The following table describes the Job parameters for a Hadoop Job.
Parameter
|
Description
|
Connection Profile
|
Defines the name of the Control-M for Hadoop connection profile. Control-M rule: 1-30 characters (capital letters only)
|
Execution Type
|
Defines the execution type for Hadoop Job execution
|
Java-Map-Reduce
|
Defines the parameters to execute MapReduce Jobs. Parameters:
- Full path to jar: Defines the full path to the jar containing the Map Reduce Java program on the Hadoop host
- Main Class (optional): Defines the class that is included in the jar containing a main function and the map reduce implementation
- Arguments: Defines the argument which is used by the command
- Append Yarn aggregated logs to output: Determines whether to add yarn aggregated logs to the Jobs output
|
Streaming
|
Defines the parameters that enables you to create and run Map/Reduce Jobs with any executable or script as the mapper and or the reducer
|
Spark
|
Defines the parameters to execute Spark Jobs
|
DistCp
|
Defines the parameters to execute DistCP Jobs. Parameters:
- Target Path: Defines the absolute destination path
- Source path: Defines the source path (minimum of one source is required)
- Command Line options: Defines the sets of parameters and values that are added to the command line. Name and Value: Defines a name and the value associated with each property
- Append Yarn aggregated logs to output: Determines whether to add yarn aggregated logs to the Jobs output
|
Oozie
|
Defines the parameters to execute Oozie Jobs. Parameters:
- Job Properties File: Defines the Job properties file path
- Job Properties: Defines the Oozie Job properties. Key and Value: Defines a key name and value associated with each property.
|
Oozie Extractor
|
The Oozie extractor is a service that fetches Oozie workflows from the Oozie server at each interval time based on defined rules and pushes the actions of each workflow as submitted Jobs.
|
Pig
|
Defines the list of parameters for the Pig program. Parameters:
- Full Path to Pig Program: Defines the full path to the Pig program on the Hadoop host
- Name: Defines a name for each property
- Value: Defines the value associated with each property
- Append Yarn aggregated logs to output: Determines whether to add yarn aggregated logs to the Jobs output
|
Hive
|
Defines the script used to execute Hive Jobs. Parameters:
- Full path to Hive script: Defines the full path to the Hive script on the Hadoop host
- Script Parameters: Defines the list of parameters for the script
- Append Yarn aggregated logs to output: Determines whether to add yarn aggregated logs to the Jobs output. Name and Value: Defines the name and value associated with the property
|
Sqoop
|
Defines the parameters required to execute Sqoop Jobs. Parameters:
- Sqoop command: Defines any valid Squoop command necessary for Job execution. Sqoop can only be used for Job execution if defined in Sqoop connection parameters
- Append Yarn aggregated logs to output: Determines whether to add yarn aggregated logs to the Jobs output
|
HDFS File Watcher
|
Defines the parameters required to execute HDFS File Watcher Jobs. Parameters:
- File name full path: Specifies the full path of the file being watched
- Max time to wait: Determines the maximum number of minutes to wait for the file to meet the watching criteria. If criteria are not met (file did not arrive, or minimum size was not reached) the Job fails after this maximum number of minutes
- Min detected size: Determines the minimum file size in bytes to meet the criteria and finish the Job as OK. If the file arrives, but the size is not met, the Job continues to watch the file
- File Name Variable: Defines the global variable name that is used in succeeding Jobs. For more information, see Control-M Parameters guide.
|
Tajo
|
Defines all the parameters required to execute Tajo Jobs. Parameters:
- Input File: Defines the input file used as the Tajo command source
- Parameters: Determines the script parameters. Name and Value: Defines a name and the value associated with each property,
- Open Query: Determines the source type to run the query/queries.
|
Distributed Shell
|
Defines all the parameters required to execute Distributed Shell Jobs
|
HDFS Commands
|
Determines the HDFS commands to run. Parameters:
- Command: Indicates the command for the argument to be performed with Job execution
- Arguments: Defines the argument which is used by the command
|
Impala
|
Defines all the parameters required to execute Impala Jobs. Parameters:
- Source: Determines the source type to run the query/queries. The two options to select from are Query File or Open Query
- Query File Full Path: Defines the location of the file used to run the query/queries
- Query: Defines the query command used to run the query/queries
- Command Line Options: Defines the sets of parameters and values that are added to the command line
|
Set Properties Parameters
|
Defines a list of properties to be executed with the Job. These properties override the Hadoop defaults. Parameters:
- Name: Defines a name for each property
- Value: Defines the value associated with each property
- Archives: Indicates the location of the Hadoop archives
- Files: Indicates the location of the Hadoop files.
|
Pre/Post Commands
|
Defines commands that run before or after the Job is executed
|