Previous Topic

Next Topic

Book Contents

Book Index

Hadoop Job parameters

The following table describes the Job parameters for a Hadoop Job.

NOTE: Take into consideration that some parameters might not appear in your properties pane, if they were not specified in the User View you applied. Use Control-M rules if the parameter rules were not specified by the Site Standard.

Parameter

Description

Connection Profile

Defines the name of the Control-M for Hadoop connection profile. Control-M rule: 1-30 characters (capital letters only)

Execution Type

Defines the execution type for Hadoop Job execution

Java-Map-Reduce

Defines the parameters to execute MapReduce Jobs. Parameters:

  • Full path to jar: Defines the full path to the jar containing the Map Reduce Java program on the Hadoop host
  • Main Class (optional): Defines the class that is included in the jar containing a main function and the map reduce implementation
  • Arguments: Defines the argument which is used by the command
  • Append Yarn aggregated logs to output: Determines whether to add yarn aggregated logs to the Jobs output

Streaming

Defines the parameters that enables you to create and run Map/Reduce Jobs with any executable or script as the mapper and or the reducer

Spark

Defines the parameters to execute Spark Jobs

DistCp

Defines the parameters to execute DistCP Jobs. Parameters:

  • Target Path: Defines the absolute destination path
  • Source path: Defines the source path (minimum of one source is required)
  • Command Line options: Defines the sets of parameters and values that are added to the command line. Name and Value: Defines a name and the value associated with each property
  • Append Yarn aggregated logs to output: Determines whether to add yarn aggregated logs to the Jobs output

Oozie

Defines the parameters to execute Oozie Jobs. Parameters:

  • Job Properties File: Defines the Job properties file path
  • Job Properties: Defines the Oozie Job properties. Key and Value: Defines a key name and value associated with each property.

Oozie Extractor

The Oozie extractor is a service that fetches Oozie workflows from the Oozie server at each interval time based on defined rules and pushes the actions of each workflow as submitted Jobs.

Pig

Defines the list of parameters for the Pig program. Parameters:

  • Full Path to Pig Program: Defines the full path to the Pig program on the Hadoop host
  • Name: Defines a name for each property
  • Value: Defines the value associated with each property
  • Append Yarn aggregated logs to output: Determines whether to add yarn aggregated logs to the Jobs output

Hive

Defines the script used to execute Hive Jobs. Parameters:

  • Full path to Hive script: Defines the full path to the Hive script on the Hadoop host
  • Script Parameters: Defines the list of parameters for the script
  • Append Yarn aggregated logs to output: Determines whether to add yarn aggregated logs to the Jobs output. Name and Value: Defines the name and value associated with the property

Sqoop

Defines the parameters required to execute Sqoop Jobs. Parameters:

  • Sqoop command: Defines any valid Squoop command necessary for Job execution. Sqoop can only be used for Job execution if defined in Sqoop connection parameters
  • Append Yarn aggregated logs to output: Determines whether to add yarn aggregated logs to the Jobs output

HDFS File Watcher

Defines the parameters required to execute HDFS File Watcher Jobs. Parameters:

  • File name full path: Specifies the full path of the file being watched
  • Max time to wait: Determines the maximum number of minutes to wait for the file to meet the watching criteria. If criteria are not met (file did not arrive, or minimum size was not reached) the Job fails after this maximum number of minutes
  • Min detected size: Determines the minimum file size in bytes to meet the criteria and finish the Job as OK. If the file arrives, but the size is not met, the Job continues to watch the file
  • File Name Variable: Defines the global variable name that is used in succeeding Jobs. For more information, see Control-M Parameters guide.

Tajo

Defines all the parameters required to execute Tajo Jobs. Parameters:

  • Input File: Defines the input file used as the Tajo command source
  • Parameters: Determines the script parameters. Name and Value: Defines a name and the value associated with each property,
  • Open Query: Determines the source type to run the query/queries.

Distributed Shell

Defines all the parameters required to execute Distributed Shell Jobs

HDFS Commands

Determines the HDFS commands to run. Parameters:

  • Command: Indicates the command for the argument to be performed with Job execution
  • Arguments: Defines the argument which is used by the command

Impala

Defines all the parameters required to execute Impala Jobs. Parameters:

  • Source: Determines the source type to run the query/queries. The two options to select from are Query File or Open Query
  • Query File Full Path: Defines the location of the file used to run the query/queries
  • Query: Defines the query command used to run the query/queries
  • Command Line Options: Defines the sets of parameters and values that are added to the command line

Set Properties Parameters

Defines a list of properties to be executed with the Job. These properties override the Hadoop defaults. Parameters:

  • Name: Defines a name for each property
  • Value: Defines the value associated with each property
  • Archives: Indicates the location of the Hadoop archives
  • Files: Indicates the location of the Hadoop files.

Pre/Post Commands

Defines commands that run before or after the Job is executed

Parent Topic

Job types