Data Processing Connection Profiles

The following topics describe the connection profile parameters for data processing platforms and services:

AWS Data Pipeline Connection Profile Parameters

The following table describes the AWS Data Pipeline connection profile parameters.

Parameter

Description

Data Pipeline URL

Determines the location of the AWS Data Pipeline resources.

https://datapipeline.us-east-1.amazonaws.com

For more information about regional endpoints available for the AWS Data Pipeline service, refer to the AWS documentation.

AWS Region

Determines the region where the AWS Data Pipeline jobs are located.

us-east-1

Authentication

Determines one of the following authentication methods:

  • AWS Key & Secret: Used for services outside the AWS infrastructure.

  • AWS IAM Role: Used for services within the AWS infrastructure.

AWS Access Key

Defines the AWS Data Pipeline account access key.

AWS Secret

Defines the AWS Data Pipeline account secret access key.

IAM Role

Defines the Identity and Access Management (IAM) role for the AWS Data Pipeline connection.

Connection Timeout

Determines the number of seconds to wait after Control-M initiates a connection request to AWS Data Pipeline before a timeout occurs.

Default: 30

AWS EMR Connection Profile Parameters

The following table describes the AWS EMR connection profile parameters.

Parameter

Description

Region

Determines the region where the AWS EMR jobs are located.

us-east-1

EMR Access Key

Defines the token for the connection to AWS.

EMR Service Key

Defines an additional security token for AWS.

Azure Databricks Connection Profile Parameters

The following table describes the Azure Databricks connection profile parameters.

Parameter

Description

Tenant ID

Defines the Tenant ID in Azure AD.

Application ID

Defines the application (service principal) ID of the registered application.

The service principal must meet the following requirements:

  • The service principal must be an Azure Databricks workspace user and an admin. In the Databricks Admin Console, it must appear under users and also under admins.

  • The service principal must be associated with a Contributor or Owner role.

Client Secret

Defines the secret (password) associated with the Azure user and the application.

Azure Login URL

Defines the URL for login to Azure:

login.microsoftonline.com

Do not change the default value unless you are required to by your Azure Administrator.

Databricks URL

Defines the URL of your Databricks workspace.

Databricks Resource

Defines the resource parameter for the Azure Databricks login application:

2ff814a6-3304-4ab8-85cb-cd0e6f879c1d

Do not change the default value unless you are required to by your Azure Administrator.

Connection Timeout

Determines the number of seconds to wait after Control-M initiates a connection request to Azure Databricks before a timeout occurs.

Default: 50 seconds

Azure HDInsight Connection Profile Parameters

The following table describes Azure HDInsight connection profile parameters.

Parameter

Description

Cluster Name

Defines the name of the HDInsight cluster.

Cluster Username

Defines the name of the Administrator to use to connect to Azure HDInsight.

Cluster Password

Defines the Administrator password, which is configured in Azure HDInsight.

Azure Synapse Connection Profile Parameters

The following table describes Azure Synapse connection profile parameters.

Parameter

Description

Authentication Method

Determines one of the following identity types to connect to Azure Synapse Analytics:

  • Managed Identity: Enables you to access other Azure AD-protected resources. The identity is managed by the Azure platform and does not require you to provide credentials within Control-M. Use this option if the Agent is installed on an Azure virtual machine that has an assigned Managed Identity with the required permissions.

    Managed Identity authentication is based on an Azure token that is valid, by default, for 24 hours. Token lifetime can be extended by Azure.

  • Service Principal: An Azure service principal, also known as App Registration, is an identity created for use with applications, hosted services, and automated tools to access Azure resources. This access is restricted by the roles assigned to the service principal, which gives the Azure Administrator control over which resources can be accessed and at which level. Use this option if the Agent is installed on-premises or any other cloud vendor.

To prepare for authentication using each of these methods:

  • Grant your Managed Identity or Service Principal access to your Synapse workspace through the Synapse Studio (Manage > Access Control).

  • Assign a Contributor role to the Synapse workspace accessed by the Managed Identity or service principal.

Specify Managed Identity Client ID

(Managed Identity) Determines whether the client ID for your Managed Identity is specified by the Managed Identity Client ID parameter.

Select this checkbox if you are using the Managed Identity authentication method and you have multiple Managed Identities defined on your Azure virtual machine.

Managed Identity Client ID

(Managed Identity) Determines which client ID to use as the Managed Identity.

This parameter requires a value only if you have multiple Managed Identities defined on your Azure virtual machine and you selected the Specify Managed Identity Client ID checkbox.

If you have only one Managed Identity, it is detected automatically.

Azure AD URL

(Service Principal) Defines the Azure AD authentication endpoint base URL.

https://login.microsoftonline.com

Tenant ID

(Service Principal) Defines the Tenant ID in Azure AD.

App ID

Defines the application (service principal) ID of the registered application for the Azure Synapse service.

Client Secret

(Service Principal) Defines the secret (password) associated with the Azure user and the application.

Synapse URL

Defines the workspace development endpoint.

https://myworkspace.dev.azuresynapse.net

Synapse Resource

Defines the resource parameter that serves as the identifier for Azure Synapse login via Azure AD:

https://dev.azursesynapse.net/

Connection Timeout

Determines the number of seconds to wait after Control-M initiates a connection request to Azure Synapse Analytics before a timeout occurs.

Default: 50 seconds.

Databricks Connection Profile Parameters

The following table describes the Databricks connection profile parameters.

Parameter

Description

Databricks Workspace URL

Defines the URL of your Databricks workspace.

Databricks Personal Access Token

Defines a Databricks token for authentication of connections to the Databricks workspace.

Connection Timeout

Determines the number of seconds to wait after Control-M initiates a connection request to Databricks before a timeout occurs.

Default: 50 seconds

DBT Connection Profile Parameters

The following table describes the DBT (Data Build Tool) connection profile parameters.

Parameter

Description

DBT URL

Defines the DBT authentication endpoint.

Default: https://cloud.getdbt.com

DBT Token

Defines the authentication code that is used to create a connection to the DBT platform.

This is located in the API Access section in the DBT Cloud platform.

Account ID

Defines the unique ID that is assigned to your DBT Cloud account.

This is located in the Account Info section in the DBT Cloud platform.

Connection Timeout

Determines the number of seconds to wait after Control-M initiates a connection request to DBT before a timeout occurs.

Default: 60

GCP BigQuery Connection Profile Parameters

The following table describes the GCP BigQuery connection profile parameters.

Parameter

Description

Identity Type

Determines one of the following authentication types using GCP Access Control:

  • Service Account: Authenticates using an application ID (service account) and client secret.

  • IAM: Authenticates based on a detected IAM role, which removes the need to provide additional credentials.

GCP BigQuery URL

Defines the Google Cloud Platform (GCP) authentication endpoint for BigQuery.

https://bigquery.googleapis.com

Service Account Key

For Service Account Defines a service account that is associated with an RSA key pair.

GCP Dataflow Connection Profile Parameters

The following table describes the Google Cloud Platform (GCP) Dataflow connection profile parameters.

Parameter

Description

Identity Type

Defines one of the following types of authentication to perform using GCP Access Control.

  • Service Account – authenticates using an application ID (service account) and client secret

  • Managed Identity – does not require credentials; available on GCP VMs only

Dataflow URL

Defines the Google Cloud Platform (GCP) authentication endpoint.

Required only for Service Account authentication.

https://dataflow.googleapis.com

Service Account Key

Defines a JSON body that contains the required service account credentials to access GCP.

Required only for Service Account authentication.

Service account JSON format:

Copy
{
   "type": "service_account",
   "project_id": "project-id",
   "private_key_id": "key-id",
   "private_key": "-----BEGIN PRIVATE KEY-----\nprivate-key\n-----END PRIVATE KEY-----\n",
   "client_email": "service-account-email",
   "client_id": "client-id",
   "auth_uri": "https://accounts.google.com/o/oauth2/auth",
   "token_uri": "https://accounts.google.com/o/oauth2/token",
   "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
   "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/service-account-email"
}

GCP Dataproc Connection Profile Parameters

The following table describes the Google Cloud Platform (GCP) Dataproc connection profile parameters.

Parameter

Description

Identity Type

Defines one of the following types of authentication to perform using GCP Access Control.

  • Service Account – authenticates using an application ID (service account) and client secret

  • Managed Identity – does not require credentials; available on GCP VMs only

Dataproc URL

Defines the Google Cloud Platform (GCP) authentication endpoint.

Required only for Service Account authentication.

https://dataproc.googleapis.com

Service Account Key

Defines a JSON body that contains the required service account credentials to access GCP.

Required only for Service Account authentication.

Service account JSON format:

Copy
{
   "type": "service_account",
   "project_id": "project-id",
   "private_key_id": "key-id",
   "private_key": "-----BEGIN PRIVATE KEY-----\nprivate-key\n-----END PRIVATE KEY-----\n",
   "client_email": "service-account-email",
   "client_id": "client-id",
   "auth_uri": "https://accounts.google.com/o/oauth2/auth",
   "token_uri": "https://accounts.google.com/o/oauth2/token",
   "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
   "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/service-account-email"
}

Connection Timeout

Defines a timeout value, in seconds, for the trigger call to Google Cloud Platform.

Default: 20 seconds

Snowflake Connection Profile Parameters

The following table describes the Snowflake connection profile parameters.

Parameter

Description

Account Identifier

Defines the Snowflake account identifier.

To obtain this string, run the Describe Security Integration command in Snowflake and copy the initial string from one of the authorization properties.

OAUTH_AUTHORIZATION_ENDPOINT has the following value:

https://abc123.us-east-1.snowflakecomputing.com/oauth/authorize

abc123 is the account identifier.

For more information about obtaining values for the parameters required by the connection profile, see Setting Up a Snowflake API Connection.

Region

Determines the region where the Snowflake jobs are located.

us-east-1

Client ID

Defines the client ID assigned to the account in the Snowflake integration setup.

Client Secret

Defines the client secret assigned to the account in the Snowflake integration setup.

Refresh Token

Defines the value for the refresh token.

Rule: This string must be URL-encoded.

Redirect URI

Defines the redirect URI assigned to the account in the Snowflake integration setup.

Rule: This string must be URL-encoded.