Data Processing and Analytics Connection Profiles

The following topics describe the connection profile parameters for data processing platforms and services:

AWS Athena Connection Profile Parameters

The following table describes the AWS Athena connection profile parameters.

Parameter

Description

AWS API Base URL

Defines the AWS Athena API authentication endpoint.

https://athena.us-east-1.amazonaws.com

AWS Region

Determines the region where the AWS Athena jobs are located.

us-east-1

Authentication

Determines one of the following authentication methods:

  • AWS Key: Used for services outside the AWS infrastructure.

  • AWS IAM Role: Used for services within the AWS infrastructure.

AWS Access Key

Defines the AWS Athena account access key.

AWS Secret Key

Defines the AWS Athena account secret access key.

IAM Role

Defines the Identity and Access Management (IAM) role for the AWS Athena connection.

Connection Timeout

Determines the number of seconds to wait after Control-M initiates a connection request to AWS Athena before a timeout occurs.

Default: 20

AWS Data Pipeline Connection Profile Parameters

The following table describes the AWS Data Pipeline connection profile parameters.

Parameter

Description

Data Pipeline URL

Defines the AWS Data Pipeline API authentication endpoint.

https://datapipeline.us-east-1.amazonaws.com

For more information about regional endpoints available for the AWS Data Pipeline service, refer to the AWS documentation.

AWS Region

Determines the region where the AWS Data Pipeline jobs are located.

us-east-1

Authentication

Determines one of the following authentication methods:

  • AWS Key & Secret: Used for services outside the AWS infrastructure.

  • AWS IAM Role: Used for services within the AWS infrastructure.

AWS Access Key

Defines the AWS Data Pipeline account access key.

AWS Secret

Defines the AWS Data Pipeline account secret access key.

IAM Role

Defines the Identity and Access Management (IAM) role for the AWS Data Pipeline connection.

Connection Timeout

Determines the number of seconds to wait after Control-M initiates a connection request to AWS Data Pipeline before a timeout occurs.

Default: 30

AWS EMR Connection Profile Parameters

The following table describes the AWS EMR connection profile parameters.

Parameter

Description

Region

Determines the region where the AWS EMR jobs are located.

us-east-1

EMR Access Key

Defines the token for the connection to AWS.

EMR Service Key

Defines an additional security token for AWS.

Azure Databricks Connection Profile Parameters

The following table describes the Azure Databricks connection profile parameters.

Parameter

Description

Tenant ID

Defines the Tenant ID in Azure AD.

Application ID

Defines the application (service principal) ID of the registered application.

The service principal must meet the following requirements:

  • The service principal must be an Azure Databricks workspace user and an admin. In the Databricks Admin Console, it must appear under users and also under admins.

  • The service principal must be associated with a Contributor or Owner role.

Client Secret

Defines the secret (password) associated with the Azure user and the application.

Azure Login URL

Defines the URL for login to Azure:

https://login.microsoftonline.com

Do not change the default value unless you are required to by your Azure Administrator.

Databricks URL

Defines the URL of your Databricks workspace.

Databricks Resource

Defines the resource parameter for the Azure Databricks login application:

2ff814a6-3304-4ab8-85cb-cd0e6f879c1d

Do not change the default value unless you are required to by your Azure Administrator.

Connection Timeout

Determines the number of seconds to wait after Control-M initiates a connection request to Azure Databricks before a timeout occurs.

Default: 50 seconds

Azure HDInsight Connection Profile Parameters

The following table describes Azure HDInsight connection profile parameters.

Parameter

Description

Cluster Name

Defines the name of the HDInsight cluster.

Cluster Username

Defines the name of the Administrator to use to connect to Azure HDInsight.

Cluster Password

Defines the Administrator password, which is configured in Azure HDInsight.

Azure Synapse Connection Profile Parameters

The following table describes Azure Synapse connection profile parameters.

Parameter

Description

Authentication Method

Determines one of the following identity types to connect to Azure Synapse Analytics:

  • Managed Identity: Enables you to access other Azure AD-protected resources. The identity is managed by the Azure platform and does not require you to provide credentials within Control-M. Use this option if the Agent is installed on an Azure virtual machine that has an assigned Managed Identity with the required permissions.

    Managed Identity authentication is based on an Azure token that is valid, by default, for 24 hours. Token lifetime can be extended by Azure.

  • Service Principal: An Azure service principal, also known as App Registration, is an identity created for use with applications, hosted services, and automated tools to access Azure resources. This access is restricted by the roles assigned to the service principal, which gives the Azure Administrator control over which resources can be accessed and at which level. Use this option if the Agent is installed on-premises or any other cloud vendor.

To prepare for authentication using each of these methods:

  • Grant your Managed Identity or Service Principal access to your Synapse workspace through the Synapse Studio (Manage > Access Control).

  • Assign a Contributor role to the Synapse workspace accessed by the Managed Identity or service principal.

Specify Managed Identity Client ID

(Managed Identity) Determines whether the client ID for your Managed Identity is specified by the Managed Identity Client ID parameter.

Select this checkbox if you are using the Managed Identity authentication method and you have multiple Managed Identities defined on your Azure virtual machine.

Managed Identity Client ID

(Managed Identity) Determines which client ID to use as the Managed Identity.

This parameter requires a value only if you have multiple Managed Identities defined on your Azure virtual machine and you selected the Specify Managed Identity Client ID checkbox.

If you have only one Managed Identity, it is detected automatically.

Azure AD URL

(Service Principal) Defines the Azure AD authentication endpoint base URL.

https://login.microsoftonline.com

Tenant ID

(Service Principal) Defines the Tenant ID in Azure AD.

App ID

Defines the application (service principal) ID of the registered application for the Azure Synapse service.

Client Secret

(Service Principal) Defines the secret (password) associated with the Azure user and the application.

Synapse URL

Defines the workspace development endpoint.

https://myworkspace.dev.azuresynapse.net

Synapse Resource

Defines the resource parameter that serves as the identifier for Azure Synapse login via Azure AD:

https://dev.azursesynapse.net/

Connection Timeout

Determines the number of seconds to wait after Control-M initiates a connection request to Azure Synapse Analytics before a timeout occurs.

Default: 50 seconds.

Databricks Connection Profile Parameters

The following table describes the Databricks connection profile parameters.

Parameter

Description

Databricks Workspace URL

Defines the URL of your Databricks workspace.

Databricks Personal Access Token

Defines a Databricks token for authentication of connections to the Databricks workspace.

Connection Timeout

Determines the number of seconds to wait after Control-M initiates a connection request to Databricks before a timeout occurs.

Default: 50 seconds

DBT Connection Profile Parameters

The following table describes the DBT (Data Build Tool) connection profile parameters.

Parameter

Description

DBT URL

Defines the DBT API authentication endpoint.

Default: https://cloud.getdbt.com

DBT Token

Defines the authentication code that is used to create a connection to the DBT platform.

This is located in the API Access section in the DBT Cloud platform.

Account ID

Defines the unique ID that is assigned to your DBT Cloud account.

This is located in the Account Info section in the DBT Cloud platform.

Connection Timeout

Determines the number of seconds to wait after Control-M initiates a connection request to DBT before a timeout occurs.

Default: 60

GCP BigQuery Connection Profile Parameters

The following table describes the GCP BigQuery connection profile parameters.

Parameter

Description

Identity Type

Determines one of the following authentication types using GCP Access Control:

  • Service Account: Authenticates using an application ID (service account) and client secret.

  • IAM: Authenticates based on a detected IAM role, which removes the need to provide additional credentials.

GCP BigQuery URL

Defines the Google Cloud Platform (GCP) authentication endpoint for BigQuery.

https://bigquery.googleapis.com

Service Account Key

(Service Account) Defines a service account that is associated with an RSA key pair.

GCP Dataflow Connection Profile Parameters

The following table describes the Google Cloud Platform (GCP) Dataflow connection profile parameters.

Parameter

Description

Identity Type

Determines one of the following authentication types using GCP Access Control:

  • Service Account: Authenticates using an application ID (service account) and client secret.

  • IAM: Authenticates based on a detected IAM role, which removes the need to provide additional credentials. This is only available on GCP virtual machines.

Dataflow URL

Defines the Google Cloud Platform (GCP) authentication endpoint for Dataflow.

https://dataflow.googleapis.com

Service Account Key

(Service Account) Defines a JSON body that contains the required service account credentials to access GCP.

Service account JSON format:

Copy
{
   "type": "service_account",
   "project_id": "project-id",
   "private_key_id": "key-id",
   "private_key": "-----BEGIN PRIVATE KEY-----\nprivate-key\n-----END PRIVATE KEY-----\n",
   "client_email": "service-account-email",
   "client_id": "client-id",
   "auth_uri": "https://accounts.google.com/o/oauth2/auth",
   "token_uri": "https://accounts.google.com/o/oauth2/token",
   "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
   "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/service-account-email"
}

GCP Dataproc Connection Profile Parameters

The following table describes the Google Cloud Platform (GCP) Dataproc connection profile parameters.

Parameter

Description

Identity Type

Determines one of the following authentication types using GCP Access Control:

  • Service Account: Authenticates using an application ID (service account) and client secret.

  • IAM: Authenticates based on a detected IAM role, which removes the need to provide additional credentials. This is only available on GCP virtual machines.

Dataproc URL

Defines the Google Cloud Platform (GCP) authentication endpoint for Dataproc.

https://dataproc.googleapis.com

Service Account Key

(Service Account) Defines a JSON body that contains the required service account credentials to access GCP.

Service account JSON format:

Copy
{
   "type": "service_account",
   "project_id": "project-id",
   "private_key_id": "key-id",
   "private_key": "-----BEGIN PRIVATE KEY-----\nprivate-key\n-----END PRIVATE KEY-----\n",
   "client_email": "service-account-email",
   "client_id": "client-id",
   "auth_uri": "https://accounts.google.com/o/oauth2/auth",
   "token_uri": "https://accounts.google.com/o/oauth2/token",
   "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
   "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/service-account-email"
}

Connection Timeout

Defines a timeout value, in seconds, for the trigger call to Google Cloud Platform.

Default: 20 seconds

Snowflake Connection Profile Parameters

The following table describes the Snowflake connection profile parameters.

This connection profile uses token-based authentication. To authenticate using an Identity Provider (IdP), see Snowflake IdP Connection Profile Parameters.

Parameter

Description

Account Identifier

Defines the Snowflake account identifier.

To obtain this string, execute the Describe Security Integration command in Snowflake and copy the initial string from one of the authorization properties.

OAUTH_AUTHORIZATION_ENDPOINT has the following value:

https://abc123.us-east-1.snowflakecomputing.com/oauth/authorize

abc123 is the account identifier.

For more information about obtaining values for the parameters required by the connection profile, see Setting Up a Snowflake API Connection.

Region

Determines the region where the Snowflake jobs are located.

us-east-1

Client ID

Defines the client ID assigned to the account in the Snowflake integration setup.

Client Secret

Defines the client secret assigned to the account in the Snowflake integration setup.

Refresh Token

Defines the value for the refresh token.

Rule: This string must be URL-encoded.

Redirect URI

Defines the redirect URI assigned to the account in the Snowflake integration setup.

Rule: This string must be URL-encoded.

Snowflake IdP Connection Profile Parameters

The following table describes the Snowflake Identity Provider (IdP) connection profile parameters.

This connection profile authenticates using an Identity Provider (IdP). To use token-based authentication, see Snowflake IdP Connection Profile Parameters.

Parameter

Description

Account Identifier

Determines the Snowflake IdP account identifier.

To obtain this string, run the Describe Security Integration command in Snowflake and copy the initial string from one of the authorization properties.

EXTERNAL_OAUTH_AUDIENCE_LIST has the following value:

https://abc123.us-east-1.snowflakecomputing.com

abc123 is the account identifier.

For information about the values for the parameters required by the connection profile, see the IdP-specific External OAuth configuration instructions in the Snowflake documentation.

Region

Determines the region where the Snowflake jobs are located.

us-east-1

Client ID

Defines the client ID assigned to the account in the Snowflake integration setup.

Client Secret

Defines the client secret assigned to the account in the Snowflake integration setup.

IDP URL

Defines the authentication endpoint for Snowflake IdP.

Scope

Defines the scope, which limits the operations you can do and the roles you can use in the Snowflake IdP plug-in.

Define the scope as follows:

session:role:<custom_role>

session:role:sysadmin