Control-M for Databricks
Databricks is a cloud-based data analytics platform that enables you to process and analyze large workloads of data.
Control-M for Databricks enables you to do the following:
-
Execute Databricks jobs.
-
Manage Databricks credentials in a secure connection profile.
-
Connect to any Databricks endpoint.
-
Introduce all Control-M capabilities to Control-M for Databricks, including advanced scheduling criteria, complex dependencies, Resource Pools, Lock Resources, and variables.
-
Integrate Databricks jobs with other Control-M jobs into a single scheduling environment.
-
Monitor the status, results, and output of Databricks jobs.
-
Attach an SLA job to the Databricks jobs.
-
Run 50 Databricks jobs simultaneously per Agent.
Control-M for Databricks Compatibility
The following table lists the prerequisites that are required to use the Databricks plug-in, each with its minimum required version.
Component |
Version |
---|---|
Control-M/EM |
9.0.20.200 |
Control-M/Agent |
9.0.20.200 |
Control-M Application Integrator |
9.0.20.201 |
Control-M Automation API |
9.0.20.245 |
Control-M for Databricks is supported on Control-M Web and Control-M Automation API, but not on the Control-M client.
To download the required installation files for each prerequisite, see Obtaining Control-M Installation Files.
Setting up Control-M for Databricks
This procedure describes how to deploy the Databricks plug-in, create a connection profile, and define a Databricks job in
Integration plug-ins released by BMC require an Application Integrator installation. However, these plug-ins are not editable and you cannot import them into Application Integrator. To deploy these integrations to your Control-M environment, import them directly into Control-M with Control-M Automation API.
Before You Begin
Verify that Automation API is installed, as described in Automation API Installation.
Begin
-
Create a temporary directory to save the downloaded files.
-
Download the Databricks plug-in from the Control-M for Databricks download page in the EPD site.
-
Install the Databricks plug-in via one of the following methods:
-
(9.0.21 or higher) Use the Automation API Provision service, as follows:
-
Log in to the Control-M/EM Server machine as an Administrator and store the downloaded zip file in the one of the following locations:
-
Linux: $HOME/ctm_em/AUTO_DEPLOY
-
Windows: <EM_HOME>\AUTO_DEPLOY
-
-
Log in to the Control-M/Agent machine and run the provision image command, as follows:
-
Linux: ctm provision image DBX_plugin.Linux
-
Windows: ctm provision image DBX_plugin.Windows
-
-
-
(9.0.20.200 or lower) Run the Automation API Deploy service, as described in deploy jobtype.
-
-
Create a Databricks connection profile in Control-M Web or Automation API, as follows:
-
Web: Create a Centralized Connection Profile with Databricks Connection Profile Parameters
-
Automation API: ConnectionProfile:Databricks
-
-
Define a Databricks job in Control-M Web or Automation API, as follows:
-
Web: Create a Job with Databricks Job parameters
-
Automation API: Job:Databricks
-
To remove this plug-in from an Agent, see Removing a Plug-in. The plug-in ID is DBX032022.
Change Log
The following table provides details about changes that were introduced in new versions of this plug-in:
Plug-in Version |
Details |
---|---|
1.0.06 |
Added Failure Tolerance job parameter |
1.0.05 |
Added semantic changes |
1.0.04 |
Removed the Job Name attribute |
1.0.03 |
Add new job icon |
1.0.02 |
Added idempotency enhancement |
1.0.01 |
Added multiple task enhancement |
1.0.00 |
Initial version |