When migrating from Matillion ETL (METL) to Matillion Data Productivity Cloud (DPC), one key difference you may encounter is how Python scripts interact with your data warehouse. In METL, Python Script components using the Jython interpreter could easily execute SQL queries against the environment warehouse using the context.cursor() function. However, DPC does not support Jython and doesn't provide this cursor functionality out of the box.
In this blog post, we'll show you how to maintain compatibility with your existing Python scripts by implementing cursor functionality in DPC, making your migration process smoother and more efficient.
Understanding the Challenge
While best practice would be to refactor your Python scripts to use components like Query Result to Grid, Query Result to Scalar, or SQL Script, we understand that migration timelines often require quicker solutions. To enable faster migrations to DPC, we've developed code snippets that replicate the context.cursor() function from METL.
Important Notes Before You Begin
This method will only work with a Hybrid SaaS agent, as the Python Script component is not available on Full SaaS.
The implementation uses your agent's cloud credentials to access either AWS Secrets Manager or Azure Key Vault.
We recommend creating the required variables as project variables for better management.
For Azure-based implementations, you'll need to add specific Python packages to the agent via the Extension Library Location parameter in the deployment template.
Implementation Options
We've prepared cursor implementations for various data platforms. Select the one that matches your environment:
Databricks on AWS (using a personal access token)
Required Project Variables:
databricks_server_hostname: The Server Hostname value for your cluster or SQL warehouse
databricks_http_path: The HTTP Path value for your cluster or SQL warehouse
databricks_pat_secret_name: The name of the secret in AWS Secrets Manager containing the personal access token
databricks_pat_secret_key: The key within the secret which has the personal access token as its value
To implement cursor functionality in your DPC Python scripts:
Identify which database platform and cloud provider you're using from the options above.
Create the required project variables in your DPC project.
If using Azure, make sure to add the necessary Python packages to your agent.
Copy the appropriate code snippet and add it to the beginning of your Python Script component.
Once you've added the cursor setup code, you can continue using context.cursor() in your Python scripts just as you did in METL. The function will now connect to your data warehouse and return a cursor object that you can use to execute SQL queries.
Long-Term Considerations
While this approach provides a quick solution for migration, we recommend gradually refactoring your Python scripts to use the native DPC components (Query Result to Grid, Query Result to Scalar, or SQL Script) for better performance and maintainability. This cursor implementation should be seen as a migration aid rather than a permanent solution.
Conclusion
By implementing cursor functionality in Matillion DPC, you can significantly ease the migration process from Matillion ETL. This approach allows you to maintain compatibility with your existing Python scripts while you transition to DPC's native components at your own pace.
For teams with tight migration timelines or extensive Python Script usage in METL, these cursor implementations provide a practical pathway to DPC adoption without requiring immediate refactoring of all Python scripts.
Need more help with your migration to Matillion Data Productivity Cloud? Contact our support team for personalized assistance.
Share:
Chris Upton
Staff Data Engineer
Experienced data professional, I've been working with databases for over 25 years. I have a deep knowledge of SQL and have worked with a variety of relational and NoSQL databases, both on-prem and in the cloud.
Share: