Increase Concurrent Connections, decrease ETL run time
Currently, for each job you run in Matillion, only one connection will be made to the target database- Amazon Redshift, Snowflake, or BigQuery. Even on an instance with 8 threads, where 8 jobs can be run in parallel, it is possible for jobs to be delayed while they wait on a connection to the target database. To improve performance and efficiency, we released a new Enterprise Feature: Concurrent Connections, available for both Amazon Redshift and Snowflake in Matillion ETL v1.41.
Users can now set a maximum number of concurrent connections on at the environment level. This is the maximum number of target database connections that Matillion will create as part of a single job run.
When set to the maximum, each Matillion thread, which varies by instance size, will have its own connection out to the target database. This feature is accessible within the environment management setting.
Taking an example job, which creates 8 tables in parallel and loads data into them using the S3 Load component:
Running this with one connection takes 30 seconds but increasing the concurrent connections to 8 and running again takes 19 seconds.
Use concurrent connections to Amazon Redshift or Snowflake to decrease your job run-times by updating your Matillion instance to version 1.41.
Check out our best practices for a smooth update.
The post Increase Concurrent Connections, decrease ETL run time appeared first on Matillion.
10 Best Practices for Maintaining Data Pipelines
Mastering Data Pipeline Maintenance: A Comprehensive GuideBeyond ...News
Matillion Adds AI Power to Pipelines with Amazon Bedrock
Data Productivity Cloud adds Amazon Bedrock to no-code generative ...Blog
Data Mesh vs. Data Fabric: Which Approach Is Right for Your Organization? Part 3
In our recent exploration, we've thoroughly analyzed two key ...