Now Available On-Demand | AI. The Future of Data Engineering - Today. Not Just Tomorrow.

Watch now

Increase Concurrent Connections, decrease ETL run time

 

Currently, for each job you run in Matillion, only one connection will be made to the target database- Amazon Redshift, Snowflake, or BigQuery. Even on an instance with 8 threads, where 8 jobs can be run in parallel, it is possible for jobs to be delayed while they wait on a connection to the target database. To improve performance and efficiency, we released a new Enterprise Feature: Concurrent Connectionsavailable for both Amazon Redshift and Snowflake in Matillion ETL v1.41.

Users can now set a maximum number of concurrent connections on at the environment level. This is the maximum number of target database connections that Matillion will create as part of a single job run.

 

Matillion ETL concurrent connections

 

When set to the maximum, each Matillion thread, which varies by instance size, will have its own connection out to the target database. This feature is accessible within the environment management setting.

Taking an example job, which creates 8 tables in parallel and loads data into them using the S3 Load component:

 

Matillion ETL concurrent connections run jobs in parallel

 

Running this with one connection takes 30 seconds but increasing the concurrent connections to 8 and running again takes 19 seconds.

Use concurrent connections to Amazon Redshift or Snowflake to decrease your job run-times by updating your Matillion instance to version 1.41.

Check out our best practices for a smooth update.

The post Increase Concurrent Connections, decrease ETL run time appeared first on Matillion.