Increase Concurrent Connections, decrease ETL run time

  • Laura Malins, ETL Product Manager
  • October 30, 2019

concurrent connections release 1.41

Currently, for each job you run in Matillion, only one connection will be made to the target database- Amazon Redshift, Snowflake, or BigQuery. Even on an instance with 8 threads, where 8 jobs can be run in parallel, it is possible for jobs to be delayed while they wait on a connection to the target database. To improve performance and efficiency, we released a new Enterprise Feature: Concurrent Connectionsavailable for both Amazon Redshift and Snowflake in Matillion ETL v1.41.

Users can now set a maximum number of concurrent connections on at the environment level. This is the maximum number of target database connections that Matillion will create as part of a single job run.

Matillion ETL concurrent connections

When set to the maximum, each Matillion thread, which varies by instance size, will have its own connection out to the target database. This feature is accessible within the environment management setting.

Taking an example job, which creates 8 tables in parallel and loads data into them using the S3 Load component:

Matillion ETL concurrent connections run jobs in parallel

Running this with one connection takes 30 seconds but increasing the concurrent connections to 8 and running again takes 19 seconds.

See it in action here in the video:

Use concurrent connections to Amazon Redshift or Snowflake to decrease your job runtimes by updating your Matillion instance to version 1.41.

Check out our best practices for a smooth update.