Now Available On-Demand | AI. The Future of Data Engineering - Today. Not Just Tomorrow.

Watch now

Using the Snowflake Multi-Table Insert

Technosnowflake1200

The Insert (Multi-Table) SQL command available in Snowflake makes it possible to insert data from a query into one or more tables, possibly incorporating conditions upon the insert action and how this behavior can be mirrored within Matillion ETL. 

 

There are two main options on the Snowflake Multi-Table Insert: unconditional and conditional. Let’s look at both in turn.

 

Unconditional Multi-table Insert

The unconditional form of the insert simply takes data from the defined source and loads this into the specified target tables. Note that a table may appear more than once in the target list and the source – target column mappings can, if required, be different for each of these targets.

Insert – ALL

When the ALL option is specified, all of the associated tables are loaded with the specified data.

For both the append and replace options, the basic Transformation Job for this data load is shown below.

Unconditional Insert (Transformation)

 

insert all
  into mti_target1
  into mti_target1 (c1, c2, c3) values (n2, n1, default)
  into mti_target2
  into mti_target2 values (n3, n2, n1)
select n1, n2, n3
from   mti_source;


 

Insert ALL – Orchestration

 

Note that at this stage, all of the Multi-Table Insert statements discussed in this post could be implemented in the base SQL language. This can be written directly into either an SQL Script component within an Orchestration Job or, as we show here, via a Transformation Job. The immediate benefit of adopting the transformation as the implementation method is that you see a visual representation of the task being performed.

Insert – OVERWRITE ALL

When the OVERWRITE ALL option is specified, the insert behavior is the same as that for the ALL option. However,  the target tables are truncated prior to the load process.

 

insert overwrite all
  into mti_target1
  into mti_target1 (c1, c2, c3) values (n2, n1, default)
  into mti_target2
  into mti_target2 values (n3, n2, n1)
select n1, n2, n3
from   mti_source;


Insert ALL with OVERWRITE Clause – Orchestration

 

Within Matillion ETL, this same functionality can be achieved using multiple table components taking input from a single data source.

 

 

Conditional Multi-table Insert

The conditional insert includes a WHEN clause that allows the user to include logic to be applied to the process of inserting data into the target tables.

 

There are two forms of the conditional insert statement, distinguished by the use of the clause ALL or FIRST.

Insert – ALL

When the ALL option is specified, all ‘when’ clauses are evaluated and, where found to be true, the associated tables are loaded with the specified data.

 

For both the append and replace options the basic Transformation Job for this data load is shown below.

 

Conditional Insert (Transformation)

 

insert all
  when n1 > 100 then
    into mti_target1
  when n1 > 10 then
    into mti_target1
    into mti_target2
  else
    into mti_target2
select n1 
from   mti_source;

 

Conditional Insert Using ALL Clause – Orchestration

 

Insert – OVERWRITE ALL

When the OVERWRITE ALL option is specified, the insert behavior is the same as that for the ALL option. However,  the target tables are truncated prior to the load process.

insert overwrite all
  when n1 > 100 then
    into mti_target1
  when n1 > 10 then
    into mti_target1
    into mti_target2
  else
    into mti_target2
select n1 
from   mti_source;

Using the same Transformation Job as for the insert all statement we can simulate the insert overwrite all process by including a truncate component within the Orchestration Job prior to calling the Transformation Job.

Conditional Insert Using OVERWRITE ALL Clause – Orchestration

 

Insert – FIRST

When the FIRST option is specified, each WHEN clause is evaluated in order. The load actions associated with the first WHEN clause to return true are executed and the remaining WHEN clauses are ignored.

Conditional Insert Using FIRST Clause (Transformation)

 

To append data to the target tables the script will simply use the first clause:

 

insert first
  when n1 > 100 then
    into mti_target1
  when n1 > 10 then
    into mti_target1
    into mti_target2
  else
    into mti_target2
select n1, n2, n3
From   mti_source;

 

Conditional Insert Using FIRST Clause – Orchestration

 

Insert – OVERWRITE FIRST

To overwrite data in the target tables the script will use the overwrite first clause:

 

insert overwrite first
  when n1 > 100 then
    into mti_target1
  when n1 > 10 then
    into mti_target1
    into mti_target2
  else
    into mti_target2
select n1, n2, n3
From   mti_source;

 

Conditional Insert Using OVERWRITE FIRST Clause – Orchestration

 

Performance Comparison

In order to determine the performance impact of simulating the multi-table insert behavior within Matillion ETL for Snowflake, we create a set of test Orchestration and Transformation Jobs to obtain the time taken to process data held within a table holding aircraft flight data for a number of years. In total the flight table contained 203,251,796 data rows.  The data was loaded into the target tables using an Orchestration Job running an SQL Script component and, for comparison, through the Transformation Job simulating the Snowflake Multi Table Insert SQL shown below.

Insert – ALL

 

 

insert all
  into mti_target1
  into mti_target2
select flightnum 
from mti_flight;

 

 

Execution Test    Average Time to Execute (s-1)
 
Transformation job simulating the Snowflake Multi Table Insert SQL    7.2
Orchestration job running an SQL Script component
 
    11

 

For more guidance on using Snowflake and Matillion ETL together, download our ebook, Optimizing Snowflake: A Real-World Guide.