Now Available On-Demand | AI. The Future of Data Engineering - Today. Not Just Tomorrow.

Watch now

Query vs. Extract Components: What’s the Difference?

Matillion ETL 1 46 Release Series Blog Image low res 10

With the release of version 1.46 of Matillion ETL you will notice some changes in the categorization and naming of our data loading components, the data source connectors that will load data into your target data warehouse. Instead of segmenting components into category folders, you will find all of the load components – the data source connectors that will load data into your target data warehouse – in the “Load” folder. And you’ll find all of the unload components – connectors that push data to the target sources – in the “Unload” folder. 

 

Within the Load folder, you will also see that some components are now called “Extract” components, in contrast to the usual “Query” components. What’s the difference between the two? When do you use query vs. extract? Let’s take a look at what’s different and how we distinguish the behavior of Extract components from that of our traditional Query components.

Query vs. Extract components

Prior to version 1.40 of Matillion ETL, all of our connectors were Query components. Traditional Query components extract data from source systems and flatten that data before landing it into a table in your cloud data warehouse. Using Query components made the extract & load simple and enabled you to start transforming and enriching that data faster.

 

Matillion ETL users, however, have an emerging need to extract and load raw data without any pre-flattening of the data sources – for example, when loading data into a data lake via external tables. To meet this need, we developed a new technology and framework that allows us to build connectors that stage the raw extract data and leverage specific features of the cloud data warehouses, such as external tables and variant data types.

 

Query componentsExtract components
Data staged in tabular formatData staged in raw format
Data is staged in regular tablesData is staged in external tables (Amazon Redshift & BigQuery)
or regular table with variant data type (Snowflake)*
 
Custom Connectors use API Query ProfilesCustom Connectors use API Extract Profiles**

*Currently not available in Matillion ETL for Azure Synapse

**Coming soon

Flattening data from Extract components

Depending on your use case and requirements, you may need to load data from a source that is offered via an Extract component, but you still need that data flattened and not in the raw format that the component will provide. For those users who want the flattened tabular data, we have Extract Nested Data Component, which performs that transformation using the power of your data warehouse, making this process quick and easy.

What data sources do you need to connect to? 

We hope this helps you determine whether to use query vs. extract components for your data. With the new technology and framework we developed to build connectors, we now have two great technologies and frameworks that allow us to build connectors at an accelerated rate.  Do you have a data source that you need to load data from that is not already supported? Head on over to the Ideas Portal in the new Matillion ETL Community to submit your connector request!