Query vs. Extract Components: What’s the Difference?
With the release of version 1.46 of Matillion ETL you will notice some changes in the categorization and naming of our data loading components, the data source connectors that will load data into your target data warehouse. Instead of segmenting components into category folders, you will find all of the load components – the data source connectors that will load data into your target data warehouse – in the “Load” folder. And you’ll find all of the unload components – connectors that push data to the target sources – in the “Unload” folder.
Within the Load folder, you will also see that some components are now called “Extract” components, in contrast to the usual “Query” components. What’s the difference between the two? When do you use query vs. extract? Let’s take a look at what’s different and how we distinguish the behavior of Extract components from that of our traditional Query components.
Query vs. Extract components
Prior to version 1.40 of Matillion ETL, all of our connectors were Query components. Traditional Query components extract data from source systems and flatten that data before landing it into a table in your cloud data warehouse. Using Query components made the extract & load simple and enabled you to start transforming and enriching that data faster.
Matillion ETL users, however, have an emerging need to extract and load raw data without any pre-flattening of the data sources – for example, when loading data into a data lake via external tables. To meet this need, we developed a new technology and framework that allows us to build connectors that stage the raw extract data and leverage specific features of the cloud data warehouses, such as external tables and variant data types.
|Query components||Extract components|
|Data staged in tabular format||Data staged in raw format|
|Data is staged in regular tables||Data is staged in external tables (Amazon Redshift & BigQuery)|
or regular table with variant data type (Snowflake)*
|Custom Connectors use API Query Profiles||Custom Connectors use API Extract Profiles**|
*Currently not available in Matillion ETL for Azure Synapse
Flattening data from Extract components
Depending on your use case and requirements, you may need to load data from a source that is offered via an Extract component, but you still need that data flattened and not in the raw format that the component will provide. For those users who want the flattened tabular data, we have Extract Nested Data Component, which performs that transformation using the power of your data warehouse, making this process quick and easy.
What data sources do you need to connect to?
We hope this helps you determine whether to use query vs. extract components for your data. With the new technology and framework we developed to build connectors, we now have two great technologies and frameworks that allow us to build connectors at an accelerated rate. Do you have a data source that you need to load data from that is not already supported? Head on over to the Ideas Portal in the new Matillion ETL Community to submit your connector request!
Matillion Adds AI Power to Pipelines with Amazon Bedrock
Data Productivity Cloud adds Amazon Bedrock to no-code generative ...Blog
Data Mesh vs. Data Fabric: Which Approach Is Right for Your Organization? Part 3
In our recent exploration, we've thoroughly analyzed two key ...eBooks
10 Best Practices for Maintaining Data Pipelines
Mastering Data Pipeline Maintenance: A Comprehensive GuideBeyond ...