Visit Matillion AI Playground at Snowflake Data Cloud Summit 24

Find out more

How Matillion Addresses Data Virtualization

This blog is in direct response to the following quote from Gartner’s TDI report calling out Matillion’s lack of data virtualization, “Basic data virtualization limits architectural scope: Matillion facilitates and actively markets data movement and physical data consolidation… It lacks data virtualization capabilities such as semantic abstraction, query acceleration, dynamic query optimization, and caching” - Gartner TDI

Where does Virtualization fit in?

Data Virtualization is a process of querying data from a variety of sources for visualization in a semantic layer that is set up on highly provisioned networks and maximized processing power, where the price is at a premium. In the modern age of affordable storage data platforms like Snowflake and Redshift, which are powered by data lakes and storage buckets, processing power has become the currency, not data storage. ELT has replaced the ETL methodologies of the turn of the century, and loading and storing data in the cloud comes at pennies on the dollar. This identity switch of the modern data world has minimized the necessity for data virtualization in the data integration architecture. 

The Modern Data Stack

The shift towards cheap storage and computing costs led to the overwhelming birth of data loading and transformation tools. Even the legacy giants of Informatica and Microsoft’s SSIS shifted their functionality to accommodate in-warehouse transformations perpetuated by the rise of Snowflake. 

This new era of data integration has become known as the modern data stack. First, the data is loaded en masse into a data platform or warehouse. Second, the data is transformed and prepped to a variety of states, such as the reporting layer, star schema, etc. Finally, the data is loaded into a visualization tool to be analyzed and used. This new stack has very little space for traditional data virtualization.

Where does Matillion fit in?

Terabytes of data can now be loaded from any business platform, file store, and database directly into Snowflake in its raw form. This initial process is simple, affordable, and securely provided by a variety of tools on the market. The next step of processing the raw data to become usable is slightly more complex, and it starts with a few buzzwords: logical replication, schema adaptation, incremental loading, and upserts. These buzzwords are really just the features that make for a successful data-loading tool, but they’re an important next step. Next is data transformation, a vague term that refers to the manipulation of tables of data from a multitude of sources and making it usable – often pushing the data through multiple data warehouse layers. This step is often far more manual and the true art of data engineering. The cherry on top is a visual orchestration of this entire process, with lineage and error handling, AI prompt components, and CoPilot to enable multi-language coding IDEs that simply accompany the pre-built templates. 

Matillion is truly the only data platform built to maximize the modern data stack. It offers a complete set of data loading, transformation, and orchestration features that are purpose-built for each leading platform. 

This blog is the second of a two-part mini-series. Please find part one here.  

Ryder Zgabay
Ryder Zgabay

Senior Product Marketing Manager