- Blog
- 11.07.2024
- Product, Data Fundamentals
Building a dynamic and reusable Matillion transformation with Grid Variables

Matillion’s Data Productivity Cloud is a productivity platform for data engineers. In this blog, I’ll show how to build a simple transformation and extend it for further use cases with Matillion Grid Variables.
Grid variables are 2-D Arrays similar to database tables on a smaller scale. They can store information like table schemas, key-value pairs, or even a small table of data.
In this blog, you’ll learn how to use grid variables to make a somewhat complex data transformation dynamic and reusable.
Use case
Consider a source dataset replicated from an on-prem database to Snowflake which includes a column representing a list of IDs, comma delimited. The number of IDs in the column is variable and, for example, could include 1, 2, 3, or more IDs within a single cell. In Snowflake, I would like to have each row of my dataset to be associated with a single ID. For this example, a small sample dataset can be created with a Fixed Flow Transformation Component. Setting the stage for a larger example, consider a case where there are many tables with this structure of multiple IDs in a single cell. We’ll eventually want to build a transformation that can be reused whenever I have a table that needs this split.
Splitting IDs for a Single Table
When building a reusable data pipeline in Matillion, start with a base case for a single table.
We need to:
1. Split the field into multiple columns
2. Transpose the columns to rows
3. Clean up and normalize our data for our final data model.
When configuring the transpose, you need to define exactly the number of transposed columns. This works for our simple example but would break if more than 4 ids were in a single cell.
Leverage Grid Variables to make the ID splitting dynamic
We need to add some dynamic functionality to our transformation to account for a variable number of ids stored in a single cell. It’s also important to be able to use this function with more than this single table. As such, everything will be parameterized.
1.Use Query Result to Scalar and Query Result to Grid components to instantiate the Scalar and Grid Variables that will be used in our split_rows Transformation.
2. Pass all variables to the split_rows_parameterized Transformation
3. Define our transformation logic with variable input
Starting with our table input, all of the references to columns in the transformation logic will now be defined by grid variables.
Next, the Split Field
.. and the Transpose
Continuing with the Filter
. the Rename, which can drop columns
Last the Rewrite table
As a final step, I can add a layer of orchestration that allows me to pass in a table name, as well as the name of the column with the list of IDs that need to be parsed into their own rows.
The next time you build a Transformation in the Data Productivity Cloud, consider if the logic will be needed again later. If so, leverage Grid Variables to build a dynamic and reusable transformation!
Not using Matillion yet? Try a free trial today!
Additional resources:
Using Grid Variables to customize Matillion’s task history
Fixed flow in focus: Simplifying data pipelines with static inputs
Angus Kinsey
Enterprise Solutions Engineer
Featured Resources
What Is Massively Parallel Processing (MPP)? How It Powers Modern Cloud Data Platforms
Massively Parallel Processing (often referred to as simply MPP) is the architectural backbone that powers modern cloud data ...
BlogETL and SQL: How They Work Together in Modern Data Integration
Explore how SQL and ETL power modern data workflows, when to use SQL scripts vs ETL tools, and how Matillion blends automation ...
WhitepapersUnlocking Data Productivity: A DataOps Guide for High-performance Data Teams
Download the DataOps White Paper today and start building data pipelines that are scalable, reliable, and built for success.
Share: