How to build a transformation pipeline in the Data Productivity Cloud

Data transformation is the cornerstone of refining and enhancing raw data into valuable insights. With Matillion's Data Productivity Cloud, crafting effective data transformation pipelines becomes an intuitive journey, tailored for both data engineers and analysts. Here’s a quick walkthrough of the process to help you get started.

 

Building a Transformation pipeline in the Data Productivity Cloud

1. We start in the Pipelines pane, where you can use the Add menu to create a Transformation pipeline. After naming your pipeline, you’ll be directed to the Designer canvas.


2. From here, you can go to the Components pane to choose your first component. Here you’ll find a wide variety of components for transformations, allowing you to join datasets, perform technical calculations, and even handle complex data formats like JSON or XML. All transformation pipelines begin with a component to read data. Here, we will start with a Table Input.  


3. After you’ve selected your chosen component and moved it onto the canvas, you can configure it in the Component Properties pane. Here, you can set parameter values that instruct the component how to execute within the pipeline. In this case, we are instructing the pipeline to read data from the defined cloud data warehouse table. 

4. Because Matillion is always connected to the cloud data warehouse, you can sample data at any time throughout the development of a  transformation pipeline to see a subset of the resulting data. This gives you a chance to confirm that the result is what you’re looking for before running the entire pipeline. You can follow this procedure at each step, adding more and more components and sampling to ensure that your transformation is working as desired. 

5. You can also validate the pipeline to make sure everything is configured correctly. Green checks mean no issues, and red Xs mean you should take another look at how the component is configured.


6. Finally, once the pipeline is validated, click Run to execute the transformation. 

It’s that easy! Now you can start experimenting and customizing your pipelines to see the full power of the Data Productivity Cloud. And don’t forget to check out this instructional blog on building orchestration pipelines, which you can use to move data from source systems to your cloud data warehouse and coordinate your entire data operation, all from Matillion.

Griffin Dassatti
Griffin Dassatti

Sr. Product Marketing Manager