Work Smarter, Not Harder: Pipeline and Code Reusability with Matillion, Slalom and Snowflake
When considering migrating and processing data, it’s important to consider ways to automate workflows and scaling for the future. Many companies have workflows that vary slightly but perform essentially the same task. By using available table metadata and just a few Matillion components, companies can drastically reduce duplicative workflows while reducing runtime and enabling scheduling and automation.
This blog will discuss the value of metadata-driven pipeline code and how Slalom, Snowflake, and Matillion helped a small healthcare provider harness its power to achieve remarkable results.
Repetitive Work and Complex Workflows
Challenges in data management often ignite innovation. This next section will unravel the intricate data workflow challenges encountered by a healthcare provider, paving the way for an exploration of innovative solutions.
- Unpredictable daily influx, custom queries, and workloads. The healthcare provider efficiently serves a diverse customer base by accepting insurance from multiple providers. They receive daily payor files from insurers, using custom queries to identify new files, their locations, processing requirements, and storage locations. The volume of files and participating payors varies daily, creating unpredictable workloads. The queries for each payor are largely similar, with minor differences.
- Inefficient maintenance and management processes, resulting in high costs: The healthcare providers’ data engineers spend significant time managing workflows, dealing with slow queries that increase computing costs, and maintaining over 100 stored procedures for ingestion and transformations. Time is spent modifying and maintaining stored procedures instead of analyzing the data to create valuable business insights.
The healthcare provider aims to reduce time and costs, simplify ETL, and streamline workflows, which can be achieved by leveraging table metadata in Snowflake and Matillion’s iteration components.
The Query to Grid component runs a query to determine the payors that require processing and all their associated file directories. The Grid Iterator component loops through the results of the query for each file directory, while the File Iterator loops through each file in the directory. Each file is ingested into Snowflake using the same ingestion orchestration, simply by passing variables for their source file, destination table, and file format.
Once all the files in the directory are processed, it moves on to the next directory; once all directories for a payor have been processed, the next payor and associated directories/files are processed. Their ETL was a complex, high-compute cost series of stored procedures that Matillion reduced to three components that pass parameters to a single ingestion orchestration.
Achieving maximum potential with Matillion
- Efficient Data Ingestion and Automation: With a few Matillion components, the healthcare provider streamlined data processing. They now run a single orchestration ingestion job to handle all their files. Matillion's scheduling capabilities enable daily automation, reducing processing time significantly and eliminating manual intervention.
- Simplified Data Workflows: The healthcare provider reduced their 40 ingestion stored procedures to a parameterized Matillion orchestration. They plan to apply a similar approach to reduce their 60+ transformation stored procedures. This simplification empowers data engineers by making troubleshooting, modifying, and maintaining data workflows much more manageable.
- Swift Onboarding of New Payors: The onboarding process for new payors, which previously took days, has been streamlined to hours. This enhancement enables the healthcare provider to efficiently and confidently onboard new payors as they expand their network of accepted insurances and customer base.
- Scalable and User-Friendly: As the organization grows, Matillion's user-friendly UI facilitates knowledge transfer to new employees. Additionally, the auto-documentation feature eliminates the often-tedious task of manually documenting code and workflows.
This healthcare provider's transition to streamlined processes with Matillion and Snowflake was remarkable, overcoming challenges like unpredictable data influx and high costs. The shift to efficiency and automation not only saved time and resources but also facilitated the quick onboarding of new payors. Matillion's user-friendly interface and auto-documentation feature set them up for future growth, emphasizing the value of working smarter and promising a bright future in the evolving data landscape.
About the writers
Dan Greenberg, Senior Principal, Cloud Data Architect, and Data Strategist at Slalom Consulting
Dan Greenberg is a Senior Principal, Cloud Data Architect, and Data Strategist at Slalom Consulting. Dan brings decades of experience as a DBA, Data Architect, and Business Intelligence Architect. Recognized as a Qlik Luminary in 2020 and a frequent presenter at conferences like Qlik World and the Snowflake Summit.
Lauren Thornton, Data Engineer at Slalom Consulting
Lauren Thornton is a Data Engineer at Slalom Consulting with a background in technical architecture. She works with clients at various stages of development to transform their technology and business landscape to meet their desired goals.
10 Best Practices for Maintaining Data Pipelines
Mastering Data Pipeline Maintenance: A Comprehensive GuideBeyond ...News
Matillion Adds AI Power to Pipelines with Amazon Bedrock
Data Productivity Cloud adds Amazon Bedrock to no-code generative ...Blog
Data Mesh vs. Data Fabric: Which Approach Is Right for Your Organization? Part 3
In our recent exploration, we've thoroughly analyzed two key ...