Introducing PipelineOS: Empowering Data Teams with Seamless Data Pipelines

What’s the secret sauce behind the Data Productivity Cloud you might ask? Matillion leverages the power of PipelineOS at its core and deploys stateless microservice agents to execute pipelines at an unlimited scale, across multiple users and projects. The architecture allows data teams to harness the full potential of their cloud data platform, taking advantage of its push-down capabilities while maintaining control over costs.

The Data Integration Challenges Fueled by the Explosion in Data

Many organizations face challenges when trying to boost their productivity with data. A major contributing factor is the explosion in data volumes that must be processed accurately to make informed business decisions. The sources require detailed feature data to be refined, stored, and used in machine learning routines. Leading more users to need data integration tools, create more pipelines, and process more data simultaneously. 

But data workloads vary in size, requiring different levels of compute — known as "lumpy workloads". It’s difficult to predict the required level of compute at a given time, whilst balancing the cost of processing data. To cope with infrequent peaks in workloads and meet business needs, organizations are often forced to run expensive compute 24/7. 

Enter Matilion PipelineOS

Matillion is proud to introduce the powerhouse behind the Data Productivity Cloud, and the solution to your data challenges - PipelineOS. Our platform can harness the potential of stateless microservice agents to execute pipelines seamlessly. Whether it's data movement or transformation, PipelineOS enables you to scale effortlessly across countless users and projects. With PipelineOS, you can embrace the full power of push-down architecture on your cloud data platform without compromising cost control. 

Rather than relying on fixed-sized on-premise servers, or even fixed-sized cloud-based virtual machines, PipelineOS is powered by serverless container technology allowing unlimited scaling.

Containers bring the ability to scale horizontally as opposed to vertically - rather than increasing the size of a single compute engine (which of course has an upper limit), you simply start more instances.

This allows scaling to achieve what you need. Run as many instances as required to get the desired level of processing power and concurrency. Allow all your engineers to work at the same time without slowing down your critical pipelines (ensuring you can ingest and process all your data even with a limited window defined by the needs of the business), process lumpy workloads while keeping an eye on your cloud infrastructure spend, and avoid schedule Tetris by scheduling pipelines when you want, rather than having to hunt for available capacity.

As an example, to prove the scalability of the Data Productivity Cloud we ran a pipeline to ingest data from 100 different AWS Aurora MySQL Database instances, each containing a table of approximately 11 million rows of Sales data. The pipeline was required to ingest and transform all of the tables in less than three minutes to get near-real time insights, and was able to perform all these actions in parallel — taking just 167 seconds from start to finish producing rapid data insight.

Conquer workload challenges and drive growth with PipelineOS today

By leveraging PipelineOS at the core of the Data Productivity Cloud, data teams can simplify data movement, bridge the skills gap for data transformation, and offer scalability and orchestration capabilities.

Simplify, automate, and scale your data pipelines with PipelineOS. Start your Data Productivity Cloud trial today!