Visit Matillion AI Playground at Snowflake Data Cloud Summit 24

Find out more

Never Say Never … But Successful Data Mesh Use Cases are Limited

The data mesh is a novel approach to data architecture introduced by Zhamak Dehghani in 2019. It is built on the foundation of domain-oriented, self-serve design principles, drawing inspiration from domain-driven design and team topologies. The core idea behind a data mesh is the decentralization of data management. Rather than relying on a central data team for everything, a data mesh places the responsibility for analytical data processing into the hands of domain teams. 

In theory, this all sounds great: one data platform team supports all these domain teams with a domain-agnostic data platform, allowing them to take control of their own data needs. It promotes data services over more traditional centralized methods of managing data for all insights and end-user needs. But I have yet to see a true data mesh in the real world be all that successful. This post will break down the critical pieces of data mesh, when it may make sense to use, but most importantly, why they often fail. 

Key principles and concepts of a data mesh

A true data mesh relies upon four core principles: 

1. Domain-oriented decentralized data ownership and architecture: Data ownership is distributed to domain-specific teams, and each domain manages its data independently through a decentralized architecture.

2. Data as a product: Data is treated as a valuable product, with a focus on making it discoverable, trustworthy, and interoperable. Clear ownership and documentation are fundamental.

3. Self-serve data infrastructure as a platform: Domain teams have access to self-serve tools and resources within the data infrastructure, reducing dependency on the centralized data team and enabling them to meet their specific needs.

4. Federated governance: This principle establishes global rules and standards for data management while allowing domain teams to govern their data within predefined boundaries, striking a balance between central oversight and domain-specific control for data quality and security.

You’ll notice that many of these principles are based on people and processes instead of architecture. 

When data mesh can be a good choice 

I’m certainly not suggesting that a data mesh should never be considered or used.  Exceptionally large organizations (FAANG comes to mind) cannot realistically pin every last data requirement and responsibility on a single team. Larger companies also tend to have more diverse data needs, and a central team can not serve the needs of all the stakeholders in the data communities in an organization very well.  A monolithic team can easily become a bottleneck and an impediment to timely progress. 

When built in line with the guiding principles, the data mesh can work out well. A great example I came across was in this excellent blog 11 lessons learned managing a Data Platform team within a data mesh. My favorite insights from his article include:  

  • An emphasis on key principles like decentralized data ownership and cross-functional teams for comprehensive management. 
  • Adopting domain-driven design aligns data products with business needs while treating the platform as a product prioritizes user feedback. 
  • Clear APIs and standards ensure interoperability, while autonomy and self-service tools empower teams for innovation.
  • Governance ensures data quality and compliance, with monitoring for optimization.
  • Continuous improvement and knowledge sharing foster a collaborative culture 

When the data mesh methodology is followed, and the teams are structured correctly, the organization can benefit from architecture and organizational processes. It sounds like it has taken Souhaib and his organization some time to get there, but I commend his team for working through the good and the bad to get to a good place.

Where data meshes start to unravel 

Unfortunately, the data mesh is more often an ivory tower concept than something that is successfully executed in the wild. Many teams will fall short in completing the end-to-end architecture of a data mesh no less the team or culture shift. After reading many data mesh stories and talking to customers who have attempted, most people implement data products or build domain teams around a set of business needs and stop there. These would be considered data marts or Operational Data Stores (ODS) managed by silo teams, not truly a data mesh as defined.  

To be fair, there is nothing wrong with managing key data in your organization as a data product and creating dedicated domain expertise around those data products – it is not just an actual data mesh. It is what I see again and again: a data team creates a dedicated data mart or ODS for Customers or users and call it a data mesh. We should just call them for what they are: a data mart managed by a specific team.  

A data mesh can also create a certain amount of redundancy between teams. This is very true and important to learn from Souhaib Guitoun’s blog mentioned above, where they had a dedicated team for data infrastructure and did the things all data teams needed, and rotated team members around so that everyone in the company could see how all the teams work and how to make a data mesh best operate.  Otherwise, you will have redundant tools and practices across the different data teams.

Data mesh is more than just a methodology, it’s a cultural and organizational shift that can be a challenge and an ongoing process at many companies. To implement a data mesh your entire company needs to be bought in as it will be a fundamental part of how everything works within all the data teams.  

It is extremely difficult to create a data mesh architecture, methodology, and culture without all data teams across the organization working towards the end goals. But there are some advantages in a data mesh if you can achieve the holy grail.

Takeaways

It’s okay to have data marts or ODS for specific data domains and have others manage that data. You don’t need to build a full data mesh architecture to support those needs. These can be extremely valuable components of a broader Medallion architecture. 

But my best advice for most data teams is to avoid messing up your organization by trying to implement a data mesh if you don’t need to or have the full support of the entire data community.  

Matillion can be a big part of your data mesh if you decide to go that route. We can also help build a really great centralized data warehouse or data mart if that is what your business needs. Learn more about Matillion’s Data Productivity Cloud or sign up for a free trial

Mark Balkenende
Mark Balkenende

VP of Product Marketing

Mark Balkenende, VP of Product Marketing, at Matillion has spent the last 20 years in the Data Management space. He started his career in IT roles managing large enterprise data integration projects, systems, and teams for companies like Motorola, Abbott Laboratories, and Walgreens. Mark has applied his data management subject matter expertise to customer-centric, practitioner-focused product marketing at data management software companies like Talend.