Despite the investments and effort poured into next-generation data storage systems, data warehouses and data lakes have failed to provide data engineers, data analysts, and data leaders trustworthy and agile business insights to make intelligent business decisions. The answer is Data Mesh – a decentralized, distributed approach to enterprise data management.
Founder of Data Mesh Zhamak Dehghani defines Data Mesh as “a sociotechnical approach to share, access, and manage analytical data in complex and large-scale environments – within or across organizations.” She’s authoring an O’Reilly book, Data Mesh: Delivering Data-Driven Value at Scale and Starburst, the ‘Analytics Engine for Data Mesh,’ happens to be the sole sponsor. In addition to providing a complimentary copy of the book, we’re also sharing chapter summaries so we can read along and educate our readers about this (r)evolutionary paradigm. Enjoy Chapter Seven: Principle of Federated Computational Governance!
We made it!
In the final pre-release chapter of O’Reilly’s book Data Mesh: Delivering Data-Driven Value at Scale, we venture into the fourth founding principle of Data Mesh, the principle of federated computational governance.
Data governance teams have well intended and necessary goals: Ensure that data is usable, accessible, protected, and compliant with regulatory requirements. Traditionally, however, achieving these goals haven’t been as smooth. Zhamak writes,“Governance has relied heavily on manual interventions, complex central processes of data validation and certification, establishing global canonical modeling of data with minimal support for change, and often engaged too late after the fact.” Frankly, this approach won’t work for Data Mesh.
Data Mesh Governance
Data Mesh governance cultivates and embraces constant change within the data landscape. Domains are responsible for data modeling and data quality. Computational instructions are automated to ensure “data is secure, compliant, of quality and usable.” Moreover, this approach also “embeds the computational policies in each and every domain and data product.”
In this chapter, Zhamak introduces how to tailor Data Mesh governance to your organization in accordance to the federated computational model via these three components:
- Systems thinking,
- Federated operating model, and
- Computational policies.
The image below demonstrates how these three components interact within a Data Mesh federated computational governance.
Apply Systems Thinking to Data Mesh Governance
Systems thinking requires viewing parts of an organization to see the whole; and not static parts, but dynamic processes. Data Mesh governance mirrors a similar thought process: more than a sum of its parts, it’s a collection of “interconnected systems — data products, data product providers, data product consumers, and platform teams and services.” A great reference for systems thinking is Peter Senge’s Fifth Discipline, which has been a favourite book of mine for more than a decade. It’s a wonderful book filled with examples about understanding how to break away from limiting beliefs as well as traditional, linear thinking to systems thinking, a more effective way to achieve a common goal.
Data Products For Dummies
Maintain Dynamic Equilibrium Between Domain Autonomy and Global Interoperability
Administering a successful Data Mesh ecosystem is the ability to preserve a balance between local domain optimization and global interoperability optimization. The governance model respects the local autonomy of each domain, but the challenge is to balance domain autonomy and global interoperability which can be considered as “global-level security, legal conformance, interoperability standards, and other mesh-level policies applied to all data products,” as per Zhamak. Systems thinking gives us some ideas and approaches for how to incentivize and maintain this balance in the longer term.
Apply Federation to the Governance Model
Organizationally, by design, Data Mesh is a federation, not the Star Trek kind of federation, but the kind where an organizational structure consists of self-sufficient, individual domains. For instance, the domains manage and own their data products and how they are modeled and served. “The domains select what Service Level Objectives their data products guarantee, and ultimately they are responsible for the satisfaction of their data products’ consumers.”
As you might suspect, despite the autonomy of the domains, there are a set of standards and global policies that all domains must follow. That’s how the mesh remains a functioning ecosystem.
Data Mesh leverages a governance operating model which is aided by federated decision making composed of the domain owners and self-serve infrastructure platform stakeholders. This federated team then needs to consider the guiding values, policies and incentives to optimize the value of the Data Mesh.
Federated Team
Data Mesh governance is not something you can outsource. It’s a collective responsibility, with clear accountability of domains. “Governance is composed of a cross-functional team of representatives from domains, platform and subject matter experts from security, compliance, legal, etc.”
Guiding Values
Values are the bedrock of any governance system. It essentially guides how the decisions are made, scope of influence, whether the global governance function should address a specific concern or not, if there is a conflict in decision making, and how to resolve it.
Policies
Whether it’s guides, rules, or policies — they indicate a proper level of correctness and steer us towards ensuring how these are maintained, particularly when it comes to security. This pertains to data: “Data accessibility, data quality, modeling data, and many other cross-functional characteristics of data shared on the mesh.”
Incentives
Implementing Data Mesh at an organization not only requires a shift in technology and architecture, but also depends on change management. Incentives can motivate and “impact the behavior of governance function and particularly in balancing the priorities of the domain representatives, between local and global priorities.”
Apply Computation to the Governance Model
In a fully functioning Data Mesh, the governance function is fully integrated and invisible. Data product owners and consumers wouldn’t notice it because it’s automated and abstracted by the platform, embedded into each data product, and seamlessly applied. This is what Zhamak calls computational governance.
Here are some ways that the platform can support governance policies, computationally:
- Standards As Code
- Policies As Code
- Automated Tests
- Automated Monitoring
So we can envision as part of publishing a data product to the mesh a number of checks take place in an automated fashion that notify the data product owner of compliance or otherwise.
Download the latest copy of the pre-released book to learn more about each of these methods.
Putting It All Together
Data Mesh governance aims to improve current data governance methods “at the intersection of a decentralized value system meeting automation and computation.”
The Data Mesh governance model consists of three complementary pillars:
- Systems thinking which involves looking at the “mesh as an ecosystem of interconnected data product and platform systems, and their independent and yet connected teams.”
- Applying a federated operating model by creating a federated team of individual domains and platform owners. Create incentives! Let domains have autonomy and make them accountable and leave cross-functional collaboration and policies(compliance, access control, auditing, privacy, etc) to be defined globally.
- Lastly, embed governance policies into each data product through automation and computation. No doubt, this suggests that your underlying data platform is of the utmost importance while making it easy to do the right thing.
To bring these three pillars together, the image below shows an example of this model:
Read along with us!
Get your complimentary access to pre-release chapters from the O’Reilly book, Data Mesh: Delivering Data–Driven Value at Scale, authored by Zhamak Dehghani now.
Data Products For Dummies
- Data products best practices guide
- Data Virtualization in the Cloud Era
- Data products executive guide