Top Qs
Timeline
Chat
Perspective
Data mesh
Distributed architecture framework for data management From Wikipedia, the free encyclopedia
Remove ads
Data mesh is a sociotechnical approach to building a decentralized data architecture by leveraging a domain-oriented, self-serve design (in a software development perspective), and borrows Eric Evans’ theory of domain-driven design[1] and Manuel Pais’ and Matthew Skelton’s theory of team topologies.[2] Data mesh mainly concerns itself with the data itself, taking the data lake and the pipelines as a secondary concern. [3] The main proposition is scaling analytical data by domain-oriented decentralization.[4] With data mesh, the responsibility for analytical data is shifted from the central data team to the domain teams, supported by a data platform team that provides a domain-agnostic data platform.[5] This enables a decrease in data disorder or the existence of isolated data silos, due to the presence of a centralized system that ensures the consistent sharing of fundamental principles across various nodes within the data mesh and allows for the sharing of data across different areas.[6]
|  | This article may be too technical for most readers to understand.  (January 2023) | 
Remove ads
History
The term data mesh was first defined by Zhamak Dehghani in 2019[7] while she was working as a principal consultant at the technology company Thoughtworks.[8][9] Dehghani introduced the term in 2019 and then provided greater detail on its principles and logical architecture throughout 2020. The process was predicted to be a “big contender” for companies in 2022.[10][11] Data meshes have been implemented by companies such as Zalando,[12] Netflix,[13] Intuit,[14] VistaPrint, PayPal[15] and others.
In 2022, Dehghani left Thoughtworks to found Nextdata Technologies to focus on decentralized data.[16]
Remove ads
Principles
Data mesh is based on four core principles:[17]
- domain ownership;
- data as a product;
- self-serve data platform;
- federated computational governance.
In addition to these principles, Dehghani writes that the data products created by each domain team should be discoverable, addressable, trustworthy, possess self-describing semantics and syntax, be interoperable, secure, and governed by global standards and access controls.[18] In other words, the data should be treated as a product that is ready to use and reliable.[19][20]
Remove ads
In practice
Summarize
Perspective
After its introduction in 2019[7] multiple companies started to implement a data mesh[12][14][15] and share their experiences. Challenges (C) and best practices (BP) for practitioners, include:
- C1. Federated data governance
- Companies report difficulties to adopt a federated governance structure for activities and processes that were previously centrally owned and enforced. This is especially true for security, privacy, and regulatory topics.[21][22][23]
- C2. Responsibility shift
- In data mesh individuals within domains are end-to-end responsible for data products. This new responsibility can be challenging, because it is rarely compensated and usually benefits other domains.[21][22]
- C3. Comprehension
- Research has shown a severe lack of comprehension for the data mesh paradigm among employees of companies implementing a data mesh.[21]
- BP1. Cross-domain unit
- Addressing C1, organizations should introduce a cross-domain steering unit responsible for strategic planning, use case prioritization, and the enforcement of specific governance rules—especially concerning security, regulatory, and privacy-related topics. Nevertheless, a cross-domain steering unit can only complement and support the federated governance structure and may grow obsolete with the increasing maturity of the data mesh.[21][24]
- BP2. Track and observe
- Addressing C2., organizations should observe and score data product quality as tracking and ranking key data products can encourage high-quality offerings, motivate domain owners, and support budget negotiations.[21]
- BP3. Conscious adoption
- Organizations should thoroughly assess and evaluate their existing data systems, consider organizational factors, and weigh the potential benefits before implementing a data mesh. When introducing data mesh, it is advised to carefully and consciously introduce data mesh terminology to ensure a clear understanding of the concept (C3).[21]
Community
Scott Hirleman has started a data mesh community that contains over 7,500 people in their Slack channel.[25]
See also
- Data product
- Data management
- Data platform
- Data vault modeling, method of data modeling with storage of data from various operational systems and tracing of data origin, facilitating auditing, loading speeds and resilience
- Data warehouse, a well established type of database system for organizing data in a thematic way
- ETL and ELT
References
Wikiwand - on
Seamless Wikipedia browsing. On steroids.
Remove ads