We're at an inflection point in data, where our data management solutions no longer match the complexity of organizations, the proliferation of data sources, and the scope of our aspirations to get value from data with AI and analytics. In this practical book, author Zhamak Dehghani introduces data mesh, a decentralized sociotechnical paradigm drawn from modern distributed architecture that provides a new approach to sourcing, sharing, accessing, and managing analytical data at scale. Dehghani guides practitioners, architects, technical leaders, and decision makers on their journey from traditional big data architecture to a distributed and multidimensional approach to analytical data management. Data mesh treats data as a product, considers domains as a primary concern, applies platform thinking to create self-serve data infrastructure, and introduces a federated computational model of data governance.
The concept (data mesh) itself is very interesting, and I can't deny the effort put into the creation of the book. But not everything went according to the author's plan (IMHO):
1. It feels like the author wants to apply the mechanisms we know from DDD and (properly shaped) microservices into the world of data - that sounds like a good idea, but there was not enough space/attention dedicated to problems specific to data: its inertia, the differences between data and query distribution, etc. 2. The author claims she wanted the book tech-agnostic (again: great idea), but in the end, I couldn't help the impression it's a convenient excuse to avoid inconvenient questions (e.g. about analytical queries) 3. I adore the concept of data product, but ... it can easily be implemented on the top of a data lake - imagine that each layer consists of private and public schemas and the latter ones are considered as contracts - layer n+1 can rely only on contracts from layers up to n 4. The concept is very simple, and it's presented in the initial section of the book; what you get later is a lot of repetition w/o practical advice or (what's even worse) any useful examples - and that's probably the biggest drawback of the book: it's by far too dry and theoretical
It's quite possible that Data Mesh will be The Next Big Thing (or rather buzzword) - this book can be a source of inspiration, but it definitely won't tell you how to build one (data mesh). To be honest, the author claims there are still gaps, but what she does not do is successfully "credentializing" the concept (why it's the one that doesn't just sound interesting but is also feasible to build).
If you are looking for a resource to explain what Data Mesh is, its core principles, origin, main advantages, pitfalls, etc., then this book has you covered.
If you are looking for a "playbook" for how to implement it. Look elsewhere.
It is a very prescient work in the sense that the author deeply understands data, and the problems of working with data that originate from modern centralized structures (a.k.a. the "modern data stack" - lakes, warehouses, etc.). She prescribes a methodology for affronting this issues through strategy, organizational changes, and forward thinking architectures.
Part I tells you what Data Mesh is and the underlying principles. Part II tells you the "why" of Data Mesh. Parts III & IV tell you how to design the various components, and part V tells you how to get started.
The concept of "data as a product" is very compelling, and this section is extremely well done. I had first gotten a glimpse of crude data products when studying Snowflake's Marketplace, which is in essence a catalog of datasets with a basic implementation of the SLOs (service-level objectives) and sharing features as prescribed by Zhamak Dehghani in this book.
The reason for the low rating has to do with the fact that beyond explaining what it is, and why we should care, there's hardly anything more of value for the interested reader. (Note: perhaps if you are already a seasoned veteran of Data Engineering you might see a clear way to implement these concepts. I am not, and my rating reflects the applicability of the material given that lack of exposure.)
It is far, far too high-level, and too theoretical to be of any use to somebody looking for a way to start. Some sections are downright philosophical. The author is aware of this and mentions it toward the end of the book.
It gets very terse and repetitive as well. You will read entire re-worded paragraphs from previous sections over and over.
Honestly, I kinda loved the book but each consecutive chapter reinforced my view that it needed a more exacting editor.
I would say that anyone working with data should read chapters 1-5 (Part I) and maybe Part II as well, since it presents a clear picture of the problems with the modern data stack. Parts III, IV, and V are unlovable.
I would perhaps end by saying that it feels like this is a text that is ahead of its time, as the implementation of these ideas are still beyond the reach of most with the technology and organizational structures of today (unless you are Spotify, Google, etc...)
Poorly written, poorly edited, this is a jumble of loosely connected ideas that don't flow well into a coherent narrative. There are parts of the book where the same paragraph is repeated verbatim within a matter of a couple pages. Ultimately Dehghani has taken a solid collection of interesting, if untested, ideas from her blog post and expanded them with enough fluff and barely relevant tangents to fill a book.
I highly recommend that you instead read "Data Management at Scale" by Piethein Strengholt, which espouses a nearly identical philosophy that's presented in a much clearer, more logical way. As far as I can tell, Strengholt's book predates the whole "Data Mesh" buzz, or at least evolved in parallel, but went largely unnoticed; meanwhile, Dehghani gets all the credit, to the point where even Strengholt has adopted her language.
Has your data organization become a bottleneck as the monolith grows bigger and each engineer has to become more and more specialized, consuming their cognitive capacity and losing motivation? All this while the rest of the organization fights with each other to get prioritized? If not, you’ll get there… and Data Mesh is the medicine you need.
Probably old news as Martin Fowler’s blog brought this up to light a while ago (must read if you haven’t). But here you’ll go deeper into the core principles (specially DDD and Product Design) and how to translate these concrete actions and guidelines in your data platform.
This is the first and the most comprehensive position about data mesh concept. And even that it brings clarity and defines certain practices, being a first position on the topic created by the author of the idea, it suffers when it comes to the pragmatism and applicability.
I love the clarity of 4 principles, idea of Data Product, and anchoring the whole concept aligned within Domain-Driven Design (DDD) and Team Topologies concepts definitely helps to draw parallels and make it more relevant.
Making the book tech-agnostic (being a great idea in the surface), also did not help in the end - especially when it comes to providing certain blueprints, implementations, and exact solutions for the certain class of the problems.
Overall, I had an impression that the whole concept is genius, but simple - but it's presented in the overly complicated way. What you get later is a lot of repetition, little to no practical advice, and none of useful examples - even that we have the examples from a non-trivial domain, they are still too abstract and lack of details leaves a ton of ambiguity. It would be great if it would be a bit more pragmatic - as for now it's too dry and theoretical.
Great book on introducing all the data mesh concepts and principles. It is a bit frustrating to be only exposed to principles and then do additional digging to figure out which tools could be a good fit for decentralized data platform
So I guess this frustration is to be factored in when reading the book. Don't expect to have ready-made reference implementations. as a result, this book is for senior tech people who can bring concepts to concrete implementation
Also, the concept of data mesh is really the concept of microservices applied to the analytical world. Thinking this way will make understand the core concepts much easier for people familiar with microservices architectures. Rou'll be able to map these into known patterns
Anyway, I don't see any other path than this type of architecture as this is how Tech is always moving: Platform + decentralized "products"
Dehghani defines the data mesh in terms of four principles, listed in order of importance. The data mesh is a distributed solution to this centralised system.
- Domain Ownership – this says that analytical data is owned by the domains that generate it rather than a centralised data team; - Data as a product – analytical data is owned as a product, with the associated management, discoverability, quality standards and so forth around it. - Self-serve data platform – a self-serve data platform is introduced which makes the process of domain ownership of data products easier, delivering the self-contained infrastructure and services that the data product defines; - Federated computational governance – this is the idea that policies such as access control, data retention, encryption requirements, and actions such as the “right to be forgotten” are determined centrally by a governance board but are stored, and executed, in machine-readable form by data products;
I did not think that this adds much value to the blog post. I totally agree with most of the things pointed out there. However, the book gives no concrete steps to implement data mesh, and is not much more than a collection of ideas - some of them are good ones, though. Often, the book even uses "we are at an early stage" as an excuse.
To me, this is more to be looked at as a data strategy book.
A book to help you understand the Data Mesh revolution By Paul Laughlin · May 2, 2023 If you are anything like me, you start from a healthy scepticism for new technology buzzwords, like Data Fabric or Data Mesh. All too often these have proven to be just marketing hype by IT suppliers who are rebranding old solutions. However, the risk of such a reaction is you can miss what is important new thinking. I recently completed reading “Data Mesh: Delivering Data-Driven Value at Scale” by Zhamak Dehghni. It has convinced me that there is much more to Data Mesh than I realised and that it matters. Unlike all those over-hyped IT supplier pitches, it painstakingly explains both the need for a new ‘inflection point’ in our approach to managing data and how data mesh can help. She also clarifies exactly what data mesh means, both in theory and in detailed logical design. So, although this is a much more technical text than I would usually review on this blog, I consider it important enough to include it. Understanding the principles of a data mesh approach to producing, maintaining, discovering & sharing the data needed within an organisation could revolutionise your approach. So, I encourage data leaders to get their heads around this new vision of what Zhamak describes as a sociotechnical approach to managing data. It has implications far beyond the hardware & software involved. How this book shares what Data Mesh means in practice It is amazing how much detail Zhamak manages to include within this book, for two reasons. First, it is patiently explained step by step, so even those whose specialism is not data management or data engineering can grasp all the concepts. Second, as she honestly highlights at several points, this is still an emerging field. Many of the theoretical components are not yet available “off the shelf”. This could of made this book feel like it had been written too soon or was trying to hit a moving target. Zhamak manages to avoid both pitfalls by both laying thorough theoretical foundations & by staying at a logical level of implementation design (but still sufficiently detailed to be helpful to practitioners). So, what will readers discover in this book? The material is divided into 5 parts. Firstly, explaining the concept of data mesh and each of the key principles. Secondly, making the case for why data mesh and the benefits it can deliver after the inflection point of change. Thirdly, how to design the key components of a Data Mesh Architecture. Fourthly, how to design the key services of a Data Mesh Product Architecture. Lastly, how to get started in terms of both systems and organisational culture. What makes this book so brilliant is how Zhamak manages to effectively communicate with multiple levels of readers. Data leaders will find parts 1 & 2 the most relevant. Once they are convinced they can share parts 3 & 4 with their teams to help them get to grips with the practical implications, especially in terms of architecture and design. Then they can both read and plan together using part 5. A recipe for a book that will not just be read once, but kept as a reference guide as this technology matures. But what is a Data Mesh? 54 Book Reviews to cross post to Amazon & Goodreads A book to help you understand the Data Mesh revolution Put simply, it’s a different approach to storing, managing, sharing, using & deploying data to generate value. Within this book, Zhamak makes a convincing case for why we have reached the end of the centralised approach to data management. Relying a central warehouse or lake and data team is no longer viable. The speed of change & complexity of today’s organisations, couples with the proliferation of potential data sources and growth in expectations of use cases – all conspire to ask more than the old approach can deliver. So, what is different with Data Mesh? At its simplest it comes down to the four principles that are explained in this book (both in theory & at a design level): 1. Domain-Driven Design (business domains own data & responsibilities) 2. Data Products (encapsulated data & code, supporting services to use that data properly) 3. Self-Serve Data Platform (delivering a network of domains that effectively share data products) 4. Federated Computational Governance (semi-automated solutions embedded into all the above) It is only when you see the traditional problems that each element solves that you begin to grasp the brilliance of this solution. It is a radical change from traditional ways of working. But, as technology & organisations catch up it offers a far more capable vision of more adaptive data usage in businesses. What are some of the key takeaways for data leaders? The first I think is that things cannot continue as they are. Even if you are not ready to make a major change like the transition to a Data Mesh approach, I recommend reading the start of this book. That is because it effectively critiques why the current approach is doomed to failure. Zhamak shines a light on how & why a central data team is doomed to failure in the light of emerging needs. It is not possible for one team to understand and respond to the changing realities and needs of the rest of a business. Especially as the scope of data usage extends to most functions within each business domain. The other key challenge is organisational redesign and accountability. Perhaps the most challenging aspect of migrating to such a federated approach is the responsibility it places on areas that have avoided it up till now. To achieve this approach requires data expertise & technology expertise within each domain. Domain leaders will have to go far beyond fine words to actually act as data owners in practice. It is needed, but it is also a significant culture change. As the author puts it “...executing data mesh needs a multifaceted organizational change. Iteratively and along with delivery of your data mesh thin slices, I encourage you to look at modifying all facets of organisational design decisions...” Lastly, I would recommend reading this book to get ready for the future. As you work through the detail in the logical design chapters you realise how many components are not yet readily available. Pioneers will need to build key parts of this approach or work around limited systems. That will delay mass adoption, but also drive the development of future packaged solutions. As with all advances in technology, progress will probably be slower than we imagine now. However, the latter implications may well be larger than we can’t imagine now. So, I advise data leaders to get their heads around this sooner rather than later. Otherwise, they too may end up as outdated as data warehouses are beginning to look.
It really bothers me when I feel like the author is wasting my time - this is one of the times. Don't get me wrong the concepts laid out are interesting, but as pointed out by many reviewers, they could have well been summarized in half the pages. "The concept is very simple, and it's presented in the initial section of the book; what you get later is a lot of repetition w/o practical advice or (what's even worse) any useful examples - and that's probably the biggest drawback of the book: it's by far too dry and theoretical." There are far too many definitions that create confusion and the books remains too much theoretical. This is what ChatGPT has to say about Data Mesch, and unfortunately for the author, it covers most of what you need to know about this movement.
Data mesh is a decentralized approach to data architecture that aims to overcome the limitations of traditional centralized data systems, particularly in large and complex organizations. It was introduced by Zhamak Dehghani in 2019. The core idea behind data mesh is to treat data as a product and to manage data ownership and responsibilities in a decentralized way, much like how microservices are managed in software development. Here are the key principles and components of data mesh:
Domain-Oriented Data Ownership: Data is owned by the teams that know the data best, typically the ones that generate it. Each domain team is responsible for the data it produces, ensuring high quality and relevance.
Data as a Product: Data is treated as a product with its own lifecycle, including development, maintenance, and deprecation. Domain teams are responsible for delivering their data in a way that is easily discoverable, understandable, and usable by others.
Self-Serve Data Infrastructure: A self-service infrastructure platform is provided to domain teams to enable them to manage their data independently. This platform typically includes tools for data storage, processing, governance, and access control.
Federated Computational Governance: Governance is implemented in a federated manner, balancing global standards with local autonomy. This involves establishing policies and standards that are enforced across all domains while allowing domains the flexibility to manage their own data.
Components of Data Mesh Domain Data Products These are datasets produced by different domain teams, designed to be used by other teams. Each data product comes with a clear contract, including schema, SLAs, quality metrics, and documentation.
Data Platform: A central platform provides common infrastructure services like data storage, processing, and security. The platform abstracts away the complexities of underlying technologies, allowing domain teams to focus on their data products.
Governance and Standards: Policies and standards are established to ensure data quality, security, and compliance. Governance is implemented in a federated manner, with responsibilities distributed across domain teams.
Interoperability and Communication:Mechanisms are put in place to ensure that data products from different domains can be easily integrated and used together. This may involve standardizing on formats, interfaces, and communication protocols.
Benefits of Data Mesh Scalability: By decentralizing data ownership and management, organizations can scale their data practices more effectively. Each domain team can work independently, avoiding bottlenecks associated with centralized data teams. Agility:
Domain teams can develop and iterate on their data products more quickly, responding to changing business needs. This leads to faster innovation and time-to-market for data-driven initiatives. Quality and Relevance:
Data ownership by domain teams ensures that the people most familiar with the data are responsible for its quality and relevance. This leads to higher quality data that is more aligned with business needs. Collaboration and Reuse:
Data mesh promotes a culture of data sharing and reuse, making it easier for teams to discover and use data from other domains. This reduces duplication of effort and leads to more efficient use of data resources. Challenges and Considerations Cultural Change:
Implementing data mesh requires a significant cultural shift, as teams need to take on new responsibilities for data ownership and product management. Organizations need to invest in training and change management to support this transition. Complexity:
Managing a decentralized data architecture can introduce new complexities, particularly around governance and interoperability. It requires careful planning and robust tooling to ensure that data remains discoverable, usable, and compliant. Technology and Tooling:
Building a self-serve data platform requires significant investment in technology and infrastructure. Organizations need to ensure they have the right tools and platforms to support the needs of their domain teams. Data mesh represents a significant shift in how organizations manage and utilize their data. By decentralizing data ownership and treating data as a product, organizations can become more agile, scalable, and effective in their use of data. However, successful implementation requires careful planning, investment in infrastructure, and a commitment to cultural change.
Over the past week, I finished reading the book on Data Mesh, which I initially began following during its writing process. I had read up to Chapter 5 at that early stage, even though I received a complimentary edition of the book a few years ago. Finally completing it took longer than I had planned. I may have previously shared some thoughts on the book, so I’ll keep this review brief.
First and foremost, the introduction of distributed data management and governance in the book is revolutionary. We should all commend author Zhamak Dehghani for her bold, countercurrent thinking and the courage she demonstrated in solidifying this concept. Initially, I was skeptical, as I was deeply invested in Data Fabric, and I thought it was too soon to embrace another architectural model to manage data as a strategic asset within organizations. However, through numerous discussions, debates, and insights shared on LinkedIn and other forums, I’ve come to realize that both Data Fabric and Data Mesh are essential for any organization serious about data management.
These two distinct architectural concepts—Data Fabric as the technology architecture and Data Mesh as the operational architecture—serve a common goal, but they can seem like a hair-splitting distinction to many. To truly grasp their value, I believe individuals should step back, approach the concepts with an open mind, and read this book along with others on the topic. This will help them build an effective data infrastructure and operational model to treat data as a strategic asset in their business.
Personally, my reading of this book has been invaluable. It has enabled me to lead the study, evaluation, and onboarding of technologies to support Data Fabric while keeping an eye on the future establishment of Data Mesh. As we move forward with these efforts, we are also integrating AI-driven DataOps and Digital Data Stewardship activities into the setup. AI-driven DataOps focuses on automating many of the data management activities, while Digital Data Stewardship will scale our data governance efforts. Ironically, had we not embraced Data Fabric as our data technology architecture, we would now be struggling to establish distributed data management and governance practices—the very essence of Data Mesh.
Reflecting on these efforts brings a great sense of satisfaction, as we approached the challenge with an open mind, embracing modern technical concepts through Lean-Agile practices. These practices encourage risk-taking and a "fail fast" mentality. I highly recommend this book to Data Practice Leaders looking to make revolutionary changes, rather than just incremental upgrades to existing technology architectures. There’s a thrill in making transformational changes happen.
I'm a big fan of Domain Driven Design and implemented it multiple times in the systems I contributed to. So I had high expectations on the idea of bringing it to Data Analytics. Unfortunately, this bloated and nebulous book did not deliver. Sometimes I thought that it's a parody and the book should be named "Wittgenstein on Data Mesh". The author describes issues with current architectures, e.g. with Data Lake pipelines and suggests... not to talk about them, leave it an implementation detail. Bingo, problem solved. The concept of "Data Product" itself is interesting, but there is no discussion why Data products can't be created on top of Data Lake. And it's interesting that the author mentions that DW was introduced in 1960s, but does not mention that Distributed Computing was introduced around the same time.
Some fascinating ideas but a challenging read. The text is very bloated with significant redundant sections. Each section seems to have been written in isolation so you’ll see the same points repeated many times between sections.
This book focuses on theory rather than practice. The author excuses this by saying that the technologies to facilitate data mesh properly do not yet exist. However it would have been interesting to have an exploration how how data mesh principles can be applied to build a data platform using currently available technologies.
Bringing software design principles to the data world
Obvious once written but obscured by years and years of siloed specialization Zhamak Dehghani brings data analytics back into the fold of proper software development principles. First and foremost Domain Driven development (Evans), in second order how to manage complexity. Highly recommended for every engineer and manager working in data-intensive applications at enterprise scale.
There is lots of good information here, and lots to learn (even outside Data Mesh).
I love the idea of Data Mesh, and would love to be involved in some kind of implementation of it someday.
That said, I do think there are some industry wide technology deficiencies that will need to improve before this becomes easy to adopt. We are still in an early adopter phase. Our technology can do this, it will just take a significant level of effort.
I listened to the audiobook version. I couldn't bear it, unfortunately. While the book defines the challenges and the abstract concepts sufficiently, it feels very repetetive and lacking concrete solutions or actionable takeaways. It does not feel to bring much added value over the original post of the author on Fowler's blog.
It's decent book introducing a new concept. However a lot of material is on peripheral aspects like people management, product management etc. which was not expected. It's ideal as an audiobook and overall I am happy to get exposure to an upcoming trend.
It challenges the tradicional ideia on how data should be organized in organizations. But it is too theoretical and at the end felt a transfer from microservices foundations to data. It is poorly written as well. It is more an organization / strategy book.
Good to finally read through this, it was a bit of a slow read and lots of conceptual framing etc.
Felt a bit repetitive in places but decent for a v1. I expect v2 will be a lot punchier and help close out many of the emerging topics that are still to be matured.
Data Mesh is Domain Driven Design and Team Topologies applied to data analytics. It represents a significant move towards an outcome-driven agile approach.
The book provides extensive explanations, but its abstract perspective can sometimes make it challenging to follow.
Unstructured and overly complicated language. Simple concepts are presented in overly complex encrypted and repetitive manner for idk why !. Only reason reading is that Zhamak has introduced this idea.