Building an Event-Driven Data Mesh: Patterns for Designing & Building Event-Driven Architectures
- Length: 259 pages
- Edition: 1
- Language: English
- Publisher: O'Reilly Media
- Publication Date: 2023-05-16
- ISBN-10: 1098127609
- ISBN-13: 9781098127602
- Sales Rank: #1623422 (See Top 100 Books)
The exponential growth of data combined with the need to derive real-time business value is a critical issue today. An event-driven data mesh can power real-time operational and analytical workloads, all from a single set of data product streams. With practical real-world examples, this book shows you how to successfully design and build an event-driven data mesh.
Building an Event-Driven Data Mesh provides:
- Practical tips for iteratively building your own event-driven data mesh, including hurdles you’ll experience, possible solutions, and how to obtain real value as soon as possible
- Solutions to pitfalls you may encounter when moving your organization from monoliths to event-driven architectures
- A clear understanding of how events relate to systems and other events in the same stream and across streams
- A realistic look at event modeling options, such as fact, delta, and command type events, including how these choices will impact your data products
- Best practices for handling events at scale, privacy, and regulatory compliance
- Advice on asynchronous communication and handling eventual consistency
Preface Conventions Used in This Book O’Reilly Online Learning How to Contact Us Acknowledgments 1. Event-Driven Data Communication What Is Data Mesh? An Event-Driven Data Mesh Using Data in the Operational Plane The Data Monolith The Difficulties of Communicating Data for Operational Concerns Strategy 1: Replicate data between services Strategy 2: Use APIs to avoid data replication needs The Analytical Plane: Data Warehouses and Data Lakes The Organizational Impact of Schema on Read Problem 1: Violated data model boundaries Problem 2: Lack of single ownership Problem 3: Do-it-yourself and custom point-to-point data connections Bad Data: The Costs of Inaction Can We Unify Analytical and Operational Workflows? Rethinking Data with Data Mesh Common Objections to an Event-Driven Data Mesh Producers Cannot Model Data for Everyone’s Use Cases Making Multiple Copies of Data Is Bad There should only be a single master copy of the data, and all systems should reference it directly It’s too computationally expensive to create, store, and update multiple copies of the same data Managing information security policies across systems and distributed data sets is too hard Eventual Consistency Is Too Difficult to Manage Summary 2. Data Mesh Principle 1: Domain Ownership Domain-Driven Design in Brief Selecting the Data to Expose from Your Domain Principle 2: Data as a Product Data Products Provide Immutable and Time-Stamped Data Data Products Are Multimodal Accessing a Data Product Via Push or Pull The Three Data Product Alignment Types Source-aligned data products Aggregate-aligned data products Consumer-aligned data products Event-Driven Data Products as Inputs for Operational Systems Principle 3: Federated Governance Specifying Data Product Language, Framework, and API Support Establishing Data Product Life Cycle Requirements Establishing Data Handling and Infosec Policies Identifying and Standardizing Cross-Domain Polysemes Formalizing Self-Service Platform Requirements Principle 4: Self-Service Platform Discovering Data Products and Dependencies Data Product Management Controls Data Product Access Controls Compute and Storage Resources for Building and Using Data Products Providing Self-Service Through SaaS Summary 3. Event Streams for Data Mesh Events, Messages, and Records What’s an Event Stream? What Is It Not? Ephemeral Message-Passing Queuing Consuming and Using Event-Driven Data Products State Events and Event-Carried State Transfer Materializing Events Aggregating Events The Kappa Architecture The Lambda Architecture and Why It Doesn’t Work for Data Mesh Supporting the Requirements for Kappa Architecture Selecting an Event Broker Summary 4. Federated Governance Forming a Federated Governance Team Implementing Standards Supporting Multimodal Data Product Types Supporting Data Product Schemas Supporting Programming Languages and Frameworks Metadata Standards and Requirements Domain and owner Tiered service levels Data quality classifications Privacy, financial, and custom tagging Upstream metadata dependencies Metadata wrap-up example Ensuring Cross-Domain Data Product Compatibility and Interoperability Defining and Using Common Entities Event Stream Keying and Partitioning Time and Time Zones What Does a Governance Meeting Look Like? 1. Identifying Existing Problems 2. Drafting Proposals 3. Reviewing Proposals 4. Implementing Proposals 5. Archiving Proposals Data Security and Access Policies Disable Data Product Access by Default Consider End-to-End Encryption Field-Level Encryption Data Privacy, the Right to Be Forgotten, and Crypto-Shredding Data Product Lineage Topology-Based Lineage Record-Based Lineage Summary 5. Self-Service Data Platform The Self-Service Platform Maturity Model Level 1: The Minimal Viable Platform The Schema Registry An Extremely Basic Metadata Catalog Connectors Level 1 Wrap-Up: How Does It Work? Level 2: The Expanded Platform Full-Featured Metadata Catalog The Data Product Management Service and UI Service and User Identities Basic Access Controls Stream Processing for Building Data Products Level 2 Wrap-Up: How Does It Work? Level 3: The Mature Platform Authentication, Identification, and Access Management Integration with Existing Application Delivery Processes Programmatic Data Product Management API Monitoring and Alerting Multiregion and Multicloud Data Products Level 3 Wrap-Up: How Does It Work? Summary 6. Event Schemas A Brief Introduction to Serialization and Deserialization What Is a Schema? What Are Our Schema Technology Options? Google’s Protocol Buffers, aka Protobuf Apache Avro JSON Schema Schema Evolution: Changing Your Schemas Through Time Negotiating a Breaking Schema Change Step 1: Design the New Data Model Step 2: Iterate with Your Existing Consumers and the Federated Governance Team Step 3. Create a Release Schedule, a Data Migration Plan, and a Deprecation Plan Step 4. Execute the Release The Role of the Schema Registry Best Practices for Managing Schemas in Your Codebase Choosing a Schema Technology Summary 7. Designing Events Introduction to Event Types Expanding on State Events and Event-Carried State Transfer Current State Events Before/After State Events Delta Events Event Sourcing with Delta Events Why Delta Events Don’t Work for Event-Driven Data Products There is an infinite set of possible event types The logic to interpret the events must be replicated to each consumer These events map poorly to event streams Inversion of ownership: Consumers put their business logic into the producer Inability to maintain historical data without excessive complications Measurement Events Measurement Events Often Form Aggregate-Aligned Data Products Measurement Event Sources May Be Lossy Measurement Events May Power Time-Sensitive Applications Hybrid Events—State with a Bit of Delta Notification Events Summary 8. Bootstrapping Data Products Getting Started: Bootstrapping with Connectors Dual Writes Polling the Database to Create Data Products Change-Data Capture Change-Data Capture Using a Transactional Outbox Denormalization and Eventification Eventification at the Transactional Outbox Eventification in a Dedicated Service What Should Go In the Event? And What Should Stay Out? Slowly Changing Dimensions Type 1: Overwrite with the new value Type 2: Append the new value Bootstrapping Cloud Storage Files to an Event Stream Summary 9. Integrating Event-Driven Data into Data at Rest Analytics and the Medallion Architecture Connecting Event Streams Into Existing Batch-Data Flows Through the Lens of Data Mesh: What’s Going On? Through the Lens of Data Mesh: How Do We Solve It? Balancing File Sizes, SLAs, and Latency Budget Blues: A Tale of Overspending Extending the Self-Service Platform for Nonstreaming Data Products Summary 10. Eventual Consistency Converging on Consistency, One Event at a Time Strategies for Dealing with Eventual Consistency Prevent Failures to Avoid Inconsistency Use Event-Driven Data Products Instead of Request-Response Server API Calls Expose Eventual Consistency in the Server Response Plan for New Services and Reprocessing of Data Synchronize Data Products on Time Boundaries Out-of-Order Events Resolving Late-Arriving Events Summary 11. Bringing It All Together Event Streams for Data Mesh Integrating with Existing Systems Operations, Analytics, and Everything in Between Summary Index
Donate to keep this site alive
How to download source code?
1. Go to: https://www.oreilly.com/
2. Search the book title: Building an Event-Driven Data Mesh: Patterns for Designing & Building Event-Driven Architectures
, sometime you may not get the results, please search the main title
3. Click the book title in the search results
3. Publisher resources
section, click Download Example Code
.
1. Disable the AdBlock plugin. Otherwise, you may not get any links.
2. Solve the CAPTCHA.
3. Click download link.
4. Lead to download server to download.