Data Management at Scale: Modern Data Architecture with Data Mesh and Data Fabric, 2nd Edition
- Length: 409 pages
- Edition: 2
- Language: English
- Publisher: O'Reilly Media
- Publication Date: 2023-05-16
- ISBN-10: 1098138864
- ISBN-13: 9781098138868
- Sales Rank: #102436 (See Top 100 Books)
As data management continues to evolve rapidly, managing all of your data in a central place, such as a data warehouse, is no longer scalable. Today’s world is about quickly turning data into value. This requires a paradigm shift in the way we federate responsibilities, manage data, and make it available to others. With this practical book, you’ll learn how to design a next-gen data architecture that takes into account the scale you need for your organization.
Executives, architects and engineers, analytics teams, and compliance and governance staff will learn how to build a next-gen data landscape. Author Piethein Strengholt provides blueprints, principles, observations, best practices, and patterns to get you up to speed.
- Examine data management trends, including regulatory requirements, privacy concerns, and new developments such as data mesh and data fabric
- Go deep into building a modern data architecture, including cloud data landing zones, domain-driven design, data product design, and more
- Explore data governance and data security, master data management, self-service data marketplaces, and the importance of metadata
Foreword Preface Why I Wrote This Book and Why Now Who Is This Book For? How to Read or Use This Book Conventions Used in This Book O’Reilly Online Learning How to Contact Us Acknowledgments 1. The Journey to Becoming Data-Driven Recent Technology Developments and Industry Trends Data Management Analytics Is Fragmenting the Data Landscape The Speed of Software Delivery Is Changing The Cloud’s Impact on Data Management Is Immeasurable Privacy and Security Concerns Are a Top Priority Operational and Analytical Systems Need to Be Integrated Organizations Operate in Collaborative Ecosystems Enterprises Are Saddled with Outdated Data Architectures The Enterprise Data Warehouse: A Single Source of Truth The Data Lake: A Centralized Repository for Structured and Unstructured Data The Pain of Centralization Defining a Data Strategy Wrapping Up 2. Organizing Data Using Data Domains Application Design Starting Points Each Application Has a Data Store Applications Are Always Unique Golden Sources The Data Integration Dilemma Application Roles Inspirations from Software Architecture Data Domains Domain-Driven Design Bounded contexts Ubiquitous language Business Architecture Business capabilities Linking business capabilities with applications Capability realizations Shared capabilities Complex applications Domain Characteristics Patterns for complex integration challenges Strengths of business capability modeling Principles for Distributed and Domain-Oriented Data Management Design Principles for Data Domains Best Practices for Data Providers Domain Ownership Responsibilities Transitioning Toward Distributed and Domain-Oriented Data Management Wrapping Up 3. Mapping Domains to a Technology Architecture Domain Topologies: Managing Problem Spaces Fully Federated Domain Topology The elephant in the room Governed Domain Topology Partially Federated Domain Topology Value Chain–Aligned Domain Topology Coarse-Grained Domain Topology Coarse-Grained and Partially Governed Domain Topology Centralized Domain Topology Picking the Right Topology Landing Zone Topologies: Managing Solution Spaces Single Data Landing Zone Organizing data products Scaling a single landing zone Source- and Consumer-Aligned Landing Zones Hub Data Landing Zone Multiple Data Landing Zones Multiple Data Management Landing Zones Practical Landing Zones Example Wrapping Up 4. Data Product Management What Are Data Products? Problems with Combining Code, Data, Metadata, and Infrastructure Data Products as Logical Entities Data Product Design Patterns What Is CQRS? Read Replicas as Data Products Design Principles for Data Products Resource-Oriented Read-Optimized Design Data Product Data Is Immutable Using the Ubiquitous Language Capture Directly from the Source Clear Interoperability Standards No Raw Data Don’t Conform to Consumers Missing Values, Defaults, and Data Types Semantic Consistency Atomicity Compatibility Abstract Volatile Reference Data New Data Means New Ownership Data Security Patterns Establish a Metamodel Allow Self-Service Cross-Domain Relationships Enterprise Consistency Historization, Redeliveries, and Overwrites Business Capabilities with Multiple Owners Operating Model Data Product Architecture High-Level Platform Design Capabilities for Capturing and Onboarding Data Ingestion method Complex software packages External APIs and SaaS providers Lineage and metadata Data Quality Data Historization Point-in-time Interval Append-only Defining your historization strategy Solution Design Real-World Example Alignment with Storage Accounts Alignment with Data Pipelines Capabilities for Serving Data Data Serving Services File Manipulation Service De-Identification Service Distributed Orchestration Intelligent Consumption Services Direct Usage Considerations Getting Started Wrapping Up 5. Services and API Management Introducing API Management What Is Service-Oriented Architecture? Enterprise Application Integration Service Orchestration Service Choreography Public Services and Private Services Service Models and Canonical Data Models Parallels with Enterprise Data Warehousing Architecture Canonical model size ESB as wrapper for legacy middleware ESB managing application state A Modern View of API Management Federated Responsibility Model API Gateway API as a Product Composite Services API Contracts API Discoverability Microservices Functions Service Mesh Microservice Domain Boundaries Ecosystem Communication Experience APIs GraphQL Backend for Frontend Practical Example Metadata Management Read-Oriented APIs Serving Data Products Wrapping Up 6. Event and Notification Management Introduction to Events Notifications Versus Carried State The Asynchronous Communication Model What Do Modern Event-Driven Architectures Look Like? Message Queues Event Brokers Event Processing Styles Event Producers Application-generated events Database-generated events Event Consumers Event Streaming Platforms EDA reference architecture Data product creation Event stores Streaming analytics Governance Model Event Stores as Data Product Stores Event Stores as Application Backends Streaming as the Operational Backbone Guarantees and Consistency Consistency Level Processing Methods Message Order Dead Letter Queue Streaming Interoperability Governance and Self-Service Wrapping Up 7. Connecting the Dots Cross-Domain Interoperability Quick Recap Data Distribution Versus Application Integration Data Distribution Patterns Application Integration Patterns Consistency and Discoverability Inspiring, Motivating, and Guiding for Change Setting Domain Boundaries Exception Handling Organizational Transformation Team Topologies Organizational Planning Wrapping Up 8. Data Governance and Data Security Data Governance The Governance Framework Roles Creating the framework Governance body Processes: Data Governance Activities Making Governance Effective and Pragmatic Supporting Services for Data Governance Data Contracts Usage agreements Best practices for getting started Data Security Current Siloed Approach Trust Boundaries Data Classifications and Labels Data Usage Classifications Unified Data Security Identity Providers Real-World Example Typical Security Process Flow Securing API-Based Architectures Securing Event-Driven Architectures Wrapping Up 9. Democratizing Data with Metadata Metadata Management The Enterprise Metadata Model Practical Example of a Metamodel Data Domains and Data Products Data Models Conceptual data models Logical data models Physical data models Limitations and best practices Data Lineage Other Metadata Areas The Metalake Architecture Role of the Catalog Role of the Knowledge Graph Technologies and standards Data fabric example Data fabric for metadata management Metalake solution design Wrapping Up 10. Modern Master Data Management Master Data Management Styles Data Integration Designing a Master Data Management Solution Domain-Oriented Master Data Management Reference Data Master Data Master identification numbers MDM domains and data products Domain-level MDM MDM and Data Quality as a Service MDM and Data Curation Knowledge Exchange Integrated Views Reusable Components and Integration Logic Republishing Data Through Integration Hubs Republishing Data Through Aggregates Data Governance Recommendations Wrapping Up 11. Turning Data into Value The Challenges of Turning Data into Value Domain Data Stores Granularity of Consumer-Aligned Use Cases DDSs Versus Data Products Best Practices Business Requirements Target Audience and Operating Model Nonfunctional Requirements Data Pipelines and Data Models Scoping the Role Your DDSs Play Business Intelligence Semantic Layers Self-Service Tools and Data Best Practices Advanced Analytics (MLOps) Initiating a Project Experimentation and Tracking Data Engineering Model Operationalization Exceptions Wrapping Up 12. Putting Theory into Practice A Brief Reflection on Your Data Journey Centralized or Decentralized? Making It Real Opportunistic Phase: Set Strategic Direction Transformation Phase: Lay Out the Foundation Optimization Phase: Professionalize Your Capabilities Data-Driven Culture DataOps Governance and Literacy The Role of Enterprise Architects Blueprints and Diagrams Modern Skills Control and Governance Last Words Index
Donate to keep this site alive
How to download source code?
1. Go to: https://www.oreilly.com/
2. Search the book title: Data Management at Scale: Modern Data Architecture with Data Mesh and Data Fabric, 2nd Edition
, sometime you may not get the results, please search the main title
3. Click the book title in the search results
3. Publisher resources
section, click Download Example Code
.
1. Disable the AdBlock plugin. Otherwise, you may not get any links.
2. Solve the CAPTCHA.
3. Click download link.
4. Lead to download server to download.