Cassandra: The Definitive Guide: Distributed Data at Web Scale, 3rd Edition
Imagine what you could do if scalability wasn’t a problem. With this hands-on guide, you’ll learn how the Cassandra database management system handles hundreds of terabytes of data while remaining highly available across multiple data centers. This third edition—updated for Cassandra 4.0—provides the technical details and practical examples you need to put this database to work in a production environment.
Authors Jeff Carpenter and Eben Hewitt demonstrate the advantages of Cassandra’s nonrelational design, with special attention to data modeling. If you’re a developer, DBA, or application architect looking to solve a database scaling issue or future-proof your application, this guide helps you harness Cassandra’s speed and flexibility.
- Understand Cassandra’s distributed and decentralized structure
- Use the Cassandra Query Language (CQL) and cqlsh—the CQL shell
- Create a working data model and compare it with an equivalent relational model
- Develop sample applications using client drivers for languages including Java, Python, and Node.js
- Explore cluster topology and learn how nodes exchange data
Foreword Preface Why Apache Cassandra? Is This Book for You? What’s in This Book? New for the Third Edition Note on the Revised Third Edition Conventions Used in This Book Using Code Examples O’Reilly Interactive Katacoda Scenarios O’Reilly Online Learning How to Contact Us Acknowledgments 1. Beyond Relational Databases What’s Wrong with Relational Databases? A Quick Review of Relational Databases Transactions, ACID-ity, and Two-Phase Commit Schema Sharding and Shared-Nothing Architecture Web Scale The Rise of NoSQL Summary 2. Introducing Cassandra The Cassandra Elevator Pitch Cassandra in 50 Words or Less Distributed and Decentralized Elastic Scalability High Availability and Fault Tolerance Tuneable Consistency Brewer’s CAP Theorem Row-Oriented High Performance Where Did Cassandra Come From? Is Cassandra a Good Fit for My Project? Large Deployments Lots of Writes, Statistics, and Analysis Geographical Distribution Hybrid Cloud and Multicloud Deployment Getting Involved Summary 3. Installing Cassandra Installing the Apache Distribution Extracting the Download What’s in There? Building from Source Additional Build Targets Running Cassandra Setting the Environment Starting the Server Stopping Cassandra Other Cassandra Distributions Running the CQL Shell Basic cqlsh Commands cqlsh Help Describing the Environment in cqlsh Creating a Keyspace and Table in cqlsh Writing and Reading Data in cqlsh Running Cassandra in Docker Summary 4. The Cassandra Query Language The Relational Data Model Cassandra’s Data Model Clusters Keyspaces Tables Columns CQL Types Numeric Data Types Textual Data Types Time and Identity Data Types Other Simple Data Types Collections Tuples User-Defined Types Summary 5. Data Modeling Conceptual Data Modeling RDBMS Design Design Differences Between RDBMS and Cassandra Defining Application Queries Logical Data Modeling Hotel Logical Data Model Reservation Logical Data Model Physical Data Modeling Hotel Physical Data Model Reservation Physical Data Model Evaluating and Refining Calculating Partition Size Calculating Size on Disk Breaking Up Large Partitions Defining Database Schema Cassandra Data Modeling Tools Summary 6. The Cassandra Architecture Data Centers and Racks Gossip and Failure Detection Snitches Rings and Tokens Virtual Nodes Partitioners Replication Strategies Consistency Levels Queries and Coordinator Nodes Hinted Handoff Anti-Entropy, Repair, and Merkle Trees Lightweight Transactions and Paxos Memtables, SSTables, and Commit Logs Bloom Filters Caching Compaction Deletion and Tombstones Managers and Services Cassandra Daemon Storage Engine Storage Service Storage Proxy Messaging Service Stream Manager CQL Native Transport Server System Keyspaces Summary 7. Designing Applications with Cassandra Hotel Application Design Cassandra and Microservice Architecture Microservice Architecture for a Hotel Application Identifying Bounded Contexts Identifying Services Designing Microservice Persistence Extending Designs Secondary Indexes Materialized Views Reservation Service: A Sample Microservice Design Choices for a Java Microservice Deployment and Integration Considerations Services, Keyspaces, and Clusters Data Centers and Load Balancing Interactions Between Microservices Summary 8. Application Development with Drivers DataStax Java Driver Development Environment Configuration Connecting to a Cluster Statements Simple Statements Prepared Statements Query Builder Object Mapper Asynchronous Execution Driver Configuration Metadata Debugging and Monitoring DataStax Python Driver DataStax Node.js Driver DataStax C# Driver Other Cassandra Drivers Summary 9. Writing and Reading Data Writing Write Consistency Levels The Cassandra Write Path Writing Files to Disk Lightweight Transactions Batches Reading Read Consistency Levels The Cassandra Read Path Read Repair Range Queries, Ordering and Filtering Paging Deleting Summary 10. Configuring and Deploying Cassandra Cassandra Cluster Manager Creating a Cluster Adding Nodes to a Cluster Dynamic Ring Participation Node Configuration Seed Nodes Snitches Partitioners Tokens and Virtual Nodes Network Interfaces Data Storage Startup and JVM Settings Planning a Cluster Deployment Cluster Topology and Replication Strategies Sizing Your Cluster Selecting Instances Storage Network Cloud Deployment Amazon Web Services Google Cloud Platform Microsoft Azure Summary 11. Monitoring Monitoring Cassandra with JMX Cassandra’s MBeans Database MBeans Cluster-Related MBeans Internal MBeans Monitoring with nodetool Getting Cluster Information Getting Statistics Virtual Tables System Virtual Schema System Views Metrics Logging Examining Log Files Full Query Logging Summary 12. Maintenance Health Check Common Maintenance Tasks Flush Cleanup Repair Rebuilding Indexes Moving Tokens Adding Nodes Adding Nodes to an Existing Data Center Adding a Data Center to a Cluster Handling Node Failure Repairing Failed Nodes Replacing Nodes Removing Nodes Upgrading Cassandra Backup and Recovery Taking a Snapshot Clearing a Snapshot Enabling Incremental Backup Restoring from Snapshot SSTable Utilities Maintenance Tools Netflix Priam DataStax OpsCenter Cassandra Sidecars Cassandra Kubernetes Operators Summary 13. Performance Tuning Managing Performance Setting Performance Goals Benchmarking and Stress Testing Monitoring Performance Analyzing Performance Issues Tracing Tuning Methodology Caching Key Cache Row Cache Chunk Cache Counter Cache Saved Cache Settings Memtables Commit Logs SSTables Hinted Handoff Compaction Concurrency and Threading Networking and Timeouts JVM Settings Memory Garbage Collection Summary 14. Security Authentication and Authorization Password Authenticator Using CassandraAuthorizer Role-Based Access Control Encryption SSL, TLS, and Certificates Node-to-Node Encryption Client-to-Node Encryption JMX Security Securing JMX Access Security MBeans Audit Logging Summary 15. Migrating and Integrating Knowing When to Migrate Adapting the Data Model Translating Entities Translating Relationships Adapting the Application Refactoring Data Access Maintaining Consistency Migrating Stored Procedures Planning the Deployment Migrating Data Zero-Downtime Migration Bulk Loading Common Integrations Managing Data Flow with Apache Kafka Searching with Apache Lucene, SOLR, and Elasticsearch Analyzing Data with Apache Spark Summary Index About the Authors
How to download source code?
1. Go to:
2. Search the book title:
Cassandra: The Definitive Guide: Distributed Data at Web Scale, 3rd Edition, sometime you may not get the results, please search the main title
3. Click the book title in the search results
Publisher resources section, click
Download Example Code.
1. Disable the AdBlock plugin. Otherwise, you may not get any links.
2. Solve the CAPTCHA.
3. Click download link.
4. Lead to download server to download.