High-Performance Big Data Computing
- Length: 272 pages
- Edition: 1
- Language: English
- Publisher: The MIT Press
- Publication Date: 2022-08-02
- ISBN-10: 0262046857
- ISBN-13: 9780262046855
- Sales Rank: #8689798 (See Top 100 Books)
An in-depth overview of an emerging field that brings together high-performance computing, big data processing, and deep lLearning.
Over the last decade, the exponential explosion of data known as big data has changed the way we understand and harness the power of data. The emerging field of high-performance big data computing, which brings together high-performance computing (HPC), big data processing, and deep learning, aims to meet the challenges posed by large-scale data processing. This book offers an in-depth overview of high-performance big data computing and the associated technical issues, approaches, and solutions.
The book covers basic concepts and necessary background knowledge, including data processing frameworks, storage systems, and hardware capabilities; offers a detailed discussion of technical issues in accelerating big data computing in terms of computation, communication, memory and storage, codesign, workload characterization and benchmarking, and system deployment and management; and surveys benchmarks and workloads for evaluating big data middleware systems. It presents a detailed discussion of big data computing systems and applications with high-performance networking, computing, and storage technologies, including state-of-the-art designs for data processing and storage systems. Finally, the book considers some advanced research topics in high-performance big data computing, including designing high-performance deep learning over big data (DLoBD) stacks and HPC cloud technologies.
Cover Title Page Copyright Table of Contents Acknowledgments 1. Introduction 1.1. Overview 1.2. Big Data Characteristics and Trends 1.3. Current Systems for Data Management and Processing 1.4. Technological Trends 1.5. Convergence in HPC, Big Data, and Deep Learning 1.6. Outline of the Book 1.7. Summary 2. Parallel Programming Models and Systems 2.1. Overview 2.2. Batch Processing Frameworks 2.3. Stream Processing Frameworks 2.4. Query Processing Frameworks 2.5. Graph Processing Frameworks 2.6. Machine Learning and Deep Learning Frameworks 2.7. Interactive Big Data Tools 2.8. Monitoring and Diagnostics Tools 2.9. Summary 3. Parallel and Distributed Storage Systems 3.1. Overview 3.2. File Storage 3.3. Object Storage 3.4. Block Storage 3.5. Memory-Centric Storage 3.6. Monitoring and Diagnostics Tools 3.7. Summary 4. HPC Architectures and Trends 4.1. Overview 4.2. Computing Capabilities 4.3. Storage 4.4. Network Interconnects 4.5. Summary 5. Opportunities and Challenges in Accelerating Big Data Computing 5.1. Overview 5.2. C1: Computational Challenges 5.3. C2: Communication and Data Movement Challenges 5.4. C3: Memory and Storage Management Challenges 5.5. C4: Challenges of Codesigning Big Data Systems and Applications 5.6. C5: Challenges of Big Data Workload Characterization and Benchmarking 5.7. C6: Deployment and Management Challenges 5.8. Summary 6. Benchmarking Big Data Systems 6.1. Overview 6.2. Offline Analytical Data Processing 6.3. Streaming Data Processing 6.4. Online Data Processing 6.5. Graph Data Processing 6.6. Machine Learning and Deep Learning Workloads 6.7. Comprehensive Benchmark Suites 6.8. Summary 7. Accelerations with RDMA 7.1. Overview 7.2. Batch and Stream Processing Systems 7.3. Graph Processing Systems 7.4. RPC Libraries 7.5. Query Processing in Databases 7.6. In-Memory KV Stores 7.7. HiBD Project 7.8. Case Studies and Performance Benefits 7.9. Summary 8. Accelerations with Multicore/Accelerator Technologies 8.1. Introduction 8.2. Multicore CPUs 8.3. GPU Acceleration for Big Data Computing 8.4. FPGAs and ASICs 8.5. Case Studies and Performance Benefits 8.6. Summary 9. Accelerations with High-Performance Storage Technologies 9.1. Overview 9.2. Exploring NVM-Centric Designs 9.3. Hybrid and Hierarchical Storage Middleware 9.4. Burst Buffer Systems 9.5. Case Studies and Performance Benefits 9.6. Summary 10. Deep Learning over Big Data 10.1. Overview 10.2. Convergence of Deep Learning, Big Data, and HPC 10.3. Challenges of Designing DLoBD Stacks 10.4. Distributed Deep Learning Training Basics 10.5. Overview of DLoBD Stacks 10.6. Characterization of DLoBD Stacks 10.7. Case Studies and Performance Benefits 10.8. Discussions on Optimizations for Deep Learning Workloads 10.9. Summary 11. Designs with Cloud Technologies 11.1. Overview 11.2. Overview of High-Performance Cloud Technologies 11.3. State-of-the-Art Designs 11.4. Case Studies and Performance Benefits 11.5. Summary 12. Frontier Research on High-Performance Big Data Computing 12.1. Heterogeneity-Aware Big Data Processing and Management Systems 12.2. Big Data Processing and Management for Hybrid Storage Systems 12.3. Efficient and Coherent Communication and Computation in Network for Big Data Systems 12.4. Summary References Index
Donate to keep this site alive
1. Disable the AdBlock plugin. Otherwise, you may not get any links.
2. Solve the CAPTCHA.
3. Click download link.
4. Lead to download server to download.