Parallel Programming: Concepts and Practice
- Length: 416 pages
- Edition: 1
- Language: English
- Publisher: Morgan Kaufmann
- Publication Date: 2017-11-27
- ISBN-10: 0128498900
- ISBN-13: 9780128498903
- Sales Rank: #767273 (See Top 100 Books)
Parallel Programming: Concepts and Practice provides an upper level introduction to parallel programming. In addition to covering general parallelism concepts, this text teaches practical programming skills for both shared memory and distributed memory architectures. The authors’ open-source system for automated code evaluation provides easy access to parallel computing resources, making the book particularly suitable for classroom settings.
- Covers parallel programming approaches for single computer nodes and HPC clusters: OpenMP, multithreading, SIMD vectorization, MPI, UPC++
- Contains numerous practical parallel programming exercises
- Includes access to an automated code evaluation tool that enables students the opportunity to program in a web browser and receive immediate feedback on the result validity of their program
- Features an example-based teaching of concept to enhance learning outcomes
Table of Contents
Chapter 1 Introduction
Chapter 2 Theoretical Background
Chapter 3 Modern Architectures
Chapter 4 C++11 Multithreading
Chapter 5 Advanced C++11 Multithreading
Chapter 6 Openmp
Chapter 7 Compute Unified Device Architecture
Chapter 8 Advanced Cuda Programming
Chapter 9 Message Passing Interface
Chapter 10 Unified Parallel C++
Cover image Title page Table of Contents Copyright Preface Acknowledgments Chapter 1: Introduction Abstract 1.1. Motivational Example and Its Analysis 1.2. Parallelism Basics 1.3. HPC Trends and Rankings 1.4. Additional Exercises Chapter 2: Theoretical Background Abstract 2.1. PRAM 2.2. Network Topologies 2.3. Amdahl's and Gustafson's Laws 2.4. Foster's Parallel Algorithm Design Methodology 2.5. Additional Exercises References Chapter 3: Modern Architectures Abstract 3.1. Memory Hierarchy 3.2. Levels of Parallelism 3.3. Additional Exercises References Chapter 4: C++11 Multithreading Abstract 4.1. Introduction to Multithreading (Hello World) 4.2. Handling Return Values (Fibonacci Sequence) 4.3. Scheduling Based on Static Distributions (Matrix Vector Multiplication) 4.4. Handling Load Imbalance (All-Pairs Distance Matrix) 4.5. Signaling Threads with Condition Variables (Ping Pong) 4.6. Parallelizing Over Implicitly Enumerable Sets (Thread Pool) 4.7. Additional Exercises References Chapter 5: Advanced C++11 Multithreading Abstract 5.1. Lock-Free Programming (Atomics, Compare-and-Swap) 5.2. Work-Sharing Thread Pool (Tree Traversal) 5.3. Parallel Graph Search (Binary Knapsack Problem) 5.4. Outlook 5.5. Additional Exercises References Chapter 6: OpenMP Abstract 6.1. Introduction to OpenMP (Hello World) 6.2. The parallel for Directive (Basic Linear Algebra) 6.3. Basic Parallel Reductions (Nearest-Neighbor Classifier) 6.4. Scheduling of Imbalanced Loops (Inner Products) 6.5. Advanced Reductions (Softmax Regression/AVX Reductions) 6.6. Task Parallelism (Tree Traversal) 6.7. SIMD Vectorization (Vector Addition) 6.8. Outlook 6.9. Additional Exercises References Chapter 7: Compute Unified Device Architecture Abstract 7.1. Introduction to CUDA (Hello World) 7.2. Hardware Architecture of CUDA-Enabled GPUs 7.3. Memory Access Patterns (Eigenfaces) 7.4. Memory Hierarchy (Dynamic Time Warping) 7.5. Optimization Guidelines 7.6. Additional Exercises References Chapter 8: Advanced CUDA Programming Abstract 8.1. Warp Intrinsics and Atomic Operations (Parallel Reduction) 8.2. Utilizing Multiple GPUs and Streams (Newton Iteration) 8.3. Outlook 8.4. Additional Exercises References Chapter 9: Message Passing Interface Abstract 9.1. Introduction to MPI 9.2. Basic Concepts (Hello World) 9.3. Point-to-Point Communication (Ping-Pong) 9.4. Nonblocking Communication (Ping-Pong in a Ring of Processes) 9.5. Collectives (Counting Primes) 9.6. Overlapping Computation and Communication (Jacobi Iteration) 9.7. Derived Datatypes (Matrix Multiplication With Submatrix Scattering) 9.8. Complex Communicators (Matrix Multiplication Using SUMMA) 9.9. Outlook 9.10. Additional Exercises References Chapter 10: Unified Parallel C++ Abstract 10.1. Introduction to PGAS and UPC++ 10.2. Basic Concepts (Hello World) 10.3. Memory Affinity and Privatization (Vector Update) 10.4. Global Pointers and Collectives (Letter Count) 10.5. Locks (Image Histogramming) 10.6. Remote Function Invocation (Mandelbrot Sets) 10.7. Additional Exercises References Index
Donate to keep this site alive
1. Disable the AdBlock plugin. Otherwise, you may not get any links.
2. Solve the CAPTCHA.
3. Click download link.
4. Lead to download server to download.