Applied Machine Learning and High Performance Computing on AWS: Accelerate development of machine learning applications following architectural best practices
- Length: 398 pages
- Edition: 1
- Language: English
- Publisher: Packt Publishing
- Publication Date: 2022-12-09
- ISBN-10: 1803237015
- ISBN-13: 9781803237015
- Sales Rank: #12008312 (See Top 100 Books)
Build, train, and deploy large machine learning models at scale in various domains such as computational fluid dynamics, genomics, autonomous vehicles, and numerical optimization using Amazon SageMaker.
Key Features
- Understanding the need for High Performance Computing (HPC).
- Build, train, and deploy large ML models with billions of parameters using Amazon SageMaker.
- Best practices and architectures for implementing ML at scale using HPC.
Book Description
Machine Learning (ML) and High Performance Computing (HPC) on AWS run compute intensive workloads across industries and emerging applications. It’s use cases can be linked to various verticals like computational fluid dynamics (CFD), genomics, and autonomous vehicles.
The book provides end-to-end guidance starting from HPC concepts for storage and networking. It then goes deeper into part 2, with working examples on how to process large datasets using SageMaker Studio and EMR, build, train, and deploy large models using distributed training. It also covers deploying models to edge devices using SageMaker and IoT Greengrass, and performance optimization of ML models, for low latency use cases.
By the end of this book, you will be able to build, train, and deploy your own large scale ML application, using HPC on AWS, following the industry best practices and addressing the key pain points encountered in the application life cycle.
What you will learn
- Data management, storage, and fast networking for HPC applications
- Analysis and visualization of a large volume of data using Spark
- Train visual transformer model using SageMaker distributed training
- Deploy and manage ML models at scale on cloud and at edge
- Performance optimization of ML models for low latency workloads
- Apply HPC to industry domains like CFD, genomics, AV, and optimization
Who This Book Is For
The book begins with HPC concepts, however, expects you to have prior machine learning knowledge. This book is for ML engineers and Data Scientists, interested in learning advanced topics on using large dataset for training large models using distributed training concepts on AWS, followed by deploying models at scale and performance optimization for low latency use cases. This book is also beneficial for Practitioners in fields such as numerical optimization, computation fluid dynamics, autonomous vehicles, and genomics, who require HPC for applying ML models to applications at scale.
Applied Machine Learning and High-Performance Computing on AWS Contributors About the authors About the reviewers Preface Who this book is for What this book covers To get the most out of this book Download the example code files Download the color images Conventions used Get in touch Share Your Thoughts Download a free PDF copy of this book Part 1: Introducing High-Performance Computing Chapter 1: High-Performance Computing Fundamentals Why do we need HPC? Limitations of on-premises HPC Barrier to innovation Reduced efficiency Lost opportunities Limited scalability and elasticity Benefits of doing HPC on the cloud Drives innovation Enables secure collaboration among distributed teams Amplifies operational efficiency Optimizes performance Optimizes cost Driving innovation across industries with HPC Life sciences and healthcare AVs Supply chain optimization Summary Further reading Chapter 2: Data Management and Transfer Importance of data management Challenges of moving data into the cloud How to securely transfer large amounts of data into the cloud AWS online data transfer services AWS DataSync AWS Transfer Family Amazon S3 Transfer Acceleration Amazon Kinesis AWS Snowcone AWS offline data transfer services Process for ordering a device from AWS Snow Family Summary Further reading Chapter 3: Compute and Networking Introducing the AWS compute ecosystem General purpose instances Compute optimized instances Accelerated compute instances Memory optimized instances Storage optimized instances Amazon Machine Images (AMIs) Containers on AWS Serverless compute on AWS Networking on AWS CIDR blocks and routing Networking for HPC workloads Selecting the right compute for HPC workloads Pattern 1 – a standalone instance Pattern 2 – using AWS ParallelCluster Pattern 3 – using AWS Batch Pattern 4 – hybrid architecture Pattern 5 – Container-based distributed processing Pattern 6 – serverless architecture Best practices for HPC workloads Summary References Chapter 4: Data Storage Technical requirements AWS services for storing data Amazon Simple Storage Service (S3) Amazon Elastic File System (EFS) Amazon EBS Amazon FSx Data security and governance IAM Data protection Data encryption Logging and monitoring Resilience Tiered storage for cost optimization Amazon S3 storage classes Amazon EFS storage classes Choosing the right storage option for HPC workloads Summary Further reading Part 2: Applied Modeling Chapter 5: Data Analysis Technical requirements Exploring data analysis methods Gathering the data Understanding the data structure Describing the data Visualizing the data Reviewing the data analytics life cycle Reviewing the AWS services for data analysis Unifying the data into a common store Creating a data structure for analysis Visualizing the data at scale Choosing the right AWS service Analyzing large amounts of structured and unstructured data Setting up EMR and SageMaker Studio Analyzing large amounts of structured data Analyzing large amounts of unstructured data Processing data at scale on AWS Cleaning up Summary Chapter 6: Distributed Training of Machine Learning Models Technical requirements Building ML systems using AWS Introducing the fundamentals of distributed training Reviewing the SageMaker distributed data parallel strategy Reviewing the SageMaker model data parallel strategy Reviewing a hybrid data parallel and model parallel strategy Executing a distributed training workload on AWS Executing distributed data parallel training on Amazon SageMaker Executing distributed model parallel training on Amazon SageMaker Summary Chapter 7: Deploying Machine Learning Models at Scale Managed deployment on AWS Amazon SageMaker managed model deployment options The variety of compute resources available Cost-effective model deployment Blue/green deployments Inference recommender MLOps integration Model registry Elastic inference Deployment on edge devices Choosing the right deployment option Using batch inference Using real-time endpoints Using asynchronous inference Batch inference Creating a transformer object Creating a batch transform job for carrying out inference Optimizing a batch transform job Real-time inference Hosting a machine learning model as a real-time endpoint Asynchronous inference The high availability of model endpoints Deployment on multiple instances Endpoints autoscaling Endpoint modification without disruption Blue/green deployments All at once Canary Linear Summary References Chapter 8: Optimizing and Managing Machine Learning Models for Edge Deployment Technical requirements Understanding edge computing Reviewing the key considerations for optimal edge deployments Efficiency Performance Reliability Security Designing an architecture for optimal edge deployments Building the edge components Building the ML model Deploying the model package Summary Chapter 9: Performance Optimization for Real-Time Inference Technical requirements Reducing the memory footprint of DL models Pruning Quantization Model compilation Key metrics for optimizing models Choosing the instance type, load testing, and performance tuning for models Observing the results Summary Chapter 10: Data Visualization Data visualization using Amazon SageMaker Data Wrangler SageMaker Data Wrangler visualization options Adding visualizations to the data flow in SageMaker Data Wrangler Data flow Amazon’s graphics-optimized instances Benefits and key features of Amazon’s graphics-optimized instances Summary Further reading Part 3: Driving Innovation Across Industries Chapter 11: Computational Fluid Dynamics Technical requirements Introducing CFD Reviewing best practices for running CFD on AWS Using AWS ParallelCluster Using CFD Direct Discussing how ML can be applied to CFD Summary References Chapter 12: Genomics Technical requirements Managing large genomics data on AWS Designing architecture for genomics Applying ML to genomics Protein secondary structure prediction for protein sequences Summary Chapter 13: Autonomous Vehicles Technical requirements Introducing AV systems AWS services supporting AV systems Designing an architecture for AV systems ML applied to AV systems Model development Step 1 – build and push the CARLA container to Amazon ECR Step 2 – configure and run CARLA on RoboMaker Summary References Chapter 14: Numerical Optimization Introduction to optimization Goal or objective function Variables Constraints Modeling an optimization problem Optimization algorithm Local and global optima Common numerical optimization algorithms Random restart hill climbing Simulated annealing Tabu search Evolutionary methods Example use cases of large-scale numerical optimization problems Traveling salesperson optimization problem Worker dispatch optimization Assembly line optimization Numerical optimization using high-performance compute on AWS Commercial optimization solvers Open source optimization solvers Numerical optimization patterns on AWS Machine learning and numerical optimization Summary Further reading Index Why subscribe? Other Books You May Enjoy Packt is searching for authors like you Share Your Thoughts Download a free PDF copy of this book
Donate to keep this site alive
How to download source code?
1. Go to: https://github.com/PacktPublishing
2. In the Find a repository… box, search the book title: Applied Machine Learning and High Performance Computing on AWS: Accelerate development of machine learning applications following architectural best practices
, sometime you may not get the results, please search the main title.
3. Click the book title in the search results.
3. Click Code to download.
1. Disable the AdBlock plugin. Otherwise, you may not get any links.
2. Solve the CAPTCHA.
3. Click download link.
4. Lead to download server to download.