Machine Learning Model Serving Patterns and Best Practices: A definitive guide to deploying, monitoring, and providing accessibility to ML models in production
- Length: 336 pages
- Edition: 1
- Language: English
- Publisher: Packt Publishing
- Publication Date: 2022-12-30
- ISBN-10: 1803249900
- ISBN-13: 9781803249902
- Sales Rank: #0 (See Top 100 Books)
Become a successful machine learning professional by effortlessly deploying machine learning models to production and implementing cloud-based machine learning models for widespread organizational use
Key Features
- Learn best practices about bringing your models to production
- Explore the tools available for serving ML models and the differences between them
- Understand state-of-the-art monitoring approaches for model serving implementations
Book Description
Serving patterns enable data science and ML teams to bring their models to production. Most ML models are not deployed for consumers, so ML engineers need to know the critical steps for how to serve an ML model.
This book will cover the whole process, from the basic concepts like stateful and stateless serving to the advantages and challenges of each. Batch, real-time, and continuous model serving techniques will also be covered in detail. Later chapters will give detailed examples of keyed prediction techniques and ensemble patterns. Valuable associated technologies like TensorFlow severing, BentoML, and RayServe will also be discussed, making sure that you have a good understanding of the most important methods and techniques in model serving. Later, you’ll cover topics such as monitoring and performance optimization, as well as strategies for managing model drift and handling updates and versioning. The book will provide practical guidance and best practices for ensuring that your model serving pipeline is robust, scalable, and reliable. Additionally, this book will explore the use of cloud-based platforms and services for model serving using AWS SageMaker with the help of detailed examples.
By the end of this book, you’ll be able to save and serve your model using state-of-the-art techniques.
What you will learn
- Explore specific patterns in model serving that are crucial for every data science professional
- Understand how to serve machine learning models using different techniques
- Discover the various approaches to stateless serving
- Implement advanced techniques for batch and streaming model serving
- Get to grips with the fundamental concepts in continued model evaluation
- Serve machine learning models using a fully managed AWS Sagemaker cloud solution
Who this book is for
This book is for machine learning engineers and data scientists who want to bring their models into production. Those who are familiar with machine learning and have experience of using machine learning techniques but are looking for options and strategies to bring their models to production will find great value in this book. Working knowledge of Python programming is a must to get started.
Machine Learning Model Serving Patterns and Best Practices Contributors About the author About the reviewers Preface Who this book is for What this book covers To get the most out of this book Download the example code files Conventions used Get in touch Share Your Thoughts Download a free PDF copy of this book Part 1:Introduction to Model Serving Chapter 1: Introducing Model Serving Technical requirements What is serving? What are models? What is model serving? Understanding the importance of model serving Using existing tools to serve models Summary Chapter 2: Introducing Model Serving Patterns Design patterns in software engineering Understanding the value of model serving patterns ML serving patterns Serving philosophy patterns Patterns of serving approaches Summary Further reading Part 2:Patterns and Best Practices of Model Serving Chapter 3: Stateless Model Serving Technical requirements Understanding stateful and stateless functions Stateless functions Stateful functions Extracting states from stateful functions Using stateful functions States in machine learning models Using input data as states Mitigating the impact of states from the ML model Summary Chapter 4: Continuous Model Evaluation Technical requirements Introducing continuous model evaluation What to monitor in model evaluation Challenges of continuous model evaluation The necessity of continuous model evaluation Monitoring errors Deciding on retraining Enhancing serving resources Understanding business impact Common metrics for training and monitoring Continuous model evaluation use cases Evaluating a model continuously Collecting the ground truth Plotting metrics on a dashboard Selecting the threshold Setting a notification for performance drops Monitoring model performance when predicting rare classes Summary Further reading Chapter 5: Keyed Prediction Technical requirements Introducing keyed prediction Exploring keyed prediction use cases Multi-threaded programming Multiple instances of the model running asynchronously Why the keyed prediction model is needed Exploring techniques for keyed prediction Passing keys with features from the clients Removing keys before the prediction Tagging predictions with keys Creating keys Summary Further reading Chapter 6: Batch Model Serving Technical requirements Introducing batch model serving What is batch model serving? Different types of batch model serving Manual triggers Automatic periodic triggers Using continuous model evaluation to retrain Serving for offline inference Serving for on-demand inference Example scenarios of batch model serving Case 1 – recommendation Case 2 – sentiment analysis Techniques in batch model serving Setting up a periodic batch update Storing the predictions in a persistent store Pulling predictions by the server application Limitations of batch serving Summary Further reading Chapter 7: Online Learning Model Serving Technical requirements Introducing online model serving Serving requests Use cases for online model serving Case 1 – recommending the nearest emergency center during a pandemic Case 2 – predicting the favorite soccer team in a tournament Case 3 – predicting the path of a hurricane or storm Case 4 – predicting the estimated delivery time of delivery trucks Challenges in online model serving Challenges in using newly arrived data for training Underperforming of the model after online training Overfitting and class imbalance Increasing of latency Handling concurrent requests Implementing online model serving Summary Further reading Chapter 8: Two-Phase Model Serving Technical requirements Introducing two-phase model serving Exploring two-phase model serving techniques Quantized phase one model Training and saving an MNIST model Full integer quantization of the model and saving the converted model Comparing the size and accuracy of the models Separately trained phase one model with reduced features Separately trained different models Use cases of two-phase model serving Case 4 – route planners Summary Further reading Chapter 9: Pipeline Pattern Model Serving Technical requirements Introducing the pipeline pattern A DAG Stages of the machine learning pipeline Introducing Apache Airflow Getting started with Apache Airflow Creating and starting a pipeline using Apache Airflow Demonstrating a machine learning pipeline using Airflow Advantages and disadvantages of the pipeline pattern Summary Further reading Chapter 10: Ensemble Model Serving Pattern Technical requirements Introducing the ensemble pattern Using ensemble pattern techniques Model update Aggregation Model selection Combining responses End-to-end dummy example of serving the model Summary Chapter 11: Business Logic Pattern Technical requirements Introducing the business logic pattern Type of business logic Technical approaches to business logic in model serving Data validation Feature transformation Prediction post-processing Summary Part 3:Introduction to Tools for Model Serving Chapter 12: Exploring TensorFlow Serving Technical requirements Introducing TensorFlow Serving Servable Loader Source Aspired versions Manager Using TensorFlow Serving to serve models TensorFlow Serving with Docker Using advanced model configurations Summary Further reading Chapter 13: Using Ray Serve Technical requirements Introducing Ray Serve Deployment ServeHandle Ingress deployment Deployment graph Using Ray Serve to serve a model Using the ensemble pattern in Ray Serve Using Ray Serve with the pipeline pattern Summary Further reading Chapter 14: Using BentoML Technical requirements Introducing BentoML Preparing models Services and APIs Bento Using BentoML to serve a model Summary Further reading Part 4:Exploring Cloud Solutions Chapter 15: Serving ML Models Using a Fully Managed Cloud Solution Technical requirements Introducing Amazon SageMaker Amazon SageMaker features Using Amazon SageMaker to serve a model Creating a notebook in Amazon SageMaker Serving the model using Amazon SageMaker Summary Index Why subscribe? Other Books You May Enjoy Packt is searching for authors like you Share Your Thoughts Download a free PDF copy of this book
Donate to keep this site alive
How to download source code?
1. Go to: https://github.com/PacktPublishing
2. In the Find a repository… box, search the book title: Machine Learning Model Serving Patterns and Best Practices: A definitive guide to deploying, monitoring, and providing accessibility to ML models in production
, sometime you may not get the results, please search the main title.
3. Click the book title in the search results.
3. Click Code to download.
1. Disable the AdBlock plugin. Otherwise, you may not get any links.
2. Solve the CAPTCHA.
3. Click download link.
4. Lead to download server to download.