Machine Learning Model Serving Patterns and Best Practices: A definitive guide to deploying, monitoring, and providing accessibility to ML models in production

Length: 336 pages
Edition: 1
Language: English
Publisher: Packt Publishing
Publication Date: 2022-12-30
ISBN-10: 1803249900
ISBN-13: 9781803249902
Sales Rank: #0 (See Top 100 Books)

Become a successful machine learning professional by effortlessly deploying machine learning models to production and implementing cloud-based machine learning models for widespread organizational use

Key Features

Learn best practices about bringing your models to production
Explore the tools available for serving ML models and the differences between them
Understand state-of-the-art monitoring approaches for model serving implementations

Book Description

Serving patterns enable data science and ML teams to bring their models to production. Most ML models are not deployed for consumers, so ML engineers need to know the critical steps for how to serve an ML model.

This book will cover the whole process, from the basic concepts like stateful and stateless serving to the advantages and challenges of each. Batch, real-time, and continuous model serving techniques will also be covered in detail. Later chapters will give detailed examples of keyed prediction techniques and ensemble patterns. Valuable associated technologies like TensorFlow severing, BentoML, and RayServe will also be discussed, making sure that you have a good understanding of the most important methods and techniques in model serving. Later, you’ll cover topics such as monitoring and performance optimization, as well as strategies for managing model drift and handling updates and versioning. The book will provide practical guidance and best practices for ensuring that your model serving pipeline is robust, scalable, and reliable. Additionally, this book will explore the use of cloud-based platforms and services for model serving using AWS SageMaker with the help of detailed examples.

By the end of this book, you’ll be able to save and serve your model using state-of-the-art techniques.

What you will learn

Explore specific patterns in model serving that are crucial for every data science professional
Understand how to serve machine learning models using different techniques
Discover the various approaches to stateless serving
Implement advanced techniques for batch and streaming model serving
Get to grips with the fundamental concepts in continued model evaluation
Serve machine learning models using a fully managed AWS Sagemaker cloud solution

Who this book is for

This book is for machine learning engineers and data scientists who want to bring their models into production. Those who are familiar with machine learning and have experience of using machine learning techniques but are looking for options and strategies to bring their models to production will find great value in this book. Working knowledge of Python programming is a must to get started.

Machine Learning Model Serving Patterns and Best Practices
Contributors
About the author
About the reviewers
Preface
    Who this book is for
    What this book covers
    To get the most out of this book
    Download the example code files
    Conventions used
    Get in touch
    Share Your Thoughts
    Download a free PDF copy of this book
Part 1:Introduction to Model Serving
Chapter 1: Introducing Model Serving
    Technical requirements
    What is serving?
    What are models?
    What is model serving?
    Understanding the importance of model serving
    Using existing tools to serve models
    Summary
Chapter 2: Introducing Model Serving Patterns
    Design patterns in software engineering
    Understanding the value of model serving patterns
    ML serving patterns
        Serving philosophy patterns
        Patterns of serving approaches
    Summary
    Further reading
Part 2:Patterns and Best Practices of Model Serving
Chapter 3: Stateless Model Serving
    Technical requirements
    Understanding stateful and stateless functions
        Stateless functions
        Stateful functions
        Extracting states from stateful functions
        Using stateful functions
    States in machine learning models
        Using input data as states
        Mitigating the impact of states from the ML model
    Summary
Chapter 4: Continuous Model Evaluation
    Technical requirements
    Introducing continuous model evaluation
        What to monitor in model evaluation
        Challenges of continuous model evaluation
    The necessity of continuous model evaluation
        Monitoring errors
        Deciding on retraining
        Enhancing serving resources
        Understanding business impact
        Common metrics for training and monitoring
    Continuous model evaluation use cases
    Evaluating a model continuously
        Collecting the ground truth
        Plotting metrics on a dashboard
        Selecting the threshold
        Setting a notification for performance drops
    Monitoring model performance when predicting rare classes
    Summary
    Further reading
Chapter 5: Keyed Prediction
    Technical requirements
    Introducing keyed prediction
    Exploring keyed prediction use cases
        Multi-threaded programming
        Multiple instances of the model running asynchronously
        Why the keyed prediction model is needed
    Exploring techniques for keyed prediction
        Passing keys with features from the clients
        Removing keys before the prediction
        Tagging predictions with keys
        Creating keys
    Summary
    Further reading
Chapter 6: Batch Model Serving
    Technical requirements
    Introducing batch model serving
        What is batch model serving?
    Different types of batch model serving
        Manual triggers
        Automatic periodic triggers
        Using continuous model evaluation to retrain
        Serving for offline inference
        Serving for on-demand inference
    Example scenarios of batch model serving
        Case 1 – recommendation
        Case 2 – sentiment analysis
    Techniques in batch model serving
        Setting up a periodic batch update
        Storing the predictions in a persistent store
        Pulling predictions by the server application
    Limitations of batch serving
    Summary
    Further reading
Chapter 7: Online Learning Model Serving
    Technical requirements
    Introducing online model serving
        Serving requests
    Use cases for online model serving
        Case 1 – recommending the nearest emergency center during a pandemic
        Case 2 – predicting the favorite soccer team in a tournament
        Case 3 – predicting the path of a hurricane or storm
        Case 4 – predicting the estimated delivery time of delivery trucks
    Challenges in online model serving
        Challenges in using newly arrived data for training
        Underperforming of the model after online training
        Overfitting and class imbalance
        Increasing of latency
        Handling concurrent requests
    Implementing online model serving
    Summary
    Further reading
Chapter 8: Two-Phase Model Serving
    Technical requirements
    Introducing two-phase model serving
    Exploring two-phase model serving techniques
        Quantized phase one model
        Training and saving an MNIST model
        Full integer quantization of the model and saving the converted model
        Comparing the size and accuracy of the models
        Separately trained phase one model with reduced features
        Separately trained different models
    Use cases of two-phase model serving
        Case 4 – route planners
    Summary
    Further reading
Chapter 9: Pipeline Pattern Model Serving
    Technical requirements
    Introducing the pipeline pattern
        A DAG
        Stages of the machine learning pipeline
    Introducing Apache Airflow
        Getting started with Apache Airflow
        Creating and starting a pipeline using Apache Airflow
    Demonstrating a machine learning pipeline using Airflow
    Advantages and disadvantages of the pipeline pattern
    Summary
    Further reading
Chapter 10: Ensemble Model Serving Pattern
    Technical requirements
    Introducing the ensemble pattern
    Using ensemble pattern techniques
        Model update
        Aggregation
        Model selection
        Combining responses
    End-to-end dummy example of serving the model
    Summary
Chapter 11: Business Logic Pattern
    Technical requirements
    Introducing the business logic pattern
        Type of business logic
    Technical approaches to business logic in model serving
        Data validation
        Feature transformation
        Prediction post-processing
    Summary
Part 3:Introduction to Tools for Model Serving
Chapter 12: Exploring TensorFlow Serving
    Technical requirements
    Introducing TensorFlow Serving
        Servable
        Loader
        Source
        Aspired versions
        Manager
    Using TensorFlow Serving to serve models
        TensorFlow Serving with Docker
        Using advanced model configurations
    Summary
    Further reading
Chapter 13: Using Ray Serve
    Technical requirements
    Introducing Ray Serve
        Deployment
        ServeHandle
        Ingress deployment
        Deployment graph
    Using Ray Serve to serve a model
        Using the ensemble pattern in Ray Serve
        Using Ray Serve with the pipeline pattern
    Summary
    Further reading
Chapter 14: Using BentoML
    Technical requirements
    Introducing BentoML
        Preparing models
        Services and APIs
        Bento
    Using BentoML to serve a model
    Summary
    Further reading
Part 4:Exploring Cloud Solutions
Chapter 15: Serving ML Models Using a Fully Managed Cloud Solution
    Technical requirements
    Introducing Amazon SageMaker
        Amazon SageMaker features
    Using Amazon SageMaker to serve a model
        Creating a notebook in Amazon SageMaker
        Serving the model using Amazon SageMaker
    Summary
Index
    Why subscribe?
Other Books You May Enjoy
    Packt is searching for authors like you
    Share Your Thoughts
    Download a free PDF copy of this book

Donate to keep this site alive

To access the Link, solve the captcha.

How to download source code?

1. Go to: https://github.com/PacktPublishing

2. In the Find a repository… box, search the book title: Machine Learning Model Serving Patterns and Best Practices: A definitive guide to deploying, monitoring, and providing accessibility to ML models in production, sometime you may not get the results, please search the main title.

Machine Learning Model Serving Patterns and Best Practices: A definitive guide to deploying, monitoring, and providing accessibility to ML models in production

Key Features

Book Description

What you will learn

Who this book is for

How to download source code?

Kubernetes Secrets Handbook: Design, implement, and maintain production-grade Kubernetes Secrets management solutions

Hands-on ESP32 with Arduino IDE: Unleash the power of IoT with ESP32 and build exciting projects with this practical guide

Microsoft Intune Cookbook: Over 75 recipes for configuring, managing, and automating your identities, apps, and endpoint devices

The Cybersecurity Guide to Governance, Risk, and Compliance

Edge Computing Patterns for Solution Architects: Learn methods and principles of resilient distributed application architectures from hybrid cloud to far edge

Learning Continuous Integration with Jenkins, 3rd Edition: An end-to-end guide to creating operational, secure, resilient, and cost-effective CI/CD processes