Machine Learning on Kubernetes: A practical handbook for building and using a complete open source machine learning platform on Kubernetes
- Length: 384 pages
- Edition: 1
- Language: English
- Publisher: Packt Publishing
- Publication Date: 2022-06-24
- ISBN-10: 1803241802
- ISBN-13: 9781803241807
- Sales Rank: #3246653 (See Top 100 Books)
Build a Kubernetes-based self-serving, agile data science and machine learning ecosystem for your organization using reliable and secure open source technologies
Key Features
- Build a complete machine learning platform on Kubernetes
- Improve the agility and velocity of your team by adopting the self-service capabilities of the platform
- Reduce time-to-market by automating data pipelines and model training and deployment
Book Description
MLOps is an emerging field that aims to bring repeatability, automation, and standardization of the software engineering domain to data science and machine learning engineering. By implementing MLOps with Kubernetes, data scientists, IT professionals, and data engineers can collaborate and build machine learning solutions that deliver business value for their organization.
You’ll begin by understanding the different components of a machine learning project. Then, you’ll design and build a practical end-to-end machine learning project using open source software. As you progress, you’ll understand the basics of MLOps and the value it can bring to machine learning projects. You will also gain experience in building, configuring, and using an open source, containerized machine learning platform. In later chapters, you will prepare data, build and deploy machine learning models, and automate workflow tasks using the same platform. Finally, the exercises in this book will help you get hands-on experience in Kubernetes and open source tools, such as JupyterHub, MLflow, and Airflow.
By the end of this book, you’ll have learned how to effectively build, train, and deploy a machine learning model using the machine learning platform you built.
What you will learn
- Understand the different stages of a machine learning project
- Use open source software to build a machine learning platform on Kubernetes
- Implement a complete ML project using the machine learning platform presented in this book
- Improve on your organization’s collaborative journey toward machine learning
- Discover how to use the platform as a data engineer, ML engineer, or data scientist
- Find out how to apply machine learning to solve real business problems
Who this book is for
This book is for data scientists, data engineers, IT platform owners, AI product owners, and data architects who want to build their own platform for ML development. Although this book starts with the basics, a solid understanding of Python and Kubernetes, along with knowledge of the basic concepts of data science and data engineering will help you grasp the topics covered in this book in a better way.
Machine Learning on Kubernetes Contributors About the authors About the reviewers Preface Who this book is for What this book covers To get the most out of this book Download the example code files Download the color images Conventions used Get in touch Reviews Share Your Thoughts Part 1: The Challenges of Adopting ML and Understanding MLOps (What and Why) Chapter 1: Challenges in Machine Learning Understanding ML Delivering ML value Choosing the right approach The importance of data Facing the challenges of adopting ML Focusing on the big picture Breaking down silos Fail-fast culture An overview of the ML platform Summary Further reading Chapter 2: Understanding MLOps Comparing ML to traditional programming Exploring the benefits of DevOps Understanding MLOps ML DevOps ML project life cycle Fast feedback loop Collaborating over the project life cycle The role of OSS in ML projects Running ML projects on Kubernetes Summary Further reading Chapter 3: Exploring Kubernetes Technical requirements Exploring Kubernetes major components Control plane Worker nodes Kubernetes objects required to run an application Becoming cloud-agnostic through Kubernetes Understanding Operators Setting up your local Kubernetes environment Installing kubectl Installing minikube Installing OLM Provisioning a VM on GCP Summary Part 2: The Building Blocks of an MLOps Platform and How to Build One on Kubernetes Chapter 4: The Anatomy of a Machine Learning Platform Technical requirements Defining a self-service platform Exploring the data engineering components Data engineer workflow Exploring the model development components Understanding the data scientist workflow Security, monitoring, and automation Introducing ODH Installing the ODH operator on Kubernetes Enabling the ingress controller on the Kubernetes cluster Installing Keycloak on Kubernetes Summary Further reading Chapter 5: Data Engineering Technical requirements Configuring Keycloak for authentication Importing the Keycloak configuration for the ODH components Creating a Keycloak user Configuring ODH components Installing ODH Understanding and using JupyterHub Validating the JupyterHub installation Running your first Jupyter notebook Understanding the basics of Apache Spark Understanding Apache Spark job execution Understanding how ODH provisions Apache Spark cluster on-demand Creating a Spark cluster Understanding how JupyterHub creates a Spark cluster Writing and running a Spark application from Jupyter Notebook Summary Chapter 6: Machine Learning Engineering Technical requirements Understanding ML engineering Using a custom notebook image Building a custom notebook container image Introducing MLflow Understanding MLflow components Validating the MLflow installation Using MLFlow as an experiment tracking system Adding custom data to the experiment run Using MLFlow as a model registry system Summary Chapter 7: Model Deployment and Automation Technical requirements Understanding model inferencing with Seldon Core Wrapping the model using Python Containerizing the model Deploying the model using the Seldon controller Packaging, running, and monitoring a model using Seldon Core Introducing Apache Airflow Understanding DAG Exploring Airflow features Understanding Airflow components Validating the Airflow installation Configuring the Airflow DAG repository Configuring Airflow runtime images Automating ML model deployments in Airflow Creating the pipeline by using the pipeline editor Summary Part 3: How to Use the MLOps Platform and Build a Full End-to-End Project Using the New Platform Chapter 8: Building a Complete ML Project Using the Platform Reviewing the complete picture of the ML platform Understanding the business problem Data collection, processing, and cleaning Understanding data sources, location, and the format Understanding data processing and cleaning Performing exploratory data analysis Understanding sample data Understanding feature engineering Data augmentation Building and evaluating the ML model Selecting evaluation criteria Building the model Deploying the model Reproducibility Summary Chapter 9: Building Your Data Pipeline Technical requirements Automated provisioning of a Spark cluster for development Writing a Spark data pipeline Preparing the environment Understanding data Designing and building the pipeline Using the Spark UI to monitor your data pipeline Building and executing a data pipeline using Airflow Understanding the data pipeline DAG Building and running the DAG Summary Chapter 10: Building, Deploying, and Monitoring Your Model Technical requirements Visualizing and exploring data using JupyterHub Building and tuning your model using JupyterHub Tracking model experiments and versioning using MLflow Tracking model experiments Versioning models Deploying the model as a service Calling your model Monitoring your model Understanding monitoring components Configuring Grafana and a dashboard Summary Chapter 11: Machine Learning on Kubernetes Identifying ML platform use cases Considering AutoML Commercial platforms ODH Operationalizing ML Setting the business expectations Dealing with dirty real-world data Dealing with incorrect results Maintaining continuous delivery Managing security Adhering to compliance policies Applying governance Running on Kubernetes Avoiding vendor lock-ins Considering other Kubernetes platforms Roadmap Summary Further reading Why subscribe? Other Books You May Enjoy Packt is searching for authors like you Share Your Thoughts
Donate to keep this site alive
How to download source code?
1. Go to: https://github.com/PacktPublishing
2. In the Find a repository… box, search the book title: Machine Learning on Kubernetes: A practical handbook for building and using a complete open source machine learning platform on Kubernetes
, sometime you may not get the results, please search the main title.
3. Click the book title in the search results.
3. Click Code to download.
1. Disable the AdBlock plugin. Otherwise, you may not get any links.
2. Solve the CAPTCHA.
3. Click download link.
4. Lead to download server to download.