Kubeflow Operations Guide: Managing Cloud and On-Premise Deployment
- Length: 304 pages
- Edition: 1
- Language: English
- Publisher: O'Reilly Media
- Publication Date: 2020-12-29
- ISBN-10: 1492053279
- ISBN-13: 9781492053279
- Sales Rank: #2175109 (See Top 100 Books)
Building models is a small part of the story when it comes to deploying machine learning applications. The entire process involves developing, orchestrating, deploying, and running scalable and portable machine learning workloads–a process Kubeflow makes much easier. This practical book shows data scientists, data engineers, and platform architects how to plan and execute a Kubeflow project to make their Kubernetes workflows portable and scalable.
Authors Josh Patterson, Michael Katzenellenbogen, and Austin Harris demonstrate how this open source platform orchestrates workflows by managing machine learning pipelines. You’ll learn how to plan and execute a Kubeflow platform that can support workflows from on-premises to cloud providers including Google, Amazon, and Microsoft.
- Dive into Kubeflow architecture and learn best practices for using the platform
- Understand the process of planning your Kubeflow deployment
- Install Kubeflow on an existing on-premise Kubernetes cluster
- Deploy Kubeflow on Google Cloud Platform, AWS, and Azure
- Use KFServing to develop and deploy machine learning models
Preface What Is in This Book? Who Is This Book For? Conventions Used in This Book Using Code Examples O’Reilly Online Learning How to Contact Us Acknowledgments Josh Michael Austin 1. Introduction to Kubeflow Machine Learning on Kubernetes The Evolution of Machine Learning in Enterprise It’s Harder Than Ever to Run Enterprise Infrastructure Identifying Next-Generation Infrastructure (NGI) Core Principles Kubernetes for Production Application Deployment Enter: Kubeflow What Problems Does Kubeflow Solve? Origin of Kubeflow Who Uses Kubeflow? Team alignment for the line of business, DevOps, data engineering, and data science Common Kubeflow Use Cases Running Notebooks on GPUs Advantages of notebooks on GPUs Team alignment for notebooks on GPUs Shared Multitenant Machine Learning Environment Advantages of on-premise multitenant environment Team alignment Building a Transfer Learning Pipeline Advantages of running computer vision pipeline on Kubeflow Team alignment for computer vision pipeline Deploying Models to Production for Application Integration Advantages of deploying models to production on Kubeflow Team alignment for model deployment Components of Kubeflow Machine Learning Tools TensorFlow training and TFJob Keras Applications and Scaffolding Kubeflow UI Jupyter Notebooks Jupyter Notebook integration with Kubeflow Operators for machine learning frameworks Metadata and artifacts Hyperparameter tuning Pipelines Basic Kubeflow Pipeline concepts Machine Learning Model Inference Serving with KFServing Platforms and Clouds Public clouds Managed Kubernetes in the cloud On-premise Local Summary 2. Kubeflow Architecture and Best Practices Kubeflow Architecture Overview Kubeflow and Kubernetes Ways to Run a Job on Kubeflow Machine Learning Metadata Service Artifact Storage Istio Operations in Kubeflow Istio and KFServing Kubeflow Multitenancy Architecture Multitenancy and Isolation Multiuser Architecture Multiuser Authorization Flow Kubeflow Profiles Multiuser Isolation Notebook Architecture Notebook Server Launcher UI Notebook Controller Pipelines Architecture Kubeflow Best Practices Managing Job Dependencies Building a custom notebook Docker image Using GPUs Using GPUs with notebooks Validating that notebook code is using the GPU Experiment Management Install the metadata SDK Basic metadata SDK usage Summary 3. Planning a Kubeflow Installation Security Planning Components That Extend the Kubernetes API Components Running Atop Kubernetes Background and Motivation Kubeflow and Deployed Applications Integration Users Profiling Users Varying Skillsets Workloads Cluster Utilization Data Patterns Dedicated versus transient GPU Planning Planning for GPUs GPU use cases GPU anti-use cases Models that Benefit from GPUs Distributed versus multi-GPU training Infrastructure Planning Kubernetes Considerations On-Premise The Nvidia DGX Datacenter considerations Cloud Placement Container Management Serverless Container Operations with Knative Sizing and Growing Forecasting Storage Scaling Summary 4. Installing Kubeflow On-Premise Kubernetes Operations from the Command Line Installing kubectl Installing kubectl on macOS Understanding kubectl and contexts Getting the current context Adding clusters to our context file Switching contexts Using kubectl Getting running services Get cluster information Get currently running jobs Using Docker Basic Docker install Basic Docker commands Using Docker to build TensorFlow containers Basic Install Process Installing On-Premise Considerations for Building Kubernetes Clusters Gateway Host Access to Kubernetes Cluster Active Directory Integration and User Management Kubernetes, kubectl, and Active Directory Kerberos Integration Storage Integration Thinking about Kubeflow job bandwidth Common access storage patterns with Kubeflow jobs Options for Kubeflow storage Persistent volume claims and Kubeflow storage Container Management and Artifact Repositories Setting up an internal container repository Accessing and Interacting with Kubeflow Common Command-Line Operations Accessible Web UIs Installing Kubeflow System Requirements Set Up and Deploy Summary 5. Running Kubeflow on Google Cloud Overview of the Google Cloud Platform Storage Google Cloud Identity-Aware Proxy Google Cloud Security and the Cloud Identity-Aware Proxy Authentication Authorization GCP Projects for Application Deployments GCP Service Accounts Signing Up for Google Cloud Platform Installing the Google Cloud SDK Update Python Download and Install Google Cloud SDK Installing Kubeflow on Google Cloud Platform Create a Project in the GCP Console Enabling APIs for a Project Set Up OAuth for GCP Cloud IAP Set up the OAuth consent screen Configuring the Credentials tab Deploy Kubeflow Using the Command-Line Interface Creating user credentials Create required environment variables Set up kfctl Set up environment variables for kfctl Setting the ZONE environment variable Setting the PROJECT environment variable Set GCloud configuration variables Setting the CONFIG environment variable Setting the Kubeflow deployment environment variables Kubeflow deployment with kfctl Confirm Kubeflow deployment Accessing the Kubeflow UI Post-Installation Getting the ingress URI for your deployment Summary 6. Running Kubeflow on Amazon Web Services Overview of Amazon Web Services Storage Amazon Storage Pricing Amazon Cloud Security AWS Compute Services Managed Kubernetes on EKS Signing Up for Amazon Web Services Installing the AWS CLI Update Python Install the AWS CLI Configuring AWS CLI Kubeflow on Amazon Web Services Installing kubectl Install the eksctl CLI for Amazon EKS Install AWS IAM Authenticator Install jq Using Managed Kubernetes on Amazon EKS Create an EKS Service Role Create an AWS VPC Creating EKS Clusters Deploying an EKS Cluster with eksctl Understanding the Deployment Process Kubeflow Configuration and Deployment Download and configure kfctl Deploy Kubeflow to EKS Confirm EKS Deployment Customize the Kubeflow Deployment Customize Authentication Resizing EKS Clusters Deleting EKS Clusters Adding Logging Troubleshooting Deployments Summary 7. Running Kubeflow on Azure Overview of the Azure Cloud Platform Key Azure Components Storage on Azure File storage Disk storage Blob Storage Azure Data Lake Storage Gen2 Archive storage Avere vFXT The Azure Security Model Authentication and authorization Service Accounts Resources and Resource Groups Azure Virtual Machines Containers and Managed Azure Kubernetes Services The Azure CLI Installing the Azure CLI macOS install Windows install Debian and Ubuntu (x86_64) Installing Kubeflow on Azure Kubernetes Azure Login and Configuration Create an AKS Cluster for Kubeflow Creating an Azure resource group Creating an AKS cluster for Kubeflow Kubeflow Installation Get Azure credentials Download, install, and configure kfctl Set up environment variables for kfctl Setting the CONFIG environment variable Setting the Kubeflow deployment environment variables Kubeflow deployment with kfctl Confirm Kubeflow deployment Authorizing Network Access to Deployment Summary 8. Model Serving and Integration Basic Concepts of Model Management Understanding Training Models Versus Model Inference Building an Intuition for Model Integration Scaling Model Inference Throughput Developing example inference-per-second forecasts Model Management Introduction to KFServing Advantages of Using KFServing Core Concepts in KFServing InferenceService Endpoint Predictor Explainer Transformers Leveraging canarying with KFServing Outlier detection Concept drift Supported Pre-Built Model Servers InferenceService and storage provider support Google Cloud Storage S3-compatible object storage Azure Blob Storage Local container filesystem Persistent volume claim KFServing Security Model Managing Models with KFServing Installing KFServing on a Kubernetes Cluster Installing KFServing standalone on Minikube Deploying a Model on KFServing Deploying a Python TensorFlow model as an InferenceService Deploy InferenceService with custom model serving strategy Managing Model Traffic with Canarying Deploying a Custom Transformer Roll Back a Deployed Model Removing a Deployed Model Summary A. Infrastructure Concepts Public Key Infrastructure Authentication Kubeflow and Authentication Authorization Authorization and Role-Based Access Control Lightweight Directory Access Protocol Kerberos Transport Layer Security X.509 Cert Webhook Active Directory Identity Providers Identity-Aware Proxy (IAP) IAP and Google Cloud Platform OAuth OpenID Connect End-User Authentication with JWT Simple and Protected GSS_API Negotiation Mechanism Dex: A Federated OpenID Connect Provider Dex and Kerberos Service Accounts The Control Plane Options for Securing the Control Plane B. An Overview of Kubernetes Core Kubernetes Concepts Pod Object Spec and Status Describing a Kubernetes Object Submitting Containers to Kubernetes Kubernetes Resource Model Custom Resources, Controllers, and Operators Custom Controllers Custom Resource Definition C. Istio Operations and Kubeflow Service Mesh Management with Istio Istio Architecture Control plane Data plane Traffic Management Virtual services Destination rules Gateways Istio Security Architecture Policies Istio identity Istio authentication Istio Authorization and Role-Based Access Control Authorization policies ServiceRole ServiceRoleBinding Index
Donate to keep this site alive
How to download source code?
1. Go to: https://www.oreilly.com/
2. Search the book title: Kubeflow Operations Guide: Managing Cloud and On-Premise Deployment
, sometime you may not get the results, please search the main title
3. Click the book title in the search results
3. Publisher resources
section, click Download Example Code
.
1. Disable the AdBlock plugin. Otherwise, you may not get any links.
2. Solve the CAPTCHA.
3. Click download link.
4. Lead to download server to download.