Deep Learning in Production
- Length: 328 pages
- Edition: 1
- Language: English
- Publication Date: 2021-11-23
- ISBN-10: B09MJF24HZ
- Sales Rank: #687883 (See Top 100 Books)
Build, train, deploy, scale and maintain deep learning models. Understand ML infrastructure and MLOps using hands-on examples.
Deep Learning research is advancing rapidly over the past years. Frameworks and libraries are constantly been developed and updated. However, we still lack standardized solutions on how to serve, deploy and scale Deep Learning models. Deep Learning infrastructure is not very mature yet.
This book accumulates a set of best practices and approaches on how to build robust and scalable machine learning applications. It covers the entire lifecycle from data processing and training to deployment and maintenance. It will help you understand how to transfer methodologies that are generally accepted and applied in the software community, into Deep Learning projects.
It’s an excellent choice for researchers with a minimal software background, software engineers with little experience in machine learning, or aspiring machine learning engineers.
What you will learn?
- Best practices to write Deep Learning code
- How to unit test and debug Machine Learning code
- How to build and deploy efficient data pipelines
- How to serve Deep Learning models
- How to deploy and scale your application
- What is MLOps and how to build end-to-end pipelines
Who is this book for?
- Software engineers who are starting out with deep learning
- Machine learning researchers with limited software engineering background
- Machine learning engineers who seek to strengthen their knowledge
- Data scientists who want to productionize their models and build customer-facing applications
What tools you will use?
Tensorflow, Flask, uWSGI, Nginx, Docker, Kubernetes, Tensorflow Extended, Google Cloud, Vertex AI
Preface Acknowledgements 1. About this Book 1.1 Welcome to Deep Learning in Production 1.2 Is this book for me? Software engineers Machine Learning researchers Machine Learning engineers Data Scientists 1.3 What is the book’s goal? 1.4 Will this be difficult to learn? 1.5 Why should you read this book? 1.6 How to use this book? 1.7 How is the book structured? 1.8 Do I need to know anything else before I get started? 2. Designing a Machine Learning System 2.1 Machine learning: phase zero 2.2 Data engineering 2.3 Model engineering 2.4 DevOps engineering 2.5 Putting it all together 2.6 Tackling a real-life problem 3. Setting up a Deep Learning Workstation 3.1 Laptop setup 3.1.1 Laptop requirements 3.1.2 Operating system 3.2 Frameworks and libraries 3.3 Development tools 3.3.1 Terminal 3.3.2 Version control 3.4 Python package and environment management 3.4.1 IDE / code editor 3.4.2 Other tools 4. Writing and Structuring Deep Learning Code 4.1 Best practices 4.1.1 Project structure Python modules and packages 4.1.2 Object-oriented programming Abstraction and Inheritance Static and class methods 4.1.3 Configuration 4.1.4 Type checking 4.1.5 Documentation 4.2 Unit testing Why do we need unit testing? 4.2.1 Basics of unit testing 4.2.2 Unit tests in Python 4.2.3 Tests in Tensorflow 4.2.4 Mocking 4.2.5 Test coverage 4.2.6 Test example cases 4.2.7 Integration / acceptance tests 4.3 Debugging 4.3.1 How to a debug deep learning project? 4.3.2 Python’s debugger 4.3.3 Debugging data with schema validation 4.3.4 Logging 4.3.5 Python’s Logging module 4.3.6 Useful Tensorflow debugging and logging functions 5. Data Processing 5.1 ETL: Extract, Transform, Load 5.2 Data reading 5.2.1 Loading from multiple sources 5.2.2 Parallel data extraction 5.3 Processing 5.4 Loading 5.4.1 Iterators 5.5 Optimizing a data pipeline 5.5.1 Batching 5.5.2 Prefetching 5.5.3 Caching 5.5.4 Streaming 6. Training 6.1 Building a trainer 6.1.1 Creating a custom training loop 6.1.2 Training checkpoints 6.1.3 Saving the trained model 6.1.4 Visualizing the training with Tensorboard 6.1.5 Model validation 6.2 Training in the cloud 6.2.1 Getting started with cloud computing 6.2.2 Creating a VM instance 6.2.3 Connecting to the VM instance 6.2.4 Transferring files to the VM instance 6.2.5 Running the training remotely 6.2.6 Accessing training data from a remote environment 6.3 Distributed training 6.3.1 Data vs model parallelism 6.3.2 Training in a single machine 6.3.3 Synchronous training Mirrored Strategy Multi Worker Mirrored Strategy Central Storage Strategy 6.3.4 Asynchronous training Parameter Server Strategy 6.3.5 Model parallelism 7. Serving 7.1 Preparing the model 7.1.1 Building the model’s inference function 7.2 Creating a web application using Flask 7.2.1 Basics of modern web applications How does the client know in what format the server expects the request (data)? What does an HTTP request look like? How does the client know where to send the request? How do we actually communicate with the server? 7.2.2 Exposing the deep learning model using Flask 7.2.3 Creating a client 7.3 Serving with uWSGI and Nginx 7.3.1 Basic Terminology What is uWSGI? Why do we need uWSGI? Isn’t Flask adequate? What is Nginx? Nginx as a reverse proxy: what is a reverse proxy and why use it? What is a web socket? 7.3.2 Designing a serving system 7.3.3 Setting up a uWSGI server with Flask Analysing the uwsgi config file 7.3.4 Setting up Nginx as a reverse proxy 7.4 Serving with model servers 7.4.1 Tensorflow Serving vs Flask 7.4.2 Export a Tensorflow model 7.4.3 Install Tensorflow Serving 7.4.4 Load a model 7.4.5 Multiple versions support 7.4.6 Multiple models support 7.4.7 Batching inferences 8. Deploying 8.1 Containerizing using Docker and Docker Compose 8.1.1 What is a container? 8.1.2 What is Docker 8.1.3 Setting up Docker Docker image 8.1.4 Building a deep learning Docker image Restructuring our app for deployment Dockerfiles 8.1.5 Running a deep learning Docker container 8.1.6 Creating an Nginx container 8.1.7 Defining multi-container Docker apps using Docker Compose 8.2 Deploying in a production environment 8.2.1 Using containers in Google Cloud 1) Create a vanilla VM instance and then install Docker 2) Use a Container-Optimized OS image 3) Use a public VM image 8.2.2 Allowing network traffic to the instance 8.2.3 Deploying in Google Cloud 1) Create a vanilla VM instance and then install Docker 2) Use a Container-Optimized OS image 3) Use a public VM image 8.3 Continuous Integration and Delivery (CI / CD) 9. Scaling 9.1 A journey from 1 to millions of users Glossary 9.1.1 First iterations of the machine learning app First iteration Second iteration: logs and CI/CD pipeline Third iteration: Docker container 9.1.2 Vertical vs horizontal scaling Fourth iteration: scaling up Fifth iteration: scaling out 9.1.3 Autoscaling 9.1.4 Cache mechanisms 9.1.5 Monitoring alerts 9.1.6 Retraining machine learning models 9.1.7 Model A/B testing 9.1.8 Offline inference 9.2 Growing with Kubernetes 9.2.1 What is Kubernetes? 9.2.2 Getting started with Kubernetes 9.2.3 Deploying with Google Kubernetes Engine 1) Installing gcloud and kubectl 2) Transfering local files and Dockerfile 3) Setting up a Kubernetes Deployment 4) Set up a Kubernetes Service 9.2.4 Scaling with Kubernetes 9.2.5 Updating the application 9.2.6 Monitoring the application 9.2.7 Running a (re)training job 9.2.8 Using Kubernetes with GPUs 9.2.9 Model A/B testing 10. Building an End-to-End Pipeline 10.1 MLOps 10.1.1 Basic principles 10.1.2 MLOps levels Level 0: Manual process Level 1: ML pipeline automation Level 2: CI/CD pipeline automation 10.2 Building a pipeline using TFX 10.2.1 TFX glossary 10.2.2 Data ingestion 10.2.3 Data validation 10.2.4 Feature engineering 10.2.5 Train the model 10.2.6 Validate model 10.2.7 Push model 10.2.8 Build a TFX pipeline 10.2.9 Run a TFX pipeline 10.3 MLOps with Vertex AI and Google Cloud 10.3.1 Hands on Vertex AI 10.3.2 Experimenting with notebooks 10.3.3 Loading data Managed Datasets Custom datasets 10.3.4 Training the model 10.3.5 Deploying to Vertex AI Creating a model Creating an endpoint 10.3.6 Creating a pipeline 10.4 More end-to-end solutions MLOps from big cloud providers Open-source frameworks for ML workflows Enterprise solutions So how do I choose? 11. Where to Go from Here Appendix Table of Figures About the Author Notes
Donate to keep this site alive
1. Disable the AdBlock plugin. Otherwise, you may not get any links.
2. Solve the CAPTCHA.
3. Click download link.
4. Lead to download server to download.