Deep Learning in Production

Length: 328 pages
Edition: 1
Language: English
Publication Date: 2021-11-23
ISBN-10: B09MJF24HZ
Sales Rank: #687883 (See Top 100 Books)

Build, train, deploy, scale and maintain deep learning models. Understand ML infrastructure and MLOps using hands-on examples.

Deep Learning research is advancing rapidly over the past years. Frameworks and libraries are constantly been developed and updated. However, we still lack standardized solutions on how to serve, deploy and scale Deep Learning models. Deep Learning infrastructure is not very mature yet.

This book accumulates a set of best practices and approaches on how to build robust and scalable machine learning applications. It covers the entire lifecycle from data processing and training to deployment and maintenance. It will help you understand how to transfer methodologies that are generally accepted and applied in the software community, into Deep Learning projects.

It’s an excellent choice for researchers with a minimal software background, software engineers with little experience in machine learning, or aspiring machine learning engineers.

What you will learn?

Best practices to write Deep Learning code
How to unit test and debug Machine Learning code
How to build and deploy efficient data pipelines
How to serve Deep Learning models
How to deploy and scale your application
What is MLOps and how to build end-to-end pipelines

Who is this book for?

Software engineers who are starting out with deep learning
Machine learning researchers with limited software engineering background
Machine learning engineers who seek to strengthen their knowledge
Data scientists who want to productionize their models and build customer-facing applications

What tools you will use?

Tensorflow, Flask, uWSGI, Nginx, Docker, Kubernetes, Tensorflow Extended, Google Cloud, Vertex AI

Preface
Acknowledgements
1. About this Book
    1.1 Welcome to Deep Learning in Production
    1.2 Is this book for me?
        Software engineers
        Machine Learning researchers
        Machine Learning engineers
        Data Scientists
    1.3 What is the book’s goal?
    1.4 Will this be difficult to learn?
    1.5 Why should you read this book?
    1.6 How to use this book?
    1.7 How is the book structured?
    1.8 Do I need to know anything else before I get started?
2. Designing a Machine Learning System
    2.1 Machine learning: phase zero
    2.2 Data engineering
    2.3 Model engineering
    2.4 DevOps engineering
    2.5 Putting it all together
    2.6 Tackling a real-life problem
3. Setting up a Deep Learning Workstation
    3.1 Laptop setup
        3.1.1 Laptop requirements
        3.1.2 Operating system
    3.2 Frameworks and libraries
    3.3 Development tools
        3.3.1 Terminal
        3.3.2 Version control
    3.4 Python package and environment management
        3.4.1 IDE / code editor
        3.4.2 Other tools
4. Writing and Structuring Deep Learning Code
    4.1 Best practices
        4.1.1 Project structure
            Python modules and packages
        4.1.2 Object-oriented programming
            Abstraction and Inheritance
            Static and class methods
        4.1.3 Configuration
        4.1.4 Type checking
        4.1.5 Documentation
    4.2 Unit testing
        Why do we need unit testing?
        4.2.1 Basics of unit testing
        4.2.2 Unit tests in Python
        4.2.3 Tests in Tensorflow
        4.2.4 Mocking
        4.2.5 Test coverage
        4.2.6 Test example cases
        4.2.7 Integration / acceptance tests
    4.3 Debugging
        4.3.1 How to a debug deep learning project?
        4.3.2 Python’s debugger
        4.3.3 Debugging data with schema validation
        4.3.4 Logging
        4.3.5 Python’s Logging module
        4.3.6 Useful Tensorflow debugging and logging functions
5. Data Processing
    5.1 ETL: Extract, Transform, Load
    5.2 Data reading
        5.2.1 Loading from multiple sources
        5.2.2 Parallel data extraction
    5.3 Processing
    5.4 Loading
        5.4.1 Iterators
    5.5 Optimizing a data pipeline
        5.5.1 Batching
        5.5.2 Prefetching
        5.5.3 Caching
        5.5.4 Streaming
6. Training
    6.1 Building a trainer
        6.1.1 Creating a custom training loop
        6.1.2 Training checkpoints
        6.1.3 Saving the trained model
        6.1.4 Visualizing the training with Tensorboard
        6.1.5 Model validation
    6.2 Training in the cloud
        6.2.1 Getting started with cloud computing
        6.2.2 Creating a VM instance
        6.2.3 Connecting to the VM instance
        6.2.4 Transferring files to the VM instance
        6.2.5 Running the training remotely
        6.2.6 Accessing training data from a remote environment
    6.3 Distributed training
        6.3.1 Data vs model parallelism
        6.3.2 Training in a single machine
        6.3.3 Synchronous training
            Mirrored Strategy
            Multi Worker Mirrored Strategy
            Central Storage Strategy
        6.3.4 Asynchronous training
            Parameter Server Strategy
        6.3.5 Model parallelism
7. Serving
    7.1 Preparing the model
        7.1.1 Building the model’s inference function
    7.2 Creating a web application using Flask
        7.2.1 Basics of modern web applications
            How does the client know in what format the server expects the request (data)?
            What does an HTTP request look like?
            How does the client know where to send the request?
            How do we actually communicate with the server?
        7.2.2 Exposing the deep learning model using Flask
        7.2.3 Creating a client
    7.3 Serving with uWSGI and Nginx
        7.3.1 Basic Terminology
            What is uWSGI?
            Why do we need uWSGI? Isn’t Flask adequate?
            What is Nginx?
            Nginx as a reverse proxy: what is a reverse proxy and why use it?
            What is a web socket?
        7.3.2 Designing a serving system
        7.3.3 Setting up a uWSGI server with Flask
            Analysing the uwsgi config file
        7.3.4 Setting up Nginx as a reverse proxy
    7.4 Serving with model servers
        7.4.1 Tensorflow Serving vs Flask
        7.4.2 Export a Tensorflow model
        7.4.3 Install Tensorflow Serving
        7.4.4 Load a model
        7.4.5 Multiple versions support
        7.4.6 Multiple models support
        7.4.7 Batching inferences
8. Deploying
    8.1 Containerizing using Docker and Docker Compose
        8.1.1 What is a container?
        8.1.2 What is Docker
        8.1.3 Setting up Docker
            Docker image
        8.1.4 Building a deep learning Docker image
            Restructuring our app for deployment
            Dockerfiles
        8.1.5 Running a deep learning Docker container
        8.1.6 Creating an Nginx container
        8.1.7 Defining multi-container Docker apps using Docker Compose
    8.2 Deploying in a production environment
        8.2.1 Using containers in Google Cloud
            1) Create a vanilla VM instance and then install Docker
            2) Use a Container-Optimized OS image
            3) Use a public VM image
        8.2.2 Allowing network traffic to the instance
        8.2.3 Deploying in Google Cloud
            1) Create a vanilla VM instance and then install Docker
            2) Use a Container-Optimized OS image
            3) Use a public VM image
    8.3 Continuous Integration and Delivery (CI / CD)
9. Scaling
    9.1 A journey from 1 to millions of users
        Glossary
        9.1.1 First iterations of the machine learning app
            First iteration
            Second iteration: logs and CI/CD pipeline
            Third iteration: Docker container
        9.1.2 Vertical vs horizontal scaling
            Fourth iteration: scaling up
            Fifth iteration: scaling out
        9.1.3 Autoscaling
        9.1.4 Cache mechanisms
        9.1.5 Monitoring alerts
        9.1.6 Retraining machine learning models
        9.1.7 Model A/B testing
        9.1.8 Offline inference
    9.2 Growing with Kubernetes
        9.2.1 What is Kubernetes?
        9.2.2 Getting started with Kubernetes
        9.2.3 Deploying with Google Kubernetes Engine
            1) Installing gcloud and kubectl
            2) Transfering local files and Dockerfile
            3) Setting up a Kubernetes Deployment
            4) Set up a Kubernetes Service
        9.2.4 Scaling with Kubernetes
        9.2.5 Updating the application
        9.2.6 Monitoring the application
        9.2.7 Running a (re)training job
        9.2.8 Using Kubernetes with GPUs
        9.2.9 Model A/B testing
10. Building an End-to-End Pipeline
    10.1 MLOps
        10.1.1 Basic principles
        10.1.2 MLOps levels
            Level 0: Manual process
            Level 1: ML pipeline automation
            Level 2: CI/CD pipeline automation
    10.2 Building a pipeline using TFX
        10.2.1 TFX glossary
        10.2.2 Data ingestion
        10.2.3 Data validation
        10.2.4 Feature engineering
        10.2.5 Train the model
        10.2.6 Validate model
        10.2.7 Push model
        10.2.8 Build a TFX pipeline
        10.2.9 Run a TFX pipeline
    10.3 MLOps with Vertex AI and Google Cloud
        10.3.1 Hands on Vertex AI
        10.3.2 Experimenting with notebooks
        10.3.3 Loading data
            Managed Datasets
            Custom datasets
        10.3.4 Training the model
        10.3.5 Deploying to Vertex AI
            Creating a model
            Creating an endpoint
        10.3.6 Creating a pipeline
    10.4 More end-to-end solutions
        MLOps from big cloud providers
        Open-source frameworks for ML workflows
        Enterprise solutions
        So how do I choose?
11. Where to Go from Here
Appendix
Table of Figures
About the Author
Notes