Elements of Deep Learning for Computer Vision

by Bharat Sikka

Length: 208 pages
Edition: 1
Language: English
Publisher: BPB Publications
Publication Date: 2021-06-25
ISBN-10: 9390684684
ISBN-13: 9789390684687
Sales Rank: #0 (See Top 100 Books)

0 ratings

Print Book Look Inside

Conceptualizing deep learning in computer vision applications using PyTorch and Python libraries.

Key Features

Covers a variety of computer vision projects, including face recognition and object recognition such as Yolo, Faster R-CNN.
Includes graphical representations and illustrations of neural networks and teaches how to program them.
Includes deep learning techniques and architectures introduced by Microsoft, Google, and the University of Oxford.

Description

Elements of Deep Learning for Computer Vision gives a thorough understanding of deep learning and provides highly accurate computer vision solutions while using libraries like PyTorch.

This book introduces you to Deep Learning and explains all the concepts required to understand the basic working, development, and tuning of a neural network using Pytorch. The book then addresses the field of computer vision using two libraries, including the Python wrapper/version of OpenCV and PIL. After establishing and understanding both the primary concepts, the book addresses them together by explaining Convolutional Neural Networks(CNNs). CNNs are further elaborated using top industry standards and research to explain how they provide complicated Object Detection in images and videos, while also explaining their evaluation. Towards the end, the book explains how to develop a fully functional object detection model, including its deployment over APIs.

By the end of this book, you are well-equipped with the role of deep learning in the field of computer vision along with a guided process to design deep learning solutions.

What you will learn

Get to know the mechanism of deep learning and how neural networks operate.
Learn to develop a highly accurate neural network model.
Access to rich Python libraries to address computer vision challenges.
Build deep learning models using PyTorch and learn how to deploy using the API.
Learn to develop Object Detection and Face Recognition models along with their deployment.

Who this book is for

This book is for the readers who aspire to gain a strong fundamental understanding of how to infuse deep learning into computer vision and image processing applications. Readers are expected to have intermediate Python skills. No previous knowledge of PyTorch and Computer Vision is required.

About the Authors

Bharat Sikka is a data scientist based in Mumbai, India. Over the years, he has worked on implementing algorithms like YOLOv3/v4, Faster-RCNN, Mask-RCNN, among others. He is currently working as a data scientist at the State Bank of India.

He holds an MS degree in Data Science and Analytics from Royal Holloway, University of London, and a BTech degree in Information Technology from Symbiosis International University and has earned multiple certifications, including MOOCs in varied fields, including machine learning.

He is a science fiction fanatic, loves to travel, and is a great cook.

Blog links: https://github.com/bharatsikka

LinkedIn Profile: www.linkedin.com/in/bharat-sikka

Cover Page
Title Page
Copyright Page
Dedication Page
About the Author
About the Reviewer
Acknowledgement
Preface
Errata
Table of Contents
Section 1: Introductory Concepts
    1. An Introduction to Deep Learning
        Objectives
        1.1 Artificial intelligence
        1.2 Machine learning
        1.3 Deep learning
        1.4 Future of deep learning
    2. Supervised Learning
        Objectives
        2.1 Data and Supervised learning
        2.2 Tasks in supervised learning
        2.3 Neurons and layers
        2.4 Regression and classification output neurons
        2.5 Neural networks using PyTorch
            PyTorch requirements
            PyTorch installation
        2.6 Classification of Iris species using Iris dataset and PyTorch
    3. Gradient Descent
        Objectives
        3.1 Gradient descent
        3.2 Overfitting and underfitting
        3.3 Regularizations and learning rate
        3.4 Stochastic Gradient Descent
        3.5 Loss Functions and optimizers
        Conclusion
Section 2: Computer Vision
    4. OpenCV with Python
        Objectives
        4.1 Computer vision
        4.2 OpenCV
            Further operations on images
                Image properties and resizing
                Pixel manipulation
                Region of image and padding
            Face recognition
        Conclusion
    5. Python Imaging Library and Pillow
        Objectives
        5.1 Python Imaging Library
            Basic image operations
                Reading an image
                Displaying an image
                Writing/saving an image
                Image properties and resizing
            Pixel manipulation
            Region of image and padding
            Image enhancing
                A viral image
        Conclusion
Section 3: Convolutional Neural Networks for Vision
    6. Introduction to Convolutional Neural Networks
        Objectives
        6.1 Convolutional Neural Networks (CNNs)
            Weights and parameters
            Pooling
            Padding
            Transfer learning
            CNN classifier implementation using CIFAR 10 and PyTorch
        Conclusion
    7. GoogLeNet, VGGNet, and ResNet
        Objectives
        7.1 GoogLeNet
        7.2 VGGNet
        7.3 ResNet
        7.4 Torchvision
            Datasets
            IO
            Models
            Ops, Transforms, and Utils
        Conclusion
Section 4: Object Detection
    8. Understanding Object Detection
        Objectives
        8.1 Introduction to object detection
        8.2 Classification
        8.3 Localization
        8.4 Detection
        8.5 mean Average Precision (mAP)
        Conclusion
    9. Popular Algorithms for Object Detection
        Objectives
        9.1 OverFeat
            Working and implementation
        9.2 Region-based CNN
            Selective search
            Working and implementation
        9.3 Fast R-CNN
            Region of interest pooling
            Working and implementation
        9.4 Faster R-CNN
            Working and implementation
            Anchors
        9.5 You Only Look Once (YOLO)
            Working and implementation
        Conclusion
    10. Faster R-CNN with PyTorch and YOLOv4 with Darknet
        Objectives
        10.1 Torchvision libraries continued
            Transforms
                Transforms on PIL image
            Transforms on torch
            Utils
        10.2 Object Detection using PyTorch
        10.3 Object detection using YOLO
        Conclusion
    11. Comparing Algorithms and API Deployment with Flask
        Objectives
        11.1 Comparing mean Average Precision (mAP) of Faster R-CNN and YOLO
            Faster R-CNN performance in mAP
            YOLO performance
        11.2 Model deployment using Flask
            Installation
            Initialization and Hello World!
        Conclusion
Section 5: Further Usage and Applications in Real Life
    12. Applications in Real World
        Objectives
        12.1 Introduction to Detecto
            Installation
            Dataset
            Labelling/annotating a dataset
            Training a model using Detecto
        Conclusion
References
Index