Getting started with Deep Learning for Natural Language Processing: Learn how to build NLP applications with Deep Learning

Length: 404 pages
Edition: 1
Language: English
Publisher: BPB Publications
Publication Date: 2021-01-13
ISBN-10: 9389898110
ISBN-13: 9789389898118
Sales Rank: #2152098 (See Top 100 Books)

Learn how to redesign NLP applications from scratch.

Key Features

Get familiar with the basics of any Machine Learning or Deep Learning application.
Understand how does preprocessing work in NLP pipeline.
Use simple PyTorch snippets to create basic building blocks of the network commonly used in NLP.
Learn how to build a complex NLP application.
Get familiar with the advanced embedding technique, Generative network, and Audio signal processing techniques.

Description
Natural language processing (NLP) is one of the areas where many Machine Learning and Deep Learning techniques are applied.
This book covers wide areas, including the fundamentals of Machine Learning, Understanding and optimizing Hyperparameters, Convolution Neural Networks (CNN), and Recurrent Neural Networks (RNN). This book not only covers the classical concept of text processing but also shares the recent advancements. This book will empower users in designing networks with the least computational and time complexity. This book not only covers basics of Natural Language Processing but also helps in deciphering the logic behind advanced concepts/architecture such as Batch Normalization, Position Embedding, DenseNet, Attention Mechanism, Highway Networks, Transformer models and Siamese Networks. This book also covers recent advancements such as ELMo-BiLM, SkipThought, and Bert. This book also covers practical implementation with step by step explanation of deep learning techniques in Topic Modelling, Text Generation, Named Entity Recognition, Text Summarization, and Language Translation. In addition to this, very advanced and open to research topics such as Generative Adversarial Network and Speech Processing are also covered.

What will you learn

Learn how to leveraging GPU for Deep Learning
Learn how to use complex embedding models such as BERT
Get familiar with the common NLP applications.
Learn how to use GANs in NLP
Learn how to process Speech data and implementing it in Speech applications

Who this book is for
This book is a must-read to everyone who wishes to start the career with Machine learning and Deep Learning. This book is also for those who want to use GPU for developing Deep Learning applications.

About the Authors
Sunil Patel has completed his master’s in Information Technology from the Indian Institute of Information technology-Allahabad with a thesis focused on investigating 3D protein-protein interactions with deep learning. Sunil has worked with TCS Innovation Labs, Excelra, and Innoplexus before joining to Nvidia. The main areas of research were using Deep Learning, Natural language processing in Banking, and healthcare domain.
Sunil started experimenting with deep learning by implanting the basic layer used in pipelines and then developing complex pipelines for a real-life problem. Apart from this, Sunil has also participated in CASP-2014 in collaboration with SCFBIO-IIT Delhi to efficiently predict possible Protein multimer formation and its impact on diseases using Deep Learning. Currently, Sunil works with Nvidia as Data Scientist – III. In Nvidia, Sunil has expanded the area of interest to computer vision and simulated environments.

LinkedIn Profile: https://www.linkedin.com/in/linus1/

Cover Page
Title Page
Copyright Page
Dedication Page
About the Author
About the Reviewer
Acknowledgements
Preface
Errata
Table of Contents
1. Understanding the Basics of Learning Process
    Structure
    Objective
    Pre-requisites
    Learning from Data
        Implementing the Perceptron Model
        Generating and Understanding “Fake Image Data” and Binary Labels
        Understanding Our First Tiny Machine Learning Model
        Coding the Model with PyTorch
        Confirming the Convergence of the Model
    Error/Noise Reduction
        Understanding Confusion Matrix and Derived Measures
        Defining Weighted Loss Function
        BLEU Score
    Bias-Variance Problem
        SciKit Learn Functions to Build Pipeline Quickly
        Managing the Bias and Variance
    Learning Curves
        Loading Data, Pre-processing
        Using Simple Regression
        Using Random Forest Regression
    Regularization
        L1 Regularization (Lasso Regularization)
        L2 Regularization (Ridge Regularization)
        Implementing Lasso Regression
        Implementing ElasticNet
    Training and Inference
        Software-based Accelerated Inferring
        Hardware-based Accelerated Inferencing
    The Three Learning Principles
        Model related concepts
        Data related Concepts
    Conclusion
2. Text Processing Techniques
    Structure
    Objective
    Pre-requisites
    Understanding the Language Problem
    Introduction to Data Retrieval and Processing
        Scrapping the Web Page
        Parsing Data from XML and JSON Format
    Understanding Stemming
        Understanding Snowball Algorithm
    Understanding Lemmatization
    Understanding Tokenization
        Using NLTK Tokenizer
        Using Spacy Tokenizer
    Getting Familiarized with PyTorch
    Installation
    Using TorchText
    Visualizing Using TensorBoard
        Showing Scalar Values on TensorboardX
        Projecting Images to TensorboardX
        Showing Text on tensorboardX
        Projecting Embedding Values on tensorboardX
    Conclusion
3. Representing Language Mathematically
    Structure
    Objective
    Prerequisite
    Encompassing knowledge to numbers
    Understanding the different approaches of converting a word/token to its embedding
    Understanding co-occurrence matrix
        Constructing a co-occurrence matrix
    Understanding TF-IDF
        Term frequency
        Inverse document frequency
        Constructing TF-IDF matrix
    Understanding Word2Vec
        Understanding methods to train Word2Vec
        Implementation
        Word2Vec improved version
            Sub-sampling
            Word pairs and phrases
            Negative sampling
    Understanding GloVe
        Defining learnable parameters
        Defining loss function
        Many important components
    Understanding character-based embedding
        Character-based embedding generation
    Conclusion
4. Using RNN for NLP
    Structure
    Objective
    Pre-requisites
    Understanding Recurrent Units
    Rolling and Unrolling
    Implementing the Concept of Embeddings
    Downloading Dataset
    Pre-processing
    Training
    Understanding Advance RNN Units
    Gating Mechanism in LSTM
    Modified LSTM Units
    Understanding and Implementing GRU
    GRU with PyTorch
    Understanding the Sequence to Sequence Model
    Implementing Sequence Encoder/Decoder
    Encoder
    Decoder
    Actual Training
    Evaluation
    Understanding Batching with Seq2Seq
    Decoder Phase
    Encoder and Decoder with Batching
    Decoder
    The Loss Function for Sequence to Sequence
    Translating in Batches with Seq2Seq
    Implementing Encoder/Decoder Capable of Batch Processing
        Encoder
        Decoder
    The Loss Function for Sequence to Sequence
    Implementing Attention for Language Translation
    Encoder
    Attention Mechanism
    Decoder
    Conclusion
5. Applying CNN in NLP Tasks
    Structure
    Objective
    Pre-requisites
    Understanding CNN
    Understanding Convolution Operations
    Convolution Layers
    Padding
    Stride
    Pooling layers
    Fully Connected Layers
    Convolution 1D
    Convolution 2D
    Pool Layers
    Rectifier Linear Unit (Relu)
    Using Word Level CNN
    Pre-processing
    Embedding
    Convolution Layers
    Using Character Level CNN
    Understanding Character Representation
    Network Architecture
    Using Very Deep Convolution Network
    The Convolution Block
    Understanding the Network
    Training Deeper Networks
    ResNet
    Highway Network
    DenseNet
    Fundamental Block of ResNet
    Fundamental Block of Highway Network
    DenseNet
    Conclusion
6. Accelerating NLP with Transfer Learning
    Structure
    Objective
    Pre-requisites
    Introduction
    Understanding the Transformer
        Source and Target Masking
        Positional Encoding
    Converting Sentence to Vector
        Sentence to Vector
        Skip Thought
    Getting to Know Contextual Vectors
        Using the Pre-trained Model
    Training Supervised Embedding
        Playing with InferSent
    Understanding and Using BERT
    Conclusion
7. Applying Deep Learning to NLP Tasks
    Structure
    Objective
    Technical Requirements
    Topic Modeling
        Applying LDA
    Text generation
        Understanding the Network
    Building Text Summarization Engine
        Abstractive Text Summarization
    Building Language Translation Using a Transformer
        Using a Transformer
    Advancing Sentiment Analysis
        Understanding Attention Mechanism
    Building Named Entity Recognition
        Word-level NER
        Character-level NER
    Conclusion
8. Application of Complex Architectures in NLP
    Structure
    Objective
    Technical Requirements
    Understanding SentencePiece
    Understanding Random Multi-Model
    Creating Flexible Networks
    Using RMDL
    Applying RMDL on Reuter Data
    Ensembling by Taking a Snapshot
    The Learning Rate Modifier
    Recording Snapshots
    Predicting Using Snapshots
    Getting to Know Siamese Networks
    Dataset Description
    Loading and Pre-processing Data
    Constructing a Sister Network
    The Stem
    Application of RCNN
    Preparing the Dataset
    Why Is It Difficult?
    How Can It Be Solved?
    Predicting Using CNN
    Predicting Using RCNN
    Understanding CTC Loss
    The Simplest Choice
    How Does CTC Work?
    Loss Calculation
    Understanding Decoding
    Installation
    Usage
    Captioning Image
    Downloading the Data
    Implementation
    Encoder Module
    Decoder Module
    Beam Search
    Variants
    Conclusion
9. Understanding Generative Networks
    Structure
    Objective
    Technical Requirements
    Understanding Unsupervised Pretraining
        GAN Components
        The Generator
        The Discriminator
    The GAN Architecture
        The Loss Function
    Implementing GAN for MNIST
    The Understanding Theory behind GAN
    Generating an Image from the Description
    Conclusion
10. Techniques of Speech Processing
    Structure
    Objective
    Technical Requirements
    Learning about Docker
    Getting to Know Phonemes
        Loading an Audio File
        Playing an Audio File
        Visualizing the Signals
        Feature Extraction
            MFCC — Mel-Frequency Cepstral Coefficients
            Spectral Centroid
            Spectral Rolloff
        Training a Small Network
            Feature Extraction
            Constructing the CNN Model
            Training and Estimating Performance on the Test Set
    Understanding Speech to Text
        Installation
        Datasets
        Pretrained Model
        Training
            Visualizing Training
        Dataset Augmentation
        Checkpoints and Continuing from Checkpoint
        Testing/Inference
        Running a Server
    Understanding Text to Speech
        Grapheme to Phoneme Model
        The Segmentation Model
        Phoneme Duration and Fundamental Frequency Model
        Audio Synthesis Model
        Download Dataset
            Installation
            Preprocessing
            Training
            Monitoring using TensorBoard
            Using the model for synthesis
    Conclusion
11. The Road Ahead
    Structure
    Objective
    Efficient Training
        Parallel Data Loading
        Utilizing Hardware Resources
    Efficient Deployment
    Hardware-related Optimizations
    Conclusion
Index