Real-World Natural Language Processing: Practical applications with deep learning
- Length: 322 pages
- Edition: 1
- Language: English
- Publisher: Manning
- Publication Date: 2021-12-07
- ISBN-10: 1617296422
- ISBN-13: 9781617296420
- Sales Rank: #1220524 (See Top 100 Books)
Real-world Natural Language Processing teaches you how to create practical NLP applications using Python and open source NLP libraries such as AllenNLP and Fairseq—without getting bogged down in complex language theory and the mathematics of deep learning.
Voice assistants, automated customer service agents, and other cutting-edge human-to-computer interactions rely on accurately interpreting language as it is written and spoken. In Real-World Natural Language Processing, you’ll explore the core tools and techniques required to build a huge range of powerful NLP apps.
Real-world Natural Language Processing teaches you how to create practical NLP applications using Python and open source NLP libraries such as AllenNLP and Fairseq—without getting bogged down in complex language theory and the mathematics of deep learning. You’ll begin by creating a complete sentiment analyzer, then dive deep into each component to unlock the building blocks you’ll use in all different kinds of NLP programs. By the time you’re done, you’ll have the skills to create named entity taggers, machine translation systems, spelling correctors, and language generation systems.
Real-World Natural Language Processing contents preface acknowledgments about this book Who should read this book How this book is organized: A roadmap About the code liveBook discussion forum Other online resources about the author about the cover illustration Part 1 Basics 1 Introduction to natural language processing 1.1 What is natural language processing (NLP)? 1.1.1 What is NLP? 1.1.2 What is not NLP? 1.1.3 AI, ML, DL, and NLP 1.1.4 Why NLP? 1.2 How NLP is used 1.2.1 NLP applications 1.2.2 NLP tasks 1.3 Building NLP applications 1.3.1 Development of NLP applications 1.3.2 Structure of NLP applications Summary 2 Your first NLP application 2.1 Introducing sentiment analysis 2.2 Working with NLP datasets 2.2.1 What is a dataset? 2.2.2 Stanford Sentiment Treebank 2.2.3 Train, validation, and test sets 2.2.4 Loading SST datasets using AllenNLP 2.3 Using word embeddings 2.3.1 What are word embeddings? 2.3.2 Using word embeddings for sentiment analysis 2.4 Neural networks 2.4.1 What are neural networks? 2.4.2 Recurrent neural networks (RNNs) and linear layers 2.4.3 Architecture for sentiment analysis 2.5 Loss functions and optimization 2.6 Training your own classifier 2.6.1 Batching 2.6.2 Putting everything together 2.7 Evaluating your classifier 2.8 Deploying your application 2.8.1 Making predictions 2.8.2 Serving predictions Summary 3 Word and document embeddings 3.1 Introducing embeddings 3.1.1 What are embeddings? 3.1.2 Why are embeddings important? 3.2 Building blocks of language: Characters, words, and phrases 3.2.1 Characters 3.2.2 Words, tokens, morphemes, and phrases 3.2.3 N-grams 3.3 Tokenization, stemming, and lemmatization 3.3.1 Tokenization 3.3.2 Stemming 3.3.3 Lemmatization 3.4 Skip-gram and continuous bag of words (CBOW) 3.4.1 Where word embeddings come from 3.4.2 Using word associations 3.4.3 Linear layers 3.4.4 Softmax 3.4.5 Implementing Skip-gram on AllenNLP 3.4.6 Continuous bag of words (CBOW) model 3.5 GloVe 3.5.1 How GloVe learns word embeddings 3.5.2 Using pretrained GloVe vectors 3.6 fastText 3.6.1 Making use of subword information 3.6.2 Using the fastText toolkit 3.7 Document-level embeddings 3.8 Visualizing embeddings Summary 4 Sentence classification 4.1 Recurrent neural networks (RNNs) 4.1.1 Handling variable-length input 4.1.2 RNN abstraction 4.1.3 Simple RNNs and nonlinearity 4.2 Long short-term memory units (LSTMs) and gated recurrent units (GRUs) 4.2.1 Vanishing gradients problem 4.2.2 Long short-term memory (LSTM) 4.2.3 Gated recurrent units (GRUs) 4.3 Accuracy, precision, recall, and F-measure 4.3.1 Accuracy 4.3.2 Precision and recall 4.3.3 F-measure 4.4 Building AllenNLP training pipelines 4.4.1 Instances and fields 4.4.2 Vocabulary and token indexers 4.4.3 Token embedders and RNNs 4.4.4 Building your own model 4.4.5 Putting it all together 4.5 Configuring AllenNLP training pipelines 4.6 Case study: Language detection 4.6.1 Using characters as input 4.6.2 Creating a dataset reader 4.6.3 Building the training pipeline 4.6.4 Running the detector on unseen instances Summary 5 Sequential labeling and language modeling 5.1 Introducing sequential labeling 5.1.1 What is sequential labeling? 5.1.2 Using RNNs to encode sequences 5.1.3 Implementing a Seq2Seq encoder in AllenNLP 5.2 Building a part-of-speech tagger 5.2.1 Reading a dataset 5.2.2 Defining the model and the loss 5.2.3 Building the training pipeline 5.3 Multilayer and bidirectional RNNs 5.3.1 Multilayer RNNs 5.3.2 Bidirectional RNNs 5.4 Named entity recognition 5.4.1 What is named entity recognition? 5.4.2 Tagging spans 5.4.3 Implementing a named entity recognizer 5.5 Modeling a language 5.5.1 What is a language model? 5.5.2 Why are language models useful? 5.5.3 Training an RNN language model 5.6 Text generation using RNNs 5.6.1 Feeding characters to an RNN 5.6.2 Evaluating text using a language model 5.6.3 Generating text using a language model Summary Part 2 Advanced models 6 Sequence-to-sequence models 6.1 Introducing sequence-to-sequence models 6.2 Machine translation 6.3 Building your first translator 6.3.1 Preparing the datasets 6.3.2 Training the model 6.3.3 Running the translator 6.4 How Seq2Seq models work 6.4.1 Encoder 6.4.2 Decoder 6.4.3 Greedy decoding 6.4.4 Beam search decoding 6.5 Evaluating translation systems 6.5.1 Human evaluation 6.5.2 Automatic evaluation 6.6 Case study: Building a chatbot 6.6.1 Introducing dialogue systems 6.6.2 Preparing a dataset 6.6.3 Training and running a chatbot 6.6.4 Next steps Summary 7 Convolutional neural networks 7.1 Introducing convolutional neural networks (CNNs) 7.1.1 RNNs and their shortcomings 7.1.2 Pattern matching for sentence classification 7.1.3 Convolutional neural networks (CNNs) 7.2 Convolutional layers 7.2.1 Pattern matching using filters 7.2.2 Rectified linear unit (ReLU) 7.2.3 Combining scores 7.3 Pooling layers 7.4 Case study: Text classification 7.4.1 Review: Text classification 7.4.2 Using CnnEncoder 7.4.3 Training and running the classifier Summary 8 Attention and Transformer 8.1 What is attention? 8.1.1 Limitation of vanilla Seq2Seq models 8.1.2 Attention mechanism 8.2 Sequence-to-sequence with attention 8.2.1 Encoder-decoder attention 8.2.2 Building a Seq2Seq machine translation with attention 8.3 Transformer and self-attention 8.3.1 Self-attention 8.3.2 Transformer 8.3.3 Experiments 8.4 Transformer-based language models 8.4.1 Transformer as a language model 8.4.2 Transformer-XL 8.4.3 GPT-2 8.4.4 XLM 8.5 Case study: Spell-checker 8.5.1 Spell correction as machine translation 8.5.2 Training a spell-checker 8.5.3 Improving a spell-checker Summary 9 Transfer learning with pretrained language models 9.1 Transfer learning 9.1.1 Traditional machine learning 9.1.2 Word embeddings 9.1.3 What is transfer learning? 9.2 BERT 9.2.1 Limitations of word embeddings 9.2.2 Self-supervised learning 9.2.3 Pretraining BERT 9.2.4 Adapting BERT 9.3 Case study 1: Sentiment analysis with BERT 9.3.1 Tokenizing input 9.3.2 Building the model 9.3.3 Training the model 9.4 Other pretrained language models 9.4.1 ELMo 9.4.2 XLNet 9.4.3 RoBERTa 9.4.4 DistilBERT 9.4.5 ALBERT 9.5 Case study 2: Natural language inference with BERT 9.5.1 What is natural language inference? 9.5.2 Using BERT for sentence-pair classification 9.5.3 Using Transformers with AllenNLP Summary Part 3 Putting into production 10 Best practices in developing NLP applications 10.1 Batching instances 10.1.1 Padding 10.1.2 Sorting 10.1.3 Masking 10.2 Tokenization for neural models 10.2.1 Unknown words 10.2.2 Character models 10.2.3 Subword models 10.3 Avoiding overfitting 10.3.1 Regularization 10.3.2 Early stopping 10.3.3 Cross-validation 10.4 Dealing with imbalanced datasets 10.4.1 Using appropriate evaluation metrics 10.4.2 Upsampling and downsampling 10.4.3 Weighting losses 10.5 Hyperparameter tuning 10.5.1 Examples of hyperparameters 10.5.2 Grid search vs. random search 10.5.3 Hyperparameter tuning with Optuna Summary 11 Deploying and serving NLP applications 11.1 Architecting your NLP application 11.1.1 Before machine learning 11.1.2 Choosing the right architecture 11.1.3 Project structure 11.1.4 Version control 11.2 Deploying your NLP model 11.2.1 Testing 11.2.2 Train-serve skew 11.2.3 Monitoring 11.2.4 Using GPUs 11.3 Case study: Serving and deploying NLP applications 11.3.1 Serving models with TorchServe 11.3.2 Deploying models with SageMaker 11.4 Interpreting and visualizing model predictions 11.5 Where to go from here Summary index Numerics A B C D E F G H I J K L M N O P R S T U V W X Y Z Real-World Natural Language Processing - back
Donate to keep this site alive
1. Disable the AdBlock plugin. Otherwise, you may not get any links.
2. Solve the CAPTCHA.
3. Click download link.
4. Lead to download server to download.