Mastering spaCy: An end-to-end practical guide to implementing NLP applications using the Python ecosystem
- Length: 356 pages
- Edition: 1
- Language: English
- Publisher: Packt Publishing
- Publication Date: 2021-07-09
- ISBN-10: 1800563353
- ISBN-13: 9781800563353
- Sales Rank: #595930 (See Top 100 Books)
Build end-to-end industrial-strength NLP models using advanced morphological and syntactic features in spaCy to create real-world applications with ease
Key Features
- Gain an overview of what spaCy offers for natural language processing
- Learn details of spaCy’s features and how to use them effectively
- Work through practical recipes using spaCy
Book Description
spaCy is an industrial-grade, efficient NLP Python library. It offers various pre-trained models and ready-to-use features. Mastering spaCy provides you with end-to-end coverage of spaCy’s features and real-world applications.
You’ll begin by installing spaCy and downloading models, before progressing to spaCy’s features and prototyping real-world NLP apps. Next, you’ll get familiar with visualizing with spaCy’s popular visualizer displaCy. The book also equips you with practical illustrations for pattern matching and helps you advance into the world of semantics with word vectors. Statistical information extraction methods are also explained in detail. Later, you’ll cover an interactive business case study that shows you how to combine all spaCy features for creating a real-world NLP pipeline. You’ll implement ML models such as sentiment analysis, intent recognition, and context resolution. The book further focuses on classification with popular frameworks such as TensorFlow’s Keras API together with spaCy. You’ll cover popular topics, including intent classification and sentiment analysis, and use them on popular datasets and interpret the classification results.
By the end of this book, you’ll be able to confidently use spaCy, including its linguistic features, word vectors, and classifiers, to create your own NLP apps.
What you will learn
- Install spaCy, get started easily, and write your first Python script
- Understand core linguistic operations of spaCy
- Discover how to combine rule-based components with spaCy statistical models
- Become well-versed with named entity and keyword extraction
- Build your own ML pipelines using spaCy
- Apply all the knowledge you’ve gained to design a chatbot using spaCy
Who this book is for
This book is for data scientists and machine learners who want to excel in NLP as well as NLP developers who want to master spaCy and build applications with it. Language and speech professionals who want to get hands-on with Python and spaCy and software developers who want to quickly prototype applications with spaCy will also find this book helpful. Beginner-level knowledge of the Python programming language is required to get the most out of this book. A beginner-level understanding of linguistics such as parsing, POS tags, and semantic similarity will also be useful.
Table of Contents
- Getting Started with spaCy
- Core Operations with spaCy
- Linguistic Features
- Rule-Based Matching
- Working with Word Vectors and Semantic Similarity
- Putting Everything Together: Semantic Parsing with spaCy
- Customizing spaCy Models
- Text Classification with spaCy
- spaCy and Transformers
- Putting Everything Together: Designing Your Chatbot with spaCy
Mastering spaCy Contributors About the author About the reviewers Preface Who this book is for What this book covers To get the most out of this book Download the example code files Download the color images Conventions used Get in touch Reviews Section 1: Getting Started with spaCy Chapter 1: Getting Started with spaCy Technical requirements Overview of spaCy Rise of NLP NLP with Python Reviewing some useful string operations Getting a high-level overview of the spaCy library Tips for the reader Installing spaCy Installing spaCy with pip Installing spaCy with conda Installing spaCy on macOS/OS X Installing spaCy on Windows Troubleshooting while installing spaCy Installing spaCy's statistical models Installing language models Visualization with displaCy Getting started with displaCy Entity visualizer Visualizing within Python Using displaCy in Jupyter notebooks Exporting displaCy graphics as an image file Summary Chapter 2: Core Operations with spaCy Technical requirements Overview of spaCy conventions Introducing tokenization Customizing the tokenizer Debugging the tokenizer Sentence segmentation Understanding lemmatization Lemmatization in NLU Understanding the difference between lemmatization and stemming spaCy container objects Doc Token Span More spaCy features Summary Section 2: spaCy Features Chapter 3: Linguistic Features Technical requirements What is POS tagging? WSD Verb tense and aspect in NLU applications Understanding number, symbol, and punctuation tags Introduction to dependency parsing What is dependency parsing? Dependency relations Syntactic relations Introducing NER A real-world example Merging and splitting tokens Summary Chapter 4: Rule-Based Matching Token-based matching Extended syntax support Regex-like operators Regex support Matcher online demo PhraseMatcher EntityRuler Combining spaCy models and matchers Extracting IBAN and account numbers Extracting phone numbers Extracting mentions Hashtag and emoji extraction Expanding named entities Combining linguistic features and named entities Summary Chapter 5: Working with Word Vectors and Semantic Similarity Technical requirements Understanding word vectors One-hot encoding Word vectors Analogies and vector operations How word vectors are produced Using spaCy's pretrained vectors The similarity method Using third-party word vectors Advanced semantic similarity methods Understanding semantic similarity Categorizing text with semantic similarity Extracting key phrases Extracting and comparing named entities Summary Chapter 6: Putting Everything Together: Semantic Parsing with spaCy Technical requirements Extracting named entities Getting to know the ATIS dataset Extracting named entities with Matcher Using dependency trees for extracting entities Using dependency relations for intent recognition Linguistic primer Extracting transitive verbs and their direct objects Extracting multiple intents with conjunction relation Recognizing the intent using wordlists Semantic similarity methods for semantic parsing Using synonyms lists for semantic similarity Using word vectors to recognize semantic similarity Putting it all together Summary Section 3: Machine Learning with spaCy Chapter 7: Customizing spaCy Models Technical requirements Getting started with data preparation Do spaCy models perform well enough on your data? Does your domain include many labels that are absent in spaCy models? Annotating and preparing data Annotating data with Prodigy Annotating data with Brat spaCy training data format Updating an existing pipeline component Disabling the other statistical models Model training procedure Evaluating the updated NER Saving and loading custom models Training a pipeline component from scratch Working with a real-world dataset Summary Chapter 8: Text Classification with spaCy Technical requirements Understanding the basics of text classification Training the spaCy text classifier Getting to know TextCategorizer class Formatting training data for the TextCategorizer Defining the training loop Testing the new component Training TextCategorizer for multilabel classification Sentiment analysis with spaCy Exploring the dataset Training the TextClassifier component Text classification with spaCy and Keras What is a layer? Sequential modeling with LSTMs Keras Tokenizer Embedding words Neural network architecture for text classification Summary References Chapter 9: spaCy and Transformers Technical requirements Transformers and transfer learning Understanding BERT BERT architecture BERT input format How is BERT trained? Transformers and TensorFlow HuggingFace Transformers Using the BERT tokenizer Obtaining BERT word vectors Using BERT for text classification Using Transformer pipelines Transformers and spaCy Summary Chapter 10: Putting Everything Together: Designing Your Chatbot with spaCy Technical requirements Introduction to conversational AI NLP components of conversational AI products Getting to know the dataset Entity extraction Extracting city entities Extracting date and time entities Extracting phone numbers Extracting cuisine types Intent recognition Pattern-based text classification Classifying text with a character-level LSTM Differentiating subjects from objects Parsing the sentence type Anaphora resolution Summary References Why subscribe? Other Books You May Enjoy Packt is searching for authors like you Leave a review - let other readers know what you think
Donate to keep this site alive
How to download source code?
1. Go to: https://github.com/PacktPublishing
2. In the Find a repository… box, search the book title: Mastering spaCy: An end-to-end practical guide to implementing NLP applications using the Python ecosystem
, sometime you may not get the results, please search the main title.
3. Click the book title in the search results.
3. Click Code to download.
1. Disable the AdBlock plugin. Otherwise, you may not get any links.
2. Solve the CAPTCHA.
3. Click download link.
4. Lead to download server to download.