Natural Language Processing with AWS AI Services: Derive strategic insights from unstructured data with Amazon Textract and Amazon Comprehend
- Length: 508 pages
- Edition: 1
- Language: English
- Publisher: Packt Publishing
- Publication Date: 2021-11-26
- ISBN-10: 1801812535
- ISBN-13: 9781801812535
- Sales Rank: #1531481 (See Top 100 Books)
Work through interesting real-life business use cases to uncover valuable insights from unstructured text using AWS AI services
Key Features
- Get to grips with AWS AI services for NLP and find out how to use them to gain strategic insights
- Run Python code to use Amazon Textract and Amazon Comprehend to accelerate business outcomes
- Understand how you can integrate human-in-the-loop for custom NLP use cases with Amazon A2I
Book Description
Natural language processing (NLP) uses machine learning to extract information from unstructured data. This book will help you to move quickly from business questions to high-performance models in production.
To start with, you’ll understand the importance of NLP in today’s business applications and learn the features of Amazon Comprehend and Amazon Textract to build NLP models using Python and Jupyter Notebooks. The book then shows you how to integrate AI in applications for accelerating business outcomes with just a few lines of code. Throughout the book, you’ll cover use cases such as smart text search, setting up compliance and controls when processing confidential documents, real-time text analytics, and much more to understand various NLP scenarios. You’ll deploy and monitor scalable NLP models in production for real-time and batch requirements. As you advance, you’ll explore strategies for including humans in the loop for different purposes in a document processing workflow. Moreover, you’ll learn best practices for auto-scaling your NLP inference for enterprise traffic.
Whether you’re new to ML or an experienced practitioner, by the end of this NLP book, you’ll have the confidence to use AWS AI services to build powerful NLP applications.
What you will learn
- Automate various NLP workflows on AWS to accelerate business outcomes
- Use Amazon Textract for text, tables, and handwriting recognition from images and PDF files
- Gain insights from unstructured text in the form of sentiment analysis, topic modeling, and more using Amazon Comprehend
- Set up end-to-end document processing pipelines to understand the role of humans in the loop
- Develop NLP-based intelligent search solutions with just a few lines of code
- Create both real-time and batch document processing pipelines using Python
Who this book is for
If you’re an NLP developer or data scientist looking to get started with AWS AI services to implement various NLP scenarios quickly, this book is for you. It will show you how easy it is to integrate AI in applications with just a few lines of code. A basic understanding of machine learning (ML) concepts is necessary to understand the concepts covered. Experience with Jupyter notebooks and Python will be helpful.
Table of Contents
- NLP in the Business Context and Introduction to AWS AI Services
- Introducing Amazon Textract
- Introducing Amazon Comprehend
- Automating Document Processing Workflows
- Creating NLP Search
- Using NLP to Improve Customer Service Efficiency
- Understanding the Voice of Your Customer Analytics
- Leveraging NLP to Monetize Your Media Content
- Extracting Metadata from Financial Documents
- Reducing Localization Costs with Machine Translation
- Using Chatbots for Querying Documents
- AI and NLP in Healthcare
- Improving the Accuracy of Document Processing Workflows
- Auditing Named Entity Recognition Workflows
- Classifying Documents and Setting up Human in the Loop for Active Learning
- Improving the Accuracy of PDF Batch Processing
- Visualizing Insights from Handwritten Content
- Building Secure, Reliable, and Efficient NLP Solutions
Natural Language Processing with AWS AI Services Acknowledgments Foreword Contributors About the authors About the reviewers Preface Who this book is for What this book covers To get the most out of this book Download the example code files Download the color images Code in Action Conventions used Get in touch Share Your Thoughts Section 1:Introduction to AWS AI NLP Services Chapter 1: NLP in the Business Context and Introduction to AWS AI Services Introducing NLP Overcoming the challenges in building NLP solutions Understanding why NLP is becoming mainstream Introducing the AWS ML stack Summary Further reading Chapter 2: Introducing Amazon Textract Technical requirements Setting up your AWS environment Signing up for an AWS account Creating an Amazon S3 bucket and a folder and uploading objects Creating an Amazon SageMaker Jupyter notebook instance Changing IAM permissions and trust relationships for the Amazon SageMaker notebook execution role Overcoming challenges with document processing Understanding how Amazon Textract can help Presenting Amazon Textract's product features Uploading sample document(s) Raw text or text extraction Form data and key/value pairs Table extraction Multiple language support Handwriting detection Human in the loop Using Amazon Textract with your applications Textract APIs Textract API demo with a Jupyter notebook Building applications using Amazon Textract APIs Summary Chapter 3: Introducing Amazon Comprehend Technical requirements Understanding Amazon Comprehend and Amazon Comprehend Medical Challenges associated with setting up ML preprocessing for NLP Exploring the benefits of Amazon Comprehend and Comprehend Medical Detecting insights in text using Comprehend and Comprehend Medical without preprocessing Using these services to gain insights from OCR documents from Amazon Textract Exploring Amazon Comprehend and Amazon Comprehend Medical product features Discovering Amazon Comprehend Deriving diagnoses from a doctor-patient transcript with Comprehend Medical Using Amazon Comprehend with your applications Architecting applications with Amazon API Gateway, AWS Lambda, and Comprehend Summary Section 2: Using NLP to Accelerate Business Outcomes Chapter 4: Automating Document Processing Workflows Technical requirements Automating document processing workflows Setting up compliance and control Setting up to solve the use case Additional IAM prerequisites Automating documents for control and compliance Processing real-time document workflows versus batch document workflows Summary Further reading Chapter 5: Creating NLP Search Technical requirements Creating NLP-powered smart search indexes Building a search solution for scanned images using Amazon Elasticsearch Prerequisites Uploading documents to Amazon S3 Inspecting the AWS Lambda function Searching for and discovering data in the Kibana console Setting up an enterprise search solution using Amazon Kendra In this section, we will cover the steps to get started. Walking through the solution Searching in Amazon Kendra with enriched filters from Comprehend Summary Further reading Chapter 6: Using NLP to Improve Customer Service Efficiency Technical requirements Introducing the customer service use case Building an NLP solution to improve customer service Setting up to solve the use case Additional IAM prerequisites Preprocessing the customer service history data Summary Further reading Chapter 7: Understanding the Voice of Your Customer Analytics Technical requirements Challenges of setting up a text analytics solution Setting up a Yelp review text analytics workflow Setting up to solve the use case Walking through the solution using Jupyter Notebook Summary Further reading Chapter 8: Leveraging NLP to Monetize Your Media Content Technical requirements Introducing the content monetization use case Building the NLP solution for content monetization Setting up to solve the use case Additional IAM prerequisites Uploading the sample video and converting it for broadcast Running transcription, finding topics, and creating a VAST ad tag URL Inserting ads and testing our video Summary Further reading Chapter 9: Extracting Metadata from Financial Documents Technical requirements Extracting metadata from financial documents Setting up the use case Setting up the notebook code and S3 Bucket creation Analyzing the output of Comprehend Events Summary Further reading Chapter 10: Reducing Localization Costs with Machine Translation Technical requirements Introducing the localization use case Building a multi-language web page using machine translation Setting up to solve the use case Running the notebook Summary Further reading Chapter 11: Using Chatbots for Querying Documents Technical requirements Introducing the chatbot use case Creating an Amazon Kendra index with Amazon S3 as a data source Building an Amazon Lex chatbot Deploying the solution with AWS CloudFormation Summary Further reading Chapter 12: AI and NLP in Healthcare Technical requirements Introducing the automated claims processing use case Understanding how to extract and validate data from medical intake forms Understanding clinical data with Amazon Comprehend Medical Understanding invalid medical form processing with notifications Understanding how to create a serverless pipeline for medical claims Summary Further reading Section 3: Improving NLP Models in Production Chapter 13: Improving the Accuracy of Document Processing Workflows Technical requirements The need for setting up HITL processes with document processing Seeing the benefits of using Amazon A2I for HITL workflows Adding human reviews to your document processing pipelines Creating an Amazon S3 bucket Creating a private work team in the AWS Console Creating a human review workflow in the AWS Console Sending the document to Amazon Textract and Amazon A2I by calling the Amazon Textract API Summary Further reading Chapter 14: Auditing Named Entity Recognition Workflows Technical requirements Authenticating loan applications Building the loan authentication solution Setting up to solve the use case Additional IAM pre-requisites Training an Amazon Comprehend custom entity recognizer Creating a private team for the human loop Extracting sample document contents using Amazon Textract Detecting entities using the Amazon Comprehend custom entity recognizer Setting up an Amazon A2I human workflow loop Reviewing and modifying detected entities Retraining Comprehend custom entity recognizer Storing decisions for downstream processing Summary Further reading Chapter 15: Classifying Documents and Setting up Human in the Loop for Active Learning Technical requirements Using Comprehend custom classification with human in the loop for active learning Building the document classification workflow Setting up to solve the use case Creating an Amazon Comprehend classification training job Creating Amazon Comprehend real-time endpoints and testing a sample document Setting up active learning with a Comprehend real-time endpoint using human in the loop Summary Further reading Chapter 16: Improving the Accuracy of PDF Batch Processing Technical requirements Introducing the PDF batch processing use case Building the solution Setting up for the solution build Additional IAM prerequisites Creating a private team for the human loop Creating an Amazon S3 bucket Extracting the registration document's contents using Amazon Textract Setting up an Amazon A2I human workflow loop Storing results for downstream processing Summary Further reading Chapter 17: Visualizing Insights from Handwritten Content Technical requirements Extracting text from handwritten images Creating the SageMaker Jupyter notebook Additional IAM prerequisites Creating an Amazon S3 bucket Extracting text using Amazon Textract Visualizing insights using Amazon QuickSight Summary Chapter 18: Building Secure, Reliable, and Efficient NLP Solutions Technical requirements Defining best practices for NLP solutions Applying best practices for optimization Using an AWS S3 data lake Using AWS Glue for data processing and transformation tasks Using Amazon SageMaker Ground Truth for annotations Using Amazon Comprehend with PDF and Word formats directly Enforcing least privilege access Obfuscating sensitive data Protecting data at rest and in transit Using Amazon API Gateway for request throttling Setting up auto scaling for Amazon Comprehend endpoints Automating monitoring of custom training metrics Using Amazon A2I to review predictions Using Async APIs for loose coupling Using Amazon Textract Response Parser Persisting prediction results Using AWS Step Function for orchestration Using AWS CloudFormation templates Summary Further reading Why subscribe? Other Books You May Enjoy Packt is searching for authors like you Share Your Thoughts
Donate to keep this site alive
How to download source code?
1. Go to: https://github.com/PacktPublishing
2. In the Find a repository… box, search the book title: Natural Language Processing with AWS AI Services: Derive strategic insights from unstructured data with Amazon Textract and Amazon Comprehend
, sometime you may not get the results, please search the main title.
3. Click the book title in the search results.
3. Click Code to download.
1. Disable the AdBlock plugin. Otherwise, you may not get any links.
2. Solve the CAPTCHA.
3. Click download link.
4. Lead to download server to download.