Human-in-the-Loop Machine Learning: Active learning and annotation for human-centered AI
- Length: 424 pages
- Edition: 1
- Language: English
- Publisher: Manning Publications
- Publication Date: 2021-07-20
- ISBN-10: 1617296740
- ISBN-13: 9781617296741
- Sales Rank: #89350 (See Top 100 Books)
Human-in-the-Loop Machine Learning lays out methods for humans and machines to work together effectively.
Summary
Most machine learning systems that are deployed in the world today learn from human feedback. However, most machine learning courses focus almost exclusively on the algorithms, not the human-computer interaction part of the systems. This can leave a big knowledge gap for data scientists working in real-world machine learning, where data scientists spend more time on data management than on building algorithms. Human-in-the-Loop Machine Learning is a practical guide to optimizing the entire machine learning process, including techniques for annotation, active learning, transfer learning, and using machine learning to optimize every step of the process.
Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications.
About the technology
Machine learning applications perform better with human feedback. Keeping the right people in the loop improves the accuracy of models, reduces errors in data, lowers costs, and helps you ship models faster.
About the book
Human-in-the-Loop Machine Learning lays out methods for humans and machines to work together effectively. You’ll find best practices on selecting sample data for human feedback, quality control for human annotations, and designing annotation interfaces. You’ll learn to create training data for labeling, object detection, and semantic segmentation, sequence labeling, and more. The book starts with the basics and progresses to advanced techniques like transfer learning and self-supervision within annotation workflows.
What’s inside
Identifying the right training and evaluation data
Finding and managing people to annotate data
Selecting annotation quality control strategies
Designing interfaces to improve accuracy and efficiency
About the author
Robert (Munro) Monarch is a data scientist and engineer who has built machine learning data for companies such as Apple, Amazon, Google, and IBM. He holds a PhD from Stanford.
Robert holds a PhD from Stanford focused on Human-in-the-Loop machine learning for healthcare and disaster response, and is a disaster response professional in addition to being a machine learning professional. A worked example throughout this text is classifying disaster-related messages from real disasters that Robert has helped respond to in the past.
Table of Contents
PART 1 – FIRST STEPS
1 Introduction to human-in-the-loop machine learning
2 Getting started with human-in-the-loop machine learning
PART 2 – ACTIVE LEARNING
3 Uncertainty sampling
4 Diversity sampling
5 Advanced active learning
6 Applying active learning to different machine learning tasks
PART 3 – ANNOTATION
7 Working with the people annotating your data
8 Quality control for data annotation
9 Advanced data annotation and augmentation
10 Annotation quality for different machine learning tasks
PART 4 – HUMAN–COMPUTER INTERACTION FOR MACHINE LEARNING
11 Interfaces for data annotation
12 Human-in-the-loop machine learning products
inside front cover Human-in-the-Loop Machine Learning Copyright brief contents contents front matter foreword preface acknowledgments about this book Who should read this book How this book is organized: A road map About the code liveBook discussion forum Other online resources about the author Part 1 First steps 1 Introduction to human-in-the-loop machine learning 1.1 The basic principles of human-in-the-loop machine learning 1.2 Introducing annotation 1.2.1 Simple and more complicated annotation strategies 1.2.2 Plugging the gap in data science knowledge 1.2.3 Quality human annotation: Why is it hard? 1.3 Introducing active learning: Improving the speed and reducing the cost of training data 1.3.1 Three broad active learning sampling strategies: Uncertainty, diversity, and random 1.3.2 What is a random selection of evaluation data? 1.3.3 When to use active learning 1.4 Machine learning and human–computer interaction 1.4.1 User interfaces: How do you create training data? 1.4.2 Priming: What can influence human perception? 1.4.3 The pros and cons of creating labels by evaluating machine learning predictions 1.4.4 Basic principles for designing annotation interfaces 1.5 Machine-learning-assisted humans vs. human-assisted machine learning 1.6 Transfer learning to kick-start your models 1.6.1 Transfer learning in computer vision 1.6.2 Transfer learning in NLP 1.7 What to expect in this text Summary 2 Getting started with human-in-the-loop machine learning 2.1 Beyond hacktive learning: Your first active learning algorithm 2.2 The architecture of your first system 2.3 Interpreting model predictions and data to support active learning 2.3.1 Confidence ranking 2.3.2 Identifying outliers 2.3.3 What to expect as you iterate 2.4 Building an interface to get human labels 2.4.1 A simple interface for labeling text 2.4.2 Managing machine learning data 2.5 Deploying your first human-in-the-loop machine learning system 2.5.1 Always get your evaluation data first 2.5.2 Every data point gets a chance 2.5.3 Select the right strategies for your data 2.5.4 Retrain the model and iterate Summary Part 2 Active learning 3 Uncertainty sampling 3.1 Interpreting uncertainty in a machine learning model 3.1.1 Why look for uncertainty in your model? 3.1.2 Softmax and probability distributions 3.1.3 Interpreting the success of active learning 3.2 Algorithms for uncertainty sampling 3.2.1 Least confidence sampling 3.2.2 Margin of confidence sampling 3.2.3 Ratio sampling 3.2.4 Entropy (classification entropy) 3.2.5 A deep dive on entropy 3.3 Identifying when different types of models are confused 3.3.1 Uncertainty sampling with logistic regression and MaxEnt models 3.3.2 Uncertainty sampling with SVMs 3.3.3 Uncertainty sampling with Bayesian models 3.3.4 Uncertainty sampling with decision trees and random forests 3.4 Measuring uncertainty across multiple predictions 3.4.1 Uncertainty sampling with ensemble models 3.4.2 Query by Committee and dropouts 3.4.3 The difference between aleatoric and epistemic uncertainty 3.4.4 Multilabeled and continuous value classification 3.5 Selecting the right number of items for human review 3.5.1 Budget-constrained uncertainty sampling 3.5.2 Time-constrained uncertainty sampling 3.5.3 When do I stop if I’m not time- or budget-constrained? 3.6 Evaluating the success of active learning 3.6.1 Do I need new test data? 3.6.2 Do I need new validation data? 3.7 Uncertainty sampling cheat sheet 3.8 Further reading 3.8.1 Further reading for least confidence sampling 3.8.2 Further reading for margin of confidence sampling 3.8.3 Further reading for ratio of confidence sampling 3.8.4 Further reading for entropy-based sampling 3.8.5 Further reading for other machine learning models 3.8.6 Further reading for ensemble-based uncertainty sampling Summary 4 Diversity sampling 4.1 Knowing what you don’t know: Identifying gaps in your model’s knowledge 4.1.1 Example data for diversity sampling 4.1.2 Interpreting neural models for diversity sampling 4.1.3 Getting information from hidden layers in PyTorch 4.2 Model-based outlier sampling 4.2.1 Use validation data to rank activations 4.2.2 Which layers should I use to calculate model-based outliers? 4.2.3 The limitations of model-based outliers 4.3 Cluster-based sampling 4.3.1 Cluster members, centroids, and outliers 4.3.2 Any clustering algorithm in the universe 4.3.3 K-means clustering with cosine similarity 4.3.4 Reduced feature dimensions via embeddings or PCA 4.3.5 Other clustering algorithms 4.4 Representative sampling 4.4.1 Representative sampling is rarely used in isolation 4.4.2 Simple representative sampling 4.4.3 Adaptive representative sampling 4.5 Sampling for real-world diversity 4.5.1 Common problems in training data diversity 4.5.2 Stratified sampling to ensure diversity of demographics 4.5.3 Represented and representative: Which matters? 4.5.4 Per-demographic accuracy 4.5.5 Limitations of sampling for real-world diversity 4.6 Diversity sampling with different types of models 4.6.1 Model-based outliers with different types of models 4.6.2 Clustering with different types of models 4.6.3 Representative sampling with different types of models 4.6.4 Sampling for real-world diversity with different types of models 4.7 Diversity sampling cheat sheet 4.8 Further reading 4.8.1 Further reading for model-based outliers 4.8.2 Further reading for cluster-based sampling 4.8.3 Further reading for representative sampling 4.8.4 Further reading for sampling for real-world diversity Summary 5 Advanced active learning 5.1 Combining uncertainty sampling and diversity sampling 5.1.1 Least confidence sampling with cluster-based sampling 5.1.2 Uncertainty sampling with model-based outliers 5.1.3 Uncertainty sampling with model-based outliers and clustering 5.1.4 Representative sampling cluster-based sampling 5.1.5 Sampling from the highest-entropy cluster 5.1.6 Other combinations of active learning strategies 5.1.7 Combining active learning scores 5.1.8 Expected error reduction sampling 5.2 Active transfer learning for uncertainty sampling 5.2.1 Making your model predict its own errors 5.2.2 Implementing active transfer learning 5.2.3 Active transfer learning with more layers 5.2.4 The pros and cons of active transfer learning 5.3 Applying active transfer learning to representative sampling 5.3.1 Making your model predict what it doesn’t know 5.3.2 Active transfer learning for adaptive representative sampling 5.3.3 The pros and cons of active transfer learning for representative sampling 5.4 Active transfer learning for adaptive sampling 5.4.1 Making uncertainty sampling adaptive by predicting uncertainty 5.4.2 The pros and cons of ATLAS 5.5 Advanced active learning cheat sheets 5.6 Further reading for active transfer learning Summary 6 Applying active learning to different machine learning tasks 6.1 Applying active learning to object detection 6.1.1 Accuracy for object detection: Label confidence and localization 6.1.2 Uncertainty sampling for label confidence and localization in object detection 6.1.3 Diversity sampling for label confidence and localization in object detection 6.1.4 Active transfer learning for object detection 6.1.5 Setting a low object detection threshold to avoid perpetuating bias 6.1.6 Creating training data samples for representative sampling that are similar to your predictions 6.1.7 Sampling for image-level diversity in object detection 6.1.8 Considering tighter masks when using polygons 6.2 Applying active learning to semantic segmentation 6.2.1 Accuracy for semantic segmentation 6.2.2 Uncertainty sampling for semantic segmentation 6.2.3 Diversity sampling for semantic segmentation 6.2.4 Active transfer learning for semantic segmentation 6.2.5 Sampling for image-level diversity in semantic segmentation 6.3 Applying active learning to sequence labeling 6.3.1 Accuracy for sequence labeling 6.3.2 Uncertainty sampling for sequence labeling 6.3.3 Diversity sampling for sequence labeling 6.3.4 Active transfer learning for sequence labeling 6.3.5 Stratified sampling by confidence and tokens 6.3.6 Create training data samples for representative sampling that are similar to your predictions 6.3.7 Full-sequence labeling 6.3.8 Sampling for document-level diversity in sequence labeling 6.4 Applying active learning to language generation 6.4.1 Calculating accuracy for language generation systems 6.4.2 Uncertainty sampling for language generation 6.4.3 Diversity sampling for language generation 6.4.4 Active transfer learning for language generation 6.5 Applying active learning to other machine learning tasks 6.5.1 Active learning for information retrieval 6.5.2 Active learning for video 6.5.3 Active learning for speech 6.6 Choosing the right number of items for human review 6.6.1 Active labeling for fully or partially annotated data 6.6.2 Combining machine learning with annotation 6.7 Further reading Summary Part 3 Annotation 7 Working with the people annotating your data 7.1 Introduction to annotation 7.1.1 Three principles of good data annotation 7.1.2 Annotating data and reviewing model predictions 7.1.3 Annotations from machine learning-assisted humans 7.2 In-house experts 7.2.1 Salary for in-house workers 7.2.2 Security for in-house workers 7.2.3 Ownership for in-house workers 7.2.4 Tip: Always run in-house annotation sessions 7.3 Outsourced workers 7.3.1 Salary for outsourced workers 7.3.2 Security for outsourced workers 7.3.3 Ownership for outsourced workers 7.3.4 Tip: Talk to your outsourced workers 7.4 Crowdsourced workers 7.4.1 Salary for crowdsourced workers 7.4.2 Security for crowdsourced workers 7.4.3 Ownership for crowdsourced workers 7.4.4 Tip: Create a path to secure work and career advancement 7.5 Other workforces 7.5.1 End users 7.5.2 Volunteers 7.5.3 People playing games 7.5.4 Model predictions as annotations 7.6 Estimating the volume of annotation needed 7.6.1 The orders-of-magnitude equation for number of annotations needed 7.6.2 Anticipate one to four weeks of annotation training and task refinement 7.6.3 Use your pilot annotations and accuracy goal to estimate cost 7.6.4 Combining types of workforces Summary 8 Quality control for data annotation 8.1 Comparing annotations with ground truth answers 8.1.1 Annotator agreement with ground truth data 8.1.2 Which baseline should you use for expected accuracy? 8.2 Interannotator agreement 8.2.1 Introduction to interannotator agreement 8.2.2 Benefits from calculating interannotator agreement 8.2.3 Dataset-level agreement with Krippendorff’s alpha 8.2.4 Calculating Krippendorff’s alpha beyond labeling 8.2.5 Individual annotator agreement 8.2.6 Per-label and per-demographic agreement 8.2.7 Extending accuracy with agreement for real-world diversity 8.3 Aggregating multiple annotations to create training data 8.3.1 Aggregating annotations when everyone agrees 8.3.2 The mathematical case for diverse annotators and low agreement 8.3.3 Aggregating annotations when annotators disagree 8.3.4 Annotator-reported confidences 8.3.5 Deciding which labels to trust: Annotation uncertainty 8.4 Quality control by expert review 8.4.1 Recruiting and training qualified people 8.4.2 Training people to become experts 8.4.3 Machine-learning-assisted experts 8.5 Multistep workflows and review tasks 8.6 Further reading Summary 9 Advanced data annotation and augmentation 9.1 Annotation quality for subjective tasks 9.1.1 Requesting annotator expectations 9.1.2 Assessing viable labels for subjective tasks 9.1.3 Trusting an annotator to understand diverse responses 9.1.4 Bayesian Truth Serum for subjective judgments 9.1.5 Embedding simple tasks in more complicated ones 9.2 Machine learning for annotation quality control 9.2.1 Calculating annotation confidence as an optimization task 9.2.2 Converging on label confidence when annotators disagree 9.2.3 Predicting whether a single annotation is correct 9.2.4 Predicting whether a single annotation is in agreement 9.2.5 Predicting whether an annotator is a bot 9.3 Model predictions as annotations 9.3.1 Trusting annotations from confident model predictions 9.3.2 Treating model predictions as a single annotator 9.3.3 Cross-validating to find mislabeled data 9.4 Embeddings and contextual representations 9.4.1 Transfer learning from an existing model 9.4.2 Representations from adjacent easy-to-annotate tasks 9.4.3 Self-supervision: Using inherent labels in the data 9.5 Search-based and rule-based systems 9.5.1 Data filtering with rules 9.5.2 Training data search 9.5.3 Masked feature filtering 9.6 Light supervision on unsupervised models 9.6.1 Adapting an unsupervised model to a supervised model 9.6.2 Human-guided exploratory data analysis 9.7 Synthetic data, data creation, and data augmentation 9.7.1 Synthetic data 9.7.2 Data creation 9.7.3 Data augmentation 9.8 Incorporating annotation information into machine learning models 9.8.1 Filtering or weighting items by confidence in their labels 9.8.2 Including the annotator identity in inputs 9.8.3 Incorporating uncertainty into the loss function 9.9 Further reading for advanced annotation 9.9.1 Further reading for subjective data 9.9.2 Further reading for machine learning for annotation quality control 9.9.3 Further reading for embeddings/contextual representations 9.9.4 Further reading for rule-based systems 9.9.5 Further reading for incorporating uncertainty in annotations into the downstream models Summary 10 Annotation quality for different machine learning tasks 10.1 Annotation quality for continuous tasks 10.1.1 Ground truth for continuous tasks 10.1.2 Agreement for continuous tasks 10.1.3 Subjectivity in continuous tasks 10.1.4 Aggregating continuous judgments to create training data 10.1.5 Machine learning for aggregating continuous tasks to create training data 10.2 Annotation quality for object detection 10.2.1 Ground truth for object detection 10.2.2 Agreement for object detection 10.2.3 Dimensionality and accuracy in object detection 10.2.4 Subjectivity for object detection 10.2.5 Aggregating object annotations to create training data 10.2.6 Machine learning for object annotations 10.3 Annotation quality for semantic segmentation 10.3.1 Ground truth for semantic segmentation annotation 10.3.2 Agreement for semantic segmentation 10.3.3 Subjectivity for semantic segmentation annotations 10.3.4 Aggregating semantic segmentation to create training data 10.3.5 Machine learning for aggregating semantic segmentation tasks to create training data 10.4 Annotation quality for sequence labeling 10.4.1 Ground truth for sequence labeling 10.4.2 Ground truth for sequence labeling in truly continuous data 10.4.3 Agreement for sequence labeling 10.4.4 Machine learning and transfer learning for sequence labeling 10.4.5 Rule-based, search-based, and synthetic data for sequence labeling 10.5 Annotation quality for language generation 10.5.1 Ground truth for language generation 10.5.2 Agreement and aggregation for language generation 10.5.3 Machine learning and transfer learning for language generation 10.5.4 Synthetic data for language generation 10.6 Annotation quality for other machine learning tasks 10.6.1 Annotation for information retrieval 10.6.2 Annotation for multifield tasks 10.6.3 Annotation for video 10.6.4 Annotation for audio data 10.7 Further reading for annotation quality for different machine learning tasks 10.7.1 Further reading for computer vision 10.7.2 Further reading for annotation for natural language processing 10.7.3 Further reading for annotation for information retrieval Summary Part 4 Human–computer interaction for machine learning 11 Interfaces for data annotation 11.1 Basic principles of human–computer interaction 11.1.1 Introducing affordance, feedback, and agency 11.1.2 Designing interfaces for annotation 11.1.3 Minimizing eye movement and scrolling 11.1.4 Keyboard shortcuts and input devices 11.2 Breaking the rules effectively 11.2.1 Scrolling for batch annotation 11.2.2 Foot pedals 11.2.3 Audio inputs 11.3 Priming in annotation interfaces 11.3.1 Repetition priming 11.3.2 Where priming hurts 11.3.3 Where priming helps 11.4 Combining human and machine intelligence 11.4.1 Annotator feedback 11.4.2 Maximizing objectivity by asking what other people would annotate 11.4.3 Recasting continuous problems as ranking problems 11.5 Smart interfaces for maximizing human intelligence 11.5.1 Smart interfaces for semantic segmentation 11.5.2 Smart interfaces for object detection 11.5.3 Smart interfaces for language generation 11.5.4 Smart interfaces for sequence labeling 11.6 Machine learning to assist human processes 11.6.1 The perception of increased efficiency 11.6.2 Active learning for increased efficiency 11.6.3 Errors can be better than absence to maximize completeness 11.6.4 Keep annotation interfaces separate from daily work interfaces 11.7 Further reading Summary 12 Human-in-the-loop machine learning products 12.1 Defining products for human-in-the-loop machine learning applications 12.1.1 Start with the problem you are solving 12.1.2 Design systems to solve the problem 12.1.3 Connecting Python and HTML 12.2 Example 1: Exploratory data analysis for news headlines 12.2.1 Assumptions 12.2.2 Design and implementation 12.2.3 Potential extensions 12.3 Example 2: Collecting data about food safety events 12.3.1 Assumptions 12.3.2 Design and implementation 12.3.3 Potential extensions 12.4 Example 3: Identifying bicycles in images 12.4.1 Assumptions 12.4.2 Design and implementation 12.4.3 Potential extensions 12.5 Further reading for building human-in-the-loop machine learning products Summary appendix Machine learning refresher A.1 Interpreting predictions from a model A.1.1 Probability distributions A.2 Softmax deep dive A.2.1 Converting the model output to confidences with softmax A.2.2 The choice of base/temperature for softmax A.2.3 The result from dividing exponentials A.3 Measuring human-in-the-loop machine learning systems A.3.1 Precision, recall, and F-score A.3.2 Micro and macro precision, recall, and F-score A.3.3 Taking random chance into account: Chance-adjusted accuracy A.3.4 Taking confidence into account: Area under the ROC curve (AUC) A.3.5 Number of model errors spotted A.3.6 Human labor cost saved A.3.7 Other methods for calculating accuracy in this book index inside back cover
Donate to keep this site alive
How to download source code?
1. Go to: https://www.manning.com
2. Search the book title: Human-in-the-Loop Machine Learning: Active learning and annotation for human-centered AI
, sometime you may not get the results, please search the main title
3. Click the book title in the search results
3. resources
section, click Source Code
.
1. Disable the AdBlock plugin. Otherwise, you may not get any links.
2. Solve the CAPTCHA.
3. Click download link.
4. Lead to download server to download.