Artificial Intelligence for Human Computer Interaction: A Modern Approach

Length: 615 pages
Edition: 1
Language: English
Publisher: Springer
Publication Date: 2021-12-11
ISBN-10: 3030826805
ISBN-13: 9783030826802
Sales Rank: #0 (See Top 100 Books)

This edited book explores the many interesting questions that lie at the intersection between AI and HCI. It covers a comprehensive set of perspectives, methods and projects that present the challenges and opportunities that modern AI methods bring to HCI researchers and practitioners. The chapters take a clear departure from traditional HCI methods and leverage data-driven and deep learning methods to tackle HCI problems that were previously challenging or impossible to address.

It starts with addressing classic HCI topics, including human behaviour modeling and input, and then dedicates a section to data and tools, two technical pillars of modern AI methods. These chapters exemplify how state-of-the-art deep learning methods infuse new directions and allow researchers to tackle long standing and newly emerging HCI problems alike. Artificial Intelligence for Human Computer Interaction: A Modern Approach concludes with a section on Specific Domains which covers a set of emerging HCI areas where modern AI methods start to show real impact, such as personalized medical, design, and UI automation.

Forward for Artificial Intelligence for Human Computer Interaction: A Modern Approach
Introduction
Contents
Part I Modeling
Human Performance Modeling with Deep Learning
	1 Introduction
	2 Modeling Visual Search Performance on Web Pages
		2.1 Problem Formulation and Datasets
		2.2 The Model Design and Learning
		2.3 Experiments
		2.4 Analysis
	3 Predicting Human Performance in Vertical Menu Selection
		3.1 Problem Formulation and Datasets
		3.2 The Model Design and Learning
		3.3 Experiments
		3.4 Analysis
	4 Modeling Grid Performance on Touchscreen Mobile Devices
		4.1 Problem Formulation and Datasets
		4.2 Model Design and Learning
		4.3 Experiments
		4.4 Analysis
	5 Discussion
	6 Conclusion
	References
Optimal Control to Support High-Level User Goals in Human-Computer Interaction
	1 Introduction
	2 Problem Setting
	3 Related Work
		3.1 Model Predictive Control for Adaptive Systems
		3.2 Reinforcement Learning from Human Behavior
	4 Background
		4.1 Optimal Control Problem
		4.2 Model Predictive Control
		4.3 Markov Decision Processes
		4.4 Reinforcement Learning
	5 Model Predictive Control for Robotic Task Support
		5.1 Aesthetic Criteria of Aerial Film
		5.2 Method
		5.3 Evaluation
	6 Reinforcement Learning for Adaptive Mixed-Reality UIs
		6.1 Data Collection
		6.2 Method
		6.3 Evaluation
	7 Discussion
	8 Future Research Challenges
	9 Conclusion
	References
Modeling Mobile Interface Tappability Using Crowdsourcing and Deep Learning
	1 Introduction
	2 Background
	3 Related Work
		3.1 Large-Scale Data Collection to Assess Interface Design & Usability
		3.2 Machine Learning Methods to Assess Interface Design & Usability
	4 Understanding Tappability at Scale
		4.1 Crowdsourcing Data Collection
		4.2 Results
		4.3 Signifier Analysis
	5 Tappability Prediction Model
		5.1 Feature Encoding
		5.2 Model Architecture & Learning
		5.3 Model Performance Results
		5.4 Human Consistency & Model Behaviors
		5.5 Usefulness of Individual Features
	6 TapShoe Interface
	7 Informal Feedback from Designers
		7.1 Visualizing Probabilities
		7.2 Exploring Variations
		7.3 Model Extension and Accuracy
	8 Discussion
	9 Future Work
	10 Conclusion
	References
Part II Input
Eye Gaze Estimation and Its Applications
	1 Introduction
	2 Background
		2.1 The Human Eye Gaze
		2.2 Gaze Estimation Methods
		2.3 Learning-Based Gaze Estimation Methods
		2.4 Person-Specific Gaze Estimator Calibration
	3 Learning-Based Gaze Estimation Methods
		3.1 Gaze Estimation Method Pipeline
		3.2 3D and 2D Gaze Estimation
		3.3 Input for Gaze Estimation Methods
		3.4 Representation Learning for Gaze Estimation
		3.5 Gaze Estimation Datasets
		3.6 Comparison of Learning-Based and Commercial Gaze Estimation Methods
	4 Making Gaze Tracking Practicable for Computer Interaction
		4.1 Personalizing Gaze Tracking Methods
		4.2 Design of Robust Interfaces
		4.3 Make Single-Webcam-Based Methods Accessible for HCI Researchers
	5 Applications
		5.1 Gaze-Aware Real-Life Objects
		5.2 Adapting a UI to Improve Information Relevance
	6 Discussion and Outlook
	7 Conclusion
	References
AI-Driven Intelligent Text Correction Techniques for Mobile Text Entry
	1 Introduction
	2 Related Work
		2.1 Text Correction Behaviors on Touch Screens
		2.2 Mobile Text Correction Techniques
		2.3 Multi-Modal Text Input
		2.4 NLP Algorithms for Error Correction
	3 Type, Then Correct: The Three Interactions
		3.1 Drag-n-Drop
		3.2 Drag-n-Throw
		3.3 Magic Key
	4 Type, Then Correct: The Correction Algorithm
		4.1 Expected Correction Categories
		4.2 The Deep Neural Network Structure
		4.3 Data Collection and Processing
		4.4 Training Process
		4.5 Results
		4.6 Other Implementation Details
	5 Type, Then Correct: Experiment
		5.1 Participants
		5.2 Apparatus
		5.3 Phrases Used in the Correction Task
		5.4 Procedure
	6 Type, Then Correct: Results
		6.1 Correction Time
		6.2 Success Rate
		6.3 Subjective Preference
	7 JustCorrect: Simplifying the Text Correction Based on TTC
		7.1 The Post hoc Correction Algorithm
		7.2 Substitution Score
		7.3 Insertion Score
		7.4 Combining Substitution and Insertion Candidates
	8 JustCorrect: Experiment
		8.1 Participants
		8.2 Apparatus
		8.3 Design
		8.4 Procedure
		8.5 Results
		8.6 Discussion
		8.7 Future Work
	9 Conclusion
	References
Deep Touch: Sensing Press Gestures from Touch Image Sequences
	1 Introduction
	2 Touch Sensing and Finger Interaction
		2.1 Touch-Sensing Hardware
		2.2 Finger-Surface Biomechanics
		2.3 Touch Interaction Design
	3 Deep Touch Model
		3.1 Touch Gesture Patterns
		3.2 Model Design
		3.3 Data Set Development
		3.4 Training
		3.5 Results
	4 System Integration and Evaluation
		4.1 Gesture Classification Algorithm
		4.2 Evaluation
	5 Discussion
	6 Conclusion
	References
Deep Learning-Based Hand Posture Recognition for Pen Interaction Enhancement
	1 Introduction
	2 Background
		2.1 Hand Independent Pen Sensors
		2.2 Capacitive Touch Sensors
		2.3 Vision-Based Camera Sensing
		2.4 Physiological Sensors
	3 Posture Recognition Using an EMG Armband
		3.1 Data Sampling
		3.2 CNN Classification
		3.3 Baseline SVM and RF Classification
		3.4 Model Evaluation
		3.5 Results
	4 Posture Recognition Using Raw Capacitive Images
		4.1 Posture Set
		4.2 Classification
		4.3 Network Architecture
		4.4 Training and Validation
		4.5 Results
		4.6 Postures Using the Other Hand
	5 Posture Detection with Pen-Top Camera
		5.1 Posture and Gesture Detection
		5.2 Data Gathering
		5.3 Network Architecture
		5.4 Experiments with Training and Validation
		5.5 Results
	6 Hand Postures for Pen Interaction in an Application Context
		6.1 Pen-Grip Detection for Touch Input
		6.2 Pointing at and Capturing Off-Tablet Content
		6.3 Discrete and Continuous Actions
		6.4 Posture Usability
	7 Conclusion
	References
Part III Data and Tools
An Early Rico Retrospective: Three Years of Uses for a Mobile App Dataset
	1 Introduction
	2 Collecting Rico
		2.1 Crowdsourced Exploration
		2.2 Automated Exploration
		2.3 Content-Agnostic Similarity Heuristic
		2.4 Coverage Benefits of Hybrid Exploration
	3 The Rico Dataset
		3.1 Data Collection
		3.2 Design Data Organization
	4 Our Uses of Rico
		4.1 Training a UI Layout Embedding
		4.2 Understanding Material Design Usage in the Wild
	5 Rico in the World
		5.1 Mobile Ecosystem Explorations
		5.2 UI Automation
		5.3 Design Assistance
		5.4 Understanding UI Semantics
		5.5 Automated Design
		5.6 Enhancements to the Rico Approach and Dataset
	6 Discussion
	7 Conclusion
	References
Visual Intelligence through Human Interaction
	1 Introduction
	2 Data Annotation by Speeding up Human Interactions
		2.1 Related Work
		2.2 Error-Embracing Crowdsourcing
		2.3 Model
		2.4 Calibration: Baseline Worker Reaction Time
		2.5 Study 1: Image Verification
		2.6 Study 2: Non-visual Tasks
		2.7 Study 3: Multi-class Classification
		2.8 Application: Building ImageNet
		2.9 Discussion
	3 Data Acquisition Through Social Interactions
		3.1 Related Work
		3.2 Social Strategies
		3.3 System Design
		3.4 Experiments
		3.5 Discussion
	4 Model Evaluation Using Human Perception
		4.1 HYPE: A Benchmark for Human eYe Perceptual Evaluation
		4.2 Consistent and Reliable Design
		4.3 Experimental Setup
		4.4 Experiment 1: HYPEtime and HYPEinfty on Human Faces
		4.5 Experiment 2: HYPEinfty Beyond Faces
		4.6 Related Work
		4.7 Discussion
	5 Conclusion
	References
ML Tools for the Web: A Way for Rapid Prototyping and HCI Research
	1 Introduction
	2 Related Work
		2.1 ML Use Cases in HCI
		2.2 ML Libraries
		2.3 Task-Specific Libraries
		2.4 ML Systems with a Graphical Interface
		2.5 Challenges for Non-ML Expert
	3 The Positive Spiral Effect of Fast Prototyping and Research
		3.1 Releasing a New Research Model
		3.2 Using the Model
		3.3 Feedback
	4 TensorFlow.js—An ML Tool for the Web
		4.1 TensorFlow.js Models - Example: Body Segmentation
		4.2 TensorFlow Models - Example: Converting an Existing ML Model
	5 Transfer Learning Made Simple
		5.1 Teachable Machine
		5.2 Cloud AutoML
	6 Deployment Considerations
		6.1 Model Optimization
		6.2 Hardware Acceleration
		6.3 Benchmarking
	7 Discussion
		7.1 Limitations
		7.2 Advantages of Web-Based Machine Learning
		7.3 Challenges for Web-Based ML and Future Work
	8 Conclusion
	References
Interactive Reinforcement Learning for Autonomous Behavior Design
	1 Introduction
		1.1 Reinforcement Learning Basics
		1.2 Why Use Interactive Reinforcement Learning?
		1.3 Interactive Reinforcement Learning Testbeds
	2 Design Guides for Interactive Reinforcement Learning
		2.1 Design Dimensions
		2.2 Feedback Types
		2.3 Typical Use of Feedback Input for the Design Dimensions
	3 Design Example Using Interactive Reinforcement Learning
	4 Recent Research Results
		4.1 Reward Shaping
		4.2 Policy Shaping
		4.3 Guided Exploration Process
		4.4 Augmented Value Function
		4.5 Inverse Reward Design
	5 Design Principles for Interactive RL
		5.1 Feedback
		5.2 Typification of the End-User
		5.3 Fast Interaction Cycles
		5.4 Design Implications
	6 Open Challenges
		6.1 Making Interactive RL Usable in High-Dimensional Environments
		6.2 Lack of User-Experience Evaluation
		6.3 Modeling Users Preferences
		6.4 Debugging Interactive RL
	7 Conclusion
	References
Part IV Specific Domains
Sketch-Based Creativity Support Tools Using Deep Learning
	1 Introduction
	2 Role of Sketching in Supporting Creative Activities
		2.1 Sketch-Based Applications Supporting Artistic Expressions
		2.2 Sketch-Based Applications Supporting Design in Various Domains
	3 Large-Scale Sketch Datasets and Sketch-Based Deep-Learning Applications
		3.1 Large-Scale Sketch Datasets
		3.2 Sketch-Based Image and 3D Model Retrieval
		3.3 Neural Sketch Generation
	4 Developing a Paired Sketch/User Interface Dataset
		4.1 Designer Recruitment and Compensation
		4.2 Dataset Statistics
		4.3 Data Collection and Postprocessing Procedure
	5 Developing Swire: A Sketch-Based User Interface Retrieval System
		5.1 Network Architecture
		5.2 Triplet Loss
		5.3 Data and Training Procedure
		5.4 Querying
		5.5 Results
		5.6 Applications
	6 Developing Scones: A Conversational Sketching System
		6.1 System Architecture
		6.2 Datasets and Model Training
		6.3 Results
		6.4 Exploratory User Evaluation
	7 Limitations and Future Research Opportunities
		7.1 Dataset Scale and Match
		7.2 Integration with Applications in Real Usage Scenarios
	8 Conclusion
	References
Generative Ink: Data-Driven Computational Models for Digital Ink
	1 Introduction
	2 Related Work
		2.1 Understanding Handwriting
		2.2 Pen-Based Interaction
		2.3 Handwriting Beautification
		2.4 Handwriting Synthesis
		2.5 Free-Form Sketches
	3 Background
		3.1 Data Representation
		3.2 Datasets
	4 Editable Digital Ink via Deep Generative Modeling
		4.1 Method Overview
		4.2 Background
		4.3 Conditional Variational Recurrent Neural Network
		4.4 High Quality Digital Ink Synthesis
		4.5 Application Scenarios
		4.6 Preliminary User Evaluation
	5 Compositional Stroke Embeddings
		5.1 Method Overview
		5.2 Stroke Embeddings
		5.3 CoSE Relational Model—Rθ
		5.4 Training
		5.5 Experiments
	6 Discussion and Outlook
	References
Bridging Natural Language and Graphical User Interfaces
	1 Introduction
	2 Natural Language Grounding in User Interfaces
		2.1 Problem Formulation
		2.2 Data
		2.3 Model Architectures
		2.4 Experiments
		2.5 Analysis
	3 Natural Language Generation from UIs
		3.1 Data
		3.2 Model Architecture
		3.3 Experiments
		3.4 Analysis
	4 Conclusion
	References
Demonstration + Natural Language: Multimodal Interfaces for GUI-Based Interactive Task Learning Agents
	1 Introduction
		1.1 Interactive Task Learning for Smartphone Intelligent Agents
		1.2 Contributions
	2 The Human-AI Collaboration Perspective
	3 Related Work
		3.1 Programming by Demonstration
		3.2 Natural Language Programming
		3.3 Multi-modal Interfaces
		3.4 Understanding App Interfaces
	4 System Overview
	5 Key Features
		5.1 Using Demonstrations in Natural Language Instructions
		5.2 Spoken Intent Clarification for Demonstrated Actions
		5.3 Task Parameterization Through GUI Grounding
		5.4 Generalizing the Learned Concepts
		5.5 Breakdown Repairs in Task-Oriented Dialogs
		5.6 The Semantic Representation of GUIs
	6 User Evaluations
	7 Limitations
		7.1 Platform
		7.2 Runtime Efficiency
		7.3 Expressiveness
		7.4 Brittleness
	8 Future Work
		8.1 Generalization in Programming by Demonstration
		8.2 Field Study of Sugilite
	9 Conclusion
	References
Human-Centered AI for Medical Imaging
	1 Introduction
	2 Leveraging Data-Driven AI to Distill New Insights from Medical Imaging
		2.1 Advances in AI-Enabled Radiology
		2.2 Advances in AI-Enabled Digital Pathology
		2.3 Advances in Other AI-Enabled Medical Imaging Modalities
		2.4 Limitations and Challenges
	3 Patient-Centered AI for Medical Imaging
		3.1 Enabling Patients to Perform Self-Assessment
		3.2 Involving Patients in Clinical Diagnosis and Treatment
	4 Physician-Centered AI for Medical Imaging
		4.1 Enabling Physicians to Comprehend AI's Findings
		4.2 Enabling Physicians to Use AI as Tools
		4.3 Enabling Physicians to Collaborate with AI
	5 Outlook and Summary
	References
3D Spatial Sound Individualization with Perceptual Feedback
	1 Introduction
		1.1 User Modeling and System Adaptation
		1.2 3D Spatial Sound Individualization
		1.3 Adapting Generative Model to a Specific User with Perceptual Feedback
	2 Embedding Individualization Parameters into a Generative Model
		2.1 Variational AutoEncoder
		2.2 Decomposing the Individualities
		2.3 Blending the Individualizing Parameters
		2.4 Blending Example
	3 Adaptation with Perceptual Feedback
		3.1 Optimizing the Blending Vector
		3.2 Estimating the Local Landscape of the Perceptual Function
		3.3 Optimization
	4 Example: 3D Spatial Sound Individualization
		4.1 Generative Model for 3D Spatial Sound
		4.2 User Interface
		4.3 User Study
	5 Discussions
		5.1 Tensor Decomposition for Adaptation
		5.2 Gradient Estimation from Relative Assessments
		5.3 Possible Applications in HCI
	6 Conclusion
	References