Low-Power Computer Vision: Improve the Efficiency of Artificial Intelligence
- Length: 344 pages
- Edition: 1
- Language: English
- Publisher: Chapman and Hall/CRC
- Publication Date: 2022-02-23
- ISBN-10: 0367744708
- ISBN-13: 9780367744700
- Sales Rank: #8293573 (See Top 100 Books)
Energy efficiency is critical for running computer vision on battery-powered systems, such as mobile phones or UAVs (unmanned aerial vehicles, or drones). This book collects the methods that have won the annual IEEE Low-Power Computer Vision Challenges since 2015. The winners share their solutions and provide insight on how to improve the efficiency of machine learning systems.
Cover Half Title Title Page Copyright Page Contents Foreword Rebooting Computing and Low-Power Computer Vision Editors SECTION I: Introduction CHAPTER 1: Book Introduction 1.1. ABOUT THE BOOK 1.2. CHAPTER SUMMARIES 1.2.1. History of Low-Power Computer Vision Challenge 1.2.2. Survey on Energy-Efficient Deep Neural Networks for Computer Vision 1.2.3. Hardware Design and Software Practices for Efficient Neural Network Inference 1.2.4. Progressive Automatic Design of Search Space for One-Shot Neural Architecture 1.2.5. Fast Adjustable Threshold for Uniform Neural Network Quantization 1.2.6. Power-efficient Neural Network Scheduling on Heterogeneous system on chips (SoCs) 1.2.7. Efficient Neural Architecture Search 1.2.8. Design Methodology for Low-Power Image Recognition Systems Design 1.2.9. Guided Design for Efficient On-device Object Detection Model 1.2.10. Quantizing Neural Networks for Low-Power Computer Vision 1.2.11. A Practical Guide to Designing Efficient Mobile Architectures 1.2.12. A Survey of Quantization Methods for Efficient Neural Network Inference CHAPTER 2: History of Low-Power Computer Vision Challenge 2.1. REBOOTING COMPUTING 2.2. LOW-POWER IMAGE RECOGNITION CHALLENGE (LPIRC): 2015–2019 2.3. LOW-POWER COMPUTER VISION CHALLENGE (LPCVC): 2.4. WINNERS 2.5. ACKNOWLEDGMENTS CHAPTER 3: Survey on Energy-Efficient Deep Neural Networks for Computer Vision 3.1. INTRODUCTION 3.2. BACKGROUND 3.2.1. Computation Intensity of Deep Neural Networks 3.2.2. Low-Power Deep Neural Networks 3.3. PARAMETER QUANTIZATION 3.4. DEEP NEURAL NETWORK PRUNING 3.5. DEEP NEURAL NETWORK LAYER AND FILTER COMPRESSION 3.6. PARAMETER MATRIX DECOMPOSITION TECHNIQUES 3.7. NEURAL ARCHITECTURE SEARCH 3.8. KNOWLEDGE DISTILLATION 3.9. ENERGY CONSUMPTION—ACCURACY TRADEOFF WITH DEEP NEURAL NETWORKS 3.10. GUIDELINES FOR LOW-POWER COMPUTER VISION 3.10.1. Relationship between Low-Power Computer Vision Techniques 3.10.2. Deep Neural Network and Resolution Scaling 3.11. EVALUATION METRICS 3.11.1. Accuracy Measurements on Popular Datasets 3.11.2. Memory Requirement and Number of Operations 3.11.3. On-device Energy Consumption and Latency 3.12. SUMMARY AND CONCLUSIONS SECTION II: Competition Winners CHAPTER 4: Hardware Design and Software Practices for Efficient Neural Network Inference 4.1. HARDWARE AND SOFTWARE DESIGN FRAMEWORK FOR EFFICIENT NEURAL NETWORK INFERENCE 4.1.1. Introduction 4.1.2. From Model to Instructions 4.2. ISA-BASED CNN ACCELERATOR: ANGEL-EYE 4.2.1. Hardware Architecture 4.2.2. Compiler 4.2.3. Runtime Workflow 4.2.4. Extension Support of Upsampling Layers 4.2.5. Evaluation 4.2.6. Practice on DAC-SDC Low-Power Object Detection Challenge 4.3. NEURAL NETWORK MODEL OPTIMIZATION 4.3.1. Pruning and Quantization 4.3.1.1. Network Pruning 4.3.1.2. Network Quantization 4.3.1.3. Evaluation and Practices 4.3.2. Pruning with Hardware Cost Model 4.3.2.1. Iterative Search-based Pruning Methods 4.3.2.2. Local Programming-based Pruning and the Practice in LPCVC’19 4.3.3. Architecture Search Framework 4.3.3.1. Framework Design 4.3.3.2. Case Study Using the aw nas Framework: Black-box Search Space Tuning for Hardware-aware NAS 4.4. SUMMARY CHAPTER 5: Progressive Automatic Design of Search Space for One-Shot Neural Architecture Search 5.1. ABSTRACT 5.2. INTRODUCTION 5.3. RELATED WORK 5.4. METHOD 5.4.1. Problem Formulation and Motivation 5.4.2. Progressive Automatic Design of Search Space 5.5. EXPERIMENTS 5.5.1. Dataset and Implement Details 5.5.2. Comparison with State-of-the-art Methods 5.5.3. Automatically Designed Search Space 5.5.4. Ablation Studies 5.6. CONCLUSION CHAPTER 6: Fast Adjustable Threshold for Uniform Neural Network Quantization 6.1. INTRODUCTION 6.2. RELATED WORK 6.2.1. Quantization with Knowledge Distillation 6.2.2. Quantization without Fine-tuning 6.2.3. Quantization with Training/Fine-tuning 6.3. METHOD DESCRIPTION 6.3.1. Quantization with Threshold Fine-tuning 6.3.1.1. Differentiable Quantization Threshold 6.3.1.2. Batch Normalization Folding 6.3.1.3. Threshold Scale 6.3.1.4. Training of Asymmetric Thresholds 6.3.1.5. Vector Quantization 6.3.2. Training on the Unlabeled Data 6.3.3. Quantization of Depth-wise Separable Convolution 6.3.3.1. Scaling the Weights for MobileNet-V2 (with ReLU6) 6.4. EXPERIMENTS AND RESULTS 6.4.1. Experiments Description 6.4.1.1. Researched Architectures 6.4.1.2. Training Procedure 6.4.2. Results 6.5. CONCLUSION CHAPTER 7: Power-efficient Neural Network Scheduling 7.1. INTRODUCTION TO NEURAL NETWORK SCHEDULING ON HETEROGENEOUS SoCs 7.1.1. Heterogeneous SoC 7.1.2. Network Scheduling 7.2. COARSE-GRAINED SCHEDULING FOR NEURAL NETWORK TASKS: A CASE STUDY OF CHAMPION SOLUTION IN LPIRC2016 7.2.1. Introduction to the LPIRC2016. Mission and the Solutions 7.2.2. Static Scheduling for the Image Recognition Task 7.2.3. Manual Load Balancing for Pipelined Fast R-CNN 7.2.4. The Result of Static Scheduling 7.3. FINE-GRAINED NEURAL NETWORK SCHEDULING ON POWER-EFFICIENT PROCESSORS 7.3.1. Network Scheduling on SUs: Compiler-Level Techniques 7.3.2. Memory-Efficient Network Scheduling 7.3.3. The Formulation of the Layer-Fusion Problem by Computational Graphs 7.3.4. Cost Estimation of Fused Layer-Groups 7.3.5. Hardware-Aware Network Fusion Algorithm (HaNF) 7.3.6. Implementation of the Network Fusion Algorithm 7.3.7. Evaluation of Memory Overhead 7.3.8. Performance on Different Processors 7.4. SCHEDULER-FRIENDLY NETWORK QUANTIZATIONS 7.4.1. The Problem of Layer Pipelining between CPU and Integer SUs 7.4.2. Introduction to Neural Network Quantization for Integer Neural Accelerators 7.4.3. Related Work of Neural Network Quantization 7.4.4. Linear Symmetric Quantization for Low-Precision Integer Hardware 7.4.5. Making Full Use of the Pre-Trained Parameters 7.4.6. Low-Precision Representation and Quantization Algorithm 7.4.7. BN Layer Fusion of Quantized Networks 7.4.8. Bias and Scaling Factor Quantization for Low-Precision Integer Operation 7.4.9. Evaluation Results 7.5. SUMMARY CHAPTER 8: Efficient Neural Network Architectures 8.1. STANDARD CONVOLUTION LAYER 8.2. EFFICIENT CONVOLUTION LAYERS 8.3. MANUALLY DESIGNED EFFICIENT CNN MODELS 8.4. NEURAL ARCHITECTURE SEARCH 8.5. HARDWARE-AWARE NEURAL ARCHITECTURE SEARCH 8.5.1. Latency Prediction 8.5.2. Specialized Models for Different Hardware 8.5.3. Handling Many Platforms and Constraints 8.6. CONCLUSION CHAPTER 9: Design Methodology for Low-Power Image Recognition Systems 9.1. DESIGN METHODOLOGY USED IN LPIRC 9.1.1. Object Detection Networks 9.1.2. Throughput Maximization by Pipelining 9.1.3. Software Optimization Techniques 9.1.3.1. Tucker Decomposition 9.1.3.2. CPU Parallelization 9.1.3.3. 16-bit Quantization 9.1.3.4. Post Processing 9.2. IMAGE RECOGNITION NETWORK EXPLORATION 9.2.1. Single Stage Detectors 9.2.2. Software Optimization Techniques 9.2.3. Post Processing 9.2.4. Network Exploration 9.2.5. LPIRC 2018. Solution 9.3. NETWORK PIPELINING FOR HETEROGENEOUS PROCESSOR SYSTEMS 9.3.1. Network Pipelining Problem 9.3.2. Network Pipelining Heuristic 9.3.3. Software Framework for Network Pipelining 9.3.4. Experimental Results 9.4. CONCLUSION AND FUTURE WORK CHAPTER 10: Guided Design for Efficient On-device Object Detection Model 10.1. INTRODUCTION 10.1.1. LPIRC Track 1 in 2018. and 10.1.2. Three Awards for Amazon team 10.2. BACKGROUND 10.3. AWARD-WINNING METHODS 10.3.1. Quantization Friendly Model 10.3.2. Network Architecture Optimization 10.3.3. Training Hyper-parameters 10.3.4. Optimal Model Architecture 10.3.5. Neural Architecture Search 10.3.6. Dataset Filtering 10.3.7. Non-maximum Suppression Threshold 10.3.8. Combination 10.4. CONCLUSION SECTION III: Invited Articles CHAPTER 11: Quantizing Neural Networks 11.1. INTRODUCTION 11.2. QUANTIZATION FUNDAMENTALS 11.2.1. Hardware Background 11.2.2. Uniform Affine Quantization 11.2.2.1. Symmetric Uniform Quantization 11.2.2.2. Power-of-two Quantizer 11.2.2.3. Quantization Granularity 11.2.3. Quantization Simulation 11.2.3.1. Batch Normalization Folding 11.2.3.2. Activation Function Fusing 11.2.3.3. Other Layers and Quantization 11.2.4. Practical Considerations 11.2.4.1. Symmetric vs. Asymmetric Quantization 11.2.4.2. Per-tensor and Per-channel Quantization 11.3. POST-TRAINING QUANTIZATION 11.3.1. Quantization Range Setting 11.3.2. Cross-Layer Equalization 11.3.3. Bias Correction 11.3.4. AdaRound 11.3.5. Standard PTQ Pipeline 11.3.6. Experiments 11.4. QUANTIZATION-AWARE TRAINING 11.4.1. Simulating Quantization for Backward Path 11.4.2. Batch Normalization Folding and QAT 11.4.3. Initialization for QAT 11.4.4. Standard QAT Pipeline 11.4.5. Experiments 11.5. SUMMARY AND CONCLUSIONS CHAPTER 12: Building Efficient Mobile Architectures 12.1. INTRODUCTION 12.2. ARCHITECTURE PARAMETERIZATIONS 12.2.1. Network Width Multiplier 12.2.2. Input Resolution Multiplier 12.2.3. Data and Internal Resolution 12.2.4. Network Depth Multiplier 12.2.5. Adjusting Multipliers for Multi-criteria Optimizations 12.3. OPTIMIZING EARLY LAYERS 12.4. OPTIMIZING THE FINAL LAYERS 12.4.1. Adjusting the Resolution of the Final Spatial Layer 12.4.2. Reducing the Size of the Embedding Layer 12.5. ADJUSTING NON-LINEARITIES: H-SWISH AND H-SIGMOID 12.6. PUTTING IT ALL TOGETHER CHAPTER 13: A Survey of Quantization Methods for Efficient Neural Network Inference 13.1. INTRODUCTION 13.2. GENERAL HISTORY OF QUANTIZATION 13.3. BASIC CONCEPTS OF QUANTIZATION 13.3.1. Problem Setup and Notations 13.3.2. Uniform Quantization 13.3.3. Symmetric and Asymmetric Quantization 13.3.4. Range Calibration Algorithms: Static vs. Dynamic Quantization 13.3.5. Quantization Granularity 13.3.6. Non-Uniform Quantization 13.3.7. Fine-tuning Methods 13.3.7.1. Quantization-Aware Training 13.3.7.2. Post-Training Quantization 13.3.7.3. Zero-shot Quantization 13.3.8. Stochastic Quantization 13.4. ADVANCED CONCEPTS: QUANTIZATION BELOW 8 BITS 13.4.1. Simulated and Integer-only Quantization 13.4.2. Mixed-Precision Quantization 13.4.3. Hardware Aware Quantization 13.4.4. Distillation-Assisted Quantization 13.4.5. Extreme Quantization 13.4.6. Vector Quantization 13.5. QUANTIZATION AND HARDWARE PROCESSORS 13.6. FUTURE DIRECTIONS FOR RESEARCH IN QUANTIZATION 13.7. SUMMARY AND CONCLUSIONS Bibliography Index
Donate to keep this site alive
To access the Link, solve the captcha.
1. Disable the AdBlock plugin. Otherwise, you may not get any links.
2. Solve the CAPTCHA.
3. Click download link.
4. Lead to download server to download.