Machine Learning under Resource Constraints, Volume 2: Discovery in Physics

by Andreas Becker, Jens Buß, Katharina Morik, Wolfgang Rhode

Length: 349 pages
Edition: 1
Language: English
Publisher: de Gruyter
Publication Date: 2022-12-31
ISBN-10: 3110785951
ISBN-13: 9783110785951
Sales Rank: #0 (See Top 100 Books)

Machine learning is part of Artificial Intelligence since its beginning. Certainly, not learning would only allow the perfect being to show intelligent behavior. All others, be it humans or machines, need to learn in order to enhance their capabilities. In the eighties of the last century, learning from examples and modeling human learning strategies have been investigated in concert. The formal statistical basis of many learning methods has been put forward later on and is still an integral part of machine learning. Neural networks have always been in the toolbox of methods. Integrating all the pre-processing, exploitation of kernel functions, and transformation steps of a machine learning process into the architecture of a deep neural network increased the performance of this model type considerably. Modern machine learning is challenged on the one hand by the amount of data and on the other hand by the demand of real-time inference. This leads to an interest in computing architectures and modern processors. For a long time, the machine learning research could take the von-Neumann architecture for granted. All algorithms were designed for the classical CPU. Issues of implementation on a particular architecture have been ignored. This is no longer possible. The time for independently investigating machine learning and computational architecture is over.

Computing architecture has experienced a similarly rampant development from mainframe or personal computers in the last century to now very large compute clusters on the one hand and ubiquitous computing of embedded systems in the Internet of Things on the other hand. Cyber-physical systems’ sensors produce a huge amount of streaming data which need to be stored and analyzed. Their actuators need to react in real-time. This clearly establishes a close connection with machine learning. Cyber-physical systems and systems in the Internet of Things consist of diverse components, heterogeneous both in hard- and software. Modern multi-core systems, graphic processors, memory technologies and hardware-software codesign offer opportunities for better implementations of machine learning models.

Machine learning and embedded systems together now form a field of research which tackles leading edge problems in machine learning, algorithm engineering, and embedded systems. Machine learning today needs to make the resource demands of learning and inference meet the resource constraints of used computer architecture and platforms. A large variety of algorithms for the same learning method and, moreover, diverse implementations of an algorithm for particular computing architectures optimize learning with respect to resource efficiency while keeping some guarantees of accuracy. The trade-off between a decreased energy consumption and an increased error rate, to just give an example, needs to be theoretically shown for training a model and the model inference. Pruning and quantization are ways of reducing the resource requirements by either compressing or approximating the model. In addition to memory and energy consumption, timeliness is an important issue, since many embedded systems are integrated into large products that interact with the physical world. If the results are delivered too late, they may have become useless. As a result, real-time guarantees are needed for such systems. To efficiently utilize the available resources, e.g., processing power, memory, and accelerators, with respect to response time, energy consumption, and power dissipation, different scheduling algorithms and resource management strategies need to be developed.

This book series addresses machine learning under resource constraints as well as the application of the described methods in various domains of science and engineering. Turning big data into smart data requires many steps of data analysis: methods for extracting and selecting features, filtering and cleaning the data, joining heterogeneous sources, aggregating the data, and learning predictions need to scale up. The algorithms are challenged on the one hand by high-throughput data, gigantic data sets like in astrophysics, on the other hand by high dimensions like in genetic data. Resource constraints are given by the relation between the demands for processing the data and the capacity of the computing machinery. The resources are runtime, memory, communication, and energy. Novel machine learning algorithms are optimized with regard to minimal resource consumption. Moreover, learned predictions are applied to program executions in order to save resources. The three books will have the following subtopics:

Volume 1: Machine Learning under Resource Constraints – Fundamentals

Volume 2: Machine Learning and Physics under Resource Constraints – Discovery

Volume 3: Machine Learning under Resource Constraints – Applications
Volume 2 is about machine learning for knowledge discovery in particle and astroparticle physics. Their instruments, e.g., particle accelerators or telescopes, gather petabytes of data. Here, machine learning is necessary not only to process the vast amounts of data and to detect the relevant examples efficiently, but also as part of the knowledge discovery process itself. The physical knowledge is encoded in simulations that are used to train the machine learning models. At the same time, the interpretation of the learned models serves to expand the physical knowledge. This results in a cycle of theory enhancement supported by machine learning.

1 Introduction
    1.1 Basics, Questions, and Motivation
    1.2 Machine Learning as Model of the Scientific Process
        1.2.1 Early Approaches to Scientific Discovery
        1.2.2 Knowledge Representation – What is a Theory?
        1.2.3 Towards Probabilities
        1.2.4 The Big Data Move
    1.3 From the Physical Theory to the Physical Observable
    1.4 From the Measured Variable to the Measuring Point
    1.5 The Epistemology of Physics in Concert with Machine Learning
        1.5.1 Further Reading in this Book
2 Challenges in Particle and Astroparticle Physics
    2.1 Physical Motivation, Problems, and Examples
    2.2 Astroparticle Physics
        2.2.1 Experiments
    2.3 Particle Physics
        2.3.1 Experiments
3 Key Concepts in Machine Learning and Data Analysis
    3.1 Overview of the Field of Machine Learning
        3.1.1 Learning Tasks
        3.1.2 Processing Paradigms of Machine Learning
        3.1.3 Machine Learning Pipelines
        3.1.4 Minimum Redundancy Maximum Relevance (MRMR)
    3.2 Optimization
        3.2.1 Stochastic Gradient Descent
        3.2.2 Newton-Raphson Optimization
    3.3 Theories of Machine Learning
        3.3.1 Computational Learning Theory
    3.4 Tree Models
        3.4.1 Ensemble Methods
        3.4.2 Implementations and Hardware Considerations
    3.5 Neural Networks
        3.5.1 Architectures of DNNs
        3.5.2 Robustness of DNNs
        3.5.3 Deep Learning Theory
        3.5.4 Explanations
        3.5.5 Hardware Considerations
4 Data Acquisition and Data Structure
    4.1 Introduction
    4.2 Data Acquisition
        4.2.1 Data Acquisition for LHCb
        4.2.2 Data Acquisition for Imaging Air Cherenkov Telescopes
        4.2.3 Data Acquisition for IceCube
    4.3 Data Structures
        4.3.1 Data Structures for LHCb
        4.3.2 Data Structures for IACTs
        4.3.3 Data Structures for IceCube
    4.4 GPU-Based Trigger Decisions
5 Monte Carlo Simulations
    5.1 LHCb: Monte Carlo Simulations and Libraries in Particle Physics
        5.1.1 Astro: Monte Carlo Simulations, Libraries
    5.2 Simulation Efficiency Studies
        5.2.1 Corsika – Active Learning
        5.2.2 Control of the Simulation – Active Sampling
        5.2.3 Corsika 8 New Modular Library
        5.2.4 ARM-Cluster for Corsika
    5.3 Validation of the Simulation
        5.3.1 Introduction
        5.3.2 Mismatches Between Observed and Simulated Data
        5.3.3 Detection of Mismatches
    5.4 Keynote: The Muon Puzzle
        5.4.1 Introduction
        5.4.2 Meta-Analysis of Muon Measurements in Air Showers
        5.4.3 Muon Production in Air Showers
        5.4.4 Related Measurements at the LHC at CERN
        5.4.5 Fixed-Target Experiments at SPS and LHC
        5.4.6 Summary and Outlook
6 Data Storage and Access
    6.1 Introduction
    6.2 Research Data Management
        6.2.1 Management of Large Amounts of Research Data
    6.3 The FACT Open Data Project as an Example of Public Data Access
        6.3.1 Available Data
    6.4 The DeLoreanSystem Architecture as Example of Experiment-Internal Access
        6.4.1 The Data Volume Problem at LHCb
        6.4.2 DeLorean: Optimized Scans for Efficient Data Access
        6.4.3 Evaluating DeLorean
        6.4.4 Looking Ahead
7 Monitoring and Feature Extraction
    7.1 Introduction
    7.2 Feature Extraction and Selection in IceCube
    7.3 Feature Extraction for IACTs
        7.3.1 Introduction
        7.3.2 Image Extraction
        7.3.3 Image Cleaning
        7.3.4 Image Parametrization
    7.4 Monitoring the Telescope via Data Summarization
        7.4.1 Introduction
        7.4.2 Data Summarization with Submodular Functions
        7.4.3 Dimensionality Reduction with Autoencoders
        7.4.4 Submodular Autoencoders
        7.4.5 Experiments
        7.4.6 Discussion and Outlook
8 Event Property Estimation and Signal Background Separation
    8.1 Introduction
    8.2 Boosted Decision Trees LHC
    8.3 Event Selection in IceCube
        8.3.1 Muon Neutrino Selection
        8.3.2 Tau Neutrino Selection
    8.4 Estimation of Event Properties for IACTs
        8.4.1 Labeled Training Data
        8.4.2 Particle Classification
        8.4.3 Energy Estimation
        8.4.4 Origin Estimation
        8.4.5 Combining Multiple Telescopes
        8.4.6 Final Event Selection
    8.5 Keynote: Data Analysis at ATLAS
        8.5.1 Introduction
        8.5.2 Rare Top-Quark Processes: Searching for FCNC Processes
        8.5.3 Searching for New Heavy Particles: Vector-Like Quarks
        8.5.4 Conclusions
9 Deep Learning Applications
    9.1 Introduction
    9.2 Deep Learning for IceCube
        9.2.1 Domain Knowledge in IceCube
        9.2.2 Convolutional Neural Networks in IceCube
        9.2.3 Combining Deep Learning with Maximum-Likelihood
        9.2.4 Model Performance and Applications
    9.3 Flavor Tagging with Deep Learning LHC
        9.3.1 Neutral B Meson Oscillations and CP-Violation
        9.3.2 Flavor Tagging Technique
        9.3.3 Flavor Tagging Formalism
        9.3.4 Flavor Tagging Algorithms
        9.3.5 Inclusive Flavor Tagging
    9.4 A Deep Learning Analysis Pipeline for Gamma Ray Astronomy
        9.4.1 Introduction
        9.4.2 Event-Tagging Pipeline
        9.4.3 Multi-Task Deep Neural Networks
        9.4.4 Model Performance and Applications
        9.4.5 Discussion
10 Inverse Problems
    10.1 Introduction
    10.2 Keynote: Introduction to Inverse Problems
        10.2.1 Information
        10.2.2 The Effect of the Response Function
        10.2.3 Supplementary Remarks
        10.2.4 Mathematical Appendix: Integrals Over Cosine Functions
    10.3 Likelihood-Based Deconvolution
        10.3.1 Introduction
        10.3.2 Discretization of the Observable Quantities
        10.3.3 Optimization
        10.3.4 Regularization with a Fixed Number of Degrees of Freedom
        10.3.5 Regularization with Minimum Global Correlation
    10.4 Deconvolution as a Classification Task
        10.4.1 Introduction
        10.4.2 Quantification with Classify-and-Count Methods
        10.4.3 Accurate Estimates Through Iterative Reweighting
        10.4.4 Classifier Choice
        10.4.5 Leveraging Eventwise Contributions
        10.4.6 Excursion: Document Analysis in Political Science
    10.5 Deconvolution of IACT Data
        10.5.1 IACT Instrument Response Functions
        10.5.2 Application of the Regularized Likelihood Deconvolution
    10.6 Deconvolution of Atmospheric Neutrino Spectra
Bibliography
Index
List of Contributors
    Editors
    Contributors
    Technical Editors
    Acknowledgment