Linear Algebra and Optimization for Machine Learning: A Textbook
- Length: 516 pages
- Edition: 1
- Language: English
- Publisher: Springer
- Publication Date: 2021-05-13
- ISBN-10: 3030403467
- ISBN-13: 9783030403461
- Sales Rank: #187606 (See Top 100 Books)
This textbook introduces linear algebra and optimization in the context of machine learning. Examples and exercises are provided throughout the book. A solution manual for the exercises at the end of each chapter is available to teaching instructors. This textbook targets graduate level students and professors in computer science, mathematics and data science. Advanced undergraduate students can also use this textbook. The chapters for this textbook are organized as follows:
1. Linear algebra and its applications: The chapters focus on the basics of linear algebra together with their common applications to singular value decomposition, matrix factorization, similarity matrices (kernel methods), and graph analysis. Numerous machine learning applications have been used as examples, such as spectral clustering, kernel-based classification, and outlier detection. The tight integration of linear algebra methods with examples from machine learning differentiates this book from generic volumes on linear algebra. The focus is clearly on the most relevant aspects of linear algebra for machine learning and to teach readers how to apply these concepts.
2. Optimization and its applications: Much of machine learning is posed as an optimization problem in which we try to maximize the accuracy of regression and classification models. The “parent problem” of optimization-centric machine learning is least-squares regression. Interestingly, this problem arises in both linear algebra and optimization, and is one of the key connecting problems of the two fields. Least-squares regression is also the starting point for support vector machines, logistic regression, and recommender systems. Furthermore, the methods for dimensionality reduction and matrix factorization also require the development of optimization methods. A general view of optimization in computational graphs is discussed together with its applications to back propagation in neural networks.
A frequent challenge faced by beginners in machine learning is the extensive background required in linear algebra and optimization. One problem is that the existing linear algebra and optimization courses are not specific to machine learning; therefore, one would typically have to complete more course material than is necessary to pick up machine learning. Furthermore, certain types of ideas and tricks from optimization and linear algebra recur more frequently in machine learning than other application-centric settings. Therefore, there is significant value in developing a view of linear algebra and optimization that is better suited to the specific perspective of machine learning.
Cover Page Half-Title Page Series Page Title Page Copyright Page Contents Acknowledgments Preface Introduction 1 Graph Theory 1.1 Basic Terminology 1.2 The Power of the Adjacency Matrix 1.3 Eigenvalues and Eigenvectors as Key Players 1.4 CASE STUDY: Applications in Sport Ranking 1.5 CASE STUDY: Gerrymandering 1.6 Exercises 2 Stochastic Processes 2.1 Markov Chain Basics 2.2 Hidden Markov Models 2.2.1 The Likelihood Problem 2.2.2 The Decoding Problem 2.2.3 The Learning Problem 2.3 CASE STUDY: Spread of Infectious Disease 2.4 CASE STUDY: Text Analysis and Autocorrect 2.5 CASE STUDY: Tweets and Time Series 2.6 Exercises 3 SVD and PCA 3.1 Vector and Inner Product Spaces 3.2 Singular Values 3.3 Singular Value Decomposition 3.4 Compression of Data Using Principal Component Analysis (PCA) 3.5 PCA, Covariance, and Correlation 3.6 Linear Discriminant Analysis 3.7 CASE STUDY: Digital Humanities 3.8 CASE STUDY: Facial Recognition Using PCA and LDA 3.9 Exercises 4 Interpolation 4.1 Lagrange Interpolation 4.2 Orthogonal Families of Polynomials 4.3 Newton's Divided Difference 4.3.1 Newton's Interpolation via Divided Difference 4.3.2 Newton's Interpolation via the Vandermonde Matrix 4.4 Chebyshev Interpolation 4.5 Hermite Interpolation 4.6 Least Squares Regression 4.7 CASE STUDY: Chebyshev Polynomials and Cryptography 4.8 CASE STUDY: Racial Disparities in Marijuana Arrests 4.9 CASE STUDY: Interpolation in Higher Education Data 4.10 Exercises 5 Optimization and Learning Techniques for Regression 5.1 Basics of Probability Theory 5.2 Introduction to Matrix Calculus 5.2.1 Matrix Differentiation 5.2.2 Matrix Integration 5.3 Maximum Likelihood Estimation 5.4 Gradient Descent Method 5.5 Introduction to Neural Networks 5.5.1 The Learning Process 5.5.2 Sigmoid Activation Functions 5.5.3 Radial Activation Functions 5.6 CASE STUDY: Handwriting Digit Recognition 5.7 CASE STUDY: Poisson Regression and COVID Counts 5.8 Exercises 6 Decision Trees and Random Forests 6.1 Decision Trees 6.1.1 Decision Trees Regression 6.2 Regression Trees 6.3 Random Decision Trees and Forests 6.4 CASE STUDY: Entropy of Wordle 6.5 CASE STUDY: Bird Call Identification 6.6 Exercises 7 Random Matrices and Covariance Estimate 7.1 Introduction to Random Matrices 7.2 Stability 7.3 Gaussian Orthogonal Ensemble 7.4 Gaussian Unitary Ensemble 7.5 Gaussian Symplectic Ensemble 7.6 Random Matrices and the Relationship to the Covariance 7.7 CASE STUDY: Finance and Brownian Motion 7.8 CASE STUDY: Random Matrices in Gene Interaction 7.9 Exercises 8 Sample Solutions to Exercises 8.1 Chapter 1 8.2 Chapter 2 8.3 Chapter 3 8.4 Chapter 4 8.5 Chapter 5 8.6 Chapter 6 8.7 Chapter 7 Github Links Bibliography Index
Donate to keep this site alive
1. Disable the AdBlock plugin. Otherwise, you may not get any links.
2. Solve the CAPTCHA.
3. Click download link.
4. Lead to download server to download.