Practical Mathematics for AI and Deep Learning: A Concise yet In-Depth Guide on Fundamentals of Computer Vision, NLP, Complex Deep Neural Networks and Machine Learning
- Length: 755 pages
- Edition: 1
- Language: English
- Publisher: BPB Publications
- Publication Date: 2022
- ISBN-10: B0BRCP4NX1
- Sales Rank: #650371 (See Top 100 Books)
Mathematical Codebook to Navigate Through the Fast-changing AI Landscape
Key Features
- Access to industry-recognized AI methodology and deep learning mathematics with simple-to-understand examples.
- Encompasses MDP Modeling, the Bellman Equation, Auto-regressive Models, BERT, and Transformers.
- Detailed, line-by-line diagrams of algorithms, and the mathematical computations they perform.
Description
To construct a system that may be referred to as having ‘Artificial Intelligence,’ it is important to develop the capacity to design algorithms capable of performing data-based automated decision-making in conditions of uncertainty. Now, to accomplish this goal, one needs to have an in-depth understanding of the more sophisticated components of linear algebra, vector calculus, probability, and statistics. This book walks you through every mathematical algorithm, as well as its architecture, its operation, and its design so that you can understand how any artificial intelligence system operates.
This book will teach you the common terminologies used in artificial intelligence such as models, data, parameters of models, and dependent and independent variables. The Bayesian linear regression, the Gaussian mixture model, the stochastic gradient descent, and the backpropagation algorithms are explored with implementation beginning from scratch. The vast majority of the sophisticated mathematics required for complicated AI computations such as autoregressive models, cycle GANs, and CNN optimization are explained and compared.
You will acquire knowledge that extends beyond mathematics while reading this book. Specifically, you will become familiar with numerous AI training methods, various NLP tasks, and the process of reducing the dimensionality of data.
What you will learn
- Learn to think like a professional data scientist by picking the best-performing AI algorithms.
- Expand your mathematical horizons to include the most cutting-edge AI methods.
- Learn about Transformer Networks, improving CNN performance, dimensionality reduction, and generative models.
- Explore several neural network designs as a starting point for constructing your own NLP and Computer Vision architecture.
- Create specialized loss functions and tailor-made AI algorithms for a given business application.
Who this book is for
Everyone interested in artificial intelligence and its computational foundations, including machine learning, data science, deep learning, computer vision, and natural language processing (NLP), both researchers and professionals, will find this book to be an excellent companion. This book can be useful as a quick reference for practitioners who already use a variety of mathematical topics but do not completely understand the underlying principles.
Cover Page Title Page Copyright Page Dedication Page About the Authors About the Reviewer Acknowledgements Preface Errata Table of Contents 1. Overview of AI Structure Objectives AI systems Machine Learning How are ML Models created? Data types Learning From data Types of ML algorithm Unsupervised learning Reinforcement learning Supervised learning Metrices for evaluating classification model Metrices for evaluating regression model Deep learning Dataset preparation Application of AI Role of Mathematics in AI Conclusion 2. Linear Algebra Structure Objectives Linear equations Solving system of equations analytically Infinitely many solutions Inconsistent system Introducing matrix Augmented matrix Pseudocode forward substitution Pseudocode back substitution Basic matrix operations Euclidean space Vectors and basic properties Representing vector Norm Direction Scalar multiplication Addition/subtraction of vectors Distance between vectors Dot product and orthogonality Linear Combination of Vectors Dimension and basis of the space Orthogonal and orthonormal basis Natural orthonormal basis of ℝn Subspaces Dimension of subspace Hyperplanes and Halfspaces Defining vector space Vector spaces Normed vector space Norm of real numbers lp Norm Maximum norm Matrix norm Inner product Application on real dataset K-nearest neighbor Representing vectors in matrix Matrix rank Matrices types Identity matrix Symmetric matrix Skew symmetric matrix Invertible matrices Properties of Matrix Inverse Permutation matrix Orthogonal matrix Matrices in ML problem formulation Feature/data matrix One hot encoding Distance matrix Gram matrix Covariance matrix Correlation matrix Jacobian and Hessian matrix Subspaces of matrix and orthogonality Null space Orthogonality among subspaces Determinant Inverse of Matrix Orthonormalization Applications of Orthonormalization Linear transformation Matrix associated with linear map Composition of linear transformation Eigenvalues and vectors Eigen properties Geometric analysis Existence of zero eigenvalue Eigen properties of symmetric matrices Positive definite Matrix decomposition LU decomposition By-product of Gauss-Jordan elimination QR decomposition Eigen decomposition Real symmetric matrix Singular value decomposition Conclusion Points to remember Further Reading 3. Vector Calculus Structure Objectives Analysis of real functions Limit of a function Continuous functions Derivative of a function Higher Order derivatives Taylor series expansion Scalar and vector fields Limits and continuity Derivative of scalar fields w.r.t. vector Directional derivative and partial derivatives Total derivative Geometry of gradient vector Derivative of vector fields w.r.t. vector Chain rule for derivatives of vector fields Matrix form of the chain rule Tensors Einstein notation Dot product of tensors Tensor calculus Total derivative of tensor Mathematical optimization Maxima, minima, and saddle point Decent methods Function optimization with constraints: Lagrange multipliers Optimization with inequality constraints The Lagrange dual function Convex functions Properties of convex functions Convex optimization Karush-Kuhn-Tucker conditions (KKT) Conclusion Points to remember Further readings 4. Basic Statistics and Probability Theory Structure Objectives Basic statistics Measures of central tendency Mean Median Mode Partition Values Measures of dispersion Range Interquartile Range Mean deviation Standard deviation Coefficients of dispersion Moments Skewness and kurtosis Correlation Probability and odds Random experiment Events as sets Conditional probability Independent Events Conditional independence Total probability theorem Bayes theorem Bayesian Decision Theory Random variable Discrete probability distributions Bernoulli and categorical distribution Binomial distribution Poisson distribution Continuous probability distributions Cumulative Probability Distribution Function (C.D.F) Uniform distribution Gaussian distribution or normal distribution Exponential Distribution Mathematical expectation of a random variable Joint Probability Distributions Transformation of a random variable Multivariate distributions Multinomial distribution Multivariate gaussian distribution Information theory Entropy Relative entropy or KL divergence Mutual information Decision tree Conclusion Points to remember Further reading 5. Statistical Inference and Applications Structure Objectives Large Sample Theory Sample statistics Sampling from known distributions Hypothesis testing Statistical inference Estimator properties Minimum Variance Unbiased (M.V.U) estimators Likelihood function Cramer-Rao inequality Method of Maximum Likelihood Estimation (MLE) Bias-variance decomposition of estimator Applications – Formulating ML problems as statistical inferencing Data distribution Classification Naive Bayes classifier Regression Linear and curvilinear regression Estimating model parameters Iterative estimation of model parameters Overfitting and underfitting Bias variance trade-off Logistic Regression Multiclass logistic regression Poisson regression Interpretability of linear models Conclusion Points to remember Further Reading 6. Neural Networks Structure Objectives Artificial neuron: An adaptive basis function Feed Forward neural network Training neural network Stochastic Gradient Descent Computing error derivatives Backpropagation algorithm Challenges of training neural networks Modifications of SGD Momentum methods Adaptive learning rate Bias-variance trade-off in neural networks Regularization of neural nets Sensitivity of neural networks to small perturbations Neural Network Architectures Conclusion Points to remember Further Reading 7. Clustering Structure Objectives Forming clusters Distance and similarity Cluster quality Internal evaluation Davies-Bouldin indicator Dunn indicator Silhouette coefficient External evaluation Rand index F-measure Fowlkes–Mallows index Jaccard index Clustering algorithms Partition-based clustering K-means K-medoids Density-based clustering DBSCAN Distribution-based clustering Gaussian Mixture Model Hierarchical-based clustering Agglomerative clustering Distance between clusters BIRCH Graph-based clustering Fuzzy theory-based clustering Fuzzy c-means Conclusion References 8. Dimensionality Reduction Structure Objectives Reducing dimensionality Principal Component Analysis Loading Iris dataset Calculating covariance matrix Decomposition of covariance matrix Reducing with principal components Variance retention When to use PCA Autoencoder Iris autoencoder t-SNE Choosing σi PCA vs t-SNE t-SNE on Iris Dataset Conclusion Further reading References 9. Computer Vision Structure Objectives Digital Image Formation Capture the light Sampling and quantization Pixels Accessing pixels Spatial filtering Geometric spatial transformation Neighbor pixel operation Convolution properties Separable kernels Convolution with separable kernels Gaussian kernel Discrete approximation of Gaussian function Application of Gaussian filter Image derivative-based kernels Laplacian kernel – Second order derivative Sobel kernel: First order derivative Non-linear filters Learning filters Convolution Neural Networks Convolution layer Pooling layer Spatially separable convolution Depthwise separable convolution Depthwise convolution Pointwise convolution Optimization Upsampling: Transposed convolution Development of CNN AlexNet TensorFlow Model Counting trainable parameters Inception VGG ResNet Xception Application of CNN models Image classification Object detection R-CNN – Regions with CNN features YOLO – You Only Look Once Image segmentation U-Net Summary Further reading Points to remember References 10. Sequence Learning Models Structure Objectives Time series models Decomposition of time series Differencing Time series forecasting OLS model Exponential smoothing Autoregressive Integrated Moving Average Probabilistic sequence models Markov chain Hidden Markov model Recurrent neural networks Training RNN Long Short-Term Memory (LSTM) Gated Recurrent Unit (GRU) Stacked LSTM/RNN Generative models for sequence Handwriting generation Mixture Density Network Sequence classification Bi-directional RNN Sequence to Sequence Connectionist Temporal Classification Training CTC network: Maximum likelihood DP formulation for CTC loss Inferencing from CTC network Encoder-Decoder architecture Attention mechanism Key-value-query formulation of attention Language translation model Speech recognition model Self-attention and transformers Computing self-attention Transformer architecture Conclusion Points to remember Further Reading 11. Natural Language Processing Structure Objectives Natural language Syntactic structure of language Parts of Speech (POS) Phrases Clause Sentence Document and Text corpus Semantic structure of language Wordnet Text preprocessing Models for text Bag of Words (BoW) model Vector Space Model Count based or Boolean Term Frequency (TF)-Inverted Document Frequency (IDF) Latent Semantic Indexing (LSI) model Probabilistic models of text Topic models Probabilistic generative models: Latent Dirichlet allocation Neural language models Contextual models ELMo model BERT Position encoding Pre-training BERT Input representation for pre-training tasks of BERT WordPiece tokenization ERNIE Generative Pre-Training by OpenAI Conclusion Points to remember Further reading 12. Generative Models Structure Objectives A simple generative model Variational Autoencoders (VAE) Generative Adversarial Nets Equilibrium state for GAN training Implementing GAN GAN training challenges Solutions for mitigating GAN training issues Wasserstein GAN (WGAN) Some properties of EM distance WGAN training Ensuring Lipschitz Constraint in Discriminator Conditional GAN (cGAN) Cycle GAN (CycleGAN) Autoregressive generative models Applying generative models Conclusion Points to remember Further Reading Index
Donate to keep this site alive
1. Disable the AdBlock plugin. Otherwise, you may not get any links.
2. Solve the CAPTCHA.
3. Click download link.
4. Lead to download server to download.