# Data Scientist Pocket Guide: Over 600 Concepts, Terminologies, and Processes of Machine Learning and Deep Learning Assembled Together

- Length: 308 pages
- Edition: 1
- Language: English
- Publisher: BPB Publications
- Publication Date: 2021-06-28
- ISBN-10: 9390684978
- ISBN-13: 9789390684977
- Sales Rank: #1396311 (See Top 100 Books)

**Discover one of the most complete dictionaries in data science.**

**Key Features**

- Simplified understanding of complex concepts, terms, terminologies, and techniques.
- Combined glossary of machine learning, mathematics, and statistics.
- Chronologically arranged A-Z keywords with brief description.

**Description**

This pocket guide is a must for all data professionals in their day-to-day work processes. This book brings a comprehensive pack of glossaries of machine learning, deep learning, mathematics, and statistics. The extensive list of glossaries comprises concepts, processes, algorithms, data structures, techniques, and many more. Each of these terms is explained in the simplest words possible. This pocket guide will help you to stay up to date of the most essential terms and references used in the process of data analysis and machine learning.

**What you will learn**

- Get absolute clarity on every concept, process, and algorithm used in the process of data science operations.
- Keep yourself technically strong and sound-minded during data science meetings.
- Strengthen your knowledge in the field of Big data and business intelligence.

**Who this book is for**

This book is for data professionals, data scientists, students, or those who are new to the field who wish to stay on top of industry jargon and terminologies used in the field of data science.

**About the Authors**

Mohamed Sabri, the author of this book, completed his graduation in Mathematics and Economics from the University of Ottawa. He is a Consultant in the field of Data Science and MLOps, and is working with the North American organizations in the Banking, Retail, and Gaming sector. With an irrefutable passion for Data Science, he is driven to do more for the domain by being involved in a range of innovative AI projects that help him deliver end-to-end solutions in the field of AI.

He drives his professional journey with his excellent communication skills and his expertise in Tech popularisation for complex projects. Building upon his commitment towards ensuring work and team cohesiveness, he has successfully executed several AI projects.

In his book, “Data Scientist Pocket Guide”, he has interestingly poured his secrets of becoming a benevolent data scientist.

His secret passion for connecting and networking with people and professionals is channelled through this book, that attempts to connect and reach several data scientists and make their everyday job enriching and easier.

Blog links: https://www.datalyticsbusiness.ca/

LinkedIn Profile: https://www.linkedin.com/in/mohamed-sabri/

Cover Page Title Page Copyright Page Dedication Page About the Author About the Reviewer Acknowledgements Preface Errata Table of Contents 1. FAQ How to fine tune a machine learning algorithm? How to build deep neural network architecture? How to train a machine learning algorithm faster? Why do we normalize the input data in deep neural network? When can we consider that we did a good job in a machine learning project? When should we use deep learning instead of the traditional machine learning models? How much time does it take to become a good data scientist? How to evaluate the performance of a model? In case of a large dataset, should I sample my data or use distributed computing? How much time should I spend in data transformation? How to select the right machine learning algorithm? Should I learn R or Python? What’s the trade-off between bias and variance? What is the difference between supervised and unsupervised machine learning? What is the difference between L1 and L2 regularizations? What’s the difference between type I and type II error? What’s the difference between probability and likelihood? What’s the difference between a generative and discriminative model? Which is more important model accuracy, or model performance? How would you handle an imbalanced dataset? How do you ensure that you’re not overfitting with a model? What’s the “kernel trick” and how is it useful? How do you handle missing data in a dataset? What are the origins of machine learning? What is the difference between a classifier and a model? What is the difference between a parametric learning algorithm and a non-parametric learning algorithm? What is the difference between a cost function and a loss function in machine learning? What is the difference between covariance and correlation? Why did it take so long for deep networks to be invented? What are some good books/papers for learning machine learning? What are the advantages of semi-supervised learning over supervised and unsupervised learning? When should I apply data normalization/standardization? How do you deal with a machine learning problem with a large number of features? When should one use median as opposed to the mean or average? Why is “Naive” Bayes called naive? 2. A A/B testing Accuracy Action Activation function Active learning AdaBoost AdaDelta AdaGrad Adam Adaptive learning rate Affine layer Agent Agglomerative clustering AlexNet Algorithm Anaconda Anchor box Annotator ANOVA Apache Spark ARIMA Artificial general intelligence (AGI) Artificial intelligence Artificial narrow intelligence (ANI) Artificial super intelligence (ASI) Association learning Association rules Attention mechanism Attribute Area under the ROC Curve (AUC) Autocorrelation Autoencoder Automatic summarization Automation bias Autoregression Average pooling Average precision 3. B Backpropagation Backpropagation through time (BPTT) Bag of words Bagging Bar chart Base learner Baseline Batch Batch gradient descent Batch normalization Bayes’ theorem Bayesian inference Bayesian statistics Bellman equation Bernoulli distribution Bias Bias-variance trade-off Bidirectional Recurrent Neural Network Big Data Big O notation Binarization Binary classification Binary variables Binning Binomial distribution Black box model BLEU score Boosting Bootstrapping Bottleneck layer Bounding box Box plot Bucketing Business analytics Business intelligence 4. C Caffe Calibration Candidate generation Candidate sampling Categorical cross-entropy Categorical variable Centroid Centroid-based algorithm Chain rule Chainer Channel Checkpoints Chi-square test Chi-squared distribution CIFAR: Classification Classification threshold Classifier Clipping Cloud Clustering CNN CNTK Co-adaptation COCO Coefficient of determination Cohen’s kappa Collaborative filtering Complexity Computer vision Concordant-discordant ratio Confidence interval Confusion matrix Connectivity-based algorithm Continuous learning Continuous variable Contrastive divergence Convenience sampling Convergence Convex function Convolution Convolutional layer Convolutional neural network Correlation Cosine similarity Cost function Covariance Coverage bias CPU Cross-entropy Cross validation CUDA 5. D Dashboard Data analysis Data augmentation Data engineering Data mining Data parallelism Data preparation Data science Data transformation Data wrangling Database Databricks DataFrame Dataset Davies-Bouldin index DBSCAN Decile Decision boundary Decision tree Deduction Deep belief network Deep dream Deep learning Deep Q-network Deeplearning4j Degree of freedom Dense feature Dense layer Density-based algorithm Dependent variable Deployment as API Deployment in batch Depth Depth-wise separable convolutional neural network Descriptive statistics Device Dimensionality reduction Discounted cumulative gain Discrete variable Discriminative model Discriminator Divisive clustering Downpour stochastic gradient descent Downsampling Dplyr DropConnect Dropout regularization Dummy variable Dunn index Dynamic model Dynamic programming 6. E Early stopping EDA ELU Embedding space Embeddings Ensemble learning algorithm Ensemble models: Entropy Episode Epoch Epsilon greedy policy ETL Euclidean distance Evaluation metric Example Experimentation Expert system Exploding gradient problem Exploration vs. exploitation Exponential family distribution Exponential loss Exponential smoothing Extrapolation Extreme values 7. F F1 Score Face recognition Facet Factor analysis False negative False positive Feature Feature cross Feature engineering Feature hashing Feature learning Feature reduction Feature selection: Federated learning: Feedback loop Feedforward Few-shot learning Fine-tuning Flume Focal loss Forget gate Frechet inception distance Frequentist statistics F-score Full softmax Fully connected layer 8. G Gain and Lift Charts Gated Recurrent Unit (GRU) Gaussian distribution General AI Generalization Generalization curve Generalized Linear Model (GLM) Generative adversarial neural network (GAN) Generative classification Generator Genetic algorithm Ggplot2 Gini coefficient GloVe Go Goodness of fit GoogleNet GPU Gradient accumulation Gradient descent Greedy policy Grid search Ground truth 9. H Hadoop Hashing Heuristic Hidden layer Hidden Markov model Hierarchical clustering Highway layer Highway network Hinge loss Histogram Hive Holdout sample Holt-Winters forecasting Huber loss Hyperparameter Hyperparameter tuning Hyperplane Hypothesis 10. I International Conference on Machine Learning (ICML) Integrated Development Environment (IDE) ImageNet Large Scale Visual Recognition Challenge (ILSVRC) Image recognition ImageNet Imbalanced dataset Implicit bias Imputation Inception Inception module Independent and identically distributed (i.i.d.) Independent Component Analysis (ICA) Induction Inferential statistics Input gate Input layer Instance Instance-based learning Interpretability Intersection over Union (IoU) Intersection over Union (IoU) Interquartile Range (IQR) Item matrix Iteration 11. J Jacobian Julia Jupyter notebook 12. K Keras Kernel Kernel support vector machine KL divergence K-means K-median K-nearest neighbors (kNN) Kolmogorov Smirnov chart Kurtosis: 13. L L1 Loss L1 regularization L2 loss L2 regularization Labeled data Lasso regression Latent variable Layer Leaky ReLU Learning rate Least squares regression Line chart Linear activation function Linear discriminant analysis Linear model Linear regression Log loss Log-Cosh loss Logistic regression Logits Log-odds Long Short-Term Memory (LSTM) Loss curve Loss function Loss surface 14. M Machine Learning Machine translation Magnet loss Mahout Majority class Manhattan distance MapReduce Market basket analysis Market mix modeling Markov chain Markov decision process Markov property Matplotlib Matrix factorization Max pooling Maximum likelihood estimation Mean Mean absolute error Mean reciprocal rank Mean squared error Median Memory-based learning Mini-batch Mini-batch gradient descent Minimax loss Minority class Management Information System (MIS) Machine learning (ML) ML-as-a-service (MLaaS) MLOps MNIST Mode Model capacity Model parallelism Model selection Model Momentum Monte Carlo simulation Moving average Multi-agent reinforcement learning Multi-class classification Multilayer perceptron Multinomial classification Multivariate analysis Multivariate regression MXNet: 15. N Naive Bayes NaN Nash equilibrium Natural language generation Natural language processing Natural Language Understanding (NLU) Negative class Negative log likelihood Nesterov accelerated gradient Neural Machine Translation (NMT) Neural network Neural Turing machine (NTM) Neuron N-gram No free lunch theorem Node Noise Noise contrastive estimation Nominal variable Nonlinear transform function Normal distribution Normalization Normalized discounted cumulative gain NoSQL Notebook Null Null accuracy: Numerical data Numpy NVIDIA 16. O Object Detection Objective Objective function One hot encoding One shot learning One vs all Online inference Online learning Oozie OpenCV Optimizer Ordinal variable Outlier Output gate Output layer Overfitting 17. P Pandas Parallel processing Parameter update Parameters Part of speech tagging Partial derivative Participation bias Partitioning Pattern recognition Peak signal-to-noise ratio: Perceptron: Performance Perplexity Pie chart Pig Pipeline Poisson distribution Polynomial regression Pooling Population Positive class: Post-processing Precision and recall Prediction Predictive model Predictor variable Pre-processing Pre-trained model Principal Component Analysis (PCA) Prior belief Probability density Proxy label P-value Python PyTorch 18. Q Q-function Q-learning Quadratic loss Quantile Quantile loss Quartile: Question answering (NLP) 19. R R Radial basis function network Random-Access Memory (RAM) Random forest Random initialization Random policy Random search Range Rank Rater Recommendation engine Reconstruction entropy Rectified linear unit Recurrent neural network Recursive neural network Regression Regression spline Regularization Reinforcement learning Relationship extraction Relative entropy Rectified linear unit (ReLU) Replay buffer Representation Representation learning Residual ResNet Response variable Restricted Boltzmann Machine (RBM) Reward Ridge regression Ridge regularization Risk Root Mean Square Propagation (RMSProp) Recurrent Neural Network (RNN) Robotic Process Automation (RPA) ROC-AUC Root Mean Squared Error (RMSE) Root Mean Squared Logarithmic Error (RMSLE) Rotational invariance R-squared/Adjusted R-squared 20. S Sampling Sampling bias SAS Scala Scalar Scaling Scikit-learn Scoring Seasonality Selection bias Self-supervised learning Semi-supervised learning Sensitivity Sentiment analysis Sequence to sequence Serialization Shape of a tensor Siamese neural network Sigmoid function Signal processing Silhouette coefficient Similarity learning Single shot object detector Singularity Skewness Skipgram Smooth mean absolute error: SMOTE Softmax Sparse feature Sparse representation Sparse vector Sparsity Spatial pooling Spatial-temporal reasoning Specificity Speech recognition Speech segmentation Splitting data SPSS Structured Query Language (SQL) Squared hinge loss Squared loss Stacking Standard deviation Standard error Standardization Stata State State-action value function Static model Stationary Statistical inference Statistics STD decomposition: Stochastic gradient descent Stratified sampling Stride Strong AI Strong classifier Structural SIMilarity (SSIM) Structured data Subsampling Supervised learning Support vector machine (SVM) SVM Synthetic feature 21. T Tanh Target variable T-distribution Tensor Tensorflow Test set Text-to-speech: Theano Time series analysis Tokenization: Topic modeling Torch Tensor Processing Unit (TPU) Training Training set Translational invariance Transfer learning Transformer Trend analysis Triplet loss True negative True positive Truncated SVD T-test Turing test Type I error Type II error 22. U Underfitting Univariate analysis Universal function approximation theorem Unlabeled data Unstructured data Unsupervised learning Upweighting User matrix 23. V Validation set Vanishing gradient problem Variance Variational autoencoder VC dimension Vector VGG 24. W Wasserstein loss Watson studio Weak classifier Weight decay Weight sharing Weighted alternating least squares Weighting Width Word embedding Word segmentation Word2vec 25. X Xavier initialization Xception XGboost 26. Y You only look once (YOLO) 27. Z Zero shot learning Z-test Index

1. Disable the **AdBlock** plugin. Otherwise, you may not get any links.

2. Solve the CAPTCHA.

3. Click download link.

4. Lead to download server to download.