Data Scientist Pocket Guide: Over 600 Concepts, Terminologies, and Processes of Machine Learning and Deep Learning Assembled Together

by Mohamed Sabri

Length: 308 pages
Edition: 1
Language: English
Publisher: BPB Publications
Publication Date: 2021-06-28
ISBN-10: 9390684978
ISBN-13: 9789390684977
Sales Rank: #1396311 (See Top 100 Books)

1 ratings

Print Book Look Inside

Discover one of the most complete dictionaries in data science.

Key Features

Simplified understanding of complex concepts, terms, terminologies, and techniques.
Combined glossary of machine learning, mathematics, and statistics.
Chronologically arranged A-Z keywords with brief description.

Description

This pocket guide is a must for all data professionals in their day-to-day work processes. This book brings a comprehensive pack of glossaries of machine learning, deep learning, mathematics, and statistics. The extensive list of glossaries comprises concepts, processes, algorithms, data structures, techniques, and many more. Each of these terms is explained in the simplest words possible. This pocket guide will help you to stay up to date of the most essential terms and references used in the process of data analysis and machine learning.

What you will learn

Get absolute clarity on every concept, process, and algorithm used in the process of data science operations.
Keep yourself technically strong and sound-minded during data science meetings.
Strengthen your knowledge in the field of Big data and business intelligence.

Who this book is for

This book is for data professionals, data scientists, students, or those who are new to the field who wish to stay on top of industry jargon and terminologies used in the field of data science.

About the Authors

Mohamed Sabri, the author of this book, completed his graduation in Mathematics and Economics from the University of Ottawa. He is a Consultant in the field of Data Science and MLOps, and is working with the North American organizations in the Banking, Retail, and Gaming sector. With an irrefutable passion for Data Science, he is driven to do more for the domain by being involved in a range of innovative AI projects that help him deliver end-to-end solutions in the field of AI.

He drives his professional journey with his excellent communication skills and his expertise in Tech popularisation for complex projects. Building upon his commitment towards ensuring work and team cohesiveness, he has successfully executed several AI projects.

In his book, “Data Scientist Pocket Guide”, he has interestingly poured his secrets of becoming a benevolent data scientist.

His secret passion for connecting and networking with people and professionals is channelled through this book, that attempts to connect and reach several data scientists and make their everyday job enriching and easier.

Blog links: https://www.datalyticsbusiness.ca/

LinkedIn Profile: https://www.linkedin.com/in/mohamed-sabri/

Cover Page
Title Page
Copyright Page
Dedication Page
About the Author
About the Reviewer
Acknowledgements
Preface
Errata
Table of Contents
1. FAQ
    How to fine tune a machine learning algorithm?
    How to build deep neural network architecture?
    How to train a machine learning algorithm faster?
    Why do we normalize the input data in deep neural network?
    When can we consider that we did a good job in a machine learning project?
    When should we use deep learning instead of the traditional machine learning models?
    How much time does it take to become a good data scientist?
    How to evaluate the performance of a model?
    In case of a large dataset, should I sample my data or use distributed computing?
    How much time should I spend in data transformation?
    How to select the right machine learning algorithm?
    Should I learn R or Python?
    What’s the trade-off between bias and variance?
    What is the difference between supervised and unsupervised machine learning?
    What is the difference between L1 and L2 regularizations?
    What’s the difference between type I and type II error?
    What’s the difference between probability and likelihood?
    What’s the difference between a generative and discriminative model?
    Which is more important model accuracy, or model performance?
    How would you handle an imbalanced dataset?
    How do you ensure that you’re not overfitting with a model?
    What’s the “kernel trick” and how is it useful?
    How do you handle missing data in a dataset?
    What are the origins of machine learning?
    What is the difference between a classifier and a model?
    What is the difference between a parametric learning algorithm and a non-parametric learning algorithm?
    What is the difference between a cost function and a loss function in machine learning?
    What is the difference between covariance and correlation?
    Why did it take so long for deep networks to be invented?
    What are some good books/papers for learning machine learning?
    What are the advantages of semi-supervised learning over supervised and unsupervised learning?
    When should I apply data normalization/standardization?
    How do you deal with a machine learning problem with a large number of features?
    When should one use median as opposed to the mean or average?
    Why is “Naive” Bayes called naive?
2. A
    A/B testing
    Accuracy
    Action
    Activation function
    Active learning
        AdaBoost
        AdaDelta
        AdaGrad
        Adam
    Adaptive learning rate
    Affine layer
    Agent
    Agglomerative clustering
    AlexNet
        Algorithm
    Anaconda
    Anchor box
    Annotator
    ANOVA
    Apache Spark
    ARIMA
    Artificial general intelligence (AGI)
        Artificial intelligence
        Artificial narrow intelligence (ANI)
        Artificial super intelligence (ASI)
    Association learning
        Association rules
    Attention mechanism
    Attribute
    Area under the ROC Curve (AUC)
    Autocorrelation
    Autoencoder
    Automatic summarization
    Automation bias
    Autoregression
    Average pooling
    Average precision
3. B
    Backpropagation
        Backpropagation through time (BPTT)
    Bag of words
    Bagging
    Bar chart
    Base learner
    Baseline
    Batch
    Batch gradient descent
    Batch normalization
    Bayes’ theorem
        Bayesian inference
        Bayesian statistics
    Bellman equation
    Bernoulli distribution
    Bias
        Bias-variance trade-off
    Bidirectional Recurrent Neural Network
        Big Data
        Big O notation
    Binarization
        Binary classification
        Binary variables
    Binning
    Binomial distribution
    Black box model
    BLEU score
    Boosting
        Bootstrapping
    Bottleneck layer
    Bounding box
    Box plot
    Bucketing
    Business analytics
        Business intelligence
4. C
    Caffe
    Calibration
    Candidate generation
    Candidate sampling
    Categorical cross-entropy
        Categorical variable
    Centroid
        Centroid-based algorithm
    Chain rule
        Chainer
    Channel
    Checkpoints
    Chi-square test
        Chi-squared distribution
    CIFAR:
    Classification
        Classification threshold
        Classifier
    Clipping
    Cloud
    Clustering
    CNN
    CNTK
    Co-adaptation
    COCO
    Coefficient of determination
    Cohen’s kappa
    Collaborative filtering
    Complexity
    Computer vision
    Concordant-discordant ratio
    Confidence interval
    Confusion matrix
    Connectivity-based algorithm
    Continuous learning
    Continuous variable
    Contrastive divergence
    Convenience sampling
    Convergence
    Convex function
    Convolution
        Convolutional layer
        Convolutional neural network
        Correlation
    Cosine similarity
    Cost function
    Covariance
    Coverage bias
    CPU
    Cross-entropy
    Cross validation
    CUDA
5. D
    Dashboard
    Data analysis
    Data augmentation
    Data engineering
    Data mining
    Data parallelism
    Data preparation
    Data science
    Data transformation
    Data wrangling
    Database
    Databricks
    DataFrame
    Dataset
    Davies-Bouldin index
    DBSCAN
    Decile
    Decision boundary
        Decision tree
        Deduction
        Deep belief network
        Deep dream
    Deep learning
        Deep Q-network
        Deeplearning4j
        Degree of freedom
        Dense feature
    Dense layer
    Density-based algorithm
    Dependent variable
        Deployment as API
    Deployment in batch
    Depth
    Depth-wise separable convolutional neural network
    Descriptive statistics
    Device
    Dimensionality reduction
    Discounted cumulative gain
    Discrete variable
    Discriminative model
        Discriminator
    Divisive clustering
    Downpour stochastic gradient descent
    Downsampling
    Dplyr
    DropConnect
        Dropout regularization
    Dummy variable
    Dunn index
    Dynamic model
    Dynamic programming
6. E
    Early stopping
    EDA
    ELU
    Embedding space
        Embeddings
    Ensemble learning algorithm
    Ensemble models:
    Entropy
    Episode
    Epoch
    Epsilon greedy policy
    ETL
    Euclidean distance
    Evaluation metric
        Example
        Experimentation
    Expert system
    Exploding gradient problem
    Exploration vs. exploitation
    Exponential family distribution
    Exponential loss
        Exponential smoothing
    Extrapolation
    Extreme values
7. F
    F1 Score
    Face recognition
    Facet
    Factor analysis
    False negative
    False positive
    Feature
        Feature cross
        Feature engineering
        Feature hashing
    Feature learning
        Feature reduction
        Feature selection:
    Federated learning:
    Feedback loop
    Feedforward
    Few-shot learning
    Fine-tuning
    Flume
    Focal loss
    Forget gate
    Frechet inception distance
    Frequentist statistics
    F-score
    Full softmax
    Fully connected layer
8. G
    Gain and Lift Charts
    Gated Recurrent Unit (GRU)
    Gaussian distribution
    General AI
    Generalization
    Generalization curve
    Generalized Linear Model (GLM)
    Generative adversarial neural network (GAN)
    Generative classification
        Generator
    Genetic algorithm
    Ggplot2
    Gini coefficient
    GloVe
    Go
    Goodness of fit
    GoogleNet
    GPU
    Gradient accumulation
    Gradient descent
    Greedy policy
    Grid search
    Ground truth
9. H
    Hadoop
    Hashing
    Heuristic
    Hidden layer
    Hidden Markov model
    Hierarchical clustering
    Highway layer
    Highway network
    Hinge loss
    Histogram
    Hive
    Holdout sample
    Holt-Winters forecasting
    Huber loss
    Hyperparameter
        Hyperparameter tuning
    Hyperplane
    Hypothesis
10. I
    International Conference on Machine Learning (ICML)
    Integrated Development Environment (IDE)
    ImageNet Large Scale Visual Recognition Challenge (ILSVRC)
    Image recognition
    ImageNet
    Imbalanced dataset
    Implicit bias
    Imputation
    Inception
        Inception module
    Independent and identically distributed (i.i.d.)
    Independent Component Analysis (ICA)
    Induction
    Inferential statistics
    Input gate
    Input layer
    Instance
    Instance-based learning
    Interpretability
    Intersection over Union (IoU)
        Intersection over Union (IoU)
        Interquartile Range (IQR)
    Item matrix
    Iteration
11. J
    Jacobian
    Julia
    Jupyter notebook
12. K
    Keras
    Kernel
    Kernel support vector machine
    KL divergence
    K-means
    K-median
    K-nearest neighbors (kNN)
    Kolmogorov Smirnov chart
    Kurtosis:
13. L
    L1 Loss
        L1 regularization
    L2 loss
        L2 regularization
    Labeled data
    Lasso regression
    Latent variable
    Layer
    Leaky ReLU
    Learning rate
    Least squares regression
    Line chart
    Linear activation function
    Linear discriminant analysis
    Linear model
        Linear regression
    Log loss
        Log-Cosh loss
    Logistic regression
        Logits
        Log-odds
    Long Short-Term Memory (LSTM)
        Loss curve
    Loss function
    Loss surface
14. M
    Machine Learning
    Machine translation
    Magnet loss
    Mahout
    Majority class
    Manhattan distance
    MapReduce
    Market basket analysis
    Market mix modeling
    Markov chain
        Markov decision process
            Markov property
    Matplotlib
    Matrix factorization
    Max pooling
    Maximum likelihood estimation
    Mean
        Mean absolute error
        Mean reciprocal rank
        Mean squared error
    Median
    Memory-based learning
    Mini-batch
        Mini-batch gradient descent
        Minimax loss
        Minority class
    Management Information System (MIS)
    Machine learning (ML)
    ML-as-a-service (MLaaS)
    MLOps
    MNIST
    Mode
    Model capacity
    Model parallelism
    Model selection
    Model
    Momentum
    Monte Carlo simulation
    Moving average
    Multi-agent reinforcement learning
    Multi-class classification
    Multilayer perceptron
    Multinomial classification
    Multivariate analysis
        Multivariate regression
    MXNet:
15. N
    Naive Bayes
    NaN
    Nash equilibrium
    Natural language generation
    Natural language processing
        Natural Language Understanding (NLU)
        Negative class
        Negative log likelihood
    Nesterov accelerated gradient
    Neural Machine Translation (NMT)
        Neural network
    Neural Turing machine (NTM)
    Neuron
    N-gram
    No free lunch theorem
    Node
        Noise
        Noise contrastive estimation
    Nominal variable
    Nonlinear transform function
    Normal distribution
    Normalization
    Normalized discounted cumulative gain
    NoSQL
    Notebook
    Null
    Null accuracy:
    Numerical data
    Numpy
    NVIDIA
16. O
    Object Detection
    Objective
        Objective function
    One hot encoding
    One shot learning
    One vs all
    Online inference
        Online learning
    Oozie
    OpenCV
    Optimizer
    Ordinal variable
        Outlier
    Output gate
    Output layer
    Overfitting
17. P
    Pandas
    Parallel processing
    Parameter update
    Parameters
    Part of speech tagging
    Partial derivative
    Participation bias
    Partitioning
    Pattern recognition
    Peak signal-to-noise ratio:
    Perceptron:
    Performance
        Perplexity
    Pie chart
    Pig
    Pipeline
    Poisson distribution
    Polynomial regression
    Pooling
    Population
    Positive class:
    Post-processing
    Precision and recall
    Prediction
    Predictive model
        Predictor variable
        Pre-processing
        Pre-trained model
    Principal Component Analysis (PCA)
    Prior belief
    Probability density
    Proxy label
    P-value
    Python
        PyTorch
18. Q
    Q-function
    Q-learning
    Quadratic loss
    Quantile
        Quantile loss
    Quartile:
    Question answering (NLP)
19. R
    R
    Radial basis function network
    Random-Access Memory (RAM)
    Random forest
    Random initialization
        Random policy
        Random search
        Range
    Rank
    Rater
    Recommendation engine
    Reconstruction entropy
    Rectified linear unit
    Recurrent neural network
    Recursive neural network
        Regression
        Regression spline
        Regularization
    Reinforcement learning
    Relationship extraction
    Relative entropy
        Rectified linear unit (ReLU)
    Replay buffer
    Representation
    Representation learning
    Residual
    ResNet
    Response variable
    Restricted Boltzmann Machine (RBM)
    Reward
    Ridge regression
    Ridge regularization
    Risk
    Root Mean Square Propagation (RMSProp)
    Recurrent Neural Network (RNN)
    Robotic Process Automation (RPA)
    ROC-AUC
    Root Mean Squared Error (RMSE)
        Root Mean Squared Logarithmic Error (RMSLE)
        Rotational invariance
        R-squared/Adjusted R-squared
20. S
    Sampling
    Sampling bias
    SAS
    Scala
    Scalar
    Scaling
    Scikit-learn
    Scoring
    Seasonality
    Selection bias
        Self-supervised learning
        Semi-supervised learning
    Sensitivity
    Sentiment analysis
    Sequence to sequence
    Serialization
    Shape of a tensor
    Siamese neural network
    Sigmoid function
    Signal processing
    Silhouette coefficient
    Similarity learning
    Single shot object detector
    Singularity
    Skewness
    Skipgram
    Smooth mean absolute error:
    SMOTE
    Softmax
    Sparse feature
        Sparse representation
        Sparse vector
        Sparsity
    Spatial pooling
    Spatial-temporal reasoning
    Specificity
    Speech recognition
    Speech segmentation
    Splitting data
    SPSS
    Structured Query Language (SQL)
    Squared hinge loss
    Squared loss
    Stacking
    Standard deviation
    Standard error
    Standardization
    Stata
    State
    State-action value function
        Static model
    Stationary
        Statistical inference
    Statistics
    STD decomposition:
    Stochastic gradient descent
    Stratified sampling
    Stride
    Strong AI
    Strong classifier
    Structural SIMilarity (SSIM)
        Structured data
    Subsampling
    Supervised learning
    Support vector machine (SVM)
        SVM
    Synthetic feature
21. T
    Tanh
    Target variable
    T-distribution
    Tensor
    Tensorflow
    Test set
    Text-to-speech:
    Theano
    Time series analysis
    Tokenization:
    Topic modeling
    Torch
    Tensor Processing Unit (TPU)
    Training
        Training set
    Translational invariance
    Transfer learning
    Transformer
    Trend analysis
    Triplet loss
    True negative
    True positive
    Truncated SVD
    T-test
    Turing test
    Type I error
    Type II error
22. U
    Underfitting
    Univariate analysis
    Universal function approximation theorem
    Unlabeled data
    Unstructured data
    Unsupervised learning
    Upweighting
    User matrix
23. V
    Validation set
    Vanishing gradient problem
    Variance
    Variational autoencoder
    VC dimension
    Vector
    VGG
24. W
    Wasserstein loss
    Watson studio
    Weak classifier
    Weight decay
    Weight sharing
        Weighted alternating least squares
    Weighting
    Width
    Word embedding
    Word segmentation
    Word2vec
25. X
    Xavier initialization
    Xception
    XGboost
26. Y
    You only look once (YOLO)
27. Z
    Zero shot learning
    Z-test
Index

Database Storage & Design