# Machine Learning with PyTorch and Scikit-Learn: Develop machine learning and deep learning models with Python

- Length: 770 pages
- Edition: 1
- Language: English
- Publisher: Packt Publishing
- Publication Date: 2022-02-25
- ISBN-10: 1801819319
- ISBN-13: 9781801819312
- Sales Rank: #10516 (See Top 100 Books)

**This book of the bestselling and widely acclaimed Python Machine Learning series is a comprehensive guide to machine and deep learning using PyTorch’s simple to code framework**

#### Key Features

- Learn applied machine learning with a solid foundation in theory
- Clear, intuitive explanations take you deep into the theory and practice of Python machine learning
- Fully updated and expanded to cover PyTorch, transformers, XGBoost, graph neural networks, and best practices

#### Book Description

Machine Learning with PyTorch and Scikit-Learn is a comprehensive guide to machine learning and deep learning with PyTorch. It acts as both a step-by-step tutorial and a reference you’ll keep coming back to as you build your machine learning systems.

Packed with clear explanations, visualizations, and examples, the book covers all the essential machine learning techniques in depth. While some books teach you only to follow instructions, with this machine learning book, we teach the principles allowing you to build models and applications for yourself.

Why PyTorch?

PyTorch is the Pythonic way to learn machine learning, making it easier to learn and simpler to code with. This book explains the essential parts of PyTorch and how to create models using popular libraries, such as PyTorch Lightning and PyTorch Geometric.

You will also learn about generative adversarial networks (GANs) for generating new data and training intelligent agents with reinforcement learning. Finally, this new edition is expanded to cover the latest trends in deep learning, including graph neural networks and large-scale transformers used for natural language processing (NLP).

This PyTorch book is your companion to machine learning with Python, whether you’re a Python developer new to machine learning or want to deepen your knowledge of the latest developments.

#### What you will learn

- Explore frameworks, models, and techniques for machines to ‘learn’ from data
- Use scikit-learn for machine learning and PyTorch for deep learning
- Train machine learning classifiers on images, text, and more
- Build and train neural networks, transformers, and boosting algorithms
- Discover best practices for evaluating and tuning models
- Predict continuous target outcomes using regression analysis
- Dig deeper into textual and social media data using sentiment analysis

#### Who this book is for

If you know some Python and you want to use machine learning and deep learning, pick up this book. Whether you want to start from scratch or extend your machine learning knowledge, this is an essential resource.

Written for developers and data scientists who want to create practical machine learning with Python and PyTorch deep learning code. This Python book is ideal for anyone who wants to teach computers how to learn from data.

Working knowledge of the Python programming language, along with a good understanding of calculus and linear algebra is a must.

Preface Who this book is for What this book covers To get the most out of this book Get in touch Share your thoughts Giving Computers the Ability to Learn from Data Building intelligent machines to transform data into knowledge The three different types of machine learning Making predictions about the future with supervised learning Classification for predicting class labels Regression for predicting continuous outcomes Solving interactive problems with reinforcement learning Discovering hidden structures with unsupervised learning Finding subgroups with clustering Dimensionality reduction for data compression Introduction to the basic terminology and notations Notation and conventions used in this book Machine learning terminology A roadmap for building machine learning systems Preprocessing – getting data into shape Training and selecting a predictive model Evaluating models and predicting unseen data instances Using Python for machine learning Installing Python and packages from the Python Package Index Using the Anaconda Python distribution and package manager Packages for scientific computing, data science, and machine learning Summary Training Simple Machine Learning Algorithms for Classification Artificial neurons – a brief glimpse into the early history of machine learning The formal definition of an artificial neuron The perceptron learning rule Implementing a perceptron learning algorithm in Python An object-oriented perceptron API Training a perceptron model on the Iris dataset Adaptive linear neurons and the convergence of learning Minimizing loss functions with gradient descent Implementing Adaline in Python Improving gradient descent through feature scaling Large-scale machine learning and stochastic gradient descent Summary A Tour of Machine Learning Classifiers Using Scikit-Learn Choosing a classification algorithm First steps with scikit-learn – training a perceptron Modeling class probabilities via logistic regression Logistic regression and conditional probabilities Learning the model weights via the logistic loss function Converting an Adaline implementation into an algorithm for logistic regression Training a logistic regression model with scikit-learn Tackling overfitting via regularization Maximum margin classification with support vector machines Maximum margin intuition Dealing with a nonlinearly separable case using slack variables Alternative implementations in scikit-learn Solving nonlinear problems using a kernel SVM Kernel methods for linearly inseparable data Using the kernel trick to find separating hyperplanes in a high-dimensional space Decision tree learning Maximizing IG – getting the most bang for your buck Building a decision tree Combining multiple decision trees via random forests K-nearest neighbors – a lazy learning algorithm Summary Building Good Training Datasets – Data Preprocessing Dealing with missing data Identifying missing values in tabular data Eliminating training examples or features with missing values Imputing missing values Understanding the scikit-learn estimator API Handling categorical data Categorical data encoding with pandas Mapping ordinal features Encoding class labels Performing one-hot encoding on nominal features Optional: encoding ordinal features Partitioning a dataset into separate training and test datasets Bringing features onto the same scale Selecting meaningful features L1 and L2 regularization as penalties against model complexity A geometric interpretation of L2 regularization Sparse solutions with L1 regularization Sequential feature selection algorithms Assessing feature importance with random forests Summary Compressing Data via Dimensionality Reduction Unsupervised dimensionality reduction via principal component analysis The main steps in principal component analysis Extracting the principal components step by step Total and explained variance Feature transformation Principal component analysis in scikit-learn Assessing feature contributions Supervised data compression via linear discriminant analysis Principal component analysis versus linear discriminant analysis The inner workings of linear discriminant analysis Computing the scatter matrices Selecting linear discriminants for the new feature subspace Projecting examples onto the new feature space LDA via scikit-learn Nonlinear dimensionality reduction and visualization Why consider nonlinear dimensionality reduction? Visualizing data via t-distributed stochastic neighbor embedding Summary Learning Best Practices for Model Evaluation and Hyperparameter Tuning Streamlining workflows with pipelines Loading the Breast Cancer Wisconsin dataset Combining transformers and estimators in a pipeline Using k-fold cross-validation to assess model performance The holdout method K-fold cross-validation Debugging algorithms with learning and validation curves Diagnosing bias and variance problems with learning curves Addressing over- and underfitting with validation curves Fine-tuning machine learning models via grid search Tuning hyperparameters via grid search Exploring hyperparameter configurations more widely with randomized search More resource-efficient hyperparameter search with successive halving Algorithm selection with nested cross-validation Looking at different performance evaluation metrics Reading a confusion matrix Optimizing the precision and recall of a classification model Plotting a receiver operating characteristic Scoring metrics for multiclass classification Dealing with class imbalance Summary Combining Different Models for Ensemble Learning Learning with ensembles Combining classifiers via majority vote Implementing a simple majority vote classifier Using the majority voting principle to make predictions Evaluating and tuning the ensemble classifier Bagging – building an ensemble of classifiers from bootstrap samples Bagging in a nutshell Applying bagging to classify examples in the Wine dataset Leveraging weak learners via adaptive boosting How adaptive boosting works Applying AdaBoost using scikit-learn Gradient boosting – training an ensemble based on loss gradients Comparing AdaBoost with gradient boosting Outlining the general gradient boosting algorithm Explaining the gradient boosting algorithm for classification Illustrating gradient boosting for classification Using XGBoost Summary Applying Machine Learning to Sentiment Analysis Preparing the IMDb movie review data for text processing Obtaining the movie review dataset Preprocessing the movie dataset into a more convenient format Introducing the bag-of-words model Transforming words into feature vectors Assessing word relevancy via term frequency-inverse document frequency Cleaning text data Processing documents into tokens Training a logistic regression model for document classification Working with bigger data – online algorithms and out-of-core learning Topic modeling with latent Dirichlet allocation Decomposing text documents with LDA LDA with scikit-learn Summary Predicting Continuous Target Variables with Regression Analysis Introducing linear regression Simple linear regression Multiple linear regression Exploring the Ames Housing dataset Loading the Ames Housing dataset into a DataFrame Visualizing the important characteristics of a dataset Looking at relationships using a correlation matrix Implementing an ordinary least squares linear regression model Solving regression for regression parameters with gradient descent Estimating the coefficient of a regression model via scikit-learn Fitting a robust regression model using RANSAC Evaluating the performance of linear regression models Using regularized methods for regression Turning a linear regression model into a curve – polynomial regression Adding polynomial terms using scikit-learn Modeling nonlinear relationships in the Ames Housing dataset Dealing with nonlinear relationships using random forests Decision tree regression Random forest regression Summary Working with Unlabeled Data – Clustering Analysis Grouping objects by similarity using k-means k-means clustering using scikit-learn A smarter way of placing the initial cluster centroids using k-means++ Hard versus soft clustering Using the elbow method to find the optimal number of clusters Quantifying the quality of clustering via silhouette plots Organizing clusters as a hierarchical tree Grouping clusters in a bottom-up fashion Performing hierarchical clustering on a distance matrix Attaching dendrograms to a heat map Applying agglomerative clustering via scikit-learn Locating regions of high density via DBSCAN Summary Implementing a Multilayer Artificial Neural Network from Scratch Modeling complex functions with artificial neural networks Single-layer neural network recap Introducing the multilayer neural network architecture Activating a neural network via forward propagation Classifying handwritten digits Obtaining and preparing the MNIST dataset Implementing a multilayer perceptron Coding the neural network training loop Evaluating the neural network performance Training an artificial neural network Computing the loss function Developing your understanding of backpropagation Training neural networks via backpropagation About convergence in neural networks A few last words about the neural network implementation Summary Parallelizing Neural Network Training with PyTorch PyTorch and training performance Performance challenges What is PyTorch? How we will learn PyTorch First steps with PyTorch Installing PyTorch Creating tensors in PyTorch Manipulating the data type and shape of a tensor Applying mathematical operations to tensors Split, stack, and concatenate tensors Building input pipelines in PyTorch Creating a PyTorch DataLoader from existing tensors Combining two tensors into a joint dataset Shuffle, batch, and repeat Creating a dataset from files on your local storage disk Fetching available datasets from the torchvision.datasets library Building an NN model in PyTorch The PyTorch neural network module (torch.nn) Building a linear regression model Model training via the torch.nn and torch.optim modules Building a multilayer perceptron for classifying flowers in the Iris dataset Evaluating the trained model on the test dataset Saving and reloading the trained model Choosing activation functions for multilayer neural networks Logistic function recap Estimating class probabilities in multiclass classification via the softmax function Broadening the output spectrum using a hyperbolic tangent Rectified linear unit activation Summary Going Deeper – The Mechanics of PyTorch The key features of PyTorch PyTorch’s computation graphs Understanding computation graphs Creating a graph in PyTorch PyTorch tensor objects for storing and updating model parameters Computing gradients via automatic differentiation Computing the gradients of the loss with respect to trainable variables Understanding automatic differentiation Adversarial examples Simplifying implementations of common architectures via the torch.nn module Implementing models based on nn.Sequential Choosing a loss function Solving an XOR classification problem Making model building more flexible with nn.Module Writing custom layers in PyTorch Project one – predicting the fuel efficiency of a car Working with feature columns Training a DNN regression model Project two – classifying MNIST handwritten digits Higher-level PyTorch APIs: a short introduction to PyTorch-Lightning Setting up the PyTorch Lightning model Setting up the data loaders for Lightning Training the model using the PyTorch Lightning Trainer class Evaluating the model using TensorBoard Summary Classifying Images with Deep Convolutional Neural Networks The building blocks of CNNs Understanding CNNs and feature hierarchies Performing discrete convolutions Discrete convolutions in one dimension Padding inputs to control the size of the output feature maps Determining the size of the convolution output Performing a discrete convolution in 2D Subsampling layers Putting everything together – implementing a CNN Working with multiple input or color channels Regularizing an NN with L2 regularization and dropout Loss functions for classification Implementing a deep CNN using PyTorch The multilayer CNN architecture Loading and preprocessing the data Implementing a CNN using the torch.nn module Configuring CNN layers in PyTorch Constructing a CNN in PyTorch Smile classification from face images using a CNN Loading the CelebA dataset Image transformation and data augmentation Training a CNN smile classifier Summary Modeling Sequential Data Using Recurrent Neural Networks Introducing sequential data Modeling sequential data – order matters Sequential data versus time series data Representing sequences The different categories of sequence modeling RNNs for modeling sequences Understanding the dataflow in RNNs Computing activations in an RNN Hidden recurrence versus output recurrence The challenges of learning long-range interactions Long short-term memory cells Implementing RNNs for sequence modeling in PyTorch Project one – predicting the sentiment of IMDb movie reviews Preparing the movie review data Embedding layers for sentence encoding Building an RNN model Building an RNN model for the sentiment analysis task Project two – character-level language modeling in PyTorch Preprocessing the dataset Building a character-level RNN model Evaluation phase – generating new text passages Summary Transformers – Improving Natural Language Processing with Attention Mechanisms Adding an attention mechanism to RNNs Attention helps RNNs with accessing information The original attention mechanism for RNNs Processing the inputs using a bidirectional RNN Generating outputs from context vectors Computing the attention weights Introducing the self-attention mechanism Starting with a basic form of self-attention Parameterizing the self-attention mechanism: scaled dot-product attention Attention is all we need: introducing the original transformer architecture Encoding context embeddings via multi-head attention Learning a language model: decoder and masked multi-head attention Implementation details: positional encodings and layer normalization Building large-scale language models by leveraging unlabeled data Pre-training and fine-tuning transformer models Leveraging unlabeled data with GPT Using GPT-2 to generate new text Bidirectional pre-training with BERT The best of both worlds: BART Fine-tuning a BERT model in PyTorch Loading the IMDb movie review dataset Tokenizing the dataset Loading and fine-tuning a pre-trained BERT model Fine-tuning a transformer more conveniently using the Trainer API Summary Generative Adversarial Networks for Synthesizing New Data Introducing generative adversarial networks Starting with autoencoders Generative models for synthesizing new data Generating new samples with GANs Understanding the loss functions of the generator and discriminator networks in a GAN model Implementing a GAN from scratch Training GAN models on Google Colab Implementing the generator and the discriminator networks Defining the training dataset Training the GAN model Improving the quality of synthesized images using a convolutional and Wasserstein GAN Transposed convolution Batch normalization Implementing the generator and discriminator Dissimilarity measures between two distributions Using EM distance in practice for GANs Gradient penalty Implementing WGAN-GP to train the DCGAN model Mode collapse Other GAN applications Summary Graph Neural Networks for Capturing Dependencies in Graph Structured Data Introduction to graph data Undirected graphs Directed graphs Labeled graphs Representing molecules as graphs Understanding graph convolutions The motivation behind using graph convolutions Implementing a basic graph convolution Implementing a GNN in PyTorch from scratch Defining the NodeNetwork model Coding the NodeNetwork’s graph convolution layer Adding a global pooling layer to deal with varying graph sizes Preparing the DataLoader Using the NodeNetwork to make predictions Implementing a GNN using the PyTorch Geometric library Other GNN layers and recent developments Spectral graph convolutions Pooling Normalization Pointers to advanced graph neural network literature Summary Reinforcement Learning for Decision Making in Complex Environments Introduction – learning from experience Understanding reinforcement learning Defining the agent-environment interface of a reinforcement learning system The theoretical foundations of RL Markov decision processes The mathematical formulation of Markov decision processes Visualization of a Markov process Episodic versus continuing tasks RL terminology: return, policy, and value function The return Policy Value function Dynamic programming using the Bellman equation Reinforcement learning algorithms Dynamic programming Policy evaluation – predicting the value function with dynamic programming Improving the policy using the estimated value function Policy iteration Value iteration Reinforcement learning with Monte Carlo State-value function estimation using MC Action-value function estimation using MC Finding an optimal policy using MC control Policy improvement – computing the greedy policy from the action-value function Temporal difference learning TD prediction On-policy TD control (SARSA) Off-policy TD control (Q-learning) Implementing our first RL algorithm Introducing the OpenAI Gym toolkit Working with the existing environments in OpenAI Gym A grid world example Implementing the grid world environment in OpenAI Gym Solving the grid world problem with Q-learning A glance at deep Q-learning Training a DQN model according to the Q-learning algorithm Replay memory Determining the target values for computing the loss Implementing a deep Q-learning algorithm Chapter and book summary Other Books You May Enjoy Index

## How to download source code?

1. Go to: `https://github.com/PacktPublishing`

2. In the Find a repository… box, search the book title: `Machine Learning with PyTorch and Scikit-Learn: Develop machine learning and deep learning models with Python`

, sometime you may not get the results, please search the main title.

3. Click the book title in the search results.

3. Click Code to download.

1. Disable the **AdBlock** plugin. Otherwise, you may not get any links.

2. Solve the CAPTCHA.

3. Click download link.

4. Lead to download server to download.