# Python Machine Learning By Example: Implement machine learning algorithms and techniques to build intelligent systems, 2nd Edition

- Length: 382 pages
- Edition: 2
- Language: English
- Publisher: Packt Publishing
- Publication Date: 2019-02-28
- ISBN-10: 1789616727
- ISBN-13: 9781789616729
- Sales Rank: #1934239 (See Top 100 Books)

**Grasp machine learning concepts, techniques, and algorithms with the help of real-world examples using Python libraries such as TensorFlow and scikit-learn**

**Key Features**

- Exploit the power of Python to explore the world of data mining and data analytics
- Discover machine learning algorithms to solve complex challenges faced by data scientists today
- Use Python libraries such as TensorFlow and Keras to create smart cognitive actions for your projects

**Book Description **

A surging interest in machine learning is due to the fact that it evolutionzies automation by learning patterns in data and using them to make predictions and decisions. Your ML journey starts with this book, as the second edition of the bestseller, Python Machine Learning By Example.

Hayden’s unique insights and expertise introduce you to important ML concepts and implementations of algorithms in Python both from scratch and with libraries. Each chapter of the book walks you through an industry adopted application. With the help of realistic examples, you will find it intriguing to acquire mechanics of ML techniques in areas such as exploratory data analysis, feature engineering, classification, regression, clustering, and NLP – they are no more obscure as you thought.

This critically extended and updated edition now includes implementation with trendy libraries including TensorFlow, gensim and Keras. The scikit-learn codes are also fully modernized. Even if you’ve read the last edition, you’ll still be delighted to find plenty of new content, for example, neural network, dimensionality reduction, topic modeling, large-scale learning with Spark and word embedding.

Toward the end, you will gather a broad picture of the ML ecosystem and best practices of applying ML techniques to meet new opportunities in today’s world.

**What you will learn**

- Understand the important concepts in machine learning and data science
- Use Python to explore the world of data mining and analytics
- Scale up model training using varied data complexities with Apache Spark
- Delve deep into text and NLP using Python libraries such NLTK and gensim
- Select and build an ML model and evaluate and optimize its performance
- Implement ML algorithms from scratch in Python, TensorFlow, and scikit-learn

**Who this book is for **

If you’re a machine learning aspirant, data analyst, or data engineer highly passionate about machine learning and want to begin working on ML assignments, this book is for you. Prior knowledge of Python coding is assumed and basic familiarity with statistical concepts will be beneficial although not necessary.

Title Page Copyright and Credits Python Machine Learning By Example Second Edition About Packt Why subscribe? Packt.com Dedication Foreword Contributors About the author About the reviewer Packt is searching for authors like you Preface Who this book is for What this book covers To get the most out of this book Download the example code files Download the color images Conventions used Get in touch Reviews Section 1: Fundamentals of Machine Learning Getting Started with Machine Learning and Python Defining machine learning and why we need it A very high-level overview of machine learning technology Types of machine learning tasks A brief history of the development of machine learning algorithms Core of machine learning – generalizing with data Overfitting, underfitting, and the bias-variance trade-off Avoiding overfitting with cross-validation Avoiding overfitting with regularization Avoiding overfitting with feature selection and dimensionality reduction Preprocessing, exploration, and feature engineering Missing values Label encoding One hot encoding Scaling Polynomial features Power transform Binning Combining models Voting and averaging Bagging Boosting Stacking Installing software and setting up Setting up Python and environments Installing the various packages NumPy SciPy Pandas Scikit-learn TensorFlow Summary Exercises Section 2: Practical Python Machine Learning By Example Exploring the 20 Newsgroups Dataset with Text Analysis Techniques How computers understand language - NLP Picking up NLP basics while touring popular NLP libraries Corpus Tokenization PoS tagging Named-entity recognition Stemming and lemmatization Semantics and topic modeling Getting the newsgroups data Exploring the newsgroups data Thinking about features for text data Counting the occurrence of each word token Text preprocessing Dropping stop words Stemming and lemmatizing words Visualizing the newsgroups data with t-SNE What is dimensionality reduction? t-SNE for dimensionality reduction Summary Exercises Mining the 20 Newsgroups Dataset with Clustering and Topic Modeling Algorithms Learning without guidance – unsupervised learning Clustering newsgroups data using k-means How does k-means clustering work? Implementing k-means from scratch Implementing k-means with scikit-learn Choosing the value of k Clustering newsgroups data using k-means Discovering underlying topics in newsgroups Topic modeling using NMF Topic modeling using LDA Summary Exercises Detecting Spam Email with Naive Bayes Getting started with classification Types of classification Applications of text classification Exploring Naïve Bayes Learning Bayes' theorem by examples The mechanics of Naïve Bayes Implementing Naïve Bayes from scratch Implementing Naïve Bayes with scikit-learn Classification performance evaluation Model tuning and cross-validation Summary Exercise Classifying Newsgroup Topics with Support Vector Machines Finding separating boundary with support vector machines Understanding how SVM works through different use cases Case 1 – identifying a separating hyperplane Case 2 – determining the optimal hyperplane Case 3 – handling outliers Implementing SVM Case 4 – dealing with more than two classes The kernels of SVM Case 5 – solving linearly non-separable problems Choosing between linear and RBF kernels Classifying newsgroup topics with SVMs More example – fetal state classification on cardiotocography A further example – breast cancer classification using SVM with TensorFlow Summary Exercise Predicting Online Ad Click-Through with Tree-Based Algorithms Brief overview of advertising click-through prediction Getting started with two types of data – numerical and categorical Exploring decision tree from root to leaves Constructing a decision tree The metrics for measuring a split Implementing a decision tree from scratch Predicting ad click-through with decision tree Ensembling decision trees – random forest Implementing random forest using TensorFlow Summary Exercise Predicting Online Ad Click-Through with Logistic Regression Converting categorical features to numerical – one-hot encoding and ordinal encoding Classifying data with logistic regression Getting started with the logistic function Jumping from the logistic function to logistic regression Training a logistic regression model Training a logistic regression model using gradient descent Predicting ad click-through with logistic regression using gradient descent Training a logistic regression model using stochastic gradient descent Training a logistic regression model with regularization Training on large datasets with online learning Handling multiclass classification Implementing logistic regression using TensorFlow Feature selection using random forest Summary Exercises Scaling Up Prediction to Terabyte Click Logs Learning the essentials of Apache Spark Breaking down Spark Installing Spark Launching and deploying Spark programs Programming in PySpark Learning on massive click logs with Spark Loading click logs Splitting and caching the data One-hot encoding categorical features Training and testing a logistic regression model Feature engineering on categorical variables with Spark Hashing categorical features Combining multiple variables – feature interaction Summary Exercises Stock Price Prediction with Regression Algorithms Brief overview of the stock market and stock prices What is regression? Mining stock price data Getting started with feature engineering Acquiring data and generating features Estimating with linear regression How does linear regression work? Implementing linear regression Estimating with decision tree regression Transitioning from classification trees to regression trees Implementing decision tree regression Implementing regression forest Estimating with support vector regression Implementing SVR Estimating with neural networks Demystifying neural networks Implementing neural networks Evaluating regression performance Predicting stock price with four regression algorithms Summary Exercise Section 3: Python Machine Learning Best Practices Machine Learning Best Practices Machine learning solution workflow Best practices in the data preparation stage Best practice 1 – completely understanding the project goal Best practice 2 – collecting all fields that are relevant Best practice 3 – maintaining the consistency of field values Best practice 4 – dealing with missing data Best practice 5 – storing large-scale data Best practices in the training sets generation stage Best practice 6 – identifying categorical features with numerical values Best practice 7 – deciding on whether or not to encode categorical features Best practice 8 – deciding on whether or not to select features, and if so, how to do so Best practice 9 – deciding on whether or not to reduce dimensionality, and if so, how to do so Best practice 10 – deciding on whether or not to rescale features Best practice 11 – performing feature engineering with domain expertise Best practice 12 – performing feature engineering without domain expertise Best practice 13 – documenting how each feature is generated Best practice 14 – extracting features from text data Best practices in the model training, evaluation, and selection stage Best practice 15 – choosing the right algorithm(s) to start with Naïve Bayes Logistic regression SVM Random forest (or decision tree) Neural networks Best practice 16 – reducing overfitting Best practice 17 – diagnosing overfitting and underfitting Best practice 18 – modeling on large-scale datasets Best practices in the deployment and monitoring stage Best practice 19 – saving, loading, and reusing models Best practice 20 – monitoring model performance Best practice 21 – updating models regularly Summary Exercises Other Books You May Enjoy Leave a review - let other readers know what you think

Donate to keep this site alive

## How to download source code?

1. Go to: `https://github.com/PacktPublishing`

2. In the Find a repository… box, search the book title: `Python Machine Learning By Example: Implement machine learning algorithms and techniques to build intelligent systems, 2nd Edition`

, sometime you may not get the results, please search the main title.

3. Click the book title in the search results.

3. Click Code to download.

1. Disable the **AdBlock** plugin. Otherwise, you may not get any links.

2. Solve the CAPTCHA.

3. Click download link.

4. Lead to download server to download.