Data Mining for Business Analytics: Concepts, Techniques and Applications in Python

by Galit Shmueli, Nitin R. Patel, Peter C. Bruce, Peter Gedeck

Length: 608 pages
Edition: 1
Language: English
Publisher: Wiley
Publication Date: 2019-11-05
ISBN-10: 1119549841
ISBN-13: 9781119549840
Sales Rank: #194909 (See Top 100 Books)

Data Mining for Business Analytics: Concepts, Techniques, and Applications in Python presents an applied approach to data mining concepts and methods, using Python software for illustration

Readers will learn how to implement a variety of popular data mining algorithms in Python (a free and open-source software) to tackle business problems and opportunities.

This is the sixth version of this successful text, and the first using Python. It covers both statistical and machine learning algorithms for prediction, classification, visualization, dimension reduction, recommender systems, clustering, text mining and network analysis. It also includes:

A new co-author, Peter Gedeck, who brings both experience teaching business analytics courses using Python, and expertise in the application of machine learning methods to the drug-discovery process
A new section on ethical issues in data mining
Updates and new material based on feedback from instructors teaching MBA, undergraduate, diploma and executive courses, and from their students
More than a dozen case studies demonstrating applications for the data mining techniques described
End-of-chapter exercises that help readers gauge and expand their comprehension and competency of the material presented
A companion website with more than two dozen data sets, and instructor materials including exercise solutions, PowerPoint slides, and case solutions

Data Mining for Business Analytics: Concepts, Techniques, and Applications in Python is an ideal textbook for graduate and upper-undergraduate level courses in data mining, predictive analytics, and business analytics. This new edition is also an excellent reference for analysts, researchers, and practitioners working with quantitative methods in the fields of business, finance, marketing, computer science, and information technology.

“This book has by far the most comprehensive review of business analytics methods that I have ever seen, covering everything from classical approaches such as linear and logistic regression, through to modern methods like neural networks, bagging and boosting, and even much more business specific procedures such as social network analysis and text mining. If not the bible, it is at the least a definitive manual on the subject.”

—Gareth M. James, University of Southern California and co-author (with Witten, Hastie and Tibshirani) of the best-selling book An Introduction to Statistical Learning, with Applications in R

DATA MINING FOR BUSINESS ANALYTICS
Contents
Foreword by Gareth James
Foreword by Ravi Bapna
Preface to the Python Edition
Acknowledgments
PART I PRELIMINARIES
	CHAPTER 1 Introduction
		1.1 What Is Business Analytics?
		1.2 What Is Data Mining?
		1.3 Data Mining and Related Terms
		1.4 Big Data
		1.5 Data Science
		1.6 Why Are There So Many Different Methods?
		1.7 Terminology and Notation
		1.8 Road Maps to This Book
			Order of Topics
	CHAPTER 2 Overview of the Data Mining Process
		2.1 Introduction
		2.2 Core Ideas in Data Mining
			Classification
			Prediction
			Association Rules and Recommendation Systems
			Predictive Analytics
			Data Reduction and Dimension Reduction
			Data Exploration and Visualization
			Supervised and Unsupervised Learning
		2.3 The Steps in Data Mining
		2.4 Preliminary Steps
			Organization of Datasets
			Predicting Home Values in the West Roxbury Neighborhood
			Loading and Looking at the Data in Python
			Python Imports
			Sampling from a Database
			Oversampling Rare Events in Classification Tasks
			Preprocessing and Cleaning the Data
		2.5 Predictive Power and Overfitting
			Overfitting
			Creation and Use of Data Partitions
		2.6 Building a Predictive Model
			Modeling Process
		2.7 Using Python for Data Mining on a Local Machine
		2.8 Automating Data Mining Solutions
		2.9 Ethical Practice in Data Mining
			Data Mining Software: The State of the Market (by Herb Edelstein)
		Problems
PART II DATA EXPLORATION AND DIMENSION REDUCTION
	CHAPTER 3 Data Visualization
		3.1 Introduction
		3.2 Data Examples
			Example 1: Boston Housing Data
			Example 2: Ridership on Amtrak Trains
		3.3 Basic Charts: Bar Charts, Line Graphs, and Scatter Plots
			Distribution Plots: Boxplots and Histograms
			Heatmaps: Visualizing Correlations and Missing Values
		3.4 Multidimensional Visualization
			Adding Variables: Color, Size, Shape, Multiple Panels, and Animation
			Manipulations: Rescaling, Aggregation and Hierarchies, Zooming, Filtering
			Reference: Trend Lines and Labels
			Scaling Up to Large Datasets
			Multivariate Plot: Parallel Coordinates Plot
			Interactive Visualization
		3.5 Specialized Visualizations
			Visualizing Networked Data
			Visualizing Hierarchical Data: Treemaps
			Visualizing Geographical Data: Map Charts
		3.6 Summary: Major Visualizations and Operations, by Data Mining Goal
			Prediction
			Classification
			Time Series Forecasting
			Unsupervised Learning
		Problems
	CHAPTER 4 Dimension Reduction
		4.1 Introduction
		4.2 Curse of Dimensionality
		4.3 Practical Considerations
			Example 1: House Prices in Boston
		4.4 Data Summaries
			Summary Statistics
			Aggregation and Pivot Tables
		4.5 Correlation Analysis
		4.6 Reducing the Number of Categories in Categorical Variables
		4.7 Converting a Categorical Variable to a Numerical Variable
		4.8 Principal Components Analysis
			Example 2: Breakfast Cereals
			Principal Components
			Normalizing the Data
			Using Principal Components for Classification and Prediction
		4.9 Dimension Reduction Using Regression Models
		4.10 Dimension Reduction Using Classification and Regression Trees
		Problems
PART III PERFORMANCE EVALUATION
	CHAPTER 5 Evaluating Predictive Performance
		5.1 Introduction
		5.2 Evaluating Predictive Performance
			Naive Benchmark: The Average
			Prediction Accuracy Measures
			Comparing Training and Validation Performance
			Cumulative Gains and Lift Charts
		5.3 Judging Classifier Performance
			Benchmark: The Naive Rule
			Class Separation
			The Confusion (Classification) Matrix
			Using the Validation Data
			Accuracy Measures
			Propensities and Cutoff for Classification
			Performance in Case of Unequal Importance of Classes
			Asymmetric Misclassification Costs
			Generalization to More Than Two Classes
		5.4 Judging Ranking Performance
			Gains and Lift Charts for Binary Data
			Decile Lift Charts
			Beyond Two Classes
			Gains and Lift Charts Incorporating Costs and Benefits
			Cumulative Gains as a Function of Cutoff
		5.5 Oversampling
			Oversampling the Training Set
			Evaluating Model Performance Using a Non-oversampled Validation Set
			Evaluating Model Performance if Only Oversampled Validation Set Exists
		Problems
PART IV PREDICTION AND CLASSIFICATION METHODS
	CHAPTER 6 Multiple Linear Regression
		6.1 Introduction
		6.2 Explanatory vs. Predictive Modeling
		6.3 Estimating the Regression Equation and Prediction
			Example: Predicting the Price of Used Toyota Corolla Cars
		6.4 Variable Selection in Linear Regression
			Reducing the Number of Predictors
			How to Reduce the Number of Predictors
			Regularization (Shrinkage Models)
		Appendix: Using Statmodels
		Problems
	CHAPTER 7 k-Nearest Neighbors (kNN)
		7.1 The k-NN Classifier (Categorical Outcome)
			Determining Neighbors
			Classification Rule
			Example: Riding Mowers
			Choosing k
			Setting the Cutoff Value
			k-NN with More Than Two Classes
			Converting Categorical Variables to Binary Dummies
		7.2 k-NN for a Numerical Outcome
		7.3 Advantages and Shortcomings of k-NN Algorithms
		Problems
	CHAPTER 8 The Naive Bayes Classifier
		8.1 Introduction
			Cutoff Probability Method
			Conditional Probability
			Example 1: Predicting Fraudulent Financial Reporting
		8.2 Applying the Full (Exact) Bayesian Classifier
			Using the “Assign to the Most Probable Class” Method
			Using the Cutoff Probability Method
			Practical Difficulty with the Complete (Exact) Bayes Procedure
			Solution: Naive Bayes
			The Naive Bayes Assumption of Conditional Independence
			Using the Cutoff Probability Method
			Example 2: Predicting Fraudulent Financial Reports, Two Predictors
			Example 3: Predicting Delayed Flights
		8.3 Advantages and Shortcomings of the Naive Bayes Classifier
		Problems
	CHAPTER 9 Classification and Regression Trees
		9.1 Introduction
			Tree Structure
			Decision Rules
			Classifying a New Record
		9.2 Classification Trees
			Recursive Partitioning
			Example 1: Riding Mowers
			Measures of Impurity
		9.3 Evaluating the Performance of a Classification Tree
			Example 2: Acceptance of Personal Loan
			Sensitivity Analysis Using Cross Validation
		9.4 Avoiding Overfitting
			Stopping Tree Growth
			Fine-tuning Tree Parameters
			Other Methods for Limiting Tree Size
		9.5 Classification Rules from Trees
		9.6 Classification Trees for More Than Two Classes
		9.7 Regression Trees
			Prediction
			Measuring Impurity
			Evaluating Performance
		9.8 Improving Prediction: Random Forests and Boosted Trees
			Random Forests
			Boosted Trees
		9.9 Advantages and Weaknesses of a Tree
		Problems
	CHAPTER 10 Logistic Regression
		10.1 Introduction
		10.2 The Logistic Regression Model
		10.3 Example: Acceptance of Personal Loan
			Model with a Single Predictor
			Estimating the Logistic Model from Data: Computing Parameter Estimates
			Interpreting Results in Terms of Odds (for a Profiling Goal)
		10.4 Evaluating Classification Performance
			Variable Selection
		10.5 Logistic Regression for Multi-class Classification
			Ordinal Classes
			Nominal Classes
			Comparing Ordinal and Nominal Models
		10.6 Example of Complete Analysis: Predicting Delayed Flights
			Data Preprocessing
			Model Training
			Model Interpretation
			Model Performance
			Variable Selection
		Appendix: Using Statmodels
		Problems
	CHAPTER 11 Neural Nets
		11.1 Introduction
		11.2 Concept and Structure of a Neural Network
		11.3 Fitting a Network to Data
			Example 1: Tiny Dataset
			Computing Output of Nodes
			Preprocessing the Data
			Training the Model
			Example 2: Classifying Accident Severity
			Avoiding Overfitting
			Using the Output for Prediction and Classification
		11.4 Required User Input
		11.5 Exploring the Relationship Between Predictors and Outcome
		11.6 Deep Learning
			Convolutional Neural Networks (CNNs)
			Local Feature Map
			A Hierarchy of Features
			The Learning Process
			Unsupervised Learning
			Conclusion
		11.7 Advantages and Weaknesses of Neural Networks
		Problems
	CHAPTER 12 Discriminant Analysis
		12.1 Introduction
			Example 1: Riding Mowers
			Example 2: Personal Loan Acceptance
		12.2 Distance of a Record from a Class
		12.3 Fisher’s Linear Classification Functions
		12.4 Classification Performance of Discriminant Analysis
		12.5 Prior Probabilities
		12.6 Unequal Misclassification Costs
		12.7 Classifying More Than Two Classes
			Example 3: Medical Dispatch to Accident Scenes
		12.8 Advantages and Weaknesses
		Problems
	CHAPTER 13 Combining Methods: Ensembles and Uplift Modeling
		13.1 Ensembles
			Why Ensembles Can Improve Predictive Power
			Simple Averaging
			Bagging
			Boosting
			Bagging and Boosting in Python
			Advantages and Weaknesses of Ensembles
		13.2 Uplift (Persuasion) Modeling
			A–B Testing
			Uplift
			Gathering the Data
			A Simple Model
			Modeling Individual Uplift
			Computing Uplift with Python
			Using the Results of an Uplift Model
		13.3 Summary
		Problems
PART V MINING RELATIONSHIPS AMONG RECORDS
	CHAPTER 14 Association Rules and Collaborative Filtering
		14.1 Association Rules
			Discovering Association Rules in Transaction Databases
			Example 1: Synthetic Data on Purchases of Phone Faceplates
			Generating Candidate Rules
			The Apriori Algorithm
			Selecting Strong Rules
			Data Format
			The Process of Rule Selection
			Interpreting the Results
			Rules and Chance
			Example 2: Rules for Similar Book Purchases
		14.2 Collaborative Filtering
			Data Type and Format
			Example 3: Netflix Prize Contest
			User-Based Collaborative Filtering: “People Like You”
			Item-Based Collaborative Filtering
			Advantages and Weaknesses of Collaborative Filtering
			Collaborative Filtering vs. Association Rules
		14.3 Summary
		Problems
	CHAPTER 15 Cluster Analysis
		15.1 Introduction
			Example: Public Utilities
		15.2 Measuring Distance Between Two Records
			Euclidean Distance
			Normalizing Numerical Measurements
			Other Distance Measures for Numerical Data
			Distance Measures for Categorical Data
			Distance Measures for Mixed Data
		15.3 Measuring Distance Between Two Clusters
			Minimum Distance
			Maximum Distance
			Average Distance
			Centroid Distance
		15.4 Hierarchical (Agglomerative) Clustering
			Single Linkage
			Complete Linkage
			Average Linkage
			Centroid Linkage
			Ward’s Method
			Dendrograms: Displaying Clustering Process and Results
			Validating Clusters
			Limitations of Hierarchical Clustering
		15.5 Non-Hierarchical Clustering: The k-Means Algorithm
			Choosing the Number of Clusters (k)
		Problems
PART VI FORECASTING TIME SERIES
	CHAPTER 16 Handling Time Series
		16.1 Introduction
		16.2 Descriptive vs. Predictive Modeling
		16.3 Popular Forecasting Methods in Business
			Combining Methods
		16.4 Time Series Components
			Example: Ridership on Amtrak Trains
		16.5 Data-Partitioning and Performance Evaluation
			Benchmark Performance: Naive Forecasts
			Generating Future Forecasts
		Problems
	CHAPTER 17 Regression-Based Forecasting
		17.1 A Model with Trend
			Linear Trend
			Exponential Trend
			Polynomial Trend
		17.2 A Model with Seasonality
		17.3 A Model with Trend and Seasonality
		17.4 Autocorrelation and ARIMA Models
			Computing Autocorrelation
			Improving Forecasts by Integrating Autocorrelation Information
			Evaluating Predictability
		Problems
	CHAPTER 18 Smoothing Methods
		18.1 Introduction
		18.2 Moving Average
			Centered Moving Average for Visualization
			Trailing Moving Average for Forecasting
			Choosing Window Width (w)
		18.3 Simple Exponential Smoothing
			Choosing Smoothing Parameter
			Relation Between Moving Average and Simple Exponential Smoothing
		18.4 Advanced Exponential Smoothing
			Series with a Trend
			Series with a Trend and Seasonality
			Series with Seasonality (No Trend)
		Problems
PART VII DATA ANALYTICS
	CHAPTER 19 Social Network Analytics
		19.1 Introduction
		19.2 Directed vs. Undirected Networks
		19.3 Visualizing and Analyzing Networks
			Plot Layout
			Edge List
			Adjacency Matrix
			Using Network Data in Classification and Prediction
		19.4 Social Data Metrics and Taxonomy
			Node-Level Centrality Metrics
			Egocentric Network
			Network Metrics
		19.5 Using Network Metrics in Prediction and Classification
			Link Prediction
			Entity Resolution
			Collaborative Filtering
		19.6 Collecting Social Network Data with Python
		19.7 Advantages and Disadvantages
		Problems
	CHAPTER 20 Text Mining
		20.1 Introduction
		20.2 The Tabular Representation of Text: Term-Document Matrix and “Bag-of-Words’’
		20.3 Bag-of-Words vs. Meaning Extraction at Document Level
		20.4 Preprocessing the Text
			Tokenization
			Text Reduction
			Presence/Absence vs. Frequency
			Term Frequency–Inverse Document Frequency (TF-IDF)
			From Terms to Concepts: Latent Semantic Indexing
			Extracting Meaning
		20.5 Implementing Data Mining Methods
		20.6 Example: Online Discussions on Autos and Electronics
			Importing and Labeling the Records
			Text Preprocessing in Python
			Producing a Concept Matrix
			Fitting a Predictive Model
			Prediction
		20.7 Summary
		Problems
PART VIII CASES
	CHAPTER 21 Cases
		21.1 Charles Book Club
			The Book Industry
			Database Marketing at Charles
			Data Mining Techniques
			Assignment
		21.2 German Credit
			Background
			Data
			Assignment
		21.3 Tayko Software Cataloger
			Background
			The Mailing Experiment
			Data
			Assignment
		21.4 Political Persuasion
			Background
			Predictive Analytics Arrives in US Politics
			Political Targeting
			Uplift
			Data
			Assignment
		21.5 Taxi Cancellations
			Business Situation
			Assignment
		21.6 Segmenting Consumers of Bath Soap
			Business Situation
			Key Problems
			Data
			Measuring Brand Loyalty
			Assignment
		21.7 Direct-Mail Fundraising
			Background
			Data
			Assignment
		21.8 Catalog Cross-Selling
			Background
			Assignment
		21.9 Time Series Case: Forecasting Public Transportation Demand
			Background
			Problem Description
			Available Data
			Assignment Goal
			Assignment
			Tips and Suggested Steps
References
Data Files Used in the Book
Python Utilities Functions
Index
EULA