Modern Deep Learning for Tabular Data: Novel Approaches to Common Modeling Problems

Length: 870 pages
Edition: 1
Language: English
Publisher: Apress
Publication Date: 2022-12-30
ISBN-10: 148428691X
ISBN-13: 9781484286913
Sales Rank: #2213109 (See Top 100 Books)

Deep learning is one of the most powerful tools in the modern artificial intelligence landscape. While having been predominantly applied to highly specialized image, text, and signal datasets, this book synthesizes and presents novel deep learning approaches to a seemingly unlikely domain – tabular data. Whether for finance, business, security, medicine, or countless other domain, deep learning can help mine and model complex patterns in tabular data – an incredibly ubiquitous form of structured data.

Part I of the book offers a rigorous overview of machine learning principles, algorithms, and implementation skills relevant to holistically modeling and manipulating tabular data. Part II studies five dominant deep learning model designs – Artificial Neural Networks, Convolutional Neural Networks, Recurrent Neural Networks, Attention and Transformers, and Tree-Rooted Networks – through both their ‘default’ usage and their application to tabular data. Part III compounds the power of the previously covered methods by surveying strategies and techniques to supercharge deep learning systems: autoencoders, deep data generation, meta-optimization, multi-model arrangement, and neural network interpretability. Each chapter comes with extensive visualization, code, and relevant research coverage.

Modern Deep Learning for Tabular Data is one of the first of its kind – a wide exploration of deep learning theory and applications to tabular data, integrating and documenting novel methods and techniques in the field. This book provides a strong conceptual and theoretical toolkit to approach challenging tabular data problems.
What You Will Learn

Important concepts and developments in modern machine learning and deep learning, with a strong emphasis on tabular data applications.
Understand the promising links between deep learning and tabular data, and when a deep learning approach is or isn’t appropriate.
Apply promising research and unique modeling approaches in real-world data contexts.
Explore and engage with modern, research-backed theoretical advances on deep tabular modeling
Utilize unique and successful preprocessing methods to prepare tabular data for successful modelling.

Who This Book Is ForData scientists and researchers of all levels from beginner to advanced looking to level up results on tabular data with deep learning or to understand the theoretical and practical aspects of deep tabular modeling research. Applicable to readers seeking to apply deep learning to all sorts of complex tabular data contexts, including business, finance, medicine, education, and security.

Table of Contents
About the Authors
About the Technical Reviewer
Acknowledgments
Foreword
Foreword
Introduction
Part I: Machine Learning and Tabular Data
	Chapter 1: Classical Machine Learning Principles and Methods
		Fundamental Principles of Modeling
			What Is Modeling?
			Modes of Learning
			Quantitative Representations of Data: Regression and Classification
			The Machine Learning Data Cycle: Training, Validation, and Test Sets
			Bias-Variance Trade-Off
			Feature Space and the Curse of Dimensionality
			Optimization and Gradient Descent
		Metrics and Evaluation
			Mean Absolute Error
			Mean Squared Error (MSE)
			Confusion Matrix
			Accuracy
			Precision
			Recall
			F1 Score
			Area Under the Receiver Operating Characteristics Curve (ROC-AUC)
		Algorithms
			K-Nearest Neighbors
				Theory and Intuition
				Implementation and Usage
			Linear Regression
				Theory and Intuition
				Implementation and Usage
				Other Variations on Simple Linear Regression
			Logistic Regression
				Theory and Intuition
				Implementation and Usage
				Other Variations on Logistic Regression
			Decision Trees
				Theory and Intuition
				Implementation and Usage
				Random Forest
			Gradient Boosting
				Theory and Intuition
				AdaBoost
				XGBoost
				LightGBM
			Summary of Algorithms
		Thinking Past Classical Machine Learning
		Key Points
	Chapter 2: Data Preparation and Engineering
		Data Storage and Manipulation
			TensorFlow Datasets
			Creating a TensorFlow Dataset
			TensorFlow Sequence Datasets
			Handling Large Datasets
				Datasets That Fit in Memory
					Pickle
					SciPy and TensorFlow Sparse Matrices
				Datasets That Do Not Fit in Memory
					Pandas Chunker
					h5py
					NumPy Memory Map
		Data Encoding
			Discrete Data
				Label Encoding
				One-Hot Encoding
				Binary Encoding
				Frequency Encoding
				Target Encoding
				Leave-One-Out Encoding
				James-Stein Encoding
				Weight of Evidence
			Continuous Data
				Min-Max Scaling
				Robust Scaling
				Standardization
			Text Data
				Keyword Search
				Raw Vectorization
				Bag of Words
				N-Grams
				TF-IDF
				Sentiment Extraction
				Word2Vec
			Time Data
			Geographical Data
		Feature Extraction
			Single- and Multi-feature Transformations
			Principal Component Analysis
			t-SNE
			Linear Discriminant Analysis
			Statistics-Based Engineering
		Feature Selection
			Information Gain
			Variance Threshold
			High-Correlation Method
			Recursive Feature Elimination
			Permutation Importance
			LASSO Coefficient Selection
		Key Points
Part II: Applied Deep Learning Architectures
	Chapter 3: Neural Networks and Tabular Data
		What Exactly Are Neural Networks?
		Neural Network Theory
			Starting with a Single Neuron
			Feed-Forward Operation
		Introduction to Keras
			Modeling with Keras
				Defining the Architecture
				Compiling the Model
				Training and Evaluation
		Loss Functions
		Math Behind Feed-Forward Operation
			Activation Functions
				Sigmoid and Hyperbolic Tangent
				Rectified Linear Unit
				LeakyReLU
				Swish
				The Nonlinearity and Variability of Activation Functions
		The Math Behind Neural Network Learning
			Gradient Descent in Neural Networks
			The Backpropagation Algorithm
		Optimizers
			Mini-batch Stochastic Gradient Descent (SGD) and Momentum
			Nesterov Accelerated Gradient (NAG)
			Adaptive Moment Estimation (Adam)
		A Deeper Dive into Keras
			Training Callbacks and Validation
			Batch Normalization and Dropout
			The Keras Functional API
				Nonlinear Topologies
				Multi-input and Multi-output Models
				Embeddings
				Model Weight Sharing
		The Universal Approximation Theorem
		Selected Research
			Simple Modifications to Improve Tabular Neural Networks
				Ghost Batch Normalization
				Leaky Gates
			Wide and Deep Learning
			Self-Normalizing Neural Networks
			Regularization Learning Networks
		Key Points
	Chapter 4: Applying Convolutional Structures to Tabular Data
		Convolutional Neural Network Theory
			Why Do We Need Convolutions?
			The Convolution Operation
			The Pooling Operation
			Base CNN Architectures
				ResNet
				Inception v3
				EfficientNet
		Multimodal Image and Tabular Models
		1D Convolutions for Tabular Data
		2D Convolutions for Tabular Data
			DeepInsight
			IGTD (Image Generation for Tabular Data)
		Key Points
	Chapter 5: Applying Recurrent Structures to Tabular Data
		Recurrent Models Theory
			Why Are Recurrent Models Necessary?
			Recurrent Neurons and Memory Cells
				Backpropagation Through Time (BPTT) and Vanishing Gradients
			LSTMs and Exploding Gradients
			Gated Recurrent Units (GRUs)
			Bidirectionality
		Introduction to Recurrent Layers in Keras
			Return Sequences and Return State
		Standard Recurrent Model Applications
			Natural Language
			Time Series
			Multimodal Recurrent Modeling
		Direct Tabular Recurrent Modeling
			A Novel Modeling Paradigm
			Optimizing the Sequence
			Optimizing the Initial Memory State(s)
		Further Resources
		Key Points
	Chapter 6: Applying Attention to Tabular Data
		Attention Mechanism Theory
			The Attention Mechanism
			The Transformer Architecture
			BERT and Pretraining Language Models
			Taking a Step Back
		Working with Attention
			Simple Custom Bahdanau Attention
			Native Keras Attention
			Attention in Sequence-to-Sequence Tasks
			Improving Natural Language Models with Attention
		Direct Tabular Attention Modeling
		Attention-Based Tabular Modeling Research
			TabTransformer
			TabNet
			SAINT
			ARM-Net
		Key Points
	Chapter 7: Tree-Based Deep Learning Approaches
		Tree-Structured Neural Networks
			Deep Neural Decision Trees
			Soft Decision Tree Regressors
			NODE
			Tree-Based Neural Network Initialization
			Net-DNF
		Boosting and Stacking Neural Networks
			GrowNet
			XBNet
		Distillation
			DeepGBM
		Key Points
Part III: Deep Learning Design and Tools
	Chapter 8: Autoencoders
		The Concept of the Autoencoder
		Vanilla Autoencoders
		Autoencoders for Pretraining
		Multitask Autoencoders
		Sparse Autoencoders
		Denoising and Reparative Autoencoders
		Key Points
	Chapter 9: Data Generation
		Variational Autoencoders
			Theory
			Implementation
		Generative Adversarial Networks
			Theory
			Simple GAN in TensorFlow
			CTGAN
		Key Points
	Chapter 10: Meta-optimization
		Meta-optimization: Concepts and Motivations
		No-Gradient Optimization
		Optimizing Model Meta-parameters
		Optimizing Data Pipelines
		Neural Architecture Search
		Key Points
	Chapter 11: Multi-model Arrangement
		Average Weighting
		Input-Informed Weighting
		Meta-evaluation
		Key Points
	Chapter 12: Neural Network Interpretability
		SHAP
		LIME
		Activation Maximization
		Key Points
		Closing Remarks
Appendix: NumPy and Pandas
	NumPy Arrays
		NumPy Array Construction
		Simple NumPy Indexing
		Quantitative Manipulation
		Advanced NumPy Indexing
		NumPy Data Types
		Function Application and Vectorization
		NumPy Array Application: Image Manipulation
	Pandas DataFrames
		Constructing Pandas DataFrames
		Simple Pandas Mechanics
		Advanced Pandas Mechanics
		Pivot
		Melt
		Explode
		Stack
		Unstack
	Conclusion
Index

AI & Machine Learning Artificial Intelligence Information Theory Intelligence & Semantics

Donate to keep this site alive

To access the Link, solve the captcha.

How to download source code?

1. Go to: https://github.com/Apress

2. In the Find a repository… box, search the book title: Modern Deep Learning for Tabular Data: Novel Approaches to Common Modeling Problems, sometime you may not get the results, please search the main title.