Deep Learning in Bioinformatics: Techniques and Applications in Practice

Length: 380 pages
Edition: 1
Language: English
Publisher: Academic Press
Publication Date: 2022-02-02
ISBN-10: 0128238224
ISBN-13: 9780128238226
Sales Rank: #2771528 (See Top 100 Books)

Deep Learning in Bioinformatics: Techniques and Applications in Practice introduces the topic in an easy-to-understand way, exploring how it can be utilized for addressing important problems in bioinformatics, including drug discovery, de novo molecular design, sequence analysis, protein structure prediction, gene expression regulation, protein classification, biomedical image processing and diagnosis, biomolecule interaction prediction, and in systems biology. The book also presents theoretical and practical successes of deep learning in bioinformatics, pointing out problems and suggesting future research directions. Dr. Izadkhah provides valuable insights and will help researchers use deep learning techniques in their biological and bioinformatics studies.

Front Cover
Deep Learning in Bioinformatics
Copyright
Contents
Acknowledgments
Preface
1 Why life science?
	1.1 Introduction
	1.2 Why deep learning?
	1.3 Contemporary life science is about data
	1.4 Deep learning and bioinformatics
	1.5 What will you learn?
2 A review of machine learning
	2.1 Introduction
	2.2 What is machine learning?
	2.3 Challenge with machine learning
	2.4 Overfitting and underfitting
		2.4.1 Mitigating overfitting
		2.4.2 Adjusting parameters using cross-validation
		2.4.3 Cross-validation methods
	2.5 Types of machine learning
		2.5.1 Supervised learning
		2.5.2 Unsupervised learning
		2.5.3 Reinforcement learning
	2.6 The math behind deep learning
		2.6.1 Tensors
		2.6.2 Relevant mathematical operations
		2.6.3 The math behind machine learning: statistics
	2.7 TensorFlow and Keras
	2.8 Real-world tensors
	2.9 Summary
3 An introduction of Python ecosystem for deep learning
	3.1 Basic setup
	3.2 SciPy (scientific Python) ecosystem
	3.3 Scikit-learn
	3.4 A quick refresher in Python
		3.4.1 Identifier
		3.4.2 Comments
		3.4.3 Data type
		3.4.4 Control flow statements
		3.4.5 Data structure
		3.4.6 Functions
	3.5 NumPy
	3.6 Matplotlib crash course
	3.7 Pandas
	3.8 How to load dataset
		3.8.1 Considerations when loading CSV data
		3.8.2 Pima Indians diabetes dataset
		3.8.3 Loading CSV files in NumPy
		3.8.4 Loading CSV files in Pandas
	3.9 Dimensions of your data
	3.10 Correlations between features
	3.11 Techniques to understand each feature in the dataset
		3.11.1 Histograms
		3.11.2 Box-and-whisker plots
		3.11.3 Correlation matrix plot
	3.12 Prepare your data for deep learning
		3.12.1 Scaling features to a range
		3.12.2 Data normalizing
		3.12.3 Binarize data (make binary)
	3.13 Feature selection for machine learning
		3.13.1 Univariate selection
		3.13.2 Recursive feature elimination
		3.13.3 Principal component analysis
		3.13.4 Feature importance
	3.14 Split dataset into training and testing sets
	3.15 Summary
4 Basic structure of neural networks
	4.1 Introduction
	4.2 The neuron
	4.3 Layers of neural networks
	4.4 How a neural network is trained?
	4.5 Delta learning rule
	4.6 Generalized delta rule
	4.7 Gradient descent
		4.7.1 Stochastic gradient descent
		4.7.2 Batch gradient descent
		4.7.3 Mini-batch gradient descent
	4.8 Example: delta rule
		4.8.1 Implementation of the SGD method
		4.8.2 Implementation of the batch method
	4.9 Limitations of single-layer neural networks
	4.10 Summary
5 Training multilayer neural networks
	5.1 Introduction
	5.2 Backpropagation algorithm
	5.3 Momentum
	5.4 Neural network models in keras
	5.5 `Hello world!' of deep learning
	5.6 Tuning hyperparameters
	5.7 Data preprocessing
		5.7.1 Vectorization
		5.7.2 Value normalization
	5.8 Summary
6 Classification in bioinformatics
	6.1 Introduction
		6.1.1 Binary classification
		6.1.2 Pima indians onset of diabetes dataset
			6.1.2.1 Import libraries
			6.1.2.2 Load data
			6.1.2.3 Keras model
			6.1.2.4 Compile the model
			6.1.2.5 Fit the model
			6.1.2.6 Evaluate the model
			6.1.2.7 Tie it all together
			6.1.2.8 Make predictions
		6.1.3 Label encoding
	6.2 Multiclass classification
		6.2.1 Sigmoid and softmax activation functions
		6.2.2 Types of classification
	6.3 Summary
7 Introduction to deep learning
	7.1 Introduction
	7.2 Improving the performance of deep neural networks
		7.2.1 Vanishing gradient
		7.2.2 Overfitting
			7.2.2.1 Reducing the network's size
			7.2.2.2 Dropout
			7.2.2.3 Weight regularization
		7.2.3 Computational load
	7.3 Configuring the learning rate in keras
		7.3.1 Adaptive learning rate
		7.3.2 Layer weight initializers
	7.4 Imbalanced dataset
	7.5 Breast cancer detection
		7.5.1 Goals
		7.5.2 Introduction and task definition
		7.5.3 Implementation
			7.5.3.1 Loading, preprocessing, preparations for modeling
			7.5.3.2 Fully connected neural network (FCNN)
			7.5.3.3 Adding dropout to the network (FCNN + dropout)
			7.5.3.4 Adding L2 weight regularization (FCNN + L2)
			7.5.3.5 Adding L2 weight regularization and dropout (FCNN + L2 + dropout)
			7.5.3.6 Adding L1_L2 weight regularization (FCNN + L1_L2)
			7.5.3.7 Reducing the size of the network
			7.5.3.8 Summary
	7.6 Molecular classification of cancer by gene expression
		7.6.1 Goals
		7.6.2 Introduction and task definition
		7.6.3 Implementation
			7.6.3.1 Loading, preprocessing, preparations for modeling
			7.6.3.2 Dimension reduction using principal component analysis (PCA)
			7.6.3.3 Model
	7.7 Summary
8 Medical image processing: an insight to convolutional neural networks
	8.1 Convolutional neural network architecture
	8.2 Convolution layer
	8.3 Pooling layer
	8.4 Stride and padding
	8.5 Convolutional layer in keras
	8.6 Coronavirus (COVID-19) disease diagnosis
		8.6.1 Goals
		8.6.2 Introduction and task definition
		8.6.3 Implementation
			8.6.3.1 Importing required libraries
			8.6.3.2 Plotting some instances of the dataset
			8.6.3.3 Defining the model
			8.6.3.4 Discussing the relevance of deep learning for small-data problems
			8.6.3.5 Predicting covid-19
		8.6.4 Conclusion
	8.7 Predicting breast cancer
		8.7.1 Goals
		8.7.2 Introduction and task definition
		8.7.3 Implementation
			8.7.3.1 Importing required libraries
			8.7.3.2 Looking for all available directories in Kaggle account
			8.7.3.3 Plotting images using cv2 module
			8.7.3.4 Finding specific pattern in the name of images
			8.7.3.5 Preprocessing data
			8.7.3.6 Dealing with imbalanced data
			8.7.3.7 Defining the sequential model
		8.7.4 Conclusion
	8.8 Diabetic retinopathy detection
		8.8.1 Goals
		8.8.2 Introduction and task definition
		8.8.3 Implementation
			8.8.3.1 Importing required libraries and reading the data
			8.8.3.2 Preprocessing data
			8.8.3.3 Defining model based on functional API
			8.8.3.4 Defining another model using ResNet50 model
		8.8.4 Conclusion
	8.9 Summary
9 Popular deep learning image classifiers
	9.1 Introduction
	9.2 LeNet-5
	9.3 AlexNet
	9.4 ZFNet
	9.5 VGGNet
	9.6 GoogLeNet/inception
	9.7 ResNet
	9.8 DenseNet
	9.9 SE-Net
	9.10 Summary
10 Electrocardiogram (ECG) arrhythmia classification
	10.1 Introduction
	10.2 MIT-BIH arrhythmia database
	10.3 Preprocessing
	10.4 Data augmentation
	10.5 Architecture of the CNN model
	10.6 Summary
11 Autoencoders and deep generative models in bioinformatics
	11.1 Introduction
	11.2 Autoencoders
		11.2.1 Encoder
		11.2.2 Decoder
		11.2.3 Distance function
	11.3 Variant types of autoencoders
		11.3.1 Undercomplete autoencoders
		11.3.2 Deep autoencoders
		11.3.3 Convolutional autoencoders
		11.3.4 Sparse autoencoders
		11.3.5 Denoising autoencoders
		11.3.6 Variational autoencoders
			Intuition
			VAE is a generative model
			How does a variational autoencoder work?
			Creating decoder
			Building the architecture of the VAE: connecting the encoder and decoder
			Defining loss function and compiling model
		11.3.7 Contractive autoencoders
	11.4 An example of denoising autoencoders – bone suppression in chest radiographs
		11.4.1 Architecture
	11.5 Implementation of autoencoders for chest X-ray images (pneumonia)
		11.5.1 Undercompleted autoencoder
		11.5.2 Sparse autoencoder
		11.5.3 Denoising autoencoder
		11.5.4 Variational autoencoder
		11.5.5 Contractive autoencoder
	11.6 Generative adversarial network
		11.6.1 GAN network architecture
		11.6.2 GAN network cost function
		11.6.3 Cost function optimization process in GAN
		11.6.4 General GAN training process
	11.7 Convolutional generative adversarial network
		11.7.1 Deconvolution layer
		11.7.2 DCGAN network structure
	11.8 Summary
12 Recurrent neural networks: generating new molecules and proteins sequence classification
	12.1 Introduction
	12.2 Types of recurrent neural network
	12.3 The problem, short-term memory
	12.4 Bidirectional LSTM
	12.5 Generating new molecules
		12.5.1 Simplified molecular-input line-entry system
		12.5.2 A generative model for molecules
		12.5.3 Generating new SMILES
		12.5.4 Analyzing the generative model's output
	12.6 Protein sequence classification
		12.6.1 Protein structure
		12.6.2 Protein function
		12.6.3 Prediction of protein function
		12.6.4 LSTM with dropout
		12.6.5 LSTM with bidirectional and CNN
	12.7 Summary
13 Application, challenge, and suggestion
	13.1 Introduction
	13.2 Legendary deep learning architectures, CNN, and RNN
	13.3 Deep learning applications in bioinformatics
	13.4 Biological networks
		13.4.1 Learning tasks on graphs
		13.4.2 Graph neural networks
	13.5 Perspectives, limitations, and suggestions
	13.6 DeepChem, a powerful library for bioinformatics
	13.7 Summary
Index
Back Cover