Deep Learning for Genomics: Data-driven approaches for genomics applications in life sciences and biotechnology
- Length: 270 pages
- Edition: 1
- Language: English
- Publisher: Packt Publishing
- Publication Date: 2022-11-11
- ISBN-10: 1804615447
- ISBN-13: 9781804615447
- Sales Rank: #121274 (See Top 100 Books)
Learn concepts, methodologies, and applications of deep learning for building predictive models from complex genomics data sets to overcome challenges in the life sciences and biotechnology industries
Key Features
- Apply deep learning algorithms to solve real-world problems in the field of genomics
- Extract biological insights from deep learning models built from genomic datasets
- Train, tune, evaluate, deploy, and monitor deep learning models for enabling predictions in genomics
Book Description
Deep learning has shown remarkable promise in the field of genomics; however, there is a lack of a skilled deep learning workforce in this discipline. This book will help researchers and data scientists to stand out from the rest of the crowd and solve real-world problems in genomics by developing the necessary skill set. Starting with an introduction to the essential concepts, this book highlights the power of deep learning in handling big data in genomics. First, you’ll learn about conventional genomics analysis, then transition to state-of-the-art machine learning-based genomics applications, and finally dive into deep learning approaches for genomics. The book covers all of the important deep learning algorithms commonly used by the research community and goes into the details of what they are, how they work, and their practical applications in genomics. The book dedicates an entire section to operationalizing deep learning models, which will provide the necessary hands-on tutorials for researchers and any deep learning practitioners to build, tune, interpret, deploy, evaluate, and monitor deep learning models from genomics big data sets.
By the end of this book, you’ll have learned about the challenges, best practices, and pitfalls of deep learning for genomics.
What you will learn
- Discover the machine learning applications for genomics
- Explore deep learning concepts and methodologies for genomics applications
- Understand supervised deep learning algorithms for genomics applications
- Get to grips with unsupervised deep learning with autoencoders
- Improve deep learning models using generative models
- Operationalize deep learning models from genomics datasets
- Visualize and interpret deep learning models
- Understand deep learning challenges, pitfalls, and best practices
Who this book is for
This deep learning book is for machine learning engineers, data scientists, and academicians practicing in the field of genomics. It assumes that readers have intermediate Python programming knowledge, basic knowledge of Python libraries such as NumPy and Pandas to manipulate and parse data, Matplotlib, and Seaborn for visualizing data, along with a base in genomics and genomic analysis concepts.
Cover Title Page Copyright and Credits Contributors Table of Contents Preface Part 1 – Machine Learning in Genomics Introducing Machine Learning for Genomics What is machine learning? Why machine learning for genomics? Machine learning for genomics in life sciences and biotechnology Exploring machine learning software Python programming language Visualization Biopython Scikit-learn Summary Genomics Data Analysis Technical requirements Installing Biopython Matplotlib What is a genome? Genome sequencing Sanger sequencing of nucleic acids Evolution of next-generation sequencing Analysis of genomic data Steps in genomics data analysis Introduction to Biopython for genomic data analysis What is Biopython? Genomic data analysis use case – Sequence analysis of Covid-19 Calculating GC content Calculating nucleotide content Dinucleotide content Modeling Motif finder Summary Machine Learning Methods for Genomic Applications Technical requirements Python packages ML libraries Genomics big data Supervised and unsupervised ML Supervised ML Unsupervised ML ML for genomics The basic workflow of ML in genomics An ML use case for genomics – Disease prediction Data collection Data preprocessing EDA Data transformation Data splitting Model training Model evaluation ML challenges in genomics Summary Part 2 – Deep Learning for Genomic Applications Deep Learning for Genomics Understanding what deep learning is and how it works Neural network definition Anatomy of deep neural networks Key concepts of DNNs An example of how neural networks work DNN architectures DNNs for genomics Deep learning workflow for genomics Broad application of DNNs in genomics Protein structure predictions Regulatory genomics Gene regulatory networks Single-cell RNA sequencing Introducing deep learning algorithms and Python libraries General deep learning libraries Deep learning libraries for genomics Summary Introducing Convolutional Neural Networks for Genomics Introduction to CNNs What are CNNs? Transfer Learning CNNs for genomics Applications of CNNs in genomics DeepBind DeepInsight DeepChrome DeepVariant Summary Recurrent Neural Networks in Genomics What are RNNs? Introducing RNNs How do RNNs work? Different RNN architectures Bidirectional RNNs (BiLSTM ) LSTMs and GRUs Different types of RNNs Applications and use cases of RNNs in genomics DeepNano ProLanGo DanQ Understanding RNNs through Transcription Factor Binding Site (TFBS) predictions Summary Unsupervised Deep Learning with Autoencoders What is unsupervised DL? Types of unsupervised DL Clustering Anomaly detection Association What are autoencoders? Properties of autoencoders How do autoencoders work? Architecture of autoencoders Types of autoencoders Autoencoders for genomics Gene expression Use case – Predicting gene expression from TCGA pan-cancer RNA-Seq data using denoising autoencoders Summary GANs for Improving Models in Genomics What are GANs? Differences between Discriminative and Generative models Intuition about GANs How do GANs work? Challenges working with genomics datasets What is synthetic data? How can GANs help improve models? Practical applications of GANs in genomics Analysis of ScRNA-Seq data Generation of DNA Using GANs for augmenting population-scale genomics data Summary Part 3 – Operationalizing models Building and Tuning Deep Learning Models Technical requirements DL life cycle Data processing Data collection Data wrangling Feature engineering Developing models Selecting an appropriate algorithm Model training Tuning the models Hyperparameter tuning Hyperparameter tuning libraries Classification metrics or performance statistics Visualizing performance Regression metrics Use case – Predicting the binding site location of the JunD TF Framing the TFBS prediction problem in terms of DL Processing the data Model training Summary Model Interpretability in Genomics What is model interpretability? Black-box model interpretability Unlocking business value from model interpretability Better business decisions Building trust Profitability Model interpretability methods in genomics Partial dependence plot Individual conditional expectation Permuted feature importance Global surrogate LIME Shapley value ExSum Saliency map Use case – Model interpretability for genomics Data collection Feature extraction Target labels Train-test split Creating a CNN architecture Summary Model Deployment and Monitoring Technical requirements Streamlit Hugging Face Introducing model deployment Steps in model deployment Types of model deployment Deploying models as services A use case for deploying a DL model as a web service – building a Streamlit application of the CNN model Monitoring models using advanced tools Why monitor models? Reasons for model degradation How to monitor DL models Advanced tools for model monitoring Addressing drifts Summary Challenges, Pitfalls, and Best Practices for Deep Learning in Genomics Deep learning challenges regarding genomics Lack of flexible tools Fewer biological samples Computational resource requirements Expertise in DL frameworks Lack of high-quality labeled data Lack of model interpretability Common pitfalls for applying deep learning to genomics Confounding Data leakage Imbalanced data Improper model comparisons Best practices for applying deep learning to genomics Understand the problem and know your data better A simple model for a simple problem Establish a baseline for your model Ensure reproducibility Using pre-existing models for genomics Do not reinvent the rule Tune hyperparameters automatically Focus on feature engineering Normalize the data Always perform model interpretation Avoid overfitting Summary Index About Packt Other Books You May Enjoy
Donate to keep this site alive
How to download source code?
1. Go to: https://github.com/PacktPublishing
2. In the Find a repository… box, search the book title: Deep Learning for Genomics: Data-driven approaches for genomics applications in life sciences and biotechnology
, sometime you may not get the results, please search the main title.
3. Click the book title in the search results.
3. Click Code to download.
1. Disable the AdBlock plugin. Otherwise, you may not get any links.
2. Solve the CAPTCHA.
3. Click download link.
4. Lead to download server to download.