Practical Deep Reinforcement Learning with Python: Concise Implementation of Algorithms, Simplified Maths, and Effective Use of TensorFlow and PyTorch
Introducing Practical Smart Agents Development using Python, PyTorch, and TensorFlow
- Exposure to well-known RL techniques, including Monte-Carlo, Deep Q-Learning, Policy Gradient, and Actor-Critical.
- Hands-on experience with TensorFlow and PyTorch on Reinforcement Learning projects.
- Everything is concise, up-to-date, and visually explained with simplified mathematics.
Reinforcement learning is a fascinating branch of AI that differs from standard machine learning in several ways. Adaptation and learning in an unpredictable environment is the part of this project. There are numerous real-world applications for reinforcement learning these days, including medical, gambling, human imitation activity, and robotics.
This book introduces readers to reinforcement learning from a pragmatic point of view. The book does involve mathematics, but it does not attempt to overburden the reader, who is a beginner in the field of reinforcement learning.
The book brings a lot of innovative methods to the reader’s attention in much practical learning, including Monte-Carlo, Deep Q-Learning, Policy Gradient, and Actor-Critical methods. While you understand these techniques in detail, the book also provides a real implementation of these methods and techniques using the power of TensorFlow and PyTorch. The book covers some enticing projects that show the power of reinforcement learning, and not to mention that everything is concise, up-to-date, and visually explained.
After finishing this book, the reader will have a thorough, intuitive understanding of modern reinforcement learning and its applications, which will tremendously aid them in delving into the interesting field of reinforcement learning.
What you will learn
- Familiarize yourself with the fundamentals of Reinforcement Learning and Deep Reinforcement Learning.
- Make use of Python and Gym framework to model an external environment.
- Apply classical Q-learning, Monte Carlo, Policy Gradient, and Thompson sampling techniques.
- Explore TensorFlow and PyTorch to practice the fundamentals of deep reinforcement learning.
- Design a smart agent for a particular problem using a specific technique.
Who this book is for
This book is for machine learning engineers, deep learning fanatics, AI software developers, data scientists, and other data professionals eager to learn and apply Reinforcement Learning to ongoing projects. No specialized knowledge of machine learning is necessary; however, proficiency in Python is desired.
Cover Page Title Page Copyright Page Dedication Page About the Author About the Reviewer Acknowledgement Preface Errata Table of Contents Part I 1. Introducing Reinforcement Learning Structure Objectives What is reinforcement learning? Reinforcement learning mechanics Reinforcement learning vs. supervised learning Examples of reinforcement learning Stock trading Chess Neural Architecture Search (NAS) Conclusion Points to remember Multiple choice questions Answers Key terms 2. Playing Monopoly and Markov Decision Process Structure Objectives Choosing the best strategy for playing Monopoly List of rules Markov chain Markov reward process Markov decision process State Probability Reward Actions Policy Blackjack Stock trading Video games Monopoly as Markov decision process Conclusion Points to remember Multiple choice questions Answers Key terms 3. Training in Gym Structure Objectives Why do we need Gym? Installation CartPole environment Result Result Interacting with Gym List of environments Environment initialization Reproducible script Action space Result Reset environment Render environment Send action to environment Close environment Gym environments Lunar Lander Mountain car Phoenix Custom environment Initialization Step Reset Render Custom Environment with PyGame Conclusion Points to Remember Multiple choice questions Answers 4. Struggling with Multi-Armed Bandits Structure Objectives Gambling with Multi-Armed Bandits Online advertising Clinical trials Emulating Multi-Armed Bandits in Gym Result Epsilon Greedy Policy Result Result Thomson sampling policy Visualization Result Result Epsilon greedy versus Thompson sampling Result Exploration versus exploitation Conclusion Points to remember Multiple choice questions Answers Key terms 5. Blackjack in Monte Carlo Structure Objectives Blackjack as a Reinforcement Learning Problem Result Q(s,a) – action-value function Monte Carlo method Results Results Results Monte Carlo Policy Exploration and Greedy Policy Exploitation Result: Optimal policy for unbalanced Blackjack Result Conclusion Points to remember Multiple choice questions Answers Key terms 6. Escaping Maze with Q-Learning Structure Objectives Maze Q-learning Solving maze problem Q-learning vs. Monte Carlo method Dense vs. Sparse rewards Conclusion Points to remember Multiple choice questions Answers Key terms 7. Discretization Structure Objectives Discretization of continuous variables Discretization of Mountain Car state space Decayed epsilon greedy policy Discrete Q-learning agent Applying discrete Q-learning agent to Mountain Car problem Training Testing Running live Coarse versus Fine discretization Q-learning Alpha parameter Hyperparameters in reinforcement learning From limits of discretization to deep reinforcement learning Conclusion Points to remember Multiple choice questions Answers Key terms Part II: Deep Reinforcement Learning 8. TensorFlow, PyTorch, and Your First Neural Network Structure Objectives Installation Derivative calculators Deep learning basics Tensors Tensor creation Random tensor Reproducibility Common tensor types Tensor methods and attributes Math functions Deep learning layers Linear layer Convolution Pooling Dropout Flatten Activations ReLU Sigmoid Tanh Softmax Neural Network Architecture Supervised learning and loss function Classification Regression Loss function Training and optimizer Optimizers Epoch and batch size Handwritten digit recognition Model Conclusion Points to remember Multiple choice questions Answers Key terms 9. Deep Q-Network and Lunar Lander Structure Objectives Neural networks in reinforcement learning Convergence of temporal difference and DQN training loss function Replay buffer DQN implementation Lunar Landing using DQN agent States Actions Environment DQN application Conclusion Points to remember Multiple choice questions Answers Key terms 10. Defending Atlantis with Double Deep Q-Network Structure Objectives Atlantis gameplay Atlantis environment Capturing motion Convolution Q-Network Double Deep Q-Network Defending Atlantis using DDQN Conclusion Points to remember Multiple choice questions Answers Key terms 11. From Q-Learning to Policy-Gradient Structure Objectives Stochastic Policy Stochastic Policy vs Deterministic Policy Parametric policy Neural network as Parametric Stochastic Policy Policy Gradient method Policy Gradient implementation Solving CartPole problem Conclusion Points to remember Multiple choice questions Answers Key terms 12. Stock Trading with Actor-Critic Structure Objectives Policy gradient training drawbacks Actor-Critic theory A2C implementation A2C vs. policy gradient Stock Trading Problem Environment Solution Conclusion Points to remember Multiple choice questions Answer Key terms 13. What Is Next? Structure Objectives Reinforcement learning overview Reread Deep learning Practice Conclusion Index
1. Disable the AdBlock plugin. Otherwise, you may not get any links.
2. Solve the CAPTCHA.
3. Click download link.
4. Lead to download server to download.