Deep Learning and the Game of Go

Length: 384 pages
Edition: 1
Language: English
Publisher: Manning
Publication Date: 2019-01-25
ISBN-10: 1617295329
ISBN-13: 9781617295324
Sales Rank: #953613 (See Top 100 Books)

Summary

Deep Learning and the Game of Go teaches you how to apply the power of deep learning to complex reasoning tasks by building a Go-playing AI. After exposing you to the foundations of machine and deep learning, you’ll use Python to build a bot and then teach it the rules of the game.

Foreword by Thore Graepel, DeepMind

Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications.

About the Technology

The ancient strategy game of Go is an incredible case study for AI. In 2016, a deep learning-based system shocked the Go world by defeating a world champion. Shortly after that, the upgraded AlphaGo Zero crushed the original bot by using deep reinforcement learning to master the game. Now, you can learn those same deep learning techniques by building your own Go bot!

About the Book

Deep Learning and the Game of Go introduces deep learning by teaching you to build a Go-winning bot. As you progress, you’ll apply increasingly complex training techniques and strategies using the Python deep learning library Keras. You’ll enjoy watching your bot master the game of Go, and along the way, you’ll discover how to apply your new deep learning skills to a wide range of other scenarios!

What’s inside

Build and teach a self-improving game AI
Enhance classical game AI systems with deep learning
Implement neural networks for deep learning

About the Reader

All you need are basic Python skills and high school-level math. No deep learning experience required.

About the Author

Max Pumperla and Kevin Ferguson are experienced deep learning specialists skilled in distributed systems and data science. Together, Max and Kevin built the open source bot BetaGo.

Deep Learning and the Game of Go
brief contents
contents
foreword
preface
acknowledgments
about this book
	Who should read this book
	Roadmap
		Tree search
		Neural networks
		Reinforcement learning
	About the code
	Book forum
about the authors
about the cover illustration
Part 1: Foundations
	Chapter 1: Toward deep learning: a machine-learning introduction
		1.1 What is machine learning?
			1.1.1 How does machine learning relate to AI?
			1.1.2 What you can and can’t do with machine learning
		1.2 Machine learning by example
			1.2.1 Using machine learning in software applications
			1.2.2 Supervised learning
			1.2.3 Unsupervised learning
			1.2.4 Reinforcement learning
		1.3 Deep learning
		1.4 What you’ll learn in this book
		1.5 Summary
	Chapter 2: Go as a machine-learning problem
		2.1 Why games?
		2.2 A lightning introduction to the game of Go
			2.2.1 Understanding the board
			2.2.2 Placing and capturing stones
			2.2.3 Ending the game and counting
			2.2.4 Understanding ko
		2.3 Handicaps
		2.4 Where to learn more
		2.5 What can we teach a machine?
			2.5.1 Selecting moves in the opening
			2.5.2 Searching game states
			2.5.3 Reducing the number of moves to consider
			2.5.4 Evaluating game states
		2.6 How to measure your Go AI’s strength
			2.6.1 Traditional Go ranks
			2.6.2 Benchmarking your Go AI
		2.7 Summary
	Chapter 3: Implementing your first Go bot
		3.1 Representing a game of Go in Python
			3.1.1 Implementing the Go board
			3.1.2 Tracking connected groups of stones in Go: strings
			3.1.3 Placing and capturing stones on a Go board
		3.2 Capturing game state and checking for illegal moves
			3.2.1 Self-capture
			3.2.2 Ko
		3.3 Ending a game
		3.4 Creating your first bot: the weakest Go AI imaginable
		3.5 Speeding up game play with Zobrist hashing
		3.6 Playing against your bot
		3.7 Summary
Part 2: Machine learning and game AI
	Chapter 4: Playing games with tree search
		4.1 Classifying games
		4.2 Anticipating your opponent with minimax search
		4.3 Solving tic-tac-toe: a minimax example
		4.4 Reducing search space with pruning
			4.4.1 Reducing search depth with position evaluation
			4.4.2 Reducing search width with alpha-beta pruning
		4.5 Evaluating game states with Monte Carlo tree search
			4.5.1 Implementing Monte Carlo tree search in Python
			4.5.2 How to select which branch to explore
			4.5.3 Applying Monte Carlo tree search to Go
		4.6 Summary
	Chapter 5: Getting started with neural networks
		5.1 A simple use case: classifying handwritten digits
			5.1.1 The MNIST data set of handwritten digits
			5.1.2 MNIST data preprocessing
		5.2 The basics of neural networks
			5.2.1 Logistic regression as simple artificial neural network
			5.2.2 Networks with more than one output dimension
		5.3 Feed-forward networks
		5.4 How good are our predictions? Loss functions and optimization
			5.4.1 What is a loss function?
			5.4.2 Mean squared error
			5.4.3 Finding minima in loss functions
			5.4.4 Gradient descent to find minima
			5.4.5 Stochastic gradient descent for loss functions
			5.4.6 Propagating gradients back through your network
		5.5 Training a neural network step-by-step in Python
			5.5.1 Neural network layers in Python
			5.5.2 Activation layers in neural networks
			5.5.3 Dense layers in Python as building blocks for feed-forward networks
			5.5.4 Sequential neural networks with Python
			5.5.5 Applying your network handwritten digit classification
		5.6 Summary
	Chapter 6: Designing a neural network for Go data
		Encoding a Go game position for neural networks
		6.1 Encoding a Go game position for neural networks
		6.2 Generating tree-search games as network training data
		6.3 Using the Keras deep-learning library
			6.3.1 Understanding Keras design principles
			6.3.2 Installing the Keras deep-learning library
			6.3.3 Running a familiar first example with Keras
			6.3.4 Go move prediction with feed-forward neural networks in Keras
		6.4 Analyzing space with convolutional networks
			6.4.1 What convolutions do intuitively
			6.4.2 Building convolutional neural networks with Keras
			6.4.3 Reducing space with pooling layers
		6.5 Predicting Go move probabilities
			6.5.1 Using the softmax activation function in the last layer
			6.5.2 Cross-entropy loss for classification problems
		6.6 Building deeper networks with dropout and rectified linear units
			6.6.1 Dropping neurons for regularization
			6.6.2 The rectified linear unit activation function
		6.7 Putting it all together for a stronger Go move-prediction network
		6.8 Summary
	Chapter 7: Learning from data: a deep-learning bot
		Importing Go game records
		7.1 Importing Go game records
			7.1.1 The SGF file format
			7.1.2 Downloading and replaying Go game records from KGS
		7.2 Preparing Go data for deep learning
			7.2.1 Replaying a Go game from an SGF record
			7.2.2 Building a Go data processor
			7.2.3 Building a Go data generator to load data efficiently
			7.2.4 Parallel Go data processing and generators
		7.3 Training a deep-learning model on human game-play data
		7.4 Building more-realistic Go data encoders
		7.5 Training efficiently with adaptive gradients
			7.5.1 Decay and momentum in SGD
			7.5.2 Optimizing neural networks with Adagrad
			7.5.3 Refining adaptive gradients with Adadelta
		7.6 Running your own experiments and evaluating performance
			7.6.1 A guideline to testing architectures and hyperparameters
			7.6.2 Evaluating performance metrics for training and test data
		7.7 Summary
	Chapter 8: Deploying bots in the wild
		8.1 Creating a move-prediction agent from a deep neural network
		8.2 Serving your Go bot to a web frontend
			8.2.1 An end-to-end Go bot example
		8.3 Training and deploying a Go bot in the cloud
		8.4 Talking to other bots: the Go Text Protocol
		8.5 Competing against other bots locally
			8.5.1 When a bot should pass or resign
			8.5.2 Let your bot play against other Go programs
		8.6 Deploying a Go bot to an online Go server
			8.6.1 Registering a bot at the Online Go Server
		8.7 Summary
	Chapter 9: Learning by practice: reinforcement learning
		9.1 The reinforcement-learning cycle
		9.2 What goes into experience?
		9.3 Building an agent that can learn
			9.3.1 Sampling from a probability distribution
			9.3.2 Clipping a probability distribution
			9.3.3 Initializing an agent
			9.3.4 Loading and saving your agent from disk
			9.3.5 Implementing move selection
		9.4 Self-play: how a computer program practices
			9.4.1 Representing experience data
			9.4.2 Simulating games
		9.5 Summary
	Chapter 10: Reinforcement learning with policy gradients
		10.1 How random games can identify good decisions
		10.2 Modifying neural network policies with gradient descent
		10.3 Tips for training with self-play
			10.3.1 Evaluating your progress
			10.3.2 Measuring small differences in strength
			10.3.3 Tuning a stochastic gradient descent (SGD) optimizer
		10.4 Summary
	Chapter 11: Reinforcement learning with value methods
		11.1 Playing games with Q-learning
		11.2 Q-learning with Keras
			11.2.1 Building two-input networks in Keras
			11.2.2 Implementing the e-greedy policy with Keras
			11.2.3 Training an action-value function
		11.3 Summary
	Chapter 12: Reinforcement learning with actor-critic methods
		12.1 Advantage tells you which decisions are important
			12.1.1 What is advantage?
			12.1.2 Calculating advantage during self-play
		12.2 Designing a neural network for actor-critic learning
		12.3 Playing games with an actor-critic agent
		12.4 Training an actor-critic agent from experience data
		12.5 Summary
Part 3: Greater than the sum of its parts
	Chapter 13: AlphaGo: Bringing it all together
		Training deep neural networks for AlphaGo
		13.1 Training deep neural networks for AlphaGo
			13.1.1 Network architectures in AlphaGo
			13.1.2 The AlphaGo board encoder
			13.1.3 Training AlphaGo-style policy networks
		13.2 Bootstrapping self-play from policy networks
		13.3 Deriving a value network from self-play data
		13.4 Better search with policy and value networks
			13.4.1 Using neural networks to improve Monte Carlo rollouts
			13.4.2 Tree search with a combined value function
			13.4.3 Implementing AlphaGo’s search algorithm
		13.5 Practical considerations for training your own AlphaGo
		13.6 Summary
	Chapter 14: AlphaGo Zero: Integrating tree search with reinforcement learning
		14.1 Building a neural network for tree search
		14.2 Guiding tree search with a neural network
			14.2.1 Walking down the tree
			14.2.2 Expanding the tree
			14.2.3 Selecting a move
		14.3 Training
		14.4 Improving exploration with Dirichlet noise
		14.5 Modern techniques for deeper neural networks
			14.5.1 Batch normalization
			14.5.2 Residual networks
		14.6 Exploring additional resources
		14.7 Wrapping up
		14.8 Summary
Appendix A: Mathematical foundations
	Vectors, matrices, and beyond: a linear algebra primer
		Vectors: one-dimensional data
		Matrices: two-dimensional data
	Rank 3 tensors
		Rank 4 tensors
	Calculus in five minutes: derivatives and finding maxima
Appendix B: The backpropagation algorithm
	A bit of notation
	The backpropagation algorithm for feed-forward networks
	Backpropagation for sequential neural networks
	Backpropagation for neural networks in general
	Computational challenges with backpropagation
Appendix C: Go programs and servers
	Go programs
		GNU Go
		Pachi
	Go servers
		OGS
		IGS
		Tygem
Appendix D: Training and deploying bots by using Amazon Web Services
	Training and deploying bots by using Amazon Web Services
	Model training on AWS
	Hosting a bot on AWS over HTTP
Appendix E: Submitting a bot to the Online Go Server
	Registering and activating your bot at OGS
	Testing your OGS bot locally
	Deploying your OGS bot on AWS
index
	Symbols
	A
	B
	C
	D
	E
	F
	G
	H
	I
	J
	K
	L
	M
	N
	O
	P
	Q
	R
	S
	T
	U
	V
	W
	X
	Z