Kevin Ferguson, co-author of Deep Learning and the Game of Go, was our latest Data Speaker Series guest. He talked about how AlphaGo Zero combines tree search and reinforcement learning in a novel way.
For the purposes of the talk, Kevin assumed that the audience is already familiar with the basic concepts of machine learning, but has no prior knowledge of game artificial intelligence.
In this clip, Kevin uses tic tac toe to introduce the concept of a game tree, and talks about the computational complexity of the tree for games like chess and Go. He goes on to discuss ways to bring this complexity down to a level where computation becomes tractable.
Reading out a Go game tree is like if you filled up an entire galaxy with chessboards and tried to read out every variation on every chessboard in the galaxy.
In the next clip, Kevin describes using a set of two functions, one to narrow the width of the game tree, and one to shrink the height, to feasibly program a computer to play a game of Go. The clip starts by introducing some terminology, and describes how AlphaZero, a more generalized system that can be trained to play any game, uses reinforcement learning to learn these two functions.
Once you can complete a task at all, reinforcement learning provides you with a way to get better at it, just by repeating it.
Next, Kevin walks through the unique form of tree search that AlphaZero uses, outlining the process by which AlphaZero collects experience.
He goes on to describe the training and evaluation process, and closes with some resources for learning more about AlphaZero, encouraging us to learn to play Go!
- Kevin on GitHub: https://github.com/macfergus
- Leela Zero: http://zero.sjeng.org/
- Minigo: https://github.com/tensorflow/minigo
- Kevin’s book: https://maxpumperla.github.io/deep_learning_and_the_game_of_go/
- Play Go: https://online-go.com/learn-to-play-go
Kevin even gave us a discount code for those interested in purchasing his book from Manning! Use the code Automatticdl and get 40% off of any of their products in any format.
: Kevin Ferguson on AlphaGo Zero