Google DeepMind's AlphaGo: operations research's unheralded role in the path-breaking achievement.

Citation metadata

From: OR/MS Today(Vol. 43, Issue 5)
Publisher: Institute for Operations Research and the Management Sciences
Document Type: Article
Length: 3,238 words
Lexile Measure: 1650L

Document controls

Main content

Article Preview :

AlphaGo's shockingly dominant victory over the reigning world Go champion Lee Sedol in Seoul, Korea, in March 2016 signaled another great leap in the seemingly relentless advancement of machines becoming truly "intelligent" in the sense of being able to learn and outsmart humans. When IBM's Deep Blue defeated world chess champion Garry Kasparov in 1997, it was thought at the time to have reached the ultimate pinnacle in computer game-playing abilities, since chess had been considered (at least in the Western world) to be the game requiring the most human brainpower, so the fact that a computer had finally surpassed the abilities of humankind's best would seem to have indicated that some aspects only found in science fiction might be finally approaching reality.

However, the "primitive brute force-based" [1] Deep Blue appeared to be more of a showcase of hardware than software, as it relied more on its computational firepower rather than any intuition or actual learning in its approach to the game. On the other hand, because of the mind-boggling number of possible configurations due to a much larger board size (19x19 vs. 8x8 for chess), such an approach is just not feasible in any foreseeable future for the game of Go. Thus, a paradigm shift was required, and at the heart of all successful Goplaying computer programs over the past decade has been the use of intelligent sampling (Monte Carlo simulation) rather than "intelligent" enumeration, which basically means immense storage of games and doing more clever enumeration (e.g., by pruning). However, all of these game-playing programs do have one thing in common: the concept of "learning," using algorithms under the general umbrella of machine-learning techniques (where it appears that concepts such as branching, bounding, and pruning now have been absorbed under this catchall).

According to a Google blog posting [2]:

"The game of Go originated in China more than 2,500 years ago. The rules of the game are simple: Players take turns to place black or white stones on a board, trying to capture the opponent's stones or surround empty space to make points of territory. As simple as the rules are, Go is a game of profound complexity. There are more possible positions in Go than there are atoms in the universe. That makes Go a googol times more complex than chess.

"The objective of the game--as the translation of its name (weiqi in Mandarin) implies--is to have surrounded a larger total area of the board with one's stones than the opponent by the end of the game."

This article covers several aspects regarding AlphaGo's success. We begin with an introduction to DeepMind and a brief description at a higher level of the main parts of the AlphaGo program - specifically the two "deep" neural networks and the technique of Monte Carlo tree search with upper confidence bounds (UCBs) that is used in all modern computer Go-playing (as well as other computer game-playing) programs. The main ideas of the latter can be found in a paper that...

Source Citation

Source Citation   

Gale Document Number: GALE|A471000820