Neurogammon

Neurogammon is a computer backgammon program written by Gerald Tesauro at IBM's Thomas J. Watson Research Center. It was the first viable computer backgammon program implemented as a neural net, and set a new standard in computer backgammon play. It won the 1st Computer Olympiad in London in 1989, handily defeating all opponents.[1] Its level of play was that of an intermediate-level human player.[2]

Neurogammon contains seven separate neural networks, each with a single hidden layer. One network makes doubling-cube decisions; the other six choose moves at different stages of the game. The networks were trained by backpropagation from transcripts of 400 games in which the author played himself. The author's move was taught as the best move in each position.

In 1992, Tesauro completed TD-Gammon, which combined a form of reinforcement learning with the human-designed input features of Neurogammon, and played at the level of a world-class human tournament player.

References

Tesauro, Gerald (1989). "Neurogammon Wins Computer Olympiad" (PDF). Neural Computation. 1 (3): 321–323. doi:10.1162/neco.1989.1.3.321. Retrieved 2010-02-20.
Tesauro, Gerald (March 1995). "Temporal Difference Learning and TD-Gammon". Communications of the ACM. 38 (3). doi:10.1145/203330.203343. Retrieved 2010-02-08.

This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.

[1] Tesauro, Gerald (1989). "Neurogammon Wins Computer Olympiad" (PDF). Neural Computation. 1 (3): 321–323. doi:10.1162/neco.1989.1.3.321. Retrieved 2010-02-20.

[CACM-2] Tesauro, Gerald (March 1995). "Temporal Difference Learning and TD-Gammon". Communications of the ACM. 38 (3). doi:10.1145/203330.203343. Retrieved 2010-02-08.