nips nips2006 nips2006-13 nips2006-13-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Lin Wu, Pierre F. Baldi
Abstract: Go is an ancient board game that poses unique opportunities and challenges for AI and machine learning. Here we develop a machine learning approach to Go, and related board games, focusing primarily on the problem of learning a good evaluation function in a scalable way. Scalability is essential at multiple levels, from the library of local tactical patterns, to the integration of patterns across the board, to the size of the board itself. The system we propose is capable of automatically learning the propensity of local patterns from a library of games. Propensity and other local tactical information are fed into a recursive neural network, derived from a Bayesian network architecture. The network integrates local information across the board and produces local outputs that represent local territory ownership probabilities. The aggregation of these probabilities provides an effective strategic evaluation function that is an estimate of the expected area at the end (or at other stages) of the game. Local area targets for training can be derived from datasets of human games. A system trained using only 9 × 9 amateur game data performs surprisingly well on a test set derived from 19 × 19 professional game data. Possible directions for further improvements are briefly discussed. 1
[1] P. Baldi and G. Pollastri. The principled design of large-scale recursive neural network architectures–DAG-RNNs and the protein structure prediction problem. Journal of Machine Learning Research, 4:575–602, 2003.
[2] E. Berlekamp and D. Wolfe. Mathematical Go–Chilling gets the last point. A K Peters, Wellesley, MA, 1994.
[3] B. Brugmann. Monte Carlo Go. 1993. URL: ftp://www.joy.ne.jp/welcome/igs/ Go/computer/mcgo.tex.Z.
[4] Zhixing Chen. Semi-empirical quantitative theory of Go part 1: Estimation of the influence of a wall. ICGA Journal, 25(4):211–218, 2002.
[5] W. S. Cobb. The Book of GO. Sterling Publishing Co., New York, NY, 2002.
[6] K. Iwamoto. GO for Beginners. Pantheon Books, New York, NY, 1972.
[7] Aske Plaat, Jonathan Schaeffer, Wim Pijls, and Arie de Bruin. Exploiting graph properties of game trees. In 13th National Conference on Artificial Intelligence (AAAI’96), pages 234–239. 1996.
[8] G. Pollastri and P. Baldi. Prediction of contact maps by GIOHMMs and recurrent neural networks using lateral propagation from all four cardinal corners. Bioinformatics, 18:S62– S70, 2002.
[9] Liva Ralaivola, Lin Wu, and Pierre Balid. SVM and Pattern-Enriched Common Fate Graphs for the game of Go. ESANN 2005, 27-29:485–490, 2005.
[10] Stuart J. Russell and Peter Norvig. Artificial Intelligence: A Modern Approach. Prentice Hall, 2nd edition, 2002.
[11] N. N. Schrauldolph, P. Dayan, and T. J. Sejnowski. Temporal difference learning of position evaluation in the game of Go. In Advances in Neural Information Processing Systems 6, pages 817–824. 1994.
[12] David H. Stern, Thore Graepel, and David J. C. MacKay. Modelling uncertainty in the game of Go. In Advances in Neural Information Processing Systems 17, pages 1353–1360. 2005.
[13] E. Werf, H. Herik, and J. Uiterwijk. Learning to score final positions in the game of Go. In Advances in Computer Games: Many Games, Many Challenges, pages 143–158. 2003.
[14] Albert L. Zobrist. A new hashing method with application for game playing. 1970. Technical report 88, University of Wisconsin, April 1970. Reprinted in ICCA Journal, 13(2), (1990), pp. 69-73.