nips nips2007 nips2007-165 nips2007-165-reference knowledge-graph by maker-knowledge-mining

165 nips-2007-Regret Minimization in Games with Incomplete Information

Source: pdf

Author: Martin Zinkevich, Michael Johanson, Michael Bowling, Carmelo Piccione

Abstract: Extensive games are a powerful model of multiagent decision-making scenarios with incomplete information. Finding a Nash equilibrium for very large instances of these games has received a great deal of recent attention. In this paper, we describe a new technique for solving large games based on regret minimization. In particular, we introduce the notion of counterfactual regret, which exploits the degree of incomplete information in an extensive game. We show how minimizing counterfactual regret minimizes overall regret, and therefore in self-play can be used to compute a Nash equilibrium. We demonstrate this technique in the domain of poker, showing we can solve abstractions of limit Texas Hold’em with as many as 1012 states, two orders of magnitude larger than previous methods. 1

reference text

[1] D. Koller and N. Megiddo. The complexity of two-person zero-sum games in extensive form. Games and Economic Behavior, pages 528–552, 1992.

[2] D. Billings, N. Burch, A. Davidson, R. Holte, J. Schaeffer, T. Schauenberg, and D. Szafron. Approximating game-theoretic optimal strategies for full-scale poker. In International Joint Conference on Artiﬁcial Intelligence, pages 661–668, 2003.

[3] A. Gilpin and T. Sandholm. Finding equilibria in large sequential games of imperfect information. In ACM Conference on Electronic Commerce, 2006.

[4] A. Gilpin and T. Sandholm. A competitive texas hold’em poker player via automated abstraction and real-time equilibrium computation. In National Conference on Artiﬁcial Intelligence, 2006.

[5] G. Gordon. No-regret algorithms for online convex programs. In Neural Information Processing Systems 19, 2007.

[6] M. Zinkevich, M. Bowling, and N. Burch. A new algorithm for generating strong strategies in massive zero-sum games. In Proceedings of the Twenty-Seventh Conference on Artiﬁcial Intelligence (AAAI), 2007. To Appear.

[7] A. Gilpin, S. Hoda, J. Pena, and T. Sandholm. Gradient-based algorithms for ﬁnding nash equilibria in extensive form games. In Proceedings of the Eighteenth International Conference on Game Theory, 2007.

[8] M. Osborne and A. Rubenstein. A Course in Game Theory. The MIT Press, Cambridge, Massachusetts, 1994.

[9] M. Zinkevich and M. Littman. The AAAI computer poker competition. Journal of the International Computer Games Association, 29, 2006. News item. 8