nips nips2010 nips2010-226 nips2010-226-reference knowledge-graph by maker-knowledge-mining

226 nips-2010-Repeated Games against Budgeted Adversaries


Source: pdf

Author: Jacob D. Abernethy, Manfred K. Warmuth

Abstract: We study repeated zero-sum games against an adversary on a budget. Given that an adversary has some constraint on the sequence of actions that he plays, we consider what ought to be the player’s best mixed strategy with knowledge of this budget. We show that, for a general class of normal-form games, the minimax strategy is indeed efficiently computable and relies on a “random playout” technique. We give three diverse applications of this new algorithmic template: a cost-sensitive “Hedge” setting, a particular problem in Metrical Task Systems, and the design of combinatorial prediction markets. 1


reference text

[1] J. Abernethy, M. K. Warmuth, and J. Yellin. Optimal strategies from random walks. In Proceedings of the 21st Annual Conference on Learning Theory (COLT 08), pages 437–445, July 2008.

[2] Nikhil Bansal, Niv Buchbinder, and Joseph (Seffi) Naor. A Primal-Dual randomized algorithm for weighted paging. In Proceedings of the 48th Annual IEEE Symposium on Foundations of Computer Science, pages 507–517. IEEE Computer Society, 2007.

[3] Y. Bartal, A. Blum, C. Burch, and A. Tomkins. A polylog (n)-competitive algorithm for metrical task systems. In Proceedings of the twenty-ninth annual ACM symposium on Theory of computing, page 711719, 1997.

[4] A. Borodin, N. Linial, and M. E Saks. An optimal on-line algorithm for metrical task system. Journal of the ACM (JACM), 39(4):745763, 1992.

[5] B. Br¨ gmann. Monte carlo go. Master’s Thesis, Unpublished, 1993. u

[6] Y. Chen, L. Fortnow, N. Lambert, D. M Pennock, and J. Wortman. Complexity of combinatorial market makers. In Proceedings of the ACM Conference on Electronic Commerce (EC), 2008.

[7] Y. Chen and D. M Pennock. A utility framework for bounded-loss market makers. In Proceedings of the 23rd Conference on Uncertainty in Artificial Intelligence, page 4956, 2007.

[8] Y. Chen and J. W Vaughan. A new understanding of prediction markets via No-Regret learning. Arxiv preprint arXiv:1003.0034, 2010.

[9] A. Fiat and M. Mendel. Better algorithms for unfair metrical task systems and applications. In Proceedings of the thirty-second annual ACM symposium on Theory of computing, page 725734, 2000.

[10] Yoav Freund and Robert E. Schapire. A decision-theoretic generalization of on-line learning and an application to Boosting. J. Comput. Syst. Sci., 55(1):119–139, 1997. Special Issue for EuroCOLT ’95.

[11] S. Gelly and D. Silver. Combining online and offline knowledge in UCT. In Proceedings of the 24th international conference on Machine learning, page 280, 2007.

[12] S. Gelly, Y. Wang, R. Munos, and O. Teytaud. Modification of UCT with patterns in MonteCarlo go. 2006.

[13] R. Hanson. Combinatorial information market design. 5(1):107119, 2003. Information Systems Frontiers,

[14] N. Littlestone and M. K. Warmuth. The Weighted Majority algorithm. Inform. Comput., 108(2):212–261, 1994. Preliminary version in FOCS 89. 9