nips nips2013 nips2013-191 nips2013-191-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Brendan McMahan, Jacob Abernethy
Abstract: We design and analyze minimax-optimal algorithms for online linear optimization games where the player’s choice is unconstrained. The player strives to minimize regret, the difference between his loss and the loss of a post-hoc benchmark strategy. While the standard benchmark is the loss of the best strategy chosen from a bounded comparator set, we consider a very broad range of benchmark functions. The problem is cast as a sequential multi-stage zero-sum game, and we give a thorough analysis of the minimax behavior of the game, providing characterizations for the value of the game, as well as both the player’s and the adversary’s optimal strategy. We show how these objects can be computed efficiently under certain circumstances, and by selecting an appropriate benchmark, we construct a novel hedging strategy for an unconstrained betting game. 1
Jacob Abernethy and Manfred K. Warmuth. Repeated games against budgeted adversaries. In NIPS, 2010. Jacob Abernethy, Peter L. Bartlett, Alexander Rakhlin, and Ambuj Tewari. Optimal strategies and minimax lower bounds for online convex games. In COLT, 2008a. Jacob Abernethy, Manfred K Warmuth, and Joel Yellin. Optimal strategies from random walks. In Proceedings of The 21st Annual Conference on Learning Theory, pages 437–446. Citeseer, 2008b. Jacob Abernethy, Alekh Agarwal, Peter Bartlett, and Alexander Rakhlin. A stochastic view of optimal regret through minimax duality. In COLT, 2009. Jacob Abernethy, Rafael M. Frongillo, and Andre Wibisono. Minimax option pricing meets blackscholes in the limit. In STOC, 2012. Amit Agarwal, Elad Hazan, Satyen Kale, and Robert E. Schapire. Algorithms for portfolio management based on the Newton method. In ICML, 2006. Nicol` Cesa-Bianchi and Gabor Lugosi. Prediction, Learning, and Games. Cambridge University o Press, 2006. A. de Moivre. The Doctrine of Chances: or, A Method of Calculating the Probabilities of Events in Play. 1718. Ofer Dekel, Ambuj Tewari, and Raman Arora. Online bandit learning against an adaptive adversary: from regret to policy regret. In ICML, 2012. Peter DeMarzo, Ilan Kremer, and Yishay Mansour. Online trading algorithms and robust option pricing. In Proceedings of the thirty-eighth annual ACM symposium on Theory of computing, pages 477–486. ACM, 2006. Persi Diaconis and Sandy Zabell. Closed form summation for classical distributions: Variations on a theme of de Moivre. Statistical Science, 6(3), 1991. Elad Hazan and Satyen Kale. On stochastic and worst-case models for investing. In NIPS. 2009. J. L. Kelly Jr. A new interpretation of information rate. Bell System Technical Journal, 1956. Wouter Koolen, Dmitry Adamskiy, and Manfred Warmuth. Putting bayes to sleep. In NIPS. 2012. N. Merhav, E. Ordentlich, G. Seroussi, and M. J. Weinberger. On sequential strategies for loss functions with memory. IEEE Trans. Inf. Theor., 48(7), September 2006. Alexander Rakhlin, Ohad Shamir, and Karthik Sridharan. Relax and randomize: From value to algorithms. In NIPS, 2012. Ralph T. Rockafellar. Convex Analysis (Princeton Landmarks in Mathematics and Physics). Princeton University Press, 1997. Shai Shalev-Shwartz. Online learning and online convex optimization. Foundations and Trends in Machine Learning, 2012. Shai Shalev-Shwartz, Yoram Singer, Nathan Srebro, and Andrew Cotter. Pegasos: Primal estimated sub-gradient solver for svm. Mathematical Programming, 127(1):3–30, 2011. Gilles Stoltz. Contributions to the sequential prediction of arbitrary sequences: applications to the theory of repeated games and empirical studies of the performance of the aggregation of experts. ` Habilitation a diriger des recherches, Universit´ Paris-Sud, 2011. e Matthew Streeter and H. Brendan McMahan. No-regret algorithms for unconstrained online convex optimization. In NIPS, 2012. Volodya Vovk. Competitive on-line statistics. International Statistical Review, 69, 2001. Martin Zinkevich. Online convex programming and generalized infinitesimal gradient ascent. In ICML, 2003. 9