jmlr jmlr2005 jmlr2005-10 jmlr2005-10-reference knowledge-graph by maker-knowledge-mining

10 jmlr-2005-Adaptive Online Prediction by Following the Perturbed Leader


Source: pdf

Author: Marcus Hutter, Jan Poland

Abstract: When applying aggregating strategies to Prediction with Expert Advice (PEA), the learning rate √ must be adaptively tuned. The natural choice of complexity/current loss renders the analysis of Weighted Majority (WM) derivatives quite complicated. In particular, for arbitrary weights there have been no results proven so far. The analysis of the alternative Follow the Perturbed Leader (FPL) algorithm from Kalai and Vempala (2003) based on Hannan’s algorithm is easier. We derive loss bounds for adaptive learning rate and both finite expert classes with uniform weights and countable expert classes with arbitrary weights. For the former setup, our loss bounds match the best known results so far, while for the latter our results are new. Keywords: prediction with expert advice, follow the perturbed leader, general weights, adaptive learning rate, adaptive adversary, hierarchy of experts, expected and high probability bounds, general alphabet and loss, online sequential prediction


reference text

P. Auer, N. Cesa-Bianchi, Y. Freund, and R. E. Schapire. Gambling in a rigged casino: The adversarial multi-armed bandit problem. In Proc. 36th Annual Symposium on Foundations of Computer Science (FOCS 1995), pages 322–331, Los Alamitos, CA, 1995. IEEE Computer Society Press. P. Auer, N. Cesa-Bianchi, and C. Gentile. Adaptive and self-confident on-line learning algorithms. Journal of Computer and System Sciences, 64:48–75, 2002. P. Auer and C. Gentile. Adaptive and self-confident on-line learning algorithms. In Proc. 13th Conference on Computational Learning Theory, pages 107–117. Morgan Kaufmann, San Francisco, 2000. N. Cesa-Bianchi, Y. Freund, D. Haussler, D. Helmbold, R. Schapire, and M. K. Warmuth. How to use expert advice. Journal of the ACM, 44(3):427–485, 1997. 658 A DAPTIVE O NLINE P REDICTION BY F OLLOWING THE P ERTURBED L EADER Y. Freund and R. E. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55(1):119–139, 1997. C. Gentile. The robustness of the p-norm algorithm. Machine Learning, 53(3):265–299, 2003. J. Hannan. Approximation to Bayes risk in repeated plays. In M. Dresher, A. W. Tucker, and P. Wolfe, editors, Contributions to the Theory of Games 3, pages 97–139. Princeton University Press, 1957. M. Hutter. Convergence and loss bounds for Bayesian sequence prediction. IEEE Trans. on Information Theory, 49(8):2061–2067, 2003a. URL http://arxiv.org/abs/cs.LG/0301014. M. Hutter. Optimality of universal Bayesian prediction for general loss and alphabet. Journal of Machine Learning Research, 4:971–1000, 2003b. URL http://arxiv.org/abs/cs.LG/0311014. M. Hutter. Online prediction – Bayes versus experts. Technical report, July 2004a. URL http://www.idsia.ch/∼ marcus/ai/bayespea.htm. Presented at the EU PASCAL Workshop on Learning Theoretic and Bayesian Inductive Principles (LTBIP-2004). M. Hutter. Universal Artificial Intelligence: Sequential Decisions based on Algorithmic Probability. Springer, Berlin. 300 pages, 2004b. URL http://www.idsia.ch/∼ marcus/ai/uaibook.htm. M. Hutter and J. Poland. Prediction with expert advice by following the perturbed leader for general weights. In Proc. 15th International Conf. on Algorithmic Learning Theory (ALT-2004), volume 3244 of LNAI, pages 279–293, Padova, 2004. Springer, Berlin. URL http://arxiv.org/abs/cs.LG/0405043. A. Kalai and S. Vempala. Efficient algorithms for online decision. In Proc. 16th Annual Conference on Learning Theory (COLT-2003), Lecture Notes in Artificial Intelligence, pages 506–521, Berlin, 2003. Springer. N. Littlestone and M. K. Warmuth. The weighted majority algorithm. In 30th Annual Symposium on Foundations of Computer Science, pages 256–261, Research Triangle Park, North Carolina, 1989. IEEE. N. Littlestone and M. K. Warmuth. The weighted majority algorithm. Information and Computation, 108(2):212–261, 1994. C. McDiarmid. On the method of bounded differences. Surveys in Combinatorics, 141, London Mathematical Society Lecture Notes Series:148–188, 1989. H. B. McMahan and A. Blum. Online geometric optimization in the bandit setting against an adaptive adversary. In 17th Annual Conference on Learning Theory (COLT), volume 3120 of LNCS, pages 109–123. Springer, 2004. J. Poland and M. Hutter. Master algorithms for active experts problems based on increasing loss values. In Annual Machine Learning Conference of Belgium and the Netherlands (Benelearn2005), Enschede, 2005. URL http://arxiv.org/abs/cs.LG/0502067. V. G. Vovk. Aggregating strategies. In Proc. Third Annual Workshop on Computational Learning Theory, pages 371–383, Rochester, New York, 1990. ACM Press. 659 H UTTER AND P OLAND V. G. Vovk. A game of prediction with expert advice. In Proc. 8th Annual Conference on Computational Learning Theory, pages 51–60. ACM Press, New York, NY, 1995. R. Yaroshinsky, R. El-Yaniv, and S. Seiden. How to better use expert advice. Machine Learning, 55 (3):271–309, 2004. 660