nips nips2010 nips2010-274 nips2010-274-reference knowledge-graph by maker-knowledge-mining

274 nips-2010-Trading off Mistakes and Don't-Know Predictions

Source: pdf

Author: Amin Sayedi, Morteza Zadimoghaddam, Avrim Blum

Abstract: We discuss an online learning framework in which the agent is allowed to say “I don’t know” as well as making incorrect predictions on given examples. We analyze the trade off between saying “I don’t know” and making mistakes. If the number of don’t-know predictions is required to be zero, the model reduces to the well-known mistake-bound model introduced by Littlestone [Lit88]. On the other hand, if no mistakes are allowed, the model reduces to KWIK framework introduced by Li et. al. [LLW08]. We propose a general, though inefﬁcient, algorithm for general ﬁnite concept classes that minimizes the number of don’t-know predictions subject to a given bound on the number of allowed mistakes. We then present speciﬁc polynomial-time algorithms for the concept classes of monotone disjunctions and linear separators with a margin.

reference text

[DF88] Martin E. Dyer and Alan M. Frieze. On the complexity of computing the volume of a polyhedron. SIAM J. Comput., 17(5):967–974, 1988. [DFK91] Martin E. Dyer, Alan M. Frieze, and Ravi Kannan. A random polynomial time algorithm for approximating the volume of convex bodies. J. ACM, 38(1):1–17, 1991. [DLL09] C. Diuk, L. Li, and B.R. Lefﬂer. The adaptive k-meteorologists problem and its application to structure learning and feature selection in reinforcement learning. In Proceedings of the 26th Annual International Conference on Machine Learning, pages 249–256. ACM, 2009. [GF] Gasarch and Fletcher. The Egg Game. www.cs.umd.edu/~gasarch/BLOGPAPERS/egg.pdf. [KK99] [KS02] [Lit88] M. Kearns and D. Koller. Efﬁcient reinforcement learning in factored MDPs. In International Joint Conference on Artiﬁcial Intelligence, volume 16, pages 740–747. Citeseer, 1999. M. Kearns and S. Singh. Near-optimal reinforcement learning in polynomial time. Machine Learning, 49(2):209–232, 2002. N. Littlestone. Learning quickly when irrelevant attributes abound: A new linearthreshold algorithm. Machine learning, 2(4):285–318, 1988. [LLW08] L. Li, M.L. Littman, and T.J. Walsh. Knows what it knows: a framework for self-aware learning. In Proceedings of the 25th international conference on Machine learning, pages 568–575. ACM, 2008. [LV06] L´ szl´ Lov´ sz and Santosh Vempala. Hit-and-run from a corner. SIAM J. Comput., a o a 35(4):985–1005, 2006. [RS88] R.L. Rivest and R. Sloan. Learning complicated concepts reliably and usefully. In Proceedings AAAI-88, pages 635–639, 1988. [SL08] A.L. Strehl and M.L. Littman. Online linear regression and its application to model-based reinforcement learning. Advances in Neural Information Processing Systems, 20, 2008. [WSDL] T.J. Walsh, I. Szita, C. Diuk, and M.L. Littman. Exploring compact reinforcementlearning representations with linear regression. In Proceedings of the Twenty-Fifth Conference on Uncertainty in Artiﬁcial Intelligence (UAI-09), 2009b. 9