nips nips2009 nips2009-193 nips2009-193-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Varun Kanade, Adam Kalai
Abstract: We prove strong noise-tolerance properties of a potential-based boosting algorithm, similar to MadaBoost (Domingo and Watanabe, 2000) and SmoothBoost (Servedio, 2003). Our analysis is in the agnostic framework of Kearns, Schapire and Sellie (1994), giving polynomial-time guarantees in presence of arbitrary noise. A remarkable feature of our algorithm is that it can be implemented without reweighting examples, by randomly relabeling them instead. Our boosting theorem gives, as easy corollaries, alternative derivations of two recent nontrivial results in computational learning theory: agnostically learning decision trees (Gopalan et al, 2008) and agnostically learning halfspaces (Kalai et al, 2005). Experiments suggest that the algorithm performs similarly to MadaBoost. 1
[1] T. G. Dietterich. An experimental comparison of three methods for constructing ensembles of decision trees: bagging, boosting, and randomization. Machine Learning, 40(2):139–158, 2000.
[2] R. Servedio. Smooth boosting and learning with malicious noise. Journal of Machine Learning Research, 4:633–648, 2003.
[3] D. Gavinsky. Optimally-smooth adaptive boosting and application to agnostic learning. Journal of Machine Learning Research, 4:101–117, 2003. 8
[4] C. Domingo and O. Watanabe. Madaboost: A modification of adaboost. In Proceedings of the Thirteenth Annual Conference on Learning Theory, pages 180–189, San Francisco, CA, USA, 2000. Morgan Kaufmann Publishers Inc.
[5] A. Kalai and R. Servedio. Boosting in the presence of noise. In Proceedings of the 35th Annual Symposium on Theory of Computing (STOC), pages 196–205, 2003.
[6] A. T. Kalai, Y. Mansour, and E. Verbin. On agnostic boosting and parity learning. In STOC ’08: Proceedings of the 40th annual ACM symposium on Theory of computing, pages 629–638, New York, NY, USA, 2008. ACM.
[7] Y. Freund and R. Schapire. Game theory, on-line prediction and boosting. In Proceedings of the Ninth Annual Conference on Computational Learning Theory, pages 325–332, 1996.
[8] D. Haussler. Decision theoretic generalizations of the pac model for neural net and other learning applications. Inf. Comput., 100(1):78–150, 1992.
[9] M. Kearns, R. Schapire, and L. Sellie. 17(2):115–141, 1994. Toward Efficient Agnostic Learning. Machine Learning,
[10] P. Gopalan, A. T. Kalai, and A. R. Klivans. Agnostically learning decision trees. In Proceedings of the 40th annual ACM symposium on Theory of computing, pages 527–536, New York, NY, USA, 2008. ACM.
[11] A. T. Kalai, A. R. Klivans, Y. Mansour, and R. Servedio. Agnostically learning halfspaces. In Proc. 46th IEEE Symp. on Foundations of Computer Science (FOCS’05), 2005.
[12] P. M. Long and R. A. Servedio. Random classification noise defeats all convex potential boosters. In ICML, pages 608–615, 2008.
[13] J. Friedman, T. Hastie, and R. Tibshirani. Additive logistic regression: a statistical view of boosting. Annals of Statistics, 28:2000, 1998.
[14] Y. Freund. An adaptive version of the boost-by-majority algorithm. In Proceedings of the Twelfth Annual Conference on Computational Learning Theory, pages 102–113, 1999.
[15] M. Nakamura, H. Nomiya, and K. Uehara. Improvement of boosting algorithm by modifying the weighting rule. Annals of Mathematics and Artificial Intelligence, 41(1):95–109, 2004.
[16] T. Bylander and L. Tate. Using validation sets to avoid overfitting in adaboost. In G. Sutcliffe and R. Goebel, editors, FLAIRS Conference, pages 544–549. AAAI Press, 2006.
[17] S. Ben-David, P. M. Long, and Y. Mansour. Agnostic boosting. In Proceedings of the 14th Annual Conference on Computational Learning Theory, COLT 2001, volume 2111 of Lecture Notes in Artificial Intelligence, pages 507–516. Springer, 2001.
[18] R. A. McDonald, D. J. Hand, and I. A. Eckley. An empirical comparison of three boosting algorithms on real data sets with artificial class noise. In T. Windeatt and F. Roli, editors, Multiple Classifier Systems, volume 2709 of Lecture Notes in Computer Science, pages 35–44. Springer, 2003.
[19] J. K. Bradley and R. Schapire. Filterboost: Regression and classification on large datasets. In J.C. Platt, D. Koller, Y. Singer, and S. Roweis, editors, Advances in Neural Information Processing Systems 20, pages 185–192. MIT Press, Cambridge, MA, 2008.
[20] Y. Mansour and D. McAllester. Boosting using branching programs. Journal of Computer and System Sciences, 64(1):103–112, 2002.
[21] P. M. Long and R. A. Servedio. Adaptive martingale boosting. In NIPS, pages 977–984, 2008.
[22] A. T. Kalai, V. Kanade, and Y. Mansour. Reliable agnostic learning. In COLT ’09: Proceedings of the 22nd annual conference on learning theory, 2009.
[23] M. Kearns and L. Valiant. Cryptographic limitations on learning Boolean formulae and finite automata. Journal of the ACM, 41(1):67–95, 1994.
[24] E. Kushilevitz and Y. Mansour. Learning decision trees using the Fourier spectrum. SIAM J. on Computing, 22(6):1331–1348, 1993.
[25] A. Klivans, R. O’Donnell, and R. Servedio. Learning intersections and thresholds of halfspaces. Journal of Computer & System Sciences, 68(4):808–840, 2004.
[26] A. Asuncion and D. J. Newman. UCI Machine Learning Repository [http://www.ics.uci.edu/ ˜mlearn/MLRepository.html] Irvine, CA: University of California, School of Information and Computer Science, 2007. 9