nips nips2007 nips2007-38 nips2007-38-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Gunnar Rätsch, Manfred K. Warmuth, Karen A. Glocer
Abstract: We present a novel boosting algorithm, called SoftBoost, designed for sets of binary labeled examples that are not necessarily separable by convex combinations of base hypotheses. Our algorithm achieves robustness by capping the distributions on the examples. Our update of the distribution is motivated by minimizing a relative entropy subject to the capping constraints and constraints on the edges of the obtained base hypotheses. The capping constraints imply a soft margin in the dual optimization problem. Our algorithm produces a convex combination of hypotheses whose soft margin is within δ of its maximum. We employ relative enN tropy projection methods to prove an O( ln 2 ) iteration bound for our algorithm, δ where N is number of examples. We compare our algorithm with other approaches including LPBoost, BrownBoost, and SmoothBoost. We show that there exist cases where the number of iterations required by LPBoost grows linearly in N instead of the logarithmic growth for SoftBoost. In simulation studies we show that our algorithm converges about as fast as LPBoost, faster than BrownBoost, and much faster than SmoothBoost. In a benchmark comparison we illustrate the competitiveness of our approach.
[1] L. Breiman. Prediction games and arcing algorithms. Neural Computation, 11(7):1493–1518, 1999. Also Technical Report 504, Statistics Department, University of California Berkeley.
[2] Y. Censor and S. A. Zenios. Parallel Optimization. Oxford, New York, 1997.
[3] C. Cortes and V. Vapnik. Support-vector networks. Machine Learning, 20(3):273–297, 1995.
[4] A. Demiriz, K.P. Bennett, and J. Shawe-Taylor. Linear programming boosting via column generation. Machine Learning, 46(1-3):225–254, 2002.
[5] C. Domingo and O. Watanabe. Madaboost: A modification of Adaboost. In Proc. COLT ’00, pages 180–189, 2000.
[6] Y. Freund. An adaptive version of the boost by majority algorithm. Mach. Learn., 43(3):293–318, 2001.
[7] Y. Freund and R.E. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55(1):119–139, 1997.
[8] A.J. Grove and D. Schuurmans. Boosting in the limit: Maximizing the margin of learned ensembles. In Proceedings of the Fifteenth National Conference on Artifical Intelligence, 1998.
[9] Mark Herbster and Manfred K. Warmuth. Tracking the best linear predictor. Journal of Machine Learning Research, 1:281–309, 2001.
[10] R. Hettich and K.O. Kortanek. Semi-infinite programming: Theory, methods and applications. SIAM Review, 3:380–429, September 1993.
[11] J. Kivinen and M. K. Warmuth. Boosting as entropy projection. In Proc. 12th Annu. Conference on Comput. Learning Theory, pages 134–144. ACM Press, New York, NY, 1999.
[12] J. Liao. Totally Corrective Boosting Algorithms that Maximize the Margin. PhD thesis, University of California at Santa Cruz, December 2006.
[13] R. Meir and G. R¨ tsch. An introduction to boosting and leveraging. In S. Mendelson and A. Smola, a editors, Proc. 1st Machine Learning Summer School, Canberra, LNCS, pages 119–184. Springer, 2003.
[14] G. R¨ tsch. Robust Boosting via Convex Optimization: Theory and Applications. PhD thesis, University a of Potsdam, Germany, December 2001.
[15] G. R¨ tsch, T. Onoda, and K.-R. M¨ ller. Soft margins for AdaBoost. Machine Learning, 42(3):287–320, a u 2001.
[16] G. R¨ tsch, B. Sch¨ lkopf, A.J. Smola, S. Mika, T. Onoda, and K.-R. M¨ ller. Robust ensemble learna o u ing. In A.J. Smola, P.L. Bartlett, B. Sch¨ lkopf, and D. Schuurmans, editors, Advances in Large Margin o Classifiers, pages 207–219. MIT Press, Cambridge, MA, 2000.
[17] G. R¨ tsch and M. K. Warmuth. Efficient margin maximizing with boosting. Journal of Machine Learning a Research, 6:2131–2152, December 2005.
[18] C. Rudin, I. Daubechies, and R.E. Schapire. The dynamics of adaboost: Cyclic behavior and convergence of margins. Journal of Machine Learning Research, 5:1557–1595, 2004.
[19] R.E. Schapire, Y. Freund, P.L. Bartlett, and W.S. Lee. Boosting the margin: A new explanation for the effectiveness of voting methods. The Annals of Statistics, 26(5):1651–1686, 1998.
[20] B. Sch¨ lkopf, A.J. Smola, R.C. Williamson, and P.L. Bartlett. New support vector algorithms. Neural o Comput., 12(5):1207–1245, 2000.
[21] Rocco A. Servedio. Smooth boosting and learning with malicious noise. Journal of Machine Learning Research, 4:633–648, 2003.
[22] M.K. Warmuth, J. Liao, and G. R¨ tsch. Totally corrective boosting algorithms that maximize the margin. a In Proc. ICML ’06, pages 1001–1008. ACM Press, 2006. 8