nips nips2008 nips2008-15 nips2008-15-reference knowledge-graph by maker-knowledge-mining

15 nips-2008-Adaptive Martingale Boosting

Source: pdf

Author: Phil Long, Rocco Servedio

Abstract: In recent work Long and Servedio [LS05] presented a “martingale boosting” algorithm that works by constructing a branching program over weak classiﬁers and has a simple analysis based on elementary properties of random walks. [LS05] showed that this martingale booster can tolerate random classiﬁcation noise when it is run with a noise-tolerant weak learner; however, a drawback of the algorithm is that it is not adaptive, i.e. it cannot effectively take advantage of variation in the quality of the weak classiﬁers it receives. We present an adaptive variant of the martingale boosting algorithm. This adaptiveness is achieved by modifying the original algorithm so that the random walks that arise in its analysis have different step size depending on the quality of the weak learner at each stage. The new algorithm inherits the desirable properties of the original [LS05] algorithm, such as random classiﬁcation noise tolerance, and has other advantages besides adaptiveness: it requires polynomially fewer calls to the weak learner than the original algorithm, and it can be used with conﬁdencerated weak hypotheses that output real values rather than Boolean predictions.

reference text

[AD98] [AL88] J. Aslam and S. Decatur. Speciﬁcation and simulation of statistical query algorithms for efﬁciency and noise tolerance. J. Comput & Syst. Sci., 56:191–208, 1998. Dana Angluin and Philip Laird. Learning from noisy examples. Machine Learning, 2(4):343–370, 1988. [ASE92] N. Alon, J. Spencer, and P. Erdos. The Probabilistic Method (1st ed.). Wiley-Interscience, New York, 1992. [BKW03] A. Blum, A. Kalai, and H. Wasserman. Noise-tolerant learning, the parity problem, and the statistical query model. J. ACM, 50(4):506–519, 2003. [Die00] T.G. Dietterich. An experimental comparison of three methods for constructing ensembles of decision trees: bagging, boosting, and randomization. Machine Learning, 40(2):139–158, 2000. [Fre95] Y. Freund. Boosting a weak learning algorithm by majority. Information and Computation, 121(2):256–285, 1995. [FS96] Y. Freund and R. Schapire. Experiments with a new boosting algorithm. In ICML, pages 148–156, 1996. [FS97] Y. Freund and R. E. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. JCSS, 55(1):119–139, 1997. T. P. Hayes. A large-deviation inequality for vector-valued martingales. 2005. [Hay05] [Kea98] M. Kearns. Efﬁcient noise-tolerant learning from statistical queries. JACM, 45(6):983–1006, 1998. [KS05] A. Kalai and R. Servedio. Boosting in the presence of noise. JCSS, 71(3):266–290, 2005. [LS05] [LS08] P. Long and R. Servedio. Martingale boosting. In Proc. 18th Annual COLT, pages 79–94, 2005. P. Long and R. Servedio. Random classiﬁcation noise defeats all convex potential boosters. In ICML, 2008. [MO97] R. Maclin and D. Opitz. An empirical evaluation of bagging and boosting. In AAAI/IAAI, pages 546–551, 1997. [MR03] R. Meir and G. R¨ tsch. An introduction to boosting and leveraging. In LNAI Advanced Lectures on a Machine Learning, pages 118–183, 2003. [RDM06] L. Ralaivola, F. Denis, and C. Magnan. CN=CNNN. In ICML, pages 265–272, 2006. [Sch90] R. Schapire. The strength of weak learnability. Machine Learning, 5(2):197–227, 1990. [Sch03] R. Schapire. The boosting approach to machine learning: An overview. Springer, 2003. [SS99] R. Schapire and Y. Singer. Improved boosting algorithms using conﬁdence-rated predictions. Machine Learning, 37:297–336, 1999.