nips nips2008 nips2008-15 nips2008-15-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Phil Long, Rocco Servedio
Abstract: In recent work Long and Servedio [LS05] presented a “martingale boosting” algorithm that works by constructing a branching program over weak classifiers and has a simple analysis based on elementary properties of random walks. [LS05] showed that this martingale booster can tolerate random classification noise when it is run with a noise-tolerant weak learner; however, a drawback of the algorithm is that it is not adaptive, i.e. it cannot effectively take advantage of variation in the quality of the weak classifiers it receives. We present an adaptive variant of the martingale boosting algorithm. This adaptiveness is achieved by modifying the original algorithm so that the random walks that arise in its analysis have different step size depending on the quality of the weak learner at each stage. The new algorithm inherits the desirable properties of the original [LS05] algorithm, such as random classification noise tolerance, and has other advantages besides adaptiveness: it requires polynomially fewer calls to the weak learner than the original algorithm, and it can be used with confidencerated weak hypotheses that output real values rather than Boolean predictions.
[AD98] [AL88] J. Aslam and S. Decatur. Specification and simulation of statistical query algorithms for efficiency and noise tolerance. J. Comput & Syst. Sci., 56:191–208, 1998. Dana Angluin and Philip Laird. Learning from noisy examples. Machine Learning, 2(4):343–370, 1988. [ASE92] N. Alon, J. Spencer, and P. Erdos. The Probabilistic Method (1st ed.). Wiley-Interscience, New York, 1992. [BKW03] A. Blum, A. Kalai, and H. Wasserman. Noise-tolerant learning, the parity problem, and the statistical query model. J. ACM, 50(4):506–519, 2003. [Die00] T.G. Dietterich. An experimental comparison of three methods for constructing ensembles of decision trees: bagging, boosting, and randomization. Machine Learning, 40(2):139–158, 2000. [Fre95] Y. Freund. Boosting a weak learning algorithm by majority. Information and Computation, 121(2):256–285, 1995. [FS96] Y. Freund and R. Schapire. Experiments with a new boosting algorithm. In ICML, pages 148–156, 1996. [FS97] Y. Freund and R. E. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. JCSS, 55(1):119–139, 1997. T. P. Hayes. A large-deviation inequality for vector-valued martingales. 2005. [Hay05] [Kea98] M. Kearns. Efficient noise-tolerant learning from statistical queries. JACM, 45(6):983–1006, 1998. [KS05] A. Kalai and R. Servedio. Boosting in the presence of noise. JCSS, 71(3):266–290, 2005. [LS05] [LS08] P. Long and R. Servedio. Martingale boosting. In Proc. 18th Annual COLT, pages 79–94, 2005. P. Long and R. Servedio. Random classification noise defeats all convex potential boosters. In ICML, 2008. [MO97] R. Maclin and D. Opitz. An empirical evaluation of bagging and boosting. In AAAI/IAAI, pages 546–551, 1997. [MR03] R. Meir and G. R¨ tsch. An introduction to boosting and leveraging. In LNAI Advanced Lectures on a Machine Learning, pages 118–183, 2003. [RDM06] L. Ralaivola, F. Denis, and C. Magnan. CN=CNNN. In ICML, pages 265–272, 2006. [Sch90] R. Schapire. The strength of weak learnability. Machine Learning, 5(2):197–227, 1990. [Sch03] R. Schapire. The boosting approach to machine learning: An overview. Springer, 2003. [SS99] R. Schapire and Y. Singer. Improved boosting algorithms using confidence-rated predictions. Machine Learning, 37:297–336, 1999.