nips nips2000 nips2000-145 nips2000-145-reference knowledge-graph by maker-knowledge-mining

145 nips-2000-Weak Learners and Improved Rates of Convergence in Boosting

Source: pdf

Author: Shie Mannor, Ron Meir

Abstract: The problem of constructing weak classifiers for boosting algorithms is studied. We present an algorithm that produces a linear classifier that is guaranteed to achieve an error better than random guessing for any distribution on the data. While this weak learner is not useful for learning in general, we show that under reasonable conditions on the distribution it yields an effective weak learner for one-dimensional problems. Preliminary simulations suggest that similar behavior can be expected in higher dimensions, a result which is corroborated by some recent theoretical bounds. Additionally, we provide improved convergence rate bounds for the generalization error in situations where the empirical error can be made small, which is exactly the situation that occurs if weak learners with guaranteed performance that is better than random guessing can be established. 1

reference text

[1] M. Anthony and P.L. Bartlett. Neural Network Learning: Theoretical Foundations. Cambridge University Press, 1999.

[2] P. Bartlett and S. Ben-David. On the hardness of learning with neural networks. In Proceedings of the fourth European Conference on computational Learning Theory, 99.

[3] P. Bartlett and G. Lugosi. An inequality for uniform deviations of sample averages from their means. Statistics and Probability Letters, 44:55- 62, 1999.

[4] T. Hastie J. Friedman and R. Tibshirani. Additive logistic regression: a statistical view of boosting. The Annals of Statistics, To appear, 2000.

[5] D.S. Johnson and F.P. Preparata. The densest hemisphere problem. Theoretical Computer Science, 6:93- 107,1978.

[6] S. Mallat and Z. Zhang. Matching pursuit with time-frequencey dictionaries. IEEE Trans. Signal Processing, 41(12):3397- 3415, December 1993.

[7] S. Mannor and R. Meir. On the existence of weak learners and applications to boosting. Submitted to Machine Learning

[8] L. Mason, P. Bartlett and J. Baxter. Improved generalization through explicit optimization of margins. Machine Learning, 2000. To appear.

[9] L. Mason, P. Bartlett, J. Baxter and M. Frean. Functional gradient techniques for combining hypotheses. In B. Sch6lkopf A. Smola, P. Bartlett and D. Schuurmans, editors, Advances in Large Margin Classifiers. MIT Press, 2000.

[10] R.E. Schapire, Y. Freund, P. Bartlett and W .S. Lee. Boosting the margin: a new explanation for the effectiveness of voting methods. The Annals of Statistics, 26(5):1651-1686, 1998.

[11] V. N. Vapnik. Estimation of Dependences Based on Empirical Data. Springer Verlag, New York, 1982.