nips nips2003 nips2003-151 nips2003-151-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Jean-yves Audibert, Olivier Bousquet
Abstract: There exist many different generalization error bounds for classification. Each of these bounds contains an improvement over the others for certain situations. Our goal is to combine these different improvements into a single bound. In particular we combine the PAC-Bayes approach introduced by McAllester [1], which is interesting for averaging classifiers, with the optimal union bound provided by the generic chaining technique developed by Fernique and Talagrand [2]. This combination is quite natural since the generic chaining is based on the notion of majorizing measures, which can be considered as priors on the set of classifiers, and such priors also arise in the PAC-bayesian setting. 1
[1] D. A. McAllester. Some PAC-Bayesian theorems. In Proceedings of the 11th Annual Conference on Computational Learning Theory, pages 230–234. ACM Press, 1998.
[2] M. Talagrand. Majorizing measures: The generic chaining. Annals of Probability, 24(3):1049– 1103, 1996.
[3] S. Boucheron, G. Lugosi, and S. Massart. A sharp concentration inequality with applications. Random Structures and Algorithms, 16:277–292, 2000.
[4] D. A. McAllester. PAC-Bayesian model averaging. In Proceedings of the 12th Annual Conference on Computational Learning Theory. ACM Press, 1999.
[5] V. Vapnik and A. Chervonenkis. Theory of Pattern Recognition [in Russian]. Nauka, Moscow, 1974. (German Translation: W. Wapnik & A. Tscherwonenkis, Theorie der Zeichenerkennung, Akademie–Verlag, Berlin, 1979).
[6] R. M. Dudley. A course on empirical processes. Lecture Notes in Mathematics, 1097:2–142, 1984.
[7] L. Devroye and G. Lugosi. Combinatorial Methods in Density Estimation. Springer Series in Statistics. Springer Verlag, New York, 2001.
[8] P. Bartlett, O. Bousquet, and S. Mendelson. Local rademacher complexities. Preprint, 2003.
[9] D. A. McAllester. Simplified pac-bayesian margin bounds. In Proceedings of Computational Learning Theory (COLT), 2003.
[10] M. Ledoux and M. Talagrand. Probability in Banach spaces. Springer-Verlag, Berlin, 1991.
[11] M. Talagrand. The Glivenko-Cantelli problem. Annals of Probability, 6:837–870, 1987.
[12] O. Catoni. Localized empirical complexity bounds and randomized estimators, 2003. Preprint.
[13] J.-Y. Audibert. Data-dependent generalization error bounds for (noisy) classification: a PACbayesian approach. 2003. Work in progress.