jmlr jmlr2008 jmlr2008-11 jmlr2008-11-reference knowledge-graph by maker-knowledge-mining

11 jmlr-2008-Aggregation of SVM Classifiers Using Sobolev Spaces


Source: pdf

Author: Sébastien Loustau

Abstract: This paper investigates statistical performances of Support Vector Machines (SVM) and considers the problem of adaptation to the margin parameter and to complexity. In particular we provide a classifier with no tuning parameter. It is a combination of SVM classifiers. Our contribution is two-fold: (1) we propose learning rates for SVM using Sobolev spaces and build a numerically realizable aggregate that converges with same rate; (2) we present practical experiments of this method of aggregation for SVM using both Sobolev spaces and Gaussian kernels. Keywords: classification, support vector machines, learning rates, approximation, aggregation of classifiers


reference text

R.A. Adams. Sobolev Spaces. Academic Press, 1975. N. Aronszajn. Theory of reproducing kernels. Transactions of the American Mathematical Society, 68:337–404, 1950. S. Arora, L. Babai, J. Stern, and Z. Sweedyk. The hardness of approximate optima in lattices, codes, and systems of linear equations. Journal of Computer and System Sciences, 54 (2):317– 331, 1997. P.L. Bartlett. The sample complexity of pattern classification with neural networks: the size of the weights is more important than the size of the network. IEEE Transactions on Information Theory, 44 (2):525–536, 1998. P.L. Bartlett and S. Mendelson. Empirical minimization. Probability Theory and Related Fields, 135 (3):311–334, 2006. P.L. Bartlett, M.I. Jordan, and J.D. McAuliffe. Convexity, classification, and risk bounds. J. Amer. Statist. Assoc., 101 (473):138–156, 2006. C. Bennett and R. Sharpley. Interpolation of Operators. Academic Press, 1988. G. Blanchard, O. Bousquet, and P. Massart. Statistical performance of support vector machines. to appear Annals of Statistics, 2006. B.E. Boser, I. Guyon, and V. Vapnik. A training algorithm for optimal margin classifiers. In Computational Learning Theory, pages 144–152, 1992. S. Boucheron, O. Bousquet, and G. Lugosi. Theory of classification: a survey of some recent advances. ESAIM: Probability and Statistics, 9:323–375, 2005. D.R. Chen, Q. Wu, Y. Ying, and D.X. Zhou. Support vector machine soft margin classifiers: error analysis. Journal of Machine Learning Research, 5:1143–1175, 2004. N. Cristianini and H. Shawe-Taylor. Introduction to Support Vector Machines, and Other KernelBased Learning Methods. Cambridge University Press, 2000. L. Devroye. Necessary and sufficient conditions for the pointwise convergence of nearest neighbor regression function estimates. Z. Wahrsch. Vew. Gebiete, 61 (4):467–481, 1982. Y. Freund. Boosting a weak learning algorithm by majority. Information and Computation, 121 (2): 256–285, 1995. A. Karatzoglou, A. Smola, and K. Hornik. An S4 package for kernel methods in R. Reference manual, 2007. G. Lecu´ . Simultaneous adaptation to the margin and to complexity in classification. The Annals of e Statistics, 35 (4):1698–1721, 2007a. G. Lecu´ . Optimal rates of aggregation in classification under low noise assumption. Bernoulli, 13 e (4):1000–1022, 2007b. 1580 AGGREGATION OF SVM P. Malliavin. Analyse de Fourier-Analyses spectrales. Ecole Polytechnique, 1974. E. Mammen and A.B. Tsybakov. Smooth discrimination analysis. The Annals of Statistics, 27 (6): 1808–1829, 1999. P. Massart and E. N´ d´ lec. Risk bounds for statistical learning. The Annals of Statistics, 34 (5): e e 2326–2366, 2006. M. Matache and V. Matache. Hilbert spaces induced by Toeplitz covariance kernels. Lecture notes in Control and Information Sciences, 280:319–334, 2002. A. Nemirovski. Topics in Nonparametric Statistics. Ecole d’´ t´ de Saint-Flour XXVIII, Springer, ee N.Y., 1998. ¨ D. R¨ tsch, T. Onoda, and K.R. Muller. Soft margin for adaboost. Esprit Working Group in Neural a and Computational Learning II, 1998. S. Smale and D.X. Zhou. Estimating the approximation error in learning theory. Analysis and Applications, 1 (1):17–41, 2003. I. Steinwart. On the influence of the kernel on the consistency of support vector machines. Journal of Machine Learning Research, 2:67–93, 2001. I. Steinwart. Consistency of support vector machines and other regularized kernel classifiers. IEEE Transactions on Information Theory, 51 (1):128–142, 2005. I. Steinwart and C. Scovel. Fast rates for support vector machines. In Proc. 18th Annu. Conference on Comput. Learning Theory, volume 3559, pages 279–294, 2005. I. Steinwart and C. Scovel. Fast rates for support vector machines using Gaussian kernels. The Annals of Statistics, 35 (2):575–607, 2007. I. Steinwart, D. Hush, and C. Scovel. An oracle inequality for clipped regularized risk minimizers. Neural Information Processing Systems, 19:1321–1328, 2007. H. Triebel. Theory of Functions Spaces II. Birkhauser, 1992. H. Triebel. Interpolation Theory, Function Spaces, Differential Operators. North-Holland Publishing Company, 1978. A.B. Tsybakov. Optimal aggregation of classifiers in statistical learning. The Annals of Statistics, 32 (1):135–166, 2004. V.N. Vapnik and A.Ya. Chervonenkis. On the uniform convergence of relative frequencies of events to their probabilities. Theory of Probability and its Applications, 16 (2):264–280, 1971. V.N. Vapnik and A.Ya. Chervonenkis. Theory of Pattern Recognition. Nauka, Moscow, 1974. ¨ R.C. Williamson, A.J. Smola, and B. Scholkopf. Generalization performance of regularization networks and support vector machines via entropy numbers of compact operators. IEEE Transactions on Information Theory, 47 (6):2516–2532, 2001. 1581 L OUSTAU Q. Wu and D.X. Zhou. Analysis of support vector machine classification. J. Comput. Anal. Appl., 8 (2):99–119, 2006. Q. Wu, Y. Ying, and D.X. Zhou. Multi-kernel regularized classifiers. Journal of Complexity, 23 (1): 108–134, 2007. Y. Yang. Mixing strategies for density estimation. The Annals of Statistics, 28 (1):75–87, 2000. T. Zhang. Statistical behavior and consistency of classification methods based on convex risk minimization. The Annals of Statistics, 32 (1):56–85, 2004. 1582