jmlr jmlr2009 jmlr2009-16 jmlr2009-16-reference knowledge-graph by maker-knowledge-mining

16 jmlr-2009-Classification with Gaussians and Convex Loss

Source: pdf

Author: Dao-Hong Xiang, Ding-Xuan Zhou

Abstract: This paper considers binary classiﬁcation algorithms generated from Tikhonov regularization schemes associated with general convex loss functions and varying Gaussian kernels. Our main goal is to provide fast convergence rates for the excess misclassiﬁcation error. Allowing varying Gaussian kernels in the algorithms improves learning rates measured by regularization error and sample error. Special structures of Gaussian kernels enable us to construct, by a nice approximation scheme with a Fourier analysis technique, uniformly bounded regularizing functions achieving polynomial decays of the regularization error under a Sobolev smoothness condition. The sample error is estimated by using a projection operator and a tight bound for the covering numbers of reproducing kernel Hilbert spaces generated by Gaussian kernels. The convexity of the general loss function plays a very important role in our analysis. Keywords: reproducing kernel Hilbert space, binary classiﬁcation, general convex loss, varying Gaussian kernels, covering number, approximation

reference text

A. Argyriou, R. Hauser, C. A. Micchelli, and M. Pontil. A DC-programming algorithm for kernel selection. Proceedings of the Twenty-Third International Conference on Machine Learning, 2006. N. Aronszajn. Theory of reproducing kernels. Transactions of the American Mathematical Society, 68:337–404, 1950. P. L. Bartlett, M. I. Jordan, and J. D. McAuliffe. Convexity, classiﬁcation, and risk bounds. Journal of the American Statistical Association, 101:138–156, 2006. O. Chapelle, V. N. Vapnik, O. Bousquet and S. Mukherjee. Choosing multiple parameters for support vector machines. Machine Learning, 46: 131–159, 2002. N. Cristianini and J. Shawe-Taylor. An Introduction to Support Vector Machines. Cambridge University Press, 2000. 1466 C LASSIFICATION WITH G AUSSIANS D. R. Chen, Q. Wu, Y. Ying, and D. X. Zhou. Support vector machine soft margin classiﬁers: error analysis. Journal of Machine Learning Research, 5:1143–1175, 2004. F. Cucker and D. X. Zhou. Learning Theory: An Approximation Theory Viewpoint. Cambridge University Press, 2007. E. De Vito, A. Caponnetto, and L. Rosasco. Model selection for regularized least-squares algorithm in learning theory. Foundations of Computational Mathematics, 5:59–85, 2006. E. De Vito, L. Rosasco, A. Caponnetto, M. Piana, and A. Verri. Some properties of regularized kernel methods. Journal of Machine Learning Research, 5:1363–1390, 2004. D. Edmunds and H. Triebel. Function Spaces, Entropy Numbers, Differential Operators. Cambridge University Press, 1996. T. Evgeniou, M. Pontil, and T. Poggio. Regularization networks and support vector machines. Advances in Computational Mathematics, 13:1–50, 2000. D. Hardin, I. Tsamardinos, and C. F. Aliferis. A theoretical characterization of linear SVM-based feature selection. Proc. of the 21st Int. Conf. on Machine Learning, Banff, Canada, 2004. S. Mukherjee, P. Niyogi, T. Poggio, and R. Rifkin. Learning theory: stability is sufﬁcient for generalization and necessary and sufﬁcient for empirical risk minimization. Advances in Computational Mathematics, 25:161–193, 2006. Z. W. Pan, D. H. Xiang, Q. W. Xiao, and D. X. Zhou. Parzen windows for multi-class classiﬁcation. Journal of Complexity, 24:606–618, 2008. R. Schaback and J. Werner. Linearly constrained reconstruction of functions by kernels, with applications to machine learning. Advances in Computational Mathematics, 25:237–258, 2006. S. Smale and D. X. Zhou. Learning theory estimates via integral operators and their approximations. Constructive Approximation, 26:153–172, 2007. S. Smale and D. X. Zhou. Estimating the approximation error in learning theory. Analysis and Applications, 1:17–41, 2003. S. Smale and D. X. Zhou. Shannon sampling and function reconstruction from point values. Bulletin of the American Mathematical Society, 41:279–305, 2004. I. Steinwart and C. Scovel. Fast rates for support vector machines using Gaussian kernels. Annals of Statistics, 35:575–607, 2007. I. Steinwart, D. Hush, and C. Scovel. An explicit description of the reproducing kernel Hilbert spaces of Gaussian RBF kernels. IEEE Transactions on Information Theory, 52:4635–4643, 2006. A. B. Tsybakov. Optimal aggregation of classiﬁers in statistical learning. Annals of Statistics, 32:135–166, 2004. G. Wahba. Spline Models for Observational Data. SIAM, 1990. 1467 X IANG AND Z HOU Q. Wu, Y. Ying, and D. X. Zhou. Multi-kernel regularized classiﬁers. Journal of Complexity, 23:108–134, 2007. Q. Wu and D. X. Zhou. Analysis of support vector machine classiﬁcation. Journal of Computational Analysis and Applications, 8:99–119, 2006. Q. Wu and D. X. Zhou. SVM soft margin classiﬁers: linear programming versus quadratic programming. Neural Computation, 17:1160–1187, 2005. Y. Yao. On complexity issue of online learning algorithms. IEEE Transactions on Information Theory, to appear. G. B. Ye and D. X. Zhou. Fully online classiﬁcation by regularization. Applied and Computational Harmonic Analysis, 23:198–214, 2007. G. B. Ye and D. X. Zhou. Learning and approximation by Gaussians on Riemannian manifolds. Advances in Computational Mathematics, 29:291–310, 2008. Y. Ying. Convergence analysis of online algorithms. Advances in Computational Mathematics, 27:273–291, 2007. Y. Ying and D. X. Zhou. Learnability of Gaussians with ﬂexible variances. Journal of Machine Learning Research, 8:249–276, 2007. T. Zhang. Statistical behavior and consistency of classiﬁcation methods based on convex risk minimization. Annals of Statistics, 32:56–85, 2004. D. X. Zhou. The covering number in learning theory. Journal of Complexity, 18:739–767, 2002. D. X. Zhou. Capacity of reproducing kernel spaces in learning theory. IEEE Transactions on Information Theory, 49:1743–1752, 2003. D. X. Zhou and K. Jetter. Approximation with polynomial kernels and SVM classiﬁers. Advances in Computational Mathematics, 25:323–344, 2006. 1468