nips nips2008 nips2008-217 nips2008-217-reference knowledge-graph by maker-knowledge-mining

217 nips-2008-Sparsity of SVMs that use the epsilon-insensitive loss


Source: pdf

Author: Ingo Steinwart, Andreas Christmann

Abstract: In this paper lower and upper bounds for the number of support vectors are derived for support vector machines (SVMs) based on the -insensitive loss function. It turns out that these bounds are asymptotically tight under mild assumptions on the data generating distribution. Finally, we briefly discuss a trade-off in between sparsity and accuracy if the SVM is used to estimate the conditional median. 1


reference text

[1] A. Christmann and I. Steinwart. Consistency and robustness of kernel based regression. Bernoulli, 13:799–819, 2007.

[2] N. Cristianini and J. Shawe-Taylor. An Introduction to Support Vector Machines. Cambridge University Press, Cambridge, 2000.

[3] E. De Vito, L. Rosasco, A. Caponnetto, M. Piana, and A. Verri. Some properties of regularized kernel methods. J. Mach. Learn. Res., 5:1363–1390, 2004.

[4] L. Devroye, L. Gy¨ rfi, and G. Lugosi. A Probabilistic Theory of Pattern Recognition. Springer, New o York, 1996.

[5] I. Steinwart. Sparseness of support vector machines. J. Mach. Learn. Res., 4:1071–1105, 2003.

[6] I. Steinwart. How to compare different loss functions. Constr. Approx., 26:225–287, 2007.

[7] I. Steinwart and A. Christmann. Support Vector Machines. Springer, New York, 2008.

[8] I. Steinwart and A. Christmann. How SVMs can estimate quantiles and the median. In J.C. Platt, D. Koller, Y. Singer, and S. Roweis, editors, Advances in Neural Information Processing Systems 20, pages 305–312. MIT Press, Cambridge, MA, 2008.

[9] I. Steinwart, D. Hush, and C. Scovel. Function classes that approximate the Bayes risk. In G. Lugosi and H. U. Simon, editors, Proceedings of the 19th Annual Conference on Learning Theory, pages 79–93. Springer, New York, 2006.

[10] V. Vapnik, S. Golowich, and A. Smola. Support vector method for function approximation, regression estimation, and signal processing. In M. Mozer, M. Jordan, and T. Petsche, editors, Advances in Neural Information Processing Systems 9, pages 81–287. MIT Press, Cambridge, MA, 1997.

[11] V. N. Vapnik. Statistical Learning Theory. John Wiley & Sons, New York, 1998.

[12] V. Yurinsky. Sums and Gaussian Vectors. Lecture Notes in Math. 1617. Springer, Berlin, 1995.

[13] T. Zhang. Convergence of large margin separable linear classification. In T. K. Leen, T. G. Dietterich, and V. Tresp, editors, Advances in Neural Information Processing Systems 13, pages 357–363. MIT Press, Cambridge, MA, 2001.