nips nips2010 nips2010-270 nips2010-270-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Sivan Sabato, Nathan Srebro, Naftali Tishby
Abstract: We obtain a tight distribution-specific characterization of the sample complexity of large-margin classification with L2 regularization: We introduce the γ-adapted-dimension, which is a simple function of the spectrum of a distribution’s covariance matrix, and show distribution-specific upper and lower bounds on the sample complexity, both governed by the γ-adapted-dimension of the source distribution. We conclude that this new quantity tightly characterizes the true sample complexity of large-margin classification. The bounds hold for a rich family of sub-Gaussian distributions. 1
[1] I. Steinwart and C. Scovel. Fast rates for support vector machines using Gaussian kernels. Annals of Statistics, 35(2):575–607, 2007.
[2] P. Liang and N. Srebro. On the interaction between norm and dimensionality: Multiple regimes in learning. In ICML, 2010.
[3] A.Y. Ng. Feature selection, l1 vs. l2 regularization, and rotational invariance. In ICML, 2004.
[4] A. Antos and G. Lugosi. Strong minimax lower bounds for learning. Mach. Learn., 30(1):31–56, 1998.
[5] A. Ehrenfeucht, D. Haussler, M. Kearns, and L. Valiant. A general lower bound on the number of examples needed for learning. In Proceedings of the First Anuual Workshop on Computational Learning Theory, pages 139–154, August 1988.
[6] C. Gentile and D.P. Helmbold. Improved lower bounds for learning from noisy examples: an informationtheoretic approach. In COLT, pages 104–115, 1998.
[7] V.N. Vapnik. The Nature of Statistical Learning Theory. Springer, 1995.
[8] Gyora M. Benedek and Alon Itai. Learnability with respect to fixed distributions. Theoretical Computer Science, 86(2):377–389, September 1991.
[9] S. Ben-David, T. Lu, and D. P´ l. Does unlabeled data provably help? In Proceedings of the Twenty-First a Annual Conference on Computational Learning Theory, pages 33–44, 2008.
[10] N. Vayatis and R. Azencott. Distribution-dependent vapnik-chervonenkis bounds. In EuroCOLT ’99, pages 230–240, London, UK, 1999. Springer-Verlag.
[11] D.J.H. Garling. Inequalities: A Journey into Linear Analysis. Cambrige University Press, 2007.
[12] V.V. Buldygin and Yu. V. Kozachenko. Metric Characterization of Random Variables and Random Processes. American Mathematical Society, 1998.
[13] P. L. Bartlett and S. Mendelson. Rademacher and Gaussian complexities: Risk bounds and structural results. In COLT 2001, volume 2111, pages 224–240. Springer, Berlin, 2001.
[14] O. Bousquet. Concentration Inequalities and Empirical Processes Theory Applied to the Analysis of Learning Algorithms. PhD thesis, Ecole Polytechnique, 2002.
[15] B. Sch¨ lkopf, J. Shawe-Taylor, A. J. Smola, and R.C. Williamson. Generalization bounds via eigenvalues o of the gram matrix. Technical Report NC2-TR-1999-035, NeuroCOLT2, 1999.
[16] M. Anthony and P. L. Bartlett. Neural Network Learning: Theoretical Foundations. Cambridge University Press, 1999.
[17] N. Christianini and J. Shawe-Taylor. An Introduction to Support Vector Machines. Cambridge University Press, 2000.
[18] Z. Bai and J.W. Silverstein. Spectral Analysis of Large Dimensional Random Matrices. Springer, second edition edition, 2010.
[19] M. Rudelson and R. Vershynin. The smallest singular value of a random rectangular matrix. Communications on Pure and Applied Mathematics, 62:1707–1739, 2009.
[20] M. Rudelson and R. Vershynin. The littlewoodofford problem and invertibility of random matrices. Advances in Mathematics, 218(2):600–633, 2008.
[21] T. Zhang. Covering number bounds of certain regularized linear function classes. Journal of Machine Learning Research, 2:527–550, 2002.
[22] G. Bennett, V. Goodman, and C. M. Newman. Norms of random matrices. Pacific J. Math., 59(2):359– 365, 1975.
[23] F.L. Nazarov and A. Podkorytov. Ball, haagerup, and distribution functions. Operator Theory: Advances and Applications, 113 (Complex analysis, operators, and related topics):247–267, 2000.
[24] R.E.A.C. Paley and A. Zygmund. A note on analytic functions in the unit circle. Proceedings of the Cambridge Philosophical Society, 28:266272, 1932. 9