nips nips2004 nips2004-96 nips2004-96-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Lorenzo Rosasco, Andrea Caponnetto, Ernesto D. Vito, Francesca Odone, Umberto D. Giovannini
Abstract: Many works have shown that strong connections relate learning from examples to regularization techniques for ill-posed inverse problems. Nevertheless by now there was no formal evidence neither that learning from examples could be seen as an inverse problem nor that theoretical results in learning theory could be independently derived using tools from regularization theory. In this paper we provide a positive answer to both questions. Indeed, considering the square loss, we translate the learning problem in the language of regularization theory and show that consistency results and optimal regularization parameter choice can be derived by the discretization of the corresponding inverse problem. 1
[1] N. Aronszajn. Theory of reproducing kernels. Trans. Amer. Math. Soc., 68:337–404, 1950.
[2] Felipe Cucker and Steve Smale. On the mathematical foundations of learning. Bull. Amer. Math. Soc. (N.S.), 39(1):1–49 (electronic), 2002.
[3] E. De Vito, A. Caponnetto, and L. Rosasco. Discretization error analysis for Tikhonov regularization. submitted to Inverse Problem, 2004. available http://www.disi.unige.it/person/RosascoL/publications/discre iop.pdf.
[4] E. De Vito, A. Caponnetto, and L. Rosasco. Model selection for regularized leastsquares algorithm in learning theory. to appear on Journal Machine Learning Research, 2004.
[5] L. Devroye, L. Gy¨ rfi, and G. Lugosi. A Probabilistic Theory of Pattern Recognition. o Number 31 in Applications of mathematics. Springer, New York, 1996.
[6] Schock E. and Sergei V. Pereverzev. On the adaptive selection of the parameter in regularization of ill-posed problems. Technical report, University of Kaiserslautern, august 200r.
[7] Heinz W. Engl, Martin Hanke, and Andreas Neubauer. Regularization of inverse problems, volume 375 of Mathematics and its Applications. Kluwer Academic Publishers Group, Dordrecht, 1996.
[8] Theodoros Evgeniou, Massimiliano Pontil, and Tomaso Poggio. Regularization networks and support vector machines. Adv. Comput. Math., 13(1):1–50, 2000.
[9] Vera Kurkova. Supervised learning as an inverse problem. Technical Report 960, Institute of Computer Science, Academy of Sciences of the Czech Republic, April 2004.
[10] Colin McDiarmid. On the method of bounded differences. In Surveys in combinatorics, 1989 (Norwich, 1989), volume 141 of London Math. Soc. Lecture Note Ser., pages 148–188. Cambridge Univ. Press, Cambridge, 1989.
[11] S. Mukherjee, T. Niyogi, P.and Poggio, and R. Rifkin. Statistical learning: Stability is sufficient for generalization and necessary and sufficient for consistency of empirical risk minimization. Technical Report CBCL Paper 223, Massachusetts Institute of Technology, january revision 2004.
[12] T. Poggio and Girosi F. Networks for approximation and learning. Proc. IEEE, 78:1481–1497, 1990.
[13] Cynthia Rudin. A different type of convergence for statistical learning algorithms. Technical report, Program in Applied and Computational Mathematics Princeton University, 2004.
[14] I. Steinwart. Consistency of support vector machines and other regularized kernel machines. IEEE Transaction on Information Theory, 2004. (accepted).
[15] Andrey N. Tikhonov and Vasiliy Y. Arsenin. Solutions of ill-posed problems. V. H. Winston & Sons, Washington, D.C.: John Wiley & Sons, New York, 1977. Translated from the Russian, Preface by translation editor Fritz John, Scripta Series in Mathematics.
[16] Vladimir N. Vapnik. Statistical learning theory. Adaptive and Learning Systems for Signal Processing, Communications, and Control. John Wiley & Sons Inc., New York, 1998. A Wiley-Interscience Publication.