nips nips2006 nips2006-150 nips2006-150-reference knowledge-graph by maker-knowledge-mining

150 nips-2006-On Transductive Regression

Source: pdf

Author: Corinna Cortes, Mehryar Mohri

Abstract: In many modern large-scale learning applications, the amount of unlabeled data far exceeds that of labeled data. A common instance of this problem is the transductive setting where the unlabeled test points are known to the learning algorithm. This paper presents a study of regression problems in that setting. It presents explicit VC-dimension error bounds for transductive regression that hold for all bounded loss functions and coincide with the tight classiﬁcation bounds of Vapnik when applied to classiﬁcation. It also presents a new transductive regression algorithm inspired by our bound that admits a primal and kernelized closedform solution and deals efﬁciently with large amounts of unlabeled data. The algorithm exploits the position of unlabeled points to locally estimate their labels and then uses a global optimization to ensure robust predictions. Our study also includes the results of experiments with several publicly available regression data sets with up to 20,000 unlabeled examples. The comparison with other transductive regression algorithms shows that it performs well and that it can scale to large data sets.

reference text

Mikhail Belkin, Partha Niyogi, and Vikas Sindhwani. Manifold regularization; a geometric framework for learning from examples. Technical Report TR-2004-06, University of Chicago, 2004. Kristin Bennett and Ayhan Demiriz. Semi-supervised support vector machines. NIPS 11, pages 368–374, 1998. Olivier Chapelle, Vladimir Vapnik, and Jason Weston. Transductive Inference for Estimating Values of Functions. NIPS 12, pages 421–427, 1999. Adrian Corduneanu and Tommi Jaakkola. On information regularization. In Christopher Meek and Uffe Kjærulff, editors, Proceedings of the Nineteenth Annual Conference on Uncertainty in Artiﬁcial Intelligence, pages 151–158, 2003. Corinna Cortes and Mehryar Mohri. On Transductive Regression. Technical Report TR2006-883, Courant Institute of Mathematical Sciences, New York University, November 2006. Philip Derbeko, Ran El-Yaniv, and Ron Meir. Explicit learning curves for transduction and application to clustering and compression algorithms. J. Artif. Intell. Res. (JAIR), 22:117–142, 2004. Thore Graepel, Ralf Herbrich, and Klaus Obermayer. Bayesian transduction. NIPS 12, 1999. Thorsten Joachims. Transductive inference for text classiﬁcation using support vector machines. In Ivan Bratko and Saso Dzeroski, editors, Proceedings of ICML-99, 16th International Conference on Machine Learning, pages 200–209. Morgan Kaufmann Publishers, San Francisco, US, 1999. Gert R. G. Lanckriet, Nello Cristianini, Peter Bartlett, Laurent El Ghaoui, and Michael I. Jordan. Learning the kernel matrix with semideﬁnite programming. J. Mach. Learn. Res., 5:27–72, 2004. ISSN 1533-7928. Bernhard Sch¨ lkopf and Alex Smola. Learning with Kernels. MIT Press: Cambridge, MA, 2002. o Dale Schuurmans and Finnegan Southey. Metric-Based Methods for Adaptive Model Selection and Regularization. Machine Learning, 48:51–84, 2002. Lu´s Torgo. Regression datasets, 2006. http://www.liacc.up.pt/ ltorgo/Regression/DataSets.html. ı Vladimir N. Vapnik. Estimation of Dependences Based on Empirical Data. Springer, Berlin, 1982. Vladimir N. Vapnik. Statistical Learning Theory. Wiley-Interscience, New York, 1998. Dengyong Zhou, Jiayuan Huang, and Bernard Scholkopf. Learning from labeled and unlabeled data on a directed graph. In L. De Raedt and S. Wrobel, editors, Proceedings of ICML-05, pages 1041–1048, 2005. Xiaojin Zhu, Jaz Kandola, Zoubin Ghahramani, and John Lafferty. Nonparametric transforms of graph kernels for semi-supervised learning. NIPS 17, 2004.