nips nips2005 nips2005-42 nips2005-42-reference knowledge-graph by maker-knowledge-mining

42 nips-2005-Combining Graph Laplacians for Semi--Supervised Learning

Source: pdf

Author: Andreas Argyriou, Mark Herbster, Massimiliano Pontil

Abstract: A foundational problem in semi-supervised learning is the construction of a graph underlying the data. We propose to use a method which optimally combines a number of differently constructed graphs. For each of these graphs we associate a basic graph kernel. We then compute an optimal combined kernel. This kernel solves an extended regularization problem which requires a joint minimization over both the data and the set of graph kernels. We present encouraging results on different OCR tasks where the optimal combined kernel is computed from graphs constructed with a variety of distances functions and the ‘k’ in nearest neighbors. 1

reference text

[1] A. Argyriou, C.A. Micchelli and M. Pontil. Learning convex combinations of continuously parameterized basic kernels. Proc. 18-th Conf. on Learning Theory, 2005.

[2] M. Belkin, I. Matveeva and P. Niyogi. Regularization and semi-supervised learning on large graphs. Proc. of 17–th Conf. Learning Theory (COLT), 2004.

[3] M. Belkin and P. Niyogi. Semi-supervised learning on Riemannian manifolds. Mach. Learn., 56: 209–239, 2004.

[4] A. Blum and S. Chawla. Learning from Labeled and Unlabeled Data using Graph Mincuts, Proc. of 18–th International Conf. on Learning Theory, 2001.

[5] F.R. Chung. Spectral Graph Theory. Regional Conference Series in Mathematics, Vol. 92, 1997.

[6] T. Hastie and P. Simard. Models and Metrics for Handwritten Character Recognition. Statistical Science, 13(1): 54–65, 1998.

[7] M. Herbster, M. Pontil, L. Wainer. Online learning over graphs. Proc. 22-nd Int. Conf. Machine Learning, 2005.

[8] T. Joachims. Transductive Learning via Spectral Graph Partitioning. Proc. of the Int. Conf. Machine Learning (ICML), 2003.

[9] R.I. Kondor and J. Lafferty. Diffusion kernels on graphs and other discrete input spaces. Proc. 19-th Int. Conf. Machine Learning, 2002.

[10] G. R. G. Lanckriet, N. Cristianini, P. Bartlett, L. El Ghaoui, M. I. Jordan. Learning the kernel matrix with semideﬁnite programming. J. Machine Learning Research, 5: 27–72, 2004.

[11] Y. Lin and H.H. Zhang. Component selection and smoothing in smoothing spline analysis of variance models – COSSO. Institute of Statistics Mimeo Series 2556, NCSU, January 2003.

[12] C. A. Micchelli and M. Pontil. Learning the kernel function via regularization, J. Machine Learning Research, 6: 1099–1125, 2005.

[13] C.S. Ong, A.J. Smola, and R.C. Williamson. Hyperkernels. Advances in Neural Information Processing Systems, 15, S. Becker et al. (Eds.), MIT Press, Cambridge, MA, 2003.

[14] A.J. Smola and R.I Kondor. Kernels and regularization on graphs. Proc. of 16–th Conf. Learning Theory (COLT), 2003.

[15] V.N. Vapnik. Statistical Learning Theory. Wiley, New York, 1998.

[16] D. Zhou, O. Bousquet, T.N. Lal, J. Weston and B. Scholkopf. Learning with local and global consistency. Advances in Neural Information Processing Systems, 16, S. Thrun et al. (Eds.), MIT Press, Cambridge, MA, 2004.

[17] X. Zhu, Z. Ghahramani and J. Lafferty. Semi-supervised learning using Gaussian ﬁelds and harmonic functions. Proc. 20–th Int. Conf. Machine Learning, 2003.

[18] X. Zhu, J. Kandola, Z, Ghahramani, J. Lafferty. Nonparametric transforms of graph kernels for semi-supervised learning. Advances in Neural Information Processing Systems, 17, L.K. Saul et al. (Eds.), MIT Press, Cambridge, MA, 2005.