nips nips2012 nips2012-309 nips2012-309-reference knowledge-graph by maker-knowledge-mining

309 nips-2012-Semi-supervised Eigenvectors for Locally-biased Learning


Source: pdf

Author: Toke Hansen, Michael W. Mahoney

Abstract: In many applications, one has side information, e.g., labels that are provided in a semi-supervised manner, about a specific target region of a large data set, and one wants to perform machine learning and data analysis tasks “nearby” that pre-specified target region. Locally-biased problems of this sort are particularly challenging for popular eigenvector-based machine learning and data analysis tools. At root, the reason is that eigenvectors are inherently global quantities. In this paper, we address this issue by providing a methodology to construct semi-supervised eigenvectors of a graph Laplacian, and we illustrate how these locally-biased eigenvectors can be used to perform locally-biased machine learning. These semi-supervised eigenvectors capture successively-orthogonalized directions of maximum variance, conditioned on being well-correlated with an input seed set of nodes that is assumed to be provided in a semi-supervised manner. We also provide several empirical examples demonstrating how these semi-supervised eigenvectors can be used to perform locally-biased learning. 1


reference text

[1] R. Andersen, F.R.K. Chung, and K. Lang. Local graph partitioning using PageRank vectors. In FOCS ’06: Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science, pages 475–486, 2006.

[2] R. Andersen and K. Lang. Communities from seed sets. In WWW ’06: Proceedings of the 15th International Conference on World Wide Web, pages 223–232, 2006.

[3] M. Belkin and P. Niyogi. Laplacian eigenmaps for dimensionality reduction and data representation. Neural Computation, 15(6):1373–1396, 2003.

[4] S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University Press, Cambridge, UK, 2004.

[5] O. Chapelle, J. Weston, and B. Sch¨ lkopf. Cluster Kernels for Semi-Supervised Learning. In o Becker, editor, NIPS 2002, volume 15, pages 585–592, Cambridge, MA, USA, 2003.

[6] R.R. Coifman, S. Lafon, A.B. Lee, M. Maggioni, B. Nadler, F. Warner, and S.W. Zucker. Geometric diffusions as a tool for harmonic analysis and structure definition in data: Diffusion maps. Proc. Natl. Acad. Sci. USA, 102(21):7426–7431, 2005.

[7] A. P. Eriksson, C. Olsson, and F. Kahl. Normalized cuts revisited: A reformulation for segmentation with linear grouping constraints. In Proceedings of th 11th International Conference on Computer Vision, pages 1–8, 2007.

[8] W. Gander, G. H. Golub, and U. von Matt. A constrained eigenvalue problem. Linear Algebra and its Applications, 114/115:815–839, 1989.

[9] T.H. Haveliwala. Topic-sensitive PageRank: A context-sensitive ranking algorithm for web search. IEEE Transactions on Knowledge and Data Engineering, 15(4):784–796, 2003.

[10] G. Jeh and J. Widom. Scaling personalized web search. In WWW ’03: Proceedings of the 12th International Conference on World Wide Web, pages 271–279, 2003.

[11] T. Joachims. Transductive learning via spectral graph partitioning. In Proceedings of the Twentieth International Conference on Machine Learning (ICML-2003), 2003.

[12] Y. Lecun and C. Cortes. The MNIST database of handwritten digits.

[13] J. Leskovec, K.J. Lang, A. Dasgupta, and M.W. Mahoney. Statistical properties of community structure in large social and information networks. In WWW ’08: Proceedings of the 17th International Conference on World Wide Web, pages 695–704, 2008.

[14] M. W. Mahoney, L. Orecchia, and N. K. Vishnoi. A local spectral method for graphs: with applications to improving graph partitions and exploring data graphs locally. Technical report, 2009. Preprint: arXiv:0912.0681.

[15] S. Maji, N. K. Vishnoi, and J. Malik. Biased normalized cuts. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2057–2064, 2011.

[16] K.A. Norman, S.M. Polyn, G.J. Detre, and J.V. Haxby. Beyond mind-reading: multi-voxel pattern analysis of fmri data. Trends in Cognitive Sciences, 10(9):424–30, 2006.

[17] L. Page, S. Brin, R. Motwani, and T. Winograd. The PageRank citation ranking: Bringing order to the web. Technical report, Stanford InfoLab, 1999.

[18] B. Scholkopf and A.J. Smola. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, Cambridge, MA, USA, 2001.

[19] D.A. Spielman and S.-H. Teng. Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems. In STOC ’04: Proceedings of the 36th annual ACM Symposium on Theory of Computing, pages 81–90, 2004.

[20] S.-H. Teng. The Laplacian paradigm: Emerging algorithms for massive graphs. In Proceedings of the 7th Annual Conference on Theory and Applications of Models of Computation, pages 2–14, 2010.

[21] S. Vigna. Spectral ranking. Technical report. Preprint: arXiv:0912.0238 (2009).

[22] D.J. Watts and S.H. Strogatz. Collective dynamics of small-world networks. Nature, 393:440– 442, 1998.

[23] S. X. Yu and J. Shi. Grouping with bias. In Annual Advances in Neural Information Processing Systems 14: Proceedings of the 2001 Conference, pages 1327–1334, 2002. 9