nips nips2007 nips2007-49 nips2007-49-reference knowledge-graph by maker-knowledge-mining

49 nips-2007-Colored Maximum Variance Unfolding

Source: pdf

Author: Le Song, Arthur Gretton, Karsten M. Borgwardt, Alex J. Smola

Abstract: Maximum variance unfolding (MVU) is an effective heuristic for dimensionality reduction. It produces a low-dimensional representation of the data by maximizing the variance of their embeddings while preserving the local distances of the original data. We show that MVU also optimizes a statistical dependence measure which aims to retain the identity of individual observations under the distancepreserving constraints. This general view allows us to design “colored” variants of MVU, which produce low-dimensional representations for a given task, e.g. subject to class labels or other side information. 1

reference text

[1] K. Q. Weinberger, F. Sha, and L. K. Saul. Learning a kernel matrix for nonlinear dimensionality reduction. In Proceedings of the 21st International Conference on Machine Learning, Banff, Canada, 2004.

[2] J. Sun, S. Boyd, L. Xiao, and P. Diaconis. The fastest mixing markove process on a graph and a connection to a maximum variance unfolding problem. SIAM Review, 48(4):681–699, 2006.

[3] A. Gretton, O. Bousquet, A.J. Smola, and B. Sch¨ lkopf. Measuring statistical dependence with Hilberto Schmidt norms. In S. Jain, H. U. Simon, and E. Tomita, editors, Proceedings Algorithmic Learning Theory, pages 63–77, Berlin, Germany, 2005. Springer-Verlag.

[4] K. Weinberger, F. Sha, Q. Zhu, and L. Saul. Graph laplacian regularization for large-scale semideﬁnte programming. In Neural Information Processing Systems, 2006.

[5] K. Fukumizu, F. R. Bach, and M. I. Jordan. Dimensionality reduction for supervised learning with reproducing kernel hilbert spaces. J. Mach. Learn. Res., 5:73–99, 2004.

[6] J. Goldberger, S. Roweis, G. Hinton, and R. Salakhutdinov. Neighbourhood component analysis. In Advances in Neural Information Processing Systems 17, 2004.

[7] L. Song, A. Smola, A. Gretton, K. Borgwardt, and J. Bedo. Supervised feature selection via dependence estimation. In Proc. Intl. Conf. Machine Learning, 2007.

[8] L. Song, A. Smola, A. Gretton, and K. Borgwardt. A dependence maximization view of clustering. In Proc. Intl. Conf. Machine Learning, 2007. 8