nips nips2009 nips2009-106 nips2009-106-reference knowledge-graph by maker-knowledge-mining

106 nips-2009-Heavy-Tailed Symmetric Stochastic Neighbor Embedding

Source: pdf

Author: Zhirong Yang, Irwin King, Zenglin Xu, Erkki Oja

Abstract: Stochastic Neighbor Embedding (SNE) has shown to be quite promising for data visualization. Currently, the most popular implementation, t-SNE, is restricted to a particular Student t-distribution as its embedding distribution. Moreover, it uses a gradient descent algorithm that may require users to tune parameters such as the learning step size, momentum, etc., in ﬁnding its optimum. In this paper, we propose the Heavy-tailed Symmetric Stochastic Neighbor Embedding (HSSNE) method, which is a generalization of the t-SNE to accommodate various heavytailed embedding similarity functions. With this generalization, we are presented with two difﬁculties. The ﬁrst is how to select the best embedding similarity among all heavy-tailed functions and the second is how to optimize the objective function once the heavy-tailed function has been selected. Our contributions then are: (1) we point out that various heavy-tailed embedding similarities can be characterized by their negative score functions. Based on this ﬁnding, we present a parameterized subset of similarity functions for choosing the best tail-heaviness for HSSNE; (2) we present a ﬁxed-point optimization algorithm that can be applied to all heavy-tailed functions and does not require the user to set any parameters; and (3) we present two empirical studies, one for unsupervised visualization showing that our optimization algorithm runs as fast and as good as the best known t-SNE implementation and the other for semi-supervised visualization showing quantitative superiority using the homogeneity measure as well as qualitative advantage in cluster separation over t-SNE.

reference text

[1] M. Belkin and P. Niyogi. Laplacian eigenmaps and spectral techniques for embedding and clustering. Advances in neural information processing systems, 14:585–591, 2002.

[2] M. A. Carreira-Perpi˜ an. Gaussian mean-shift is an em algorithm. IEEE Transactions On n´ Pattern Analysis And Machine Intelligence, 29(5):767–776, 2007.

[3] D. Comaniciu and M. Peter. Mean Shift: A robust approach toward feature space analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(5):603–619, 2002.

[4] J. A. Cook, I. Sutskever, A. Mnih, and G. E. Hinton. Visualizing similarity data with a mixture of maps. In Proceedings of the 11th International Conference on Artiﬁcial Intelligence and Statistics, volume 2, pages 67–74, 2007.

[5] M. Gashler, D. Ventura, and T. Martinez. Iterative non-linear dimensionality reduction with manifold sculpting. In J.C. Platt, D. Koller, Y. Singer, and S. Roweis, editors, Advances in Neural Information Processing Systems 20, pages 513–520. MIT Press, Cambridge, MA, 2008.

[6] G. Hinton and S. Roweis. Stochastic neighbor embedding. Advances in Neural Information Processing Systems, 15:833–840, 2003.

[7] G. J. McLachlan and D. Peel. Finite Mixture Models. Wiley, 2000.

[8] J. A. K. Suykens. Data visualization and dimensionality reduction using kernel maps with a reference point. IEEE Transactions on Neural Networks, 19(9):1501–1517, 2008.

[9] J. B. Tenenbaum, V. Silva, and J. C. Langford. A global geometric framework for nonlinear dimensionality reduction. Science, 290(5500):2319–2323, Dec. 2000.

[10] L. van der Maaten and G. Hinton. Visualizing data using t-SNE. Journal of Machine Learning Research, 9:2579–2605, 2008.