nips nips2002 nips2002-190 nips2002-190-reference knowledge-graph by maker-knowledge-mining

190 nips-2002-Stochastic Neighbor Embedding

Source: pdf

Author: Geoffrey E. Hinton, Sam T. Roweis

Abstract: We describe a probabilistic approach to the task of placing objects, described by high-dimensional vectors or by pairwise dissimilarities, in a low-dimensional space in a way that preserves neighbor identities. A Gaussian is centered on each object in the high-dimensional space and the densities under this Gaussian (or the given dissimilarities) are used to deﬁne a probability distribution over all the potential neighbors of the object. The aim of the embedding is to approximate this distribution as well as possible when the same operation is performed on the low-dimensional “images” of the objects. A natural cost function is a sum of Kullback-Leibler divergences, one per object, which leads to a simple gradient for adjusting the positions of the low-dimensional images. Unlike other dimensionality reduction methods, this probabilistic framework makes it easy to represent each object by a mixture of widely separated low-dimensional images. This allows ambiguous objects, like the document count vector for the word “bank”, to have versions close to the images of both “river” and “ﬁnance” without forcing the images of outdoor concepts to be located close to those of corporate concepts.

reference text

[1] T. Cox and M. Cox. Multidimensional Scaling. Chapman & Hall, London, 1994.

[2] J. Tenenbaum. Mapping a manifold of perceptual observations. In Advances in Neural Information Processing Systems, volume 10, pages 682–688. MIT Press, 1998.

[3] J. B. Tenenbaum, V. de Silva, and J. C. Langford. A global geometric framework for nonlinear dimensionality reduction. Science, 290:2319–2323, 2000.

[4] S. T. Roweis and L. K. Saul. Nonlinear dimensionality reduction by locally linear embedding. Science, 290:2323–2326, 2000.

[5] T. Kohonen. Self-organization and Associative Memory. Springer-Verlag, Berlin, 1988.

[6] C. Bishop, M. Svensen, and C. Williams. GTM: The generative topographic mapping. Neural Computation, 10:215, 1998.

[7] J. J. Hull. A database for handwritten text recognition research. IEEE Transaction on Pattern Analysis and Machine Intelligence, 16(5):550–554, May 1994.

[8] I. T. Jolliffe. Principal Component Analysis. Springer-Verlag, New York, 1986.

[9] Yann LeCun. Nips online web site. http://nips.djvuzone.org, 2001.

[10] Andrew Kachites McCallum. Bow: A toolkit for statistical language modeling, text retrieval, classiﬁcation and clustering. http://www.cs.cmu.edu/ mccallum/bow, 1996.

[11] A. Paccanaro and G.E. Hinton. Learning distributed representations of concepts from relational data using linear relational embedding. IEEE Transactions on Knowledge and Data Engineering, 13:232–245, 2000.