nips nips2003 nips2003-71 nips2003-71-reference knowledge-graph by maker-knowledge-mining

71 nips-2003-Fast Embedding of Sparse Similarity Graphs


Source: pdf

Author: John C. Platt

Abstract: This paper applies fast sparse multidimensional scaling (MDS) to a large graph of music similarity, with 267K vertices that represent artists, albums, and tracks; and 3.22M edges that represent similarity between those entities. Once vertices are assigned locations in a Euclidean space, the locations can be used to browse music and to generate playlists. MDS on very large sparse graphs can be effectively performed by a family of algorithms called Rectangular Dijsktra (RD) MDS algorithms. These RD algorithms operate on a dense rectangular slice of the distance matrix, created by calling Dijsktra a constant number of times. Two RD algorithms are compared: Landmark MDS, which uses the Nyström approximation to perform MDS; and a new algorithm called Fast Sparse Embedding, which uses FastMap. These algorithms compare favorably to Laplacian Eigenmaps, both in terms of speed and embedding quality. 1


reference text

[1] C. Baker. The numerical treatment of integral equations. Clarendon Press, Oxford, 1977.

[2] Y. Bengio, J.-F. Paiement, and P. Vincent. Out-of-sample extensions for LLE, Isomap, MDS, Eigenmaps and spectral clustering. In S. Thrun, L. Saul, and B. Schø

[3] A. P. Bradley. The user of area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition, 30:1145–1159, 1997.

[4] T. F. Cox and M. A. A. Cox. Multidimensional Scaling. Number 88 in Monographs on Statistics and Applied Probability. Chapman & Hall/CRC, 2nd edition, 2001.

[5] V. de Silva and J. B. Tenenbaum. Global versus local methods in nonlinear dimensionality reduction. In S. Becker, S. Thrun, and K. Obermayer, editors, Proc. NIPS, volume 15, pages 721–728, 2003.

[6] E. W. Dijkstra. A note on two problems in connexion with graphs. Numerical Mathematics, 1:269–271, 1959.

[7] D. P. W. Ellis, B. Whitman, A. Berenzweig, and S. Lawrence. The quest for ground truth in musical artist similarity. In Proc. International Conference on Music Information Retrieval (ISMIR), 2002.

[8] C. Faloutsos and K.-I. Lin. Fastmap: A fast algorithm for indexing, data-mining and visualization of traditional and multimedia databases. In Proc. ACM SIGMOD, pages 163–174, 1995.

[9] R. Floyd. Algorithm 97 (shortest path). Communications of the ACM, 7:345, 1962.

[10] C. Fowlkes, S. Belongie, and J. Malik. Efficient spatiotemporal grouping using the Nyström method. In Proc. CVPR, volume 1, pages I–231–I–238, 2001.

[11] D. B. Johnson. Efficient algorithms for shortest paths in sparse networks. JACM, 24:1–13, 1977.

[12] J. C. Platt, C. J. C. Burges, S. Swenson, C. Weare, and A. Zheng. Learning a gaussian process prior for automatically generating music playlists. In T. Dietterich, S. Becker, and Z. Ghahramani, editors, Proc. NIPS, volume 14, pages 1425–1432, 2002.

[13] Y. Takane, F. W. Young, and J. de Leeuw. Nonmetric individual differences multidimensional scaling: an alternating least squares method with optimal scaling features. Psychometrika, 42:7–67, 1977.

[14] J. B. Tenenbaum. Mapping a manifold of perceptual observations. In M. Jordan, M. Kearns, and S. Solla, editors, Proc. NIPS, volume 10, pages 682–688, 1998.