nips nips2010 nips2010-221 nips2010-221-reference knowledge-graph by maker-knowledge-mining

221 nips-2010-Random Projections for $k$-means Clustering


Source: pdf

Author: Christos Boutsidis, Anastasios Zouzias, Petros Drineas

Abstract: This paper discusses the topic of dimensionality reduction for k-means clustering. We prove that any set of n points in d dimensions (rows in a matrix A ∈ Rn×d ) can be projected into t = Ω(k/ε2 ) dimensions, for any ε ∈ (0, 1/3), in O(nd⌈ε−2 k/ log(d)⌉) time, such that with constant probability the optimal k-partition of the point set is preserved within a factor of 2 + √ The projection is done ε. √ by post-multiplying A with a d × t random matrix R having entries +1/ t or −1/ t with equal probability. A numerical implementation of our technique and experiments on a large face images dataset verify the speed and the accuracy of our theoretical results.


reference text

[1] D. Achlioptas. Database-friendly random projections: Johnson-Lindenstrauss with binary coins. Journal of Computer and System Science, 66(4):671–687, 2003.

[2] N. Ailon and B. Chazelle. Approximate nearest neighbors and the fast Johnson-Lindenstrauss transform. In ACM Symposium on Theory of Computing (STOC), pages 557–563, 2006.

[3] D. Aloise, A. Deshpande, P. Hansen, and P. Popat. NP-hardness of Euclidean sum-of-squares clustering. Machine Learning, 75(2):245–248, 2009.

[4] E. Bingham and H. Mannila. Random projection in dimensionality reduction: applications to image and text data. In ACM SIGKDD international conference on Knowledge discovery and data mining (KDD), pages 245– 250, 2001.

[5] C. Boutsidis, M. W. Mahoney, and P. Drineas. Unsupervised feature selection for the k-means clustering problem. In Advances in Neural Information Processing Systems (NIPS), 2009.

[6] P. Drineas, A. Frieze, R. Kannan, S. Vempala, and V. Vinay. Clustering in large graphs and matrices. In ACMSIAM Symposium on Discrete Algorithms (SODA), pages 291–299, 1999.

[7] D. Foley and J. Sammon. An optimal set of discriminant vectors. IEEE Transactions on Computers, C-24(3):281– 289, March 1975.

[8] I. Guyon and A. Elisseeff. An introduction to variable and feature selection. Journal of Machine Learning Research, 3:1157–1182, 2003.

[9] I. Guyon, S. Gunn, A. Ben-Hur, and G. Dror. Result analysis of the NIPS 2003 feature selection challenge. In Advances in Neural Information Processing Systems (NIPS), pages 545–552. 2005.

[10] X. He, D. Cai, and P. Niyogi. Laplacian score for feature selection. In Advances in Neural Information Processing Systems (NIPS) 18, pages 507–514. 2006.

[11] P. Indyk and R. Motwani Approximate nearest neighbors: towards removing the curse of dimensionality. In ACM Symposium on Theory of Computing (STOC), pages 604–613, 1998.

[12] W. Johnson and J. Lindenstrauss. Extensions of Lipschitz mappings into a Hilbert space. Contemporary mathematics, 26(189-206):1–1, 1984.

[13] E. Kokiopoulou, J. Chen and Y. Saad. Trace optimization and eigenproblems in dimension reduction methods. Numerical Linear Algebra with Applications, to appear.

[14] A. Kumar, Y. Sabharwal, and S. Sen. A simple linear time (1+ε)-approximation algorithm for k-means clustering in any dimensions. In IEEE Symposium on Foundations of Computer Science (FOCS), pages 454–462, 2004.

[15] E. Liberty and S. Zucker. The Mailman algorithm: A note on matrix-vector multiplication. Information Processing Letters, 109(3):179–182, 2009.

[16] S. Lloyd. Least squares quantization in PCM. IEEE Transactions on Information Theory, 28(2):129–137, 1982.

[17] R. Ostrovsky, Y. Rabani, L. J. Schulman, and C. Swamy. The effectiveness of Lloyd-type methods for the k-means problem. In IEEE Symposium on Foundations of Computer Science (FOCS), pages 165–176, 2006.

[18] S. Roweis, and L. Saul. Nonlinear dimensionality reduction by locally linear embedding. Science, 290:5500, pages 2323-2326, 2000.

[19] T. Sarlos. Improved approximation algorithms for large matrices via random projections. In IEEE Symposium on Foundations of Computer Science (FOCS), pages 329–337, 2006.

[20] X. Wu et al. Top 10 algorithms in data mining. Knowledge and Information Systems, 14(1):1–37, 2008.

[21] http://www.cs.uiuc.edu/˜dengcai2/Data/FaceData.html

[22] http://www.cs.uiuc.edu/˜dengcai2/Data/data.html

[23] http://www.cs.nyu.edu/˜roweis/lle/ 9