nips nips2010 nips2010-90 nips2010-90-reference knowledge-graph by maker-knowledge-mining

90 nips-2010-Fast Large-scale Mixture Modeling with Component-specific Data Partitions

Source: pdf

Author: Bo Thiesson, Chong Wang

Abstract: Remarkably easy implementation and guaranteed convergence has made the EM algorithm one of the most used algorithms for mixture modeling. On the downside, the E-step is linear in both the sample size and the number of mixture components, making it impractical for large-scale data. Based on the variational EM framework, we propose a fast alternative that uses component-speciﬁc data partitions to obtain a sub-linear E-step in sample size, while the algorithm still maintains provable convergence. Our approach builds on previous work, but is signiﬁcantly faster and scales much better in the number of mixture components. We demonstrate this speedup by experiments on large-scale synthetic and real data. 1

reference text

[1] S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University Press, 2004.

[2] P. S. Bradley, U. M. Fayyad, and C. A. Reina. Scaling EM (expectation maximization) clustering to large databases. Technical Report MSR-TR-98-3, Microsoft Research, 1998.

[3] S. Dasgupta. Learning mixtures of Gaussians. In Proceedings of the 40th Annual Symposium on Foundations of Computer Science, pages 634–644, 1999.

[4] A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B, 39(1):1–38, 1977.

[5] G. Hamerly. Making k-means even faster. In SIAM International Conference on Data Mining (SDM), 2010.

[6] T. Kanungo, D. M. Mount, N. S. Netanyahu, C. D. Piatko, R. Silverman, and A. Y. Wu. An efﬁcient k-means clustering algorithm: Analysis and implementation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(7):881–892, 2002.

[7] G. J. McLachlan and D. Peel. Finite Mixture Models. Wiley Interscience, New York, USA, 2000.

[8] A. Moore. The anchors hierarchy: Using the triangle inequality to survive high-dimensional data. In Proceedings of the Fourteenth Conference on Uncertainty in Artiﬁcial Intelligence, pages 397–405. AAAI Press, 2000.

[9] A. W. Moore. A tutorial on kd-trees. Technical Report 209, University of Cambridge, 1991.

[10] A. W. Moore. Very fast EM-based mixture model clustering using multiresolution kd-trees. In Advances in Neural Information Processing Systems, pages 543–549. Morgan Kaufman, 1999.

[11] R. Neal and G. E. Hinton. A view of the EM algorithm that justiﬁes incremental, sparse, and other variants. In Learning in Graphical Models, pages 355–368, 1998.

[12] L. E. Ortiz and L. P. Kaelbling. Accelerating EM: An empirical study. In Proceedings of the Fifteenth Conference on Uncertainty in Artiﬁcial Intelligence, pages 512–521, 1999.

[13] D. Pelleg and A. Moore. Accelerating exact k-means algorithms with geometric reasoning. In S. Chaudhuri and D. Madigan, editors, Proceedings of the Fifth International Conference on Knowledge Discovery in Databases, pages 277–281. AAAI Press, 1999.

[14] B. Thiesson, C. Meek, and D. Heckerman. Accelerating EM for large databases. Machine Learning, 45(3):279–299, 2001.

[15] J. K. Uhlmann. Satisfying general proximity/similarity queries with metric trees. Information Processing Letters, 40(4):175–179, 1991.

[16] J. J. Verbeek, J. R. Nunnink, and N. Vlassis. Accelerated EM-based clustering of large data sets. Data Mining and Knowledge Discovery, 13(3):291–307, 2006.

[17] J. J. Verbeek, N. Vlassis, and J. R. J. Nunnink. A variational EM algorithm for large-scale mixture modeling. In In Proceedings of the 8th Annual Conference of the Advanced School for Computing and Imaging (ASCI), 2003. 9