nips nips2013 nips2013-344 nips2013-344-reference knowledge-graph by maker-knowledge-mining

344 nips-2013-Using multiple samples to learn mixture models

Source: pdf

Author: Jason Lee, Ran Gilad-Bachrach, Rich Caruana

Abstract: In the mixture models problem it is assumed that there are K distributions θ1 , . . . , θK and one gets to observe a sample from a mixture of these distributions with unknown coeﬃcients. The goal is to associate instances with their generating distributions, or to identify the parameters of the hidden distributions. In this work we make the assumption that we have access to several samples drawn from the same K underlying distributions, but with diﬀerent mixing weights. As with topic modeling, having multiple samples is often a reasonable assumption. Instead of pooling the data into one sample, we prove that it is possible to use the diﬀerences between the samples to better recover the underlying structure. We present algorithms that recover the underlying structure under milder assumptions than the current state of art when either the dimensionality or the separation is high. The methods, when applied to topic modeling, allow generalization to words not present in the training data. 1

reference text

[1] Mikhail Belkin and Kaushik Sinha, Polynomial learning of distribution families, Foundations of Computer Science (FOCS), 2010 51st Annual IEEE Symposium on, IEEE, 2010, pp. 103–112.

[2] Shai Ben-David, John Blitzer, Koby Crammer, and Fernando Pereira, Analysis of representations for domain adaptation, Advances in neural information processing systems 19 (2007), 137.

[3] David M Blei, Andrew Y Ng, and Michael I Jordan, Latent dirichlet allocation, the Journal of machine Learning research 3 (2003), 993–1022.

[4] Kamalika Chaudhuri and Satish Rao, Learning mixtures of product distributions using correlations and independence, Proc. of COLT, 2008.

[5] Sanjoy Dasgupta, Learning mixtures of gaussians, Foundations of Computer Science, 1999. 40th Annual Symposium on, IEEE, 1999, pp. 634–644.

[6] Adam Tauman Kalai, Ankur Moitra, and Gregory Valiant, Eﬃciently learning mixtures of two gaussians, Proceedings of the 42nd ACM symposium on Theory of computing, ACM, 2010, pp. 553–562.

[7] Ravindran Kannan, Hadi Salmasian, and Santosh Vempala, The spectral method for general mixture models, Learning Theory, Springer, 2005, pp. 444–457.

[8] Ankur Moitra and Gregory Valiant, Settling the polynomial learnability of mixtures of gaussians, Foundations of Computer Science (FOCS), 2010 51st Annual IEEE Symposium on, IEEE, 2010, pp. 93–102.

[9] Christos H Papadimitriou, Hisao Tamaki, Prabhakar Raghavan, and Santosh Vempala, Latent semantic indexing: A probabilistic analysis, Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems, ACM, 1998, pp. 159–168.

[10] Arora Sanjeev and Ravi Kannan, Learning mixtures of arbitrary gaussians, Proceedings of the thirty-third annual ACM symposium on Theory of computing, ACM, 2001, pp. 247–257.

[11] Santosh Vempala and Grant Wang, A spectral algorithm for learning mixtures of distributions, Foundations of Computer Science, 2002. Proceedings. The 43rd Annual IEEE Symposium on, IEEE, 2002, pp. 113–122. 9