nips nips2004 nips2004-169 nips2004-169-reference knowledge-graph by maker-knowledge-mining

169 nips-2004-Sharing Clusters among Related Groups: Hierarchical Dirichlet Processes


Source: pdf

Author: Yee W. Teh, Michael I. Jordan, Matthew J. Beal, David M. Blei

Abstract: We propose the hierarchical Dirichlet process (HDP), a nonparametric Bayesian model for clustering problems involving multiple groups of data. Each group of data is modeled with a mixture, with the number of components being open-ended and inferred automatically by the model. Further, components can be shared across groups, allowing dependencies across groups to be modeled effectively as well as conferring generalization to new groups. Such grouped clustering problems occur often in practice, e.g. in the problem of topic discovery in document corpora. We report experimental results on three text corpora showing the effective and superior performance of the HDP over previous models.


reference text

[1] D.M. Blei, A.Y. Ng, and M.I. Jordan. Latent Dirichlet allocation. JMLR, 3:993–1022, 2003.

[2] M.D. Escobar and M. West. Bayesian density estimation and inference using mixtures. Journal of the American Statistical Association, 90:577–588, 1995.

[3] S.N. MacEachern and P. M¨ ller. Estimating mixture of Dirichlet process models. Journal of u Computational and Graphical Statistics, 7:223–238, 1998.

[4] T.S. Ferguson. A Bayesian analysis of some nonparametric problems. Annals of Statistics, 1(2):209–230, 1973. ´

[5] D. Aldous. Exchangeability and related topics. In Ecole d’´ t´ de probabilit´ s de Saint-Flour ee e XIII–1983, pages 1–198. Springer, Berlin, 1985.

[6] J. Sethuraman. A constructive definition of Dirichlet priors. Statistica Sinica, 4:639–650, 1994.

[7] R.M. Neal. Markov chain sampling methods for Dirichlet process mixture models. Journal of Computational and Graphical Statistics, 9:249–265, 2000.

[8] C.E. Rasmussen. The infinite Gaussian mixture model. In NIPS, volume 12, 2000.

[9] D.M. Blei, T.L. Griffiths, M.I. Jordan, and J.B. Tenenbaum. Hierarchical topic models and the nested Chinese restaurant process. NIPS, 2004.

[10] Y.W. Teh, M.I. Jordan, M.J. Beal, and D.M. Blei. Hierarchical dirichlet processes. Technical Report 653, Department of Statistics, University of California at Berkeley, 2004.

[11] M.J. Beal, Z. Ghahramani, and C.E. Rasmussen. The infinite hidden Markov model. In NIPS, volume 14, 2002.

[12] M.J. Beal. Variational Algorithms for Approximate Bayesian Inference. PhD thesis, Gatsby Unit, University College London, 2004.