nips nips2003 nips2003-83 nips2003-83-reference knowledge-graph by maker-knowledge-mining

83 nips-2003-Hierarchical Topic Models and the Nested Chinese Restaurant Process

Source: pdf

Author: Thomas L. Griffiths, Michael I. Jordan, Joshua B. Tenenbaum, David M. Blei

Abstract: We address the problem of learning topic hierarchies from data. The model selection problem in this domain is daunting—which of the large collection of possible trees to use? We take a Bayesian approach, generating an appropriate prior via a distribution on partitions that we refer to as the nested Chinese restaurant process. This nonparametric prior allows arbitrarily large branching factors and readily accommodates growing data collections. We build a hierarchical topic model by combining this prior with a likelihood that is based on a hierarchical variant of latent Dirichlet allocation. We illustrate our approach on simulated data and with an application to the modeling of NIPS abstracts. 1

reference text

[1] D. Aldous. Exchangeability and related topics. In Ecole d’´ t´ de probabilit´ s de Saint-Flour, ee e XIII—1983, pages 1–198. Springer, Berlin, 1985.

[2] E. Segal, D. Koller, and D. Ormoneit. Probabilistic abstraction hierarchies. In Advances in Neural Information Processing Systems 14.

[3] T. Hofmann. The cluster-abstraction model: Unsupervised learning of topic hierarchies from text data. In IJCAI, pages 682–687, 1999.

[4] T. Ferguson. A Bayesian analysis of some nonparametric problems. The Annals of Statistics, 1:209–230, 1973.

[5] J. Pitman. Combinatorial Stochastic Processes. Notes for St. Flour Summer School. 2002.

[6] J. Ishwaran and L. James. Generalized weighted Chinese restaurant processes for species sampling mixture models. Statistica Sinica, 13:1211–1235, 2003.

[7] R. Neal. Markov chain sampling methods for Dirichlet process mixture models. Journal of Computational and Graphical Statistics, 9(2):249–265, June 2000.

[8] M. West, P. Muller, and M. Escobar. Hierarchical priors and mixture models, with application in regression and density estimation. In Aspects of Uncertainty. John Wiley.

[9] M. Beal, Z. Ghahramani, and C. Rasmussen. The inﬁnite hidden Markov model. In Advances in Neural Information Processing Systems 14.

[10] C. Rasmussen and Z. Ghahramani. Inﬁnite mixtures of Gaussian process experts. In Advances in Neural Information Processing Systems 14.

[11] D. Blei, A. Ng, and M. Jordan. Latent Dirichlet allocation. Journal of Machine Learning Research, 3:993–1022, January 2003.

[12] T. Grifﬁths and M. Steyvers. A probabilistic approach to semantic representation. In Proceedings of the 24th Annual Conference of the Cognitive Science Society, 2002.

[13] R. Kass and A. Raftery. Bayes factors. Journal of the American Statistical Association, 90(430):773–795, 1995.

[14] S. Roweis. NIPS abstracts, 1987–1999. http://www.cs.toronto.edu/ roweis/data.html.