nips nips2009 nips2009-255 nips2009-255-reference knowledge-graph by maker-knowledge-mining

255 nips-2009-Variational Inference for the Nested Chinese Restaurant Process


Source: pdf

Author: Chong Wang, David M. Blei

Abstract: The nested Chinese restaurant process (nCRP) is a powerful nonparametric Bayesian model for learning tree-based hierarchies from data. Since its posterior distribution is intractable, current inference methods have all relied on MCMC sampling. In this paper, we develop an alternative inference technique based on variational methods. To employ variational methods, we derive a tree-based stick-breaking construction of the nCRP mixture model, and a novel variational algorithm that efficiently explores a posterior over a large set of combinatorial structures. We demonstrate the use of this approach for text and hand written digits modeling, where we show we can adapt the nCRP to continuous data as well. 1


reference text

[1] Blei, D. M., T. L. Griffiths, M. I. Jordan, et al. Hierarchical topic models and the nested Chinese restaurant process. In NIPS. 2003.

[2] Bart, E., I. Porteous, P. Perona, et al. Unsupervised learning of visual taxonomies. In CVPR. 2008.

[3] Sivic, J., B. C. Russell, A. Zisserman, et al. Unsupervised discovery of visual object class hierarchies. In CVPR. 2008.

[4] Aldous, D. Exchangeability and related topics. In Ecole d’Ete de Probabilities de Saint-Flour XIII 1983, pages 1–198. Springer, 1985.

[5] Ferguson, T. S. A Bayesian analysis of some nonparametric problems. The Annals of Statistics, 1(2):209– 230, 1973.

[6] Neal, R. Probabilistic inference using Markov chain Monte Carlo methods. Tech. Rep. CRG-TR-93-1, Department of Computer Science, University of Toronto, 1993.

[7] Robert, C., G. Casella. Monte Carlo Statistical Methods. Springer-Verlag, New York, NY, 2004.

[8] Jordan, M. I., Z. Ghahramani, T. S. Jaakkola, et al. An introduction to variational methods for graphical models. Learning in Graphical Models, 1999.

[9] Blei, D. M., M. I. Jordan. Variational methods for the Dirichlet process. In ICML. 2004.

[10] Kurihara, K., M. Welling, N. A. Vlassis. Accelerated variational Dirichlet process mixtures. In NIPS. 2006.

[11] Kurihara, K., M. Welling, Y. W. Teh. Collapsed variational Dirichlet process mixture models. In IJCAI. 2007.

[12] Teh, Y. W., K. Kurihara, M. Welling. Collapsed variational inference for HDP. In NIPS. 2008.

[13] Sudderth, E. B., M. I. Jordan. Shared segmentation of natural scenes using dependent Pitman-Yor processes. In NIPS. 2008.

[14] Doshi, F., K. T. Miller, J. Van Gael, et al. Variational inference for the Indian buffet process. In AISTATS, vol. 12. 2009.

[15] Escobar, M. D., M. West. Bayesian density estimation and inference using mixtures. Journal of the American Statistical Association, 90:577–588, 1995.

[16] Tipping, M. E., C. M. Bishop. Probabilistic principal component analysis. Journal of the Royal Statistical Society, Series B, 61:611–622, 1999.

[17] Bishop, C. M. Variational principal components. In ICANN. 1999.

[18] Collins, M., S. Dasgupta, R. E. Schapire. A generalization of principal components analysis to the exponential family. In NIPS. 2001.

[19] Mohamed, S., K. A. Heller, Z. Ghahramani. Bayesian exponential family PCA. In NIPS. 2008.

[20] Bach, F. R., M. I. Jordan. Beyond independent components: Trees and clusters. JMLR, 4:1205–1233, 2003.

[21] Antoniak, C. E. Mixtures of Dirichlet processes with applications to Bayesian nonparametric problems. The Annals of Statistics, 2(6):1152–1174, 1974.

[22] Sethuraman, J. A constructive definition of Dirichlet priors. Statistica Sinica, 4:639–650, 1994.

[23] Wainwright, M., M. Jordan. Variational inference in graphical models: The view from the marginal polytope. In Allerton Conference on Control, Communication and Computation. 2003.

[24] Ueda, N., R. Nakano, Z. Ghahramani, et al. SMEM algorithm for mixture models. Neural Computation, 12(9):2109–2128, 2000.

[25] Griffiths, T. L., M. Steyvers. Finding scientific topics. Proc Natl Acad Sci USA, 101 Suppl 1:5228–5235, 2004.

[26] Tipping, M. E., C. M. Bishop. Mixtures of probabilistic principal component analysers. Neural Computation, 11(2):443–482, 1999. 9