nips nips2009 nips2009-65 nips2009-65-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Chong Wang, David M. Blei
Abstract: We present a nonparametric hierarchical Bayesian model of document collections that decouples sparsity and smoothness in the component distributions (i.e., the “topics”). In the sparse topic model (sparseTM), each topic is represented by a bank of selector variables that determine which terms appear in the topic. Thus each topic is associated with a subset of the vocabulary, and topic smoothness is modeled on this subset. We develop an efficient Gibbs sampler for the sparseTM that includes a general-purpose method for sampling from a Dirichlet mixture with a combinatorial number of components. We demonstrate the sparseTM on four real-world datasets. Compared to traditional approaches, the empirical results will show that sparseTMs give better predictive performance with simpler inferred models. 1
[1] Teh, Y. W., M. I. Jordan, M. J. Beal, et al. Hierarchical Dirichlet processes. Journal of the American Statistical Association, 101(476):1566–1581, 2006.
[2] Blei, D., A. Ng, M. Jordan. Latent Dirichlet allocation. J. Mach. Learn. Res., 3:993–1022, 2003.
[3] Griffths, T., M. Steyvers. Probabilistic topic models. In Latent Semantic Analysis: A Road to Meaning. 2006.
[4] Saund, E. A multiple cause mixture model for unsupervised learning. Neural Comput., 7(1):51–71, 1995.
[5] Kab´ n, A., E. Bingham, T. Hirsim¨ ki. Learning to read between the lines: The aspect Bernoulli model. In a a SDM. 2004.
[6] Ishwaran, H., J. S. Rao. Spike and slab variable selection: Frequentist and Bayesian strategies. The Annals of Statistics, 33(2):730–773, 2005.
[7] Friedman, N., Y. Singer. Efficient Bayesian parameter estimation in large discrete domains. In NIPS. 1999.
[8] Pitman, J. Poisson–Dirichlet and GEM invariant distributions for split-and-merge transformations of an interval partition. Comb. Probab. Comput., 11(5):501–514, 2002.
[9] Sethuraman, J. A constructive definition of Dirichlet priors. Statistica Sinica, 4:639–650, 1994.
[10] Escobar, M. D., M. West. Bayesian density estimation and inference using mixtures. Journal of the American Statistical Association, 90:577–588, 1995. 8