nips nips2012 nips2012-274 nips2012-274-reference knowledge-graph by maker-knowledge-mining

274 nips-2012-Priors for Diversity in Generative Latent Variable Models

Source: pdf

Author: James T. Kwok, Ryan P. Adams

Abstract: Probabilistic latent variable models are one of the cornerstones of machine learning. They offer a convenient and coherent way to specify prior distributions over unobserved structure in data, so that these unknown properties can be inferred via posterior inference. Such models are useful for exploratory analysis and visualization, for building density models of data, and for providing features that can be used for later discriminative tasks. A signiﬁcant limitation of these models, however, is that draws from the prior are often highly redundant due to i.i.d. assumptions on internal parameters. For example, there is no preference in the prior of a mixture model to make components non-overlapping, or in topic model to ensure that co-occurring words only appear in a small number of topics. In this work, we revisit these independence assumptions for probabilistic latent variable models, replacing the underlying i.i.d. prior with a determinantal point process (DPP). The DPP allows us to specify a preference for diversity in our latent variables using a positive deﬁnite kernel function. Using a kernel between probability distributions, we are able to deﬁne a DPP on probability measures. We show how to perform MAP inference with DPP priors in latent Dirichlet allocation and in mixture models, leading to better intuition for the latent variable representation and quantitatively improved unsupervised feature extraction, without compromising the generative aspects of the model. 1

reference text

[1] David M. Blei, Andrew Y. Ng, and Michael I. Jordan. Latent Dirichlet allocation. Journal of Machine Learning Research, 3:993–1022, 2003.

[2] J. F. C. Kingman. Poisson Processes. Oxford University Press, Oxford, United Kingdom, 1993.

[3] David J. Strauss. A model for clustering. Biometrika, 62(2):467–475, August 1975.

[4] Jesper Møller and Rasmus Plenge Waagepetersen. Statistical Inference and Simulation for Spatial Point Processes. Monographs on Statistics and Applied Probability. Chapman and Hall/CRC, Boca Raton, FL, 2004.

[5] J. Ben Hough, Manjunath Krishnapur, Yuval Peres, and Blint Vir´ g. Determinantal processes a and independence. Probability Surveys, 3:206–229, 2006.

[6] Antonello Scardicchio, Chase E. Zachary, and Salvatore Torquato. Statistical properties of determinantal point processes in high-dimensional Euclidean spaces. Physical Review E, 79(4), 2009.

[7] Fr´ d´ ric Lavancier, Jesper Møller, and Ege Rubak. Statistical aspects of determinantal point e e processes. http://arxiv.org/abs/1205.4818, 2012.

[8] Alex Kulesza and Ben Taskar. Structured determinantal point processes. In Advanced in Neural Information Processing Systems 23, 2011.

[9] Alex Kulesza and Ben Taskar. Learning determinantal point processes. In Proceedings of the 27th Conference on Uncertainty in Artiﬁcial Intelligence, 2011.

[10] Tony Jebara, Risi Kondor, and Andrew Howard. Probability product kernels. Journal of Machine Learning Research, 5:819–844, 2004.

[11] Adam Coates Honglak Lee and Andrew Ng. An analysis of single-layer networks in unsupervised feature learning. In Proceedings of the 14th International Conference on Artiﬁcial Intelligence and Statistics, 2011. 9