nips nips2013 nips2013-320 nips2013-320-reference knowledge-graph by maker-knowledge-mining

320 nips-2013-Summary Statistics for Partitionings and Feature Allocations

Source: pdf

Author: Isik B. Fidaner, Taylan Cemgil

Abstract: Inﬁnite mixture models are commonly used for clustering. One can sample from the posterior of mixture assignments by Monte Carlo methods or ﬁnd its maximum a posteriori solution by optimization. However, in some problems the posterior is diffuse and it is hard to interpret the sampled partitionings. In this paper, we introduce novel statistics based on block sizes for representing sample sets of partitionings and feature allocations. We develop an element-based deﬁnition of entropy to quantify segmentation among their elements. Then we propose a simple algorithm called entropy agglomeration (EA) to summarize and visualize this information. Experiments on various inﬁnite mixture posteriors as well as a feature allocation dataset demonstrate that the proposed statistics are useful in practice.

reference text

[1] Ferguson, T. S. (1973) A Bayesian analysis of some nonparametric problems. 1(2):209–230. Annals of Statistics,

[2] Teh, Y. W. (2010) Dirichlet Processes. In Encyclopedia of Machine Learning. Springer.

[3] Kingman, J. F. C. (1992). Poisson processes. Oxford University Press.

[4] Pitman, J., & Yor, M. (1997) The two-parameter Poisson–Dirichlet distribution derived from a stable subordinator. Annals of Probability, 25:855-900.

[5] Pitman, J. (2006) Combinatorial Stochastic Processes. Lecture Notes in Mathematics. Springer-Verlag.

[6] Sethuraman, J. (1994) A constructive deﬁnition of Dirichlet priors. Statistica Sinica, 4, 639-650.

[7] Neal, R. M. (2000) Markov chain sampling methods for Dirichlet process mixture models, Journal of Computational and Graphical Statistics, 9:249–265.

[8] Meeds, E., Ghahramani, Z., Neal, R., & Roweis, S. (2007) Modelling dyadic data with binary latent factors. In Advances in Neural Information Processing 19.

[9] Teh, Y. W., Jordan, M. I., Beal, M. J., & Blei, D. M. (2006) Hierarchical Dirichlet processes. Journal of the American Statistical Association, 101(476):1566–1581.

[10] Grifﬁths, T. L. and Ghahramani, Z. (2011) The Indian buffet process: An introduction and review. Journal of Machine Learning Research, 12:1185–1224.

[11] Broderick, T., Pitman, J., & Jordan, M. I. (2013). Feature allocations, probability functions, and paintboxes. arXiv preprint arXiv:1301.6647.

[12] Teh, Y. W., Blundell, C., & Elliott, L. T. (2011). Modelling genetic variations with fragmentationcoagulation processes. In Advances in Neural Information Processing Systems 23.

[13] Orbanz, P. & Teh, Y. W. (2010). Bayesian Nonparametric Models. In Encyclopedia of Machine Learning. Springer.

[14] Medvedovic, M. & Sivaganesan, S. (2002) Bayesian inﬁnite mixture model based clustering of gene expression proﬁles. Bioinformatics, 18:1194–1206.

[15] Medvedovic, M., Yeung, K. and Bumgarner, R. (2004) Bayesian mixture model based clustering of replicated microarray data. Bioinformatics 20:1222–1232.

[16] Liu X., Sivanagesan, S., Yeung, K.Y., Guo, J., Bumgarner, R. E. and Medvedovic, M. (2006) Contextspeciﬁc inﬁnite mixtures for clustering gene expression proﬁles across diverse microarray dataset. Bioinformatics, 22:1737-1744.

[17] Shannon, C. E. (1948) A Mathematical Theory of Communication. 27(3):379–423. Bell System Technical Journal

[18] I. Nemenman, F. Shafee, & W. Bialek. (2002) Entropy and inference, revisited. In Advances in Neural Information Processing Systems, 14.

[19] Archer, E., Park, I. M., & Pillow, J. (2013) Bayesian Entropy Estimation for Countable Discrete Distributions. arXiv preprint arXiv:1302.0328.

[20] Simovici, D. (2007) On Generalized Entropy and Entropic Metrics. Journal of Multiple Valued Logic and Soft Computing, 13(4/6):295.

[21] Ellerman, D. (2009) Counting distinctions: on the conceptual foundations of Shannon’s information theory. Synthese, 168(1):119-149.

[22] Neal, R. M. (1992) Bayesian mixture modeling, in Maximum Entropy and Bayesian Methods: Proceedings of the 11th International Workshop on Maximum Entropy and Bayesian Methods of Statistical Analysis, Seattle, 1991, eds, Smith, Erickson, & Neudorfer, Dordrecht: Kluwer Academic Publishers, 197-211.

[23] Eisen, M. B., Spellman, P. T., Brown, P. O., & Botstein, D. (1998) Cluster analysis and display of genomewide expression patterns. Proceedings of the National Academy of Sciences, 95(25):14863-14868.

[24] Fisher, R. A. (1936) The use of multiple measurements in taxonomic problems. Annals of Eugenics, 7(2):179-188.

[25] Ideker, T., Thorsson, V., Ranish, J. A., Christmas, R., Buhler, J., Eng, J. K., Bumgarner, R., Goodlett, D. R., Aebersold, R. & Hood, L. (2001) Integrated genomic and proteomic analyses of a systematically perturbed metabolic network. Science, 292(5518):929-934.

[26] Pevehouse, J. C., Nordstrom, T. & Warnke, K. (2004) The COW-2 International Organizations Dataset Version 2.0. Conﬂict Management and Peace Science 21(2):101-119. http://www.correlatesofwar.org/COW2%20Data/IGOs/IGOv2-1.htm 9