nips nips2001 nips2001-149 nips2001-149-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Eran Segal, Daphne Koller, Dirk Ormoneit
Abstract: Many domains are naturally organized in an abstraction hierarchy or taxonomy, where the instances in “nearby” classes in the taxonomy are similar. In this paper, we provide a general probabilistic framework for clustering data into a set of classes organized as a taxonomy, where each class is associated with a probabilistic model from which the data was generated. The clustering algorithm simultaneously optimizes three things: the assignment of data instances to clusters, the models associated with the clusters, and the structure of the abstraction hierarchy. A unique feature of our approach is that it utilizes global optimization algorithms for both of the last two steps, reducing the sensitivity to noise and the propensity to local maxima that are characteristic of algorithms such as hierarchical agglomerative clustering that only take local steps. We provide a theoretical analysis for our algorithm, showing that it converges to a local maximum of the joint likelihood of model and data. We present experimental results on synthetic data, and on real data in the domains of gene expression and text.
[1] P. Cheeseman and J. Stutz. Bayesian Classification (AutoClass): Theory and Results. AAAI Press, 1995.
[2] A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, B 39:1–39, 1977.
[3] M. Eisen, P. Spellman, P. Brown, and D. Botstein. Cluster analysis and display of genome-wide expression patterns. PNAS, 95:14863–68, 1998.
[4] N. Friedman. The Bayesian structural EM algorithm. In Proc. UAI, 1998.
[5] N. Friedman, M. Ninio, I. Pe’er, and T. Pupko. A structural EM algorithm for phylogentic inference. In Proc. RECOMB, 2001.
[6] A.P. Gasch et al. Genomic expression program in the response of yeast cells to environmental changes. Mol. Bio. Cell, 11:4241–4257, 2000.
[7] T. Hofmann. The cluster-abstraction model: Unsupervised learning of topic hierarchies from text data. In Proc. IJCAI, 1999.
[8] T. Hofmann. The cluster-abstraction model: Unsupervised learning of topic hierarchies from text data. In Proc. International Joint Conference on Artificial Intelligence, 1999.
[9] T. R. Hughes et al. Functional discovery via a compendium of expression profiles. Cell, 102(1):109–26, 2000.
[10] F.K. Hwang, D.S.Richards, and P. Winter. The Steiner Tree Problem. Annals of Discrete Mathematics, Vol. 53, North-Holland, 1992.
[11] A. Krogh, M. Brown, S. Mian, K. Sjolander, and D. Haussler. Hidden markov models in computational biology: Applications to protein modeling. Mol. Biology, 235:1501–1531, 1994.
[12] A. McCallum, R. Rosenfeld, T. Mitchell, and A. Ng. Improving text classification by shrinkage in a hierarchy of classes. In Proc. ICML, 1998.
[13] M. Meila and M.I. Jordan. Learning with mixtures of trees. Machine Learning, 1:1–48, 2000.
[14] E. Segal and D. Koller. Probabilistic hierarchical clustering for biological data. In RECOMB, 2002.