nips nips2003 nips2003-47 nips2003-47-reference knowledge-graph by maker-knowledge-mining

47 nips-2003-Computing Gaussian Mixture Models with EM Using Equivalence Constraints

Source: pdf

Author: Noam Shental, Aharon Bar-hillel, Tomer Hertz, Daphna Weinshall

Abstract: Density estimation with Gaussian Mixture Models is a popular generative technique used also for clustering. We develop a framework to incorporate side information in the form of equivalence constraints into the model estimation procedure. Equivalence constraints are deﬁned on pairs of data points, indicating whether the points arise from the same source (positive constraints) or from different sources (negative constraints). Such constraints can be gathered automatically in some learning problems, and are a natural form of supervision in others. For the estimation of model parameters we present a closed form EM procedure which handles positive constraints, and a Generalized EM procedure using a Markov net which handles negative constraints. Using publicly available data sets we demonstrate that such side information can lead to considerable improvement in clustering tasks, and that our algorithm is preferable to two other suggested methods using the same type of side information.

reference text

[1] A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the EM algorithm. JRSSB, 39:1–38, 1977.

[2] A. Georghiades, P.N. Belhumeur, and D.J. Kriegman. From few to many: Generative models for recognition under variable pose and illumination. IEEE international Conference on Automatic Face and Gesture Recognition, pages 277–284, 2000.

[3] D. Klein, Sepandar D. Kamvar, and Christopher D. Manning. From instance-level constraints to space-level constraints: Making the most of prior knowledge in data clustering. In ICML, 2002.

[4] D. Miller and S. Uyar. A mixture of experts classiﬁer with learning based on both labelled and unlabelled data. In M. C. Mozer, M. I. Jordan, and T. Petsche, editors, NIPS 9, pages 571–578. MIT Press, 1997.

[5] K. Nigam, A.K. McCallum, S. Thrun, and T.M. Mitchell. Learning to classify text from labeled and unlabeled documents. In Proceedings of AAAI-98, pages 792–799, Madison, US, 1998. AAAI Press, Menlo Park, US.

[6] P.J. Phillips. Support vector machines applied to face recognition. In M. C. Mozer, M. I. Jordan, and T. Petsche, editors, NIPS 11, page 803ff. MIT Press, 1998.

[7] N. Shental, T. Hertz, D. Weinshall, and M. Pavel. Adjustment learning and relevant component analysis. In A. Heyden, G. Sparr, M. Nielsen, and P. Johansen, editors, Computer Vision ECCV 2002, volume 4, page 776ff, 2002.

[8] M. Szummer and T. Jaakkola. Partially labeled classiﬁcation with markov random walks. In NIPS, volume 14. The MIT Press, 2001.

[9] K. Wagstaff, C. Cardie, S. Rogers, and S. Schroedl. Constrained K-means clustering with background knowledge. In Proc. 18th International Conf. on Machine Learning, pages 577– 584. Morgan Kaufmann, San Francisco, CA, 2001.

[10] E.P Xing, A.Y. Ng, M.I. Jordan, and S. Russell. Distance metric learnign with application to clustering with side-information. In Advances in Neural Information Processing Systems, volume 15. The MIT Press, 2002.