jmlr jmlr2006 jmlr2006-64 jmlr2006-64-reference knowledge-graph by maker-knowledge-mining

64 jmlr-2006-Noisy-OR Component Analysis and its Application to Link Analysis

Source: pdf

Author: Tomáš Šingliar, Miloš Hauskrecht

Abstract: We develop a new component analysis framework, the Noisy-Or Component Analyzer (NOCA), that targets high-dimensional binary data. NOCA is a probabilistic latent variable model that assumes the expression of observed high-dimensional binary data is driven by a small number of hidden binary sources combined via noisy-or units. The component analysis procedure is equivalent to learning of NOCA parameters. Since the classical EM formulation of the NOCA learning problem is intractable, we develop its variational approximation. We test the NOCA framework on two problems: (1) a synthetic image-decomposition problem and (2) a co-citation data analysis problem for thousands of CiteSeer documents. We demonstrate good performance of the new model on both problems. In addition, we contrast the model to two mixture-based latent-factor models: the probabilistic latent semantic analysis (PLSA) and latent Dirichlet allocation (LDA). Differing assumptions underlying these models cause them to discover different types of structure in co-citation data, thus illustrating the beneﬁt of NOCA in building our understanding of highdimensional data sets. Keywords: component analysis, vector quantization, variational learning, link analysis

reference text

Hagai Attias. Independent Factor Analysis. Neural Computation, 11(4):803–851, 1999. URL citeseer.nj.nec.com/attias99independent.html. David Bartholomew and Martin Knott. Latent Variable Models and Factor Analysis, volume 7 of Kendall’s Library of Statistics. Oxford University Press, 1999. Christopher Bishop. Latent variable models. In Michael Jordan, editor, Learning in Graphical Models, pages 371–403. MIT Press, 1999a. Christopher Bishop. Variational principal components. In Proceedings of 9th International Conference on Artiﬁcial Neural Networks, volume 1, pages 509–514, 1999b. David Blei, Andrew Ng, and Michael Jordan. Latent Dirichlet tion. Journal of Machine Learning Research, 3:993–1022, Jan 2003. http://www.cs.berkeley.edu/ blei/papers/blei03a.ps.gz. allocaURL Wray Buntine. Variational extensions to EM and multinomial PCA. In Proth European Conference on Machine Learning, 2002. ceedings of the 13 URL citeseer.nj.nec.com/buntine02variational.html. 2211 ˇ S INGLIAR AND H AUSKRECHT David Cohn and Huan Chang. Learning to probabilistically identify authoritative documents. In Proceedings of the 17th International Conference on Machine Learning, pages 167–174. Morgan Kaufmann, San Francisco, CA, 2000. URL citeseer.ist.psu.edu/cohn00learning.html. David Cohn and Thomas Hofmann. The missing link - a probabilistic model of document content and hypertext connectivity. In Neural Information Processing Systems 13, 2001. URL citeseer.ist.psu.edu/cohn01missing.html. Gregory Cooper and Edward Herskovits. A Bayesian method for the induction of probabilistic networks from data. Machine Learning, 9:309–347, 1992. Thomas Cover and Joy Thomas. Elements of Information Theory. John Wiley & sons, 1991. Arthur Dempster, Nan Laird, and Donald Rubin. Maximum likelihood for incomplete data via the EM algorithm. Journal of Royal Statistical Society, 39:1–38, 1977. Francisco Diez and Severino Gallan. Efﬁcient computation for the Noisy-Max. International Journal of Intelligent Systems, 2003. Zoubin Ghahramani and Michael Jordan. Factorial hidden Markov models. In David Touretzky, Michael Mozer, and Michael Hasselmo, editors, Proceedings of Advances in Neural Information Processing Systems, volume 8, pages 472–478. MIT Press, 1995. ISBN 0262201070. URL citeseer.nj.nec.com/article/ghahramani97factorial.html. Zoubin Ghahramani and Michael Jordan. Factorial hidden Markov models. Machine Learning, 29: 245–273, 1997. David Heckerman. Causal independence for knowledge acquisition and inference. In Proceedings of 9th Conference on Uncertainty in AI UAI’93, San Francisco, CA, 1993. Morgan Kaufmann Publishers. Thomas Hofmann. Probabilistic latent semantic analysis. ings of Uncertainty in Artiﬁcial Intelligence, Stockholm, citeseer.nj.nec.com/hofmann99probabilistic.html. In 1999a. ProceedURL Thomas Hofmann. Probabilistic latent semantic indexing. In Proceedings of the 22nd Annual ACM Conference on Research and Development in Information Retrieval, pages 50–57, Berkeley, California, August 1999b. URL citeseer.nj.nec.com/article/hofmann99probabilistic.html. Tommi Jaakkola and Michael Jordan. Variational probabilistic inference and the QMRDT network. Journal of Artiﬁcial Intelligence Research, 10:291–322, 1999. URL citeseer.nj.nec.com/article/jaakkola99variational.html. Tommi Jaakkola, Lawrence Saul, and Michael Jordan. Fast learning by bounding likelihoods in sigmoid type belief networks. In David S. Touretzky, Michael C. Mozer, and Michael E. Hasselmo, editors, Advances in Neural Information Processing Systems, volume 8, pages 528–534. The MIT Press, 1996. URL citeseer.ist.psu.edu/jaakkola96fast.html. Ian Jolliffe. Principal Component Analysis. Springer, 1986. 2212 N OISY-OR C OMPONENT A NALYSIS Michael Jordan, Zoubin Ghahramani, Tommi Jaakkola, and Lawrence Saul. An introduction to variational methods for graphical models. Machine Learning, 37(2):183–233, 1999. Michael Kearns and Yishay Mansour. Exact inference of hidden structure from sample data in noisyor networks. In Proceedings of the 14th Conference on Uncertainty in Artiﬁcial Intelligence, pages 304–310, 1998. URL citeseer.nj.nec.com/383491.html. Xinghua Lu, Milos Hauskrecht, and Roger Day. Modeling cellular processes with variational Bayesian cooperative vector quantizer. In Proceedings of Paciﬁc Symposium on Biocomputing, 2004. David MacKay. Probable networks and plausible predictions - a review of practical Baysian methods for supervised neural networks. Network: Computation in Neural Systems, 6(3):469–505, 1995. James Miskin. Ensemble Learning for Independent Component Analysis. PhD thesis, Selwyn College, University of Cambridge, 2000. David Ross and Richard Zemel. Multiple cause vector quantization. In Proceedings of Advances in Neural Information Processing Systems 16, 2002. URL citeseer.ist.psu.edu/ross02multiple.html. Lawrence Saul, Tommi Jaakkola, and Michael Jordan. Mean ﬁeld theory for sigmoid belief networks. Journal of Artiﬁcial Intelligence Research, 4:61–76, 1996. Andrew Schein, Lawrence Saul, and Lyle Ungar. A generalized linear model for principal component analysis of binary data. In Proceedings of the 9th International Workshop on Artiﬁcial Intelligence and Statistics, 2003. URL citeseer.nj.nec.com/546431.html. Gideon Schwarz. Estimating the dimension of a model. Annals of Statistics, 6:461–464, 1978. Michael Shwe, Blackford Middleton, David Heckerman, Max Henrion, Eric Horvitz, Harold Lehmann, and Gregory Cooper. Probabilistic diagnosis using a reformulation of the INTERNIST1/QMR knowledge base: Part I. The probabilistic model and inference algorithms. Methods of Information in Medicine, 30:241–255, 1991. Tom´ s Singliar and Miloˇ Hauskrecht. Variational learning for noisy-or component analysis. In aˇ ˇ s Proceedings of the SIAM International Conference on Data Mining, pages 370–379, 2005. Michael Tipping and Christopher Bishop. Probabilistic principal component analysis. Technical Report NCRG/97/010, Neural Computing Research Group, Aston University, September 1997. URL citeseer.nj.nec.com/article/tipping97probabilistic.html. Jiˇ´ Vomlel. Noisy-or classiﬁer. In Proceedings of the 6th Workshop on Uncertainty Processing, rı pages 291–302, 2003. URL http://lisp.vse.cz/wupes2003/. 2213