acl acl2010 acl2010-10 acl2010-10-reference knowledge-graph by maker-knowledge-mining

10 acl-2010-A Latent Dirichlet Allocation Method for Selectional Preferences

Source: pdf

Author: Alan Ritter ; Mausam Mausam ; Oren Etzioni

Abstract: The computation of selectional preferences, the admissible argument values for a relation, is a well-known NLP task with broad applicability. We present LDA-SP, which utilizes LinkLDA (Erosheva et al., 2004) to model selectional preferences. By simultaneously inferring latent topics and topic distributions over relations, LDA-SP combines the benefits of previous approaches: like traditional classbased approaches, it produces humaninterpretable classes describing each relation’s preferences, but it is competitive with non-class-based methods in predictive power. We compare LDA-SP to several state-ofthe-art methods achieving an 85% increase in recall at 0.9 precision over mutual information (Erk, 2007). We also evaluate LDA-SP’s effectiveness at filtering improper applications of inference rules, where we show substantial improvement over Pantel et al. ’s system (Pantel et al., 2007).

reference text

Michele Banko and Oren Etzioni. 2008. The tradeoffs between open and traditional relation extraction. In ACL-08: HLT. Shane Bergsma, Dekang Lin, and Randy Goebel. 2008. Discriminative learning of selectional preference from unlabeled text. In EMNLP. David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent dirichlet allocation. J. Mach. Learn. Res. Samuel Brody and Mirella Lapata. 2009. Bayesian word sense induction. In EACL, pages 103–1 11, Morristown, NJ, USA. Association for Computational Linguistics. Andrew Carlson, Justin Betteridge, Richard C. Wang, Estevam R. Hruschka Jr., and Tom M. Mitchell. 2010. Coupled semi-supervised learning for information extraction. In WSDM 2010. Harr Chen, S. R. K. Branavan, Regina Barzilay, and David R. Karger. 2009. Global models of document structure using latent permutations. In NAACL. Stephen Clark and David Weir. 2002. Class-based probability estimation using a semantic hierarchy. Comput. Linguist. Ido Dagan, Lillian Lee, and Fernando C. N. Pereira. 1999. Similarity-based models of word cooccurrence probabilities. In Machine Learning. Hal Daum e´ III and Daniel Marcu. 2006. Bayesian query-focused summarization. In Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics. Hal Daume III. 2007. hbc: Hierarchical bayes compiler. http://hal3.name/hbc. Katrin Erk. 2007. A simple, similarity-based model for selectional preferences. In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics. Elena Erosheva, Stephen Fienberg, and John Lafferty. 2004. Mixed-membership models of scientific publications. Proceedings of the National Academy of Sciences of the United States of America. Oren Etzioni, Michael Cafarella, Doug Downey, Ana maria Popescu, Tal Shaked, Stephen Soderl, Daniel S. Weld, and Alex Yates. 2005. Unsupervised named-entity extraction from the web: An experimental study. Artificial Intelligence. Daniel Gildea and Daniel Jurafsky. 2002. Automatic labeling of semantic roles. Comput. Linguist. T. L. Griffiths and M. Steyvers. 2004. Finding scientific topics. Proc Natl Acad Sci U S A. Frank Keller and Mirella Lapata. 2003. Using the web to obtain frequencies for unseen bigrams. Comput. Linguist. Zornitsa Kozareva, Ellen Riloff, and Eduard Hovy. 2008. Semantic class learning from the web with hyponym pattern linkage graphs. In ACL-08: HLT. Hang Li and Naoki Abe. 1998. Generalizing case frames using a thesaurus and the mdl principle. Comput. Linguist. Dekang Lin and Patrick Pantel. 2001 . Dirt-discovery of inference rules from text. In KDD. Dekang Lin. 1998. Dependency-based evaluation of minipar. In Proc. Workshop on the Evaluation of Parsing Systems. Qiaozhu Mei, Xuehua Shen, and ChengXiang Zhai. 2007. Automatic models. In KDD. labeling of multinomial topic David Mimno, Hanna M. Wallach, Jason Naradowsky, David A. Smith, and Andrew McCallum. 2009. Polylingual topic models. In EMNLP. David Newman, Arthur Asuncion, Padhraic Smyth, and Max Welling. 2009. Distributed algorithms for topic models. JMLR. David Newman, Jey Han Lau, Karl Grieser, and Timothy Baldwin. 2010. Automatic evaluation of topic coherence. In NAACL-HLT. Diarmuid O´ S ´eaghdha. 2010. Latent variable models of selectional preference. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. Patrick Pantel, Rahul Bhagat, Bonaventura Coppola, Timothy Chklovski, and Eduard H. Hovy. 2007. Isp: Learning inferential selectional preferences. In HLT-NAACL. Patrick Andre Pantel. 2003. Clustering by committee. Ph.D. thesis, University of Alberta, Edmonton, Alta., Canada. Joseph Reisinger and Marius Pasca. 2009. Latent variable models of concept-attribute attachment. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP. P. Resnik. 1996. Selectional constraints: an information-theoretic model and its computational realization. Cognition. Philip Resnik. 1997. Selectional preference and sense disambiguation. In Proc. of the ACL SIGLEX Workshop on Tagging Text with Lexical Semantics: Why, What, and How? 433 Mats Rooth, Stefan Riezler, Detlef Prescher, Glenn Carroll, and Franz Beil. 1999. Inducing a semantically annotated lexicon via em-based clustering. In Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics. Lenhart Schubert and Matthew Tong. 2003. Extracting and evaluating general world knowledge from the brown corpus. In In Proc. of the HLT-NAACL Workshop on Text Meaning, pages 7–13. Benjamin Van Durme and Daniel Gildea. 2009. Topic models for corpus-centric knowledge generalization. In Technical Report TR-946, Department of Computer Science, University of Rochester, Rochester. Tae Yano, William W. Cohen, and Noah A. Smith. 2009. Predicting response to political blog posts with topic models. In NAACL. L. Yao, D. Mimno, and A. Mccallum. 2009. Efficient methods for topic model inference on streaming document collections. In KDD. 434