acl acl2011 acl2011-170 acl2011-170-reference knowledge-graph by maker-knowledge-mining

170 acl-2011-In-domain Relation Discovery with Meta-constraints via Posterior Regularization

Source: pdf

Author: Harr Chen ; Edward Benson ; Tahira Naseem ; Regina Barzilay

Abstract: We present a novel approach to discovering relations and their instantiations from a collection of documents in a single domain. Our approach learns relation types by exploiting meta-constraints that characterize the general qualities of a good relation in any domain. These constraints state that instances of a single relation should exhibit regularities at multiple levels of linguistic structure, including lexicography, syntax, and document-level context. We capture these regularities via the structure of our probabilistic model as well as a set of declaratively-specified constraints enforced during posterior inference. Across two domains our approach successfully recovers hidden relation structure, comparable to or outperforming previous state-of-the-art approaches. Furthermore, we find that a small , set of constraints is applicable across the domains, and that using domain-specific constraints can further improve performance. 1

reference text

Eugene Agichtein and Luis Gravano. 2000. Snowball: Extracting relations from large plain-text collections. In Proceedings of DL. Michele Banko and Oren Etzioni. 2008. The tradeoffs between open and traditional relation extraction. In Proceedings of ACL. Michele Banko, Michael J. Cafarella, Stephen Soderland, Matt Broadhead, and Oren Etzioni. 2007. Open information extraction from the web. In Proceedings of IJCAI. Regina Barzilay and Lillian Lee. 2004. Catching the drift: Probabilistic content models, with applications to generation and summarization. In Proceedings of HLT/NAACL. Kedar Bellare and Andrew McCallum. 2009. Generalized expectation criteria for bootstrapping extractors using record-text alignment. In Proceedings of EMNLP. Jordan Boyd-Graber and David M. Blei. 2008. Syntactic topic models. In Advances in NIPS. Razvan C. Bunescu and Raymond J. Mooney. 2007. Learning to extract relations from the web using minimal supervision. In Proceedings of ACL. Richard H. Byrd, Peihuang Lu, Jorge Nocedal, and Ciyou Zhu. 1995. A limited memory algorithm for bound constrained optimization. SIAM Journal on Scientific Computing, 16(5): 1190–1208. Ming-Wei Chang, Lev Ratinov, and Dan Roth. 2007. Guiding semi-supervision with constraintdriven learning. In Proceedings of ACL. 2006. Modeling general and specific aspects of documents with a probabilistic topic model. In Advances in NIPS. Jinxiu Chen, Dong-Hong Ji, Chew Lim Tan, and ZhengYu Niu. 2005. Automatic relation extraction with model order selection and discriminative label identification. In Proceedings of IJCNLP. Harr Chen, S.R.K. Branavan, Regina Barzilay, and David R. Karger. 2009. Content modeling using latent permutations. Journal of Artificial Intelligence Research, 36: 129–163. Marie-Catherine de Marneffe and Christopher D. Manning. 2008. The stanford typed dependencies representation. In Proceedings of the COLING Workshop on Cross-framework and Cross-domain Parser Evaluation. Jo˜ ao Gra ¸ca, Kuzman Ganchev, and Ben Taskar. 2007. Expectation maximization and posterior constraints. In Advances in NIPS. Takaaki Hasegawa, Satoshi Sekine, and Ralph Grishman. 2004. Discovering relations among named entities from large corpora. In Proceedings of ACL. 539 Richard Johansson and Pierre Nugues. 2007. Extended constituent-to-dependency conversion for english. In Proceedings of NODALIDA. Mark Johnson. 2007. Why doesn’t EM find good HMM POS-taggers? In Proceedings of EMNLP. Dan Klein and Christopher D. Manning. 2003. Accurate unlexicalized parsing. In Proceedings of ACL. Mirella Lapata. 2006. Automatic evaluation of information ordering: Kendall’s tau. Computational Linguistics, 32(4):471–484. Dekang Lin and Patrick Pantel. 2001. DIRT - discovery of inference rules from text. In Proceedings of SIGKDD. Gideon S. Mann and Andrew McCallum. 2008. Generalized expectation criteria for semi-supervised learning of conditional random fields. In Proceedings of ACL. Mike Mintz, Steven Bills, Rion Snow, and Dan Jurafsky. 2009. Distant supervision for relation extraction without labeled data. In Proceedings of ACL/IJCNLP. Hoifung Poon and Pedro Domingos. 2009. Unsupervised semantic parsing. In Proceedings of EMNLP. Ellen Riloff. 1996. Automatically generating extraction patterns from untagged texts. In Proceedings of AAAI. Benjamin Rosenfeld and Ronen Feldman. 2007. Clustering for unsupervised relation identification. In Proceedings of CIKM. Dan Roth and Wen-tau Yih. 2004. A linear programming formulation for global inference in natural language tasks. In Proceedings of CoNLL. Yusuke Shinyama and Satoshi Sekine. 2006. Preemptive information extraction using unrestricted relation discovery. In Proceedings of HLT/NAACL. Kiyoshi Sudo, Satoshi Sekine, and Ralph Grishman. 2003. An improved extraction pattern representation model for automatic IE pattern acquisition. In Proceedings of ACL. Roman Yangarber, Ralph Grishman, Pasi Tapanainen, and Silja Huttunen. 2000. Automatic acquisition of domain knowledge for information extraction. In Proceedings of COLING. Limin Yao, Sebastian Riedel, and Andrew McCallum. 2010. Cross-document relation extraction without labelled data. In Proceedings of EMNLP. Alexander Yates and Oren Etzioni. 2009. Unsupervised methods for determining object and relation synonyms on the web. Journal ofArtificial Intelligence Research, 34:255–296. Min Zhang, Jian Su, Danmei Wang, Guodong Zhou, and Chew Lim Tan. 2005. Discovering relations between named entities from a large raw corpus using tree similarity-based clustering. In Proceedings of IJCNLP. Jun Zhu, Zaiqing Nie, Xiaojing Liu, Bo Zhang, and Ji- 2009. StatSnowball: a statistical approach extracting entity relationships. In Proceedings of Rong Wen. to WWW. 540