emnlp emnlp2012 emnlp2012-93 emnlp2012-93-reference knowledge-graph by maker-knowledge-mining

93 emnlp-2012-Multi-instance Multi-label Learning for Relation Extraction

Source: pdf

Author: Mihai Surdeanu ; Julie Tibshirani ; Ramesh Nallapati ; Christopher D. Manning

Abstract: Distant supervision for relation extraction (RE) gathering training data by aligning a database of facts with text – is an efficient approach to scale RE to thousands of different relations. However, this introduces a challenging learning scenario where the relation expressed by a pair of entities found in a sentence is unknown. For example, a sentence containing Balzac and France may express BornIn or Died, an unknown relation, or no relation at all. Because of this, traditional supervised learning, which assumes that each example is explicitly mapped to a label, is not appropriate. We propose a novel approach to multi-instance multi-label learning for RE, which jointly models all the instances of a pair of entities in text and all their labels using a graphical model with latent variables. Our model performs competitively on two difficult domains. –

reference text

Kedar Bellare and Andrew McCallum. 2007. Learning extractors from unlabeled text using relevant databases. In Proceedings of the Sixth International Workshop on Information Extraction on the Web. Carla Brodley and Mark Friedl. 1999. Identifying mislabeled training data. Journal of Artificial Intelligence Research (JAIR). Razvan Bunescu and Raymond Mooney. 2007. Learning to extract relations from the web using minimal supervision. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics. Mark Craven and Johan Kumlien. 1999. Constructing biological knowledge bases by extracting information from text sources. In Proceedings of the Seventh International Conference on Intelligent Systems for Molecular Biology. Jenny Rose Finkel, Trond Grenager, and Christopher D. Manning. 2005. Incorporating non-local information into information extraction systems by gibbs sampling. In Proceedings of the 43nd Annual Meeting of the Association for Computational Linguistics. Raphael Hoffmann, Congle Zhang, Xiao Ling, Luke Zettlemoyer, and Daniel S. Weld. 2011. Knowledgebased weak supervision for information extraction of overlapping relations. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL). Heng Ji, Ralph Grishman, Hoa T. Dang, Kira Griffitt, and Joe Ellis. 2010. Overview of the TAC 2010 knowledge base population track. In Proceedings of the Text Analytics Conference. Heng Ji, Ralph Grishman, and Hoa T. Dang. 2011. Overview of the TAC 2011 knowledge base population track. In Proceedings of the Text Analytics Conference. Mike Mintz, Steven Bills, Rion Snow, and Daniel Juraf- sky. 2009. Distant supervision for relation extraction without labeled data. In Proceedings of the 47th Annual Meeting of the Association for Computational Linguistics. Truc Vien T. Nguyen and Alessandro Moschitti. 2011. End-to-end relation extraction using distant supervision from external semantic repositories. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. Sebastian Riedel, Limin Yao, and Andrew McCallum. 2010. Modeling relations and their mentions without labeled text. In Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases (ECML PKDD ’10). 465 Ang Sun, Ralph Grishman, Wei Xu, and Bonan Min. 2011. New York University 2011 system for KBP slot filling. In Proceedings of the Text Analytics Conference. Mihai Surdeanu, Sonal Gupta, John Bauer, David McClosky, Angel X. Chang, Valentin I. Spitkovsky, and Christopher D. Manning. 2011a. Stanford’s distantlysupervised slot-filling system. In Proceedings of the Text Analytics Conference. Mihai Surdeanu, David McClosky, Mason R. Smith, An- drey Gusev, and Christopher D. Manning. 2011b. Customizing an information extraction system to a new domain. In Proceedings of the Workshop on Relational Models of Semantics, Portland, Oregon, June. Fei Wu and Dan Weld. 2007. Autonomously semantifying Wikipedia. In Proceedings of the International Conference on Information and Knowledge Management (CIKM). Z.H. Zhou and M.L. Zhang. 2007. Multi-instance multilabel learning with application to scene classification. In Advances in Neural Information Processing Systems (NIPS).