emnlp emnlp2013 emnlp2013-160 emnlp2013-160-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Xiao Cheng ; Dan Roth
Abstract: Wikification, commonly referred to as Disambiguation to Wikipedia (D2W), is the task of identifying concepts and entities in text and disambiguating them into the most specific corresponding Wikipedia pages. Previous approaches to D2W focused on the use of local and global statistics over the given text, Wikipedia articles and its link structures, to evaluate context compatibility among a list of probable candidates. However, these methods fail (often, embarrassingly), when some level of text understanding is needed to support Wikification. In this paper we introduce a novel approach to Wikification by incorporating, along with statistical methods, richer relational analysis of the text. We provide an extensible, efficient and modular Integer Linear Programming (ILP) formulation of Wikification that incorporates the entity-relation inference problem, and show that the ability to identify relations in text helps both candi- date generation and ranking Wikipedia titles considerably. Our results show significant improvements in both Wikification and the TAC Entity Linking task.
R. Bunescu and M. Pasca. 2006. Using encyclopedic knowledge for named entity disambiguation. In Proceedings of the European Chapter of the ACL (EACL). Y. Chan and D. Roth. 2011. Exploiting syntacticosemantic structures for relation extraction. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), Portland, Oregon. M. Chang, L. Ratinov, and D. Roth. 2012. Structured learning with constrained conditional models. Machine Learning, 88(3):399–43 1, 6. Silviu Cucerzan. 2007. Large-scale named entity disambiguation based on Wikipedia data. In Proceedings of the 2007 Joint Conference of EMNLP-CoNLL, pages 708–716. Silviu Cucerzan. 2011. Tac entity linking by performing full-document entity extraction and disambiguation. In Proceedings of the Text Analysis Conference. Q. Do, D. Roth, M. Sammons, Y. Tu, and V. Vydiswaran. 2009. Robust, light-weight approaches to compute lexical similarity. Technical report, Computer Science Department, University of Illinois. Joe Ellis, Xuansong Li, Kira Griffitt, Stephanie M Strassel, and Jonathan Wright. 2011. Linguistic resources for 2012 knowledge base population evaluations. Paolo Ferragina and Ugo Scaiella. 2010. Tagme: on-thefly annotation of short text fragments (by wikipedia entities). In Proceedings of the 19th ACM international conference on Information and knowledge management, CIKM ’ 10, pages 1625–1628, New York, NY, USA. ACM. Inc. Gurobi Optimization. 2013. Gurobi optimizer reference manual. Heng Ji, Ralph Grishman, Hoa Trang Dang, Kira Griffitt, and Joe Ellis. 2010. Overview of the tac 2010 knowledge base population track. In Third Text Analysis Conference (TAC 2010). Heng Ji, Ralph Grishman, and Hoa Trang Dang. 2011. Overview of the tac 2011 knowledge base population track. In Fourth Text Analysis Conference (TAC 2011). R. Mihalcea and A. Csomai. 2007. Wikify! : linking documents to encyclopedic knowledge. In Proceedings of ACM Conference on Information and Knowledge Management (CIKM), pages 233–242. D. Milne and I. H. Witten. 2008. Learning to link with wikipedia. In Proceedings of ACM Conference 1796 on Information and Knowledge Management (CIKM), pages 509–518. Sean Monahan, John Lehmann, Timothy Nyberg, Jesse Plymale, and Arnold Jung. 2011. Cross-lingual crossdocument coreference with entity linking. In Proceedings of the Text Analysis Conference. L. Ratinov and D. Roth. 2011. Glow tac-kbp 2011entity linking system. In TAC. Text Analysis Conference, 11. L. Ratinov and D. Roth. 2012. Learning-based multisieve co-reference resolution with knowledge. In EMNLP. L. Ratinov, D. Roth, D. Downey, and M. Anderson. 2011. Local and global algorithms for disambiguation to wikipedia. In ACL. D. Roth and W. Yih. 2004. A linear programming formulation for global inference in natural language tasks. In Hwee Tou Ng and Ellen Riloff, editors, Proceedings of the Annual Conference on Computational Natural Language Learning (CoNLL), pages 1–8. Association for Computational Linguistics.