acl acl2011 acl2011-128 acl2011-128-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Danuta Ploch
Abstract: Named entity disambiguation is the task of linking an entity mention in a text to the correct real-world referent predefined in a knowledge base, and is a crucial subtask in many areas like information retrieval or topic detection and tracking. Named entity disambiguation is challenging because entity mentions can be ambiguous and an entity can be referenced by different surface forms. We present an approach that exploits Wikipedia relations between entities co-occurring with the ambiguous form to derive a range of novel features for classifying candidate referents. We find that our features improve disambiguation results significantly over a strong popularity baseline, and are especially suitable for recognizing entities not contained in the knowledge base. Our system achieves state-of-the-art results on the TAC-KBP 2009 dataset.
Sergey Brin and Lawrence Page. 1998. The anatomy of a large-scale hypertextual web search engine. In WWW7: Proceedings of the seventh international conference on World Wide Web 7, pages 107–1 17, Amsterdam, The Netherlands. Elsevier Science Publishers B. V. Razvan Bunescu and Marius Pasca. 2006. Using encyclopedic knowledge for named entity disambiguation. In Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguis- tics (EACL-06), pages 9–16, Trento, Italy. Razvan Constantin Bunescu. 2007. Learning for Information Extraction: From Named Entity Recognition and Disambiguation To Relation Extraction. Ph.D. thesis, University of Texas at Austin, Department of Computer Sciences. Silviu Cucerzan. 2007. Large-Scale named entity disambiguation based on Wikipedia data. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), pages 708–716, Prague, Czech Republic. Association for Computational Linguistics. Mark Dredze, Paul McNamee, Delip Rao, Adam Gerber, and Tim Finin. 2010. Entity disambiguation for knowledge base population. In Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010), pages 277–285, Beijing, China. Coling 2010 Organizing Committee. Jenny Rose Finkel, Trond Grenager, and Christopher Manning. 2005. Incorporating non-local information into information extraction systems by gibbs sampling. In Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, pages 363–370, Ann Arbor, Michigan. Association for Computational Linguistics. Xianpei Han and Jun Zhao. 2009. Named entity dis- ambiguation by leveraging wikipedia semantic knowledge. In Proceeding of the 18th ACM conference on Information and knowledge management, pages 215– 224, Hong Kong, China. ACM. Paul McNamee and Hoa Trang Dang. 2009. Overview of the tac 2009 knowledge base population track. In Text Analysis Conference (TAC). Paul McNamee, Hoa Trang Dang, Heather Simpson, Patrick Schone, and Stephanie M. Strassel. 2010. An evaluation of technologies for knowledge base population. In Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC’10). European Language Resources Association (ELRA). Vladimir N. Vapnik. 1995. The nature of statistical learning theory. Springer-Verlag New York, Inc., New York, NY, USA. Wei Zhang, Jian Su, Chew Lim Tan, and Wen Ting Wang. 2010. Entity linking leveraging automatically generated annotation. In Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010), pages 1290–1298, Beijing, China. Coling 2010 Organizing Committee. Zhicheng Zheng, Fangtao Li, Minlie Huang, and Xiaoyan Zhu. 2010. Learning to link entities with knowledge base. In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, HLT ’ 10, pages 483–491, Stroudsburg, PA, USA. Association for Computational Linguistics. 23