emnlp emnlp2012 emnlp2012-19 emnlp2012-19-reference knowledge-graph by maker-knowledge-mining

19 emnlp-2012-An Entity-Topic Model for Entity Linking


Source: pdf

Author: Xianpei Han ; Le Sun

Abstract: Entity Linking (EL) has received considerable attention in recent years. Given many name mentions in a document, the goal of EL is to predict their referent entities in a knowledge base. Traditionally, there have been two distinct directions of EL research: one focusing on the effects of mention’s context compatibility, assuming that “the referent entity of a mention is reflected by its context”; the other dealing with the effects of document’s topic coherence, assuming that “a mention ’s referent entity should be coherent with the document’ ’s main topics”. In this paper, we propose a generative model called entitytopic model, to effectively join the above two complementary directions together. By jointly modeling and exploiting the context compatibility, the topic coherence and the correlation between them, our model can – accurately link all mentions in a document using both the local information (including the words and the mentions in a document) and the global knowledge (including the topic knowledge, the entity context knowledge and the entity name knowledge). Experimental results demonstrate the effectiveness of the proposed model. 1


reference text

Adafre, S. F. & de Rijke, M. 2005. Discovering missing links in Wikipedia. In: Proceedings of the 3rd international workshop on Link discovery. Bhattacharya, I. and L. Getoor. 2006. A latent dirichlet model for unsupervised entity resolution. In: Proceedings of SIAM International Conference on Data Mining. Blei, D. M. and A. Y. Ng, et al. (2003). Latent dirichlet allocation. In: The Journal of Machine Learning Research 3: 993--1022. Bunescu, R. & Pasca, M. 2006. Using encyclopedic knowledge for named entity disambiguation. In: Proceedings of EACL, vol. 6. Brown, P., Pietra, S. D., Pietra, V. D., and Mercer, R. 1993. The mathematics of statistical machine translation: parameter estimation. Computational Linguistics, 19(2), 263-3 1. Chen, S. F. & Goodman, J. 1999. An empirical study of smoothing techniques for language modeling. In Computer Speech and Language, London; Orlando: Academic Press, c1986-, pp. 359-394. Cucerzan, S. 2007. Large-scale named entity disambiguation based on Wikipedia data. In: Proceedings of EMNLP-CoNLL, pp. 708-716. De Beaugrande, R. A. and W. U. Dressler. 1981. Introduction to text linguistics, Chapter V, Longman London. Dredze, M., McNamee, P., Rao, D., Gerber, A. & Finin, T. 2010. Entity Disambiguation for Knowledge Base Population. In: Proceedings of the 23rd International Conference on Computational Linguistics. Fader, A., Soderland, S., Etzioni, O. & Center, T. 2009. Scaling Wikipedia-based named entity disambiguation to arbitrary web text. In: Proceedings of Wiki-AI Workshop at IJCAI, vol. 9. Gottipati, S., Jiang, J. 2011. Linking Entities to a Knowledge Base with Query Expansion. In: Proceedings of EMNLP. Griffiths, T. L. and M. Steyvers. 2004. Finding scientific topics. In: Proceedings of the National Academy of Sciences of the United States of America. Han, X., Sun, L. and Zhao J. 2011. Collective Entity Linking in Web Text: A Graph-Based Method. In: Proceedings of 34th Annual ACM SIGIR Conference. Han, X. and Sun, L. 2011. A Generative Entity-Mention Model for Linking Entities with Knowledge Base. In: Proceedings of ACL-HLT. Hoffart, J., Yosef, M. A., et al. 2011. Robust Disambiguation of Named Entities in Text. In: Proceedings of EMNLP. Jelinek, Frederick and Robert L. Mercer. 1980. Interpolated estimation of Markov source parameters from sparse data. In: Proceedings of the Workshop on Pattern Recognition in Practice. Kataria, S. S., Kumar, K. S. and Rastogi, R. 2011. Entity Disambiguation with Hierarchical Topic Models. In: Proceedings of KDD. Kulkarni, S., Singh, A., Ramakrishnan, G. & Chakrabarti, S. 2009. Collective annotation of Wikipedia entities in web text. In: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 457-466. Li, X., Morie, P. & Roth, D. 2004. Identification tracing of ambiguous names: Discriminative generative approaches. In: Proceedings of National Conference on Artificial Intelligence, 419-424. and and the pp. McNamee, P. & Dang, H. T. 2009. Overview of the TAC 2009 Knowledge Base Population Track. In: Proceeding of Text Analysis Conference. Ji, H., et al. 2010. Overview of the TAC 2010 knowledge base population track. In: Proceedings of Text Analysis Conference. Ji, H. and Chen, Z. 2011. Collaborative Ranking: A Case Study on Entity Linking. In: Proceedings of EMNLP. Milne, D. & Witten, I. H. 2008. Learning to link with Wikipedia. In: Proceedings of the 17th ACM conference on Conference knowledge management. on information 115 and Milne, D., et al. 2006. Mining Domain-Specific Thesauri from Wikipedia: A case study. In Proc. of IEEE/WIC/ACM WI. Medelyan, O., Witten, I. H. & Milne, D. 2008. Topic indexing with Wikipedia. In: Proceedings of the AAAI WikiAI workshop. Mihalcea, R. & Csomai, A. 2007. Wikify!: linking documents to encyclopedic knowledge. In: Proceedings of the sixteenth ACM conference on Conference on information management, pp. 233-242. and knowledge Pedersen, T., Purandare, A. & Kulkarni, A. 2005. Name discrimination by clustering similar contexts. Computational Linguistics and Intelligent Text Processing, pp. 226-237. Ratinov, L. and D. Roth, et al. 2011. Local and Global Algorithms for Disambiguation to Wikipedia. In: Proceedings of ACL. Sen, P. 2012. Collective context-aware topic models for entity disambiguation. In Proceedings of WWW '12, New York, NY, USA, ACM. Zhang, W., Su, J., Tan, Chew Lim & Wang, W. T. 2010. Entity Linking Leveraging Automatically Generated Annotation. In: Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010). Zheng, Z., Li, F., Huang, M. & Zhu, X. 2010. Learning to Link Entities with Knowledge Base. In: The Proceedings of the Annual Conference of the North American Chapter of the ACL. Zhou, Y., Nie, L., Rouhani-Kalleh, O., Vasile, F. & Gaffney, S. 2010. Resolving Surface Forms to Wikipedia Topics. In: Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010), pp. 1335-1343. Zhang, W. and Sim, Y. C., et al. 2011. Entity Linking with Effective Acronym Expansion, Instance Selection and Topic Modeling. In: Proceedings of IJCAI.