emnlp emnlp2010 emnlp2010-79 emnlp2010-79-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Gae-won You ; Seung-won Hwang ; Young-In Song ; Long Jiang ; Zaiqing Nie
Abstract: This paper studies the problem of mining entity translation, specifically, mining English and Chinese name pairs. Existing efforts can be categorized into (a) a transliterationbased approach leveraging phonetic similarity and (b) a corpus-based approach exploiting bilingual co-occurrences, each of which suffers from inaccuracy and scarcity respectively. In clear contrast, we use unleveraged resources of monolingual entity co-occurrences, crawled from entity search engines, represented as two entity-relationship graphs extracted from two language corpora respectively. Our problem is then abstracted as finding correct mappings across two graphs. To achieve this goal, we propose a holistic approach, of exploiting both transliteration similarity and monolingual co-occurrences. This approach, building upon monolingual corpora, complements existing corpus-based work, requiring scarce resources of parallel or compa- rable corpus, while significantly boosting the accuracy of transliteration-based work. We validate our proposed system using real-life datasets.
Yaser Al-Onaizan and Kevin Knight. 2002. Translating Named Entities Using Monolingual and Bilingual Resources. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics (ACL’02), pages 400–408. Association for Computational Linguistics. Donghui Feng, Yajuan L u¨, and Ming Zhou. 2004. A New Approach for English-Chinese Named Entity Alignment. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’04), pages 372–379. Association for Computational Linguistics. 438 Pascale Fung and Lo Yuen Yee. 1998. An IR Approach for Translating New Words from Nonparallel,Comparable Texts. In Proceedings of the 17th International Conference on Computational Linguistics (COLING’98), pages 414–420. Association for Computational Linguistics. Long Jiang, Ming Zhou, Lee feng Chien, and Cheng Niu. 2007. Named Entity Translation with Web Mining and Transliteration. In Proceedings of the 20th International Joint Conference on Artificial Intelligence (IJCAI’07), pages 1629–1634. Morgan Kaufmann Pub- lishers Inc. Long Jiang, Shiquan Yang, Ming Zhou, Xiaohua Liu, and Qingsheng Zhu. 2009. Mining Bilingual Data from the Web with Adaptively Learnt Patterns. In Proceedings of the 47th Annual Meeting of the Association for Computational Linguistics (ACL’09), pages 870–878. Association for Computational Linguistics. Kevin Knight and Jonathan Graehl. 1998. Machine Transliteration. Computational Linguistics, 24(4):599–612. Julian Kupiec. 1993. An Algorithm for finding Noun Phrase Correspondences in Bilingual Corpora. In Proceedings ofthe 31thAnnual Meeting ofthe Association for Computational Linguistics (ACL’93), pages 17–22. Association for Computational Linguistics. Haizhou Li, Zhang Min, and Su Jian. 2004. A Joint Source-Channel Model for Machine Transliteration. In Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics (ACL’04), pages 159–166. Association for Computational Linguistics. Dekang Lin, Shaojun Zhao, Benjamin Van Durme, and Marius Pasca. 2008. Mining Parenthetical Translations from the Web by Word Alignment. In Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics (ACL’08), pages 994– 1002. Association for Computational Linguistics. Li Shao and Hwee Tou Ng. 2004. Mining New Word Translations from Comparable Corpora. In Proceedings ofthe 20th International Conference on Computational Linguistics (COLING’04), pages 618–624. Association for Computational Linguistics. Ellen M. Voorhees. 2001 . The trec question answering track. Natural Language Engineering, 7(4):361–378. Stephen Wan and Cornelia Maria Verspoor. 1998. Auto- matic English-Chinese Name Transliteration for Development of Multilingual Resources. In Proceedings of the 17th International Conference on Computational Linguistics (COLING’98), pages 1352–1356. Association for Computational Linguistics. Douglas Brent West. 2000. Introduction to Graph Theory. Prentice Hall, second edition. 439