emnlp emnlp2013 emnlp2013-79 emnlp2013-79-reference knowledge-graph by maker-knowledge-mining

79 emnlp-2013-Exploiting Multiple Sources for Open-Domain Hypernym Discovery

Source: pdf

Author: Ruiji Fu ; Bing Qin ; Ting Liu

Abstract: Hypernym discovery aims to extract such noun pairs that one noun is a hypernym of the other. Most previous methods are based on lexical patterns but perform badly on opendomain data. Other work extracts hypernym relations from encyclopedias but has limited coverage. This paper proposes a simple yet effective distant supervision framework for Chinese open-domain hypernym discovery. Given an entity name, we try to discover its hypernyms by leveraging knowledge from multiple sources, i.e., search engine results, encyclopedias, and morphology of the entity name. First, we extract candidate hypernyms from the above sources. Then, we apply a statistical ranking model to select correct hypernyms. A set of novel features is proposed for the rank- ing model. We also present a heuristic strategy to build a large-scale noisy training data for the model without human annotation. Experimental results demonstrate that our approach outperforms the state-of-the-art methods on a manually labeled test dataset.

reference text

Sharon A. Caraballo. 1999. Automatic construction of a hypernym-labeled noun hierarchy from text. In Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics, pages 120–126, College Park, Maryland, USA, June. Wanxiang Che, Zhenghua Li, and Ting Liu. 2010. Ltp: A chinese language technology platform. In Coling 2010: Demonstrations, pages 13–16, Beijing, China, August. Massimiliano Ciaramita and Mark Johnson. 2003. Supersense tagging of unknown nouns in wordnet. In Proceedings of the 2003 conference on Empirical methods in natural language processing, pages 168– 175. Oren Etzioni, Michael Cafarella, Doug Downey, AnaMaria Popescu, Tal Shaked, Stephen Soderland, Daniel S Weld, and Alexander Yates. 2005. Unsupervised named-entity extraction from the web: An experimental study. Artificial Intelligence, 165(1):91–134. Oren Etzioni, Michele Banko, and Michael J Cafarella. 2006. Machine reading. In AAAI, volume 6, pages 1517–1519. Richard Evans. 2004. A framework for named entity recognition in the open domain. Recent Advances 1233 in Natural Language Processing III: Selected Papers from RANLP 2003, 260:267–274. Marti A Hearst. 1992. Automatic acquisition of hyponyms from large text corpora. In Proceedings of the 14th conference on Computational linguistics-Volume 2, pages 539–545. Johannes Hoffart, Fabian M Suchanek, Klaus Berberich, and Gerhard Weikum. 2012. Yago2: a spatially and temporally enhanced knowledge base from wikipedia. Artificial Intelligence, pages 1–63. Thomas Lin, Mausam, and Oren Etzioni. 2012. No noun phrase left behind: Detecting and typing unlinkable entities. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pages 893–903, Jeju Island, Korea, July. Paul McNamee, Rion Snow, Patrick Schone, and James Mayfield. 2008. Learning named entity hyponyms for question answering. In Proceedings of the Third International Joint Conference on Natural Language Processing, pages 799–804. Marius Pasca. 2004. Acquisition of categorized named entities for web search. In Proceedings of the thirteenth ACM international conference on Information and knowledge management, pages 137–145. John Platt. 1999. Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Advances in large margin classifiers, 10(3):61–74. Alan Ritter, Stephen Soderland, and Oren Etzioni. 2009. What is this, anyway: Automatic hypernym discovery. In Proceedings of the 2009 AAAI Spring Symposium on Learning by Reading and Learning to Read, pages 88–93. Cederberg Scott and Widdows Dominic. 2003. Using lsa and noun coordination information to improve the precision and recall ofautomatic hyponymy extraction. In Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003-Volume 4, pages 111–1 18. Sidney Siegel and N John Castellan Jr. 1988. Nonparametric statistics for the behavioral sciences. McGrawHill, New York. Rion Snow, Daniel Jurafsky, and Andrew Y. Ng. 2005. Learning syntactic patterns for automatic hypernym discovery. In Lawrence K. Saul, Yair Weiss, and L e´on Bottou, editors, Advances in Neural Information Processing Systems 17, pages 1297–1304. Fabian M Suchanek, Gjergji Kasneci, and Gerhard Weikum. 2008. Yago: A large ontology from wikipedia and wordnet. Web Semantics: Science, Services and Agents on the World Wide Web, 6(3):203– 217. Peter Turney, Michael L Littman, Jeffrey Bigham, and Victor Shnayder. 2003. Combining independent modules to solve multiple-choice synonym and analogy problems. In Proceedings of the International Conference RANLP-2003, pages 482–489. Fan Zhang, Shuming Shi, Jing Liu, Shuqi Sun, and ChinYew Lin. 2011. Nonlinear evidence fusion and propagation for hyponymy relation mining. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pages 1159–1 168, Portland, Oregon, USA, June. 1234