acl acl2010 acl2010-248 acl2010-248-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Hoifung Poon ; Pedro Domingos
Abstract: Extracting knowledge from unstructured text is a long-standing goal of NLP. Although learning approaches to many of its subtasks have been developed (e.g., parsing, taxonomy induction, information extraction), all end-to-end solutions to date require heavy supervision and/or manual engineering, limiting their scope and scalability. We present OntoUSP, a system that induces and populates a probabilistic ontology using only dependency-parsed text as input. OntoUSP builds on the USP unsupervised semantic parser by jointly forming ISA and IS-PART hierarchies of lambda-form clusters. The ISA hierarchy allows more general knowledge to be learned, and the use of smoothing for parameter estimation. We evaluate On- toUSP by using it to extract a knowledge base from biomedical abstracts and answer questions. OntoUSP improves on the recall of USP by 47% and greatly outperforms previous state-of-the-art approaches.
Hiyan Alshawi. 1990. Resolving quasi logical forms. Computational Linguistics, 16: 133–144. G. Bakir, T. Hofmann, B. B. Sch o¨lkopf, A. Smola, B. Taskar, 304 S. Vishwanathan, and (eds.). 2007. Predicting Structured Data. MIT Press, Cambridge, MA. Michele Banko, Michael J. Cafarella, Stephen Soderland, Matt Broadhead, and Oren Etzioni. 2007. Open information extraction from the web. In Proceedings of the Twentieth International Joint Conference on Artificial Intelligence, pages 2670–2676, Hyderabad, India. AAAI Press. Philipp Cimiano. 2006. Ontology learning and population from text. Springer. Marie-Catherine de Marneffe, Bill MacCartney, and Christopher D. Manning. 2006. Generating typed dependency parses from phrase structure parses. In Proceedings of the Fifth International Conference on Language Resources and Evaluation, pages 449–454, Genoa, Italy. ELRA. Pedro Domingos and Daniel Lowd. 2009. Markov Logic: An Interface Layer for Artificial Intelligence. Morgan & Claypool, San Rafael, CA. Miroslav Dudik, David Blei, and Robert Schapire. 2007. Hierarchical maximum entropy density estimation. In Proceedings of the Twenty Fourth International Conference on Machine Learning. Christiane Fellbaum, editor. 1998. WordNet: An Electronic Lexical Database. MIT Press, Cambridge, MA. Andrew Gelman and Jennifer Hill. 2006. Data Analysis Us- ing Regression andMultilevel/Hierarchical Models. Cambridge University Press. Lise Getoor and Ben Taskar, editors. 2007. Introduction to Statistical Relational Learning. MIT Press, Cambridge, MA. Marti Hearst. 1992. Automatic acquisition of hyponyms from large text corpora. In Proceedings of the 14th International Conference on Computational Linguistics. Jin-Dong Kim, Tomoko Ohta, Yuka Tateisi, and Jun’ichi Tsujii. 2003. GENIA corpus - a semantically annotated corpus for bio-textmining. Bioinformatics, 19: 180–82. Dan Klein and Christopher D. Manning. 2003. Accurate unlexicalized parsing. In Proceedings of the Forty First Annual Meeting of the Association for Computational Linguistics, pages 423–430. Dekang Lin and Patrick Pantel. inference rules from text. In ACM SIGKDD International Discovery and Data Mining, cisco, CA. ACM Press. 2001. DIRT - discovery of Proceedings of the Seventh Conference on Knowledge pages 323–328, San Fran- Alexander Maedche. 2002. Ontology learning for the semantic Web. Kluwer Academic Publishers, Boston, Massachusetts. Andrew McCallum, Ronald Rosenfeld, Tom Mitchell, and Andrew Ng. 1998. Improving text classification by shrinkage in a hierarchy of classes. In Proceedings of the Fifteenth International Conference on Machine Learning. Hoifung Poon and Pedro Domingos. 2008. Joint unsupervised coreference resolution with Markov logic. In Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, pages 649–658, Honolulu, HI. ACL. Hoifung Poon and Pedro Domingos. 2009. Unsupervised semantic parsing. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, pages 1–10, Singapore. ACL. Rion Snow, Daniel Jurafsky, and Andrew Ng. 2006. Semantic taxonomy induction from heterogenous evidence. In Proceedings of COLING/ACL 2006. S. Staab and R. Studer. Springer. 2004. Handbook on ontologies. Fabian Suchanek, Gjergji Kasneci, and Gerhard Weikum. 2008. Yago - a large ontology from Wikipedia and WordNet. Journal of Web Semantics. Fabian Suchanek, Mauro Sozio, and Gerhard Weikum. 2009. Sofie: A self-organizing framework for information extraction. In Proceedings of the Eighteenth International Conference on World Wide Web. Jun-ichi Tsujii. 2004. Thesaurus or logical ontology, which do we need for mining text? In Proceedings of the Language Resources and Evaluation Conference. Fei Wu and Daniel S. Weld. 2008. Automatically refining the wikipedia infobox ontology. In Proceedings of the Seventeenth International Conference on World Wide Web, Beijing, China. Alexander Yates and Oren Etzioni. 2009. Unsupervised methods for determining object and relation synonyms on the web. Journal of Artificial Intelligence Research, 34:255–296. 305