acl acl2011 acl2011-262 acl2011-262-reference knowledge-graph by maker-knowledge-mining

262 acl-2011-Relation Guided Bootstrapping of Semantic Lexicons

Source: pdf

Author: Tara McIntosh ; Lars Yencken ; James R. Curran ; Timothy Baldwin

Abstract: State-of-the-art bootstrapping systems rely on expert-crafted semantic constraints such as negative categories to reduce semantic drift. Unfortunately, their use introduces a substantial amount of supervised knowledge. We present the Relation Guided Bootstrapping (RGB) algorithm, which simultaneously extracts lexicons and open relationships to guide lexicon growth and reduce semantic drift. This removes the necessity for manually crafting category and relationship constraints, and manually generating negative categories.

reference text

Michele Banko, Michael J Cafarella, Stephen Soderland, Matt Broadhead, and Oren Etzioni. 2007. Open information extraction from the web. In Proceedings of the 20th International Joint Conference on Artificial Intelligence, pages 2670–2676, Hyderabad, India. Andrew Carlson, Justin Betteridge, Richard C. Wang, Estevam R. Hruschka, Jr., and Tom M. Mitchell. 2010. Coupled semi-supervised learning for information extraction. In Proceedings of the Third ACM International Conference on Web Search and Data Mining, pages 101–1 10, New York, USA. Janara Christensen, Mausam, Stephen Soderland, and Oren Etzioni. 2010. Semantic role labeling for open information extraction. In Proceedings of the NAACL HLT 2010 First International Workshop on Formalisms and Methodology for Learning by Read- ing, pages 52–60, Los Angeles, California, USA, June. Stephen Clark and James R. Curran. 2007. Widecoverage efficient statistical parsing with ccg and loglinear models. Computational Linguistics, 33(4):493– 552. James R. Curran, Tara Murphy, and Bernhard Scholz. 2007. Minimising semantic drift with mutual exclusion bootstrapping. In Proceedings of the 10th Conference of the Pacific Association for Computational Linguistics, pages 172–180, Melbourne, Australia. 270 Mark A. Greenwood, Mark Stevenson, Yikun Guo, Henk Harkema, and Angus Roberts. 2005. Automatically acquiring a linguistically motivated genic interaction extraction system. In Proceedings of the 4th Learning Language in Logic Workshop, pages 46–52, Bonn, Germany. Claire Grover, Michael Matthews, and Richard Tobin. 2006. Tools to address the interdependence between tokenisation and standoff annotation. In Proceedings of the 5th Workshop on NLP and XML: MultiDimensional Markup in Natural Language Processing, pages 19–26, Trento, Italy. Tara McIntosh and James R. Curran. 2008. Weighted mutual exclusion bootstrapping for domain independent lexicon and template acquisition. In Proceedings of the Australasian Language Technology Association Workshop, pages 97–105, Hobart, Australia. Tara McIntosh and James R. Curran. 2009. Reducing semantic drift with bagging and distributional similarity. In Proceedings of the 47th Annual Meeting of the Association for Computational Linguistics and the 4th International Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, pages 396–404, Suntec, Singapore, August. Tara McIntosh. 2010. Unsupervised discovery of negative categories in lexicon bootstrapping. In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, pages 356–365, Boston, USA. Ellen Riloff and Rosie Jones. 1999. Learning dictionaries for information extraction by multi-level bootstrapping. In Proceedings of the 16th National Conference on Artificial Intelligence and the 11th Innovative Applications of Artificial Intelligence Conference, pages 474–479, Orlando, USA. Ellen Riloff and Jessica Shepherd. 1997. A corpus-based approach for building semantic lexicons. In Proceedings of the Second Conference on Empirical Methods in Natural Language Processing, pages 117–124, Providence, USA. Laura Rimell and Stephen Clark. 2009. Porting a lexicalized-grammar parser to the biomedical domain. Journal of Biomedical Informatics, pages 852–865. Fei Wu and Daniel S. Weld. 2010. Open information extraction using wikipedia. In Proceedings of the 48th Annual Meeting of the Association of Computational Linguistics, pages 118–127, Uppsala, Sweden. Roman Yangarber, Winston Lin, and Ralph Grishman. 2002. Unsupervised learning of generalized names. In Proceedings of the 19th International Conference on Computational Linguistics, pages 1135–1 141, Taipei, Taiwan.