acl acl2010 acl2010-203 acl2010-203-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Matthew Honnibal ; James R. Curran ; Johan Bos
Abstract: Once released, treebanks tend to remain unchanged despite any shortcomings in their depth of linguistic analysis or coverage of specific phenomena. Instead, separate resources are created to address such problems. In this paper we show how to improve the quality of a treebank, by integrating resources and implementing improved analyses for specific constructions. We demonstrate this rebanking process by creating an updated version of CCGbank that includes the predicate-argument structure of both verbs and nouns, baseNP brackets, verb-particle constructions, and restrictive and non-restrictive nominal modifiers; and evaluate the impact of these changes on a statistical parser.
Ann Bies, Mark Ferguson, Karen Katz, and Robert MacIntyre. 1995. Bracketing guidelines for Treebank II style Penn Treebank project. Technical report, MS-CIS-95-06, University of Pennsylvania, Philadelphia, PA, USA. Stephen Boxwell and Michael White. 2008. Projecting propbank roles onto the CCGbank. In Proceedings of the Sixth International Language Resources and Evaluation (LREC’08), pages 3 112–3 117. European Language Re- sources Association (ELRA), Marrakech, Morocco. Miriam Butt, Mary Dalrymple, and Tracy H. King, editors. 2006. Lexical Semantics in LFG. CSLI Publications, Stanford, CA. Aoife Cahill, Michael Burke, Ruth O’Donovan, Stefan Riezler, Josef van Genabith, and Andy Way. 2008. Widecoverage deep statistical parsing using automatic dependency structure annotation. Computational Linguistics, 34(1):81–124. Stephen Clark and James R. Curran. 2007. Wide-coverage efficient statistical parsing with CCG and log-linear models. Computational Linguistics, 33(4):493–552. James Constable and James Curran. 2009. Integrating verbparticle constructions into CCG parsing. In Proceedings of the Australasian Language Technology Association Workshop 2009, pages 114–1 18. Sydney, Australia. Christy Doran, Dania Egedi, Beth Ann Hockey, B. Srinivas, and Martin Zaidel. 1994. Xtag system: a wide coverage grammar for english. In Proceedings of the 15th conference on Computational linguistics, pages 922–928. ACL, Morristown, NJ, USA. Dan Flickinger. 2000. On building a more efficient grammar by exploiting types. Natural Language Engineering, 6(1):15–28. Daniel Gildea and Julia Hockenmaier. 2003. Identifying semantic roles using combinatory categorial grammar. In Proceedings of the 2003 conference on Empirical methods in natural language processing, pages 57–64. ACL, Morristown, NJ, USA. Donald Hindle. 1983. User manual for fidditch, a deterministic parser. Technical Memorandum 7590-142, Naval Re- search Laboratory. Julia Hockenmaier. 2003. Data and Models for Statistical Parsing with Combinatory Categorial Grammar. Ph.D. thesis, University of Edinburgh, Edinburgh, UK. Julia Hockenmaier and Mark Steedman. 2002. Acquiring compact lexicalized grammars from a cleaner treebank. In Proceedings of the Third Conference on Language Resources and Evaluation Conference, pages 1974–1981 . Las Palmas, Spain. Julia Hockenmaier and Mark Steedman. 2007. CCGbank: a corpus of CCG derivations and dependency structures extracted from the Penn Treebank. Computational Linguistics, 33(3):355–396. Matthew Honnibal and James R. Curran. 2007. Improving the complement/adjunct distinction in CCGBank. In Proceedings of the Conference of the Pacific Association for Computational Linguistics, pages 210–217. Melbourne, Australia. Ronald M. Kaplan and Joan Bresnan. 1982. LexicalFunctional Grammar: A formal system for grammatical representation. In Joan Bresnan, editor, The mental representation of grammatical relations, pages 173–281 . MIT Press, Cambridge, MA, USA. Mitchell Marcus, Beatrice Santorini, and Mary Marcinkiewicz. 1993. Building a large annotated corpus of English: The Penn Treebank. Computational Linguistics, 19(2):3 13–330. Paul Martin. 2002. The Wall Street Journal Guide to Business Style and Usage. Free Press, New York. Adam Meyers, Ruth Reeves, Catherine Macleod, Rachel Szekely, Veronika Zielinska, Brian Young, and Ralph Gr- ishman. 2004. The NomBank project: An interim report. In Frontiers in Corpus Annotation: Proceedings of the Workshop, pages 24–3 1. Boston, MA, USA. Yusuke Miyao, Takashi Ninomiya, and Jun’ichi Tsujii. 2004. Corpus-oriented grammar development for acquiring a head-driven phrase structure grammar from the Penn Treebank. In Proceedings of the First International Joint Conference on Natural Language Processing (IJCNLP-04), pages 684–693. Hainan Island, China. Stepan Oepen, Daniel Flickenger, Kristina Toutanova, and Christopher D. Manning. 2004. LinGO Redwoods. a rich and dynamic treebank for HPSG. Research on Language and Computation, 2(4):575–596. Martha Palmer, Daniel Gildea, and Paul Kingsbury. 2005. The proposition bank: An annotated corpus of semantic roles. Computational Linguistics, 3 1(1):71–106. Carl Pollard and Ivan Sag. 1994. Head-Driven Phrase Structure Grammar. The University of Chicago Press, Chicago. Libin Shen, Lucas Champollion, and Aravind K. Joshi. 2008. LTAG-spinal and the treebank: A new resource for incremental, dependency and semantic parsing. Language Resources and Evaluation, 42(1): 1–19. Stuart M. Shieber. 1986. An Introduction to UnificationBased Approaches to Grammar, volume 4 of CSLI Lecture Notes. CSLI Publications, Stanford, CA. Mark Steedman. 2000. The Syntactic Process. The MIT Press, Cambridge, MA, USA. Daniel Tse and James R. Curran. 2008. Punctuation normalisation for cleaner treebanks and parsers. In Proceedings of the Australian Language Technology Workshop, volume 6, pages 151–159. ALTW, Hobart, Australia. David Vadas and James Curran. 2007. Adding noun phrase structure to the Penn Treebank. In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, pages 240–247. ACL, Prague, Czech Republic. David Vadas and James R. Curran. 2008. Parsing noun phrase structure with CCG. In Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics, pages 335–343. ACL, Columbus, Ohio, USA. 215