acl acl2012 acl2012-71 acl2012-71-reference knowledge-graph by maker-knowledge-mining

71 acl-2012-Dependency Hashing for n-best CCG Parsing

Source: pdf

Author: Dominick Ng ; James R. Curran

Abstract: Optimising for one grammatical representation, but evaluating over a different one is a particular challenge for parsers and n-best CCG parsing. We find that this mismatch causes many n-best CCG parses to be semantically equivalent, and describe a hashing technique that eliminates this problem, improving oracle n-best F-score by 0.7% and reranking accuracy by 0.4%. We also present a comprehensive analysis of errors made by the C&C; CCG parser, providing the first breakdown of the impact of implementation decisions, such as supertagging, on parsing accuracy.

reference text

Michael Auli and Adam Lopez. 2011. Training a Log-Linear Parser with Loss Functions via SoftmaxMargin. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing (EMNLP-11), pages 333–343. Edinburgh, Scotland, UK. Forrest Brennan. 2008. k-best Parsing Algorithms for a Natural Language Parser. Master’s thesis, University of Oxford. Ted Briscoe and John Carroll. 2006. Evaluating the Accuracy of an Unlexicalized Statistical Parser on the PARC DepBank. In Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions, pages 41–48. Sydney, Australia. Eugene Charniak and Mark Johnson. 2005. Coarseto-Fine n-Best Parsing and MaxEnt Discriminative Reranking. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL-05), pages 173–180. Ann Arbor, Michigan, USA. Stephen Clark and James R. Curran. 2004a. Parsing the WSJ Using CCG and Log-Linear Models. In Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04), pages 103–1 10. Barcelona, Spain. Stephen Clark and James R. Curran. 2004b. The Importance of Supertagging for Wide-Coverage CCG Parsing. In Proceedings of the 20th International Conference on Computational Linguistics (COLING-04), pages 282–288. Geneva, Switzerland. Stephen Clark and James R. Curran. 2007. WideCoverage Efficient Statistical Parsing with CCG and Log-Linear Models. Computational Linguistics, 33(4):493–552. Stephen Clark, Julia Hockenmaier, and Mark Steedman. 2002. Building Deep Dependency Structures using a Wide-Coverage CCG Parser. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL-02), pages 327–334. Philadelphia, Pennsylvania, USA. Michael Collins. 2000. Discriminative Reranking for Natural Language Parsing. In Proceedings of the 1 International Conference on Machine Learning 7th (ICML-00), pages 175–182. Palo Alto, California, USA. Jason Eisner. 1996. Efficient Normal-Form Parsing for Combinatory Categorial Grammar. In Proceedings of the 34th Annual Meeting of the Association for Computational Linguistics (ACL-96), pages 79–86. Santa Cruz, California, USA. Julia Hockenmaier. 2003. Parsing with Generative Models of Predicate-Argument Structure. In Proceedings ofthe 41stAnnual Meeting ofthe Associationfor Com505 putational Linguistics (ACL-03), pages 359–366. Sapporo, Japan. Julia Hockenmaier. 2006. Creating a CCGbank and a Wide-Coverage CCG Lexicon for German. In Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics (COLING/ACL-06), pages 505–5 12. Sydney, Australia. Julia Hockenmaier and Mark Steedman. 2007. CCGbank: A Corpus of CCG Derivations and Dependency Structures Extracted from the Penn Treebank. Computational Linguistics, 33(3):355–396. Liang Huang. 2008. Forest Reranking: Discriminative Parsing with Non-Local Features. In Proceedings of the Human Language Technology Conference at the 45th Annual Meeting of the Association for Computational Linguistics (HLT/ACL-08), pages 586–594. Columbus, Ohio. Liang Huang and David Chiang. 2005. Better k-best Parsing. In Proceedings of the Ninth International Workshop on Parsing Technology (IWPT-05), pages 53–64. Vancouver, British Columbia, Canada. Liang Huang, Kevin Knight, and Aravind K. Joshi. 2006. Statistical Syntax-Directed Translation with Extended Domain ofLocality. In Proceedings ofthe 7th Biennial Conference of the Association for Machine Translation in the Americas (AMTA-06), pages 66–73. Boston, Massachusetts, USA. Tracy Holloway King, Richard Crouch, Stefan Riezler, Mary Dalrymple, and Ronald M. Kaplan. 2003. The PARC 700 Dependency Bank. In Proceedings of the 4th International Workshop on Linguistically Interpreted Corpora, pages 1–8. Budapest, Hungary. Adwait Ratnaparkhi. 1996. A Maximum Entropy Model for Part-of-Speech Tagging. In Proceedings of the 1996 Conference on Empirical Methods in Natural Language Processing (EMNLP-96), pages 133–142. Philadelphia, Pennsylvania, USA. Mark Steedman. 2000. The Syntactic Process. MIT Press, Cambridge, Massachusetts, USA. Daniel Tse and James R. Curran. 2010. Chinese CCG- bank: extracting CCG derivations from the Penn Chinese Treebank. In Proceedings of the 23rd International Conference on Computational Linguistics (COLING-2010), pages 1083–1091. Beijing, China. Aline Villavicencio. 2002. Learning to Distinguish PP Arguments from Adjuncts. In Proceedings of the 6th Conference on Natural Language Learning (CoNLL2002), pages 84–90. Taipei, Taiwan.