acl acl2011 acl2011-333 acl2011-333-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Mohit Bansal ; Dan Klein
Abstract: Counts from large corpora (like the web) can be powerful syntactic cues. Past work has used web counts to help resolve isolated ambiguities, such as binary noun-verb PP attachments and noun compound bracketings. In this work, we first present a method for generating web count features that address the full range of syntactic attachments. These features encode both surface evidence of lexical affinities as well as paraphrase-based cues to syntactic structure. We then integrate our features into full-scale dependency and constituent parsers. We show relative error reductions of7.0% over the second-order dependency parser of McDonald and Pereira (2006), 9.2% over the constituent parser of Petrov et al. (2006), and 3.4% over a non-local constituent reranker.
M. Atterer and H. Schutze. 2007. Prepositional phrase attachment without oracles. Computational Linguis- tics, 33(4):469476. Thorsten Brants and Alex Franz. 2006. The Google Web 1T 5-gram corpus version 1.1. LDC2006T13. Eugene Charniak and Mark Johnson. 2005. Coarse-tofine n-best parsing and MaxEnt discriminative reranking. In Proceedings of ACL. Michael Collins and Terry Koo. 2005. Discriminative reranking for natural language parsing. Computational Linguistics, 3 1(1):25–70. Michael Collins. 1999. Head-Driven Statistical Models for Natural Language Parsing. Ph.D. thesis, University of Pennsylvania, Philadelphia. Michael Collins. 2000. Discriminative reranking for natural language parsing. In Proceedings of ICML. Michael Collins. 2002. Discriminative training methods for Hidden Markov Models: Theory and experiments with perceptron algorithms. In Proceedings of EMNLP. Jenny Rose Finkel, Alex Kleeman, and Christopher D. Manning. 2008. Efficient, feature-based, conditional random field parsing. In Proceedings of ACL. Liang Huang. 2008. Forest reranking: Discriminative parsing with non-local features. In Proceedings of ACL. Adam Kilgarriff. 2007. Googleology is bad science. 33(1). Terry Koo and Michael Collins. 2010. Efficient thirdorder dependency parsers. In Proceedings of ACL. Terry Koo, Xavier Carreras, and Michael Collins. 2008. Simple semi-supervised dependency parsing. In ProComputational Linguistics, ceedings of ACL. Mirella Lapata and Frank Keller. 2004. The Web as a Alexander Yates, Stefan Schoenmackers, and Oren Et- zioni. 2006. Detecting parser errors using web-based semantic filters. In Proceedings of EMNLP. baseline: Evaluating the performance of unsupervised Web-based models for a range of NLP tasks. In Proceedings of HLT-NAACL. M. Lauer. 1995. Corpus statistics meet the noun compound: some empirical results. In Proceedings of ACL. Andr e´ F. T. Martins, Noah A. Smith, and Eric P. Xing. 2009. Concise integer linear programming formulations for dependency parsing. In Proceedings of ACLIJCNLP. Ryan McDonald and Fernando Pereira. 2006. Online learning of approximate dependency parsing algorithms. In Proceedings of EACL. Ryan McDonald, Koby Crammer, and Fernando Pereira. 2005. Online large-margin training of dependency parsers. In Proceedings of ACL. Preslav Nakov and Marti Hearst. 2005a. Search engine statistics beyond the n-gram: Application to noun compound bracketing. In Proceedings of CoNLL. Preslav Nakov and Marti Hearst. 2005b. Using the web as an implicit training set: Application to structural ambiguity resolution. In Proceedings of EMNLP. Preslav Nakov and Marti Hearst. 2008. Solving relational similarity problems using the web as a corpus. In Proceedings of ACL. Slav Petrov, Leon Barrett, Romain Thibaux, and Dan Klein. 2006. Learning Accurate, Compact, and Interpretable Tree Annotation. In Proceedings of COLING-ACL. Emily Pitler, Shane Bergsma, Dekang Lin, , and Kenneth Church. 2010. Using web-scale n-grams to improve base NP parsing performance. In Proceedings ofCOLING. Adwait Ratnaparkhi. 1996. A maximum entropy model for part-of-speech tagging. In Proceedings of EMNLP. David A. Smith and Jason Eisner. 2008. Dependency parsing by belief propagation. In Proceedings of EMNLP. David Vadas and James R. Curran. 2007. Adding noun phrase structure to the Penn Treebank. In Proceedings of ACL. Martin Volk. 2001. Exploiting the WWW as a corpus to resolve PP attachment ambiguities. In Proceedings of Corpus Linguistics. 702