emnlp emnlp2010 emnlp2010-106 emnlp2010-106-reference knowledge-graph by maker-knowledge-mining

106 emnlp-2010-Top-Down Nearly-Context-Sensitive Parsing


Source: pdf

Author: Eugene Charniak

Abstract: We present a new syntactic parser that works left-to-right and top down, thus maintaining a fully-connected parse tree for a few alternative parse hypotheses. All of the commonly used statistical parsers use context-free dynamic programming algorithms and as such work bottom up on the entire sentence. Thus they only find a complete fully connected parse at the very end. In contrast, both subjective and experimental evidence show that people understand a sentence word-to-word as they go along, or close to it. The constraint that the parser keeps one or more fully connected syntactic trees is intended to operationalize this cognitive fact. Our parser achieves a new best result for topdown parsers of 89.4%,a 20% error reduction over the previous single-parser best result for parsers of this type of 86.8% (Roark, 2001) . The improved performance is due to embracing the very large feature set available in exchange for giving up dynamic programming.


reference text

Stephen Abney and Mark Johnson. 1991. Memory requirements and local ambiguities of parsing strategies. Journal of Psycholinguistic Research, 20(3) :233–250. Y. Amit and D. Geman. 1997. Shape quantization and recognition with randomized trees. Neural Computation, 9:1545–1588. Dan Bikel. 2004. On the Parameter Space of Lexicalized Statistical Parsing Models. Ph.D. thesis, University of Pennsylvania. Leo Breiman. 2004. Random forests. Machine Learning, 45(1) :5–32. Eugene Charniak and Mark Johnson. 2005. Coarseto-fine n-best parsing and MaxEnt discriminative reranking. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics, pages 173–180, Ann Arbor, Michigan, June. Association for Computational Linguistics. Eugene Charniak. 2000. A maximum-entropyinspired parser. In The Proceedings of the North American Chapter of the Association for Computational Linguistics, pages 132–139. Stanley F. Chen and Joshua Goodman. 1996. An empirical study of smoothing techniques for language modeling. In Arivind Joshi and Martha Palmer, editors, Proceedings of the Thirty-Fourth 682 Annual Meeting of the Association for Computational Linguistics, pages 310–318, San Francisco. Morgan Kaufmann Publishers. Michael Collins and Brian Roark. 2004. Incremental parsing with the perceptron algorithm. In Proceedings of the 42nd Meeting of the Association for Computational Linguistics (A CL ’04), Main Volume, pages 111–118, Barcelona, Spain, July. Michael Collins. 2003. Head-driven statistical models for natural language parsing. Computational Linguistics, 29(4) :589–638. A. Demers. 1977. Generalized left-corner parsing. In Conference Record of the Fourth ACM Symposium on Principles of Programming Languages, 1977 ACM SIGACT/SIGPLAN, pages 170–182. Jay Earley. 1970. An efficient contex-free parsing algorithm. Communications of the A CM, 6(8) :451– 445. Suzan L. Graham, Michael Harrison, and Walter L. Ruzzo. 1980. An improved context-free recognizer. A CM Transations on Programming Languages and Systems, 2(3) :415–462. James Henderson. 2003. Inducing history representations for broad coverage statistical parsing. In Proceedings of HLT-NAACL 2003. Mark Johnson and Brian Roark. 2000. Compact non-left-recursive grammars using the selective left-corner transform and factoring. In Proceedings of COLING-2000. Dan Klein and Christopher Manning. 2003. Accurate unlexicalized parsing. In Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics. Michell P. Marcus, Beatrice Santorini, and Mary Ann Marcinkiewicz. 1993. Building a large annotated corpus of English: The Penn Treebank. Computational Linguistics, 19(2) :313– 330. Joakim Nivre. 2003. An efficient algorithm for projective dependency parsing. In Proceedings of the 8th International Workshop on Parsing. Slav Petrov, Leon Barrett, Romain Thibaux, and Dan Klein. 2006. Learning accurate, compact, and interpretable tree annotation. In Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, pages 433–440, Sydney, Australia, July. Association for Computational Linguistics. Brian Roark. 2001. Probabilistic top-down parsing and language modeling. Computational Linguis- tics, 27(2):249–276. Roger van Gompel and Martin J. Pickering. 2007. Syntactic parsing. In G. Gatskil, editor, The Oxford handbook of psycholinguistics. Oxford University Press. Peng Xu and Frederick Jelinek. 2004. Random forests in language modeling. In Proceedings of the 2004 Empirical Methods in Natural Language Processing Conference. The Association for Computational Linguistics. 683