acl acl2011 acl2011-58 acl2011-58-reference knowledge-graph by maker-knowledge-mining

58 acl-2011-Beam-Width Prediction for Efficient Context-Free Parsing

Source: pdf

Author: Nathan Bodenstab ; Aaron Dunlop ; Keith Hall ; Brian Roark

Abstract: Efficient decoding for syntactic parsing has become a necessary research area as statistical grammars grow in accuracy and size and as more NLP applications leverage syntactic analyses. We review prior methods for pruning and then present a new framework that unifies their strengths into a single approach. Using a log linear model, we learn the optimal beam-search pruning parameters for each CYK chart cell, effectively predicting the most promising areas of the model space to explore. We demonstrate that our method is faster than coarse-to-fine pruning, exemplified in both the Charniak and Berkeley parsers, by empirically comparing our parser to the Berkeley parser using the same grammar and under identical operating conditions.

reference text

Robert J. Bobrow. 1990. Statistical agenda parsing. In DARPA Speech and Language Workshop, pages 222– 224. Nathan Bodenstab, Kristy Hollingshead, and Brian Roark. 2011. Unary constraints for efficient contextfree parsing. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics, Portland, Oregon. Sharon A Caraballo and Eugene Charniak. 1998. New figures of merit for best-first probabilistic chart parsing. Computational Linguistics, 24:275–298. Eugene Charniak and Mark Johnson. 2005. Coarse-tofine n-best parsing and MaxEnt discriminative reranking. In Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, pages 173– 180, Ann Arbor, Michigan. Eugene Charniak. 2000. A maximum-entropy-inspired parser. In Proceedings of the 1st North American chapter of the Association for Computational Linguis- tics conference, pages 132–139, Seattle, Washington. David Chiang. 2010. Learning to translate with source and target syntax. In Proceedings of the 48rd Annual Meeting on Association for Computational Linguistics, pages 1443–1452. Michael Collins. 1999. Head-Driven Statistical Models for Natural Language Parsing. PhD dissertation, University of Pennsylvania. Michael Collins. 2002. Discriminative training methods for hidden markov models: theory and experiments with perceptron algorithms. In Proceedings of the ACL-02 conference on Empirical Methods in Natural Language Processing, volume 10, pages 1–8, Philadelphia. Hal Daum e´, III and Daniel Marcu. 2005. Learning as search optimization: approximate large margin methods for structured prediction. In Proceedings of the 22nd international conference on Machine learning, ICML ’05, pages 169–176, New York, NY, USA. Joshua Goodman. 1997. Global thresholding and Multiple-Pass parsing. Proceedings of the Second Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 11–25. Mark Johnson. 1998. PCFG models of linguistic tree representations. Computational Linguistics, 24(4):613–632. Dan Klein and Christopher D. Manning. 2003a. A* parsing. In Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology (NAACL ’03), pages 40–47, Edmonton, Canada. Dan Klein and Christopher D. Manning. 2003b. Accurate unlexicalized parsing. In Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1, pages 423–430, Sapporo, Japan. Mitchell P Marcus, Beatrice Santorini, Mary Ann Marcinkiewicz, and Ann Taylor. 1999. Treebank-3, Philadelphia. Takuya Matsuzaki, Yusuke Miyao, and Jun’ichi Tsujii. 2005. Probabilistic CFG with latent annotations. In Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics - ACL ’05, pages 75–82, Ann Arbor, Michigan. Joakim Nivre. 2008. Algorithms for deterministic incremental dependency parsing. Comput. Linguist., 34:513–553. Adam Pauls, Dan Klein, and Chris Quirk. 2010. Topdown k-best a* parsing. In In proceedings of the Annual Meeting on Association for Computational Linguistics Short Papers, ACLShort ’ 10, pages 200–204, Morristown, NJ, USA. 449 Slav Petrov and Dan Klein. 2007a. Improved inference for unlexicalized parsing. In Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference, pages 404– 411, Rochester, New York. Slav Petrov and Dan Klein. 2007b. Learning and inference for hierarchically split PCFGs. In AAAI 2007 (Nectar Track). Slav Petrov, Leon Barrett, Romain Thibaux, and Dan Klein. 2006. Learning accurate, compact, and interpretable tree annotation. In Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics, pages 433–440, Sydney, Australia. Slav Petrov, Pi-Chuan Chang, Michael Ringgaard, and Hiyan Alshawi. 2010. Uptraining for accurate deterministic question parsing. In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, pages 705–713, Cambridge, MA, October. Vasin Punyakanok, Dan Roth, and Wen tau Yih. 2008. The importance of syntactic parsing and inference in semantic role labeling. Computational Linguistics, 34(2):257–287. Brian Roark and Kristy Hollingshead. 2008. Classifying chart cells for quadratic complexity context-free inference. In Donia Scott and Hans Uszkoreit, editors, Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008), pages 745– 752, Manchester, UK. Brian Roark and Kristy Hollingshead. 2009. Linear complexity Context-Free parsing pipelines via chart constraints. In Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pages 647–655, Boulder, Colorado. Tzong-Han Tsai, Chia-Wei Wu, Yu-Chun Lin, and WenLian Hsu. 2005. Exploiting full parsing information to label semantic roles using an ensemble of ME and SVM via integer linear programming. In Proceedings of the Ninth Conference on Computational Natural Language Learning, CONLL ’05, pages 233–236, Morristown, NJ, USA. Yue Zhang, Byung gyu Ahn, Stephen Clark, Curt Van Wyk, James R. Curran, and Laura Rimell. 2010. Chart pruning for fast Lexicalised-Grammar parsing. In Proceedings of the 23rd International Conference on Computational Linguistics, pages 1472–1479, Beijing, China.