emnlp emnlp2013 emnlp2013-66 emnlp2013-66-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: He He ; Hal Daume III ; Jason Eisner
Abstract: Feature computation and exhaustive search have significantly restricted the speed of graph-based dependency parsing. We propose a faster framework of dynamic feature selection, where features are added sequentially as needed, edges are pruned early, and decisions are made online for each sentence. We model this as a sequential decision-making problem and solve it by imitation learning techniques. We test our method on 7 languages. Our dynamic parser can achieve accuracies comparable or even superior to parsers using a full set of features, while computing fewer than 30% of the feature templates.
P. Abbeel and A. Y. Ng. 2004. Apprenticeship learning via inverse reinforcement learning. In Proceedings of ICML. D. Benbouzid, R. Busa-Fekete, and B. K e´gl. 2012. Fast classification using space decision DAGs. In Proceedings of ICML. S. Bergsma and C. Cherry. 2010. Fast and accurate arc filtering for dependency parsing. In Proceedings of COLING. S. Buchholz and E. Marsi. 2006. CoNLL-X shared task on multilingual dependency parsing. In CoNLL. Xavier Carreras. 2007. Experiments with a higher-order projective dependency parser. In Proceedings of the CoNLL Shared Task Session of EMNLP-CoNLL. Eugene Charniak, Mark Johnson, Micha Elsner, Joseph Austerweil, David Ellis, Isaac Haxton, Catherine Hill, R. Shrivaths, Jeremy Moore, Michael Pozar, and Theresa Vu. 2006. Multilevel coarse-to-fine PCFG parsing. In Proceedings of ACL. Y. J. Chu and T. H. Liu. 1965. On the shortest arborescence of a directed graph. Science Sinica, 14. Koby Crammer and Yoram Singer. 2003. Ultraconservative online algorithms for multiclass problems. Journal of Machine Learning Research, 3:951–991. Koby Crammer, Ofer Dekel, Joseph Keshet, Shai ShalevShwartz, and Yoram Singer. 2006. Online passive- aggressive algorithms. Journal of Machine Learning Research, 7:551–585. Hal Daum e´ III, John Langford, and Daniel Marcu. 2009. Search-based structured prediction. Machine Learning Journal (MLJ). J. Edmonds. 1967. Optimum branchings. Journal of Research of the National Bureau of Standards, (71B):233–240. Jason Eisner. 1996. Three new probabilistic models for dependency parsing: an exploration. In Proceedings of COLING. Rong-En Fan, Kai-Wei Chang, Cho-Jui Hsieh, Xiang-Rui Wang, and Chih-Jen Lin. 2008. LIBLINEAR: A library for large linear classification. Journal of Machine Learning Research, 9: 1871–1874. Tianshi Gao and Daphne Koller. 2010. Active classification based on value of classifier. In Proceedings of NIPS. Alexander Grubb and J. Andrew Bagnell. 2012. SpeedBoost: Anytime prediction with uniform nearoptimality. In AISTATS. He He, Hal Daum e´ III, and Jason Eisner. 2012. Costsensitive dynamic feature selection. In ICML Inferning Workshop. 1464 Matti K a¨ a¨ri a¨inen. 2006. Lower bounds for reductions. Talk at the Atomic Learning Workshop (TTI-C), March. Terry Koo and Michael Collins. 2010. Efficient thirdorder dependency parsers. In Proceedings of ACL. Mitchell P. Marcus, Beatrice Santorini, and Mary Ann Marcinkiewicz. 1993. Building a large annotated corpus of English: The Penn Treebank. Computational Linguistics, 19(2):3 13–330. Andr e´ F. T. Martins, Dipanjan Das, Noah A. Smith, and Eric P. Xing. 2008. Stacking dependency parsers. In Proceedings of EMNLP. Ryan McDonald and Fernando Pereira. 2006. Online learning of approximate dependency parsing algorithms. In Proceedings of EACL, pages 81–88. Ryan McDonald, K. Crammer, and Fernando Pereira. 2005a. Online large-margin training of dependency parsers. In Proceedings of ACL. Ryan McDonald, Fernando Pereira, Kiril Ribarov, and Jan Haji cˇ. 2005b. Non-projective dependency parsing using spanning tree algorithms. In Proc. of EMNLP. N. Ratliff, D. Bradley, J. A. Bagnell, and J. Chestnutt. 2004. Boosting structured prediction for imitation learning. In Proceedings of ICML. B. Roark and K. Hollingshead. 2008. Classifying chart cells for quadratic complexity context-free inference. In Proceedings of COLING. St´ ephane. Ross, Geoffrey J. Gordon, and J. Andrew. Bagnell. 2011. A reduction of imitation learning and structured prediction to no-regret online learning. In Proceedings of AISTATS. Alexander Rush and Slav Petrov. 2012. Vine pruning for efficient multi-pass dependency parsing. In Proceedings of NAACL. David A. Smith and Jason Eisner. 2008. Dependency parsing by belief propagation. In EMNLP. Richard S. Sutton and Andrew G. Barto. 1998. Reinforcement Learning :An Introduction. MIT Press. R. E. Tarjan. 1977. Finding optimum branchings. Networks, 7(1):25–35. Kristina Toutanova, Dan Klein, Christopher Manning, and Yoram Singer. 2003. Feature-rich part-of-speech tagging with a cyclic dependency network. In NAACL. Paul Viola and Michael Jones. 2004. Robust feal-time face detection. International Journal of Computer Vision, 57: 137–154. Qin Iris Wang, Dekang Lin, and Dale Schuurmans. 2007. Simple training of dependency parsers via structured boosting. In Proceedings of IJCAI. David Weiss and Ben Taskar. 2010. Structured prediction cascades. In Proceedings of AISTATS. H. Yamada and Y. Matsumoto. 2003. Statistical dependency analysis with Support Vector Machines. In Proceedings of IWPT.