acl acl2012 acl2012-83 acl2012-83-reference knowledge-graph by maker-knowledge-mining

83 acl-2012-Error Mining on Dependency Trees


Source: pdf

Author: Claire Gardent ; Shashi Narayan

Abstract: In recent years, error mining approaches were developed to help identify the most likely sources of parsing failures in parsing systems using handcrafted grammars and lexicons. However the techniques they use to enumerate and count n-grams builds on the sequential nature of a text corpus and do not easily extend to structured data. In this paper, we propose an algorithm for mining trees and apply it to detect the most likely sources of generation failure. We show that this tree mining algorithm permits identifying not only errors in the generation system (grammar, lexicon) but also mismatches between the structures contained in the input and the input structures expected by our generator as well as a few idiosyncrasies/error in the input data.


reference text

Anja Belz, Michael White, Dominic Espinosa, Eric Kow, Deirdre Hogan, and Amanda Stent. 2011. The first surface realisation shared task: Overview and evaluation results. In Proceedings of the 13th European Workshop on Natural Language Generation (ENLG), Nancy, France. Charles B. Callaway. 2003. Evaluating coverage for large symbolic NLG grammars. In Proceedings of the 18th International Joint Conference on Artificial Intelligence, pages 811–817, Acapulco, Mexico. John Carroll, Ann Copestake, Dan Flickinger, and Viktor Pazna ´nski. 1999. An efficient chart generator for (semi-)lexicalist grammars. In Proceedings of the 7th European Workshop on Natural Language Generation, pages 86–95, Toulouse, France. Yun Chi, Yirong Yang, and Richard R. Muntz. 2004. Hybridtreeminer: An efficient algorithm for mining frequent rooted trees and free trees using canonical form. In Proceedings of the 16th International Conference on and Statistical Database Management (SSDBM), pages 11–20, Santorini Island, Greece. IEEE Computer Society. Dani e¨l de Kok, Jianqiang Ma, and Gertjan van Noord. 2009. A generalized method for iterative error mining in parsing results. In Proceedings of the 2009 Workshop on Grammar Engineering Across Frameworks (GEAF 2009), pages 71–79, Suntec, Singapore. Association for Computational Linguistics. Claire Gardent and Eric Kow. 2007. Spotting overgeneration suspect. In Proceedings of the 11th European Workshop on Natural Language Generation (ENLG), pages 41–48, Schloss Dagstuhl, Germany. Claire Gardent and Laura Perez-Beltrachini. 2010. Rtg based surface realisation for tag. In Proceedings of the 23rd International Conference on Computational Linguistics (COLING), pages 367–375, Beijing, China. Claire Gardent, Benjamin Gottesman, and Laura PerezBeltrachini. 2010. Comparing the performance of two TAG-based Surface Realisers using controlled Grammar Traversal. In Proceedings of the 23rd International Conference on Computational Linguistics (COLING - Poster session), pages 338–346, Beijing, China. Richert Johansson and Pierre Nugues. 2007. Extended constituent-to-dependency conversion for english. In Proceedings of the 16th Nordic Conference of Computational Linguistics (NODALIDA), pages 105–1 12, Tartu, Estonia. Rajakrishnan Rajkumar, Dominic Espinosa, and Michael White. 2011. The osu system for surface realization at generation challenges 2011. In Proceedings of the 600 13th European Workshop on Natural Language Generation (ENLG), pages 236–238, Nancy, France. Beno ıˆt Sagot and E´ric de la Clergerie. 2006. Error mining in parsing results. In Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics (ACL), pages 329–336, Sydney, Australia. Gertjan van Noord. 2004. Error mining for widecoverage grammar engineering. In Proceedings of the 42nd Meeting of the Association for Computational Linguistics (ACL), pages 446–453, Barcelona, Spain.