emnlp emnlp2010 emnlp2010-86 emnlp2010-86-reference knowledge-graph by maker-knowledge-mining

86 emnlp-2010-Non-Isomorphic Forest Pair Translation


Source: pdf

Author: Hui Zhang ; Min Zhang ; Haizhou Li ; Eng Siong Chng

Abstract: This paper studies two issues, non-isomorphic structure translation and target syntactic structure usage, for statistical machine translation in the context of forest-based tree to tree sequence translation. For the first issue, we propose a novel non-isomorphic translation framework to capture more non-isomorphic structure mappings than traditional tree-based and tree-sequence-based translation methods. For the second issue, we propose a parallel space searching method to generate hypothesis using tree-to-string model and evaluate its syntactic goodness using tree-to-tree/tree sequence model. This not only reduces the search complexity by merging spurious-ambiguity translation paths and solves the data sparseness issue in training, but also serves as a syntax-based target language model for better grammatical generation. Experiment results on the benchmark data show our proposed two solutions are very effective, achieving significant performance improvement over baselines when applying to different translation models.


reference text

Eugene Charniak. 2000. A maximum-entropy inspired parser. NAACL-00. Eugene Charniak, Kevin Knight, and Kenji Yamada. 2003. Syntax-based language models for statistical machine translation. MT Summit IX. 40–46. David Chiang. 2007. Hierarchical phrase-based translation.Computational Linguistics, 33(2). Steve DeNeefe, Kevin Knight. 2009. Synchronous Tree Adjoining Machine Translation. EMNLP-2009. 727-736. Jason Eisner. 2003. Learning non-isomorphic tree mappings for MT. ACL-03 (companion volume). 449 Michel Galley, Mark Hopkins, Kevin Knight and Daniel Marcu. 2004. What’s in a translation rule? HLT-NAACL-04. 273-280. Liang Huang. 2008. Forest Reranking: Parsing with Non-Local Features. 586-594 Discriminative ACL-HLT-08. Liang Huang and David Chiang. 2005. Better k-best Parsing. IWPT-05. 53-64 Liang Huang and David Chiang. 2007. Forest rescoring: Faster decoding with integrated language models. ACL-07. 144–151 Dan Klein and Christopher D. Manning. 2001. Parsing and Hypergraphs. IWPT-2001. Reinhard Kneser and Hermann backing-off for M-gram ICASSP-95, 181-184 Ney. 1995. Improved language modeling. Philipp Koehn, Hieu Hoang, Alexandra Birch, Chris Callison-Burch, Marcello Federico, Nicola Bertoldi, Brooke Cowan, Wade Shen, Christine Moran, Richard Zens, Chris Dyer, Ondrej Bojar, Alexandra Constantin and Evan Herbst. 2007. Moses: Open Source Toolkit for Statistical (poster) Machine Translation. ACL-07. 177-180. Yang Liu, Qun Liu and Shouxun Lin. 2006. Tree-to-String Alignment Template for Statistical Machine Translation. COLING-ACL-06. 609-616. Yang Liu, Yun Huang, Qun Liu and Shouxun Lin. 2007. Forest-to-String Statistical Translation Rules. ACL-07. 704-71 1. Yang Liu, Yajuan Lü, Qun Liu. 2009. Improving Tree-to-Tree Translation with Packed Forests. ACL-09. 558-566 Haitao Mi, Liang Huang, and Qun Liu. 2008. Forest-based translation. ACL-HLT-08. 192-199. Haitao Mi and Liang Huang. 2008. Forest-based Translation Rule Extraction. EMNLP-08. 206-214. Franz J. Och. 2003. Minimum error rate training in statistical machine translation. ACL-03. 160-167. Franz Josef Och and Hermann Ney. 2003. A Systematic Comparison of Various Statistical Alignment Models. Computational Linguistics. 29(1) 19-51. Kishore Papineni, Salim Roukos, Todd Ward and Wei-Jing Zhu. 2002. BLEU: a method for automatic evaluation of machine translation. ACL-02. 3 11-3 18. Andreas Stolcke. 2002. SRILM - an extensible language modeling toolkit. ICSLP-02. 901-904. Masaru Tomita. mented-Context-Free al Linguistics 1987. An Efficient Aug- Parsing Algorithm. Computation- 13(1-2): 3 1-46 Xavier Carreras and Michael Collins. 2009. Non-projective Parsing for Statistical Machine Translation. EMNLP-2009. 200-209. K. Yamada and K. Knight. 2001. A Syntax-Based Statistical Translation Model. ACL-01. 523-530. Hui Zhang, Min Zhang, Haizhou Li, Aiti Aw and Chew Lim Tan. 2009a. Forest-based Tree Sequence to String Translation Model. ACL-IJCNLP-09. 172-180. Hui Zhang, Min Zhang, Haizhou Li, and Chew Lim Tan. 2009b. Fast Translation Rule Matching for Syntax-based Statistical Machine Translation. EMNLP-09. 1037-1045. Min Zhang, Hongfei Jiang, Ai Ti Aw, Jun Sun, Chew Lim Tan and Sheng Li. 2007. A Tree-to-Tree Alignment-based model for statistical Machine translation. MT-Summit-07. 535-542 Min Zhang, Hongfei Jiang, Aiti Aw, Haizhou Li, Chew Lim Tan, Sheng Li. 2008. A Tree Sequence Alignment-based Tree-to-Tree Translation Model. ACL-HLT-08. 559-567. Ying Zhang, Stephan Vogel, Alex Waibel. 2004. Interpreting BLEU/NIST scores: How much improvement do we need to have a better system? LREC-04. 205 1-2054. 450