acl acl2013 acl2013-320 acl2013-320-reference knowledge-graph by maker-knowledge-mining

320 acl-2013-Shallow Local Multi-Bottom-up Tree Transducers in Statistical Machine Translation


Source: pdf

Author: Fabienne Braune ; Nina Seemann ; Daniel Quernheim ; Andreas Maletti

Abstract: We present a new translation model integrating the shallow local multi bottomup tree transducer. We perform a largescale empirical evaluation of our obtained system, which demonstrates that we significantly beat a realistic tree-to-tree baseline on the WMT 2009 English → German tlriannes olnati tohne tWasMk.T TA 2s0 an a Edndgitliisonha →l c Gonetrrmibauntion we make the developed software and complete tool-chain publicly available for further experimentation.


reference text

Andr e´ Arnold and Max Dauchet. 1982. Morphismes et bimorphismes d’arbres. Theoret. Comput. Sci., 20(1):33–93. Marco Baroni, Silvia Bernardini, Adriano Ferraresi, and Eros Zanchetta. 2009. The WaCky Wide Web: A collection of very large linguistically processed web-crawled corpora. Language Resources and Evaluation, 43(3):209–226. Chris Callison-Burch, Philipp Koehn, Christof Monz, and Josh Schroeder. 2009. Findings of the 2009 Workshop on Statistical Machine Translation. In Proc. 4th Workshop on Statistical Machine Translation, pages 1–28. Eugene Charniak and Mark Johnson. 2005. Coarseto-fine n-best parsing and MaxEnt discriminative reranking. In Proc. 43rd ACL, pages 173–180. David Chiang. 2007. Hierarchical phrase-based translation. Computat. Linguist. , 33(2):201–228. David Chiang. 2010. Learning to translate with source and target syntax. In Proc. 48th ACL, pages 1443– 1452. Jason Eisner. 2003. Learning non-isomorphic tree mappings for machine translation. In Proc. 41st ACL, pages 205–208. Michel Galley, Mark Hopkins, Kevin Knight, and Daniel Marcu. 2004. What’s in a translation rule? In Proc. HLT-NAACL, pages 273–280. Michel Galley, Jonathan Graehl, Kevin Knight, Daniel Marcu, Steve Deneefe, Wei Wang, and Ignacio Thayer. 2006. Scalable inference and training of context-rich syntactic translation models. In Proc. 44th ACL, pages 961–968. Hieu Hoang, Philipp Koehn, and Adam Lopez. 2009. A unified framework for phrase-based, hierarchical, and syntax-based statistical machine translation. In Proc. 6th Int. Workshop Spoken Language Translation, pages 152–159. Liang Huang, Kevin Knight, and Aravind Joshi. 2006. Statistical syntax-directed translation with extended domain of locality. In Proc. 7th Conf. Association for Machine Translation of the Americas, pages 66– 73. Philip Koehn, Franz Josef Och, and Daniel Marcu. 2003. Statistical phrase-based translation. In Proc. HLT-NAACL, pages 127–133. Philipp Koehn, Amittai Axelrod, Alexandra Birch Mayne, Chris Callison-Burch, Miles Osborne, and David Talbot. 2005. Edinburgh system description for the 2005 IWSLT Speech Translation Evaluation. In Proc. 2nd Int. Workshop Spoken Language Trans- lation. Philipp Koehn, Hieu Hoang, Alexandra Birch, Chris Callison-Burch, Marcello Federico, Nicola Bertoldi, Brooke Cowan, Wade Shen, Christine Moran, Richard Zens, Chris Dyer, Ondrej Bojar, Alexandra Constantin, and Evan Herbst. 2007. Moses: Open source toolkit for statistical machine translation. In Proc. ACL, pages 177–180. Philipp Koehn. 2004. Statistical significance tests for machine translation evaluation. In Proc. EMNLP, pages 388–395. Philipp Koehn. 2005. Europarl: A parallel corpus for statistical machine translation. In Proc. 10th Machine Translation Summit, pages 79–86. Alon Lavie, Alok Parlikar, and Vamshi Ambati. 2008. Syntax-driven learning of sub-sentential translation equivalents and translation rules from parsed parallel corpora. In Proc. 2nd ACL Workshop on Syntax and Structure in Statistical Translation, pages 87–95. Eric Lilin. 1978. Une g ´en ´eralisation des transducteurs d’´ etats finis d’arbres: les S-transducteurs. Th` ese 3 e`me cycle, Universit e´ de Lille. Yang Liu, Qun Liu, and Shouxun Lin. 2006. Treeto-string alignment template for statistical machine translation. In Proc. 44th ACL, pages 609–616. Yang Liu, Yajuan L u¨, and Qun Liu. 2009. Improving tree-to-tree translation with packed forests. In Proc. 47th ACL, pages 558–566. Andreas Maletti. 2010. Why synchronous tree substitution grammars? In Proc. HLT-NAACL, pages 876–884. Andreas Maletti. 2011. How to train your multi bottom-up tree transducer. In Proc. 49th ACL, pages 825–834. Andreas Maletti. 2012. Every sensible extended topdown tree transducer is a multi bottom-up tree transducer. In Proc. HLT-NAACL, pages 263–273. Franz Josef Och and Hermann Ney. 2003. A systematic comparison of various statistical alignment models. Computat. Linguist., 29(1): 19–51 . Franz Josef Och. 2003. Minimum error rate training in statistical machine translation. In Proc. 41st ACL, pages 160–167. Kishore Papineni, Salim Roukos, Todd Ward, and Wei jing Zhu. 2002. BLEU: a method for automatic evaluation of machine translation. In Proc. 40th ACL, pages 311–318. Jean-Claude Raoult. 1997. Rational tree relations. Bull. Belg. Math. Soc. Simon Stevin, 4(1): 149–176. Helmut Schmid. 2004. Efficient parsing of highly ambiguous context-free grammars with bit vectors. In Proc. 20th COLING, pages 162–168. 820 Jun Sun, Min Zhang, and Chew Lim Tan. 2009. A noncontiguous tree sequence alignment-based model for statistical machine translation. In Proc. 47th ACL, pages 914–922. Web-as-Corpus Consortium. 2008. SDeWaC a 0.88 billion word corpus for german. Website: http : / /wacky . s s lmit .unibo . it / doku .php. — Dekai Wu. 1997. Stochastic inversion transduction grammars and bilingual parsing of parallel corpora. Computat. Linguist., 23(3):377–403. Min Zhang, Hongfei Jiang, Aiti Aw, Haizhou Li, Chew Lim Tan, and Sheng Li. 2008a. A tree sequence alignment-based tree-to-tree translation model. In Proc. 46th ACL, pages 559–567. Min Zhang, Hongfei Jiang, Haizhou Li, Aiti Aw, and Sheng Li. 2008b. Grammar comparison study for translational equivalence modeling and statistical machine translation. In Proc. 22nd International Conference on Computational Linguistics, pages 1097–1 104. 821