emnlp emnlp2011 emnlp2011-51 emnlp2011-51-reference knowledge-graph by maker-knowledge-mining

51 emnlp-2011-Exact Decoding of Phrase-Based Translation Models through Lagrangian Relaxation


Source: pdf

Author: Yin-Wen Chang ; Michael Collins

Abstract: This paper describes an algorithm for exact decoding of phrase-based translation models, based on Lagrangian relaxation. The method recovers exact solutions, with certificates of optimality, on over 99% of test examples. The method is much more efficient than approaches based on linear programming (LP) or integer linear programming (ILP) solvers: these methods are not feasible for anything other than short sentences. We compare our method to MOSES (Koehn et al., 2007), and give precise estimates of the number and magnitude of search errors that MOSES makes.


reference text

Graeme Blackwood, Adri a` de Gispert, Jamie Brunning, and William Byrne. 2009. Large-scale statistical machine translation with weighted finite state transducers. In Proceeding of the 2009 conference on Finite-State Methods and Natural Language Processing: Post-proceedings of the 7th International Workshop FSMNLP 2008, pages 39–49, Amsterdam, The Netherlands, The Netherlands. IOS Press. Peter F. Brown, Vincent J. Della Pietra, Stephen A. Della Pietra, and Robert L. Mercer. 1993. The mathematics of statistical machine translation: Parameter estimation. Computational Linguistics, 19:263–3 11, June. Ulrich Germann, Michael Jahr, Kevin Knight, Daniel Marcu, and Kenji Yamada. 2001 . Fast decoding and optimal decoding for machine translation. In Proceedings of the 39th Annual Meeting on Association for Computational Linguistics, ACL ’01, pages 228–235. Philipp Koehn, Franz Josef Och, and Daniel Marcu. 2003. Statistical phrase-based translation. In Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, NAACL ’03, pages 48–54. Philipp Koehn, Hieu Hoang, Alexandra Birch, Chris Callison-Burch, Marcello Federico, Nicola Bertoldi, Brooke Cowan, Wade Shen, Christine Moran, Richard Zens, Chris Dyer, Ond ˇrej Bojar, Alexandra Constantin, and Evan Herbst. 2007. Moses: Open source toolkit for statistical machine translation. In Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions, ACL ’07, pages 177–180. Philipp Koehn. 2005. Europarl: A parallel corpus for statistical machine translation. In Proceedings of the MT Summit. Nikos Komodakis, Nikos Paragios, and Georgios Tziritas. 2007. MRF optimization via dual decomposition: Message-passing revisited. In Proceedings of the 11th International Conference on Computer Vision. Terry Koo, Alexander M. Rush, Michael Collins, Tommi Jaakkola, and David Sontag. 2010. Dual decomposition for parsing with non-projective head automata. In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, pages 1288–1298, Cambridge, MA, October. Association for Computational Linguistics. Bernhard Korte and Jens Vygen. 2008. Combinatorial Optimization: Theory and Application. Springer Verlag. Shankar Kumar and William Byrne. 2005. Local phrase reordering models for statistical machine translation. In Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing, HLT ’05, pages 161–168. Claude Lemar ´echal. 2001. Lagrangian Relaxation. In Computational Combinatorial Optimization, Optimal or Provably Near-Optimal Solutions [based on a Spring School], pages 112–156, London, UK. Springer-Verlag. Angelia Nedi c´ and Asuman Ozdaglar. 2009. Approximate primal solutions and rate analysis for dual subgradient methods. SIAM Journal on Optimization, 19(4): 1757–1780. Franz Josef Och, Christoph Tillmann, Hermann Ney, and Lehrstuhl Fiir Informatik. 1999. Improved alignment models for statistical machine translation. In Proceedings of the Joint SIGDAT Conference on Empiri- cal Methods in Natural Language Processing and Very Large Corpora, pages 20–28. 36 Franz Josef Och, Nicola Ueffing, and Hermann Ney. 2001. An efficient A* search algorithm for statistical machine translation. In Proceedings of the workshop on Data-driven methods in machine translation Volume 14, DMMT ’01, pages 1–8, Stroudsburg, PA, USA. Association for Computational Linguistics. Franz Josef Och. 2003. Minimum error rate training in statistical machine translation. In Proceedings of the 41st Annual Meeting on Association for Computational Linguistics, ACL ’03, pages 160–167. Kishore Papineni, Salim Roukos, Todd Ward, and Wei Jing Zhu. 2002. Bleu: a method for automatic evaluation of machine translation. In Proceedings of ACL 2002. Sebastian Riedel and James Clarke. 2006. Incremental integer linear programming for non-projective dependency parsing. In Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, EMNLP ’06, pages 129–137, Stroudsburg, PA, USA. Association for Computational Linguistics. Sebastian Riedel and James Clarke. 2009. Revisiting optimal decoding for machine translation IBM model 4. In Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers, NAACLShort ’09, pages 5–8, Stroudsburg, PA, USA. Association for Computational Linguistics. Alexander M. Rush and Michael Collins. 2011. Exact decoding of syntactic translation models through Lagrangian relaxation. In Proceedings of ACL. Alexander M Rush, David Sontag, Michael Collins, and Tommi Jaakkola. 2010. On dual decomposition and linear programming relaxations for natural language processing. In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, pages 1–1 1, Cambridge, MA, October. Association for Computational Linguistics. David A. Smith and Jason Eisner. 2008. Dependency parsing by belief propagation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP ’08, pages 145–156. David Sontag, Talya Meltzer, Amir Globerson, Tommi Jaakkola, and Yair Weiss. 2008. Tightening LP relaxations for MAP using message passing. In Proceedings of the 24th Conference on Uncertainty in Artificial Intelligence, pages 503–5 10. Christoph Tillmann and Hermann Ney. 2003. Word reordering and a dynamic programming beam search algorithm for statistical machine translation. Computational Linguistics, 29:97–133, March. Christoph Tillmann. 2006. Efficient dynamic programming search algorithms for phrase-based SMT. In Proceedings of the Workshop on Computationally Hard Problems and Joint Inference in Speech and Language Processing, CHSLP ’06, pages 9–16. Roy W. Tromble and Jason Eisner. 2006. A fast finite-state relaxation method for enforcing global constraints on sequence decoding. In Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, HLT-NAACL ’06, pages 423–430, Stroudsburg, PA, USA. Association for Computational Linguistics. Martin Wainwright, Tommi Jaakkola, and Alan Willsky. 2005. MAP estimation via agreement on trees: Message-passing and linear programming. In IEEE Transactions on Information Theory, volume 51, pages 3697–3717. Mikhail Zaslavskiy, Marc Dymetman, and Nicola Cancedda. 2009. Phrase-based statistical machine translation as a traveling salesman problem. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1- Volume 1, ACL ’09, pages 333–341, Stroudsburg, PA, USA. Association for Computational Linguistics. 37