emnlp emnlp2013 emnlp2013-156 emnlp2013-156-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Nal Kalchbrenner ; Phil Blunsom
Abstract: We introduce a class of probabilistic continuous translation models called Recurrent Continuous Translation Models that are purely based on continuous representations for words, phrases and sentences and do not rely on alignments or phrasal translation units. The models have a generation and a conditioning aspect. The generation of the translation is modelled with a target Recurrent Language Model, whereas the conditioning on the source sentence is modelled with a Convolutional Sentence Model. Through various experiments, we show first that our models obtain a perplexity with respect to gold translations that is > 43% lower than that of stateof-the-art alignment-based translation models. Secondly, we show that they are remarkably sensitive to the word order, syntax, and meaning of the source sentence despite lacking alignments. Finally we show that they match a state-of-the-art system when rescoring n-best lists of translations.
Yoshua Bengio, R ´ejean Ducharme, Pascal Vincent, and Christian Janvin. 2003. A neural probabilistic language model. Journal of Machine Learning Research, 3: 1137–1 155. Peter F. Brown, Vincent J.Della Pietra, Stephen A. Della Pietra, and Robert. L. Mercer. 1993. The mathematics of statistical machine translation: Parameter estimation. Computational Linguistics, 19:263–31 1. R. Collobert and J. Weston. 2008. A unified architecture for natural language processing: Deep neural networks with multitask learning. In International Conference on Machine Learning, ICML. John Duchi, Elad Hazan, and Yoram Singer. 2011. Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res., 12:2121–2159, July. Chris Dyer, Jonathan Weese, Hendra Setiawan, Adam Lopez, Ferhan Ture, Vladimir Eidelman, Juri Ganitkevitch, Phil Blunsom, and Philip Resnik. 2010. cdec: A decoder, alignment, and learning framework for finitestate and context-free translation models. In Proceedings of the ACL 2010 System Demonstrations, pages 7–12. Association for Computational Linguistics. Chris Dyer, Victor Chahuneau, and Noah A. Smith. 2013. A simple, fast, and effective reparameterization of ibm model 2. In Proc. of NAACL. Edward Grefenstette, Mehrnoosh Sadrzadeh, Stephen Clark, Bob Coecke, and Stephen Pulman. 2011. Con1709 crete sentence spaces for compositional distributional models of meaning. CoRR, abs/1 101.0309. Karl Moritz Hermann and Phil Blunsom. 2013. The Role of Syntax in Vector Space Models of Compositional Semantics. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Sofia, Bulgaria, August. Association for Computational Linguistics. Forthcoming. Nal Kalchbrenner and Phil Blunsom. 2013. Recurrent Convolutional Neural Networks for Discourse Compositionality. In Proceedings of the Workshop on Continuous Vector Space Models and their Compositionality, Sofia, Bulgaria, August. Association for Computational Linguistics. Hai Son Le, Alexandre Allauzen, and Fran ¸cois Yvon. 2012. Continuous space translation models with neural networks. In HLT-NAACL, pages 39–48. Tomas Mikolov and Geoffrey Zweig. 2012. Context dependent recurrent neural network language model. In SLT, pages 234–239. Tomas Mikolov, Martin Karafi´ at, Lukas Burget, Jan Cernock y´, and Sanjeev Khudanpur. 2010. Recurrent neural network based language model. In Takao Kobayashi, Keikichi Hirose, and Satoshi Nakamura, editors, INTERSPEECH, pages 1045–1048. ISCA. Tomas Mikolov, Stefan Kombrink, Lukas Burget, Jan Cernock y´, and Sanjeev Khudanpur. 2011. Extensions of recurrent neural network language model. In ICASSP, pages 5528–5531. IEEE. Holger Schwenk, Daniel D ´echelotte, and Jean-Luc Gau- vain. 2006. Continuous space language models for statistical machine translation. In ACL. Holger Schwenk. 2012. Continuous space translation models for phrase-based statistical machine translation. In COLING (Posters), pages 1071–1080. Richard Socher, Eric H. Huang, Jeffrey Pennin, Andrew Y. Ng, and Christopher D. Manning. 2011. Dynamic pooling and unfolding recursive autoencoders for paraphrase detection. In J. Shawe-Taylor, R.S. Zemel, P. Bartlett, F.C.N. Pereira, and K.Q. Weinberger, editors, Advances in Neural Information Processing Systems 24, pages 801–809. Richard Socher, Brody Huval, Christopher D. Manning, and Andrew Y. Ng. 2012. Semantic Compositionality Through Recursive Matrix-Vector Spaces. In Proceedings of the 2012 Conference on Empirical Methods in Natural Language Processing (EMNLP). Ilya Sutskever, James Martens, and Geoffrey E. Hinton. 2011. Generating text with recurrent neural networks. In Lise Getoor and Tobias Scheffer, editors, ICML, pages 1017–1024. Omnipress.