acl acl2012 acl2012-22 acl2012-22-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Xinyan Xiao ; Deyi Xiong ; Min Zhang ; Qun Liu ; Shouxun Lin
Abstract: Previous work using topic model for statistical machine translation (SMT) explore topic information at the word level. However, SMT has been advanced from word-based paradigm to phrase/rule-based paradigm. We therefore propose a topic similarity model to exploit topic information at the synchronous rule level for hierarchical phrase-based translation. We associate each synchronous rule with a topic distribution, and select desirable rules according to the similarity of their topic distributions with given documents. We show that our model significantly improves the translation performance over the baseline on NIST Chinese-to-English translation experiments. Our model also achieves a better performance and a faster speed than previous approaches that work at the word level.
Nicola Bertoldi and Marcello Federico. 2009. Domain adaptation for statistical machine translation with monolingual resources. In Proc of WMT 2009. David M. Blei and John D. Lafferty. 2007. A correlated topic model of science. AAS, 1(1): 17–35. David M. Blei, Andrew Ng, and Michael Jordan. 2003. Latent dirichlet allocation. JMLR, 3:993–1022. Marine Carpuat and Dekai Wu. 2007. Contextdependent phrasal translation lexicons for statistical machine translation. In Proceedings of the MT Summit XI. David Chiang, Yuval Marton, and Philip Resnik. 2008. Online large-margin training of syntactic and structural translation features. In Proc. EMNLP 2008. David Chiang. 2007. Hierarchical phrase-based translation. Computational Linguistics, 33(2):201–228. George Foster and Roland Kuhn. 2007. Mixture-model adaptation for SMT. In Proc. of the Second Workshop on Statistical Machine Translation, pages 128– 135, Prague, Czech Republic, June. Zhengxian Gong, Yu Zhang, and Guodong Zhou. 2010. Statistical machine translation based on lda. In Proc. IUCS 2010, page 286 –290, Oct. Zhongjun He, Qun Liu, and Shouxun Lin. 2008. Improving statistical machine translation using lexicalized rule selection. In Proc. EMNLP 2008. Thomas Hofmann. 1999. Probabilistic latent semantic analysis. In Proc. of UAI 1999, pages 289–296. Philipp Koehn, Franz Josef Och, and Daniel Marcu. 2003. Statistical phrase-based translation. In Proc. HLT-NAACL 2003. Philipp Koehn. 2004. Statistical significance tests for machine translation evaluation. In Proc. EMNLP 2004. David Mimno, Hanna M. Wallach, Jason Naradowsky, David A. Smith, and Andrew McCallum. 2009. Polylingual topic models. In Proc. of EMNLP 2009. Franz J. Och and Hermann Ney. 2002. Discriminative training and maximum entropy models for statistical machine translation. In Proc. ACL 2002. 758 Franz Josef Och and Hermann Ney. 2003. A systematic comparison of various statistical alignment models. Computational Linguistics, 29(1): 19–5 1. Franz Josef Och. 2003. Minimum error rate training in statistical machine translation. In Proc. ACL 2003. Kishore Papineni, Salim Roukos, Todd Ward, and WeiJing Zhu. 2002. Bleu: a method for automatic evaluation of machine translation. In Proc. ACL 2002. Nick Ruiz and Marcello Federico. 2011. Topic adaptation for lecture translation through bilingual latent semantic models. In Proceedings of the Sixth Workshop on Statistical Machine Translation, July. Libin Shen, Jinxi Xu, Bing Zhang, Spyros Matsoukas, and Ralph Weischedel. 2009. Effective use of linguistic and contextual information for statistical machine translation. In Proc. EMNLP 2009. Andreas Stolcke. 2002. Srilm an extensible language – modeling toolkit. In Proc. ICSLP 2002. Yik-Cheung Tam, Ian R. Lane, and Tanja Schultz. 2007. Bilingual lsa-based adaptation for statistical machine translation. Machine Translation, 21(4): 187–207. Hua Wu, Haifeng Wang, and Chengqing Zong. 2008. Domain adaptation for statistical machine translation with domain dictionary and monolingual corpora. In Proc. Coling 2008. Bing Zhao and Eric P. Xing. 2006. BiTAM: Bilingual topic admixture models for word alignment. In Proc. ACL 2006. Bin Zhao and Eric P. Xing. 2007. HM-BiTAM: Bilingual topic exploration, word alignment, and translation. In Proc. NIPS 2007.