emnlp emnlp2013 emnlp2013-103 emnlp2013-103-reference knowledge-graph by maker-knowledge-mining

103 emnlp-2013-Improving Pivot-Based Statistical Machine Translation Using Random Walk

Source: pdf

Author: Xiaoning Zhu ; Zhongjun He ; Hua Wu ; Haifeng Wang ; Conghui Zhu ; Tiejun Zhao

Abstract: This paper proposes a novel approach that utilizes a machine learning method to improve pivot-based statistical machine translation (SMT). For language pairs with few bilingual data, a possible solution in pivot-based SMT using another language as a

reference text

Colin Bannard and Chris Callison-Burch. 2005. Para- phrasing with Bilingual Parallel Corpora. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics, pages 597-604 Sergey Brin and Lawrence Page. 1998. The Anatomy of a Large-Scale Hypertextual Web Search Engine. In Proceedings of the Seventh International World Wide Web Conference Trevor Cohn and Mirella Lapata. 2007. Machine Translation by Triangulation: Make Effective Use of Multi-Parallel Corpora. In Proceedings of 45th Annual Meeting of the Association for Computational Linguistics, pages 828-735. Marta R. Costa-jussà, Carlos Henríquez, and Rafael E. Banchs. 2011. Enhancing Scarce-Resource Language Translation through Pivot Combinations. In Proceedings of the 5th International Joint Conference on Natural Language Processing, pages 1361-1365 Nick Craswell and Martin Szummer. 2007. Random Walks on the Click Graph. In Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, pages 239-246 Yiming Cui, Conghui Zhu, Xiaoning Zhu, Tiejun Zhao and Dequan Zheng. 2013. Phrase Table Combination Deficiency Analyses in Pivot-based SMT. In Proceedings of 18th International Conference on Application of Natural Language to Information Systems, pages 355-358. Kevin Duh, Katsuhito Sudoh, Xianchao Wu, Hajime Tsukada and Masaaki Nagata. 2011. Generalized Minimum Bayes Risk System Combination. In Proceedings of the 5th International Joint Conference on Natural Language Processing, pages 1356–1360 Jesús González-Rubio, Alfons Juan and Francisco Casacuberta. 2011. Minimum Bayes-risk System Combination. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics, pages 1268–1277 Zhongjun He, Yao Meng, Yajuan Lü, Hao Yu and Qun Liu. 2009. Reducing SMT Rule Table with Monolingual Key Phrase. In Proceedings of the ACLIJCNLP 2009 Conference Short Papers, pages 121124 Howard Johnson, Joel Martin, George Foster, and Roland Kuhn. 2007. Improving translation quality by discarding most of the phrase table. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pages 967–975. Philipp Koehn, Franz J. Och, and Daniel Marcu. 2003. Statistical Phrase-Based Translation. In HLT-NAACL: Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics, pages 127-133 Philipp Koehn. 2004. Statistical significance tests for machine translation evaluation. In Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 388–395. Philipp Koehn. 2005. Europarl: A Parallel Corpus for Statistical Machine Translation. In Proceedings of MT Summit X, pages 79-86. Philipp Koehn, Hieu Hoang, Alexanda Birch, Chris Callison-Burch, Marcello Federico, Nicola Bertoldi, Brooke Cowan, Wade Shen, Christine Moran, Richard Zens, Chris Dyer, Ondrej Bojar, Alexandra Constantin, and Evan Herbst. 2007. Moses: Open Source Toolkit for Statistical Machine Translation. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics, demonstration session, pages 177–180. Franz Josef Och and Hermann Ney. 2000. A comparison of alignment models for statistical machine translation. In Proceedings of the 18th International Conference on Computational Linguistics, pages 1086–1090 Kishore Papineni, Salim Roukos, Todd Ward and Wei- Jing Zhu. 2002. BLEU: a Method for Automatic Evaluation of Machine Translation. In Proceedings of the 40th Annual Meeting of the Association for Computation Linguistics, pages 3 11-3 19 Karl Pearson. 1905. The Problem of the Random Walk. Nature, 27(1865):294 Mydans, Seth. 2011. Across cultures, English is the word. New York Times. 534 Martin Szummer and Tommi Jaakkola. 2002. Partially Labeled Classification with Markov Random Walks. In Advances in Neural Information Processing Systems, pages 945-952 Kristina Toutanova, Christopher D. Manning and Andrew Y. Ng. 2004. Learning Random Walk Models for Inducting Word Dependency Distributions. In Proceedings of the 21st International Conference on Machine Learning. Masao Utiyama and Hitoshi Isahara. 2007. A Comparison of Pivot Methods for Phrase-Based Statistical Machine Translation. In Proceedings of Human Language Technology: the Conference of the North American Chapter of the Association for Computational Linguistics, pages 484-491 Masao Utiyama, Andrew Finch, Hideo Okuma, Michael Paul, Hailong Cao, Hirofumi Yamamoto, Keiji Yasuda, and Eiichiro Sumita. 2008. The NICT/ATR speech Translation System for IWSLT 2008. In Proceedings of the International Workshop on Spoken Language Translation, pages 77-84 Haifeng Wang, Hua Wu, Xiaoguang Hu, Zhanyi Liu, Jianfeng Li, Dengjun Ren, and Zhengyu Niu. 2008. The TCH Machine Translation System for IWSLT 2008. In Proceedings of the International Workshop on Spoken Language Translation, pages 124-13 1 Hua Wu and Haifeng Wang. 2007. Pivot Language Approach for Phrase-Based Statistical Machine Translation. In Proceedings of 45th Annual Meeting of the Association for Computational Linguistics, pages 856-863. Hua Wu and Haifeng Wang. 2009. Revisiting Pivot Language Approach for Machine Translation. In Proceedings of the 47th Annual Meeting of the Association for Computational Linguistics and the 4th IJCNLP of the AFNLP, pages 154-162 Ying Zhang, Fei Huang, Stephan Vogel. 2005. Mining translations of OOV terms from the web through cross-lingual query expansion. In Proceedings of the 27th ACM SIGIR. pages 524-525