acl acl2013 acl2013-15 acl2013-15-reference knowledge-graph by maker-knowledge-mining

15 acl-2013-A Novel Graph-based Compact Representation of Word Alignment

Source: pdf

Author: Qun Liu ; Zhaopeng Tu ; Shouxun Lin

Abstract: In this paper, we propose a novel compact representation called weighted bipartite hypergraph to exploit the fertility model, which plays a critical role in word alignment. However, estimating the probabilities of rules extracted from hypergraphs is an NP-complete problem, which is computationally infeasible. Therefore, we propose a divide-and-conquer strategy by decomposing a hypergraph into a set of independent subhypergraphs. The experiments show that our approach outperforms both 1-best and n-best alignments.

reference text

Peter E. Brown, Stephen A. Della Pietra, Vincent J. Della Pietra, and Robert L. Mercer. 1993. The mathematics of statistical machine translation: Parameter estimation. Computational linguistics, 19(2):263–31 1. David Chiang. 2007. Hierarchical phrase-based translation. Computational Linguistics, 33(2):201–228. M. Collins, P. Koehn, and I. Kuˇ cerov a´. 2005. Clause restructuring for statistical machine translation. In Proceedings ofthe 43rdAnnualMeeting on Association for Computational Linguistics, pages 53 1–540. Adri a` de Gispert, Juan Pino, and William Byrne. 2010. Hierarchical phrase-based translation grammars extracted from alignment posterior probabilities. In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, pages 545–554. Christopher Dyer, Smaranda Muresan, and Philip Resnik. 2008. Generalizing word lattice translation. In Proceedings of ACL-08: HLT, pages 1012–1020. Reinhard Kneser and Hermann Ney. 1995. Improved backing-offfor m-gram language modeling. In Proceedings of the International Conference on Acoustics, Speech, andSignal Processing, volume 1, pages 181–184. Philipp Koehn, Franz Joseph Och, and Daniel Marcu. 2003. Statistical phrase-based translation. In Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics, pages 48–54. Shankar Kumar and William Byrne. 2002. Minimum Bayes-risk word alignments of bilingual texts. In Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing, pages 140–147. Yang Liu, Tian Xia, Xinyan Xiao, and Qun Liu. 2009. Weighted alignment matrices for statistical machine translation. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, pages 1017–1026. Yang Liu, Qun Liu, and Shouxun Lin. 2010. Discriminative word alignment by linear modeling. Computational Linguistics, 36(3):303–339. Haitao Mi and Liang Huang. 2008. Forest-based translation rule extraction. In Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, pages 206–214. Robert C. Moore. 2005. A discriminative framework for bilingual word alignment. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, pages 8 1–88, October. Franz J. Och and Hermann Ney. 2004. The alignment template approach to statistical machine translation. Computational Linguistics, 30(4):417–449. Franz Josef Och. 2003. Minimum error rate training in statistical machine translation. In Proceedings of the 41st AnnualMeeting of the Associationfor Computational Linguistics, pages 160–167. Kishore Papineni, Salim Roukos, Todd Ward, and WeiJing Zhu. 2002. Bleu: a method for automatic evaluation of machine translation. In Proceedings of 40th Annual Meeting of the Association for Computational Linguistics, pages 3 11–3 18. Jason Riesa and Daniel Marcu. 2010. Hierarchical search for word alignment. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pages 157–166. Andreas Stolcke. 2002. Srilm - an extensible language modeling toolkit. In Proceedings of Seventh International Conference on Spoken Language Processing, volume 3, pages 901–904. Citeseer. Zhaopeng Tu, Yang Liu, Young-Sook Hwang, Qun Liu, and Shouxun Lin. 2010. Dependency forest for statistical machine translation. In Proceedings of the 23rd International Conference on Computational Linguistics, pages 1092–1 100. Zhaopeng Tu, Yang Liu, Qun Liu, and Shouxun Lin. 2011. Extracting hierarchical rules from a weighted alignment matrix. In Proceedings of 5th International Joint Conference on Natural Language Processing, pages 1294–1303. Zhaopeng Tu, Wenbin Jiang, Qun Liu, and Shouxun Lin. 2012a. Dependency forest for sentiment analysis. In Springer-Verlag Berlin Heidelberg, pages 69–77. Zhaopeng Tu, Yang Liu, Yifan He, Josef van Genabith, Qun Liu, and Shouxun Lin. 2012b. Combining multiple alignments to improve machine translation. In Proceedings ofthe 24th International Conference on Computational Linguistics, pages 1249–1260. Leslie G Valiant. 1979. The complexity of comput- ing the permanent. 8(2): 189–201. Theoretical Computer Science, Ashish Venugopal, Andreas Zollmann, Noah A. Smith, and Stephan Vogel. 2008. Wider pipelines: n-best alignments and parses in mt training. In Proceedings of AMTA, pages 192–201 . 363