acl acl2010 acl2010-262 acl2010-262-reference knowledge-graph by maker-knowledge-mining

262 acl-2010-Word Alignment with Synonym Regularization

Source: pdf

Author: Hiroyuki Shindo ; Akinori Fujino ; Masaaki Nagata

Abstract: We present a novel framework for word alignment that incorporates synonym knowledge collected from monolingual linguistic resources in a bilingual probabilistic model. Synonym information is helpful for word alignment because we can expect a synonym to correspond to the same word in a different language. We design a generative model for word alignment that uses synonym information as a regularization term. The experimental results show that our proposed method significantly improves word alignment quality.

reference text

C. Bannard and C. Callison-Burch. 2005. Paraphrasing with bilingual parallel corpora. In Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, pages 597–604. Association for Computational Linguistics Morristown, NJ, USA. J. M. Bernardo, M. J. Bayarri, J. O. Berger, A. P. Dawid, D. Heckerman, A. F. M. Smith, and M. West. 2003. The variational bayesian EM algorithm for inY. Deng and Y. Gao. 2007. Guiding statistical word alignment models with prior knowledge. In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, pages 1–8, Prague, Czech Republic, June. Association for Computational Linguistics. A. Fraser and D. Marcu. 2007. Getting the structure right for word alignment: LEAF. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLPCoNLL), pages 5 1–60, Prague, Czech Republic, June. Association for Computational Linguistics. Y. Ma, S. Ozdowska, Y. Sun, and A. Way. 2008. Improving word alignment using syntactic dependencies. In Proceedings of the ACL-08: HLT Second Workshop on Syntax and Structure in Statistical Translation (SSST-2), pages 69–77, Columbus, Ohio, June. Association for Computational Linguistics. R. Mihalcea and T. Pedersen. 2003. An evaluation exercise for word alignment. In Proceedings of the HLT-NAACL 2003 Workshop on building and using parallel texts: data driven machine translation and beyond-Volume 3, page 10. Association for Computational Linguistics. G. A. Miller. 1995. WordNet: a lexical database for English. Communications of the ACM, 38(1 1):41 . F. J. Och and H. Ney. 2000. Improved statistical alignment models. In Proceedings of the 38th Annual Meeting on Association for Computational Linguistics, pages 440–447. Association for Computational Linguistics Morristown, NJ, USA. F. J. Och and H. Ney. 2003. A systematic comparison of various statistical alignment models. tional Linguistics, 29(1): 19–5 1. Computa- B. Sagot and D. Fiser. 2008. Building a free French wordnet from multilingual resources. In Proceedings of Ontolex. S. Vogel, H. Ney, and C. Tillmann. 1996. HMMbased word alignment in statistical translation. In Proceedings of the 16th Conference on Computational Linguistics-Volume 2, pages 836–841 . Association for Computational Linguistics Morristown, NJ, USA. B. Zhao and E. P. Xing. 2006. BiTAM: Bilingual topic admixture models for word alignment. In Proceedings of the COLING/ACL on Main Conference Poster Sessions, page 976. Association for Computational Linguistics. B. Zhao and E. P. Xing. 2008. HM-BiTAM: Bilingual topic exploration, word alignment, and translation. complete data: with application to scoring graphical model structures. In Bayesian Statistics 7: Proceed- In Advances in Neural Information Processing Systems 20, pages 1689–1696, Cambridge, MA. MIT ings of the 7th Valencia International Meeting, June 2-6, 2002, page 453. Oxford University Press, USA. Press. 141