emnlp emnlp2011 emnlp2011-122 emnlp2011-122-reference knowledge-graph by maker-knowledge-mining

122 emnlp-2011-Simple Effective Decipherment via Combinatorial Optimization


Source: pdf

Author: Taylor Berg-Kirkpatrick ; Dan Klein

Abstract: We present a simple objective function that when optimized yields accurate solutions to both decipherment and cognate pair identification problems. The objective simultaneously scores a matching between two alphabets and a matching between two lexicons, each in a different language. We introduce a simple coordinate descent procedure that efficiently finds effective solutions to the resulting combinatorial optimization problem. Our system requires only a list of words in both languages as input, yet it competes with and surpasses several state-of-the-art systems that are both substantially more complex and make use of more information.


reference text

A. Bouchard-C oˆt´ e, P. Liang, T.L. Griffiths, and D. Klein. 2007. A probabilistic approach to diachronic phonol- ogy. In Proc. of EMNLP. A. Bouchard-C oˆt´ e, T.L. Griffiths, and D. Klein. 2009. Improved reconstruction of protolanguage word forms. In Proc. of NAACL. A. Haghighi, P. Liang, T. Berg-Kirkpatrick, and D. Klein. 2008. Learning bilingual lexicons from monolingual corpora. Proceedings of ACL. D. Hall and D. Klein. 2010. Finding cognate groups using phylogenies. In Proc. of ACL. K. Knight and K. Yamada. 1999. A computational approach to deciphering unknown scripts. In Proc. of ACL Workshop on Unsupervised Learning in Natural Language Processing. K. Knight, A. Nair, N. Rathod, and K. Yamada. 2006. Unsupervised analysis for decipherment problems. In Proc. of COLING/ACL. P. Koehn and K. Knight. 2002. Learning a translation lexicon from monolingual corpora. In Proc. of ACL workshop on Unsupervised lexical acquisition. P. Koehn. 2005. Europarl: A Parallel Corpus for Statistical Machine Translation. In Proc. of Machine Translation Summit. G. Kondrak. 2001. Identifying Cognates by Phonetic and Semantic Similarity. In NAACL. H.W. Kuhn. 1955. The Hungarian method for the assignment problem. Naval research logistics quarterly. P. Liang, D. Klein, and M.I. Jordan. 2008. Agreementbased learning. Proc. of NIPS. S. Ravi and K. Knight. 2011. Bayesian inference for Zodiac and other homophonic ciphers. In Proc. of ACL. B. Snyder, R. Barzilay, and K. Knight. 2010. A statistical model for lost language decipherment. In Proc. of ACL. 321