acl acl2013 acl2013-109 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Malte Nuhn ; Hermann Ney
Abstract: In this paper we show that even for the case of 1:1 substitution ciphers—which encipher plaintext symbols by exchanging them with a unique substitute—finding the optimal decipherment with respect to a bigram language model is NP-hard. We show that in this case the decipherment problem is equivalent to the quadratic assignment problem (QAP). To the best of our knowledge, this connection between the QAP and the decipherment problem has not been known in the literature before.
Reference: text
sentIndex sentText sentNum sentScore
1 We show that in this case the decipherment problem is equivalent to the quadratic assignment problem (QAP). [sent-2, score-1.478]
2 To the best of our knowledge, this connection between the QAP and the decipherment problem has not been known in the literature before. [sent-3, score-0.771]
3 1 Introduction The decipherment approach for MT has recently gained popularity for training and adapting translation models using only monolingual data. [sent-4, score-0.648]
4 In general, the process of translation has a wide range of phenomena like substitution and reordering of words and phrases. [sent-6, score-0.083]
5 In this paper we only study models that substitute tokens—i. [sent-7, score-0.031]
6 It therefore serves as a very basic case for decipherment and machine translation. [sent-10, score-0.648]
7 Multiple techniques like integer linear programming (ILP), A∗ search, genetic algorithms, and Bayesian inference have been used to tackle the decipherment problem for 1:1substitution ciphers. [sent-11, score-0.791]
8 The existence of such a variety of different approaches for solving the same problem already shows that there is no obvious way to solve the problem optimally. [sent-12, score-0.189]
9 In this paper we show that decipherment of 1:1 substitution ciphers is indeed NP-hard and thus ex. [sent-13, score-0.807]
10 The literature on decipherment provides surprisingly little on the analysis of the complexity of the decipherment problem. [sent-16, score-1.296]
11 This might be related to the fact that a statistical formulation of the decipherment problem has not been analyzed with respect to n-gram language models: This paper shows the close relationship of the decipherment problem to the quadratic assignment problem. [sent-17, score-2.095]
12 To the best of our knowledge the connection between the decipherment problem and the quadratic assignment problem was not known. [sent-18, score-1.494]
13 Section 3 introduces the decipherment problem and describes the notation and definitions used throughout this paper. [sent-20, score-0.765]
14 In Section 4 we show that decipherment using a unigram language model corresponds to solving a linear sum assignment problem (LSAP). [sent-21, score-1.253]
15 Section 5 shows the connection between the quadratic assignment problem and decipherment using a bigram language model. [sent-22, score-1.507]
16 Here we also give a reduction of the traveling salesman problem (TSP) to the decipherment problem to highlight the additional complexity in the decipherment problem. [sent-23, score-1.634]
17 2 Related Work In recent years a large number of publications on the automatic decipherment of substitution ciphers has been published. [sent-24, score-0.828]
18 These publications were mostly dominated by rather heuristic methods and did not provide a theoretical analysis of the complexity of the decipherment problem: (Knight and Yamada, 1999) and (Knight et al. [sent-25, score-0.669]
19 , 2006) use the EM algorithm for various decipherment problems, like e. [sent-26, score-0.648]
20 , 2012), and (Dou and Knight, 2012) treat natural language translation as a deciphering problem including phenomena like reordering, in- sertion, and deletion and are able to train translation models using only monolingual data. [sent-35, score-0.149]
21 In this paper we will show the connection between the decipherment problem and the linear sum assignment problem as well as the quadratic assignment problem: Regarding the linear sum assignment problem we will make use of definitions presented in (Burkard and ela, 1999). [sent-36, score-2.457]
22 Concerning the quadratic assignment problem we will use basic definitions from (Beckmann and Koopmans, 1957). [sent-37, score-0.764]
23 , 1998) gives a good overview over the quadratic assignment problem, including different formulations, solution methods, and an analysis of computational complexity. [sent-39, score-0.647]
24 3 Definitions In the following we will use the machine translation notation and denote the ciphertext with f1N = f1. [sent-41, score-0.088]
25 We define e0 = f0 = eN+1 = fN+1 = $ (1) with “$” being a special sentence boundary token. [sent-55, score-0.025]
26 A general substitution cipher uses a table s(e|f) which contains for each cipher token f a probability cthha ct othntea tionske fno f eisa shub csitpihtuetred to wkeitnh fthe a plaintext token e. [sent-57, score-0.446]
27 Such a table for substituting cipher tokens {A, B, C, D} with plaintext tokens {a, b, c, d} cnosu {ldA ,foBr example loithok p lliakien a b c d A0. [sent-58, score-0.195]
28 1 The 1:1 substitution cipher encrypts a given plaintext into a ciphertext by replacing each plain- text token with a unique substitute: This means that the table s(e|f) contains all zeroes, except for one “1. [sent-74, score-0.393]
29 Fonore example trh fe text abadcab would be enciphered to BCBADBC when using the substitution a b c d A0001 B 1 0 C D 0 0 0 0 1 0 0 0 1 0 We formalize the 1:1 substitutions with a bijective function φ : Vf → Ve. [sent-77, score-0.083]
30 The general decipherment goal is to o→btai Vn a mapping φ such that the probability of the deciphered text is maximal: φˆ = argφmaxp(φ(f1)φ(f2)φ(f3). [sent-78, score-0.648]
31 Given a ciphertext f1N, we define the unigram count Nf of f ∈ Vf as1 Nf = NX+1δ(f,fi) Xi=0 (3) This implies that Nf are integer counts > 0. [sent-86, score-0.291]
32 We similarly define the bigram count Nff0 of f,f0 ∈ Vf as Nff0 = NX+1δ(f,fi−1) · δ(f0,fi) (4) Xi= X1 This definition implies that (a) Nff0 are integer counts > 0 of bigrams found in the ciphertext f1N. [sent-87, score-0.311]
33 (b) Given the first and last token of the cipher f1 and fN, the bigram counts involving the sentence boundary token $ need to fulfill = N$f δ(f, f1) Nf$ = δ(f, fN) (5) (6) (c) For all f ∈ Vf X Nff0 fX0∈Vf = X Nf0f fX0∈Vf (7) must hold. [sent-88, score-0.333]
34 616 Similarly, we define language model matrices S for the unigram and the bigram case. [sent-90, score-0.207]
35 bacVefCBAcijabN A lo Ag p (bac)N B lo Bg p (bca)N C lo Cg p (cab) Figure 1: Linear sum assignment problem for a cipher with Ve = {a, b, c}, Vf = {A, B, C}, unigram counts Nf, a {nad, unigram probabilities p(e). [sent-93, score-0.823]
36 2 The Linear Sum Assignment Problem The practical problem behind the linear sum assignment problem can be described as follows: Given jobs {j1, . [sent-95, score-0.627]
37 , wn}, th jeo btassk { ijs to assign each job ji to a {wworker wj. [sent-101, score-0.025]
38 E}a,c thh assignment sinsicgunrs e a hco jsotb cij and the total cost for assigning all jobs and workers is to be minimized. [sent-102, score-0.527]
39 This can be formalized as finding the assignment φˆ = Xn argφminXi=1ciφ(i) (18) The general LSAP can be solved in polynomial time using the Hungarian algorithm (Kuhn, 1955). [sent-103, score-0.393]
40 Sorting and then assigning the elements can be done in O(n log n). [sent-105, score-0.036]
41 1 Problem Definition When using a 2-gram language model, Equation 2 simplifies to φˆ = argφmaxNjY=+11p(φ(fj)|φ(fj−1))(19) 617 Assignments Flows ( ab))fl 111ffl422flf333lf 244 ff21f11f12f3f4 y f3 f4 1 1 l3l2l1l4x Figure 2: Hypothetical quadratic assignment prob- lem with locations l1. [sent-107, score-0.716]
42 f4 with all flows being zero except f1 ↔ f2 and f3 ↔ f4. [sent-113, score-0.049]
43 l4 is implicitly given by the locations in the plane, implying a euclidean metric. [sent-117, score-0.045]
44 However, he does not mention the close connection to the quadratic assignment problem. [sent-120, score-0.694]
45 2 The Quadratic Assignment Problem The quadratic assignment problem was introduced by (Beckmann and Koopmans, 1957) for the following real-life problem: Given a set of facilities {f1, . [sent-122, score-0.796]
46 the amount of supplies to be transported between a pair of facilities) the problem is to assign the facilities to locations such that the sum of the distances multiplied by the corresponding flows (which can be interpreted as total transportation costs) is minimized. [sent-130, score-0.284]
47 Following (Beckmann and Koopmans, 1957) we can express the quadratic assignment problem as finding φˆ = argφminXi=n1jX=n1aijbφ(i)φ(j)+Xi=n1ciφ(i) (21) where A = (aij) , B = (bij) , C = (cij) ∈ Nn×n and φ a permutation φ : {1, . [sent-132, score-0.723]
48 The so-called pure or homogeneous QAP φˆ = argφminXi=n1jX=n1aijbφ(i)φ(j) (23) is obtained by setting cij = 0, and is often denoted as QAP(A, B). [sent-140, score-0.076]
49 In terms of the real-life problem presented in (Beckmann and Koopmans, 1957) the matrix A can be interpreted as distance matrix for locations {l1 . [sent-141, score-0.261]
50 nd Gonzalez, 1976) show that the quadratic assignment problem is strongly NPhard. [sent-146, score-0.723]
51 We will now show the relation between the quadratic assignment problem and the decipher- . [sent-147, score-0.723]
52 Quadratic Assignment nPtro Pbrloebmle Every decipherment problem is directly a quadratic assignment problem, since the matrices Nff0 and Sff0 are just special cases of the general matrices A and B required for the quadratic assignment problem. [sent-153, score-2.116]
53 Thus a reduction from the decipherment problem to the quadratic assignment problem is trivial. [sent-154, score-1.471]
54 This means that all algorithms capable of solving QAPs can directly be used to solve the decipherment problem. [sent-155, score-0.685]
55 Decipherment iPgnrombelenmt P Given QAP(A, B) with integer matrices A = (aij), B = (bij) i,j ∈ {1, . [sent-158, score-0.087]
56 , n} we construct the count matrix) Nff0 a {nd1 language me coodnesl matrix Sff0 in such a way that the solution for the decipherment problem implies the solution to the 618 quadratic assignment problem, and vice versa. [sent-161, score-1.514]
57 Show that Sff0 is a valid bigram language model matrix. [sent-173, score-0.112]
58 Show that the decipherment problem and the newly constructed quadratic assignment problem are equivalent. [sent-175, score-1.447]
59 We start by showing that Nff0 is a valid count matrix: (a) By construction, Nff0 has integer counts that are greater or equal to 0. [sent-176, score-0.121]
60 , n} the count propertFieosr are equivalent to af∗+X ˜aff0 = ˜a∗f+X a˜f0f+ δ(f,1) Xf0 Xf0 (24) • which holds by construction of a∗f and af∗. [sent-180, score-0.126]
61 For f = n+1the count property is equivaFloernft t o= 1 +X a˜f0∗ Xf0 = 2 +X a˜∗f0 (25) Xf0 which follows from Equation (24) by summing over all f ∈ {1, . [sent-181, score-0.037]
62 t We now show that Sff0 is a valid bigram language model matrix: (a) By construction, Sff0 ∈ [−∞, 0] holds. [sent-186, score-0.112]
63 (c) By the construction of the values Sff0 ful- b˜f∗, fill Pf0 exp(Sff0) = 1for all f. [sent-188, score-0.037]
64 are chosen to be smaller We now show the equivalence of the quadratic assignment problem and the newly constructed decipherment problem. [sent-190, score-1.371]
65 For this we will use the definitions A˜ B˜ = {1, . [sent-191, score-0.041]
66 , n} (26) = {n + 1, n + 2, n + 3} (27) We first show that solutions of the constructed decipherment problem with score > −∞ fulfill φ(f) = f feonrt f ∈ fA)ll = mappings φ, with φ(f) = f0 for any f ∈ and f0 ∈ will induce a score offo −∞ si fnc ∈e bB˜le. [sent-194, score-0.752]
67 2 b˜ff0= bff0− mf˜af˜0x{bf˜f˜0} − log(n + 2) b˜f∗= log1 −fX0n=1exp(b˜ff0) −n +2 2 εi = − log(n + i) Figure 3: Construction of matrices Nff0 and Sff0 of the decipherment problem from matrices A = (aij) and B = (bij) of the quadratic assignment problem QAP(A, B). [sent-214, score-1.545]
68 Assuming that calculat- ing elementary functions can be done in O(1), setting up Nff0 ayn fdu Sff0 can nb e b edo dnoen ien i polynomial time. [sent-216, score-0.04]
69 2 Thus we have given a polynomial time reduction from the quadratic assignment problem to 2This is the case if we only require a fixed number of digits precision for the log and exp operations. [sent-217, score-0.823]
70 the decipherment problem: Since the quadratic assignment problem is NP-hard, it follows that the decipherment problem is NP-hard, too. [sent-218, score-2.095]
71 Decipherment sPmraonble Pmro Using the above construction we can immediately construct a decipherment problem that is equivalent to the traveling salesman problem by using the quadratic assignment problem formulation of the traveling salesman problem. [sent-221, score-1.915]
72 Without loss of generality3 we assume that the TSP’s distance matrix fulfills the constraints of a bigram language model matrix Sff0 . [sent-222, score-0.261]
73 Then the count matrix Nff0 needs to be chosen as Nf0= 10 . [sent-223, score-0.107]
74 (35) which fulfills the constraints of a bigram count matrix. [sent-230, score-0.158]
75 3The general case can be covered using the reduction shown in Section 5. [sent-231, score-0.024]
76 620 This matrix corresponds to a ciphertext of the form $abcd$ (36) and represents the tour of the traveling salesman in an intuitive way. [sent-232, score-0.32]
77 The mapping φ then only decides in which order the cities are visited, and only costs between two successive cities are counted. [sent-233, score-0.056]
78 This shows that the TSP is only a special case of the decipherment problem. [sent-234, score-0.648]
79 For a bigram language model, the decipherment problem is equivalent to the quadratic assignment problem and is NP-hard. [sent-236, score-1.567]
80 We also pointed out that all available algorithms for the quadratic assignment problem can be directly used to solve the decipherment problem. [sent-237, score-1.371]
81 To the best of our knowledge, this correspondence between the decipherment problem and the quadratic assignment problem has not been known previous to our work. [sent-238, score-1.447]
wordName wordTfidf (topN-words)
[('decipherment', 0.648), ('assignment', 0.353), ('quadratic', 0.294), ('vf', 0.188), ('xf', 0.16), ('qap', 0.141), ('cipher', 0.114), ('beckmann', 0.1), ('nf', 0.094), ('bigram', 0.089), ('ciphertext', 0.088), ('fx', 0.086), ('substitution', 0.083), ('plaintext', 0.081), ('salesman', 0.081), ('traveling', 0.081), ('koopmans', 0.08), ('ciphers', 0.076), ('cij', 0.076), ('problem', 0.076), ('fn', 0.073), ('facilities', 0.073), ('deciphering', 0.073), ('burkard', 0.07), ('matrix', 0.07), ('unigram', 0.069), ('xn', 0.068), ('bij', 0.06), ('sf', 0.054), ('af', 0.054), ('arg', 0.054), ('jobs', 0.052), ('equation', 0.05), ('matrices', 0.049), ('tsp', 0.049), ('flows', 0.049), ('fj', 0.048), ('connection', 0.047), ('knight', 0.047), ('workers', 0.046), ('nuhn', 0.046), ('locations', 0.045), ('ravi', 0.044), ('aij', 0.044), ('sum', 0.041), ('definitions', 0.041), ('eranda', 0.04), ('lsap', 0.04), ('sahni', 0.04), ('vfexp', 0.04), ('polynomial', 0.04), ('ve', 0.038), ('integer', 0.038), ('construction', 0.037), ('count', 0.037), ('solving', 0.037), ('implies', 0.036), ('log', 0.036), ('argmax', 0.036), ('corlett', 0.035), ('bz', 0.035), ('rainer', 0.035), ('bb', 0.032), ('fulfills', 0.032), ('logp', 0.032), ('substitute', 0.031), ('equivalent', 0.031), ('ela', 0.031), ('aachen', 0.031), ('malte', 0.031), ('xi', 0.029), ('linear', 0.029), ('az', 0.029), ('kevin', 0.028), ('cities', 0.028), ('dou', 0.028), ('fulfill', 0.028), ('token', 0.027), ('mf', 0.027), ('xz', 0.027), ('lo', 0.026), ('sujith', 0.026), ('kluwer', 0.025), ('hungarian', 0.025), ('ji', 0.025), ('boundary', 0.025), ('bauer', 0.024), ('reduction', 0.024), ('simplifies', 0.024), ('counts', 0.023), ('valid', 0.023), ('lms', 0.023), ('rewritten', 0.023), ('nx', 0.023), ('ln', 0.023), ('bj', 0.022), ('sorting', 0.022), ('max', 0.021), ('holds', 0.021), ('publications', 0.021)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999911 109 acl-2013-Decipherment Complexity in 1:1 Substitution Ciphers
Author: Malte Nuhn ; Hermann Ney
Abstract: In this paper we show that even for the case of 1:1 substitution ciphers—which encipher plaintext symbols by exchanging them with a unique substitute—finding the optimal decipherment with respect to a bigram language model is NP-hard. We show that in this case the decipherment problem is equivalent to the quadratic assignment problem (QAP). To the best of our knowledge, this connection between the QAP and the decipherment problem has not been known in the literature before.
2 0.45453888 66 acl-2013-Beam Search for Solving Substitution Ciphers
Author: Malte Nuhn ; Julian Schamper ; Hermann Ney
Abstract: In this paper we address the problem of solving substitution ciphers using a beam search approach. We present a conceptually consistent and easy to implement method that improves the current state of the art for decipherment of substitution ciphers and is able to use high order n-gram language models. We show experiments with 1:1 substitution ciphers in which the guaranteed optimal solution for 3-gram language models has 38.6% decipherment error, while our approach achieves 4.13% decipherment error in a fraction of time by using a 6-gram language model. We also apply our approach to the famous Zodiac-408 cipher and obtain slightly bet- ter (and near to optimal) results than previously published. Unlike the previous state-of-the-art approach that uses additional word lists to evaluate possible decipherments, our approach only uses a letterbased 6-gram language model. Furthermore we use our algorithm to solve large vocabulary substitution ciphers and improve the best published decipherment error rate based on the Gigaword corpus of 7.8% to 6.0% error rate.
3 0.31208256 108 acl-2013-Decipherment
Author: Kevin Knight
Abstract: The first natural language processing systems had a straightforward goal: decipher coded messages sent by the enemy. This tutorial explores connections between early decipherment research and today’s NLP work. We cover classic military and diplomatic ciphers, automatic decipherment algorithms, unsolved ciphers, language translation as decipherment, and analyzing ancient writing as decipherment. 1 Tutorial Overview The first natural language processing systems had a straightforward goal: decipher coded messages sent by the enemy. Sixty years later, we have many more applications, including web search, question answering, summarization, speech recognition, and language translation. This tutorial explores connections between early decipherment research and today’s NLP work. We find that many ideas from the earlier era have become core to the field, while others still remain to be picked up and developed. We first cover classic military and diplomatic cipher types, including complex substitution ciphers implemented in the first electro-mechanical encryption machines. We look at mathematical tools (language recognition, frequency counting, smoothing) developed to decrypt such ciphers on proto-computers. We show algorithms and extensive empirical results for solving different types of ciphers, and we show the role of algorithms in recent decipherments of historical documents. We then look at how foreign language can be viewed as a code for English, a concept developed by Alan Turing and Warren Weaver. We describe recently published work on building automatic translation systems from non-parallel data. We also demonstrate how some of the same algorithmic tools can be applied to natural language tasks like part-of-speech tagging and word alignment. Turning back to historical ciphers, we explore a number of unsolved ciphers, giving results of initial computer experiments on several of them. Finally, we look briefly at writing as a way to encipher phoneme sequences, covering ancient scripts and modern applications. 2 Outline 1. Classical military/diplomatic ciphers (15 minutes) • 60 cipher types (ACA) • Ciphers vs. codes • Enigma cipher: the mother of natural language processing computer analysis of text language recognition Good-Turing smoothing – – – 2. Foreign language as a code (10 minutes) • • Alan Turing’s ”Thinking Machines” Warren Weaver’s Memorandum 3. Automatic decipherment (55 minutes) • Cipher type detection • Substitution ciphers (simple, homophonic, polyalphabetic, etc) plaintext language recognition ∗ how much plaintext knowledge is – nheowede mdu 3 Proce diSnogfsia, of B thuleg5a r1iast, A Anungu aslt M4-9e t2in01g3 o.f ? tc he20 A1s3so Acsiasoticoinat fio rn C fo rm Cpoumtaptuiotantaioln Lainlg Luinisgtuicis ,tpi casges 3–4, – ∗ index of coincidence, unicity distance, oanf dc oointhceidr measures navigating a difficult search space ∗ frequencies of letters and words ∗ pattern words and cribs ∗ pElMin,g ILP, Bayesian models, sam– recent decipherments ∗ Jefferson cipher, Copiale cipher, cJievfifle war ciphers, n Caovaplia Enigma • • • • Application to part-of-speech tagging, Awopprdli alignment Application to machine translation withoAuptp parallel t teoxtm Parallel development of cryptography aPnarda ltrleanls dlaetvioenlo Recently released NSA internal nReewcselnettlyter (1974-1997) 4. *** Break *** (30 minutes) 5. Unsolved ciphers (40 minutes) • Zodiac 340 (1969), including computatZioodnaial cw 3o4r0k • Voynich Manuscript (early 1400s), including computational ewarolyrk • Beale (1885) • Dorabella (1897) • Taman Shud (1948) • Kryptos (1990), including computatKiorynaplt owsor (k1 • McCormick (1999) • Shoeboxes in attics: DuPonceau jour- nal, Finnerana, SYP, Mopse, diptych 6. Writing as a code (20 minutes) • Does writing encode ideas, or does it encDoodees phonemes? • Ancient script decipherment Egyptian hieroglyphs Linear B Mayan glyphs – – – – wUgoarkritic, including computational Chinese N ¨ushu, including computational work • Automatic phonetic decipherment • Application to transliteration 7. Undeciphered writing systems (15 minutes) • Indus Valley Script (3300BC) • Linear A (1900BC) • Phaistos disc (1700BC?) • Rongorongo (1800s?) – 8. Conclusion and further questions (15 minutes) 3 About the Presenter Kevin Knight is a Senior Research Scientist and Fellow at the Information Sciences Institute of the University of Southern California (USC), and a Research Professor in USC’s Computer Science Department. He received a PhD in computer science from Carnegie Mellon University and a bachelor’s degree from Harvard University. Professor Knight’s research interests include natural language processing, machine translation, automata theory, and decipherment. In 2001, he co-founded Language Weaver, Inc., and in 2011, he served as President of the Association for Computational Linguistics. Dr. Knight has taught computer science courses at USC for more than fifteen years and co-authored the widely adopted textbook Artificial Intelligence. 4
4 0.30118826 307 acl-2013-Scalable Decipherment for Machine Translation via Hash Sampling
Author: Sujith Ravi
Abstract: In this paper, we propose a new Bayesian inference method to train statistical machine translation systems using only nonparallel corpora. Following a probabilistic decipherment approach, we first introduce a new framework for decipherment training that is flexible enough to incorporate any number/type of features (besides simple bag-of-words) as side-information used for estimating translation models. In order to perform fast, efficient Bayesian inference in this framework, we then derive a hash sampling strategy that is inspired by the work of Ahmed et al. (2012). The new translation hash sampler enables us to scale elegantly to complex models (for the first time) and large vocab- ulary/corpora sizes. We show empirical results on the OPUS data—our method yields the best BLEU scores compared to existing approaches, while achieving significant computational speedups (several orders faster). We also report for the first time—BLEU score results for a largescale MT task using only non-parallel data (EMEA corpus).
5 0.13942493 369 acl-2013-Unsupervised Consonant-Vowel Prediction over Hundreds of Languages
Author: Young-Bum Kim ; Benjamin Snyder
Abstract: In this paper, we present a solution to one aspect of the decipherment task: the prediction of consonants and vowels for an unknown language and alphabet. Adopting a classical Bayesian perspective, we performs posterior inference over hundreds of languages, leveraging knowledge of known languages and alphabets to uncover general linguistic patterns of typologically coherent language clusters. We achieve average accuracy in the unsupervised consonant/vowel prediction task of 99% across 503 languages. We further show that our methodology can be used to predict more fine-grained phonetic distinctions. On a three-way classification task between vowels, nasals, and nonnasal consonants, our model yields unsu- pervised accuracy of 89% across the same set of languages.
6 0.085682541 260 acl-2013-Nonconvex Global Optimization for Latent-Variable Models
7 0.082204908 304 acl-2013-SEMILAR: The Semantic Similarity Toolkit
8 0.061365202 143 acl-2013-Exact Maximum Inference for the Fertility Hidden Markov Model
9 0.059708834 223 acl-2013-Learning a Phrase-based Translation Model from Monolingual Data with Application to Domain Adaptation
10 0.056196686 377 acl-2013-Using Supervised Bigram-based ILP for Extractive Summarization
11 0.055365007 37 acl-2013-Adaptive Parser-Centric Text Normalization
12 0.048574876 263 acl-2013-On the Predictability of Human Assessment: when Matrix Completion Meets NLP Evaluation
13 0.046850454 237 acl-2013-Margin-based Decomposed Amortized Inference
14 0.039976198 247 acl-2013-Modeling of term-distance and term-occurrence information for improving n-gram language model performance
15 0.039975762 370 acl-2013-Unsupervised Transcription of Historical Documents
16 0.039273556 362 acl-2013-Turning on the Turbo: Fast Third-Order Non-Projective Turbo Parsers
17 0.039051399 40 acl-2013-Advancements in Reordering Models for Statistical Machine Translation
18 0.037901103 210 acl-2013-Joint Word Alignment and Bilingual Named Entity Recognition Using Dual Decomposition
19 0.033798415 388 acl-2013-Word Alignment Modeling with Context Dependent Deep Neural Network
20 0.033658698 6 acl-2013-A Java Framework for Multilingual Definition and Hypernym Extraction
topicId topicWeight
[(0, 0.099), (1, -0.037), (2, 0.047), (3, 0.006), (4, -0.002), (5, -0.028), (6, 0.047), (7, -0.003), (8, -0.132), (9, -0.054), (10, -0.022), (11, -0.135), (12, -0.108), (13, -0.306), (14, 0.033), (15, -0.422), (16, -0.06), (17, -0.147), (18, -0.088), (19, 0.272), (20, 0.023), (21, 0.064), (22, -0.171), (23, -0.146), (24, 0.064), (25, 0.048), (26, 0.104), (27, -0.029), (28, -0.123), (29, -0.035), (30, 0.002), (31, 0.049), (32, 0.035), (33, -0.018), (34, -0.027), (35, -0.026), (36, 0.014), (37, -0.041), (38, 0.098), (39, -0.015), (40, 0.038), (41, 0.0), (42, 0.036), (43, 0.003), (44, -0.006), (45, -0.009), (46, -0.009), (47, -0.037), (48, -0.016), (49, -0.031)]
simIndex simValue paperId paperTitle
same-paper 1 0.97214884 109 acl-2013-Decipherment Complexity in 1:1 Substitution Ciphers
Author: Malte Nuhn ; Hermann Ney
Abstract: In this paper we show that even for the case of 1:1 substitution ciphers—which encipher plaintext symbols by exchanging them with a unique substitute—finding the optimal decipherment with respect to a bigram language model is NP-hard. We show that in this case the decipherment problem is equivalent to the quadratic assignment problem (QAP). To the best of our knowledge, this connection between the QAP and the decipherment problem has not been known in the literature before.
2 0.94045103 108 acl-2013-Decipherment
Author: Kevin Knight
Abstract: The first natural language processing systems had a straightforward goal: decipher coded messages sent by the enemy. This tutorial explores connections between early decipherment research and today’s NLP work. We cover classic military and diplomatic ciphers, automatic decipherment algorithms, unsolved ciphers, language translation as decipherment, and analyzing ancient writing as decipherment. 1 Tutorial Overview The first natural language processing systems had a straightforward goal: decipher coded messages sent by the enemy. Sixty years later, we have many more applications, including web search, question answering, summarization, speech recognition, and language translation. This tutorial explores connections between early decipherment research and today’s NLP work. We find that many ideas from the earlier era have become core to the field, while others still remain to be picked up and developed. We first cover classic military and diplomatic cipher types, including complex substitution ciphers implemented in the first electro-mechanical encryption machines. We look at mathematical tools (language recognition, frequency counting, smoothing) developed to decrypt such ciphers on proto-computers. We show algorithms and extensive empirical results for solving different types of ciphers, and we show the role of algorithms in recent decipherments of historical documents. We then look at how foreign language can be viewed as a code for English, a concept developed by Alan Turing and Warren Weaver. We describe recently published work on building automatic translation systems from non-parallel data. We also demonstrate how some of the same algorithmic tools can be applied to natural language tasks like part-of-speech tagging and word alignment. Turning back to historical ciphers, we explore a number of unsolved ciphers, giving results of initial computer experiments on several of them. Finally, we look briefly at writing as a way to encipher phoneme sequences, covering ancient scripts and modern applications. 2 Outline 1. Classical military/diplomatic ciphers (15 minutes) • 60 cipher types (ACA) • Ciphers vs. codes • Enigma cipher: the mother of natural language processing computer analysis of text language recognition Good-Turing smoothing – – – 2. Foreign language as a code (10 minutes) • • Alan Turing’s ”Thinking Machines” Warren Weaver’s Memorandum 3. Automatic decipherment (55 minutes) • Cipher type detection • Substitution ciphers (simple, homophonic, polyalphabetic, etc) plaintext language recognition ∗ how much plaintext knowledge is – nheowede mdu 3 Proce diSnogfsia, of B thuleg5a r1iast, A Anungu aslt M4-9e t2in01g3 o.f ? tc he20 A1s3so Acsiasoticoinat fio rn C fo rm Cpoumtaptuiotantaioln Lainlg Luinisgtuicis ,tpi casges 3–4, – ∗ index of coincidence, unicity distance, oanf dc oointhceidr measures navigating a difficult search space ∗ frequencies of letters and words ∗ pattern words and cribs ∗ pElMin,g ILP, Bayesian models, sam– recent decipherments ∗ Jefferson cipher, Copiale cipher, cJievfifle war ciphers, n Caovaplia Enigma • • • • Application to part-of-speech tagging, Awopprdli alignment Application to machine translation withoAuptp parallel t teoxtm Parallel development of cryptography aPnarda ltrleanls dlaetvioenlo Recently released NSA internal nReewcselnettlyter (1974-1997) 4. *** Break *** (30 minutes) 5. Unsolved ciphers (40 minutes) • Zodiac 340 (1969), including computatZioodnaial cw 3o4r0k • Voynich Manuscript (early 1400s), including computational ewarolyrk • Beale (1885) • Dorabella (1897) • Taman Shud (1948) • Kryptos (1990), including computatKiorynaplt owsor (k1 • McCormick (1999) • Shoeboxes in attics: DuPonceau jour- nal, Finnerana, SYP, Mopse, diptych 6. Writing as a code (20 minutes) • Does writing encode ideas, or does it encDoodees phonemes? • Ancient script decipherment Egyptian hieroglyphs Linear B Mayan glyphs – – – – wUgoarkritic, including computational Chinese N ¨ushu, including computational work • Automatic phonetic decipherment • Application to transliteration 7. Undeciphered writing systems (15 minutes) • Indus Valley Script (3300BC) • Linear A (1900BC) • Phaistos disc (1700BC?) • Rongorongo (1800s?) – 8. Conclusion and further questions (15 minutes) 3 About the Presenter Kevin Knight is a Senior Research Scientist and Fellow at the Information Sciences Institute of the University of Southern California (USC), and a Research Professor in USC’s Computer Science Department. He received a PhD in computer science from Carnegie Mellon University and a bachelor’s degree from Harvard University. Professor Knight’s research interests include natural language processing, machine translation, automata theory, and decipherment. In 2001, he co-founded Language Weaver, Inc., and in 2011, he served as President of the Association for Computational Linguistics. Dr. Knight has taught computer science courses at USC for more than fifteen years and co-authored the widely adopted textbook Artificial Intelligence. 4
3 0.90794593 66 acl-2013-Beam Search for Solving Substitution Ciphers
Author: Malte Nuhn ; Julian Schamper ; Hermann Ney
Abstract: In this paper we address the problem of solving substitution ciphers using a beam search approach. We present a conceptually consistent and easy to implement method that improves the current state of the art for decipherment of substitution ciphers and is able to use high order n-gram language models. We show experiments with 1:1 substitution ciphers in which the guaranteed optimal solution for 3-gram language models has 38.6% decipherment error, while our approach achieves 4.13% decipherment error in a fraction of time by using a 6-gram language model. We also apply our approach to the famous Zodiac-408 cipher and obtain slightly bet- ter (and near to optimal) results than previously published. Unlike the previous state-of-the-art approach that uses additional word lists to evaluate possible decipherments, our approach only uses a letterbased 6-gram language model. Furthermore we use our algorithm to solve large vocabulary substitution ciphers and improve the best published decipherment error rate based on the Gigaword corpus of 7.8% to 6.0% error rate.
4 0.4282777 307 acl-2013-Scalable Decipherment for Machine Translation via Hash Sampling
Author: Sujith Ravi
Abstract: In this paper, we propose a new Bayesian inference method to train statistical machine translation systems using only nonparallel corpora. Following a probabilistic decipherment approach, we first introduce a new framework for decipherment training that is flexible enough to incorporate any number/type of features (besides simple bag-of-words) as side-information used for estimating translation models. In order to perform fast, efficient Bayesian inference in this framework, we then derive a hash sampling strategy that is inspired by the work of Ahmed et al. (2012). The new translation hash sampler enables us to scale elegantly to complex models (for the first time) and large vocab- ulary/corpora sizes. We show empirical results on the OPUS data—our method yields the best BLEU scores compared to existing approaches, while achieving significant computational speedups (several orders faster). We also report for the first time—BLEU score results for a largescale MT task using only non-parallel data (EMEA corpus).
5 0.33084983 369 acl-2013-Unsupervised Consonant-Vowel Prediction over Hundreds of Languages
Author: Young-Bum Kim ; Benjamin Snyder
Abstract: In this paper, we present a solution to one aspect of the decipherment task: the prediction of consonants and vowels for an unknown language and alphabet. Adopting a classical Bayesian perspective, we performs posterior inference over hundreds of languages, leveraging knowledge of known languages and alphabets to uncover general linguistic patterns of typologically coherent language clusters. We achieve average accuracy in the unsupervised consonant/vowel prediction task of 99% across 503 languages. We further show that our methodology can be used to predict more fine-grained phonetic distinctions. On a three-way classification task between vowels, nasals, and nonnasal consonants, our model yields unsu- pervised accuracy of 89% across the same set of languages.
6 0.28186971 370 acl-2013-Unsupervised Transcription of Historical Documents
7 0.22773798 260 acl-2013-Nonconvex Global Optimization for Latent-Variable Models
8 0.2053792 382 acl-2013-Variational Inference for Structured NLP Models
9 0.19990014 171 acl-2013-Grammatical Error Correction Using Integer Linear Programming
10 0.19422358 143 acl-2013-Exact Maximum Inference for the Fertility Hidden Markov Model
11 0.1852854 324 acl-2013-Smatch: an Evaluation Metric for Semantic Feature Structures
12 0.18140072 237 acl-2013-Margin-based Decomposed Amortized Inference
13 0.18065643 354 acl-2013-Training Nondeficient Variants of IBM-3 and IBM-4 for Word Alignment
14 0.17374611 1 acl-2013-"Let Everything Turn Well in Your Wife": Generation of Adult Humor Using Lexical Constraints
15 0.1724041 308 acl-2013-Scalable Modified Kneser-Ney Language Model Estimation
16 0.16639216 279 acl-2013-PhonMatrix: Visualizing co-occurrence constraints of sounds
17 0.15587273 334 acl-2013-Supervised Model Learning with Feature Grouping based on a Discrete Constraint
18 0.1449679 349 acl-2013-The mathematics of language learning
19 0.14403147 381 acl-2013-Variable Bit Quantisation for LSH
20 0.14345707 54 acl-2013-Are School-of-thought Words Characterizable?
topicId topicWeight
[(0, 0.087), (6, 0.038), (8, 0.212), (11, 0.037), (15, 0.032), (24, 0.026), (26, 0.035), (28, 0.012), (35, 0.047), (42, 0.039), (48, 0.057), (64, 0.012), (70, 0.037), (88, 0.019), (95, 0.211)]
simIndex simValue paperId paperTitle
same-paper 1 0.85286438 109 acl-2013-Decipherment Complexity in 1:1 Substitution Ciphers
Author: Malte Nuhn ; Hermann Ney
Abstract: In this paper we show that even for the case of 1:1 substitution ciphers—which encipher plaintext symbols by exchanging them with a unique substitute—finding the optimal decipherment with respect to a bigram language model is NP-hard. We show that in this case the decipherment problem is equivalent to the quadratic assignment problem (QAP). To the best of our knowledge, this connection between the QAP and the decipherment problem has not been known in the literature before.
2 0.75076509 66 acl-2013-Beam Search for Solving Substitution Ciphers
Author: Malte Nuhn ; Julian Schamper ; Hermann Ney
Abstract: In this paper we address the problem of solving substitution ciphers using a beam search approach. We present a conceptually consistent and easy to implement method that improves the current state of the art for decipherment of substitution ciphers and is able to use high order n-gram language models. We show experiments with 1:1 substitution ciphers in which the guaranteed optimal solution for 3-gram language models has 38.6% decipherment error, while our approach achieves 4.13% decipherment error in a fraction of time by using a 6-gram language model. We also apply our approach to the famous Zodiac-408 cipher and obtain slightly bet- ter (and near to optimal) results than previously published. Unlike the previous state-of-the-art approach that uses additional word lists to evaluate possible decipherments, our approach only uses a letterbased 6-gram language model. Furthermore we use our algorithm to solve large vocabulary substitution ciphers and improve the best published decipherment error rate based on the Gigaword corpus of 7.8% to 6.0% error rate.
3 0.74579298 37 acl-2013-Adaptive Parser-Centric Text Normalization
Author: Congle Zhang ; Tyler Baldwin ; Howard Ho ; Benny Kimelfeld ; Yunyao Li
Abstract: Text normalization is an important first step towards enabling many Natural Language Processing (NLP) tasks over informal text. While many of these tasks, such as parsing, perform the best over fully grammatically correct text, most existing text normalization approaches narrowly define the task in the word-to-word sense; that is, the task is seen as that of mapping all out-of-vocabulary non-standard words to their in-vocabulary standard forms. In this paper, we take a parser-centric view of normalization that aims to convert raw informal text into grammatically correct text. To understand the real effect of normalization on the parser, we tie normal- ization performance directly to parser performance. Additionally, we design a customizable framework to address the often overlooked concept of domain adaptability, and illustrate that the system allows for transfer to new domains with a minimal amount of data and effort. Our experimental study over datasets from three domains demonstrates that our approach outperforms not only the state-of-the-art wordto-word normalization techniques, but also manual word-to-word annotations.
4 0.74514562 162 acl-2013-FrameNet on the Way to Babel: Creating a Bilingual FrameNet Using Wiktionary as Interlingual Connection
Author: Silvana Hartmann ; Iryna Gurevych
Abstract: We present a new bilingual FrameNet lexicon for English and German. It is created through a simple, but powerful approach to construct a FrameNet in any language using Wiktionary as an interlingual representation. Our approach is based on a sense alignment of FrameNet and Wiktionary, and subsequent translation disambiguation into the target language. We perform a detailed evaluation of the created resource and a discussion of Wiktionary as an interlingual connection for the cross-language transfer of lexicalsemantic resources. The created resource is publicly available at http : / /www . ukp .tu-darmst adt .de / fnwkde / .
5 0.73917896 289 acl-2013-QuEst - A translation quality estimation framework
Author: Lucia Specia ; ; ; Kashif Shah ; Jose G.C. de Souza ; Trevor Cohn
Abstract: We describe QUEST, an open source framework for machine translation quality estimation. The framework allows the extraction of several quality indicators from source segments, their translations, external resources (corpora, language models, topic models, etc.), as well as language tools (parsers, part-of-speech tags, etc.). It also provides machine learning algorithms to build quality estimation models. We benchmark the framework on a number of datasets and discuss the efficacy of features and algorithms.
6 0.73603982 336 acl-2013-Syntactic Patterns versus Word Alignment: Extracting Opinion Targets from Online Reviews
7 0.72902131 256 acl-2013-Named Entity Recognition using Cross-lingual Resources: Arabic as an Example
9 0.72388899 326 acl-2013-Social Text Normalization using Contextual Graph Random Walks
10 0.72285587 359 acl-2013-Translating Dialectal Arabic to English
11 0.71484035 195 acl-2013-Improving machine translation by training against an automatic semantic frame based evaluation metric
12 0.71090227 374 acl-2013-Using Context Vectors in Improving a Machine Translation System with Bridge Language
13 0.71060616 255 acl-2013-Name-aware Machine Translation
14 0.70855534 135 acl-2013-English-to-Russian MT evaluation campaign
15 0.70738846 5 acl-2013-A Decade of Automatic Content Evaluation of News Summaries: Reassessing the State of the Art
16 0.70526773 97 acl-2013-Cross-lingual Projections between Languages from Different Families
17 0.70497751 13 acl-2013-A New Syntactic Metric for Evaluation of Machine Translation
18 0.70481277 240 acl-2013-Microblogs as Parallel Corpora
19 0.70100623 317 acl-2013-Sentence Level Dialect Identification in Arabic
20 0.6988979 120 acl-2013-Dirt Cheap Web-Scale Parallel Text from the Common Crawl