acl acl2012 acl2012-148 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Arianna Bisazza ; Marcello Federico
Abstract: This paper presents a novel method to suggest long word reorderings to a phrase-based SMT decoder. We address language pairs where long reordering concentrates on few patterns, and use fuzzy chunk-based rules to predict likely reorderings for these phenomena. Then we use reordered n-gram LMs to rank the resulting permutations and select the n-best for translation. Finally we encode these reorderings by modifying selected entries of the distortion cost matrix, on a per-sentence basis. In this way, we expand the search space by a much finer degree than if we simply raised the distortion limit. The proposed techniques are tested on Arabic-English and German-English using well-known SMT benchmarks.
Reference: text
sentIndex sentText sentNum sentScore
1 eu Abstract This paper presents a novel method to suggest long word reorderings to a phrase-based SMT decoder. [sent-2, score-0.364]
2 We address language pairs where long reordering concentrates on few patterns, and use fuzzy chunk-based rules to predict likely reorderings for these phenomena. [sent-3, score-1.006]
3 Then we use reordered n-gram LMs to rank the resulting permutations and select the n-best for translation. [sent-4, score-0.296]
4 Finally we encode these reorderings by modifying selected entries of the distortion cost matrix, on a per-sentence basis. [sent-5, score-0.794]
5 In this way, we expand the search space by a much finer degree than if we simply raised the distortion limit. [sent-6, score-0.418]
6 Lying between these two extremes are language pairs where most of the reordering happens locally, 478 and where long reorderings can be isolated and described by a handful of linguistic rules. [sent-13, score-0.917]
7 Interestingly, on these pairs, PSMT generally prevails over tree-based SMT1, producing overall highquality outputs and isolated but critical reordering errors that undermine the global sentence meaning. [sent-15, score-0.553]
8 Previous works on this type of language pairs have mostly focused on source reordering prior to translation (Xia and McCord, 2004; Collins et al. [sent-16, score-0.671]
9 , 2005), or on sophisticated reordering models integrated into decoding (Koehn et al. [sent-17, score-0.601]
10 Added to the usual space of local permutations defined by a low distortion limit (DL), this results in a linguistically informed definition of the search space that simplifies the task of the in-decoder reordering model, besides decreasing its complexity. [sent-20, score-1.097]
11 After reviewing a selection of relevant works, we analyze salient reordering patterns in Arabic-English and GermanEnglish, and describe the corresponding chunkbased reordering rule sets. [sent-22, score-1.202]
12 In the following sections we present a reordering selection technique based on – – 1A good comparison of phrase-based and tree-based approaches across language pairs with different reordering levels can be found in (Zollmann et al. [sent-23, score-1.132]
13 c so2c0ia1t2io Ans fso rc Ciatoiomnp fuotart Cio nmaplu Ltiantgiounisatlic Lsi,n pgaugiestsi4c 7s8–487, reordered n-gram LMs and, finally, explain the notion of modified distortion matrices. [sent-27, score-0.577]
14 , 2005; Habash, 2007); non-deterministic reordering encodes multiple alternative reorderings into a word lattice and lets a monotonic decoder find the best path according to its models (Zhang et al. [sent-30, score-0.916]
15 The latter approaches are ideally conceived as alternative to in-decoding reordering, and therefore require an exhaustive reordering rule set. [sent-32, score-0.609]
16 This yields sparse reordering lattices that can be translated with a regular decoder performing additional reordering. [sent-35, score-0.678]
17 Similarly to hybrid approaches, in this work we use few linguistically informed rules to generate multiple reorderings for selected phenomena but, as a difference, we do not employ lattices to represent them. [sent-39, score-0.423]
18 We also include a competitive in-decoding reordering model in all the systems used to evaluate our methods. [sent-40, score-0.553]
19 Another large body of work is devoted to the modeling of reordering decisions inside decoding, based on a decomposition of the problem into a sequence – of basic reordering steps. [sent-41, score-1.159]
20 Existing approaches range from basic linear distortion to more complex models that are conditioned on the words being translated. [sent-42, score-0.376]
21 (2010) tried to improve it with a future distortion cost estimate. [sent-47, score-0.412]
22 These models are known to well handle local reordering and are widely adopted by the PSMT community. [sent-51, score-0.553]
23 However, they are unsuitable to model long reordering as they classify as “discontinuous” every phrase that does not immediately follow or precede the last translated one. [sent-52, score-0.689]
24 Lexicalized distortion models predict the jump from the last translated word to the next one, with a class for each possible jump length (Al-Onaizan and Papineni, 2006), or bin of lengths (Green et al. [sent-53, score-0.607]
25 This method does not directly model reordering decisions, but rather word sequences produced by them. [sent-58, score-0.553]
26 Attempting to improve the reordering space definition, Yahyaei and Monz (2010) train a classifier to guess the most likely jump length at each source position, then use its predictions to dynamically set the DL. [sent-61, score-0.666]
27 Modifying the distortion function, as proposed in this paper, makes it possible to expand the pemutation search space by a much finer degree than varying the DL does. [sent-63, score-0.418]
28 3 Long reordering patterns Our study focuses on Arabic-English and GermanEnglish: two language pairs characterized by uneven distributions of word-reordering phenomena, with long-range movements concentrating on few patterns. [sent-64, score-0.553]
29 In Arabic-English, the internal order of most noun phrases needs to be reversed during translation, which is generally well handled by phrase-internal reordering or local distortion. [sent-65, score-0.553]
30 Thanks to sophisticated reordering models, stateof-the-art PSMT systems are generally good at handling local reordering phenomena that are not captured by phrase-internal reordering. [sent-72, score-1.106]
31 We believe this is mainly not the fault of the reordering models, but rather of a too coarse definition of the search space. [sent-74, score-0.553]
32 We will now describe two rule sets aimed at capturing these reordering phenomena. [sent-81, score-0.582]
33 4 Shallow syntax reordering rules To compute the source reorderings, we use chunkbased rules following Bisazza and Federico (2010). [sent-82, score-0.769]
34 Shallow syntax chunking is indeed a lighter and simpler task compared to full parsing, and it can be used to constrain the number of reorderings in a softer way. [sent-83, score-0.349]
35 Besides defining a unique segmentation of the sentence, chunk annotation provides other useful information that can be used by the rules namely chunk type and POS tags3. [sent-85, score-0.308]
36 5%) of the verb reorderings observed in a parallel news corpus, including those where the verb must be moved along with an adverbial or a complement. [sent-90, score-0.455]
37 4A similar rule set was previously used to produce chunk reordering lattices in (Hardmeier et al. [sent-94, score-0.753]
38 Figure 1: Examples of chunk permutations generated by shallow syntax reordering rules. [sent-126, score-0.91]
39 The application of chunk reordering rules is illustrated by Fig. [sent-130, score-0.736]
40 Here, the rules generate 3 permutations for the chunk sequence 2 to 5, corresponding to likely locations of the merged verb phrase, the 1st being optimal. [sent-134, score-0.447]
41 Empirically, this yields on average 22 reorderings per sentence in the NIST-MT Arabic benchmark dev06-NW and 3 on the WMT German benchmark test08. [sent-140, score-0.319]
42 Ara- bic rules are indeed more noisy, which is not surprising as reordering is triggered by any verb chunk. [sent-141, score-0.679]
43 5 Reordering selection The number of chunk-based reorderings per sentence varies according to the rule set, to the size of chunks and to the context. [sent-142, score-0.44]
44 A high degree of fuzziness can complicate the decoding process, leaving too much work to the in-decoding reordering model. [sent-143, score-0.601]
45 A solution to this problem is using an external model to score the rule-generated reorderings and discard the less probable ones. [sent-144, score-0.319]
46 In such a way, a further part of reordering complexity is taken out of decoding. [sent-145, score-0.553]
47 Chunk-based reordering rules are applied deterministically to the source side of the parallel training data, using word alignment to choose the optimal permutation (“oracle reordering”)6. [sent-153, score-0.64]
48 The n-best reorderings of each rule-matching sequence are selected for translation. [sent-158, score-0.347]
49 In experiments not reported here, we obtained accurate rankings by scoring source permutations with a uniformly weighted combination of two LMs trained on chunk types and on chunk-type+headword, respectively. [sent-159, score-0.322]
50 In particular, 3-best reorderings of each rule-matching sequence yield reordering recalls of 77. [sent-160, score-0.9]
51 6 Modified distortion matrices We present here a novel technique to encode likely long reorderings of an input sentence, which can be seamlessly integrated into the PSMT framework. [sent-163, score-0.895]
52 During decoding, the distance between source positions is used for two main purposes: (i) generating a distortion penalty for the current hypothesis and (ii) determining the set of source positions that can be covered at the next hypothesis expansion. [sent-164, score-0.522]
53 We can then tackle the coarseness of both distortion penalty and reordering constraints, by replacing the distance function with a function defined ad hoc for each input sentence. [sent-165, score-0.929]
54 In the linear distortion model this is 6Following Bisazza and Federico (2010), the optimal reordering for a source sentence is the one that minimizes distortion in the word alignment to a target translation, measured by number of swaps and sum of distortion costs. [sent-167, score-1.71]
55 At the level of phrases, distortion is computed between the last word of the last translated phrase and the first word of the next phrase. [sent-170, score-0.495]
56 We retain this equation as the core distortion function for our model. [sent-171, score-0.376]
57 Then, we modify entries in the matrix such that the distortion cost is minimized for the decoding paths pre-computed with the reordering rules. [sent-172, score-1.066]
58 Given a source sentence and its set of rulegenerated permutations, the linear distortion matrix is modified as follows: 1. [sent-173, score-0.531]
59 then, for each extracted pair, the corresponding point in the matrix is assigned the lowest possible distortion cost, that is 0 if si < si+1 and 2 if si > si+1. [sent-177, score-0.519]
60 This makes modifed distortion matrices particularly suitable to encode just those reorderings that are typically missed by phrase-based decoders (see Sect. [sent-180, score-0.825]
61 We propose two ways to do this, given an ordered pair of chunks (cx,cy): mode A ×A : create a shortcut from each word of cx t×o Aeac :h cwreoartde o af cy; mode L ×F : create only one shortcut from the last ew Lor×dF Fof : cx etoa tteh eo nfilyrst o onfe cy. [sent-183, score-0.297]
62 The former solution admits more chunk-internal permutations with the same minimal distortion cost, whereas the latter implies that the first word of a reordered chunk is covered first and the last is covered last. [sent-184, score-0.85]
63 7In fact, any decoding path that includes a jump marked as shortcut benefits from the same distortion discount in that point. [sent-185, score-0.555]
64 "IC3 '"5;K $"( Figure 2: Modified distortion matrix (mode A×A) of the FGiegrumrean 2 :s Mentoednifciee given irnti Fig. [sent-193, score-0.429]
65 iTxh (me cohduen Ak reordering shown on top generates three shortcuts corresponding to the 0’s and 2’s highlighted in the matrix. [sent-195, score-0.584]
66 2 shows the distortion matrix of the German sentence ofFig. [sent-197, score-0.429]
67 Suppose we want to encode the reordering shown on top of Fig. [sent-199, score-0.587]
68 The desired reordering is now attainable within a DL of 2 words instead of 5. [sent-206, score-0.553]
69 If compared to the word reordering lattices used by Bisazza and Federico (2010) and Andreas et al. [sent-208, score-0.599]
70 (201 1), modified distortion matrices provide a more compact, implicit way to encode likely reorderings in a sentence-specific fashion. [sent-209, score-0.898]
71 j would 483 source word and is naturally compatible with the PSMT decoder’s standard reordering mechanisms. [sent-211, score-0.582]
72 7 Evaluation In this section we evaluate the impact of modified distortion matrices on two news translation tasks. [sent-212, score-0.634]
73 The list of word shortcuts for each sentence is provided as an XML tag that is parsed by the decoder to modify the distortion matrix just before starting the search. [sent-215, score-0.504]
74 As usual, the distortion matrix is queried by the distortion penalty generator and by the hypothesis expander9. [sent-216, score-0.805]
75 German tokenization 9Note that lexicalized reordering models use real word distances to compute the orientation class of a new hypothesis, thus they are not affected by changes in the matrix. [sent-229, score-0.635]
76 The decoder is based on the log-linear combination of a phrase translation model, a lexicalized reordering model, a 6-gram target language model, distortion cost, word and phrase penalties. [sent-237, score-1.148]
77 The reordering model is a hierarchical phrase orientation model (Tillmann, 2004; Koehn et al. [sent-238, score-0.633]
78 Finally, for German, we enable the Moses option monotone-atpunctuation which forbids reordering across punctuation marks. [sent-241, score-0.553]
79 To evaluate the reordering selection technique, we also com- pare the encoding of all rule-generated reorderings against only the 3 best per rule-matching sequence, as ranked by our best performing reordered LM (see end of Sect. [sent-263, score-1.026]
80 Results in the row “allReo” are obtained by encoding all the rule-generated reorderings in L×Fchunktion-gwaollrdth ecornuvlee-rgsieonne rmatoedder. [sent-271, score-0.319]
81 Finally, we arrive to the performance of 3-best reorderings per sequence. [sent-275, score-0.319]
82 Looking at run times, we can say that modified distortion matrices are a very efficient way to address long reordering. [sent-287, score-0.59]
83 Even when all the generated reorderings are encoded, translation time increases only by 4%. [sent-288, score-0.408]
84 76• 3675294 71 Table 1: Impact of modified distortion matrices on translation quality, measured with BLEU, METEOR and KRS (all in percentage form, higher scores mean higher quality). [sent-308, score-0.634]
85 Looking at the rest of the table, we see that reordering selection is not as crucial as in ArabicEnglish. [sent-317, score-0.579]
86 This is in line with the properties of the more precise German reordering rule set (two rules out of three generate at most 3 reorderings per sequence). [sent-318, score-0.959]
87 Considering all scores, the last setting (3-best reordering and A×A) appears as the best one, achieving nthge following gains over sth teh ebba esestline: +. [sent-319, score-0.581]
88 8 Conclusions In Arabic-English and German-English, long reordering concentrates on specific patterns describable by a small number of linguistic rules. [sent-329, score-0.598]
89 By means of non-deterministic chunk reordering rules, we have generated likely permutations of the test 485 sentences and ranked them with n-gram LMs trained on pre-reordered data. [sent-330, score-0.846]
90 We have then introduced the notion of modified distortion matrices to naturally encode a set of likely reorderings in the decoder input. [sent-331, score-0.942]
91 Modified distortion allows for a finer and more linguistically informed definition of the search space, which is reflected in better translation outputs and more efficient decoding. [sent-332, score-0.507]
92 We expect that further improvements may be achieved by refining the Arabic reordering rules with specific POS tags and lexical cues. [sent-333, score-0.611]
93 We also plan to evaluate modified distortion matrices in conjunction with a different type of in-decoding reorder- ing model such as the one proposed by Green et al. [sent-334, score-0.545]
94 We thank Christian Hardmeier for helping us define the German reordering rules, and the anonymous reviewers for valuable suggestions. [sent-338, score-0.553]
95 Chunkbased verb reordering in VSO sentences for ArabicEnglish statistical machine translation. [sent-360, score-0.655]
96 Chunk-lattices for verb reordering in ArabicEnglish statistical machine translation. [sent-365, score-0.655]
97 Using shallow syntax information to improve word alignment and reordering for smt. [sent-390, score-0.617]
98 Improved models of distortion cost for statistical machine translation. [sent-416, score-0.446]
99 Dynamic distortion in a discriminative reordering model for statistical machine translation. [sent-509, score-0.963]
100 Chunk-level reordering of source language sentences with automatically learned rules for statistical machine translation. [sent-533, score-0.674]
wordName wordTfidf (topN-words)
[('reordering', 0.553), ('distortion', 0.376), ('reorderings', 0.319), ('bisazza', 0.173), ('permutations', 0.168), ('vc', 0.16), ('krs', 0.157), ('dl', 0.139), ('reordered', 0.128), ('chunk', 0.125), ('arabic', 0.107), ('matrices', 0.096), ('translation', 0.089), ('jump', 0.084), ('psmt', 0.082), ('lms', 0.077), ('german', 0.075), ('modified', 0.073), ('jumps', 0.07), ('infinitive', 0.068), ('verb', 0.068), ('habash', 0.067), ('chunks', 0.066), ('arianna', 0.063), ('rules', 0.058), ('meteor', 0.056), ('matrix', 0.053), ('orientation', 0.052), ('zens', 0.049), ('decoding', 0.048), ('crego', 0.047), ('hardmeier', 0.047), ('niehues', 0.047), ('participle', 0.047), ('shortcut', 0.047), ('lattices', 0.046), ('federico', 0.045), ('koehn', 0.045), ('si', 0.045), ('long', 0.045), ('positions', 0.044), ('decoder', 0.044), ('green', 0.042), ('broken', 0.042), ('finer', 0.042), ('chunkbased', 0.041), ('elming', 0.041), ('mccord', 0.041), ('nizar', 0.04), ('af', 0.039), ('association', 0.038), ('sy', 0.037), ('cost', 0.036), ('translated', 0.035), ('bleu', 0.035), ('mode', 0.035), ('michigan', 0.035), ('xia', 0.035), ('statistical', 0.034), ('encode', 0.034), ('shallow', 0.034), ('svo', 0.033), ('sx', 0.033), ('arbor', 0.033), ('costs', 0.033), ('pages', 0.032), ('fuzzy', 0.031), ('arabicenglish', 0.031), ('gertwol', 0.031), ('haapalainen', 0.031), ('kolss', 0.031), ('matr', 0.031), ('shortcuts', 0.031), ('yahyaei', 0.031), ('lexicalized', 0.03), ('syntax', 0.03), ('subordinate', 0.03), ('source', 0.029), ('rule', 0.029), ('modifying', 0.029), ('sequence', 0.028), ('marcello', 0.028), ('phrase', 0.028), ('last', 0.028), ('conceived', 0.027), ('amira', 0.027), ('germanenglish', 0.027), ('koskenniemi', 0.027), ('andreas', 0.027), ('birch', 0.027), ('galley', 0.027), ('diab', 0.027), ('papineni', 0.026), ('selection', 0.026), ('moses', 0.025), ('tillmann', 0.025), ('devoted', 0.025), ('seamlessly', 0.025), ('admits', 0.025), ('sigf', 0.025)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000005 148 acl-2012-Modified Distortion Matrices for Phrase-Based Statistical Machine Translation
Author: Arianna Bisazza ; Marcello Federico
Abstract: This paper presents a novel method to suggest long word reorderings to a phrase-based SMT decoder. We address language pairs where long reordering concentrates on few patterns, and use fuzzy chunk-based rules to predict likely reorderings for these phenomena. Then we use reordered n-gram LMs to rank the resulting permutations and select the n-best for translation. Finally we encode these reorderings by modifying selected entries of the distortion cost matrix, on a per-sentence basis. In this way, we expand the search space by a much finer degree than if we simply raised the distortion limit. The proposed techniques are tested on Arabic-English and German-English using well-known SMT benchmarks.
2 0.39305165 19 acl-2012-A Ranking-based Approach to Word Reordering for Statistical Machine Translation
Author: Nan Yang ; Mu Li ; Dongdong Zhang ; Nenghai Yu
Abstract: Long distance word reordering is a major challenge in statistical machine translation research. Previous work has shown using source syntactic trees is an effective way to tackle this problem between two languages with substantial word order difference. In this work, we further extend this line of exploration and propose a novel but simple approach, which utilizes a ranking model based on word order precedence in the target language to reposition nodes in the syntactic parse tree of a source sentence. The ranking model is automatically derived from word aligned parallel data with a syntactic parser for source language based on both lexical and syntactical features. We evaluated our approach on largescale Japanese-English and English-Japanese machine translation tasks, and show that it can significantly outperform the baseline phrase- based SMT system.
3 0.26345769 147 acl-2012-Modeling the Translation of Predicate-Argument Structure for SMT
Author: Deyi Xiong ; Min Zhang ; Haizhou Li
Abstract: Predicate-argument structure contains rich semantic information of which statistical machine translation hasn’t taken full advantage. In this paper, we propose two discriminative, feature-based models to exploit predicateargument structures for statistical machine translation: 1) a predicate translation model and 2) an argument reordering model. The predicate translation model explores lexical and semantic contexts surrounding a verbal predicate to select desirable translations for the predicate. The argument reordering model automatically predicts the moving direction of an argument relative to its predicate after translation using semantic features. The two models are integrated into a state-of-theart phrase-based machine translation system and evaluated on Chinese-to-English transla- , tion tasks with large-scale training data. Experimental results demonstrate that the two models significantly improve translation accuracy.
4 0.23688602 155 acl-2012-NiuTrans: An Open Source Toolkit for Phrase-based and Syntax-based Machine Translation
Author: Tong Xiao ; Jingbo Zhu ; Hao Zhang ; Qiang Li
Abstract: We present a new open source toolkit for phrase-based and syntax-based machine translation. The toolkit supports several state-of-the-art models developed in statistical machine translation, including the phrase-based model, the hierachical phrase-based model, and various syntaxbased models. The key innovation provided by the toolkit is that the decoder can work with various grammars and offers different choices of decoding algrithms, such as phrase-based decoding, decoding as parsing/tree-parsing and forest-based decoding. Moreover, several useful utilities were distributed with the toolkit, including a discriminative reordering model, a simple and fast language model, and an implementation of minimum error rate training for weight tuning. 1
5 0.18360353 162 acl-2012-Post-ordering by Parsing for Japanese-English Statistical Machine Translation
Author: Isao Goto ; Masao Utiyama ; Eiichiro Sumita
Abstract: Reordering is a difficult task in translating between widely different languages such as Japanese and English. We employ the postordering framework proposed by (Sudoh et al., 2011b) for Japanese to English translation and improve upon the reordering method. The existing post-ordering method reorders a sequence of target language words in a source language word order via SMT, while our method reorders the sequence by: 1) parsing the sequence to obtain syntax structures similar to a source language structure, and 2) transferring the obtained syntax structures into the syntax structures of the target language.
6 0.15867557 3 acl-2012-A Class-Based Agreement Model for Generating Accurately Inflected Translations
7 0.1569604 108 acl-2012-Hierarchical Chunk-to-String Translation
8 0.13699295 105 acl-2012-Head-Driven Hierarchical Phrase-based Translation
9 0.11689532 141 acl-2012-Maximum Expected BLEU Training of Phrase and Lexicon Translation Models
10 0.10653869 140 acl-2012-Machine Translation without Words through Substring Alignment
11 0.10363259 158 acl-2012-PORT: a Precision-Order-Recall MT Evaluation Metric for Tuning
12 0.09587384 123 acl-2012-Joint Feature Selection in Distributed Stochastic Learning for Large-Scale Discriminative Training in SMT
13 0.091942884 143 acl-2012-Mixing Multiple Translation Models in Statistical Machine Translation
14 0.089874662 131 acl-2012-Learning Translation Consensus with Structured Label Propagation
15 0.08757443 97 acl-2012-Fast and Scalable Decoding with Language Model Look-Ahead for Phrase-based Statistical Machine Translation
16 0.085451961 202 acl-2012-Transforming Standard Arabic to Colloquial Arabic
17 0.081631318 27 acl-2012-Arabic Retrieval Revisited: Morphological Hole Filling
18 0.081100836 42 acl-2012-Bootstrapping via Graph Propagation
19 0.080893315 22 acl-2012-A Topic Similarity Model for Hierarchical Phrase-based Translation
20 0.079659812 25 acl-2012-An Exploration of Forest-to-String Translation: Does Translation Help or Hurt Parsing?
topicId topicWeight
[(0, -0.249), (1, -0.219), (2, 0.035), (3, 0.015), (4, 0.067), (5, -0.099), (6, -0.01), (7, 0.05), (8, 0.042), (9, 0.019), (10, -0.036), (11, -0.061), (12, 0.118), (13, -0.128), (14, -0.171), (15, -0.032), (16, -0.096), (17, -0.07), (18, 0.006), (19, -0.321), (20, 0.121), (21, 0.076), (22, -0.164), (23, 0.058), (24, -0.249), (25, 0.154), (26, -0.076), (27, 0.088), (28, 0.101), (29, 0.146), (30, -0.066), (31, 0.055), (32, 0.017), (33, -0.048), (34, 0.051), (35, -0.102), (36, 0.033), (37, -0.005), (38, 0.093), (39, 0.127), (40, -0.086), (41, 0.101), (42, -0.07), (43, -0.064), (44, 0.018), (45, 0.004), (46, 0.061), (47, 0.049), (48, -0.009), (49, -0.024)]
simIndex simValue paperId paperTitle
same-paper 1 0.94359893 148 acl-2012-Modified Distortion Matrices for Phrase-Based Statistical Machine Translation
Author: Arianna Bisazza ; Marcello Federico
Abstract: This paper presents a novel method to suggest long word reorderings to a phrase-based SMT decoder. We address language pairs where long reordering concentrates on few patterns, and use fuzzy chunk-based rules to predict likely reorderings for these phenomena. Then we use reordered n-gram LMs to rank the resulting permutations and select the n-best for translation. Finally we encode these reorderings by modifying selected entries of the distortion cost matrix, on a per-sentence basis. In this way, we expand the search space by a much finer degree than if we simply raised the distortion limit. The proposed techniques are tested on Arabic-English and German-English using well-known SMT benchmarks.
2 0.82575518 19 acl-2012-A Ranking-based Approach to Word Reordering for Statistical Machine Translation
Author: Nan Yang ; Mu Li ; Dongdong Zhang ; Nenghai Yu
Abstract: Long distance word reordering is a major challenge in statistical machine translation research. Previous work has shown using source syntactic trees is an effective way to tackle this problem between two languages with substantial word order difference. In this work, we further extend this line of exploration and propose a novel but simple approach, which utilizes a ranking model based on word order precedence in the target language to reposition nodes in the syntactic parse tree of a source sentence. The ranking model is automatically derived from word aligned parallel data with a syntactic parser for source language based on both lexical and syntactical features. We evaluated our approach on largescale Japanese-English and English-Japanese machine translation tasks, and show that it can significantly outperform the baseline phrase- based SMT system.
3 0.6833986 162 acl-2012-Post-ordering by Parsing for Japanese-English Statistical Machine Translation
Author: Isao Goto ; Masao Utiyama ; Eiichiro Sumita
Abstract: Reordering is a difficult task in translating between widely different languages such as Japanese and English. We employ the postordering framework proposed by (Sudoh et al., 2011b) for Japanese to English translation and improve upon the reordering method. The existing post-ordering method reorders a sequence of target language words in a source language word order via SMT, while our method reorders the sequence by: 1) parsing the sequence to obtain syntax structures similar to a source language structure, and 2) transferring the obtained syntax structures into the syntax structures of the target language.
4 0.55241078 155 acl-2012-NiuTrans: An Open Source Toolkit for Phrase-based and Syntax-based Machine Translation
Author: Tong Xiao ; Jingbo Zhu ; Hao Zhang ; Qiang Li
Abstract: We present a new open source toolkit for phrase-based and syntax-based machine translation. The toolkit supports several state-of-the-art models developed in statistical machine translation, including the phrase-based model, the hierachical phrase-based model, and various syntaxbased models. The key innovation provided by the toolkit is that the decoder can work with various grammars and offers different choices of decoding algrithms, such as phrase-based decoding, decoding as parsing/tree-parsing and forest-based decoding. Moreover, several useful utilities were distributed with the toolkit, including a discriminative reordering model, a simple and fast language model, and an implementation of minimum error rate training for weight tuning. 1
5 0.54941058 105 acl-2012-Head-Driven Hierarchical Phrase-based Translation
Author: Junhui Li ; Zhaopeng Tu ; Guodong Zhou ; Josef van Genabith
Abstract: This paper presents an extension of Chiang’s hierarchical phrase-based (HPB) model, called Head-Driven HPB (HD-HPB), which incorporates head information in translation rules to better capture syntax-driven information, as well as improved reordering between any two neighboring non-terminals at any stage of a derivation to explore a larger reordering search space. Experiments on Chinese-English translation on four NIST MT test sets show that the HD-HPB model significantly outperforms Chiang’s model with average gains of 1.91 points absolute in BLEU. 1
6 0.52653992 147 acl-2012-Modeling the Translation of Predicate-Argument Structure for SMT
7 0.44654408 108 acl-2012-Hierarchical Chunk-to-String Translation
8 0.43103892 156 acl-2012-Online Plagiarized Detection Through Exploiting Lexical, Syntax, and Semantic Information
9 0.36147818 3 acl-2012-A Class-Based Agreement Model for Generating Accurately Inflected Translations
10 0.34026754 158 acl-2012-PORT: a Precision-Order-Recall MT Evaluation Metric for Tuning
11 0.33931184 131 acl-2012-Learning Translation Consensus with Structured Label Propagation
12 0.3067812 127 acl-2012-Large-Scale Syntactic Language Modeling with Treelets
13 0.30198425 178 acl-2012-Sentence Simplification by Monolingual Machine Translation
14 0.28157309 143 acl-2012-Mixing Multiple Translation Models in Statistical Machine Translation
15 0.27875987 123 acl-2012-Joint Feature Selection in Distributed Stochastic Learning for Large-Scale Discriminative Training in SMT
16 0.27817649 63 acl-2012-Cross-lingual Parse Disambiguation based on Semantic Correspondence
17 0.27012959 27 acl-2012-Arabic Retrieval Revisited: Morphological Hole Filling
18 0.26970014 140 acl-2012-Machine Translation without Words through Substring Alignment
19 0.26829299 11 acl-2012-A Feature-Rich Constituent Context Model for Grammar Induction
20 0.26514775 202 acl-2012-Transforming Standard Arabic to Colloquial Arabic
topicId topicWeight
[(25, 0.026), (26, 0.056), (28, 0.035), (30, 0.084), (37, 0.039), (39, 0.04), (40, 0.196), (57, 0.041), (59, 0.019), (74, 0.058), (82, 0.024), (84, 0.022), (85, 0.045), (90, 0.107), (92, 0.056), (94, 0.027), (99, 0.043)]
simIndex simValue paperId paperTitle
1 0.77550459 63 acl-2012-Cross-lingual Parse Disambiguation based on Semantic Correspondence
Author: Lea Frermann ; Francis Bond
Abstract: We present a system for cross-lingual parse disambiguation, exploiting the assumption that the meaning of a sentence remains unchanged during translation and the fact that different languages have different ambiguities. We simultaneously reduce ambiguity in multiple languages in a fully automatic way. Evaluation shows that the system reliably discards dispreferred parses from the raw parser output, which results in a pre-selection that can speed up manual treebanking.
same-paper 2 0.7659198 148 acl-2012-Modified Distortion Matrices for Phrase-Based Statistical Machine Translation
Author: Arianna Bisazza ; Marcello Federico
Abstract: This paper presents a novel method to suggest long word reorderings to a phrase-based SMT decoder. We address language pairs where long reordering concentrates on few patterns, and use fuzzy chunk-based rules to predict likely reorderings for these phenomena. Then we use reordered n-gram LMs to rank the resulting permutations and select the n-best for translation. Finally we encode these reorderings by modifying selected entries of the distortion cost matrix, on a per-sentence basis. In this way, we expand the search space by a much finer degree than if we simply raised the distortion limit. The proposed techniques are tested on Arabic-English and German-English using well-known SMT benchmarks.
3 0.66268069 83 acl-2012-Error Mining on Dependency Trees
Author: Claire Gardent ; Shashi Narayan
Abstract: In recent years, error mining approaches were developed to help identify the most likely sources of parsing failures in parsing systems using handcrafted grammars and lexicons. However the techniques they use to enumerate and count n-grams builds on the sequential nature of a text corpus and do not easily extend to structured data. In this paper, we propose an algorithm for mining trees and apply it to detect the most likely sources of generation failure. We show that this tree mining algorithm permits identifying not only errors in the generation system (grammar, lexicon) but also mismatches between the structures contained in the input and the input structures expected by our generator as well as a few idiosyncrasies/error in the input data.
4 0.65011054 75 acl-2012-Discriminative Strategies to Integrate Multiword Expression Recognition and Parsing
Author: Matthieu Constant ; Anthony Sigogne ; Patrick Watrin
Abstract: and Parsing Anthony Sigogne Universit e´ Paris-Est LIGM, CNRS France s igogne @univ-mlv . fr Patrick Watrin Universit e´ de Louvain CENTAL Belgium pat rick .wat rin @ ucl ouvain .be view, their incorporation has also been considered The integration of multiword expressions in a parsing procedure has been shown to improve accuracy in an artificial context where such expressions have been perfectly pre-identified. This paper evaluates two empirical strategies to integrate multiword units in a real constituency parsing context and shows that the results are not as promising as has sometimes been suggested. Firstly, we show that pregrouping multiword expressions before parsing with a state-of-the-art recognizer improves multiword recognition accuracy and unlabeled attachment score. However, it has no statistically significant impact in terms of F-score as incorrect multiword expression recognition has important side effects on parsing. Secondly, integrating multiword expressions in the parser grammar followed by a reranker specific to such expressions slightly improves all evaluation metrics.
5 0.63014352 19 acl-2012-A Ranking-based Approach to Word Reordering for Statistical Machine Translation
Author: Nan Yang ; Mu Li ; Dongdong Zhang ; Nenghai Yu
Abstract: Long distance word reordering is a major challenge in statistical machine translation research. Previous work has shown using source syntactic trees is an effective way to tackle this problem between two languages with substantial word order difference. In this work, we further extend this line of exploration and propose a novel but simple approach, which utilizes a ranking model based on word order precedence in the target language to reposition nodes in the syntactic parse tree of a source sentence. The ranking model is automatically derived from word aligned parallel data with a syntactic parser for source language based on both lexical and syntactical features. We evaluated our approach on largescale Japanese-English and English-Japanese machine translation tasks, and show that it can significantly outperform the baseline phrase- based SMT system.
6 0.61222363 175 acl-2012-Semi-supervised Dependency Parsing using Lexical Affinities
7 0.61181003 144 acl-2012-Modeling Review Comments
8 0.61139739 80 acl-2012-Efficient Tree-based Approximation for Entailment Graph Learning
10 0.60989493 214 acl-2012-Verb Classification using Distributional Similarity in Syntactic and Semantic Structures
11 0.60539013 136 acl-2012-Learning to Translate with Multiple Objectives
12 0.60433936 174 acl-2012-Semantic Parsing with Bayesian Tree Transducers
13 0.6041227 72 acl-2012-Detecting Semantic Equivalence and Information Disparity in Cross-lingual Documents
14 0.60381311 110 acl-2012-Historical Analysis of Legal Opinions with a Sparse Mixed-Effects Latent Variable Model
15 0.60047013 130 acl-2012-Learning Syntactic Verb Frames using Graphical Models
16 0.59946847 97 acl-2012-Fast and Scalable Decoding with Language Model Look-Ahead for Phrase-based Statistical Machine Translation
17 0.59868699 65 acl-2012-Crowdsourcing Inference-Rule Evaluation
18 0.59855264 11 acl-2012-A Feature-Rich Constituent Context Model for Grammar Induction
19 0.59797317 206 acl-2012-UWN: A Large Multilingual Lexical Knowledge Base
20 0.59522563 28 acl-2012-Aspect Extraction through Semi-Supervised Modeling