acl acl2013 acl2013-40 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Minwei Feng ; Jan-Thorsten Peter ; Hermann Ney
Abstract: In this paper, we propose a novel reordering model based on sequence labeling techniques. Our model converts the reordering problem into a sequence labeling problem, i.e. a tagging task. Results on five Chinese-English NIST tasks show that our model improves the baseline system by 1.32 BLEU and 1.53 TER on average. Results of comparative study with other seven widely used reordering models will also be reported.
Reference: text
sentIndex sentText sentNum sentScore
1 de Abstract In this paper, we propose a novel reordering model based on sequence labeling techniques. [sent-3, score-0.572]
2 Our model converts the reordering problem into a sequence labeling problem, i. [sent-4, score-0.611]
3 Results of comparative study with other seven widely used reordering models will also be reported. [sent-10, score-0.443]
4 As shown in (Knight, 1999), if arbitrary reordering is allowed, the search problem is NP-hard. [sent-14, score-0.409]
5 Many ideas have been proposed to address the reordering problem. [sent-15, score-0.409]
6 Within the phrase-based SMT framework there are mainly three stages where improved reordering could be integrated: In the preprocessing: the source sentence is reordered by heuristics, so that the word order of source and target sentences is similar. [sent-16, score-0.901]
7 , 2007) use rules to reorder the source sentences on the chunk level and provide a source-reordering lattice instead of a single reordered source sentence as input to the SMT system. [sent-20, score-0.455]
8 Designing rules to reorder the source sentence is conceptually clear and usually easy to implement. [sent-21, score-0.232]
9 In the decoder: we can add constraints or models into the decoder to reward good reordering op- tions or penalize bad ones. [sent-24, score-0.476]
10 For reordering constraints, early work includes ITG constraints (Wu, 1995) and IBM constraints (Berger et al. [sent-25, score-0.465]
11 (Zens and Ney, 2003) did comparative study over different reordering constraints. [sent-27, score-0.443]
12 For reordering models, we can further roughly divide the existing methods into three genres: • The reordering is a classification problem. [sent-29, score-0.818]
13 The classifier can be trained with maximum likelihood like Moses lexicalized reordering (Koehn et al. [sent-32, score-0.447]
14 , 2007) and hierarchical lexicalized reordering model (Galley and Manning, 2008) or be trained under maximum entropy framework (Zens and Ney, 2006). [sent-33, score-0.484]
15 From the reordering point of view, the idea is that the correct reordering is a suit- able order of translation units. [sent-38, score-0.909]
16 , 2006)’s model which utilize only source words to model the decoding order. [sent-41, score-0.384]
17 • The reordering can be solved by outside hTehueris retiocsr. [sent-42, score-0.409]
18 For example, the simple jump model using linear distance tells the decoder that usually the long range reordering should be avoided. [sent-44, score-0.587]
19 One disadvantage of carrying out reordering in reranking is the representativeness of the N-best list is often a question mark. [sent-54, score-0.443]
20 In this paper, we propose a novel tagging style reordering model which is under the category “The reordering is a decoding order problem”. [sent-55, score-1.053]
21 Our model converts the decoding order problem into a sequence labeling problem, i. [sent-56, score-0.367]
22 Section 4 briefly describes several reordering models with which we compare our method. [sent-61, score-0.409]
23 The objective is to translate the source into a target language sentence e1I = e1 . [sent-71, score-0.239]
24 given the source sentence f17, the system translates it into the target sentence e17, then the alignment link set {a1 = 3, a3 = 2, a4 = 4, a4 = 5, a5 = 7, a6 = 6, a7 = 6} reveals the decoding process, i. [sent-97, score-0.522]
25 the first generated target word e1 has no alignment, we can regard it as a translation from a NULL source word; then the second generated target word e2 is translated from f3. [sent-101, score-0.479]
26 We reorder the source side of the alignment to get Figure 1(b). [sent-102, score-0.299]
27 Figure 1(b) implies the source sentence decoding sequence information, which is depicted in Figure 1(c). [sent-103, score-0.428]
28 e1 • the unaligned source word should follow its preceding word, tuhrec unaligned fueladtu froel liosw kept with a ∗ symbol, e. [sent-106, score-0.347]
29 f2∗ is after f1 • when one source word is aligned to multiple target words, only keep t ahleig alignment utlhtai-t links the source word to the first target word, e. [sent-108, score-0.568]
30 In other words, we use this strategy to guarantee that every source word appears only once in the source decoding sequence. [sent-111, score-0.485]
31 • when multiple source words are aligned to one target word, put together th alei source words according to their original relative positions, e. [sent-112, score-0.417]
32 Now Figure 1(c) shows the original source sentence and its decoding sequence. [sent-116, score-0.352]
33 The positive function values mean that compared to the original position in the source sentence, its position in the decoding sequence should move rightwards. [sent-121, score-0.54]
34 If the function value is 0, the word’s position in original source sentence and its decoding sequence is same. [sent-122, score-0.505]
35 For example, f1is the first word in the source sentence but it is the second word in the decoding sequence. [sent-123, score-0.412]
36 Now Figure 1(d) converts the reordering problem into a sequence labeling or tagging problem. [sent-125, score-0.607]
37 Suppose the longest sentence length is 100, then according to Figure 1(d), there are 200 tags (from -99 to +99 plus the unalign tag). [sent-127, score-0.225]
38 Secondly, we add source sentence part-ofspeech (POS) tags to the input. [sent-141, score-0.222]
39 The start and end position of the kth source phrase are bk and jk. [sent-158, score-0.32]
40 We have access to the old coverage vector, from which we know if the new phrase’s left neighboring source word fbk−1 f˜k (f˜k, f˜k and right neighboring source word fjk+1 have been translated. [sent-166, score-0.35]
41 We also have the word alignment within the new phrase pair, which is stored during the phrase extraction process. [sent-167, score-0.236]
42 The added model will then check the consistence between the calculated labels and the labels predicted by the reordering model. [sent-169, score-0.472]
43 4 Comparative Study The second part of this paper is comparative study on reordering models. [sent-171, score-0.443]
44 1 Moses lexicalized reordering model Moses (Koehn et al. [sent-174, score-0.484]
45 The definitions of reordering types are as follows: monotone for current phrase, if a word alignment to the bottom left (point A) exists and there is no word alignment point at the bottom right position (point B) . [sent-177, score-0.732]
46 swap for current phrase, if a word alignment to the bottom right (point B) exists and there is no word alignment point at the bottom left position (point A) . [sent-178, score-0.323]
47 discontinuous all other cases Our implementation is same with the default behavior of Moses lexicalized reordering model. [sent-179, score-0.447]
48 We count how often each extracted phrase pair is found with each of the three reordering types. [sent-180, score-0.474]
49 2 Maximum entropy reordering model Figure 3 is an illustration of (Zens and Ney, 2006) . [sent-185, score-0.446]
50 j is the source word position which is aligned to the last target word of the current phrase. [sent-186, score-0.406]
51 is the last source word position of the current phrase. [sent-187, score-0.286]
52 is the source word position which is aligned to the first target word position of the next phrase. [sent-188, score-0.449]
53 The whole model is: hn(f1J, e1I, pλ1N(cj,j0,j00 |f1J, e1I,i,j) =exp(nPN=1λNnhn(f1J,e1I,i,j,cj,j0,j00)) Pexp( P Pc0 nP=1 (5) λnhn(fJ1,e1I,i,j,c0)) Different fePatures Pcan be used, we use the source and target word features to train the model. [sent-194, score-0.264]
54 j is the source word position aligned to the last target word of current phrase. [sent-196, score-0.406]
55 j0 is the last source word position of current phrase. [sent-197, score-0.286]
56 Bilingual LM The previous two models belong to “The reordering is a classification problem”. [sent-202, score-0.409]
57 Now we turn to “The reordering is a decoding order problem”. [sent-203, score-0.574]
58 e1, e2f3 e3f1 e4f4 e5f4 e6f6f7 e7f5 Notice the bilingual units have been ordered according to the target side, as the decoder writes the translation in a left-to-right way. [sent-209, score-0.268]
59 f2 • when one source word aligned to multiple target words, duplicate th alei source w mourdlt fpoler each target word, e. [sent-214, score-0.499]
60 e4f4 , e5f4 • when multiple source words aligned to one target word, put together tohred source ewdo rtdos o ofnoer that target word, e. [sent-216, score-0.432]
61 e6f6f7 After the operation in Figure 4 was done for all bilingual sentence pairs, we get a decoding sequence corpus. [sent-218, score-0.369]
62 To use the bilingual LM, the search state must be augmented to keep the bilingual unit • 00 j is the source word position aligned to the first target word decoding sequence. [sent-221, score-0.709]
63 The bilingual sequence of phrase pairs will be extracted using the same strategy in Figure 4 . [sent-223, score-0.227]
64 is the bilingual sequence for the new phrase pair e˜) and is the ith unit (f˜, F˜i F˜. [sent-225, score-0.227]
65 , 2010) present an simpler version of the above bilingual LM where they use only the source side to model the decoding order. [sent-231, score-0.466]
66 The source word decoding sequence in Figure 4 is then f3 , f1, f2 , f4 , f6 , f7 , f5 . [sent-232, score-0.416]
67 We also build a 9-gram LM based on the source word decoding sequences. [sent-233, score-0.34]
68 5 Syntactic cohesion model The previous two models belong to “The reordering is a decoding order problem”. [sent-236, score-0.677]
69 Now we turn to “The reordering can be solved by outside heuristics”. [sent-237, score-0.409]
70 6 of (Zhang, 2013), instead of doing hard reordering decision, the author uses the rules as soft constraints in the decoder. [sent-253, score-0.466]
71 The leaf nodes in the revised tree constitute the reordered source sentence. [sent-258, score-0.223]
72 Finally, in the log-linear framework (Equation 2) a newjump model is added which uses the reordered source sentence to calculate the cost. [sent-259, score-0.328]
73 Suppose previously translated source phrase is f1and the current phrase is f5 . [sent-262, score-0.388]
74 Then the standard jump model gives cost qDist = 4 and the new tree-based jump model will return a cost qDist new = 1. [sent-263, score-0.278]
75 1 Experimental Setup Our baseline is a phrase-based decoder, which includes the following models: an n-gram targetside language model (LM), a phrase translation model and a word-based lexicon model. [sent-266, score-0.291]
76 The reordering model for the baseline system is the distancebased jump model which uses linear distance. [sent-270, score-0.646]
77 Chinese Sentences Running Words Vocabulary English 5 384 856 115 172 748 129 820 318 1125 437 739 251 Table 1: translation model and LM training data statistics Table 1 contains the data statistics used for translation model and LM. [sent-281, score-0.256]
78 For the reordering model, we take two further filtering steps. [sent-282, score-0.409]
79 Firstly, we delete the sentence pairs if the source sentence length is one. [sent-283, score-0.229]
80 When the source sentence has only one word, the translation will be always monotonic and the reordering model does not need to learn this. [sent-284, score-0.76]
81 Secondly, we delete the sentence pairs if the source sentence contains more than three contiguous unaligned words. [sent-285, score-0.315]
82 The source side data statistics for the reordering model training is given in Table 2 (target side has only nine labels). [sent-295, score-0.686]
83 From Table 7 we see that the proposed reordering model using CRFs improves the baseline by 0. [sent-382, score-0.507]
84 21 TER on average, while the proposed reordering model using RNN improves the baseline by 1. [sent-384, score-0.507]
85 To investigate why RNN has lower performance for the tagging task but achieves better BLEU, we build a 3-gram LM on the source side of the training corpus in Table 2 and perplexity values are listed in Table 8. [sent-389, score-0.248]
86 The perplexity of the test corpus for reordering model comparison is much lower than those NIST corpora for translation experiments. [sent-390, score-0.574]
87 In other words, there exists mismatch of the data for reordering model training and actual MT data. [sent-391, score-0.446]
88 CRFs and RNN mean the tagging-style model trained with CRFs or RNN; LRM for lexicalized reordering model (Koehn et al. [sent-542, score-0.521]
89 , 2007) ; MERO for maximum entropy reordering model (Zens and Ney, 2006) ; BILM for bilingual language model (Mari ˜no et al. [sent-543, score-0.569]
90 , 2006) and SRCLM for its simpler version source decoding sequence model (Feng et al. [sent-544, score-0.423]
91 6 Conclusion In this paper, a novel tagging style reordering model has been proposed. [sent-555, score-0.479]
92 By our method, the reordering problem is converted into a sequence labeling problem so that the whole source sentence is taken into consideration for reordering decision. [sent-556, score-1.131]
93 By adding an unaligned word tag, the unaligned word phenomenon is automatically implanted in the proposed model. [sent-557, score-0.232]
94 The CRFs achieves lower error rate on the tagging task but RNN trained model is better for the translation task. [sent-561, score-0.233]
95 We also compare our method with several other popular reordering models. [sent-568, score-0.409]
96 However, the tree-based jump model relies on manually designed reordering rules which does not exist for many language pairs while our model can be easily adapted to other translation tasks. [sent-570, score-0.676]
97 The main contributions of the paper are: propose the tagging-style reordering model and improve the translation quality; compare two sequence labeling techniques CRFs and RNN; compare our method with seven other reordering models. [sent-572, score-1.072]
98 A source-side decoding sequence model for statisti- cal machine translation. [sent-603, score-0.278]
99 A compara- tive study on reordering constraints in statistical machine translation. [sent-746, score-0.437]
100 Chunk-level reordering of source language sentences with automatically learned rules for statistical machine translation. [sent-762, score-0.554]
wordName wordTfidf (topN-words)
[('rnn', 0.454), ('reordering', 0.409), ('fj', 0.31), ('crfs', 0.181), ('decoding', 0.165), ('jumptree', 0.148), ('unalign', 0.148), ('source', 0.145), ('bilm', 0.129), ('mero', 0.129), ('srclm', 0.129), ('lrm', 0.114), ('lm', 0.111), ('jump', 0.102), ('translation', 0.091), ('bilingual', 0.086), ('srl', 0.086), ('unaligned', 0.086), ('zens', 0.082), ('ter', 0.081), ('translated', 0.079), ('reordered', 0.078), ('position', 0.077), ('alignment', 0.076), ('sequence', 0.076), ('lavergne', 0.074), ('feng', 0.073), ('bleu', 0.067), ('cohesion', 0.066), ('phrase', 0.065), ('sc', 0.061), ('baseline', 0.061), ('mari', 0.057), ('lmono', 0.055), ('lreorder', 0.055), ('rmono', 0.055), ('rreorder', 0.055), ('rumelhart', 0.055), ('wapiti', 0.055), ('hermann', 0.055), ('target', 0.052), ('labeling', 0.05), ('neural', 0.05), ('lstm', 0.049), ('minwei', 0.049), ('reorder', 0.045), ('och', 0.045), ('recurrent', 0.044), ('memory', 0.043), ('aachen', 0.043), ('sentence', 0.042), ('ney', 0.042), ('rate', 0.039), ('decoder', 0.039), ('converts', 0.039), ('aligned', 0.038), ('lexicalized', 0.038), ('penalty', 0.037), ('alei', 0.037), ('hbilm', 0.037), ('hhhh', 0.037), ('hochreiter', 0.037), ('qdist', 0.037), ('referhenchehhhprehdicthion', 0.037), ('rnnlib', 0.037), ('yuqi', 0.037), ('perplexity', 0.037), ('model', 0.037), ('koehn', 0.036), ('monotonic', 0.036), ('tags', 0.035), ('comparative', 0.034), ('orientation', 0.034), ('reranking', 0.034), ('current', 0.034), ('moses', 0.033), ('usa', 0.033), ('error', 0.033), ('tagging', 0.033), ('bk', 0.033), ('eik', 0.033), ('fjk', 0.033), ('schuster', 0.033), ('side', 0.033), ('cherry', 0.03), ('word', 0.03), ('pr', 0.03), ('franz', 0.029), ('soft', 0.029), ('nine', 0.029), ('constraints', 0.028), ('finished', 0.028), ('violates', 0.028), ('pos', 0.028), ('network', 0.028), ('rwth', 0.027), ('suppose', 0.027), ('added', 0.026), ('confusion', 0.026), ('agency', 0.025)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999928 40 acl-2013-Advancements in Reordering Models for Statistical Machine Translation
Author: Minwei Feng ; Jan-Thorsten Peter ; Hermann Ney
Abstract: In this paper, we propose a novel reordering model based on sequence labeling techniques. Our model converts the reordering problem into a sequence labeling problem, i.e. a tagging task. Results on five Chinese-English NIST tasks show that our model improves the baseline system by 1.32 BLEU and 1.53 TER on average. Results of comparative study with other seven widely used reordering models will also be reported.
2 0.31390634 101 acl-2013-Cut the noise: Mutually reinforcing reordering and alignments for improved machine translation
Author: Karthik Visweswariah ; Mitesh M. Khapra ; Ananthakrishnan Ramanathan
Abstract: Preordering of a source language sentence to match target word order has proved to be useful for improving machine translation systems. Previous work has shown that a reordering model can be learned from high quality manual word alignments to improve machine translation performance. In this paper, we focus on further improving the performance of the reordering model (and thereby machine translation) by using a larger corpus of sentence aligned data for which manual word alignments are not available but automatic machine generated alignments are available. The main challenge we tackle is to generate quality data for training the reordering model in spite of the machine align- ments being noisy. To mitigate the effect of noisy machine alignments, we propose a novel approach that improves reorderings produced given noisy alignments and also improves word alignments using information from the reordering model. This approach generates alignments that are 2.6 f-Measure points better than a baseline supervised aligner. The data generated allows us to train a reordering model that gives an improvement of 1.8 BLEU points on the NIST MT-08 Urdu-English evaluation set over a reordering model that only uses manual word alignments, and a gain of 5.2 BLEU points over a standard phrase-based baseline.
3 0.27302003 166 acl-2013-Generalized Reordering Rules for Improved SMT
Author: Fei Huang ; Cezar Pendus
Abstract: We present a simple yet effective approach to syntactic reordering for Statistical Machine Translation (SMT). Instead of solely relying on the top-1 best-matching rule for source sentence preordering, we generalize fully lexicalized rules into partially lexicalized and unlexicalized rules to broaden the rule coverage. Furthermore, , we consider multiple permutations of all the matching rules, and select the final reordering path based on the weighed sum of reordering probabilities of these rules. Our experiments in English-Chinese and English-Japanese translations demonstrate the effectiveness of the proposed approach: we observe consistent and significant improvement in translation quality across multiple test sets in both language pairs judged by both humans and automatic metric. 1
4 0.20971103 200 acl-2013-Integrating Phrase-based Reordering Features into a Chart-based Decoder for Machine Translation
Author: ThuyLinh Nguyen ; Stephan Vogel
Abstract: Hiero translation models have two limitations compared to phrase-based models: 1) Limited hypothesis space; 2) No lexicalized reordering model. We propose an extension of Hiero called PhrasalHiero to address Hiero’s second problem. Phrasal-Hiero still has the same hypothesis space as the original Hiero but incorporates a phrase-based distance cost feature and lexicalized reodering features into the chart decoder. The work consists of two parts: 1) for each Hiero translation derivation, find its corresponding dis- continuous phrase-based path. 2) Extend the chart decoder to incorporate features from the phrase-based path. We achieve significant improvement over both Hiero and phrase-based baselines for ArabicEnglish, Chinese-English and GermanEnglish translation.
5 0.19012286 223 acl-2013-Learning a Phrase-based Translation Model from Monolingual Data with Application to Domain Adaptation
Author: Jiajun Zhang ; Chengqing Zong
Abstract: Currently, almost all of the statistical machine translation (SMT) models are trained with the parallel corpora in some specific domains. However, when it comes to a language pair or a different domain without any bilingual resources, the traditional SMT loses its power. Recently, some research works study the unsupervised SMT for inducing a simple word-based translation model from the monolingual corpora. It successfully bypasses the constraint of bitext for SMT and obtains a relatively promising result. In this paper, we take a step forward and propose a simple but effective method to induce a phrase-based model from the monolingual corpora given an automatically-induced translation lexicon or a manually-edited translation dictionary. We apply our method for the domain adaptation task and the extensive experiments show that our proposed method can substantially improve the translation quality. 1
6 0.17054038 19 acl-2013-A Shift-Reduce Parsing Algorithm for Phrase-based String-to-Dependency Translation
7 0.16528866 363 acl-2013-Two-Neighbor Orientation Model with Cross-Boundary Global Contexts
8 0.16311084 10 acl-2013-A Markov Model of Machine Translation using Non-parametric Bayesian Inference
9 0.15738101 77 acl-2013-Can Markov Models Over Minimal Translation Units Help Phrase-Based SMT?
10 0.15127379 361 acl-2013-Travatar: A Forest-to-String Machine Translation Engine based on Tree Transducers
11 0.14711507 11 acl-2013-A Multi-Domain Translation Model Framework for Statistical Machine Translation
12 0.1457506 38 acl-2013-Additive Neural Networks for Statistical Machine Translation
13 0.13685043 388 acl-2013-Word Alignment Modeling with Context Dependent Deep Neural Network
14 0.13273284 210 acl-2013-Joint Word Alignment and Bilingual Named Entity Recognition Using Dual Decomposition
15 0.12582542 275 acl-2013-Parsing with Compositional Vector Grammars
16 0.12462708 226 acl-2013-Learning to Prune: Context-Sensitive Pruning for Syntactic MT
17 0.12130176 314 acl-2013-Semantic Roles for String to Tree Machine Translation
18 0.11902806 68 acl-2013-Bilingual Data Cleaning for SMT using Graph-based Random Walk
19 0.11350355 255 acl-2013-Name-aware Machine Translation
20 0.11335694 307 acl-2013-Scalable Decipherment for Machine Translation via Hash Sampling
topicId topicWeight
[(0, 0.265), (1, -0.227), (2, 0.173), (3, 0.153), (4, -0.001), (5, 0.047), (6, -0.014), (7, -0.001), (8, -0.035), (9, 0.081), (10, -0.009), (11, -0.022), (12, 0.007), (13, -0.029), (14, 0.059), (15, 0.09), (16, 0.109), (17, 0.039), (18, 0.024), (19, -0.061), (20, -0.107), (21, -0.029), (22, -0.003), (23, -0.153), (24, 0.079), (25, 0.008), (26, -0.008), (27, -0.164), (28, -0.182), (29, -0.075), (30, -0.101), (31, 0.019), (32, -0.01), (33, 0.055), (34, -0.043), (35, 0.005), (36, 0.068), (37, -0.014), (38, -0.023), (39, 0.048), (40, 0.04), (41, -0.081), (42, -0.022), (43, -0.045), (44, -0.052), (45, -0.022), (46, 0.01), (47, 0.012), (48, 0.004), (49, -0.042)]
simIndex simValue paperId paperTitle
1 0.94589651 101 acl-2013-Cut the noise: Mutually reinforcing reordering and alignments for improved machine translation
Author: Karthik Visweswariah ; Mitesh M. Khapra ; Ananthakrishnan Ramanathan
Abstract: Preordering of a source language sentence to match target word order has proved to be useful for improving machine translation systems. Previous work has shown that a reordering model can be learned from high quality manual word alignments to improve machine translation performance. In this paper, we focus on further improving the performance of the reordering model (and thereby machine translation) by using a larger corpus of sentence aligned data for which manual word alignments are not available but automatic machine generated alignments are available. The main challenge we tackle is to generate quality data for training the reordering model in spite of the machine align- ments being noisy. To mitigate the effect of noisy machine alignments, we propose a novel approach that improves reorderings produced given noisy alignments and also improves word alignments using information from the reordering model. This approach generates alignments that are 2.6 f-Measure points better than a baseline supervised aligner. The data generated allows us to train a reordering model that gives an improvement of 1.8 BLEU points on the NIST MT-08 Urdu-English evaluation set over a reordering model that only uses manual word alignments, and a gain of 5.2 BLEU points over a standard phrase-based baseline.
same-paper 2 0.93513191 40 acl-2013-Advancements in Reordering Models for Statistical Machine Translation
Author: Minwei Feng ; Jan-Thorsten Peter ; Hermann Ney
Abstract: In this paper, we propose a novel reordering model based on sequence labeling techniques. Our model converts the reordering problem into a sequence labeling problem, i.e. a tagging task. Results on five Chinese-English NIST tasks show that our model improves the baseline system by 1.32 BLEU and 1.53 TER on average. Results of comparative study with other seven widely used reordering models will also be reported.
3 0.89681917 166 acl-2013-Generalized Reordering Rules for Improved SMT
Author: Fei Huang ; Cezar Pendus
Abstract: We present a simple yet effective approach to syntactic reordering for Statistical Machine Translation (SMT). Instead of solely relying on the top-1 best-matching rule for source sentence preordering, we generalize fully lexicalized rules into partially lexicalized and unlexicalized rules to broaden the rule coverage. Furthermore, , we consider multiple permutations of all the matching rules, and select the final reordering path based on the weighed sum of reordering probabilities of these rules. Our experiments in English-Chinese and English-Japanese translations demonstrate the effectiveness of the proposed approach: we observe consistent and significant improvement in translation quality across multiple test sets in both language pairs judged by both humans and automatic metric. 1
4 0.8776353 77 acl-2013-Can Markov Models Over Minimal Translation Units Help Phrase-Based SMT?
Author: Nadir Durrani ; Alexander Fraser ; Helmut Schmid ; Hieu Hoang ; Philipp Koehn
Abstract: The phrase-based and N-gram-based SMT frameworks complement each other. While the former is better able to memorize, the latter provides a more principled model that captures dependencies across phrasal boundaries. Some work has been done to combine insights from these two frameworks. A recent successful attempt showed the advantage of using phrasebased search on top of an N-gram-based model. We probe this question in the reverse direction by investigating whether integrating N-gram-based translation and reordering models into a phrase-based decoder helps overcome the problematic phrasal independence assumption. A large scale evaluation over 8 language pairs shows that performance does significantly improve.
5 0.87345833 125 acl-2013-Distortion Model Considering Rich Context for Statistical Machine Translation
Author: Isao Goto ; Masao Utiyama ; Eiichiro Sumita ; Akihiro Tamura ; Sadao Kurohashi
Abstract: This paper proposes new distortion models for phrase-based SMT. In decoding, a distortion model estimates the source word position to be translated next (NP) given the last translated source word position (CP). We propose a distortion model that can consider the word at the CP, a word at an NP candidate, and the context of the CP and the NP candidate simultaneously. Moreover, we propose a further improved model that considers richer context by discriminating label sequences that specify spans from the CP to NP candidates. It enables our model to learn the effect of relative word order among NP candidates as well as to learn the effect of distances from the training data. In our experiments, our model improved 2.9 BLEU points for Japanese-English and 2.6 BLEU points for Chinese-English translation compared to the lexical reordering models.
6 0.86847526 363 acl-2013-Two-Neighbor Orientation Model with Cross-Boundary Global Contexts
7 0.85824096 200 acl-2013-Integrating Phrase-based Reordering Features into a Chart-based Decoder for Machine Translation
8 0.61330271 201 acl-2013-Integrating Translation Memory into Phrase-Based Machine Translation during Decoding
9 0.59443504 226 acl-2013-Learning to Prune: Context-Sensitive Pruning for Syntactic MT
10 0.58840507 10 acl-2013-A Markov Model of Machine Translation using Non-parametric Bayesian Inference
11 0.57275748 320 acl-2013-Shallow Local Multi-Bottom-up Tree Transducers in Statistical Machine Translation
12 0.56046724 361 acl-2013-Travatar: A Forest-to-String Machine Translation Engine based on Tree Transducers
13 0.55702764 38 acl-2013-Additive Neural Networks for Statistical Machine Translation
14 0.55416578 354 acl-2013-Training Nondeficient Variants of IBM-3 and IBM-4 for Word Alignment
15 0.54209167 388 acl-2013-Word Alignment Modeling with Context Dependent Deep Neural Network
16 0.54048342 223 acl-2013-Learning a Phrase-based Translation Model from Monolingual Data with Application to Domain Adaptation
17 0.54005587 127 acl-2013-Docent: A Document-Level Decoder for Phrase-Based Statistical Machine Translation
18 0.53603256 9 acl-2013-A Lightweight and High Performance Monolingual Word Aligner
19 0.53125483 68 acl-2013-Bilingual Data Cleaning for SMT using Graph-based Random Walk
20 0.52431154 15 acl-2013-A Novel Graph-based Compact Representation of Word Alignment
topicId topicWeight
[(0, 0.033), (6, 0.057), (11, 0.041), (24, 0.029), (26, 0.05), (28, 0.01), (35, 0.052), (42, 0.406), (48, 0.039), (67, 0.011), (70, 0.044), (88, 0.016), (90, 0.043), (95, 0.087)]
simIndex simValue paperId paperTitle
1 0.98413259 125 acl-2013-Distortion Model Considering Rich Context for Statistical Machine Translation
Author: Isao Goto ; Masao Utiyama ; Eiichiro Sumita ; Akihiro Tamura ; Sadao Kurohashi
Abstract: This paper proposes new distortion models for phrase-based SMT. In decoding, a distortion model estimates the source word position to be translated next (NP) given the last translated source word position (CP). We propose a distortion model that can consider the word at the CP, a word at an NP candidate, and the context of the CP and the NP candidate simultaneously. Moreover, we propose a further improved model that considers richer context by discriminating label sequences that specify spans from the CP to NP candidates. It enables our model to learn the effect of relative word order among NP candidates as well as to learn the effect of distances from the training data. In our experiments, our model improved 2.9 BLEU points for Japanese-English and 2.6 BLEU points for Chinese-English translation compared to the lexical reordering models.
Author: Sina Zarriess ; Jonas Kuhn
Abstract: We suggest a generation task that integrates discourse-level referring expression generation and sentence-level surface realization. We present a data set of German articles annotated with deep syntax and referents, including some types of implicit referents. Our experiments compare several architectures varying the order of a set of trainable modules. The results suggest that a revision-based pipeline, with intermediate linearization, significantly outperforms standard pipelines or a parallel architecture.
3 0.97469515 64 acl-2013-Automatically Predicting Sentence Translation Difficulty
Author: Abhijit Mishra ; Pushpak Bhattacharyya ; Michael Carl
Abstract: In this paper we introduce Translation Difficulty Index (TDI), a measure of difficulty in text translation. We first define and quantify translation difficulty in terms of TDI. We realize that any measure of TDI based on direct input by translators is fraught with subjectivity and adhocism. We, rather, rely on cognitive evidences from eye tracking. TDI is measured as the sum of fixation (gaze) and saccade (rapid eye movement) times of the eye. We then establish that TDI is correlated with three properties of the input sentence, viz. length (L), degree of polysemy (DP) and structural complexity (SC). We train a Support Vector Regression (SVR) system to predict TDIs for new sentences using these features as input. The prediction done by our framework is well correlated with the empirical gold standard data, which is a repository of < L, DP, SC > and TDI pairs for a set of sentences. The primary use of our work is a way of “binning” sentences (to be translated) in “easy”, “medium” and “hard” categories as per their predicted TDI. This can decide pricing of any translation task, especially useful in a scenario where parallel corpora for Machine Translation are built through translation crowdsourcing/outsourcing. This can also provide a way of monitoring progress of second language learners.
4 0.97448444 11 acl-2013-A Multi-Domain Translation Model Framework for Statistical Machine Translation
Author: Rico Sennrich ; Holger Schwenk ; Walid Aransa
Abstract: While domain adaptation techniques for SMT have proven to be effective at improving translation quality, their practicality for a multi-domain environment is often limited because of the computational and human costs of developing and maintaining multiple systems adapted to different domains. We present an architecture that delays the computation of translation model features until decoding, allowing for the application of mixture-modeling techniques at decoding time. We also de- scribe a method for unsupervised adaptation with development and test data from multiple domains. Experimental results on two language pairs demonstrate the effectiveness of both our translation model architecture and automatic clustering, with gains of up to 1BLEU over unadapted systems and single-domain adaptation.
5 0.97136354 372 acl-2013-Using CCG categories to improve Hindi dependency parsing
Author: Bharat Ram Ambati ; Tejaswini Deoskar ; Mark Steedman
Abstract: We show that informative lexical categories from a strongly lexicalised formalism such as Combinatory Categorial Grammar (CCG) can improve dependency parsing of Hindi, a free word order language. We first describe a novel way to obtain a CCG lexicon and treebank from an existing dependency treebank, using a CCG parser. We use the output of a supertagger trained on the CCGbank as a feature for a state-of-the-art Hindi dependency parser (Malt). Our results show that using CCG categories improves the accuracy of Malt on long distance dependencies, for which it is known to have weak rates of recovery.
same-paper 6 0.96341556 40 acl-2013-Advancements in Reordering Models for Statistical Machine Translation
7 0.96078473 206 acl-2013-Joint Event Extraction via Structured Prediction with Global Features
8 0.95070279 302 acl-2013-Robust Automated Natural Language Processing with Multiword Expressions and Collocations
9 0.82466197 166 acl-2013-Generalized Reordering Rules for Improved SMT
10 0.80609357 77 acl-2013-Can Markov Models Over Minimal Translation Units Help Phrase-Based SMT?
11 0.78184056 281 acl-2013-Post-Retrieval Clustering Using Third-Order Similarity Measures
12 0.77988112 38 acl-2013-Additive Neural Networks for Statistical Machine Translation
13 0.77040601 56 acl-2013-Argument Inference from Relevant Event Mentions in Chinese Argument Extraction
14 0.76199871 127 acl-2013-Docent: A Document-Level Decoder for Phrase-Based Statistical Machine Translation
15 0.74227959 69 acl-2013-Bilingual Lexical Cohesion Trigger Model for Document-Level Machine Translation
16 0.73978823 101 acl-2013-Cut the noise: Mutually reinforcing reordering and alignments for improved machine translation
17 0.73939794 68 acl-2013-Bilingual Data Cleaning for SMT using Graph-based Random Walk
18 0.73575473 363 acl-2013-Two-Neighbor Orientation Model with Cross-Boundary Global Contexts
19 0.73365664 226 acl-2013-Learning to Prune: Context-Sensitive Pruning for Syntactic MT
20 0.73162776 181 acl-2013-Hierarchical Phrase Table Combination for Machine Translation