acl acl2011 acl2011-313 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: David Chiang ; Steve DeNeefe ; Michael Pust
Abstract: We introduce two simple improvements to the lexical weighting features of Koehn, Och, and Marcu (2003) for machine translation: one which smooths the probability of translating word f to word e by simplifying English morphology, and one which conditions it on the kind of training data that f and e co-occurred in. These new variations lead to improvements of up to +0.8 BLEU, with an average improvement of +0.6 BLEU across two language pairs, two genres, and two translation systems.
Reference: text
sentIndex sentText sentNum sentScore
1 These new variations lead to improvements of up to +0. [sent-3, score-0.127]
2 6 BLEU across two language pairs, two genres, and two translation systems. [sent-5, score-0.212]
3 , 2003) estimate the probability of a phrase pair or translation rule word-by-word. [sent-7, score-0.324]
4 In this paper, we introduce two simple improvements to these features: one which smooths the probability of translating word f to word e using English morphology, and one which conditions it on the kind of training data that f and e co-occurred in. [sent-8, score-0.185]
5 These new variations lead to improvements of up to +0. [sent-9, score-0.127]
6 6 BLEU across two language pairs, two genres, and two translation systems. [sent-11, score-0.212]
7 2 Background Since there are slight variations in how the lexical weighting features are computed, we begin by defining the baseline lexical weighting features. [sent-12, score-0.758]
8 If f = f1 · · · fn and e = e1 · · · em are a training sentence pair, le·t· ai (1 ≤ i≤ n) b·e· ·thee (possibly empty) set of positions in1 1f ≤th iat ≤ ei is aligned to. [sent-13, score-0.149]
9 First, compute a word translation table from the word-aligned parallel text: for each sentence pair and 455 each i, let c(fj,ei) ← c(fj,ei) +|a1i| c(NULL, ei) Then ← c(NULL, ei) + 1 forj ∈ ai (1) if |ai| = 0 (2) t(e | f) =∑ce(cf(,fe,)e) (3) where f can be NULL. [sent-14, score-0.462]
10 Second, during phrase-pair extraction, store with each phrase pair the alignments between the words in the phrase pair. [sent-15, score-0.093]
11 If it is observed with more than one word alignment pattern, store the most frequent pattern. [sent-16, score-0.084]
12 Third, for each phrase pair (f¯, e¯, a), compute t(e¯|f)=∏i=|e¯1| t|(a1e¯i|∑j|∈NaiUt(Le¯Li)|f¯j)ioft|haei|rw>is0e (4) This generalizes to synchronous CFG rules in the obvious way. [sent-17, score-0.096]
13 Similarly, compute the reverse probability t(f¯ | ¯e ). [sent-18, score-0.043]
14 Then add two new model features − logt( ¯e | f¯) and − logt(f¯ | e¯) Proceedings ofP thoer t4l9atnhd A, Onrnuegaoln M,e Jeuntineg 19 o-f2 t4h,e 2 A0s1s1o. [sent-19, score-0.094]
15 i ac t2io0n11 fo Ar Cssoocmiaptuiotanti foonra Clo Lminpguutiast i ocns:aslh Loirntpgaupisetrics , pages 455–460, translation (7) (8) feature small LM26. [sent-21, score-0.212]
16 3 Table 1: Although the language models prefer translation (8), which translates 朋友 and 伙伴 as singular nouns, the lexical weighting features prefer translation (7), which incorrectly generates plural nouns. [sent-29, score-1.241]
17 All features are negative log-probabilities, so lower numbers indicate preference. [sent-30, score-0.094]
18 3 Morphological smoothing Consider the following example Chinese sentence: (5) 温家宝 Wēn Jiābǎo Wen Jiabao 表示 biǎoshì said 中国 在 非洲 Zhōngguó zài Fēizhōu China in Africa 好 hǎo . [sent-31, score-0.128]
19 , , , 科特迪瓦 Kētèdíwǎ Côte d’Ivoire 的 好 朋友 de hǎo péngyǒu ’s good friend 是 shì is , , , good partner . [sent-33, score-0.715]
20 (6) Human: Wen Jiabao said that Côte d’Ivoire is a good friend and a good partner of China’s in Africa. [sent-34, score-0.79]
21 (7) MT (baseline): Wen Jiabao said that Cote d’Ivoire is China’s good friends, and good partners in Africa. [sent-35, score-0.366]
22 (8) MT(better): Wen Jiabao said that Cote d’Ivoire is China’s good friend and good partner in Africa. [sent-36, score-0.79]
23 The baseline machine translation (7) incorrectly generates plural nouns. [sent-37, score-0.367]
24 Even though the language models (LMs) prefer singular nouns, the lexical weighting features prefer plural nouns (Table 1). [sent-38, score-0.818]
25 Therefore the information needed to mark friend and partner for number must come from the context. [sent-40, score-0.523]
26 The LMs are able to capture this context: the 5-gram is China ’s good 1The presence of an extra comma in translation (7) affects the LM scores only slightly; removing the comma would make them 26. [sent-41, score-0.418]
27 i45g83h |te-) ing features weaken the preference for singular or plural translations, with the exception of t(friends | 朋友). [sent-48, score-0.353]
28 friend is observed in our large LM, and the 4-gram China ’s good friend in our small LM, but China ’s good friends is not observed in either LM. [sent-49, score-1.039]
29 Likewise, the 5-grams good friend and good partner and good friends and good partners are both observed in our LMs, but neither good friend and good partners nor good friends and good partner is. [sent-50, score-2.318]
30 By contrast, the lexical weighting tables (Table 2, columns 3–4), which ignore context, have a strong preference for plural translations, except in the case of t(朋友 | friend). [sent-51, score-0.492]
31 Therefore we hypothesize that, for Chinese-English translation, we should weaken the lexical weighting features’ morphological preferences so that more contextual features can do their work. [sent-52, score-0.59]
32 Running a morphological stemmer (Porter, 1980) on the English side of the parallel data gives a three-way parallel text: for each sentence, we have French f,English e, and stemmed English e′ . [sent-53, score-0.274]
33 We can then build two word translation tables, t(e′ | f) and t(e | e′), and form their product tm(e | f) =∑e′t(e′ | f)t(e | e′) (9) Similarly, we can compute tm(f | e) in the opposite direction. [sent-54, score-0.255]
34 ) These tables can then be extended to phrase pairs or synchronous CFG rules as before and added as two new features of the model: − log tm( e¯ | f¯) and − log tm(f¯ | e¯) The feature tm( e¯ | f¯) does still prefer certain wordforms, as can be( seen in Table 2. [sent-56, score-0.383]
35 2Since the Porter stemmer is deterministic, we always have t(e′ | e) = 1. [sent-58, score-0.066]
36 Perhaps this is not surprising, because in ArabicEnglish translation (unlike Chinese-English translation), the source language is morphologically richer than the target language. [sent-62, score-0.254]
37 So we may benefit from features that preserve this information, while smoothing over morphological differences blurs important distinctions. [sent-63, score-0.265]
38 4 Conditioning on provenance Typical machine translation systems are trained on a fixed set of training data ranging over a variety of genres, and if the genre of an input sentence is known in advance, it is usually advantageous to use model parameters tuned for that genre. [sent-64, score-0.472]
39 Consider the following Arabic sentence, from a weblog (words written left-to-right): (10) وﻟﻌﻞ اﺣﺪ ھﺬا اھﻢ اﻟﻔﺮوق ﺑﯿﻦ wlEl h*A AHd Ahm Alfrwq byn perhaps this one main differences between اﻧﻈﻤﺔ اﻟﺤﻜﻢ اﻟﻤﻘﺘﺮﺣﺔ Swr AnZmp AlHkm AlmqtrHp . [sent-65, score-0.197]
40 (11) Human: Perhaps this is one of the most important differences between the images of the proposed ruling systems. [sent-68, score-0.208]
41 (12) MT (baseline): This may be one of the most important differences between pictures of the proposed ruling regimes. [sent-69, score-0.143]
42 (13) MT(better): Perhaps this is one of the most important differences between the images of the proposed regimes. [sent-70, score-0.109]
43 But some genres favor perhaps more or less strongly. [sent-72, score-0.228]
44 Thus, both translations (12) and (13) are good, but the latter uses a slightly more informal register appropriate to the genre. [sent-73, score-0.048]
45 (2009), we assign each training sentence pair a set of binary features which we call s-features: 457 t(e | f) ts(e | f) f e – nw we(be b)n un وﻟﻌﻞmay0. [sent-75, score-0.321]
46 19 Table 3: Different genres have different preferences for word translations. [sent-85, score-0.173]
47 Key: nw = newswire, web = Web, bn = broadcast news, un = United Nations proceedings. [sent-86, score-0.193]
48 • Whether the sentence pair came from a particular genre, for example, newswire or web • Whether the sentence pair came from a particular collection, for example, FBIS or UN Matsoukas et al. [sent-87, score-0.443]
49 (2009) use these s-features to compute weights for each training sentence pair, which are in turn used for computing various model features. [sent-88, score-0.182]
50 They found that the sentence-level weights were most helpful for computing the lexical weighting features (p. [sent-89, score-0.495]
51 The mapping from s-features to sentence weights was chosen to optimize expected TER on held-out data. [sent-92, score-0.139]
52 For each s-feature s, we compute new word translation tables ts(e | f) and ts(f | e) estimated from only those sentence pairsf on wfh |ic e)h s fires, and extend them to phrases/rules as before. [sent-95, score-0.367]
53 The idea is to use these probabilities as new features in the model. [sent-96, score-0.094]
54 However, two challenges arise: first, many word pairs are unseen for a given s, resulting in zero or undefined probabilities; second, this adds many new features for each rule, which requires a lot of space. [sent-97, score-0.094]
55 For each s-feature s, we add two model features −logˆtts( e¯ e¯ | |f ¯ ) and − logˆtts( f¯f¯ | |e¯ ¯ e)) s tyrsinegm-to srteingfbueals te2ulirne s4 Dn7 e. [sent-99, score-0.094]
56 s0396t Table 4: Our variations on lexical weighting improve translation quality significantly across 16 different test conditions. [sent-107, score-0.574]
57 01 level, except where marked with an asterisk (∗), indicating p < 0. [sent-109, score-0.037]
58 In order to address the space problem, we use the following heuristic: for any given rule, if the absolute value of one of these features is less than log 2, we discard it for that rule. [sent-111, score-0.151]
59 5 Experiments Setup We tested these features on two machine translation systems: a hierarchical phrasebased (string-to-string) system (Chiang, 2005) and a syntax-based (string-to-tree) system (Galley et al. [sent-112, score-0.372]
60 For Arabic-English translation, both systems were trained on 190+220 million words of parallel data; for Chinese-English, the string-to-string system was trained on 240+260 million words of parallel data, and the string-to-tree system, 58+65 million words. [sent-115, score-0.134]
61 The baseline string-to-string system already incorporates some simple provenance features: for each s-feature s, there is a feature P(s | rule). [sent-117, score-0.161]
62 Both baseline also include a variety of oPt(hser | features (Chiang et al. [sent-118, score-0.094]
63 , 2008) on a held-out set, then tested on two more sets (Dev and Test) disjoint from the data used for rule extraction and for MIRA training. [sent-124, score-0.125]
64 Individual tests We first tested morphological smoothing using the string-to-string system on Chinese-English translation. [sent-126, score-0.193]
65 The morphologically 458 smoothed system generated the improved translation (8) above, and generally gave a small improvement: task features Dev Chi-Eng nwbaseline28. [sent-127, score-0.348]
66 1 We then tested the provenance-conditioned features on both Arabic-English and Chinese-English, again using the string-to-string system: task features Dev Ara-Eng nwbaseline47. [sent-129, score-0.254]
67 4 The translations (12) and (13) come from the Arabic-English baseline and provenance systems. [sent-135, score-0.209]
68 For Arabic-English, we also compared against lexical weighting features that use sentence weights kindly provided to us by Matsoukas et al. [sent-136, score-0.575]
69 Our features performed better, although it should be noted that those sentence weights had been optimized for a different translation model. [sent-137, score-0.445]
70 Combined tests Finally, we tested the features across a wider range of tasks. [sent-138, score-0.16]
71 For Chinese-English translation, we combined the morphologicallysmoothed and provenance-conditioned lexical weighting features; for Arabic-English, we continued to use only the provenance-conditioned features. [sent-139, score-0.302]
72 We tested using both systems, and on both newswire and web genres. [sent-140, score-0.225]
73 The features produce statistically significant improvements across all 16 conditions. [sent-142, score-0.161]
74 Newswire Figure 1: Feature weights for provenance-conditioned features: string-to-string, Chinese-English, web versus newswire. [sent-145, score-0.158]
75 The diagonal line indicates where the two weights would be equal relative to the original t(e | f) feature weight. [sent-148, score-0.18]
76 Figure 1 shows the feature weights obtained for the provenance-conditioned features ts(f | e) in the string-to-string Chinese-English system, t r| ae)ined on newswire and web data. [sent-149, score-0.352]
77 On the diagonal are cor- pora that were equally useful in either genre. [sent-150, score-0.081]
78 Surprisingly, the UN data received strong positive weights, indicating usefulness in both genres. [sent-151, score-0.046]
79 Two lists of named entities received large weights: the LDC list (LDC2005T34) in the positive direction and the NewsExplorer list in the negative direction, suggesting that there are noisy entries in the latter. [sent-152, score-0.046]
80 The corpus LDC2007E08, which contains parallel data mined from comparable corpora (Munteanu and Marcu, 2005), received strong negative weights. [sent-153, score-0.113]
81 Off the diagonal are corpora favored in only one genre or the other: above, we see that the wl (weblog) and ng (newsgroup) genres are more helpful for web translation, as expected (although web oddly seems less helpful), as well as LDC2006G05 (LDC/FBIS/NVTC Parallel Text V2. [sent-154, score-0.392]
82 Below are corpora more helpful for newswire translation, like LDC2005T06 (Chinese News Translation Text Part 1). [sent-156, score-0.1]
83 459 6 Conclusion Many different approaches to morphology and provenance in machine translation are possible. [sent-157, score-0.429]
84 We have chosen to implement our approach as extensions to lexical weighting (Koehn et al. [sent-158, score-0.302]
85 For this reason, the features we have introduced should be easily applicable to a wide range of phrase-based, hierarchical phrase-based, and syntax-based systems. [sent-160, score-0.094]
86 While the improvements obtained using them are not enormous, we have demonstrated that they help significantly across many different conditions, and over very strong baselines. [sent-161, score-0.067]
87 We therefore fully expect that these new features would yield similar improvements in other systems as well. [sent-162, score-0.161]
88 Acknowledgements We would like to thank Spyros Matsoukas and colleagues at BBN for providing their sentence-level weights and important insights into their corpusweighting work. [sent-163, score-0.099]
89 of syntactic and struc- tural translation features. [sent-167, score-0.212]
90 Scalable inference and training of context-rich syntactic translation models. [sent-196, score-0.212]
wordName wordTfidf (topN-words)
[('friend', 0.314), ('weighting', 0.247), ('translation', 0.212), ('partner', 0.209), ('matsoukas', 0.19), ('ivoire', 0.183), ('jiabao', 0.183), ('tm', 0.165), ('provenance', 0.161), ('chiang', 0.142), ('logt', 0.137), ('genres', 0.134), ('friends', 0.131), ('plural', 0.118), ('wen', 0.111), ('prefer', 0.103), ('china', 0.1), ('newswire', 0.1), ('weights', 0.099), ('partners', 0.099), ('ruling', 0.099), ('good', 0.096), ('perhaps', 0.094), ('ts', 0.094), ('features', 0.094), ('cote', 0.092), ('un', 0.091), ('diagonal', 0.081), ('lms', 0.081), ('weaken', 0.081), ('pust', 0.081), ('galley', 0.076), ('said', 0.075), ('smooths', 0.074), ('morphological', 0.074), ('tables', 0.072), ('improvements', 0.067), ('parallel', 0.067), ('porter', 0.066), ('stemmer', 0.066), ('watanabe', 0.066), ('tested', 0.066), ('lm', 0.066), ('images', 0.065), ('dev', 0.064), ('spyros', 0.063), ('munteanu', 0.063), ('tts', 0.063), ('ei', 0.062), ('mt', 0.061), ('variations', 0.06), ('singular', 0.06), ('weblog', 0.059), ('genre', 0.059), ('rule', 0.059), ('web', 0.059), ('bleu', 0.057), ('arabic', 0.057), ('witten', 0.057), ('log', 0.057), ('morphology', 0.056), ('comma', 0.055), ('lexical', 0.055), ('mira', 0.054), ('pair', 0.053), ('smoothing', 0.053), ('deneefe', 0.051), ('fs', 0.05), ('bbn', 0.05), ('came', 0.049), ('koehn', 0.049), ('translations', 0.048), ('ai', 0.047), ('marcu', 0.047), ('cfg', 0.046), ('received', 0.046), ('knight', 0.045), ('differences', 0.044), ('observed', 0.044), ('conditions', 0.044), ('compute', 0.043), ('nw', 0.043), ('morphologically', 0.042), ('crammer', 0.042), ('nfs', 0.04), ('kindly', 0.04), ('fsts', 0.04), ('arabicenglish', 0.04), ('africa', 0.04), ('sentence', 0.04), ('store', 0.04), ('steve', 0.04), ('preferences', 0.039), ('kevin', 0.038), ('nouns', 0.038), ('michel', 0.038), ('incorrectly', 0.037), ('asterisk', 0.037), ('ined', 0.037), ('newsgroup', 0.037)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000002 313 acl-2011-Two Easy Improvements to Lexical Weighting
Author: David Chiang ; Steve DeNeefe ; Michael Pust
Abstract: We introduce two simple improvements to the lexical weighting features of Koehn, Och, and Marcu (2003) for machine translation: one which smooths the probability of translating word f to word e by simplifying English morphology, and one which conditions it on the kind of training data that f and e co-occurred in. These new variations lead to improvements of up to +0.8 BLEU, with an average improvement of +0.6 BLEU across two language pairs, two genres, and two translation systems.
2 0.15230195 202 acl-2011-Learning Hierarchical Translation Structure with Linguistic Annotations
Author: Markos Mylonakis ; Khalil Sima'an
Abstract: While it is generally accepted that many translation phenomena are correlated with linguistic structures, employing linguistic syntax for translation has proven a highly non-trivial task. The key assumption behind many approaches is that translation is guided by the source and/or target language parse, employing rules extracted from the parse tree or performing tree transformations. These approaches enforce strict constraints and might overlook important translation phenomena that cross linguistic constituents. We propose a novel flexible modelling approach to introduce linguistic information of varying granularity from the source side. Our method induces joint probability synchronous grammars and estimates their parameters, by select- ing and weighing together linguistically motivated rules according to an objective function directly targeting generalisation over future data. We obtain statistically significant improvements across 4 different language pairs with English as source, mounting up to +1.92 BLEU for Chinese as target.
3 0.14762053 171 acl-2011-Incremental Syntactic Language Models for Phrase-based Translation
Author: Lane Schwartz ; Chris Callison-Burch ; William Schuler ; Stephen Wu
Abstract: This paper describes a novel technique for incorporating syntactic knowledge into phrasebased machine translation through incremental syntactic parsing. Bottom-up and topdown parsers typically require a completed string as input. This requirement makes it difficult to incorporate them into phrase-based translation, which generates partial hypothesized translations from left-to-right. Incremental syntactic language models score sentences in a similar left-to-right fashion, and are therefore a good mechanism for incorporat- ing syntax into phrase-based translation. We give a formal definition of one such lineartime syntactic language model, detail its relation to phrase-based decoding, and integrate the model with the Moses phrase-based translation system. We present empirical results on a constrained Urdu-English translation task that demonstrate a significant BLEU score improvement and a large decrease in perplexity.
4 0.14519374 81 acl-2011-Consistent Translation using Discriminative Learning - A Translation Memory-inspired Approach
Author: Yanjun Ma ; Yifan He ; Andy Way ; Josef van Genabith
Abstract: We present a discriminative learning method to improve the consistency of translations in phrase-based Statistical Machine Translation (SMT) systems. Our method is inspired by Translation Memory (TM) systems which are widely used by human translators in industrial settings. We constrain the translation of an input sentence using the most similar ‘translation example’ retrieved from the TM. Differently from previous research which used simple fuzzy match thresholds, these constraints are imposed using discriminative learning to optimise the translation performance. We observe that using this method can benefit the SMT system by not only producing consistent translations, but also improved translation outputs. We report a 0.9 point improvement in terms of BLEU score on English–Chinese technical documents.
5 0.13138732 90 acl-2011-Crowdsourcing Translation: Professional Quality from Non-Professionals
Author: Omar F. Zaidan ; Chris Callison-Burch
Abstract: Naively collecting translations by crowdsourcing the task to non-professional translators yields disfluent, low-quality results if no quality control is exercised. We demonstrate a variety of mechanisms that increase the translation quality to near professional levels. Specifically, we solicit redundant translations and edits to them, and automatically select the best output among them. We propose a set of features that model both the translations and the translators, such as country of residence, LM perplexity of the translation, edit rate from the other translations, and (optionally) calibration against professional translators. Using these features to score the collected translations, we are able to discriminate between acceptable and unacceptable translations. We recreate the NIST 2009 Urdu-toEnglish evaluation set with Mechanical Turk, and quantitatively show that our models are able to select translations within the range of quality that we expect from professional trans- lators. The total cost is more than an order of magnitude lower than professional translation.
6 0.1311442 44 acl-2011-An exponential translation model for target language morphology
7 0.13106127 233 acl-2011-On-line Language Model Biasing for Statistical Machine Translation
8 0.12909354 75 acl-2011-Combining Morpheme-based Machine Translation with Post-processing Morpheme Prediction
9 0.12805501 29 acl-2011-A Word-Class Approach to Labeling PSCFG Rules for Machine Translation
10 0.12790175 256 acl-2011-Query Weighting for Ranking Model Adaptation
11 0.12546876 152 acl-2011-How Much Can We Gain from Supervised Word Alignment?
12 0.12146467 104 acl-2011-Domain Adaptation for Machine Translation by Mining Unseen Words
13 0.12131053 259 acl-2011-Rare Word Translation Extraction from Aligned Comparable Documents
14 0.11831298 290 acl-2011-Syntax-based Statistical Machine Translation using Tree Automata and Tree Transducers
15 0.11827252 100 acl-2011-Discriminative Feature-Tied Mixture Modeling for Statistical Machine Translation
16 0.11637684 146 acl-2011-Goodness: A Method for Measuring Machine Translation Confidence
17 0.11201399 110 acl-2011-Effective Use of Function Words for Rule Generalization in Forest-Based Translation
18 0.10885396 87 acl-2011-Corpus Expansion for Statistical Machine Translation with Semantic Role Label Substitution Rules
19 0.10846673 177 acl-2011-Interactive Group Suggesting for Twitter
20 0.10671363 57 acl-2011-Bayesian Word Alignment for Statistical Machine Translation
topicId topicWeight
[(0, 0.254), (1, -0.145), (2, 0.121), (3, 0.085), (4, 0.02), (5, 0.014), (6, 0.038), (7, -0.058), (8, 0.051), (9, 0.066), (10, -0.022), (11, -0.064), (12, -0.051), (13, -0.065), (14, 0.074), (15, -0.05), (16, -0.033), (17, 0.029), (18, -0.034), (19, -0.042), (20, 0.017), (21, -0.028), (22, 0.043), (23, 0.03), (24, -0.036), (25, 0.002), (26, -0.002), (27, -0.014), (28, 0.03), (29, 0.007), (30, -0.053), (31, -0.017), (32, -0.055), (33, 0.018), (34, -0.006), (35, 0.017), (36, 0.049), (37, -0.081), (38, 0.062), (39, 0.027), (40, -0.044), (41, 0.017), (42, 0.085), (43, -0.045), (44, -0.039), (45, 0.084), (46, 0.009), (47, -0.003), (48, 0.009), (49, -0.0)]
simIndex simValue paperId paperTitle
same-paper 1 0.95528275 313 acl-2011-Two Easy Improvements to Lexical Weighting
Author: David Chiang ; Steve DeNeefe ; Michael Pust
Abstract: We introduce two simple improvements to the lexical weighting features of Koehn, Och, and Marcu (2003) for machine translation: one which smooths the probability of translating word f to word e by simplifying English morphology, and one which conditions it on the kind of training data that f and e co-occurred in. These new variations lead to improvements of up to +0.8 BLEU, with an average improvement of +0.6 BLEU across two language pairs, two genres, and two translation systems.
2 0.82600778 81 acl-2011-Consistent Translation using Discriminative Learning - A Translation Memory-inspired Approach
Author: Yanjun Ma ; Yifan He ; Andy Way ; Josef van Genabith
Abstract: We present a discriminative learning method to improve the consistency of translations in phrase-based Statistical Machine Translation (SMT) systems. Our method is inspired by Translation Memory (TM) systems which are widely used by human translators in industrial settings. We constrain the translation of an input sentence using the most similar ‘translation example’ retrieved from the TM. Differently from previous research which used simple fuzzy match thresholds, these constraints are imposed using discriminative learning to optimise the translation performance. We observe that using this method can benefit the SMT system by not only producing consistent translations, but also improved translation outputs. We report a 0.9 point improvement in terms of BLEU score on English–Chinese technical documents.
3 0.78764766 146 acl-2011-Goodness: A Method for Measuring Machine Translation Confidence
Author: Nguyen Bach ; Fei Huang ; Yaser Al-Onaizan
Abstract: State-of-the-art statistical machine translation (MT) systems have made significant progress towards producing user-acceptable translation output. However, there is still no efficient way for MT systems to inform users which words are likely translated correctly and how confident it is about the whole sentence. We propose a novel framework to predict wordlevel and sentence-level MT errors with a large number of novel features. Experimental results show that the MT error prediction accuracy is increased from 69.1 to 72.2 in F-score. The Pearson correlation between the proposed confidence measure and the human-targeted translation edit rate (HTER) is 0.6. Improve- ments between 0.4 and 0.9 TER reduction are obtained with the n-best list reranking task using the proposed confidence measure. Also, we present a visualization prototype of MT errors at the word and sentence levels with the objective to improve post-editor productivity.
4 0.78154373 90 acl-2011-Crowdsourcing Translation: Professional Quality from Non-Professionals
Author: Omar F. Zaidan ; Chris Callison-Burch
Abstract: Naively collecting translations by crowdsourcing the task to non-professional translators yields disfluent, low-quality results if no quality control is exercised. We demonstrate a variety of mechanisms that increase the translation quality to near professional levels. Specifically, we solicit redundant translations and edits to them, and automatically select the best output among them. We propose a set of features that model both the translations and the translators, such as country of residence, LM perplexity of the translation, edit rate from the other translations, and (optionally) calibration against professional translators. Using these features to score the collected translations, we are able to discriminate between acceptable and unacceptable translations. We recreate the NIST 2009 Urdu-toEnglish evaluation set with Mechanical Turk, and quantitatively show that our models are able to select translations within the range of quality that we expect from professional trans- lators. The total cost is more than an order of magnitude lower than professional translation.
5 0.78139752 233 acl-2011-On-line Language Model Biasing for Statistical Machine Translation
Author: Sankaranarayanan Ananthakrishnan ; Rohit Prasad ; Prem Natarajan
Abstract: The language model (LM) is a critical component in most statistical machine translation (SMT) systems, serving to establish a probability distribution over the hypothesis space. Most SMT systems use a static LM, independent of the source language input. While previous work has shown that adapting LMs based on the input improves SMT performance, none of the techniques has thus far been shown to be feasible for on-line systems. In this paper, we develop a novel measure of cross-lingual similarity for biasing the LM based on the test input. We also illustrate an efficient on-line implementation that supports integration with on-line SMT systems by transferring much of the computational load off-line. Our approach yields significant reductions in target perplexity compared to the static LM, as well as consistent improvements in SMT performance across language pairs (English-Dari and English-Pashto).
6 0.76523894 75 acl-2011-Combining Morpheme-based Machine Translation with Post-processing Morpheme Prediction
7 0.75843519 290 acl-2011-Syntax-based Statistical Machine Translation using Tree Automata and Tree Transducers
8 0.75317234 44 acl-2011-An exponential translation model for target language morphology
9 0.74519074 171 acl-2011-Incremental Syntactic Language Models for Phrase-based Translation
10 0.72699368 310 acl-2011-Translating from Morphologically Complex Languages: A Paraphrase-Based Approach
11 0.72416556 100 acl-2011-Discriminative Feature-Tied Mixture Modeling for Statistical Machine Translation
12 0.72136974 60 acl-2011-Better Hypothesis Testing for Statistical Machine Translation: Controlling for Optimizer Instability
13 0.7176519 151 acl-2011-Hindi to Punjabi Machine Translation System
14 0.70232761 104 acl-2011-Domain Adaptation for Machine Translation by Mining Unseen Words
15 0.70051938 247 acl-2011-Pre- and Postprocessing for Statistical Machine Translation into Germanic Languages
16 0.68400937 202 acl-2011-Learning Hierarchical Translation Structure with Linguistic Annotations
17 0.66963041 2 acl-2011-AM-FM: A Semantic Framework for Translation Quality Assessment
18 0.64670968 240 acl-2011-ParaSense or How to Use Parallel Corpora for Word Sense Disambiguation
19 0.64207017 259 acl-2011-Rare Word Translation Extraction from Aligned Comparable Documents
20 0.63144791 29 acl-2011-A Word-Class Approach to Labeling PSCFG Rules for Machine Translation
topicId topicWeight
[(17, 0.065), (26, 0.035), (37, 0.085), (39, 0.045), (41, 0.023), (53, 0.012), (55, 0.036), (59, 0.025), (72, 0.021), (91, 0.434), (96, 0.144)]
simIndex simValue paperId paperTitle
1 0.92372471 80 acl-2011-ConsentCanvas: Automatic Texturing for Improved Readability in End-User License Agreements
Author: Oliver Schneider ; Alex Garnett
Abstract: We present ConsentCanvas, a system which structures and “texturizes” End-User License Agreement (EULA) documents to be more readable. The system aims to help users better understand the terms under which they are providing their informed consent. ConsentCanvas receives unstructured text documents as input and uses unsupervised natural language processing methods to embellish the source document using a linked stylesheet. Unlike similar usable security projects which employ summarization techniques, our system preserves the contents of the source document, minimizing the cognitive and legal burden for both the end user and the licensor. Our system does not require a corpus for training. 1
2 0.90777493 140 acl-2011-Fully Unsupervised Word Segmentation with BVE and MDL
Author: Daniel Hewlett ; Paul Cohen
Abstract: Several results in the word segmentation literature suggest that description length provides a useful estimate of segmentation quality in fully unsupervised settings. However, since the space of potential segmentations grows exponentially with the length of the corpus, no tractable algorithm follows directly from the Minimum Description Length (MDL) principle. Therefore, it is necessary to generate a set of candidate segmentations and select between them according to the MDL principle. We evaluate several algorithms for generating these candidate segmentations on a range of natural language corpora, and show that the Bootstrapped Voting Experts algorithm consistently outperforms other methods when paired with MDL.
3 0.90041935 148 acl-2011-HITS-based Seed Selection and Stop List Construction for Bootstrapping
Author: Tetsuo Kiso ; Masashi Shimbo ; Mamoru Komachi ; Yuji Matsumoto
Abstract: In bootstrapping (seed set expansion), selecting good seeds and creating stop lists are two effective ways to reduce semantic drift, but these methods generally need human supervision. In this paper, we propose a graphbased approach to helping editors choose effective seeds and stop list instances, applicable to Pantel and Pennacchiotti’s Espresso bootstrapping algorithm. The idea is to select seeds and create a stop list using the rankings of instances and patterns computed by Kleinberg’s HITS algorithm. Experimental results on a variation of the lexical sample task show the effectiveness of our method.
4 0.87877923 79 acl-2011-Confidence Driven Unsupervised Semantic Parsing
Author: Dan Goldwasser ; Roi Reichart ; James Clarke ; Dan Roth
Abstract: Current approaches for semantic parsing take a supervised approach requiring a considerable amount of training data which is expensive and difficult to obtain. This supervision bottleneck is one of the major difficulties in scaling up semantic parsing. We argue that a semantic parser can be trained effectively without annotated data, and introduce an unsupervised learning algorithm. The algorithm takes a self training approach driven by confidence estimation. Evaluated over Geoquery, a standard dataset for this task, our system achieved 66% accuracy, compared to 80% of its fully supervised counterpart, demonstrating the promise of unsupervised approaches for this task.
same-paper 5 0.84402114 313 acl-2011-Two Easy Improvements to Lexical Weighting
Author: David Chiang ; Steve DeNeefe ; Michael Pust
Abstract: We introduce two simple improvements to the lexical weighting features of Koehn, Och, and Marcu (2003) for machine translation: one which smooths the probability of translating word f to word e by simplifying English morphology, and one which conditions it on the kind of training data that f and e co-occurred in. These new variations lead to improvements of up to +0.8 BLEU, with an average improvement of +0.6 BLEU across two language pairs, two genres, and two translation systems.
6 0.73349416 108 acl-2011-EdIt: A Broad-Coverage Grammar Checker Using Pattern Grammar
7 0.64814615 145 acl-2011-Good Seed Makes a Good Crop: Accelerating Active Learning Using Language Modeling
8 0.62977779 262 acl-2011-Relation Guided Bootstrapping of Semantic Lexicons
9 0.62755942 86 acl-2011-Coreference for Learning to Extract Relations: Yes Virginia, Coreference Matters
10 0.6078527 239 acl-2011-P11-5002 k2opt.pdf
11 0.60342675 200 acl-2011-Learning Dependency-Based Compositional Semantics
12 0.60254616 241 acl-2011-Parsing the Internal Structure of Words: A New Paradigm for Chinese Word Segmentation
13 0.59207648 304 acl-2011-Together We Can: Bilingual Bootstrapping for WSD
14 0.58813775 177 acl-2011-Interactive Group Suggesting for Twitter
15 0.57657456 119 acl-2011-Evaluating the Impact of Coder Errors on Active Learning
16 0.55367577 284 acl-2011-Simple Unsupervised Grammar Induction from Raw Text with Cascaded Finite State Models
17 0.55238998 117 acl-2011-Entity Set Expansion using Topic information
18 0.55000883 258 acl-2011-Ranking Class Labels Using Query Sessions
19 0.54771382 222 acl-2011-Model-Portability Experiments for Textual Temporal Analysis
20 0.5456351 147 acl-2011-Grammatical Error Correction with Alternating Structure Optimization