acl acl2011 acl2011-81 knowledge-graph by maker-knowledge-mining

81 acl-2011-Consistent Translation using Discriminative Learning - A Translation Memory-inspired Approach


Source: pdf

Author: Yanjun Ma ; Yifan He ; Andy Way ; Josef van Genabith

Abstract: We present a discriminative learning method to improve the consistency of translations in phrase-based Statistical Machine Translation (SMT) systems. Our method is inspired by Translation Memory (TM) systems which are widely used by human translators in industrial settings. We constrain the translation of an input sentence using the most similar ‘translation example’ retrieved from the TM. Differently from previous research which used simple fuzzy match thresholds, these constraints are imposed using discriminative learning to optimise the translation performance. We observe that using this method can benefit the SMT system by not only producing consistent translations, but also improved translation outputs. We report a 0.9 point improvement in terms of BLEU score on English–Chinese technical documents.

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 ie , , Abstract We present a discriminative learning method to improve the consistency of translations in phrase-based Statistical Machine Translation (SMT) systems. [sent-5, score-0.308]

2 We constrain the translation of an input sentence using the most similar ‘translation example’ retrieved from the TM. [sent-7, score-0.497]

3 Differently from previous research which used simple fuzzy match thresholds, these constraints are imposed using discriminative learning to optimise the translation performance. [sent-8, score-1.222]

4 We observe that using this method can benefit the SMT system by not only producing consistent translations, but also improved translation outputs. [sent-9, score-0.287]

5 1 Introduction Translation consistency is an important factor for large-scale translation, especially for domainspecific translations in an industrial environment. [sent-12, score-0.267]

6 For example, in the translation of technical documents, lexical as well as structural consistency is essential to produce a fluent target-language sentence. [sent-13, score-0.378]

7 Moreover, even in the case of translation errors, consistency in the errors (e. [sent-14, score-0.378]

8 1239 In phrase-based SMT, translation models and language models are automatically learned and/or generalised from the training data, and a translation is produced by maximising a weighted combination of these models. [sent-18, score-0.546]

9 On the other hand, TM systems widely used by translators in industrial environments for enterprise localisation by translators can shed some light on mitigating this limitation. [sent-20, score-0.308]

10 TM systems can assist translators by retrieving and displaying previously translated similar ‘example’ sentences (displayed as source-target pairs, widely called ‘fuzzy matches’ in the localisation industry (Sikes, 2007)). [sent-21, score-0.3]

11 In TM systems, fuzzy matches are retrieved by calculating the similarity or the so-called ‘fuzzy match score’ (ranging from 0 to 1 with 0 indicating no matches and 1 indicating a full match) between the input sentence and sentences in the source side of the translation memory. [sent-22, score-1.625]

12 When presented with fuzzy matches, translators can then avail of useful chunks in previous translations while composing the translation of a new sentence. [sent-23, score-1.216]

13 c s 2o0ci1a1ti Aonss foocria Ctioomnp fourta Ctioomnaplu Ltaintigouniaslti Lcisn,g puaigsetsic 1s239–1248, tion 2 has focused on using fuzzy match score as a threshold when using the target side of the fuzzy matches to constrain the translation of the input sentence. [sent-27, score-2.195]

14 In our approach, we use a more finegrained discriminative learning method to determine whether the target side of the fuzzy matches should be used as a constraint in translating the input sentence. [sent-28, score-1.06]

15 We demonstrate that our method can consistently improve translation quality. [sent-29, score-0.26]

16 We present our discriminative learning method for consistent translation in Section 3 and our feature design in Section 4. [sent-31, score-0.366]

17 , 2010), or simply using fuzzy match score or MT confidence measures (Specia et al. [sent-35, score-1.015]

18 If matched chunks between input sentence and fuzzy matches can be detected, we can directly re-use the corresponding parts of the translation in the fuzzy matches, and use an MT system to translate the remaining chunks. [sent-39, score-2.104]

19 As a matter of fact, implementing this idea is pretty straightforward: a TM system can easily detect the word alignment between the input sentence and the source side of the fuzzy match by retracing the paths used in calculating the fuzzy match score. [sent-40, score-2.045]

20 To obtain the translation for the matched chunks, we just require the word alignment between source and target TM matches, which can be addressed using state-of-the-art word alignment techniques. [sent-41, score-0.453]

21 More 1240 importantly, albeit not explicitly spelled out in previous work, this method can potentially increase the consistency of translation, as the translation of new input sentences is closely informed and guided (or constrained) by previously translated sentences. [sent-42, score-0.565]

22 It is worth mentioning that translation consistency was not explicitly regarded as their primary motivation in this previous work. [sent-44, score-0.378]

23 However, to categorically reuse the translations of matched chunks without any differentiation could generate inferior translations given the fact that the context of these matched chunks in the input sentence could be completely different from the source side of the fuzzy match. [sent-46, score-1.545]

24 To address this problem, both (Koehn and Senellart, 2010) and (Zhechev and van Genabith, 2010) used fuzzy match score as a threshold to determine whether to reuse the translations of the matched chunks. [sent-47, score-1.229]

25 Despite being an informative measure, using fuzzy match score as a threshold has a number of limitations. [sent-49, score-0.935]

26 Given the fact that fuzzy match score is normally calculated based on Edit Distance (Levenshtein, 1966), a low score does not necessarily imply that the fuzzy match is harmful when used to constrain an input sentence. [sent-50, score-2.054]

27 For example, in longer sentences where fuzzy match scores tend to be low, some chunks and the corresponding translations within the sentences can still be useful. [sent-51, score-1.156]

28 1 Formulation of the Problem Given a sentence e to translate, we retrieve the most similar sentence e′ from the translation memory associated with target translation f′. [sent-55, score-0.665]

29 This process can derive a number of “phrase pairs” < ¯e m, f¯′m >, which can be used to specify the translations of the matched phrases in the input sentence. [sent-58, score-0.349]

30 For example, given an input sentence e1e2 · · · and a phrase pair < e, f¯′ >, e = eiei+1 , ·f¯·′ = fj′fj′+1 derived from the fuzzy match, we can mark up the input sentence as: e1e2 · · · eiei+1 < /tm> · · · eI. [sent-60, score-1.148]

31 Our method to constrain the translations using TM fuzzy matches is similar to (Koehn and Senellart, 2010), except that the word alignment between e′ and f′ is the intersection of bidirectional GIZA++ (Och and Ney, 2003) posterior alignments. [sent-61, score-0.976]

32 2 Discriminative Learning Whether the translation information from the fuzzy matches should be used or not (i. [sent-64, score-1.042]

33 whether the input sentence should be marked up) is determined using a discriminative learning procedure. [sent-66, score-0.312]

34 The translation information refers to the “phrase pairs” derived using the method described in Section 3. [sent-67, score-0.291]

35 The SVM classifier will thus be able to predict the usefulness of the TM fuzzy match, and determine whether the input sentence should be marked up using relevant phrase pairs derived from the fuzzy match before sending it to the SMT system for translation. [sent-80, score-2.014]

36 The classifier uses features such as the fuzzy match score, the phrase and lexical translation probabilities of these relevant phrase pairs, and additional syntactic dependency features. [sent-81, score-1.389]

37 Ideally the classifier will decide to mark up the input sentence if the translations of the marked phrases are accurate when taken contextual information into account. [sent-82, score-0.407]

38 1 The TM Feature The TM feature is the fuzzy match score, which indicates the overall similarity between the input sentence and the source side of the TM output. [sent-100, score-1.128]

39 If the input sentence is similar to the source side of the matching segment, it is more likely that the matching segment can be used to mark up the input sentence. [sent-101, score-0.384]

40 The calculation of the fuzzy match score itself is one of the core technologies in TM systems, and varies among different vendors. [sent-102, score-0.935]

41 We compute fuzzy match cost as the minimum Edit Distance (Levenshtein, 1966) between the source and TM entry, normalised by the length of the source as in (6), as most of the current implementations are based on edit distance while allowing some additional flexi- ble matching. [sent-103, score-0.96]

42 For fuzzy match scores F, hfm roughly corresponds to 1− F. [sent-105, score-0.945]

43 2 Translation Features We use four features related to translation probabilities, i. [sent-107, score-0.285]

44 the phrase translation and lexical probabilities for the phrase pairs < em, f¯′m > derived using the method in Section 3. [sent-109, score-0.468]

45 Specifically, we use the phrase translation probabilities p(f¯′m | e¯m) and p(¯ em |f¯′m), as well as the lexical translati|o e¯n probabiliti|es plex(f¯′m| e¯m) and plex(¯ em|f¯′m) as calculated in (Koehn et al. [sent-111, score-0.4]

46 In cases where multiple phrase pairs are used to mark up one single input sentence e, we use a unified score for each of the four features, which is an average over the corresponding feature in each phrase pair. [sent-113, score-0.418]

47 The intuition behind these features is as follows: phrase pairs < em, f¯′m > derived from the fuzzy match should also be reliable with respect to statistically produced models. [sent-114, score-1.045]

48 whether the phrase table contains at least one phrase pair < em, f¯′m > that is used to mark up the input sentence. [sent-119, score-0.311]

49 3 Dependency Features Given the phrase pairs < em, f¯′m > derived from the fuzzy match, and used to translate the corresponding chunks of the input sentence (cf. [sent-121, score-1.1]

50 1), these translations are more likely to be coherent in the context of the particular input sentence if the matched parts on the input side are syntactically and semantically related. [sent-123, score-0.539]

51 For matched phrases em between the input sentence and the source side of the fuzzy match, we define the contextual information of the input side using dependency relations between words em in em and the remaining words ej in the input sentence e. [sent-124, score-1.65]

52 The dependency features designed to capture the context of the matched input phrases em are as follows: Coverage features measure the coverage of dependency labels on the input sentence in order to obtain a bigger picture of the matched parts in the input. [sent-127, score-0.704]

53 For each dependency label L, we consider its head or modifier as covered if the corresponding input word em is covered by a matched phrase ¯e m. [sent-128, score-0.393]

54 Position features identify whether the head and the tail of a sentence are matched, as these are the cases in which the matched translation is not affected by the preceding words (when it is the head) or following words (when it is the tail), and is therefore more reliable. [sent-130, score-0.496]

55 The consistency feature is a single feature which determines whether matched phrases e¯m belong to a consistent dependency structure, instead of being distributed discontinuously around in the input sentence. [sent-134, score-0.457]

56 1 Experimental Setup Our data set is an English–Chinese translation memory with technical translation from Symantec, consisting of 87K sentence pairs. [sent-138, score-0.615]

57 The composition of test subsets based on fuzzy match scores is shown in Table 2. [sent-142, score-0.908]

58 7307 Table 2: Composition of test subsets based on fuzzy match scores training and validation is on the same training senas the SMT system with 5-fold cross validation. [sent-168, score-0.96]

59 5% of the input sentences, our MT system produces the same translation irrespective of whether the input sentence is marked up or not. [sent-182, score-0.603]

60 Let A be the set of predicted markup input sentences, and B be the set of input sentences where the markup version has a lower TER score than the plain version. [sent-185, score-0.618]

61 3 Cross-fold translation In order to obtain training samples for the classifier, we need to label each sentence in the SMT training data as to whether marking up the sentence can produce better translations. [sent-187, score-0.469]

62 1 Translation Results Table 3 contains the translation results of the SMT system when we use discriminative learning to mark up the input sentence (MARKUP-DL). [sent-198, score-0.528]

63 32 Table 3: Performance of Discriminative Learning (%) put sentences using phrase pairs derived from fuzzy matches. [sent-211, score-0.873]

64 Our discriminative learning method (MARKUP-DL), which automatically classifies whether an input sentence should be marked up, leads to an increase of 0. [sent-217, score-0.312]

65 Despite there being much room for further improvement when compared to the Oracle score, the discriminative learning method appears to be effective not only in maintaining transla- tion consistency, but also a statistically significant improvement in translation quality. [sent-220, score-0.339]

66 Table 4 shows the classification and translation results when we use different confidence thresholds. [sent-227, score-0.41]

67 50, and the corresponding translation results were described in Section 5. [sent-229, score-0.26]

68 We investigate the impact of increasing classification confidence on the performance of the classifier and the translation results. [sent-232, score-0.445]

69 The fluctuation in classification performance has an impact on the translation results as measured by BLEU and TER. [sent-235, score-0.33]

70 3 Comparison with Previous Work As discussed in Section 2, both (Koehn and Senellart, 2010) and (Zhechev and van Genabith, 2010) used fuzzy match score to determine whether the input sentences should be marked up. [sent-281, score-1.211]

71 The input sentences are only marked up when the fuzzy match score is above a certain threshold. [sent-282, score-1.127]

72 07 Table 5: Performance using fuzzy match score for classification ble, we can see an inferior performance compared to the BASELINE results (cf. [sent-299, score-1.038]

73 A modest gain can only be achieved when the fuzzy match score is above 0. [sent-302, score-0.935]

74 This is slightly different from the conclusions drawn in (Koehn and Senellart, 2010), where gains are observed when the fuzzy match score is above 0. [sent-304, score-0.935]

75 86 Table 6: Percentage of training sentences with markup vs without markup grouped by fuzzy match (FM) score ranges To further validate our assumption, we analyse the training sentences by grouping them according to their fuzzy match score ranges. [sent-333, score-2.28]

76 We can see that for sentences with fuzzy match scores lower than 0. [sent-336, score-0.947]

77 8, more sentences can be better translated For sentences where fuzzy match scores are within (0. [sent-337, score-1.024]

78 surprisingly, ceive better translation can be better However, within the range actually more sentences without markup. [sent-343, score-0.299]

79 re- This indi- cates that fuzzy match score is not a good measure to predict whether fuzzy matches are beneficial when used to constrain the translation of an input sentence. [sent-344, score-2.166]

80 markup 。 , 。 , 。 。 。 。 。 flect whether translation consistency can be captured using syntactic knowledge. [sent-353, score-0.548]

81 14 Table 7: Contribution of Features (%) translation results using different features are reported in Table 7. [sent-362, score-0.285]

82 We observe a significant improvement in both classification precision and recall by adding dependency (DEP) features on top of TM and translation features. [sent-363, score-0.399]

83 As a result, the translation quality also significantly improves. [sent-364, score-0.26]

84 This indicates that dependency features which can capture structural and semantic similarities are effective in gauging the usefulness of the phrase pairs derived from the fuzzy matches. [sent-365, score-0.932]

85 We observe that the improvements can broadly be attributed to two rea- sons: 1) the use of long phrase pairs which are missing in the phrase table, and 2) deterministically using highly reliable phrase pairs. [sent-370, score-0.276]

86 Differently from his approach, our method directly translates part of the input sentence using fuzzy matches retrieved on the fly, with the rest of the sentence translated by the pre-trained MT system. [sent-374, score-1.058]

87 Example 1 shows translation improvements by using long phrase pairs. [sent-376, score-0.331]

88 Compared to the reference translation, we can see that for the underlined phrase, the translation without markup contains (i) word ordering errors and (ii) a missing right quota- tion mark. [sent-377, score-0.4]

89 The translation of this relative clause is missing when translating the input without markup. [sent-379, score-0.37]

90 This improvement can be partly attributed to the reduction in search errors by specifying the highly reliable translations for phrases in an input sentence. [sent-380, score-0.278]

91 6 Conclusions and Future Work In this paper, we introduced a discriminative learning method to tightly integrate fuzzy matches retrieved using translation memory technologies with phrase-based SMT systems to improve translation consistency. [sent-381, score-1.454]

92 We used an SVM classifier to predict whether phrase pairs derived from fuzzy matches could be used to constrain the translation of an input sentence. [sent-382, score-1.403]

93 Experiments demonstrated that discriminative learning is effective in improving translation quality and is more informative than the fuzzy match score used in previous research. [sent-384, score-1.274]

94 9 absolute improve- ment in BLEU score using a procedure to promote translation consistency. [sent-386, score-0.312]

95 However, it is worth noting that the level of gains in translation consistency is also dependent on the nature of the TM itself; a selfcontained coherent TM would facilitate consistent translations. [sent-388, score-0.405]

96 In the future, we plan to investigate the impact of TM quality on translation consistency when using our approach. [sent-389, score-0.378]

97 Furthermore, we will explore methods to promote translation consistency at document level. [sent-390, score-0.378]

98 Dynamic translation memory: Using statistical machine translation to improve translation memory. [sent-404, score-0.78]

99 A study of translation edit rate with targeted human annotation. [sent-494, score-0.285]

100 Seeding statistical machine translation with translation memory output through tree-based structural alignment. [sent-506, score-0.565]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('fuzzy', 0.697), ('translation', 0.26), ('tm', 0.217), ('match', 0.186), ('markup', 0.14), ('senellart', 0.128), ('consistency', 0.118), ('genabith', 0.112), ('translations', 0.111), ('input', 0.11), ('zhechev', 0.11), ('smt', 0.105), ('matched', 0.099), ('localisation', 0.092), ('translators', 0.089), ('matches', 0.085), ('confidence', 0.08), ('discriminative', 0.079), ('ter', 0.074), ('phrase', 0.071), ('classification', 0.07), ('em', 0.069), ('mt', 0.068), ('koehn', 0.064), ('bleu', 0.063), ('chunks', 0.059), ('side', 0.059), ('policy', 0.058), ('eiei', 0.055), ('van', 0.054), ('score', 0.052), ('sentence', 0.05), ('constrain', 0.049), ('translate', 0.047), ('minimise', 0.045), ('memory', 0.045), ('dependency', 0.044), ('marked', 0.043), ('industry', 0.042), ('dublin', 0.04), ('sentences', 0.039), ('industrial', 0.038), ('translated', 0.038), ('hfm', 0.037), ('teerr', 0.037), ('yifan', 0.037), ('andy', 0.037), ('platt', 0.035), ('pairs', 0.035), ('classifier', 0.035), ('alignment', 0.034), ('hi', 0.034), ('governor', 0.034), ('summit', 0.033), ('inferior', 0.033), ('tms', 0.032), ('baidu', 0.032), ('categorically', 0.032), ('integrity', 0.032), ('owczarzak', 0.032), ('plex', 0.032), ('tail', 0.032), ('derived', 0.031), ('whether', 0.03), ('specia', 0.03), ('cortes', 0.03), ('simard', 0.03), ('dep', 0.03), ('host', 0.03), ('phrases', 0.029), ('usefulness', 0.029), ('mark', 0.029), ('retrieved', 0.028), ('yanjun', 0.028), ('ontario', 0.028), ('reusing', 0.028), ('attributed', 0.028), ('plain', 0.027), ('marking', 0.027), ('consistent', 0.027), ('svm', 0.027), ('xj', 0.027), ('ottawa', 0.027), ('levenshtein', 0.027), ('xii', 0.027), ('svms', 0.026), ('snover', 0.026), ('training', 0.026), ('och', 0.026), ('source', 0.026), ('pi', 0.025), ('fly', 0.025), ('rbf', 0.025), ('edit', 0.025), ('normally', 0.025), ('scores', 0.025), ('features', 0.025), ('libsvm', 0.024), ('josef', 0.024), ('scan', 0.024)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.000001 81 acl-2011-Consistent Translation using Discriminative Learning - A Translation Memory-inspired Approach

Author: Yanjun Ma ; Yifan He ; Andy Way ; Josef van Genabith

Abstract: We present a discriminative learning method to improve the consistency of translations in phrase-based Statistical Machine Translation (SMT) systems. Our method is inspired by Translation Memory (TM) systems which are widely used by human translators in industrial settings. We constrain the translation of an input sentence using the most similar ‘translation example’ retrieved from the TM. Differently from previous research which used simple fuzzy match thresholds, these constraints are imposed using discriminative learning to optimise the translation performance. We observe that using this method can benefit the SMT system by not only producing consistent translations, but also improved translation outputs. We report a 0.9 point improvement in terms of BLEU score on English–Chinese technical documents.

2 0.18189716 90 acl-2011-Crowdsourcing Translation: Professional Quality from Non-Professionals

Author: Omar F. Zaidan ; Chris Callison-Burch

Abstract: Naively collecting translations by crowdsourcing the task to non-professional translators yields disfluent, low-quality results if no quality control is exercised. We demonstrate a variety of mechanisms that increase the translation quality to near professional levels. Specifically, we solicit redundant translations and edits to them, and automatically select the best output among them. We propose a set of features that model both the translations and the translators, such as country of residence, LM perplexity of the translation, edit rate from the other translations, and (optionally) calibration against professional translators. Using these features to score the collected translations, we are able to discriminate between acceptable and unacceptable translations. We recreate the NIST 2009 Urdu-toEnglish evaluation set with Mechanical Turk, and quantitatively show that our models are able to select translations within the range of quality that we expect from professional trans- lators. The total cost is more than an order of magnitude lower than professional translation.

3 0.14771135 146 acl-2011-Goodness: A Method for Measuring Machine Translation Confidence

Author: Nguyen Bach ; Fei Huang ; Yaser Al-Onaizan

Abstract: State-of-the-art statistical machine translation (MT) systems have made significant progress towards producing user-acceptable translation output. However, there is still no efficient way for MT systems to inform users which words are likely translated correctly and how confident it is about the whole sentence. We propose a novel framework to predict wordlevel and sentence-level MT errors with a large number of novel features. Experimental results show that the MT error prediction accuracy is increased from 69.1 to 72.2 in F-score. The Pearson correlation between the proposed confidence measure and the human-targeted translation edit rate (HTER) is 0.6. Improve- ments between 0.4 and 0.9 TER reduction are obtained with the n-best list reranking task using the proposed confidence measure. Also, we present a visualization prototype of MT errors at the word and sentence levels with the objective to improve post-editor productivity.

4 0.14519374 313 acl-2011-Two Easy Improvements to Lexical Weighting

Author: David Chiang ; Steve DeNeefe ; Michael Pust

Abstract: We introduce two simple improvements to the lexical weighting features of Koehn, Och, and Marcu (2003) for machine translation: one which smooths the probability of translating word f to word e by simplifying English morphology, and one which conditions it on the kind of training data that f and e co-occurred in. These new variations lead to improvements of up to +0.8 BLEU, with an average improvement of +0.6 BLEU across two language pairs, two genres, and two translation systems.

5 0.1447247 152 acl-2011-How Much Can We Gain from Supervised Word Alignment?

Author: Jinxi Xu ; Jinying Chen

Abstract: Word alignment is a central problem in statistical machine translation (SMT). In recent years, supervised alignment algorithms, which improve alignment accuracy by mimicking human alignment, have attracted a great deal of attention. The objective of this work is to explore the performance limit of supervised alignment under the current SMT paradigm. Our experiments used a manually aligned ChineseEnglish corpus with 280K words recently released by the Linguistic Data Consortium (LDC). We treated the human alignment as the oracle of supervised alignment. The result is surprising: the gain of human alignment over a state of the art unsupervised method (GIZA++) is less than 1point in BLEU. Furthermore, we showed the benefit of improved alignment becomes smaller with more training data, implying the above limit also holds for large training conditions. 1

6 0.1397388 216 acl-2011-MEANT: An inexpensive, high-accuracy, semi-automatic metric for evaluating translation utility based on semantic roles

7 0.13737266 202 acl-2011-Learning Hierarchical Translation Structure with Linguistic Annotations

8 0.13717082 49 acl-2011-Automatic Evaluation of Chinese Translation Output: Word-Level or Character-Level?

9 0.13036165 171 acl-2011-Incremental Syntactic Language Models for Phrase-based Translation

10 0.12960242 2 acl-2011-AM-FM: A Semantic Framework for Translation Quality Assessment

11 0.1270327 104 acl-2011-Domain Adaptation for Machine Translation by Mining Unseen Words

12 0.12693593 87 acl-2011-Corpus Expansion for Statistical Machine Translation with Semantic Role Label Substitution Rules

13 0.124141 247 acl-2011-Pre- and Postprocessing for Statistical Machine Translation into Germanic Languages

14 0.11769862 57 acl-2011-Bayesian Word Alignment for Statistical Machine Translation

15 0.11624727 245 acl-2011-Phrase-Based Translation Model for Question Retrieval in Community Question Answer Archives

16 0.11172064 75 acl-2011-Combining Morpheme-based Machine Translation with Post-processing Morpheme Prediction

17 0.1116014 259 acl-2011-Rare Word Translation Extraction from Aligned Comparable Documents

18 0.11053101 233 acl-2011-On-line Language Model Biasing for Statistical Machine Translation

19 0.1050994 43 acl-2011-An Unsupervised Model for Joint Phrase Alignment and Extraction

20 0.10288899 16 acl-2011-A Joint Sequence Translation Model with Integrated Reordering


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.243), (1, -0.161), (2, 0.112), (3, 0.126), (4, 0.048), (5, 0.041), (6, 0.039), (7, -0.023), (8, 0.056), (9, 0.015), (10, 0.011), (11, -0.103), (12, 0.011), (13, -0.154), (14, -0.04), (15, 0.03), (16, -0.033), (17, -0.021), (18, 0.023), (19, -0.052), (20, -0.017), (21, 0.011), (22, 0.012), (23, -0.012), (24, -0.009), (25, 0.004), (26, -0.015), (27, 0.018), (28, -0.002), (29, 0.029), (30, -0.02), (31, 0.015), (32, -0.031), (33, 0.042), (34, 0.001), (35, 0.034), (36, 0.02), (37, -0.069), (38, 0.041), (39, 0.0), (40, -0.01), (41, 0.022), (42, 0.075), (43, -0.021), (44, -0.054), (45, -0.002), (46, -0.021), (47, -0.037), (48, 0.068), (49, -0.006)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.97193581 81 acl-2011-Consistent Translation using Discriminative Learning - A Translation Memory-inspired Approach

Author: Yanjun Ma ; Yifan He ; Andy Way ; Josef van Genabith

Abstract: We present a discriminative learning method to improve the consistency of translations in phrase-based Statistical Machine Translation (SMT) systems. Our method is inspired by Translation Memory (TM) systems which are widely used by human translators in industrial settings. We constrain the translation of an input sentence using the most similar ‘translation example’ retrieved from the TM. Differently from previous research which used simple fuzzy match thresholds, these constraints are imposed using discriminative learning to optimise the translation performance. We observe that using this method can benefit the SMT system by not only producing consistent translations, but also improved translation outputs. We report a 0.9 point improvement in terms of BLEU score on English–Chinese technical documents.

2 0.88965517 90 acl-2011-Crowdsourcing Translation: Professional Quality from Non-Professionals

Author: Omar F. Zaidan ; Chris Callison-Burch

Abstract: Naively collecting translations by crowdsourcing the task to non-professional translators yields disfluent, low-quality results if no quality control is exercised. We demonstrate a variety of mechanisms that increase the translation quality to near professional levels. Specifically, we solicit redundant translations and edits to them, and automatically select the best output among them. We propose a set of features that model both the translations and the translators, such as country of residence, LM perplexity of the translation, edit rate from the other translations, and (optionally) calibration against professional translators. Using these features to score the collected translations, we are able to discriminate between acceptable and unacceptable translations. We recreate the NIST 2009 Urdu-toEnglish evaluation set with Mechanical Turk, and quantitatively show that our models are able to select translations within the range of quality that we expect from professional trans- lators. The total cost is more than an order of magnitude lower than professional translation.

3 0.85908002 313 acl-2011-Two Easy Improvements to Lexical Weighting

Author: David Chiang ; Steve DeNeefe ; Michael Pust

Abstract: We introduce two simple improvements to the lexical weighting features of Koehn, Och, and Marcu (2003) for machine translation: one which smooths the probability of translating word f to word e by simplifying English morphology, and one which conditions it on the kind of training data that f and e co-occurred in. These new variations lead to improvements of up to +0.8 BLEU, with an average improvement of +0.6 BLEU across two language pairs, two genres, and two translation systems.

4 0.838117 146 acl-2011-Goodness: A Method for Measuring Machine Translation Confidence

Author: Nguyen Bach ; Fei Huang ; Yaser Al-Onaizan

Abstract: State-of-the-art statistical machine translation (MT) systems have made significant progress towards producing user-acceptable translation output. However, there is still no efficient way for MT systems to inform users which words are likely translated correctly and how confident it is about the whole sentence. We propose a novel framework to predict wordlevel and sentence-level MT errors with a large number of novel features. Experimental results show that the MT error prediction accuracy is increased from 69.1 to 72.2 in F-score. The Pearson correlation between the proposed confidence measure and the human-targeted translation edit rate (HTER) is 0.6. Improve- ments between 0.4 and 0.9 TER reduction are obtained with the n-best list reranking task using the proposed confidence measure. Also, we present a visualization prototype of MT errors at the word and sentence levels with the objective to improve post-editor productivity.

5 0.82564145 2 acl-2011-AM-FM: A Semantic Framework for Translation Quality Assessment

Author: Rafael E. Banchs ; Haizhou Li

Abstract: This work introduces AM-FM, a semantic framework for machine translation evaluation. Based upon this framework, a new evaluation metric, which is able to operate without the need for reference translations, is implemented and evaluated. The metric is based on the concepts of adequacy and fluency, which are independently assessed by using a cross-language latent semantic indexing approach and an n-gram based language model approach, respectively. Comparative analyses with conventional evaluation metrics are conducted on two different evaluation tasks (overall quality assessment and comparative ranking) over a large collection of human evaluations involving five European languages. Finally, the main pros and cons of the proposed framework are discussed along with future research directions. 1

6 0.8154242 60 acl-2011-Better Hypothesis Testing for Statistical Machine Translation: Controlling for Optimizer Instability

7 0.7916249 247 acl-2011-Pre- and Postprocessing for Statistical Machine Translation into Germanic Languages

8 0.78668201 151 acl-2011-Hindi to Punjabi Machine Translation System

9 0.7822929 264 acl-2011-Reordering Metrics for MT

10 0.78079599 171 acl-2011-Incremental Syntactic Language Models for Phrase-based Translation

11 0.77206576 216 acl-2011-MEANT: An inexpensive, high-accuracy, semi-automatic metric for evaluating translation utility based on semantic roles

12 0.75444871 290 acl-2011-Syntax-based Statistical Machine Translation using Tree Automata and Tree Transducers

13 0.74991769 233 acl-2011-On-line Language Model Biasing for Statistical Machine Translation

14 0.73383701 104 acl-2011-Domain Adaptation for Machine Translation by Mining Unseen Words

15 0.72092789 49 acl-2011-Automatic Evaluation of Chinese Translation Output: Word-Level or Character-Level?

16 0.71714795 202 acl-2011-Learning Hierarchical Translation Structure with Linguistic Annotations

17 0.71257895 75 acl-2011-Combining Morpheme-based Machine Translation with Post-processing Morpheme Prediction

18 0.70959991 16 acl-2011-A Joint Sequence Translation Model with Integrated Reordering

19 0.69387865 69 acl-2011-Clause Restructuring For SMT Not Absolutely Helpful

20 0.68246478 100 acl-2011-Discriminative Feature-Tied Mixture Modeling for Statistical Machine Translation


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(1, 0.017), (5, 0.028), (17, 0.058), (26, 0.023), (37, 0.084), (39, 0.044), (41, 0.058), (53, 0.013), (55, 0.038), (59, 0.038), (72, 0.041), (73, 0.177), (91, 0.07), (96, 0.221)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.93602192 160 acl-2011-Identifying Sarcasm in Twitter: A Closer Look

Author: Roberto Gonzalez-Ibanez ; Smaranda Muresan ; Nina Wacholder

Abstract: Sarcasm transforms the polarity of an apparently positive or negative utterance into its opposite. We report on a method for constructing a corpus of sarcastic Twitter messages in which determination of the sarcasm of each message has been made by its author. We use this reliable corpus to compare sarcastic utterances in Twitter to utterances that express positive or negative attitudes without sarcasm. We investigate the impact of lexical and pragmatic factors on machine learning effectiveness for identifying sarcastic utterances and we compare the performance of machine learning techniques and human judges on this task. Perhaps unsurprisingly, neither the human judges nor the machine learning techniques perform very well. 1

2 0.91122639 310 acl-2011-Translating from Morphologically Complex Languages: A Paraphrase-Based Approach

Author: Preslav Nakov ; Hwee Tou Ng

Abstract: We propose a novel approach to translating from a morphologically complex language. Unlike previous research, which has targeted word inflections and concatenations, we focus on the pairwise relationship between morphologically related words, which we treat as potential paraphrases and handle using paraphrasing techniques at the word, phrase, and sentence level. An important advantage of this framework is that it can cope with derivational morphology, which has so far remained largely beyond the capabilities of statistical machine translation systems. Our experiments translating from Malay, whose morphology is mostly derivational, into English show signif- icant improvements over rivaling approaches based on five automatic evaluation measures (for 320,000 sentence pairs; 9.5 million English word tokens).

3 0.89076656 37 acl-2011-An Empirical Evaluation of Data-Driven Paraphrase Generation Techniques

Author: Donald Metzler ; Eduard Hovy ; Chunliang Zhang

Abstract: Paraphrase generation is an important task that has received a great deal of interest recently. Proposed data-driven solutions to the problem have ranged from simple approaches that make minimal use of NLP tools to more complex approaches that rely on numerous language-dependent resources. Despite all of the attention, there have been very few direct empirical evaluations comparing the merits of the different approaches. This paper empirically examines the tradeoffs between simple and sophisticated paraphrase harvesting approaches to help shed light on their strengths and weaknesses. Our evaluation reveals that very simple approaches fare surprisingly well and have a number of distinct advantages, including strong precision, good coverage, and low redundancy.

4 0.88532436 58 acl-2011-Beam-Width Prediction for Efficient Context-Free Parsing

Author: Nathan Bodenstab ; Aaron Dunlop ; Keith Hall ; Brian Roark

Abstract: Efficient decoding for syntactic parsing has become a necessary research area as statistical grammars grow in accuracy and size and as more NLP applications leverage syntactic analyses. We review prior methods for pruning and then present a new framework that unifies their strengths into a single approach. Using a log linear model, we learn the optimal beam-search pruning parameters for each CYK chart cell, effectively predicting the most promising areas of the model space to explore. We demonstrate that our method is faster than coarse-to-fine pruning, exemplified in both the Charniak and Berkeley parsers, by empirically comparing our parser to the Berkeley parser using the same grammar and under identical operating conditions.

same-paper 5 0.87654507 81 acl-2011-Consistent Translation using Discriminative Learning - A Translation Memory-inspired Approach

Author: Yanjun Ma ; Yifan He ; Andy Way ; Josef van Genabith

Abstract: We present a discriminative learning method to improve the consistency of translations in phrase-based Statistical Machine Translation (SMT) systems. Our method is inspired by Translation Memory (TM) systems which are widely used by human translators in industrial settings. We constrain the translation of an input sentence using the most similar ‘translation example’ retrieved from the TM. Differently from previous research which used simple fuzzy match thresholds, these constraints are imposed using discriminative learning to optimise the translation performance. We observe that using this method can benefit the SMT system by not only producing consistent translations, but also improved translation outputs. We report a 0.9 point improvement in terms of BLEU score on English–Chinese technical documents.

6 0.82318866 318 acl-2011-Unsupervised Bilingual Morpheme Segmentation and Alignment with Context-rich Hidden Semi-Markov Models

7 0.82261992 177 acl-2011-Interactive Group Suggesting for Twitter

8 0.82201481 241 acl-2011-Parsing the Internal Structure of Words: A New Paradigm for Chinese Word Segmentation

9 0.82098711 145 acl-2011-Good Seed Makes a Good Crop: Accelerating Active Learning Using Language Modeling

10 0.82070673 117 acl-2011-Entity Set Expansion using Topic information

11 0.82026172 190 acl-2011-Knowledge-Based Weak Supervision for Information Extraction of Overlapping Relations

12 0.82012522 207 acl-2011-Learning to Win by Reading Manuals in a Monte-Carlo Framework

13 0.81921518 187 acl-2011-Jointly Learning to Extract and Compress

14 0.81921029 137 acl-2011-Fine-Grained Class Label Markup of Search Queries

15 0.81822503 171 acl-2011-Incremental Syntactic Language Models for Phrase-based Translation

16 0.81665778 86 acl-2011-Coreference for Learning to Extract Relations: Yes Virginia, Coreference Matters

17 0.81647831 327 acl-2011-Using Bilingual Parallel Corpora for Cross-Lingual Textual Entailment

18 0.81510162 175 acl-2011-Integrating history-length interpolation and classes in language modeling

19 0.81467819 15 acl-2011-A Hierarchical Pitman-Yor Process HMM for Unsupervised Part of Speech Induction

20 0.81373119 116 acl-2011-Enhancing Language Models in Statistical Machine Translation with Backward N-grams and Mutual Information Triggers