emnlp emnlp2013 emnlp2013-84 knowledge-graph by maker-knowledge-mining

84 emnlp-2013-Factored Soft Source Syntactic Constraints for Hierarchical Machine Translation


Source: pdf

Author: Zhongqiang Huang ; Jacob Devlin ; Rabih Zbib

Abstract: Translation Jacob Devlin Raytheon BBN Technologies 50 Moulton St Cambridge, MA, USA j devl in@bbn . com Rabih Zbib Raytheon BBN Technologies 50 Moulton St Cambridge, MA, USA r zbib@bbn . com have tried to introduce grammaticality to the transThis paper describes a factored approach to incorporating soft source syntactic constraints into a hierarchical phrase-based translation system. In contrast to traditional approaches that directly introduce syntactic constraints to translation rules by explicitly decorating them with syntactic annotations, which often exacerbate the data sparsity problem and cause other problems, our approach keeps translation rules intact and factorizes the use of syntactic constraints through two separate models: 1) a syntax mismatch model that associates each nonterminal of a translation rule with a distribution of tags that is used to measure the degree of syntactic compatibility of the translation rule on source spans; 2) a syntax-based reordering model that predicts whether a pair of sibling constituents in the constituent parse tree of the source sentence should be reordered or not when translated to the target language. The features produced by both models are used as soft constraints to guide the translation process. Experiments on Chinese-English translation show that the proposed approach significantly improves a strong string-to-dependency translation system on multiple evaluation sets.

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 com have tried to introduce grammaticality to the transThis paper describes a factored approach to incorporating soft source syntactic constraints into a hierarchical phrase-based translation system. [sent-4, score-1.129]

2 The features produced by both models are used as soft constraints to guide the translation process. [sent-6, score-0.546]

3 Experiments on Chinese-English translation show that the proposed approach significantly improves a strong string-to-dependency translation system on multiple evaluation sets. [sent-7, score-0.718]

4 1 Introduction Hierarchical phrase-based translation models (Chiang, 2007) are widely used in machine translation systems due to their ability to achieve local fluency through phrasal translation and handle nonlocal phrase reordering using synchronous contextfree grammars. [sent-8, score-1.739]

5 A large number of previous works 556 lation process by incorporating syntactic constraints into hierarchical translation models. [sent-9, score-0.732]

6 Despite some differences in the granularity of syntax units (e. [sent-10, score-0.169]

7 , 2008; Chiang, 2010), and extended tags (Zollmann and Venugopal, 2006)), most previous work incorporates syntax into hierarchical translation models by explicitly decorating translation rules with syntactic annotations. [sent-15, score-1.546]

8 These approaches inevitably exacerbate the data sparsity problem and cause other problems such as increased grammar size, worsened derivational ambiguity, and unavoidable parsing errors (Hanneman and Lavie, 2013). [sent-16, score-0.217]

9 In this paper, we propose a factored approach that incorporates soft source syntactic constraints into a hierarchical string-to-dependency translation model (Shen et al. [sent-17, score-1.124]

10 The general ideas are applicable to other hierarchical models as well. [sent-19, score-0.174]

11 Instead of enriching translation rules with explicit syntactic annotations, we keep the original translation rules intact, and factorize the use of source syntactic constraints through two separate models. [sent-20, score-1.545]

12 The first is a syntax mismatch model that introduces source syntax into the nonterminals of translation rules, and measures the degree of syntactic compatibility between a translation rule and the source spans it is applied to during decoding. [sent-21, score-2.238]

13 When a hierarchical translation rule is extracted from a parallel training sentence pair, we determine a tag for each nonterminal based on the dependency parse of the source sentence. [sent-22, score-1.313]

14 Instead of fragmenting rule statistics by directly labeling nonterminals with tags, ProceSe datintlges, o Wfa tsh ein 2g01to3n, C UoSnfAe,re 1n8c-e2 o1n O Ecmtopbier ic 2a0l1 M3. [sent-23, score-0.32]

15 hc o2d0s1 i3n A Nsastoucria lti Loan fgoura Cgoem Ppruotcaetsiosin agl, L piang eusis 5t5ic6s–56 , we keep the original string-to-dependency translation rules intact and associate each nonterminal with a distribution of tags. [sent-25, score-0.918]

16 That distribution is then used to measure the syntactic compatibility between the syntactic context from which the translation rule is extracted and the syntactic analysis of a test sentence. [sent-26, score-1.016]

17 The second is a syntax-based reordering model that takes advantage of phrasal cohesion in translation (Fox, 2002). [sent-27, score-0.948]

18 The reordering model takes a pair of sibling constituents in the source parse tree as input, and uses source syntactic clues to predict the ordering distribution (straight vs. [sent-28, score-1.274]

19 The resulting ordering distribution is used in the decoder at the word pair level to guide the translation process. [sent-30, score-0.451]

20 This separate reordering model allows us to utilize source syntax to improve reordering in hierarchical translation models without having to explicitly annotate translation rules with source syntax. [sent-31, score-2.629]

21 Our results show that both the syntax mismatch model and the syntax-based reordering model are able to achieve significant gains over a strong Chinese-English MT baseline. [sent-32, score-0.824]

22 Section 3 provides an overview of our baseline string-to-dependency translation system. [sent-35, score-0.359]

23 Section 4 describes the details of the syntax mismatch and syntax-based reordering models. [sent-36, score-0.824]

24 2 Related Work Attempts to use rich syntactic annotations do not always result in improved performance when compared to purely hierarchical models that do not use linguistic guidance. [sent-39, score-0.325]

25 For example, as shown in (Mi and Huang, 2008), tree-to-string translation models (Huang et al. [sent-40, score-0.359]

26 , 2006) only start to outperform purely hierarchical models when significant efforts were made to alleviate parsing errors by using forest-based approaches in both rule extraction and decoding. [sent-41, score-0.335]

27 Using only syntactic phrases is too restrictive in phrasal translation as many useful phrase pairs are not syntactic constituents (Koehn et al. [sent-42, score-0.842]

28 The syntax-augmented translation model of Zollmann and Venugopal (2006) annotates non557 terminals in hierarchical rules with thousands of extended syntactic categories in order to capture the syntactic variations of phrase pairs. [sent-44, score-0.988]

29 This results in exacerbated data sparsity problems, partially due to the requirement of exact matches in nonterminal substitutions between translation rules in the derivation. [sent-45, score-0.758]

30 (2009) and Chiang (2010) used soft match features to explicitly model the substitution of nonterminals with different labels; Venugopal et al. [sent-48, score-0.322]

31 (2009) used a preference grammar to soften the syntactic constraints through the use of a preference distribution of syntactic categories; and recently Hanneman and Lavie (2013) proposed a clustering approach to reduce the number of syntactic categories. [sent-49, score-0.55]

32 Our proposed syntax mismatch model associates nonterminals with a distribution of tags. [sent-50, score-0.584]

33 , 2009); however, we use treebank tags and focus on the syntactic compatibility between translation rules and the source sentence. [sent-52, score-1.032]

34 (2010) is most similar to ours, with the main difference being that their syntactic categories are latent and learned automatically in a data driven fashion while we simply use treebank tags based on dependency parsing. [sent-54, score-0.333]

35 Marton and Resnik (2008) also ex- ploited soft source syntax constraints without modifying translation rules. [sent-55, score-0.879]

36 However, they focused on the quality of translation spans based on the syntactic analysis of the source sentence, while our method explicitly models the syntactic compatibility between translation rules and source spans. [sent-56, score-1.7]

37 Most research on reordering in machine translation focuses on phrase-based translation models as they are inherently weak at non-local reordering. [sent-57, score-1.253]

38 Previous efforts to improve reordering for phrasebased systems can be largely classified into two categories. [sent-58, score-0.499]

39 Approaches in the first category try to reorder words in the source sentence in a preprocessing step to reduce reordering in both word alignment and MT decoding. [sent-59, score-0.689]

40 The reordering decisions are either made using manual or automatically learned rules (Collins et al. [sent-60, score-0.639]

41 , 2005; Xia and McCord, 2004; Xia and McCord, 2004; Genzel, 2010) based on the syntactic analysis of the source sentence, or constructed through an optimization procedure that uses feature-based reordering models trained on a wordaligned parallel corpus (Tromble and Eisner, 2009; Khapra et al. [sent-61, score-0.814]

42 Approaches in the second category try to explicitly model phrase reordering in the translation process. [sent-63, score-0.973]

43 , 2003) that globally penalizes reordering based on the distorted distance, to lexicalized reordering models (Koehn et al. [sent-65, score-0.998]

44 , 2005; Al-Onaizan and Papineni, 2006) that assign reordering preferences of adjacent phrases for individual phrases, and to hierarchical reordering models (Galley and Manning, 2008; Cherry, 2013) that handle reordering preferences beyond adjacent phrases. [sent-66, score-1.723]

45 Although hierarchical translation models are capable of handling nonlocal reordering, their accuracy is far from perfect. [sent-67, score-0.577]

46 (2009) showed that the syntax-augmented hierarchical model (Zollmann and Venugopal, 2006) also benefits from reordering source words in a preprocessing step. [sent-69, score-0.863]

47 Explicitly adding syntax to translation rules helps with reordering in general, but it introduces additional complexities, and is still limited by the context-free nature of hierarchical rules. [sent-70, score-1.38]

48 Our work exploits an alternative direction that uses an external reordering model to improve word reordering of hierarchical models. [sent-71, score-1.202]

49 (2013) also studied external reordering models for hierarchical models. [sent-75, score-0.703]

50 Our syntaxbased reordering model exploits phrasal cohesion in translation (Fox, 2002) by modeling the reordering of sibling constituents in the source parse tree, which is similar to the recent work of Yang et al. [sent-77, score-1.841]

51 However, the latter focuses on finding the optimal reordering of sibling constituents before MT decoding, while our proposed model generates reordering features that are used together with other MT features to determine the optimal reordering during MT decoding. [sent-79, score-1.668]

52 3 String-to-Dependency Translation Our baseline translation system is based on a stringto-dependency translation model similar to the implementation in (Shen et al. [sent-80, score-0.718]

53 It is an extension of the hierarchical translation model of Chiang et al. [sent-82, score-0.533]

54 (2006) that requires the target side of a phrase pair 558 to have a well-formed dependency structure, defined as either of the two types: • • fixed structure: a single rooted dependency sub-tree with each child being a complete constituent. [sent-83, score-0.406]

55 In this case, the phrase has a unique head word inside the phrase, i. [sent-84, score-0.127]

56 Each dependent of the head word, together with all of its descendants, is either completely inside the phrase or completely outside the phrase. [sent-87, score-0.127]

57 For example, the phrase give him in Figure 1 (a) has a fixed dependency structure with head word give. [sent-88, score-0.331]

58 floating structure: a sequence of siblings with each being a complete constituent. [sent-89, score-0.077]

59 In this case, the phrase is composed of a sequence of sibling constituents whose common parent is outside the phrase. [sent-90, score-0.236]

60 For example, the phrase him that brown coat in Figure 1 is a floating structure whose common parent give is not in the phrase. [sent-91, score-0.253]

61 Requiring the target side to have a well-formed dependency structure is less restrictive than requiring it to be a syntactic constituent, allowing more translation rules to be extracted. [sent-92, score-0.887]

62 However, it still results in fewer rules than pure hierarchical translation models and might hurt MT performance. [sent-93, score-0.703]

63 The well-formed dependency structure on the target side makes it possible to introduce syntax features during decoding. [sent-94, score-0.395]

64 (2008) obtained significant improvements from including a dependency language model score in decoding, outweighing the negative effect ofthe dependency constraint. [sent-96, score-0.2]

65 Figure 1 (c) shows an example stringto-dependency translation rule in our baseline system. [sent-100, score-0.494]

66 structure1, 1Nonterminals corresponding to floating structures keep their default label “X” as experiments show that it is not beneficial to label them differently. [sent-101, score-0.137]

67 他 (a) word alignments 那件 褐色 X : give X2 X1 VV : give PRP2 NN1 X : X1 ? [sent-103, score-0.062]

68 X2 (b) pure hierarchical rule (c) string-to-dependency rule Figure 1: An example of extracting a string-todependency translation rule from word alignments. [sent-105, score-0.968]

69 The nonterminals on the target side of the hierarchical rule (b) all correspond to fixed dependency structures and so they are labeled by the respective head tag in the stringto-dependency rule (c). [sent-106, score-1.063]

70 4 Factored Syntactic Constraints Although the string-to-dependency formulation helps to improve the grammaticality of translations, it lacks the ability to incorporate source syntax into the translation process. [sent-107, score-0.757]

71 We next describe a factored approach to address this problem by utilizing source syntax through two models: one that introduces syntactic awareness to translation rules themselves, and another that focuses on reordering based on the syntactic analysis of the source. [sent-108, score-1.774]

72 1 Syntax Mismatch Model A straightforward method to introduce awareness of source syntax to translation rules is to apply the same well-formed dependency constraint and head POS annotation on the target side of stringto-dependency translation rules to the source side. [sent-110, score-1.853]

73 However, as discussed earlier, this would significantly reduce the number of rules that can be extracted, exacerbate data sparsity, and cause other problems, especially given that the target side is already constrained by the dependency requirement. [sent-111, score-0.443]

74 A relaxed method is to bypass the dependency constraint and only annotate source nonterminals whose underlying phrase is a fixed dependency structure with the head POS tag of the phrase. [sent-112, score-0.879]

75 This method would still extract all of the rules that can be extracted from the baseline string-to-dependency 559 VV : give PRP2 NN1 42NVX N : 0 . [sent-113, score-0.171]

76 X2 (a) non5te4rminal tag dis5tri4butions depsenpsadoneun trac gye: ? [sent-117, score-0.104]

77 Unfortunately, our experiments have shown that even this moderate annotation results in significantly lower translation quality due to the fragmentation of translation rules, and the increased derivational ambiguity. [sent-120, score-0.762]

78 We have also tried to include some source tag mismatch features (with details described later) to measure the syntactic compatibility between the nonterminal labels of a translation rule and the corresponding tags of source spans. [sent-121, score-1.664]

79 This improves translation accuracy, but not enough to compensate for the performance drop caused by annotating source nonterminals. [sent-122, score-0.549]

80 Our proposed method introduces syntax to translation rules without sacrificing performance. [sent-123, score-0.707]

81 Instead of imposing dependency constraints or explicitly annotating source nonterminals, we keep the original string-to-dependency translation rules intact and associate each nonterminal on the source side with a distribution of tags. [sent-124, score-1.585]

82 The tags are determined based on the dependency structure of training samples. [sent-125, score-0.206]

83 If the underlying source phrase of a nonterminal is a fixed dependency structure in a training sample, we use the head POS tag of the phrase as the tag. [sent-126, score-0.877]

84 T Xh,et de≠fa tult) value of each feature is zero if the source span tag ts does not match the condition ing structures and dependency structures that are not well formed. [sent-128, score-0.57]

85 As a result, we still extract the same set of rules as in the baseline string-to-dependency translation model, and also obtain a distribution of tags for each nonterminal. [sent-129, score-0.613]

86 Figure 2 (a) illustrates the example tag distributions of a string-to-dependency translation rule. [sent-130, score-0.463]

87 The tag distributions provide an approximation of the source syntax of the training data from which the translation rules are extracted. [sent-131, score-0.962]

88 They are used to measure the syntactic compatibility between a translation rule and the source spans it is applied to. [sent-132, score-0.971]

89 At decoding time, we parse the source sentence and assign each span a tag in the same way as it is done during rule extraction, as shown in the example in Figure 2 (b). [sent-133, score-0.525]

90 We use soft features instead of hard syntactic constraints, and allow the tuning process to choose the appropriate weight for each feature. [sent-136, score-0.212]

91 As shown in Section 5, these source syntax mismatch features help to improve the baseline system. [sent-137, score-0.515]

92 2 Syntax-based Reordering Model Most previous research on reordering models has focused on improving word reordering for statistical phrase-based translation systems (e. [sent-139, score-1.357]

93 There has been less work on improving the reordering of hierarchical phrase-based translation systems (see (Xu et al. [sent-143, score-1.032]

94 , 2012) for a few exceptions), ex- 560 cept through explicit syntactic annotation of translation rules. [sent-146, score-0.484]

95 It is generally assumed that hierarchical models are inherently capable of handling both local and non-local reorderings. [sent-147, score-0.21]

96 However, many hierarchical translation rules are noisy and have limited context, and so may not be able to produce translations in the right order. [sent-148, score-0.717]

97 We propose a general framework that incorporates external reordering information into the decoding process of hierarchical translation models. [sent-149, score-1.129]

98 To simplify the presentation, we make the assumption that every source word translates to one or more target words, and that the translations for a pair of source words is either straight or inverted. [sent-150, score-0.497]

99 tshise h tr awnisthla tiroesnpsec otf t aon tyhe s ourrdceerin wgo rpdre pdaiicrte (dw by th)e. [sent-153, score-0.055]

100 reordering model as the sum of log probabilities2 for ordering each pair of source words, as defined , n. [sent-154, score-0.718]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('reordering', 0.499), ('translation', 0.359), ('nonterminal', 0.218), ('source', 0.19), ('nonterminals', 0.185), ('hierarchical', 0.174), ('syntax', 0.169), ('mismatch', 0.156), ('rules', 0.14), ('rule', 0.135), ('bbn', 0.131), ('syntactic', 0.125), ('venugopal', 0.117), ('compatibility', 0.11), ('tag', 0.104), ('intact', 0.103), ('dependency', 0.1), ('sibling', 0.094), ('ts', 0.092), ('moulton', 0.089), ('raytheon', 0.089), ('soft', 0.087), ('xxx', 0.082), ('factored', 0.081), ('constituents', 0.077), ('tags', 0.077), ('floating', 0.077), ('exacerbate', 0.077), ('rhs', 0.077), ('constraints', 0.074), ('zollmann', 0.07), ('shen', 0.066), ('lhs', 0.066), ('phrase', 0.065), ('side', 0.063), ('head', 0.062), ('mt', 0.061), ('decorating', 0.059), ('hanneman', 0.059), ('ppp', 0.059), ('pppp', 0.059), ('tttr', 0.059), ('zbib', 0.059), ('tr', 0.055), ('phrasal', 0.054), ('spans', 0.052), ('vv', 0.052), ('coat', 0.051), ('mccord', 0.051), ('explicitly', 0.05), ('tromble', 0.047), ('awareness', 0.047), ('fixed', 0.044), ('chiang', 0.044), ('translations', 0.044), ('derivational', 0.044), ('nonlocal', 0.044), ('dei', 0.044), ('sparsity', 0.041), ('straight', 0.039), ('grammaticality', 0.039), ('introduces', 0.039), ('associates', 0.037), ('restrictive', 0.037), ('distribution', 0.037), ('cohesion', 0.036), ('lavie', 0.036), ('inherently', 0.036), ('tt', 0.036), ('xia', 0.035), ('xiong', 0.035), ('target', 0.034), ('incorporates', 0.034), ('fox', 0.033), ('keep', 0.033), ('ma', 0.033), ('parse', 0.033), ('decoding', 0.033), ('preference', 0.032), ('give', 0.031), ('treebank', 0.031), ('external', 0.03), ('span', 0.03), ('xx', 0.03), ('pure', 0.03), ('ordering', 0.029), ('cause', 0.029), ('structure', 0.029), ('koehn', 0.028), ('associate', 0.028), ('huang', 0.028), ('structures', 0.027), ('galley', 0.026), ('papineni', 0.026), ('guide', 0.026), ('gao', 0.026), ('purely', 0.026), ('preferences', 0.026), ('bea', 0.026), ('unavoidable', 0.026)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999899 84 emnlp-2013-Factored Soft Source Syntactic Constraints for Hierarchical Machine Translation

Author: Zhongqiang Huang ; Jacob Devlin ; Rabih Zbib

Abstract: Translation Jacob Devlin Raytheon BBN Technologies 50 Moulton St Cambridge, MA, USA j devl in@bbn . com Rabih Zbib Raytheon BBN Technologies 50 Moulton St Cambridge, MA, USA r zbib@bbn . com have tried to introduce grammaticality to the transThis paper describes a factored approach to incorporating soft source syntactic constraints into a hierarchical phrase-based translation system. In contrast to traditional approaches that directly introduce syntactic constraints to translation rules by explicitly decorating them with syntactic annotations, which often exacerbate the data sparsity problem and cause other problems, our approach keeps translation rules intact and factorizes the use of syntactic constraints through two separate models: 1) a syntax mismatch model that associates each nonterminal of a translation rule with a distribution of tags that is used to measure the degree of syntactic compatibility of the translation rule on source spans; 2) a syntax-based reordering model that predicts whether a pair of sibling constituents in the constituent parse tree of the source sentence should be reordered or not when translated to the target language. The features produced by both models are used as soft constraints to guide the translation process. Experiments on Chinese-English translation show that the proposed approach significantly improves a strong string-to-dependency translation system on multiple evaluation sets.

2 0.30545819 157 emnlp-2013-Recursive Autoencoders for ITG-Based Translation

Author: Peng Li ; Yang Liu ; Maosong Sun

Abstract: While inversion transduction grammar (ITG) is well suited for modeling ordering shifts between languages, how to make applying the two reordering rules (i.e., straight and inverted) dependent on actual blocks being merged remains a challenge. Unlike previous work that only uses boundary words, we propose to use recursive autoencoders to make full use of the entire merging blocks alternatively. The recursive autoencoders are capable of generating vector space representations for variable-sized phrases, which enable predicting orders to exploit syntactic and semantic information from a neural language modeling’s perspective. Experiments on the NIST 2008 dataset show that our system significantly improves over the MaxEnt classifier by 1.07 BLEU points.

3 0.30315334 175 emnlp-2013-Source-Side Classifier Preordering for Machine Translation

Author: Uri Lerner ; Slav Petrov

Abstract: We present a simple and novel classifier-based preordering approach. Unlike existing preordering models, we train feature-rich discriminative classifiers that directly predict the target-side word order. Our approach combines the strengths of lexical reordering and syntactic preordering models by performing long-distance reorderings using the structure of the parse tree, while utilizing a discriminative model with a rich set of features, including lexical features. We present extensive experiments on 22 language pairs, including preordering into English from 7 other languages. We obtain improvements of up to 1.4 BLEU on language pairs in the WMT 2010 shared task. For languages from different families the improvements often exceed 2 BLEU. Many of these gains are also significant in human evaluations.

4 0.26688519 22 emnlp-2013-Anchor Graph: Global Reordering Contexts for Statistical Machine Translation

Author: Hendra Setiawan ; Bowen Zhou ; Bing Xiang

Abstract: Reordering poses one of the greatest challenges in Statistical Machine Translation research as the key contextual information may well be beyond the confine oftranslation units. We present the “Anchor Graph” (AG) model where we use a graph structure to model global contextual information that is crucial for reordering. The key ingredient of our AG model is the edges that capture the relationship between the reordering around a set of selected translation units, which we refer to as anchors. As the edges link anchors that may span multiple translation units at decoding time, our AG model effectively encodes global contextual information that is previously absent. We integrate our proposed model into a state-of-the-art translation system and demonstrate the efficacy of our proposal in a largescale Chinese-to-English translation task.

5 0.23646581 104 emnlp-2013-Improving Statistical Machine Translation with Word Class Models

Author: Joern Wuebker ; Stephan Peitz ; Felix Rietig ; Hermann Ney

Abstract: Automatically clustering words from a monolingual or bilingual training corpus into classes is a widely used technique in statistical natural language processing. We present a very simple and easy to implement method for using these word classes to improve translation quality. It can be applied across different machine translation paradigms and with arbitrary types of models. We show its efficacy on a small German→English and a larger F ornenc ah s→mGalelrm Gaenrm mtarann→slEatniognli tsahsk a nwdit ha lbaortghe rst Farnednacrhd→ phrase-based salandti nhie traaskrch wiciathl phrase-based translation systems for a common set of models. Our results show that with word class models, the baseline can be improved by up to 1.4% BLEU and 1.0% TER on the French→German task and 0.3% BLEU aonnd t h1e .1 F%re nTcEhR→ on tehrem German→English Btask.

6 0.23081911 71 emnlp-2013-Efficient Left-to-Right Hierarchical Phrase-Based Translation with Improved Reordering

7 0.22781175 127 emnlp-2013-Max-Margin Synchronous Grammar Induction for Machine Translation

8 0.20668836 187 emnlp-2013-Translation with Source Constituency and Dependency Trees

9 0.20626435 171 emnlp-2013-Shift-Reduce Word Reordering for Machine Translation

10 0.19321239 88 emnlp-2013-Flexible and Efficient Hypergraph Interactions for Joint Hierarchical and Forest-to-String Decoding

11 0.1809127 201 emnlp-2013-What is Hidden among Translation Rules

12 0.13421221 103 emnlp-2013-Improving Pivot-Based Statistical Machine Translation Using Random Walk

13 0.13187458 107 emnlp-2013-Interactive Machine Translation using Hierarchical Translation Models

14 0.13035671 57 emnlp-2013-Dependency-Based Decipherment for Resource-Limited Machine Translation

15 0.11697252 15 emnlp-2013-A Systematic Exploration of Diversity in Machine Translation

16 0.11531309 135 emnlp-2013-Monolingual Marginal Matching for Translation Model Adaptation

17 0.10001766 145 emnlp-2013-Optimal Beam Search for Machine Translation

18 0.098237611 186 emnlp-2013-Translating into Morphologically Rich Languages with Synthetic Phrases

19 0.094304614 136 emnlp-2013-Multi-Domain Adaptation for SMT Using Multi-Task Learning

20 0.086947739 181 emnlp-2013-The Effects of Syntactic Features in Automatic Prediction of Morphology


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.289), (1, -0.412), (2, 0.166), (3, 0.079), (4, 0.14), (5, -0.08), (6, -0.087), (7, -0.099), (8, 0.058), (9, 0.047), (10, -0.004), (11, 0.075), (12, -0.101), (13, -0.028), (14, -0.21), (15, 0.267), (16, -0.011), (17, -0.131), (18, -0.03), (19, -0.11), (20, -0.056), (21, -0.021), (22, 0.016), (23, -0.018), (24, 0.029), (25, -0.045), (26, -0.049), (27, -0.047), (28, -0.067), (29, 0.084), (30, -0.001), (31, 0.003), (32, 0.038), (33, 0.006), (34, 0.05), (35, -0.004), (36, 0.065), (37, 0.065), (38, 0.015), (39, 0.034), (40, -0.021), (41, -0.032), (42, -0.032), (43, 0.024), (44, 0.026), (45, -0.038), (46, -0.036), (47, 0.05), (48, -0.038), (49, -0.005)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.96591502 84 emnlp-2013-Factored Soft Source Syntactic Constraints for Hierarchical Machine Translation

Author: Zhongqiang Huang ; Jacob Devlin ; Rabih Zbib

Abstract: Translation Jacob Devlin Raytheon BBN Technologies 50 Moulton St Cambridge, MA, USA j devl in@bbn . com Rabih Zbib Raytheon BBN Technologies 50 Moulton St Cambridge, MA, USA r zbib@bbn . com have tried to introduce grammaticality to the transThis paper describes a factored approach to incorporating soft source syntactic constraints into a hierarchical phrase-based translation system. In contrast to traditional approaches that directly introduce syntactic constraints to translation rules by explicitly decorating them with syntactic annotations, which often exacerbate the data sparsity problem and cause other problems, our approach keeps translation rules intact and factorizes the use of syntactic constraints through two separate models: 1) a syntax mismatch model that associates each nonterminal of a translation rule with a distribution of tags that is used to measure the degree of syntactic compatibility of the translation rule on source spans; 2) a syntax-based reordering model that predicts whether a pair of sibling constituents in the constituent parse tree of the source sentence should be reordered or not when translated to the target language. The features produced by both models are used as soft constraints to guide the translation process. Experiments on Chinese-English translation show that the proposed approach significantly improves a strong string-to-dependency translation system on multiple evaluation sets.

2 0.83338737 22 emnlp-2013-Anchor Graph: Global Reordering Contexts for Statistical Machine Translation

Author: Hendra Setiawan ; Bowen Zhou ; Bing Xiang

Abstract: Reordering poses one of the greatest challenges in Statistical Machine Translation research as the key contextual information may well be beyond the confine oftranslation units. We present the “Anchor Graph” (AG) model where we use a graph structure to model global contextual information that is crucial for reordering. The key ingredient of our AG model is the edges that capture the relationship between the reordering around a set of selected translation units, which we refer to as anchors. As the edges link anchors that may span multiple translation units at decoding time, our AG model effectively encodes global contextual information that is previously absent. We integrate our proposed model into a state-of-the-art translation system and demonstrate the efficacy of our proposal in a largescale Chinese-to-English translation task.

3 0.8064509 175 emnlp-2013-Source-Side Classifier Preordering for Machine Translation

Author: Uri Lerner ; Slav Petrov

Abstract: We present a simple and novel classifier-based preordering approach. Unlike existing preordering models, we train feature-rich discriminative classifiers that directly predict the target-side word order. Our approach combines the strengths of lexical reordering and syntactic preordering models by performing long-distance reorderings using the structure of the parse tree, while utilizing a discriminative model with a rich set of features, including lexical features. We present extensive experiments on 22 language pairs, including preordering into English from 7 other languages. We obtain improvements of up to 1.4 BLEU on language pairs in the WMT 2010 shared task. For languages from different families the improvements often exceed 2 BLEU. Many of these gains are also significant in human evaluations.

4 0.74896276 71 emnlp-2013-Efficient Left-to-Right Hierarchical Phrase-Based Translation with Improved Reordering

Author: Maryam Siahbani ; Baskaran Sankaran ; Anoop Sarkar

Abstract: Left-to-right (LR) decoding (Watanabe et al., 2006b) is a promising decoding algorithm for hierarchical phrase-based translation (Hiero). It generates the target sentence by extending the hypotheses only on the right edge. LR decoding has complexity O(n2b) for input of n words and beam size b, compared to O(n3) for the CKY algorithm. It requires a single language model (LM) history for each target hypothesis rather than two LM histories per hypothesis as in CKY. In this paper we present an augmented LR decoding algorithm that builds on the original algorithm in (Watanabe et al., 2006b). Unlike that algorithm, using experiments over multiple language pairs we show two new results: our LR decoding algorithm provides demonstrably more efficient decoding than CKY Hiero, four times faster; and by introducing new distortion and reordering features for LR decoding, it maintains the same translation quality (as in BLEU scores) ob- tained phrase-based and CKY Hiero with the same translation model.

5 0.73242348 157 emnlp-2013-Recursive Autoencoders for ITG-Based Translation

Author: Peng Li ; Yang Liu ; Maosong Sun

Abstract: While inversion transduction grammar (ITG) is well suited for modeling ordering shifts between languages, how to make applying the two reordering rules (i.e., straight and inverted) dependent on actual blocks being merged remains a challenge. Unlike previous work that only uses boundary words, we propose to use recursive autoencoders to make full use of the entire merging blocks alternatively. The recursive autoencoders are capable of generating vector space representations for variable-sized phrases, which enable predicting orders to exploit syntactic and semantic information from a neural language modeling’s perspective. Experiments on the NIST 2008 dataset show that our system significantly improves over the MaxEnt classifier by 1.07 BLEU points.

6 0.71567309 171 emnlp-2013-Shift-Reduce Word Reordering for Machine Translation

7 0.67930859 88 emnlp-2013-Flexible and Efficient Hypergraph Interactions for Joint Hierarchical and Forest-to-String Decoding

8 0.67400014 104 emnlp-2013-Improving Statistical Machine Translation with Word Class Models

9 0.66604197 187 emnlp-2013-Translation with Source Constituency and Dependency Trees

10 0.66254073 201 emnlp-2013-What is Hidden among Translation Rules

11 0.65692431 127 emnlp-2013-Max-Margin Synchronous Grammar Induction for Machine Translation

12 0.62261987 107 emnlp-2013-Interactive Machine Translation using Hierarchical Translation Models

13 0.61523753 103 emnlp-2013-Improving Pivot-Based Statistical Machine Translation Using Random Walk

14 0.47796139 57 emnlp-2013-Dependency-Based Decipherment for Resource-Limited Machine Translation

15 0.43625498 156 emnlp-2013-Recurrent Continuous Translation Models

16 0.42283604 186 emnlp-2013-Translating into Morphologically Rich Languages with Synthetic Phrases

17 0.41484711 135 emnlp-2013-Monolingual Marginal Matching for Translation Model Adaptation

18 0.40200678 15 emnlp-2013-A Systematic Exploration of Diversity in Machine Translation

19 0.34898701 145 emnlp-2013-Optimal Beam Search for Machine Translation

20 0.34354603 136 emnlp-2013-Multi-Domain Adaptation for SMT Using Multi-Task Learning


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(3, 0.017), (18, 0.026), (22, 0.022), (30, 0.083), (46, 0.015), (51, 0.127), (66, 0.024), (71, 0.015), (75, 0.021), (77, 0.546), (96, 0.012)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.95870322 55 emnlp-2013-Decoding with Large-Scale Neural Language Models Improves Translation

Author: Ashish Vaswani ; Yinggong Zhao ; Victoria Fossum ; David Chiang

Abstract: We explore the application of neural language models to machine translation. We develop a new model that combines the neural probabilistic language model of Bengio et al., rectified linear units, and noise-contrastive estimation, and we incorporate it into a machine translation system both by reranking k-best lists and by direct integration into the decoder. Our large-scale, large-vocabulary experiments across four language pairs show that our neural language model improves translation quality by up to 1. 1B .

same-paper 2 0.92567343 84 emnlp-2013-Factored Soft Source Syntactic Constraints for Hierarchical Machine Translation

Author: Zhongqiang Huang ; Jacob Devlin ; Rabih Zbib

Abstract: Translation Jacob Devlin Raytheon BBN Technologies 50 Moulton St Cambridge, MA, USA j devl in@bbn . com Rabih Zbib Raytheon BBN Technologies 50 Moulton St Cambridge, MA, USA r zbib@bbn . com have tried to introduce grammaticality to the transThis paper describes a factored approach to incorporating soft source syntactic constraints into a hierarchical phrase-based translation system. In contrast to traditional approaches that directly introduce syntactic constraints to translation rules by explicitly decorating them with syntactic annotations, which often exacerbate the data sparsity problem and cause other problems, our approach keeps translation rules intact and factorizes the use of syntactic constraints through two separate models: 1) a syntax mismatch model that associates each nonterminal of a translation rule with a distribution of tags that is used to measure the degree of syntactic compatibility of the translation rule on source spans; 2) a syntax-based reordering model that predicts whether a pair of sibling constituents in the constituent parse tree of the source sentence should be reordered or not when translated to the target language. The features produced by both models are used as soft constraints to guide the translation process. Experiments on Chinese-English translation show that the proposed approach significantly improves a strong string-to-dependency translation system on multiple evaluation sets.

3 0.71376204 135 emnlp-2013-Monolingual Marginal Matching for Translation Model Adaptation

Author: Ann Irvine ; Chris Quirk ; Hal Daume III

Abstract: When using a machine translation (MT) model trained on OLD-domain parallel data to translate NEW-domain text, one major challenge is the large number of out-of-vocabulary (OOV) and new-translation-sense words. We present a method to identify new translations of both known and unknown source language words that uses NEW-domain comparable document pairs. Starting with a joint distribution of source-target word pairs derived from the OLD-domain parallel corpus, our method recovers a new joint distribution that matches the marginal distributions of the NEW-domain comparable document pairs, while minimizing the divergence from the OLD-domain distribution. Adding learned translations to our French-English MT model results in gains of about 2 BLEU points over strong baselines.

4 0.68813753 57 emnlp-2013-Dependency-Based Decipherment for Resource-Limited Machine Translation

Author: Qing Dou ; Kevin Knight

Abstract: We introduce dependency relations into deciphering foreign languages and show that dependency relations help improve the state-ofthe-art deciphering accuracy by over 500%. We learn a translation lexicon from large amounts of genuinely non parallel data with decipherment to improve a phrase-based machine translation system trained with limited parallel data. In experiments, we observe BLEU gains of 1.2 to 1.8 across three different test sets.

5 0.49644604 187 emnlp-2013-Translation with Source Constituency and Dependency Trees

Author: Fandong Meng ; Jun Xie ; Linfeng Song ; Yajuan Lu ; Qun Liu

Abstract: We present a novel translation model, which simultaneously exploits the constituency and dependency trees on the source side, to combine the advantages of two types of trees. We take head-dependents relations of dependency trees as backbone and incorporate phrasal nodes of constituency trees as the source side of our translation rules, and the target side as strings. Our rules hold the property of long distance reorderings and the compatibility with phrases. Large-scale experimental results show that our model achieves significantly improvements over the constituency-to-string (+2.45 BLEU on average) and dependencyto-string (+0.91 BLEU on average) models, which only employ single type of trees, and significantly outperforms the state-of-theart hierarchical phrase-based model (+1.12 BLEU on average), on three Chinese-English NIST test sets.

6 0.46498173 157 emnlp-2013-Recursive Autoencoders for ITG-Based Translation

7 0.46473315 88 emnlp-2013-Flexible and Efficient Hypergraph Interactions for Joint Hierarchical and Forest-to-String Decoding

8 0.45568061 22 emnlp-2013-Anchor Graph: Global Reordering Contexts for Statistical Machine Translation

9 0.45483938 175 emnlp-2013-Source-Side Classifier Preordering for Machine Translation

10 0.45102546 104 emnlp-2013-Improving Statistical Machine Translation with Word Class Models

11 0.44708809 15 emnlp-2013-A Systematic Exploration of Diversity in Machine Translation

12 0.4392803 181 emnlp-2013-The Effects of Syntactic Features in Automatic Prediction of Morphology

13 0.43751773 38 emnlp-2013-Bilingual Word Embeddings for Phrase-Based Machine Translation

14 0.43340558 107 emnlp-2013-Interactive Machine Translation using Hierarchical Translation Models

15 0.42341146 113 emnlp-2013-Joint Language and Translation Modeling with Recurrent Neural Networks

16 0.41505963 171 emnlp-2013-Shift-Reduce Word Reordering for Machine Translation

17 0.41227737 114 emnlp-2013-Joint Learning and Inference for Grammatical Error Correction

18 0.40137833 156 emnlp-2013-Recurrent Continuous Translation Models

19 0.39762849 128 emnlp-2013-Max-Violation Perceptron and Forced Decoding for Scalable MT Training

20 0.39002779 53 emnlp-2013-Cross-Lingual Discriminative Learning of Sequence Models with Posterior Regularization