emnlp emnlp2013 emnlp2013-171 knowledge-graph by maker-knowledge-mining

171 emnlp-2013-Shift-Reduce Word Reordering for Machine Translation


Source: pdf

Author: Katsuhiko Hayashi ; Katsuhito Sudoh ; Hajime Tsukada ; Jun Suzuki ; Masaaki Nagata

Abstract: This paper presents a novel word reordering model that employs a shift-reduce parser for inversion transduction grammars. Our model uses rich syntax parsing features for word reordering and runs in linear time. We apply it to postordering of phrase-based machine translation (PBMT) for Japanese-to-English patent tasks. Our experimental results show that our method achieves a significant improvement of +3.1 BLEU scores against 30.15 BLEU scores of the baseline PBMT system.

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 jp Abstract This paper presents a novel word reordering model that employs a shift-reduce parser for inversion transduction grammars. [sent-5, score-0.499]

2 Our model uses rich syntax parsing features for word reordering and runs in linear time. [sent-6, score-0.296]

3 We apply it to postordering of phrase-based machine translation (PBMT) for Japanese-to-English patent tasks. [sent-7, score-0.484]

4 1 Introduction Even though phrase-based machine translation (PBMT) (Koehn et al. [sent-11, score-0.035]

5 To improve such word reordering, one promis- ing way is to separate it from the translation process as preordering (Collins et al. [sent-14, score-0.095]

6 , 2005; DeNero and Uszkoreit, 2011) or postordering (Sudoh et al. [sent-15, score-0.377]

7 Many studies utilize a rulebased or a probabilistic model to perform a reordering decision at each node of a syntactic parse tree. [sent-18, score-0.264]

8 This paper presents a parser-based word reordering model that employs a shift-reduce parser for inversion transduction grammars (ITG) (Wu, 1997). [sent-19, score-0.499]

9 To the best of our knowledge, this is the first study on a shift-reduce parser for word reordering. [sent-20, score-0.112]

10 The parser-based reordering approach uses rich syntax parsing features for reordering decisions. [sent-21, score-0.53]

11 ecTe(aH-rgFeEt)Ttaerngce t(SEe)nreordering Figure 1: A description of the postordering MT system. [sent-25, score-0.377]

12 Even when using these non-local features, the complexity of the shift-reduce parser does not increase at all due to give up achieving an optimal solution. [sent-27, score-0.112]

13 In our experiments, we apply our proposed method to postordering for J-to-E patent tasks because their training data for reordering have little noise and they are ideal for evaluating reordering methods. [sent-29, score-0.917]

14 Although our used J-to-E setups need a language-dependent scheme and we describe our proposed method as a J-to-E postordering method, the key algorithm is language-independent and it can be applicable to preordering as well as postordering if the training data for reordering are available. [sent-30, score-1.093]

15 , 2011) has two steps; the first is a translation step that translates an input sentence into source-ordered translations. [sent-33, score-0.035]

16 The second is a reordering step in which the translations are reordered in the target language order. [sent-34, score-0.37]

17 (2012) modeled the second step by parsing and created training data for a postordering parser using a language-dependent rule called headfinalization. [sent-37, score-0.588]

18 The rule moves syntactic heads of a lexicalized parse tree of an English sentence to the ProceeSdeiantgtlse o,f W thaesh 2i0n1gt3o nC,o UnSfeAre,n 1c8e- o2n1 E Omctpoibriecra 2l0 M13et. [sent-38, score-0.179]

19 Figure 2: An example of the head-finzalizaton process for an English-Japanese sentence pair: the left-hand side tree is the original English tree, and the right-hand side tree is its head-final English tree. [sent-281, score-0.212]

20 As a result, the terminal symbols of the English tree are sorted in a Japanese-like order. [sent-283, score-0.112]

21 2, we show an example of head-finalization and a tree on the righthand side is a head-finalized English (HFE) tree of an English tree on the left-hand side. [sent-285, score-0.284]

22 For example, a nonterminal symbol PP#(with) shows that a noun phrase “a/an telescope” and a word “with” are inverted. [sent-287, score-0.108]

23 (2012) also deleted articles “the” “a” “an” from English because Japanese has no articles, and inserted Japanese particles “ga” “wo” “wa” into English sentences. [sent-289, score-0.286]

24 We privilege the nonterminals of a phrase modified by a deleted article to determine which “the” “a/an” or “no articles” should be inserted at the front of the phrase. [sent-290, score-0.276]

25 Note that an original English sentence can be recovered from its HFE tree by using # symbols and annotated articles and deleting Japanese particles. [sent-291, score-0.215]

26 (2012), we solve postordering by a parser whose model is trained with a set of HFE trees. [sent-293, score-0.489]

27 (2012)’s model and ours is that while the former simply used the Berkeley parser (Petrov and Klein, 2007), our shift-reduce parsing model can use such non-local task specific features as the N-gram words of reordered strings without sacrificing efficiency. [sent-295, score-0.269]

28 Our method integrates postediting (Knight and Chander, 1994) with reordering and inserts articles into English translations by learning an additional “insert” action of the parser. [sent-296, score-0.704]

29 (2012) solved the article generation problem by using an 1383 N-gram language model, but this somewhat complicates their approach. [sent-298, score-0.053]

30 Compared with other parsers, one advantage of the shift-reduce parser is to easily define such additional operations as “insert”. [sent-299, score-0.112]

31 HFE trees can be defined as monolingual ITG trees (DeNero and Uszkoreit, 2011). [sent-300, score-0.116]

32 Our monolingual ITG G is a tuple G = (V, T, P, I,S) where V is a set of nonterminals, T is a set of terminals, P is a set of production rules, I a set of nontermiis nals on which “the” “a/an” or “no articles” must be determined, and S is the start symbol. [sent-301, score-0.066]

33 Set P consists of terminal production rules that are responsible for generating word w(∈ T) : X → w and binary production rules in two forms: X → YZ X# → YZ where X, X#, Y and Z are nonterminals. [sent-302, score-0.17]

34 On the right-hand side, the second rule generates two phrases Y and Z in the reverse order. [sent-303, score-0.108]

35 In our experiments, we removed all unary production rules. [sent-304, score-0.066]

36 wn, the shift-reduce parser uses a stack of partial derivations, a buffer of input words, and a set of actions to build a parse tree. [sent-308, score-0.388]

37 The following is the parser’s configuration: ℓ : ⟨i, j,S⟩ : π where ℓ is the step size, S is a stack of elements s0, s1, . [sent-309, score-0.109]

38 , iis the leftmost span index of the stack top element s0, j is an index of the next input word of the buffer, and π is a set of predictor states1 . [sent-312, score-0.351]

39 Our proposed system has 4 actions shift-X, insertx, reduce-MR-X and reduce-SR-X. [sent-316, score-0.048]

40 The shift-X action pushes the next input word onto the stack and assigns a part-of-speech tag X to the word. [sent-317, score-0.234]

41 The deduction step is as follows: X → wj ∈ P z}p|{ ℓz : ⟨i, j,} }S||s′0⟩ : π{ ℓ + 1 : ⟨zj,j + 1}|,S|s′0|s0{)⟩ : {p} where s0 is {X, j, wj , wj , null}. [sent-318, score-0.213]

42 The insert-x action determines whether to generate “the” “a/an” or “no articles” (= x): ∧ s′0. [sent-319, score-0.125]

43 {TXhe, shi,dwe condition prevents ≤the parser f,rroimgh inserting articles into phrase X more than twice. [sent-323, score-0.283]

44 During parsing, articles are not explicitly inserted into the input string: they are inserted into it when backtracking to generate a reordered string after parsing. [sent-324, score-0.466]

45 The reduce-MR-X action has a deduction rule: ∧q∈ z}q|{ X → YZ ∈ P π z: ⟨k, i, S}||s′2|s′1⟩ : π{′ ℓ : ⟨i,j, S|s′1|s′0⟩ : π zℓ} +| 1 : ⟨k,j{,S|s′2|s0⟩ : π′ 1Since our notion of predictor states is identical to that in (Huang and Sagae, 2010), we omit the details here. [sent-325, score-0.219]

46 The action generates s0 by combining s′0 and s′1 w}it. [sent-491, score-0.16]

47 New nonterminal X is lexicalized with head word wh0 of right nonterminal Z. [sent-493, score-0.194]

48 The leftmost word of phrase X is set to leftmost word wleft1 of Y, and the rightmost word of phrase X is set to rightmost word wright0 of Z. [sent-495, score-0.434]

49 The difference between reduce-MR-X and reduce-SR-X actions is new stack element s0. [sent-497, score-0.191]

50 The reduce-SR-X action generates s0 by combining s′0 and s′1 with binary rule X# →Y Z: s0 , = {X#, h0, wleft0, wright1 a0}. [sent-498, score-0.197]

51 This action expands Y and Z in a reverse order, and the leftmost word of X# is set to wleft0 of Z, and the rightmost word of X# is set to wright1 of Y. [sent-499, score-0.387]

52 1 Experimental Setups We conducted experiments for NTCIR-9 and 10 patent data using a Japanese-English language pair. [sent-513, score-0.072]

53 We used Enju (Miyao and Tsujii, 2008) for parsing the English training data and converted parse trees into HFE trees by a head-finalization scheme. [sent-516, score-0.208]

54 We extracted grammar rules from all the HFE trees and randomly selected 500,000 HFE trees to train the shift-reduce parser. [sent-517, score-0.116]

55 , 2007) with lexicalized reordering and a 6-gram language model (LM) trained using SRILM (Stolcke et al. [sent-519, score-0.272]

56 To recover the English sentences, our shift-reduce parser reordered only the 1-best HFE sentence. [sent-521, score-0.207]

57 (2012)’s because they used a linear inteporation of MT cost, parser cost and N-gram LM cost to generate the best English sentence from the n-best HFE sentences. [sent-523, score-0.112]

58 3 Analysis We show N-gram precisions of PBMT (dist=6, dist=20) and proposed systems in Table 5. [sent-533, score-0.048]

59 com/p /me cab / 3All the data and the MT toolkits used in our experiments are the same as theirs. [sent-536, score-0.03]

60 674346 Table 4: The effects of article generation: “w/o art. [sent-547, score-0.053]

61 ” denotes evaluation scores for translations of the best system (“proposed”) in Table 3 from which articles are removed. [sent-548, score-0.182]

62 ” system used HFE data with articles and generated them by MT system and the shift-reduce parser performed only reordering. [sent-550, score-0.253]

63 “N-gram” system inserted articles into the translations of “w/o art. [sent-551, score-0.282]

64 7 5 Table 5: N-gram precisions of moses (dist=6, dist=20) and proposed systems for test9 data. [sent-558, score-0.092]

65 It seems that the gains of 1-gram presicions come from postediting (article generation). [sent-560, score-0.163]

66 In table 4, we show the effectiveness of our joint reordering and postediting approach (“proposed”). [sent-561, score-0.397]

67 ” results clearly show that generating articles has great effects on MT evaluations especially for BLEU metric. [sent-563, score-0.141]

68 ” systems, these results show that postediting is much more effective than generating articles by MT. [sent-565, score-0.304]

69 We plan to study more general methods that use word align- ments to embed swap information in trees (Galley et al. [sent-570, score-0.088]

70 Scalable inference and training of context-rich syntactic translation models. [sent-622, score-0.035]

71 Stochastic inversion transduction grammars and bilingual parsing of parallel corpora. [sent-668, score-0.183]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('hfe', 0.479), ('postordering', 0.377), ('goto', 0.244), ('reordering', 0.234), ('pbmt', 0.208), ('postediting', 0.163), ('articles', 0.141), ('wleft', 0.137), ('action', 0.125), ('parser', 0.112), ('stack', 0.109), ('leftmost', 0.104), ('wright', 0.101), ('dist', 0.101), ('inserted', 0.1), ('reordered', 0.095), ('sudoh', 0.091), ('buffer', 0.089), ('yz', 0.089), ('rightmost', 0.083), ('graehl', 0.081), ('japanese', 0.078), ('nonterminal', 0.078), ('itg', 0.076), ('tree', 0.074), ('patent', 0.072), ('np', 0.072), ('mt', 0.07), ('hayashi', 0.068), ('shou', 0.068), ('production', 0.066), ('hajime', 0.065), ('inversion', 0.063), ('parsing', 0.062), ('vp', 0.061), ('tsukada', 0.06), ('masaaki', 0.06), ('knight', 0.06), ('wat', 0.06), ('ope', 0.06), ('preordering', 0.06), ('trees', 0.058), ('transduction', 0.058), ('deduction', 0.054), ('isozaki', 0.054), ('nagata', 0.054), ('article', 0.053), ('denero', 0.053), ('wj', 0.053), ('wo', 0.052), ('english', 0.049), ('actions', 0.048), ('precisions', 0.048), ('nonterminals', 0.048), ('katsuhito', 0.048), ('uszkoreit', 0.048), ('galley', 0.046), ('deleted', 0.045), ('setups', 0.045), ('kevin', 0.045), ('koehn', 0.044), ('moses', 0.044), ('insert', 0.043), ('wa', 0.042), ('suzuki', 0.042), ('miyao', 0.042), ('collins', 0.041), ('translations', 0.041), ('pp', 0.041), ('shi', 0.04), ('predictor', 0.04), ('jun', 0.039), ('expands', 0.039), ('lexicalized', 0.038), ('terminal', 0.038), ('yusuke', 0.038), ('rule', 0.037), ('reverse', 0.036), ('srilm', 0.035), ('translation', 0.035), ('bleu', 0.035), ('generates', 0.035), ('element', 0.034), ('null', 0.033), ('side', 0.032), ('employs', 0.032), ('lm', 0.032), ('index', 0.032), ('petrov', 0.031), ('stolcke', 0.031), ('phrase', 0.03), ('parse', 0.03), ('xen', 0.03), ('hikaridai', 0.03), ('swap', 0.03), ('righthand', 0.03), ('telescope', 0.03), ('sho', 0.03), ('cab', 0.03), ('backtracking', 0.03)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999988 171 emnlp-2013-Shift-Reduce Word Reordering for Machine Translation

Author: Katsuhiko Hayashi ; Katsuhito Sudoh ; Hajime Tsukada ; Jun Suzuki ; Masaaki Nagata

Abstract: This paper presents a novel word reordering model that employs a shift-reduce parser for inversion transduction grammars. Our model uses rich syntax parsing features for word reordering and runs in linear time. We apply it to postordering of phrase-based machine translation (PBMT) for Japanese-to-English patent tasks. Our experimental results show that our method achieves a significant improvement of +3.1 BLEU scores against 30.15 BLEU scores of the baseline PBMT system.

2 0.20626435 84 emnlp-2013-Factored Soft Source Syntactic Constraints for Hierarchical Machine Translation

Author: Zhongqiang Huang ; Jacob Devlin ; Rabih Zbib

Abstract: Translation Jacob Devlin Raytheon BBN Technologies 50 Moulton St Cambridge, MA, USA j devl in@bbn . com Rabih Zbib Raytheon BBN Technologies 50 Moulton St Cambridge, MA, USA r zbib@bbn . com have tried to introduce grammaticality to the transThis paper describes a factored approach to incorporating soft source syntactic constraints into a hierarchical phrase-based translation system. In contrast to traditional approaches that directly introduce syntactic constraints to translation rules by explicitly decorating them with syntactic annotations, which often exacerbate the data sparsity problem and cause other problems, our approach keeps translation rules intact and factorizes the use of syntactic constraints through two separate models: 1) a syntax mismatch model that associates each nonterminal of a translation rule with a distribution of tags that is used to measure the degree of syntactic compatibility of the translation rule on source spans; 2) a syntax-based reordering model that predicts whether a pair of sibling constituents in the constituent parse tree of the source sentence should be reordered or not when translated to the target language. The features produced by both models are used as soft constraints to guide the translation process. Experiments on Chinese-English translation show that the proposed approach significantly improves a strong string-to-dependency translation system on multiple evaluation sets.

3 0.19940871 175 emnlp-2013-Source-Side Classifier Preordering for Machine Translation

Author: Uri Lerner ; Slav Petrov

Abstract: We present a simple and novel classifier-based preordering approach. Unlike existing preordering models, we train feature-rich discriminative classifiers that directly predict the target-side word order. Our approach combines the strengths of lexical reordering and syntactic preordering models by performing long-distance reorderings using the structure of the parse tree, while utilizing a discriminative model with a rich set of features, including lexical features. We present extensive experiments on 22 language pairs, including preordering into English from 7 other languages. We obtain improvements of up to 1.4 BLEU on language pairs in the WMT 2010 shared task. For languages from different families the improvements often exceed 2 BLEU. Many of these gains are also significant in human evaluations.

4 0.19173989 157 emnlp-2013-Recursive Autoencoders for ITG-Based Translation

Author: Peng Li ; Yang Liu ; Maosong Sun

Abstract: While inversion transduction grammar (ITG) is well suited for modeling ordering shifts between languages, how to make applying the two reordering rules (i.e., straight and inverted) dependent on actual blocks being merged remains a challenge. Unlike previous work that only uses boundary words, we propose to use recursive autoencoders to make full use of the entire merging blocks alternatively. The recursive autoencoders are capable of generating vector space representations for variable-sized phrases, which enable predicting orders to exploit syntactic and semantic information from a neural language modeling’s perspective. Experiments on the NIST 2008 dataset show that our system significantly improves over the MaxEnt classifier by 1.07 BLEU points.

5 0.11771847 71 emnlp-2013-Efficient Left-to-Right Hierarchical Phrase-Based Translation with Improved Reordering

Author: Maryam Siahbani ; Baskaran Sankaran ; Anoop Sarkar

Abstract: Left-to-right (LR) decoding (Watanabe et al., 2006b) is a promising decoding algorithm for hierarchical phrase-based translation (Hiero). It generates the target sentence by extending the hypotheses only on the right edge. LR decoding has complexity O(n2b) for input of n words and beam size b, compared to O(n3) for the CKY algorithm. It requires a single language model (LM) history for each target hypothesis rather than two LM histories per hypothesis as in CKY. In this paper we present an augmented LR decoding algorithm that builds on the original algorithm in (Watanabe et al., 2006b). Unlike that algorithm, using experiments over multiple language pairs we show two new results: our LR decoding algorithm provides demonstrably more efficient decoding than CKY Hiero, four times faster; and by introducing new distortion and reordering features for LR decoding, it maintains the same translation quality (as in BLEU scores) ob- tained phrase-based and CKY Hiero with the same translation model.

6 0.11317422 22 emnlp-2013-Anchor Graph: Global Reordering Contexts for Statistical Machine Translation

7 0.097356141 104 emnlp-2013-Improving Statistical Machine Translation with Word Class Models

8 0.087228581 116 emnlp-2013-Joint Parsing and Disfluency Detection in Linear Time

9 0.0827251 127 emnlp-2013-Max-Margin Synchronous Grammar Induction for Machine Translation

10 0.079453193 187 emnlp-2013-Translation with Source Constituency and Dependency Trees

11 0.074293397 57 emnlp-2013-Dependency-Based Decipherment for Resource-Limited Machine Translation

12 0.067843057 39 emnlp-2013-Boosting Cross-Language Retrieval by Learning Bilingual Phrase Associations from Relevance Rankings

13 0.066154718 15 emnlp-2013-A Systematic Exploration of Diversity in Machine Translation

14 0.065354422 146 emnlp-2013-Optimal Incremental Parsing via Best-First Dynamic Programming

15 0.063961677 135 emnlp-2013-Monolingual Marginal Matching for Translation Model Adaptation

16 0.06317462 50 emnlp-2013-Combining PCFG-LA Models with Dual Decomposition: A Case Study with Function Labels and Binarization

17 0.060860321 128 emnlp-2013-Max-Violation Perceptron and Forced Decoding for Scalable MT Training

18 0.060770523 201 emnlp-2013-What is Hidden among Translation Rules

19 0.060631171 88 emnlp-2013-Flexible and Efficient Hypergraph Interactions for Joint Hierarchical and Forest-to-String Decoding

20 0.060264945 139 emnlp-2013-Noise-Aware Character Alignment for Bootstrapping Statistical Machine Transliteration from Bilingual Corpora


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.187), (1, -0.204), (2, 0.079), (3, 0.058), (4, 0.042), (5, -0.028), (6, -0.037), (7, -0.082), (8, 0.029), (9, 0.093), (10, -0.057), (11, 0.079), (12, -0.126), (13, -0.03), (14, -0.132), (15, 0.206), (16, -0.076), (17, -0.11), (18, -0.061), (19, -0.014), (20, -0.024), (21, -0.06), (22, 0.13), (23, 0.152), (24, 0.12), (25, 0.024), (26, -0.029), (27, 0.084), (28, 0.053), (29, 0.04), (30, -0.004), (31, 0.059), (32, -0.041), (33, 0.019), (34, -0.049), (35, 0.047), (36, -0.108), (37, -0.122), (38, -0.019), (39, 0.015), (40, -0.022), (41, -0.0), (42, -0.101), (43, 0.047), (44, -0.022), (45, -0.068), (46, 0.031), (47, -0.006), (48, 0.048), (49, 0.028)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.92933285 171 emnlp-2013-Shift-Reduce Word Reordering for Machine Translation

Author: Katsuhiko Hayashi ; Katsuhito Sudoh ; Hajime Tsukada ; Jun Suzuki ; Masaaki Nagata

Abstract: This paper presents a novel word reordering model that employs a shift-reduce parser for inversion transduction grammars. Our model uses rich syntax parsing features for word reordering and runs in linear time. We apply it to postordering of phrase-based machine translation (PBMT) for Japanese-to-English patent tasks. Our experimental results show that our method achieves a significant improvement of +3.1 BLEU scores against 30.15 BLEU scores of the baseline PBMT system.

2 0.76196218 175 emnlp-2013-Source-Side Classifier Preordering for Machine Translation

Author: Uri Lerner ; Slav Petrov

Abstract: We present a simple and novel classifier-based preordering approach. Unlike existing preordering models, we train feature-rich discriminative classifiers that directly predict the target-side word order. Our approach combines the strengths of lexical reordering and syntactic preordering models by performing long-distance reorderings using the structure of the parse tree, while utilizing a discriminative model with a rich set of features, including lexical features. We present extensive experiments on 22 language pairs, including preordering into English from 7 other languages. We obtain improvements of up to 1.4 BLEU on language pairs in the WMT 2010 shared task. For languages from different families the improvements often exceed 2 BLEU. Many of these gains are also significant in human evaluations.

3 0.69631881 157 emnlp-2013-Recursive Autoencoders for ITG-Based Translation

Author: Peng Li ; Yang Liu ; Maosong Sun

Abstract: While inversion transduction grammar (ITG) is well suited for modeling ordering shifts between languages, how to make applying the two reordering rules (i.e., straight and inverted) dependent on actual blocks being merged remains a challenge. Unlike previous work that only uses boundary words, we propose to use recursive autoencoders to make full use of the entire merging blocks alternatively. The recursive autoencoders are capable of generating vector space representations for variable-sized phrases, which enable predicting orders to exploit syntactic and semantic information from a neural language modeling’s perspective. Experiments on the NIST 2008 dataset show that our system significantly improves over the MaxEnt classifier by 1.07 BLEU points.

4 0.58012491 22 emnlp-2013-Anchor Graph: Global Reordering Contexts for Statistical Machine Translation

Author: Hendra Setiawan ; Bowen Zhou ; Bing Xiang

Abstract: Reordering poses one of the greatest challenges in Statistical Machine Translation research as the key contextual information may well be beyond the confine oftranslation units. We present the “Anchor Graph” (AG) model where we use a graph structure to model global contextual information that is crucial for reordering. The key ingredient of our AG model is the edges that capture the relationship between the reordering around a set of selected translation units, which we refer to as anchors. As the edges link anchors that may span multiple translation units at decoding time, our AG model effectively encodes global contextual information that is previously absent. We integrate our proposed model into a state-of-the-art translation system and demonstrate the efficacy of our proposal in a largescale Chinese-to-English translation task.

5 0.56837046 84 emnlp-2013-Factored Soft Source Syntactic Constraints for Hierarchical Machine Translation

Author: Zhongqiang Huang ; Jacob Devlin ; Rabih Zbib

Abstract: Translation Jacob Devlin Raytheon BBN Technologies 50 Moulton St Cambridge, MA, USA j devl in@bbn . com Rabih Zbib Raytheon BBN Technologies 50 Moulton St Cambridge, MA, USA r zbib@bbn . com have tried to introduce grammaticality to the transThis paper describes a factored approach to incorporating soft source syntactic constraints into a hierarchical phrase-based translation system. In contrast to traditional approaches that directly introduce syntactic constraints to translation rules by explicitly decorating them with syntactic annotations, which often exacerbate the data sparsity problem and cause other problems, our approach keeps translation rules intact and factorizes the use of syntactic constraints through two separate models: 1) a syntax mismatch model that associates each nonterminal of a translation rule with a distribution of tags that is used to measure the degree of syntactic compatibility of the translation rule on source spans; 2) a syntax-based reordering model that predicts whether a pair of sibling constituents in the constituent parse tree of the source sentence should be reordered or not when translated to the target language. The features produced by both models are used as soft constraints to guide the translation process. Experiments on Chinese-English translation show that the proposed approach significantly improves a strong string-to-dependency translation system on multiple evaluation sets.

6 0.56804967 116 emnlp-2013-Joint Parsing and Disfluency Detection in Linear Time

7 0.47844079 71 emnlp-2013-Efficient Left-to-Right Hierarchical Phrase-Based Translation with Improved Reordering

8 0.44971025 10 emnlp-2013-A Multi-Teraflop Constituency Parser using GPUs

9 0.41227052 168 emnlp-2013-Semi-Supervised Feature Transformation for Dependency Parsing

10 0.40689862 50 emnlp-2013-Combining PCFG-LA Models with Dual Decomposition: A Case Study with Function Labels and Binarization

11 0.3928768 104 emnlp-2013-Improving Statistical Machine Translation with Word Class Models

12 0.36548734 187 emnlp-2013-Translation with Source Constituency and Dependency Trees

13 0.35462332 58 emnlp-2013-Dependency Language Models for Sentence Completion

14 0.35084921 66 emnlp-2013-Dynamic Feature Selection for Dependency Parsing

15 0.34586686 146 emnlp-2013-Optimal Incremental Parsing via Best-First Dynamic Programming

16 0.34586617 122 emnlp-2013-Learning to Freestyle: Hip Hop Challenge-Response Induction via Transduction Rule Segmentation

17 0.33686861 189 emnlp-2013-Two-Stage Method for Large-Scale Acquisition of Contradiction Pattern Pairs using Entailment

18 0.32026467 190 emnlp-2013-Ubertagging: Joint Segmentation and Supertagging for English

19 0.31747124 127 emnlp-2013-Max-Margin Synchronous Grammar Induction for Machine Translation

20 0.3150726 139 emnlp-2013-Noise-Aware Character Alignment for Bootstrapping Statistical Machine Transliteration from Bilingual Corpora


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(3, 0.012), (18, 0.042), (22, 0.031), (26, 0.021), (30, 0.084), (45, 0.414), (50, 0.02), (51, 0.118), (64, 0.011), (66, 0.028), (75, 0.016), (77, 0.079), (90, 0.012), (96, 0.017)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.78142917 34 emnlp-2013-Automatically Classifying Edit Categories in Wikipedia Revisions

Author: Johannes Daxenberger ; Iryna Gurevych

Abstract: In this paper, we analyze a novel set of features for the task of automatic edit category classification. Edit category classification assigns categories such as spelling error correction, paraphrase or vandalism to edits in a document. Our features are based on differences between two versions of a document including meta data, textual and language properties and markup. In a supervised machine learning experiment, we achieve a micro-averaged F1 score of .62 on a corpus of edits from the English Wikipedia. In this corpus, each edit has been multi-labeled according to a 21-category taxonomy. A model trained on the same data achieves state-of-the-art performance on the related task of fluency edit classification. We apply pattern mining to automatically labeled edits in the revision histories of different Wikipedia articles. Our results suggest that high-quality articles show a higher degree of homogeneity with respect to their collaboration patterns as compared to random articles.

same-paper 2 0.75266576 171 emnlp-2013-Shift-Reduce Word Reordering for Machine Translation

Author: Katsuhiko Hayashi ; Katsuhito Sudoh ; Hajime Tsukada ; Jun Suzuki ; Masaaki Nagata

Abstract: This paper presents a novel word reordering model that employs a shift-reduce parser for inversion transduction grammars. Our model uses rich syntax parsing features for word reordering and runs in linear time. We apply it to postordering of phrase-based machine translation (PBMT) for Japanese-to-English patent tasks. Our experimental results show that our method achieves a significant improvement of +3.1 BLEU scores against 30.15 BLEU scores of the baseline PBMT system.

3 0.73119694 58 emnlp-2013-Dependency Language Models for Sentence Completion

Author: Joseph Gubbins ; Andreas Vlachos

Abstract: Sentence completion is a challenging semantic modeling task in which models must choose the most appropriate word from a given set to complete a sentence. Although a variety of language models have been applied to this task in previous work, none of the existing approaches incorporate syntactic information. In this paper we propose to tackle this task using a pair of simple language models in which the probability of a sentence is estimated as the probability of the lexicalisation of a given syntactic dependency tree. We apply our approach to the Microsoft Research Sentence Completion Challenge and show that it improves on n-gram language models by 8.7 percentage points, achieving the highest accuracy reported to date apart from neural language models that are more complex and ex- pensive to train.

4 0.43343389 146 emnlp-2013-Optimal Incremental Parsing via Best-First Dynamic Programming

Author: Kai Zhao ; James Cross ; Liang Huang

Abstract: We present the first provably optimal polynomial time dynamic programming (DP) algorithm for best-first shift-reduce parsing, which applies the DP idea of Huang and Sagae (2010) to the best-first parser of Sagae and Lavie (2006) in a non-trivial way, reducing the complexity of the latter from exponential to polynomial. We prove the correctness of our algorithm rigorously. Experiments confirm that DP leads to a significant speedup on a probablistic best-first shift-reduce parser, and makes exact search under such a model tractable for the first time.

5 0.42295456 168 emnlp-2013-Semi-Supervised Feature Transformation for Dependency Parsing

Author: Wenliang Chen ; Min Zhang ; Yue Zhang

Abstract: In current dependency parsing models, conventional features (i.e. base features) defined over surface words and part-of-speech tags in a relatively high-dimensional feature space may suffer from the data sparseness problem and thus exhibit less discriminative power on unseen data. In this paper, we propose a novel semi-supervised approach to addressing the problem by transforming the base features into high-level features (i.e. meta features) with the help of a large amount of automatically parsed data. The meta features are used together with base features in our final parser. Our studies indicate that our proposed approach is very effective in processing unseen data and features. Experiments on Chinese and English data sets show that the final parser achieves the best-reported accuracy on the Chinese data and comparable accuracy with the best known parsers on the English data.

6 0.4221397 116 emnlp-2013-Joint Parsing and Disfluency Detection in Linear Time

7 0.40924278 187 emnlp-2013-Translation with Source Constituency and Dependency Trees

8 0.40532938 57 emnlp-2013-Dependency-Based Decipherment for Resource-Limited Machine Translation

9 0.40448681 50 emnlp-2013-Combining PCFG-LA Models with Dual Decomposition: A Case Study with Function Labels and Binarization

10 0.40426904 107 emnlp-2013-Interactive Machine Translation using Hierarchical Translation Models

11 0.40354809 150 emnlp-2013-Pair Language Models for Deriving Alternative Pronunciations and Spellings from Pronunciation Dictionaries

12 0.40160269 175 emnlp-2013-Source-Side Classifier Preordering for Machine Translation

13 0.39970717 157 emnlp-2013-Recursive Autoencoders for ITG-Based Translation

14 0.39625633 14 emnlp-2013-A Synchronous Context Free Grammar for Time Normalization

15 0.39331847 15 emnlp-2013-A Systematic Exploration of Diversity in Machine Translation

16 0.39001793 114 emnlp-2013-Joint Learning and Inference for Grammatical Error Correction

17 0.3887946 113 emnlp-2013-Joint Language and Translation Modeling with Recurrent Neural Networks

18 0.38856885 66 emnlp-2013-Dynamic Feature Selection for Dependency Parsing

19 0.38696158 52 emnlp-2013-Converting Continuous-Space Language Models into N-Gram Language Models for Statistical Machine Translation

20 0.38692525 172 emnlp-2013-Simple Customization of Recursive Neural Networks for Semantic Relation Classification