emnlp emnlp2013 emnlp2013-187 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Fandong Meng ; Jun Xie ; Linfeng Song ; Yajuan Lu ; Qun Liu
Abstract: We present a novel translation model, which simultaneously exploits the constituency and dependency trees on the source side, to combine the advantages of two types of trees. We take head-dependents relations of dependency trees as backbone and incorporate phrasal nodes of constituency trees as the source side of our translation rules, and the target side as strings. Our rules hold the property of long distance reorderings and the compatibility with phrases. Large-scale experimental results show that our model achieves significantly improvements over the constituency-to-string (+2.45 BLEU on average) and dependencyto-string (+0.91 BLEU on average) models, which only employ single type of trees, and significantly outperforms the state-of-theart hierarchical phrase-based model (+1.12 BLEU on average), on three Chinese-English NIST test sets.
Reference: text
sentIndex sentText sentNum sentScore
1 ie , , , Abstract We present a novel translation model, which simultaneously exploits the constituency and dependency trees on the source side, to combine the advantages of two types of trees. [sent-5, score-0.807]
2 We take head-dependents relations of dependency trees as backbone and incorporate phrasal nodes of constituency trees as the source side of our translation rules, and the target side as strings. [sent-6, score-1.585]
3 Our rules hold the property of long distance reorderings and the compatibility with phrases. [sent-7, score-0.278]
4 However, dependency trees describe the grammatical relation between words of the sentence, and represent long distance dependencies in a concise manner. [sent-22, score-0.304]
5 In this paper, we propose to combine the advantages of source side constituency and dependency trees. [sent-25, score-0.6]
6 Since the dependency tree is structurally simpler and directly represents long distance dependencies, we take dependency trees as the backbone and incorporate constituents to them. [sent-26, score-0.588]
7 Our model employs rules that represent the source side as head-dependents relations which are incorporated with constituency phrasal nodes, and the target side as strings. [sent-27, score-1.027]
8 , 2011) is composed of a head and all its dependents in dependency trees, and it encodes phrase pattern and sentence pattern (typically long distance reordering relations). [sent-29, score-0.303]
9 With the advantages of head-dependents relations, the translation rules of our model hold the property of long distance reorderings and the compatibility with phrases. [sent-30, score-0.428]
10 Our new model (Section 2) extracts rules from word-aligned pairs of source trees (constituency and dependency) and target strings (Section 3), and translate source trees into target strings by employing a bottom-up chart-based algorithm (Section 4). [sent-31, score-0.671]
11 ) while captured by a constituency (a), where the bold phrasal nodes NP1 , VP2, VP3 indicate the phras? [sent-63, score-0.832]
12 n not be captured by dependency tree syn- tactic phrases. [sent-70, score-0.292]
13 2 Grammar We take head-dependents relations of dependency trees as backbone and incorporate phrasal nodes of constituency trees as the source side of our translation rules, and the target side as strings. [sent-78, score-1.585]
14 A headdependents relation consists of a head and all its dependents in dependency trees, and it can represent long distance dependencies. [sent-79, score-0.303]
15 Incorporating phrasal nodes of constituency trees into head-dependents relations further enhances the compatibility with phrases of our rules. [sent-80, score-1.005]
16 Figure 1 shows an example of phrases which can not be captured by a dependency tree while captured by a constituency tree, such as the bold phrasal nodes NP1,VP2 and VP3. [sent-81, score-1.165]
17 /VV”, we represent r2 by constructing a new head node 1067 ? [sent-278, score-0.199]
18 21 Figure 2: Two examples of our translation rules corresponding to the top level of Figure 1-(b). [sent-288, score-0.277]
19 r1 captures a head-dependents relation, and r2 extends r1 by incorporating a phrasal node VP2. [sent-289, score-0.471]
20 “x1 :VP2 | | |VV NN” indicates a substitution site which can be replaced by a source phrase covered by a phrasal node VP (the phrasal node consists of two dependency nodes with POS tag VV and NN, respectively). [sent-291, score-1.423]
21 For simplicity, we use a shorten form CH|||DVRV to represent mthpel head-dependents rretleanti foonrswith/without constituency phrasal nodes. [sent-294, score-0.625]
22 Figure 3 gives an example of the translation derivation in our model, with the translation rules listed in (g). [sent-493, score-0.396]
23 Given a sentence to translate in (a), we first parse it into a constituency tree and a de? [sent-495, score-0.432]
24 Finally, we apply r8 to translate the last fragment to “the first”, and get the final result (f). [sent-585, score-0.186]
25 3 Rule Extraction In this section, we describe how to extract rules from a set of 4-tuples hC, T, S, Ai, where C is a source constituency tree, CT, iTs a source dependency tree, cSe is a target side sentence, and A is an word alignment relation between T/C and S. [sent-586, score-0.869]
26 Label the dependency tree with phrasal nodes from the constituency tree, and annotate align- ment information to the phrasal nodes labeled dependency tree (Section 3. [sent-589, score-1.83]
27 Identify acceptable CHDR fragments from the annotated dependency tree for rule induction (Section 3. [sent-592, score-0.563]
28 Induce a set of lexicalized and generalized CHDR rules from the acceptable fragments (Section 3. [sent-595, score-0.423]
29 1 Annotation Given a 4-tuple hC, T, S, Ai, we first label phrasal Gnoivdeesn afro 4m-tu tphlee constituency tree Cirs to btheel dependency tree T, which can be easily accomplished by phrases mapping according to the common covered source sequences. [sent-598, score-1.149]
30 As dependency trees can capture some phrasal information by dependency syntactic 1069 Figure 4: An annotated dependency tree. [sent-599, score-0.918]
31 Each node is annotated with two spans, the former is node span and the latter subtree span. [sent-600, score-0.391]
32 The fragments covered by phrasal nodes are annotated with phrasal spans. [sent-601, score-1.019]
33 The nodes denoted by the solid line box are not nsp consistent. [sent-602, score-0.308]
34 phrases, in order to complement the information that dependency trees can not capture, we only label the phrasal nodes that cover dependency non-syntactic phrases. [sent-603, score-0.932]
35 Then, we annotate alignment information to the phrasal nodes labeled dependency tree T, as shown in Figure 4. [sent-604, score-0.82]
36 Given a node n in the source phrasal nodes labeled T with word alignment information, the spans of n induced by the word alignment are consecutive sequences of words in the target sentence. [sent-606, score-0.861]
37 As shown in Figure 4, we annotate each node n of phrasal nodes labeled T with two attributes: node span and subtree span; be? [sent-607, score-0.928]
38 i Definition 2 Given a subtree rooted at n, the subtree span tsp(n) of n is the consecutive target word sequence from the lower bound of the nsp of all nodes in T′ to ? [sent-641, score-0.605]
39 Definition 3 Given a fragment f covered by a phrasal node, the phrasal sp? [sent-650, score-0.886]
40 The annotation can be achieved by a single postorder transversal of the phrasal nodes labeled dependency tree. [sent-733, score-0.689]
41 For simplicity, we call the annotated phrasal nodes labeled dependency tree annotated dependency tree. [sent-734, score-0.908]
42 The extraction of bilingual phrases (including the translation of head node, dependency syntactic phrases and the fragment covered by a phrasal node) can be readily achieved by the algorithm described in Koehn et al. [sent-735, score-1.037]
43 2 Acceptable Fragments Identification Before present the method of acceptable fragments identification, we give a brief description of CHDR fragments. [sent-739, score-0.219]
44 A CHDR fragment is an annotated fragment that consists of a source head-dependents relation with/without constituency phrasal nodes, a target string and the word alignment information between the source and target side. [sent-740, score-1.185]
45 We identify the acceptable CHDR fragments that are suitable for rule induction from the annotated dependency tree. [sent-741, score-0.464]
46 We divide the acceptable CHDR fragments into two categories depending on whether the fragments contain phrasal nodes. [sent-742, score-0.664]
47 If an acceptable CHDR fragment does not contain phrasal nodes, we call it CHDR-normal fragment, otherwise CHDR-phrasal fragment. [sent-743, score-0.591]
48 Given a CHDR fragment F rooted at n, we say F is acceptable if it satisfies any one of the following properties: 1070 (a) ? [sent-744, score-0.292]
49 The identification of acceptable fragments can be achieved by a single postorder transversal of the annotated dependency tree. [sent-955, score-0.403]
50 Typically, each acceptable fragment contains at most three types of nodes: head node, head of the related CHDR; internal nodes, internal nodes of the related CHDR except head node; leaf nodes, leaf nodes of the related CHDR. [sent-956, score-0.999]
51 We induce CHDR-normal rules and CHDR-phrasal rules from CHDR-normal fragments and CHDRphrasal fragments, respectively. [sent-959, score-0.422]
52 We first induce a lexicalized form of CHDR rule from an acceptable CHDR fragment: 1. [sent-960, score-0.252]
53 For a CHDR-normal fragment, we first mark the internal nodes as substitution sites. [sent-961, score-0.24]
54 Then we generate the target string according to the node span of the head and the subtree spans of the dependents, and turn the word sequences covered by the internal nodes into variables. [sent-963, score-0.731]
55 For a CHDR-phrasal fragment, we first mark the internal nodes and the phrasal nodes as substitution sites. [sent-966, score-0.745]
56 Then we construct the output of the CHDR-phrasal rule in almost the same way with constructing CHDR-normal rules, except that we replace the target sequences covered by the internal nodes and the phrasal nodes with variables. [sent-968, score-0.923]
57 1071 For example, rule r1 in Figure 5-(d) is a lexicalized CHDR-normal rule induced from the CHDR-normal fragment in Figure 5-(a). [sent-969, score-0.371]
58 r9 and r11 are CHDRphrasal rules induced from the CHDR-phrasal fragment in Figure 5-(b) and Figure 5-(c) respectively. [sent-970, score-0.297]
59 To alleviate the sparseness problem, we generalize the lexicalized CHDR-normal rules and partially unlexicalized CHDR-phrasal rules with unlexicalized nodes by the method proposed in Xie et al. [sent-972, score-0.528]
60 As the modification relations between head and dependents are determined by the edges, we can replace the lexical word of each node with its category (POS tag) and obtain new head-dependents relations with unlexicalized nodes keeping the same modification relations. [sent-974, score-0.42]
61 We generalize the rule by simultaneously turn the nodes of the same type (head, internal, leaf) into their categories. [sent-975, score-0.291]
62 Actually, our CHDR rules are the superset of head-dependents relation rules in Xie et al. [sent-978, score-0.316]
63 CHDR-normal rules are equivalent with the head-dependents relation rules and the CHDRphrasal rules are the extension of these rules. [sent-980, score-0.474]
64 For convenience of description, we use the subscript to distinguish the phrasal nodes with the same catego- ry, such as VP2 and VP3. [sent-981, score-0.505]
65 We handle the unaligned words of the target side by extending the node spans of the lexicalized head and leaf nodes, and the subtree spans of the lexicalized dependents, on both left and right directions. [sent-983, score-0.651]
66 During this process, we might obtain m(m ≥ 1) CHDR rules from an acceptable fragment. [sent-985, score-0.271]
67 We take the extracted rule set as observed data and make use of relative frequency estimator to obtain the translation probabilities P(t|s) tainmda P(s|t). [sent-987, score-0.212]
68 Let d be a derivation that convert a source phrasal nodes labeled dependency tree into a target string e. [sent-989, score-0.896]
69 The values of the first four features are accumulated on the CHDR rules and the next four features are accumulated on the bilingual phrases. [sent-993, score-0.228]
70 We also use a pseudo translation rule (constructed according to the word order of head-dependents relation) as a feature to guarantee the complete translation when no matched rules can be found during decoding. [sent-994, score-0.489]
71 It finds the best derivation that convert the input phrasal nodes labeled dependency tree into a target string among all possible derivations. [sent-996, score-0.832]
72 Given the source constituency tree and dependency tree, we first generate phrasal nodes labeled dependency tree T as described in Section 3. [sent-997, score-1.357]
73 For each node n, it enumerates all instances of CHDR rooted at n, and checks the rule set for matched translation rules. [sent-999, score-0.384]
74 A larger translation is generated by substituting the variables in the target side of a translation rule with the translations of the corresponding dependents. [sent-1000, score-0.444]
75 To balance the performance and speed of the decoder, we limit the search space by reducing the 1072 number of translation rules used for each node. [sent-1002, score-0.308]
76 There are two ways to limit the rule table size: by a fixed limit (rule-limit) of how many rules are retrieved for each input node, and by a threshold (rulethreshold) to specify that the rule with a score lower than β times of the best score should be discard- ed. [sent-1003, score-0.406]
77 The qual- ity of translations is evaluated by the case insensitive NIST BLEU-4 metric We parse the source sentences to constituency trees (without binarization) and projective dependency trees with Stanford Parser (Klein and Manning, 2002). [sent-1020, score-0.748]
78 The “+” denotes that the rules are composed of syntactic translation rules and bilingual phrases (32. [sent-1057, score-0.546]
79 , 2006) to strict that the height of a rule tree is no greater than 3 and phrase length is no greater than 7. [sent-1069, score-0.192]
80 , 2004) rule extraction algorithm and utilize bilingual phrases to translate source head node and dependency syntactic phrases. [sent-1072, score-0.666]
81 By exploiting two types of trees on source side, our model gains significant improvements over constituency-to-string and dependencyto-string models, which employ single type of trees. [sent-1085, score-0.187]
82 Table 1 also lists the statistical results of rules extracted from training data by different systems. [sent-1086, score-0.188]
83 The extra rules are CHDR-phrasal rules, which can bring in BLEU improvements by enhancing the compatibility with phrases. [sent-1090, score-0.208]
84 Besides, the proportion of CHDR-phrasal rules in all CHDR rules is calculated in these translations, and we call this proportion “CHDR-phrasal Rule”. [sent-1096, score-0.316]
85 ” is labeled by a phrasal node “VP” (means verb phrase), which can be captured by our CHDR-phrasal rules and translated into the correct result “reemergence” with bilingual phrases. [sent-1295, score-0.74]
86 By combining the merits of constituency and dependency trees, our consdep2str model learns CHDR-normal rules to acquire the property of long distance reorderings and CHDR-phrasal rules to obtain good compatibility with phrases. [sent-1296, score-0.912]
87 Marton and Resnik (2008) took the source constituency tree into account and added soft constraints to the hierarchical phrasebased model (Chiang, 2005). [sent-1313, score-0.499]
88 Cherry (2008) utilized dependency tree to add syntactic cohesion to the phrased-based model. [sent-1314, score-0.251]
89 Mi and Liu, (2010) 1074 proposed a constituency-to-dependency translation model, which utilizes constituency forests on the source side to direct the translation, and dependency trees on the target side to ensure grammaticality. [sent-1315, score-0.924]
90 (2012) presented a hierarchical chunk-to-string translation model, which is a compromise between the hierarchical phrase-based model and the constituency-to-string model. [sent-1317, score-0.219]
91 Most works make effort to introduce linguistic knowledge into the phrase-based model and hierarchical phrasebased model with constituency trees. [sent-1318, score-0.336]
92 Only the work proposed by Mi and Liu, (2010) utilized constituency and dependency trees, while their work applied two types of trees on two sides. [sent-1319, score-0.561]
93 Instead, our model simultaneously utilizes constituency and dependency trees on the source side to direct the translation, which is concerned with combining the advantages of two types of trees in translation rules to advance the state-of-the-art machine translation. [sent-1320, score-1.155]
94 7 Conclusion In this paper, we present a novel model that simultaneously utilizes constituency and dependency trees on the source side to direct the translation. [sent-1321, score-0.724]
95 To combine the merits of constituency and dependency trees, our model employs head-dependents relations incorporating with constituency phrasal nodes. [sent-1322, score-1.101]
96 For the first time, source side constituency and dependency trees are simultaneously utilized to direct the translation, and the model surpasses the state-of-theart translation models. [sent-1324, score-0.843]
97 the reemergence of a severe acute respiratory syndrome (SARS) case dep2srt: ? [sent-1394, score-0.204]
98 reemergence of a severe acute respiratory syndrome ( SARS ) case ? [sent-1401, score-0.204]
99 A new string-to-dependency machine translation algorithm with a target dependency language model. [sent-1551, score-0.317]
100 A dependency treelet string correspondence model for statistical machine translation. [sent-1563, score-0.212]
wordName wordTfidf (topN-words)
[('chdr', 0.458), ('phrasal', 0.339), ('constituency', 0.286), ('launch', 0.175), ('nodes', 0.166), ('rules', 0.158), ('dependency', 0.152), ('xie', 0.143), ('nsp', 0.142), ('fragment', 0.139), ('intel', 0.135), ('node', 0.132), ('ultrabook', 0.126), ('tsp', 0.125), ('trees', 0.123), ('translation', 0.119), ('acceptable', 0.113), ('fragments', 0.106), ('vv', 0.102), ('tree', 0.099), ('rule', 0.093), ('nn', 0.084), ('subtree', 0.084), ('qun', 0.082), ('chdrphrasal', 0.079), ('psp', 0.079), ('asia', 0.078), ('bilingual', 0.07), ('covered', 0.069), ('liu', 0.068), ('side', 0.067), ('head', 0.067), ('bleu', 0.065), ('source', 0.064), ('reemergence', 0.063), ('leaf', 0.063), ('nr', 0.061), ('dependents', 0.055), ('co', 0.052), ('compatibility', 0.05), ('spans', 0.05), ('hierarchical', 0.05), ('acute', 0.047), ('respiratory', 0.047), ('syndrome', 0.047), ('translate', 0.047), ('re', 0.047), ('lexicalized', 0.046), ('target', 0.046), ('ru', 0.045), ('internal', 0.044), ('och', 0.043), ('nd', 0.043), ('span', 0.043), ('ad', 0.041), ('captured', 0.041), ('sars', 0.041), ('reorderings', 0.041), ('phrases', 0.041), ('rooted', 0.04), ('haitao', 0.038), ('nist', 0.038), ('merits', 0.038), ('graehl', 0.038), ('indonesian', 0.035), ('nations', 0.035), ('ding', 0.035), ('terminals', 0.035), ('knight', 0.034), ('mi', 0.033), ('backbone', 0.033), ('es', 0.032), ('pages', 0.032), ('annotate', 0.032), ('simultaneously', 0.032), ('asal', 0.032), ('computationallinguistics', 0.032), ('deadline', 0.032), ('figu', 0.032), ('lon', 0.032), ('nsistent', 0.032), ('pbp', 0.032), ('pbplex', 0.032), ('phrasa', 0.032), ('stituency', 0.032), ('transversal', 0.032), ('troops', 0.032), ('alignment', 0.032), ('limit', 0.031), ('advantages', 0.031), ('josef', 0.031), ('th', 0.03), ('substitution', 0.03), ('statistical', 0.03), ('od', 0.03), ('ree', 0.03), ('string', 0.03), ('chiang', 0.029), ('tr', 0.029), ('distance', 0.029)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000008 187 emnlp-2013-Translation with Source Constituency and Dependency Trees
Author: Fandong Meng ; Jun Xie ; Linfeng Song ; Yajuan Lu ; Qun Liu
Abstract: We present a novel translation model, which simultaneously exploits the constituency and dependency trees on the source side, to combine the advantages of two types of trees. We take head-dependents relations of dependency trees as backbone and incorporate phrasal nodes of constituency trees as the source side of our translation rules, and the target side as strings. Our rules hold the property of long distance reorderings and the compatibility with phrases. Large-scale experimental results show that our model achieves significantly improvements over the constituency-to-string (+2.45 BLEU on average) and dependencyto-string (+0.91 BLEU on average) models, which only employ single type of trees, and significantly outperforms the state-of-theart hierarchical phrase-based model (+1.12 BLEU on average), on three Chinese-English NIST test sets.
2 0.23769718 96 emnlp-2013-Identifying Phrasal Verbs Using Many Bilingual Corpora
Author: Karl Pichotta ; John DeNero
Abstract: We address the problem of identifying multiword expressions in a language, focusing on English phrasal verbs. Our polyglot ranking approach integrates frequency statistics from translated corpora in 50 different languages. Our experimental evaluation demonstrates that combining statistical evidence from many parallel corpora using a novel ranking-oriented boosting algorithm produces a comprehensive set ofEnglish phrasal verbs, achieving performance comparable to a human-curated set.
3 0.20668836 84 emnlp-2013-Factored Soft Source Syntactic Constraints for Hierarchical Machine Translation
Author: Zhongqiang Huang ; Jacob Devlin ; Rabih Zbib
Abstract: Translation Jacob Devlin Raytheon BBN Technologies 50 Moulton St Cambridge, MA, USA j devl in@bbn . com Rabih Zbib Raytheon BBN Technologies 50 Moulton St Cambridge, MA, USA r zbib@bbn . com have tried to introduce grammaticality to the transThis paper describes a factored approach to incorporating soft source syntactic constraints into a hierarchical phrase-based translation system. In contrast to traditional approaches that directly introduce syntactic constraints to translation rules by explicitly decorating them with syntactic annotations, which often exacerbate the data sparsity problem and cause other problems, our approach keeps translation rules intact and factorizes the use of syntactic constraints through two separate models: 1) a syntax mismatch model that associates each nonterminal of a translation rule with a distribution of tags that is used to measure the degree of syntactic compatibility of the translation rule on source spans; 2) a syntax-based reordering model that predicts whether a pair of sibling constituents in the constituent parse tree of the source sentence should be reordered or not when translated to the target language. The features produced by both models are used as soft constraints to guide the translation process. Experiments on Chinese-English translation show that the proposed approach significantly improves a strong string-to-dependency translation system on multiple evaluation sets.
4 0.17625031 167 emnlp-2013-Semi-Markov Phrase-Based Monolingual Alignment
Author: Xuchen Yao ; Benjamin Van Durme ; Chris Callison-Burch ; Peter Clark
Abstract: We introduce a novel discriminative model for phrase-based monolingual alignment using a semi-Markov CRF. Our model achieves stateof-the-art alignment accuracy on two phrasebased alignment datasets (RTE and paraphrase), while doing significantly better than other strong baselines in both non-identical alignment and phrase-only alignment. Additional experiments highlight the potential benefit of our alignment model to RTE, paraphrase identification and question answering, where even a naive application of our model’s alignment score approaches the state ofthe art.
5 0.16588899 127 emnlp-2013-Max-Margin Synchronous Grammar Induction for Machine Translation
Author: Xinyan Xiao ; Deyi Xiong
Abstract: Traditional synchronous grammar induction estimates parameters by maximizing likelihood, which only has a loose relation to translation quality. Alternatively, we propose a max-margin estimation approach to discriminatively inducing synchronous grammars for machine translation, which directly optimizes translation quality measured by BLEU. In the max-margin estimation of parameters, we only need to calculate Viterbi translations. This further facilitates the incorporation of various non-local features that are defined on the target side. We test the effectiveness of our max-margin estimation framework on a competitive hierarchical phrase-based system. Experiments show that our max-margin method significantly outperforms the traditional twostep pipeline for synchronous rule extraction by 1.3 BLEU points and is also better than previous max-likelihood estimation method.
6 0.14931226 201 emnlp-2013-What is Hidden among Translation Rules
8 0.11433355 104 emnlp-2013-Improving Statistical Machine Translation with Word Class Models
9 0.11303703 71 emnlp-2013-Efficient Left-to-Right Hierarchical Phrase-Based Translation with Improved Reordering
10 0.10743923 175 emnlp-2013-Source-Side Classifier Preordering for Machine Translation
11 0.098086223 58 emnlp-2013-Dependency Language Models for Sentence Completion
12 0.091787592 22 emnlp-2013-Anchor Graph: Global Reordering Contexts for Statistical Machine Translation
13 0.08952029 157 emnlp-2013-Recursive Autoencoders for ITG-Based Translation
14 0.087337457 151 emnlp-2013-Paraphrasing 4 Microblog Normalization
15 0.085732125 194 emnlp-2013-Unsupervised Relation Extraction with General Domain Knowledge
16 0.082253762 103 emnlp-2013-Improving Pivot-Based Statistical Machine Translation Using Random Walk
17 0.080686785 57 emnlp-2013-Dependency-Based Decipherment for Resource-Limited Machine Translation
18 0.080612637 15 emnlp-2013-A Systematic Exploration of Diversity in Machine Translation
19 0.079453193 171 emnlp-2013-Shift-Reduce Word Reordering for Machine Translation
20 0.07855624 38 emnlp-2013-Bilingual Word Embeddings for Phrase-Based Machine Translation
topicId topicWeight
[(0, -0.254), (1, -0.229), (2, 0.089), (3, 0.073), (4, 0.073), (5, -0.013), (6, -0.04), (7, -0.048), (8, 0.036), (9, 0.034), (10, 0.05), (11, 0.078), (12, 0.03), (13, 0.219), (14, -0.068), (15, 0.071), (16, -0.047), (17, 0.1), (18, 0.155), (19, -0.259), (20, 0.017), (21, -0.093), (22, -0.034), (23, -0.038), (24, -0.087), (25, 0.065), (26, 0.087), (27, -0.017), (28, -0.083), (29, 0.01), (30, 0.001), (31, -0.097), (32, 0.042), (33, -0.055), (34, -0.012), (35, -0.052), (36, -0.029), (37, 0.09), (38, 0.02), (39, 0.003), (40, -0.017), (41, -0.04), (42, 0.018), (43, -0.011), (44, -0.02), (45, -0.067), (46, 0.008), (47, -0.064), (48, 0.013), (49, 0.139)]
simIndex simValue paperId paperTitle
same-paper 1 0.95041138 187 emnlp-2013-Translation with Source Constituency and Dependency Trees
Author: Fandong Meng ; Jun Xie ; Linfeng Song ; Yajuan Lu ; Qun Liu
Abstract: We present a novel translation model, which simultaneously exploits the constituency and dependency trees on the source side, to combine the advantages of two types of trees. We take head-dependents relations of dependency trees as backbone and incorporate phrasal nodes of constituency trees as the source side of our translation rules, and the target side as strings. Our rules hold the property of long distance reorderings and the compatibility with phrases. Large-scale experimental results show that our model achieves significantly improvements over the constituency-to-string (+2.45 BLEU on average) and dependencyto-string (+0.91 BLEU on average) models, which only employ single type of trees, and significantly outperforms the state-of-theart hierarchical phrase-based model (+1.12 BLEU on average), on three Chinese-English NIST test sets.
2 0.66949672 96 emnlp-2013-Identifying Phrasal Verbs Using Many Bilingual Corpora
Author: Karl Pichotta ; John DeNero
Abstract: We address the problem of identifying multiword expressions in a language, focusing on English phrasal verbs. Our polyglot ranking approach integrates frequency statistics from translated corpora in 50 different languages. Our experimental evaluation demonstrates that combining statistical evidence from many parallel corpora using a novel ranking-oriented boosting algorithm produces a comprehensive set ofEnglish phrasal verbs, achieving performance comparable to a human-curated set.
3 0.62413234 88 emnlp-2013-Flexible and Efficient Hypergraph Interactions for Joint Hierarchical and Forest-to-String Decoding
Author: Martin Cmejrek ; Haitao Mi ; Bowen Zhou
Abstract: Machine translation benefits from system combination. We propose flexible interaction of hypergraphs as a novel technique combining different translation models within one decoder. We introduce features controlling the interactions between the two systems and explore three interaction schemes of hiero and forest-to-string models—specification, generalization, and interchange. The experiments are carried out on large training data with strong baselines utilizing rich sets of dense and sparse features. All three schemes significantly improve results of any single system on four testsets. We find that specification—a more constrained scheme that almost entirely uses forest-to-string rules, but optionally uses hiero rules for shorter spans—comes out as the strongest, yielding improvement up to 0.9 (T -B )/2 points. We also provide a detailed experimental and qualitative analysis of the results.
4 0.61744606 84 emnlp-2013-Factored Soft Source Syntactic Constraints for Hierarchical Machine Translation
Author: Zhongqiang Huang ; Jacob Devlin ; Rabih Zbib
Abstract: Translation Jacob Devlin Raytheon BBN Technologies 50 Moulton St Cambridge, MA, USA j devl in@bbn . com Rabih Zbib Raytheon BBN Technologies 50 Moulton St Cambridge, MA, USA r zbib@bbn . com have tried to introduce grammaticality to the transThis paper describes a factored approach to incorporating soft source syntactic constraints into a hierarchical phrase-based translation system. In contrast to traditional approaches that directly introduce syntactic constraints to translation rules by explicitly decorating them with syntactic annotations, which often exacerbate the data sparsity problem and cause other problems, our approach keeps translation rules intact and factorizes the use of syntactic constraints through two separate models: 1) a syntax mismatch model that associates each nonterminal of a translation rule with a distribution of tags that is used to measure the degree of syntactic compatibility of the translation rule on source spans; 2) a syntax-based reordering model that predicts whether a pair of sibling constituents in the constituent parse tree of the source sentence should be reordered or not when translated to the target language. The features produced by both models are used as soft constraints to guide the translation process. Experiments on Chinese-English translation show that the proposed approach significantly improves a strong string-to-dependency translation system on multiple evaluation sets.
5 0.60992646 167 emnlp-2013-Semi-Markov Phrase-Based Monolingual Alignment
Author: Xuchen Yao ; Benjamin Van Durme ; Chris Callison-Burch ; Peter Clark
Abstract: We introduce a novel discriminative model for phrase-based monolingual alignment using a semi-Markov CRF. Our model achieves stateof-the-art alignment accuracy on two phrasebased alignment datasets (RTE and paraphrase), while doing significantly better than other strong baselines in both non-identical alignment and phrase-only alignment. Additional experiments highlight the potential benefit of our alignment model to RTE, paraphrase identification and question answering, where even a naive application of our model’s alignment score approaches the state ofthe art.
6 0.60743707 201 emnlp-2013-What is Hidden among Translation Rules
7 0.5717116 127 emnlp-2013-Max-Margin Synchronous Grammar Induction for Machine Translation
8 0.51475465 103 emnlp-2013-Improving Pivot-Based Statistical Machine Translation Using Random Walk
9 0.48324427 71 emnlp-2013-Efficient Left-to-Right Hierarchical Phrase-Based Translation with Improved Reordering
10 0.45668575 107 emnlp-2013-Interactive Machine Translation using Hierarchical Translation Models
11 0.44199106 22 emnlp-2013-Anchor Graph: Global Reordering Contexts for Statistical Machine Translation
12 0.43530697 175 emnlp-2013-Source-Side Classifier Preordering for Machine Translation
13 0.41703245 58 emnlp-2013-Dependency Language Models for Sentence Completion
14 0.41427609 32 emnlp-2013-Automatic Idiom Identification in Wiktionary
15 0.40107879 151 emnlp-2013-Paraphrasing 4 Microblog Normalization
16 0.39233258 171 emnlp-2013-Shift-Reduce Word Reordering for Machine Translation
17 0.38557595 186 emnlp-2013-Translating into Morphologically Rich Languages with Synthetic Phrases
18 0.38508961 178 emnlp-2013-Success with Style: Using Writing Style to Predict the Success of Novels
19 0.38222831 104 emnlp-2013-Improving Statistical Machine Translation with Word Class Models
20 0.38106006 10 emnlp-2013-A Multi-Teraflop Constituency Parser using GPUs
topicId topicWeight
[(3, 0.031), (18, 0.052), (22, 0.066), (27, 0.235), (30, 0.104), (45, 0.023), (50, 0.016), (51, 0.149), (66, 0.066), (71, 0.018), (75, 0.026), (77, 0.081), (96, 0.014)]
simIndex simValue paperId paperTitle
same-paper 1 0.8238405 187 emnlp-2013-Translation with Source Constituency and Dependency Trees
Author: Fandong Meng ; Jun Xie ; Linfeng Song ; Yajuan Lu ; Qun Liu
Abstract: We present a novel translation model, which simultaneously exploits the constituency and dependency trees on the source side, to combine the advantages of two types of trees. We take head-dependents relations of dependency trees as backbone and incorporate phrasal nodes of constituency trees as the source side of our translation rules, and the target side as strings. Our rules hold the property of long distance reorderings and the compatibility with phrases. Large-scale experimental results show that our model achieves significantly improvements over the constituency-to-string (+2.45 BLEU on average) and dependencyto-string (+0.91 BLEU on average) models, which only employ single type of trees, and significantly outperforms the state-of-theart hierarchical phrase-based model (+1.12 BLEU on average), on three Chinese-English NIST test sets.
2 0.79431993 86 emnlp-2013-Feature Noising for Log-Linear Structured Prediction
Author: Sida Wang ; Mengqiu Wang ; Stefan Wager ; Percy Liang ; Christopher D. Manning
Abstract: NLP models have many and sparse features, and regularization is key for balancing model overfitting versus underfitting. A recently repopularized form of regularization is to generate fake training data by repeatedly adding noise to real data. We reinterpret this noising as an explicit regularizer, and approximate it with a second-order formula that can be used during training without actually generating fake data. We show how to apply this method to structured prediction using multinomial logistic regression and linear-chain CRFs. We tackle the key challenge of developing a dynamic program to compute the gradient of the regularizer efficiently. The regularizer is a sum over inputs, so we can estimate it more accurately via a semi-supervised or transductive extension. Applied to text classification and NER, our method provides a > 1% absolute performance gain over use of standard L2 regularization.
3 0.6617341 157 emnlp-2013-Recursive Autoencoders for ITG-Based Translation
Author: Peng Li ; Yang Liu ; Maosong Sun
Abstract: While inversion transduction grammar (ITG) is well suited for modeling ordering shifts between languages, how to make applying the two reordering rules (i.e., straight and inverted) dependent on actual blocks being merged remains a challenge. Unlike previous work that only uses boundary words, we propose to use recursive autoencoders to make full use of the entire merging blocks alternatively. The recursive autoencoders are capable of generating vector space representations for variable-sized phrases, which enable predicting orders to exploit syntactic and semantic information from a neural language modeling’s perspective. Experiments on the NIST 2008 dataset show that our system significantly improves over the MaxEnt classifier by 1.07 BLEU points.
4 0.66006023 175 emnlp-2013-Source-Side Classifier Preordering for Machine Translation
Author: Uri Lerner ; Slav Petrov
Abstract: We present a simple and novel classifier-based preordering approach. Unlike existing preordering models, we train feature-rich discriminative classifiers that directly predict the target-side word order. Our approach combines the strengths of lexical reordering and syntactic preordering models by performing long-distance reorderings using the structure of the parse tree, while utilizing a discriminative model with a rich set of features, including lexical features. We present extensive experiments on 22 language pairs, including preordering into English from 7 other languages. We obtain improvements of up to 1.4 BLEU on language pairs in the WMT 2010 shared task. For languages from different families the improvements often exceed 2 BLEU. Many of these gains are also significant in human evaluations.
5 0.65757215 107 emnlp-2013-Interactive Machine Translation using Hierarchical Translation Models
Author: Jesus Gonzalez-Rubio ; Daniel Ortiz-Martinez ; Jose-Miguel Benedi ; Francisco Casacuberta
Abstract: Current automatic machine translation systems are not able to generate error-free translations and human intervention is often required to correct their output. Alternatively, an interactive framework that integrates the human knowledge into the translation process has been presented in previous works. Here, we describe a new interactive machine translation approach that is able to work with phrase-based and hierarchical translation models, and integrates error-correction all in a unified statistical framework. In our experiments, our approach outperforms previous interactive translation systems, and achieves estimated effort reductions of as much as 48% relative over a traditional post-edition system.
6 0.65420312 56 emnlp-2013-Deep Learning for Chinese Word Segmentation and POS Tagging
7 0.65365779 38 emnlp-2013-Bilingual Word Embeddings for Phrase-Based Machine Translation
8 0.65357107 57 emnlp-2013-Dependency-Based Decipherment for Resource-Limited Machine Translation
9 0.65349382 135 emnlp-2013-Monolingual Marginal Matching for Translation Model Adaptation
10 0.65151268 15 emnlp-2013-A Systematic Exploration of Diversity in Machine Translation
11 0.64909792 22 emnlp-2013-Anchor Graph: Global Reordering Contexts for Statistical Machine Translation
12 0.64718902 114 emnlp-2013-Joint Learning and Inference for Grammatical Error Correction
13 0.64197594 143 emnlp-2013-Open Domain Targeted Sentiment
14 0.6415574 104 emnlp-2013-Improving Statistical Machine Translation with Word Class Models
15 0.63963056 52 emnlp-2013-Converting Continuous-Space Language Models into N-Gram Language Models for Statistical Machine Translation
16 0.63945258 53 emnlp-2013-Cross-Lingual Discriminative Learning of Sequence Models with Posterior Regularization
17 0.63819778 83 emnlp-2013-Exploring the Utility of Joint Morphological and Syntactic Learning from Child-directed Speech
18 0.63671917 40 emnlp-2013-Breaking Out of Local Optima with Count Transforms and Model Recombination: A Study in Grammar Induction
19 0.63595468 88 emnlp-2013-Flexible and Efficient Hypergraph Interactions for Joint Hierarchical and Forest-to-String Decoding
20 0.63593411 103 emnlp-2013-Improving Pivot-Based Statistical Machine Translation Using Random Walk