emnlp emnlp2011 emnlp2011-123 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Yang Gao ; Philipp Koehn ; Alexandra Birch
Abstract: Long-distance reordering remains one of the biggest challenges facing machine translation. We derive soft constraints from the source dependency parsing to directly address the reordering problem for the hierarchical phrasebased model. Our approach significantly improves Chinese–English machine translation on a large-scale task by 0.84 BLEU points on average. Moreover, when we switch the tuning function from BLEU to the LRscore which promotes reordering, we observe total improvements of 1.21 BLEU, 1.30 LRscore and 3.36 TER over the baseline. On average our approach improves reordering precision and recall by 6.9 and 0.3 absolute points, respectively, and is found to be especially effective for long-distance reodering.
Reference: text
sentIndex sentText sentNum sentScore
1 uk Abstract Long-distance reordering remains one of the biggest challenges facing machine translation. [sent-8, score-0.35]
2 We derive soft constraints from the source dependency parsing to directly address the reordering problem for the hierarchical phrasebased model. [sent-9, score-0.851]
3 On average our approach improves reordering precision and recall by 6. [sent-16, score-0.419]
4 The hierarchical phrase-based model captures the recursiveness of language without relying on syntactic annotation, and promises better reordering than the phrase-based model. [sent-22, score-0.467]
5 (2009) find that although the hierarchical phrasebased model outperforms the phrase-based model in 857 terms of medium-range reordering, it does equally poorly in long-distance reordering due to constraints to guarantee efficiency. [sent-24, score-0.548]
6 It measures derivation well-formedness and is used to indirectly help reordering; An auxiliary unaligned penalty feature that mitigates xsielaiarcryh error given etnhea lotythfeera ttwuroe ftehaattumreits. [sent-31, score-0.376]
7 - We achieve significant improvements in terms of the overall translation quality and reordering behavior. [sent-32, score-0.464]
8 To our knowledge we are the first to use the source dependency parsing to target the reordering problem for hierarchical phrase-based MT. [sent-33, score-0.741]
9 The basic unit of dependency parsing is a triple consisting of the dependent word, the head word and the dependency relation that connects them. [sent-43, score-0.419]
10 We use the Stanford Parser1 to generate dependency parsing, which automatically extracts dependency relations from phrase structure parsing (de Marneffe et al. [sent-45, score-0.312]
11 , 2005), we decompose the sentence reordering probability into the reordering probability for each aligned source word with respect to its head, excluding the root word at the top of the dependency hierarchy which does not have a head word. [sent-49, score-1.013]
12 (2010) also take a word-based reordering approach for HPBMT, but they model all possible pairwise orientation from the source side as a general linear ordering problem (Tromble and Eisner, 2009). [sent-51, score-0.844]
13 To be more specific, we have a maximum entropy orientation classifier that predicts the probability of a source word being translated in a monotone or reversed manner with respect to its head. [sent-52, score-0.58]
14 shtml (a) jdep jhead S 1 S2 S3 858 jdep jhead (b) S 1 S2 S3 ihde apdT T 3142ihde apdT T 4213 Figure 2: Word alignments to illustrate orientation classification. [sent-56, score-0.595]
15 given the alignment in Figure 2(a), with the alignment points (idep, jdep) for the source dependent word and (ihead, jhead) for the source head word, we define two orientation classes as: c =? [sent-58, score-0.7]
16 MR oift (hjedrwepis−e jhead)(idep− ihead) < 0 (1) When a source head or dependent word is aligned to multiple target words, as shown in Figure 2(b), we always take the first target word for orientation classification. [sent-59, score-0.578]
17 The orientation classifier is trained on the large word-aligned parallel corpus. [sent-60, score-0.349]
18 So for the word yu (English with) in Figure 1, we extract these features for orientation classification: prepDEPyu and prepHEADyou. [sent-64, score-0.44]
19 We define the dependency orientation feature score for a translation hypothesis as the sum of the log orientation probabilities for each source word. [sent-65, score-1.05]
20 Table 1 shows the dependency orientation probabilities for all words in the Figure 1 sentence. [sent-67, score-0.505]
21 Most interestingly, the orientation probabilities for you (English have) strongly support global reordering of one of the few countries with the relative clause that have diplomatic relations with North Korea. [sent-68, score-0.772]
22 We find that it is a general trend for long-distance reordering to gain stronger support, since it is often correlated with prominent reordering patterns (such as relative clause and preposition) as well as lexical evidences (such as “. [sent-69, score-0.7]
23 ”)) for which the reversed orientation takes up the majority of the training cases. [sent-75, score-0.402]
24 Thus our dependency orientation feature is able to trace the difference in ordering the PP with North Korea (as underlined) and the VP have dipl. [sent-88, score-0.535]
25 down to the orientation of the preposition yu (English with) with respect to its head you (English have), and promote Rule 3 which has the right word order. [sent-90, score-0.546]
26 We carry an unresolved word along in the derivation process until we reach a terminator hypothesis which translates the head word. [sent-93, score-0.417]
27 Then the resulting dependency orientation score is added to the terminator hypothesis. [sent-94, score-0.54]
28 This means that the dependency orientation feature is “stateless”, i. [sent-95, score-0.505]
29 , hypotheses that cover the same source span with the same orientation information will receive the same feature score, regardless of the derivation history. [sent-97, score-0.553]
30 Therefore, Derivation 5 in the following will have the same dependency orientation score as Derivation (Rule) 3, and Derivation 6 will score the same as Derivation (Rule) 4. [sent-98, score-0.505]
31 2 Cohesion Penalty When the dependency orientation for a word is temporarily unavailable (“unresolved”), a cohesion penalty fires. [sent-111, score-1.151]
32 Cohesion penalty counts the total occurrences of unresolved words for a translation hypothesis, which involve newly encountered unresolved words as well as old unresolved words carried on from the derivation history. [sent-112, score-0.839]
33 Under this definition, the most cohesive derivation translates the entire sentence with one rule, where every word is locally resolved. [sent-116, score-0.329]
34 The least cohesive derivation translates each word individually and glues word translations together. [sent-117, score-0.329]
35 Consulting Figure 1, the cohesion penalty in Derivation 5 is 4, since the word yu (English with) is unresolved twice (in 5. [sent-118, score-0.873]
36 4, respectively); the cohesion penalty in Derivation 6 is 5: 2 from Beihan (English North Korea) (in 6. [sent-122, score-0.646]
37 To sum up, our cohesion penalty provides an integer-valued measure of derivation well-formedness in the hierarchical phrase-based MT. [sent-126, score-0.885]
38 Same as dependency orientation, the cohesion penalty is not applicable to the root word of the sentence. [sent-127, score-0.802]
39 We propose the cohesion penalty in order to further improve reordering, especially in long-distance cases, since a well-formed derivation at an earlier stage makes it more likely to explore hierarchical rules that perform more reliable reordering. [sent-128, score-0.885]
40 In this respect, the cohesion penalty can be seen as an aid to the glue rule penalty and as an alternative to constituency-based constraints. [sent-129, score-0.954]
41 Specifically, the glue rule penalty (Chiang, 2007) promotes hierarchical rules. [sent-130, score-0.425]
42 Hierarchical rules whose lexical evidence helps resolve words locally will also be favored by our cohesion penalty feature. [sent-131, score-0.646]
43 However, ignorant of the syntactic structure, the 860 glue rule penalty may penalize a reasonably cohesive derivation such as Derivation 5 and at the same time promote a less cohesive hierarchical translation, such as Derivation 6. [sent-132, score-0.894]
44 Compared with constituency constraints based on the phrase structure, our cohesion penalty derived from the binary dependency parsing has two different characteristics. [sent-133, score-0.921]
45 First, our cohesion penalty is by nature more tolerant to some meaningful noncontituent translations. [sent-134, score-0.681]
46 Yet our cohesion penalty by nature admits these translations as cohesive (with no extra cost from es and Aozhou since both are locally resolved). [sent-139, score-0.804]
47 Admittedly, our current implementation of the cohesion penalty is blind to some other meaningful nonconstituent collocations, such as neighbouring siblings of a common uncovered head (regulated as the “floating structure” in (Shen et al. [sent-140, score-0.809]
48 X → (es1 gibt2, there1 is2) (7) X → (Aozhou1 shi2, Australia1 is2) (8) X → (shaoshu1 guojia2, few1 countries2) (9) Second, our cohesion penalty can be by nature more discriminative. [sent-144, score-0.646]
49 Compared with the constituency constraints, the cohesion penalty is integer-valued, and can be made sensitive to the depth of each word in the dependency hierarchy (see Section 2. [sent-145, score-0.953]
50 , 2009), the cohesion penalty could also be made sensitive to the dependency relation of each word. [sent-148, score-0.835]
51 Bin 1 D e p t h 34 pobj与prep有rcmdo bdjn邦um交mod少cp 数mn 的国 家 Bin 2 Depth 5 北韩 Figure 3: Using 2 bins for the dependency parse tree of the Figure 1 sentence. [sent-153, score-0.262]
52 3 Unaligned Penalty The dependency orientation and cohesion penalty cannot be applied to unaligned source words. [sent-155, score-1.292]
53 The problem is mitigated by an unaligned penalty applicable to all words in the dependency hierarchy. [sent-159, score-0.41]
54 4 Grouping Words into Bins Having defined dependency orientation, cohesion penalty and unaligned penalty, we section the source dependency tree uniformly by depth, group words at different depths into bins and only add the feature scores of a word into its respective bin. [sent-161, score-1.205]
55 The primary motivation is to distinguish long-distance reordering which is still problematic for the hierostyle model, since local reorderings generally operate at low levels of the tree while high tree levels tend to take more care of long-distance reordering. [sent-164, score-0.417]
56 Parsing accuracy is another concern, yet its impact on feature performance is intricate and our MaxEnt-trained dependency orientation feature also buffers against odd parsing. [sent-165, score-0.505]
57 1 million features (using the MegaM software2) for dependency orientation classification. [sent-177, score-0.505]
58 When only 1 bin was used, 3 additional features were added to the baseline, one each from the soft dependency constraints. [sent-182, score-0.3]
59 3 Using LRscore as the Tuning Metric Since our features are proposed to address the reordering problem and BLEU is not sensitive enough to reordering (especially in long-distance cases), we have also tried tuning with a metric that highlights reordering, i. [sent-231, score-0.804]
60 LRscore is a linear interpolation of a lexical metric and a reordering metric. [sent-234, score-0.35]
61 We interpolated BLEU (as the lexical metric) with the Kendall’s tau permutation distance (as the reordering metric). [sent-235, score-0.35]
62 In the next section, we conduct quantitative analysis on reordering precision and recall, as well as qualitative analysis on translation examples. [sent-249, score-0.496]
63 1 Precision and Recall of Reordering The key to obtaining precision and recall for reordering is to investigate whether reorderings in the references are reproduced in the translations. [sent-251, score-0.522]
64 Details of measuring reproduced reordering can be found in Birch et al. [sent-269, score-0.386]
65 This is consistent with our treatment in dependency orientation classification, and results in more reorderings being extracted. [sent-272, score-0.572]
66 This is a novel and important finding as we directly show that the quality of reordering has been improved. [sent-278, score-0.35]
67 In Figure 4 we break down the precision and recall statistics in MT08 by the reordering width on the source side. [sent-284, score-0.529]
68 Once again, it seems that the featureaugmented model is able to benefit from tuning with a metric that is more sensitive to reordering, as the performance of “bin-2-lr” is the best in all reordering statistics. [sent-288, score-0.454]
69 The key dependency orientation that controls the global reordering is between the prepositional modifier dui (English to) and its head word, the verb gandao (English feel). [sent-293, score-1.018]
70 Some directly encodes dependency in the translation model (Ding and Palmer, 2005; Quirk et al. [sent-308, score-0.27]
71 (2008) report that just filtering the phrase table by the socalled well-formed target dependency structure does not help, yet adding a target dependency language model improves performance significantly. [sent-315, score-0.384]
72 Our intuitive interpretation is that the target dependency language model capitalizes on two characteristics of the dependency structure: it is based on words and it directly connects head and child. [sent-316, score-0.455]
73 Therefore, the target dependency language model makes good use of the dependency representation as well as the target side training data. [sent-317, score-0.417]
74 We follow the second line of research, and derive three word-based soft constraints from the source dependency parsing. [sent-318, score-0.356]
75 , 2009a,b) which have successfully defined another cohesion constraint from the source depen864 dency structure, with the aim of improving reordering in phrase-based MT. [sent-320, score-0.883]
76 (2009b) define cohesion as translating a source dependency subtree contiguously into the target side without interruption (span or subtree overlapping), following Fox (2002). [sent-322, score-0.853]
77 This span-based cohesion constraint has a different criterion from our wordbased cohesion penalty and often leads to opposite conclusions. [sent-323, score-1.097]
78 (2009a) also use cohesion to correlate with the lexicalized reordering model (Tillman, 2004; Koehn et al. [sent-325, score-0.801]
79 , 2005), whereas we define an orthogonal dependency orientation feature to explicitly model head-dependent reordering. [sent-326, score-0.505]
80 Their span-based cohesion constraint is implemented as an “interruption check” to encourage finishing a subtree before translating something else. [sent-328, score-0.481]
81 In fact, it constrains reordering for the phrase-based model, as Cherry finds that the cohesion constraint is used “primarily to prevent distortion” and to provide “an intelligent estimate as to when source order must be respected” (Cherry, 2008). [sent-330, score-0.883]
82 Therefore, our cohesion penalty is better suited for the hierarchical phrase-based model. [sent-333, score-0.763]
83 To discourage nonconstituent translation, Chiang (2005) has proposed a constituency feature to examine whether a source rule span matches the source constituent as defined by phrase structure parsing. [sent-334, score-0.384]
84 Finer-grained constituency constraints significantly improve hierarchical phrase-based MT when applied on the source side (Marton and Resnik, 2008; Chiang et al. [sent-335, score-0.351]
85 Compared to constituency-based approaches, our cohesion penalty based on the dependency structure naturally supports constituent translations as well as some nonconstituent translations, if not all of them (as discussed in Section 2. [sent-340, score-0.89]
86 Our dependency orientation feature is similar to the order model within dependency treelet translation (Quirk et al. [sent-342, score-0.804]
87 Yet instead of a head-relative position number for each modifier word, we simply predict the head-dependent orientation which is either monotone or reversed. [sent-344, score-0.413]
88 Our coarser-grained approach is more robust from a machine learning perspective, yet still captures prominent and long-distance reordering patterns observed in Chinese–English (Wang et al. [sent-345, score-0.35]
89 Not committed to specific language pairs, we learn orientation classification from the word-aligned parallel data through maximum entropy training as Zens and Ney (2006) and Chang et al. [sent-349, score-0.349]
90 (2009) also make use of source dependency, their orientation classification concerns two subsequent phrase pairs in the leftto-right phrase-based decoding (as apposed to each dependent word and its head) and is therefore less linguistically-motivated. [sent-353, score-0.431]
91 865 6 Conclusion We have derived three novel features from the source dependency structure for hierarchical phrase-based MT. [sent-354, score-0.355]
92 They work as a whole to capitalize on two characteristics of the dependency representation: it is directly based on words and it directly connects head and child. [sent-355, score-0.263]
93 On average we improve reordering precision and recall by 6. [sent-360, score-0.419]
94 2, the cohesion penalty can be extended to also account for how a head word is translated with its children so that we are not biased towards one form of cohesive nonconstituent translation. [sent-365, score-0.999]
95 Sourceside dependency tree reordering models with subtree movements and constraints. [sent-381, score-0.536]
96 LRscore for evaluating lexical and reordering quality in MT. [sent-399, score-0.35]
97 Soft syntactic constraints for hierarchical phrase-based translation using latent syntactic distributions. [sent-502, score-0.284]
98 Syntactic reordering in preprocessing for japanese-to-english translation: Mit system description for ntcir-7 patent translation task. [sent-507, score-0.464]
99 A new string-to-dependency machine translation algorithm with a target dependency language model. [sent-589, score-0.306]
100 Maximum entropy based phrase reordering model for statistical machine translation. [sent-636, score-0.35]
wordName wordTfidf (topN-words)
[('cohesion', 0.451), ('reordering', 0.35), ('orientation', 0.349), ('lrscore', 0.226), ('penalty', 0.195), ('cohesive', 0.158), ('dependency', 0.156), ('unresolved', 0.136), ('chiang', 0.122), ('derivation', 0.122), ('bleu', 0.118), ('hierarchical', 0.117), ('translation', 0.114), ('bach', 0.109), ('bins', 0.106), ('yu', 0.091), ('nonconstituent', 0.088), ('birch', 0.085), ('cherry', 0.085), ('source', 0.082), ('bin', 0.079), ('koehn', 0.077), ('head', 0.075), ('tuning', 0.071), ('beihan', 0.07), ('jhead', 0.07), ('resnik', 0.068), ('reorderings', 0.067), ('constituency', 0.066), ('osborne', 0.066), ('rule', 0.066), ('soft', 0.065), ('monotone', 0.064), ('marton', 0.063), ('unaligned', 0.059), ('palestinian', 0.055), ('reversed', 0.053), ('constraints', 0.053), ('abbas', 0.053), ('dui', 0.053), ('jdep', 0.053), ('depth', 0.052), ('setiawan', 0.051), ('translates', 0.049), ('xiong', 0.047), ('glue', 0.047), ('korea', 0.045), ('pna', 0.045), ('authority', 0.045), ('ter', 0.043), ('english', 0.04), ('north', 0.04), ('eu', 0.039), ('boulder', 0.039), ('points', 0.038), ('countries', 0.038), ('recall', 0.037), ('alignment', 0.037), ('proceedings', 0.036), ('target', 0.036), ('reproduced', 0.036), ('quirk', 0.035), ('apdt', 0.035), ('binning', 0.035), ('diplomatic', 0.035), ('gandao', 0.035), ('hpbmt', 0.035), ('idep', 0.035), ('ihead', 0.035), ('interruption', 0.035), ('nhn', 0.035), ('pnn', 0.035), ('terminator', 0.035), ('tolerant', 0.035), ('widths', 0.035), ('chinese', 0.034), ('prep', 0.034), ('side', 0.033), ('annual', 0.033), ('sensitive', 0.033), ('wu', 0.033), ('pages', 0.032), ('connects', 0.032), ('translated', 0.032), ('precision', 0.032), ('promote', 0.031), ('ordering', 0.03), ('subtree', 0.03), ('tillman', 0.03), ('sourceside', 0.03), ('tseng', 0.03), ('hayashi', 0.03), ('kneser', 0.03), ('absolute', 0.03), ('treelet', 0.029), ('chang', 0.029), ('shen', 0.029), ('meeting', 0.029), ('width', 0.028), ('phrasebased', 0.028)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000004 123 emnlp-2011-Soft Dependency Constraints for Reordering in Hierarchical Phrase-Based Translation
Author: Yang Gao ; Philipp Koehn ; Alexandra Birch
Abstract: Long-distance reordering remains one of the biggest challenges facing machine translation. We derive soft constraints from the source dependency parsing to directly address the reordering problem for the hierarchical phrasebased model. Our approach significantly improves Chinese–English machine translation on a large-scale task by 0.84 BLEU points on average. Moreover, when we switch the tuning function from BLEU to the LRscore which promotes reordering, we observe total improvements of 1.21 BLEU, 1.30 LRscore and 3.36 TER over the baseline. On average our approach improves reordering precision and recall by 6.9 and 0.3 absolute points, respectively, and is found to be especially effective for long-distance reodering.
2 0.30552447 136 emnlp-2011-Training a Parser for Machine Translation Reordering
Author: Jason Katz-Brown ; Slav Petrov ; Ryan McDonald ; Franz Och ; David Talbot ; Hiroshi Ichikawa ; Masakazu Seno ; Hideto Kazawa
Abstract: We propose a simple training regime that can improve the extrinsic performance of a parser, given only a corpus of sentences and a way to automatically evaluate the extrinsic quality of a candidate parse. We apply our method to train parsers that excel when used as part of a reordering component in a statistical machine translation system. We use a corpus of weakly-labeled reference reorderings to guide parser training. Our best parsers contribute significant improvements in subjective translation quality while their intrinsic attachment scores typically regress.
3 0.27683645 13 emnlp-2011-A Word Reordering Model for Improved Machine Translation
Author: Karthik Visweswariah ; Rajakrishnan Rajkumar ; Ankur Gandhe ; Ananthakrishnan Ramanathan ; Jiri Navratil
Abstract: Preordering of source side sentences has proved to be useful in improving statistical machine translation. Most work has used a parser in the source language along with rules to map the source language word order into the target language word order. The requirement to have a source language parser is a major drawback, which we seek to overcome in this paper. Instead of using a parser and then using rules to order the source side sentence we learn a model that can directly reorder source side sentences to match target word order using a small parallel corpus with highquality word alignments. Our model learns pairwise costs of a word immediately preced- ing another word. We use the Lin-Kernighan heuristic to find the best source reordering efficiently during training and testing and show that it suffices to provide good quality reordering. We show gains in translation performance based on our reordering model for translating from Hindi to English, Urdu to English (with a public dataset), and English to Hindi. For English to Hindi we show that our technique achieves better performance than a method that uses rules applied to the source side English parse.
4 0.2403426 15 emnlp-2011-A novel dependency-to-string model for statistical machine translation
Author: Jun Xie ; Haitao Mi ; Qun Liu
Abstract: Dependency structure, as a first step towards semantics, is believed to be helpful to improve translation quality. However, previous works on dependency structure based models typically resort to insertion operations to complete translations, which make it difficult to specify ordering information in translation rules. In our model of this paper, we handle this problem by directly specifying the ordering information in head-dependents rules which represent the source side as head-dependents relations and the target side as strings. The head-dependents rules require only substitution operation, thus our model requires no heuristics or separate ordering models of the previous works to control the word order of translations. Large-scale experiments show that our model performs well on long distance reordering, and outperforms the state- of-the-art constituency-to-string model (+1.47 BLEU on average) and hierarchical phrasebased model (+0.46 BLEU on average) on two Chinese-English NIST test sets without resort to phrases or parse forest. For the first time, a source dependency structure based model catches up with and surpasses the state-of-theart translation models.
5 0.20995505 108 emnlp-2011-Quasi-Synchronous Phrase Dependency Grammars for Machine Translation
Author: Kevin Gimpel ; Noah A. Smith
Abstract: We present a quasi-synchronous dependency grammar (Smith and Eisner, 2006) for machine translation in which the leaves of the tree are phrases rather than words as in previous work (Gimpel and Smith, 2009). This formulation allows us to combine structural components of phrase-based and syntax-based MT in a single model. We describe a method of extracting phrase dependencies from parallel text using a target-side dependency parser. For decoding, we describe a coarse-to-fine approach based on lattice dependency parsing of phrase lattices. We demonstrate performance improvements for Chinese-English and UrduEnglish translation over a phrase-based baseline. We also investigate the use of unsupervised dependency parsers, reporting encouraging preliminary results.
6 0.17363031 74 emnlp-2011-Inducing Sentence Structure from Parallel Corpora for Reordering
7 0.17179203 20 emnlp-2011-Augmenting String-to-Tree Translation Models with Fuzzy Use of Source-side Syntax
8 0.17105059 125 emnlp-2011-Statistical Machine Translation with Local Language Models
9 0.1560545 22 emnlp-2011-Better Evaluation Metrics Lead to Better Machine Translation
10 0.13857132 58 emnlp-2011-Fast Generation of Translation Forest for Large-Scale SMT Discriminative Training
11 0.12328822 137 emnlp-2011-Training dependency parsers by jointly optimizing multiple objectives
12 0.10991104 44 emnlp-2011-Domain Adaptation via Pseudo In-Domain Data Selection
13 0.10041107 10 emnlp-2011-A Probabilistic Forest-to-String Model for Language Generation from Typed Lambda Calculus Expressions
14 0.10034323 3 emnlp-2011-A Correction Model for Word Alignments
15 0.1003205 51 emnlp-2011-Exact Decoding of Phrase-Based Translation Models through Lagrangian Relaxation
16 0.094626226 83 emnlp-2011-Learning Sentential Paraphrases from Bilingual Parallel Corpora for Text-to-Text Generation
17 0.087258369 4 emnlp-2011-A Fast, Accurate, Non-Projective, Semantically-Enriched Parser
18 0.080228068 100 emnlp-2011-Optimal Search for Minimum Error Rate Training
19 0.076845594 75 emnlp-2011-Joint Models for Chinese POS Tagging and Dependency Parsing
20 0.076303944 93 emnlp-2011-Minimum Imputed-Risk: Unsupervised Discriminative Training for Machine Translation
topicId topicWeight
[(0, 0.301), (1, 0.3), (2, 0.163), (3, -0.045), (4, -0.041), (5, -0.004), (6, 0.337), (7, 0.107), (8, 0.021), (9, 0.033), (10, 0.054), (11, 0.031), (12, -0.035), (13, -0.069), (14, -0.045), (15, -0.075), (16, -0.088), (17, 0.104), (18, -0.032), (19, 0.096), (20, -0.025), (21, 0.059), (22, -0.078), (23, 0.077), (24, -0.1), (25, 0.065), (26, 0.01), (27, 0.014), (28, -0.068), (29, 0.001), (30, 0.001), (31, 0.035), (32, -0.07), (33, 0.019), (34, 0.045), (35, 0.01), (36, 0.001), (37, -0.029), (38, 0.009), (39, -0.012), (40, -0.025), (41, -0.02), (42, -0.025), (43, -0.091), (44, -0.043), (45, -0.043), (46, 0.004), (47, 0.016), (48, -0.067), (49, 0.104)]
simIndex simValue paperId paperTitle
same-paper 1 0.94395703 123 emnlp-2011-Soft Dependency Constraints for Reordering in Hierarchical Phrase-Based Translation
Author: Yang Gao ; Philipp Koehn ; Alexandra Birch
Abstract: Long-distance reordering remains one of the biggest challenges facing machine translation. We derive soft constraints from the source dependency parsing to directly address the reordering problem for the hierarchical phrasebased model. Our approach significantly improves Chinese–English machine translation on a large-scale task by 0.84 BLEU points on average. Moreover, when we switch the tuning function from BLEU to the LRscore which promotes reordering, we observe total improvements of 1.21 BLEU, 1.30 LRscore and 3.36 TER over the baseline. On average our approach improves reordering precision and recall by 6.9 and 0.3 absolute points, respectively, and is found to be especially effective for long-distance reodering.
2 0.83343822 13 emnlp-2011-A Word Reordering Model for Improved Machine Translation
Author: Karthik Visweswariah ; Rajakrishnan Rajkumar ; Ankur Gandhe ; Ananthakrishnan Ramanathan ; Jiri Navratil
Abstract: Preordering of source side sentences has proved to be useful in improving statistical machine translation. Most work has used a parser in the source language along with rules to map the source language word order into the target language word order. The requirement to have a source language parser is a major drawback, which we seek to overcome in this paper. Instead of using a parser and then using rules to order the source side sentence we learn a model that can directly reorder source side sentences to match target word order using a small parallel corpus with highquality word alignments. Our model learns pairwise costs of a word immediately preced- ing another word. We use the Lin-Kernighan heuristic to find the best source reordering efficiently during training and testing and show that it suffices to provide good quality reordering. We show gains in translation performance based on our reordering model for translating from Hindi to English, Urdu to English (with a public dataset), and English to Hindi. For English to Hindi we show that our technique achieves better performance than a method that uses rules applied to the source side English parse.
3 0.72206813 136 emnlp-2011-Training a Parser for Machine Translation Reordering
Author: Jason Katz-Brown ; Slav Petrov ; Ryan McDonald ; Franz Och ; David Talbot ; Hiroshi Ichikawa ; Masakazu Seno ; Hideto Kazawa
Abstract: We propose a simple training regime that can improve the extrinsic performance of a parser, given only a corpus of sentences and a way to automatically evaluate the extrinsic quality of a candidate parse. We apply our method to train parsers that excel when used as part of a reordering component in a statistical machine translation system. We use a corpus of weakly-labeled reference reorderings to guide parser training. Our best parsers contribute significant improvements in subjective translation quality while their intrinsic attachment scores typically regress.
4 0.71857512 15 emnlp-2011-A novel dependency-to-string model for statistical machine translation
Author: Jun Xie ; Haitao Mi ; Qun Liu
Abstract: Dependency structure, as a first step towards semantics, is believed to be helpful to improve translation quality. However, previous works on dependency structure based models typically resort to insertion operations to complete translations, which make it difficult to specify ordering information in translation rules. In our model of this paper, we handle this problem by directly specifying the ordering information in head-dependents rules which represent the source side as head-dependents relations and the target side as strings. The head-dependents rules require only substitution operation, thus our model requires no heuristics or separate ordering models of the previous works to control the word order of translations. Large-scale experiments show that our model performs well on long distance reordering, and outperforms the state- of-the-art constituency-to-string model (+1.47 BLEU on average) and hierarchical phrasebased model (+0.46 BLEU on average) on two Chinese-English NIST test sets without resort to phrases or parse forest. For the first time, a source dependency structure based model catches up with and surpasses the state-of-theart translation models.
5 0.67715776 74 emnlp-2011-Inducing Sentence Structure from Parallel Corpora for Reordering
Author: John DeNero ; Jakob Uszkoreit
Abstract: When translating among languages that differ substantially in word order, machine translation (MT) systems benefit from syntactic preordering—an approach that uses features from a syntactic parse to permute source words into a target-language-like order. This paper presents a method for inducing parse trees automatically from a parallel corpus, instead of using a supervised parser trained on a treebank. These induced parses are used to preorder source sentences. We demonstrate that our induced parser is effective: it not only improves a state-of-the-art phrase-based system with integrated reordering, but also approaches the performance of a recent preordering method based on a supervised parser. These results show that the syntactic structure which is relevant to MT pre-ordering can be learned automatically from parallel text, thus establishing a new application for unsupervised grammar induction.
6 0.62286669 20 emnlp-2011-Augmenting String-to-Tree Translation Models with Fuzzy Use of Source-side Syntax
7 0.57254803 108 emnlp-2011-Quasi-Synchronous Phrase Dependency Grammars for Machine Translation
8 0.51297694 58 emnlp-2011-Fast Generation of Translation Forest for Large-Scale SMT Discriminative Training
9 0.50737011 125 emnlp-2011-Statistical Machine Translation with Local Language Models
10 0.48645318 22 emnlp-2011-Better Evaluation Metrics Lead to Better Machine Translation
11 0.44305995 44 emnlp-2011-Domain Adaptation via Pseudo In-Domain Data Selection
12 0.40998122 85 emnlp-2011-Learning to Simplify Sentences with Quasi-Synchronous Grammar and Integer Programming
13 0.39321983 137 emnlp-2011-Training dependency parsers by jointly optimizing multiple objectives
14 0.39075705 3 emnlp-2011-A Correction Model for Word Alignments
15 0.37916464 47 emnlp-2011-Efficient retrieval of tree translation examples for Syntax-Based Machine Translation
16 0.36829567 51 emnlp-2011-Exact Decoding of Phrase-Based Translation Models through Lagrangian Relaxation
17 0.353836 83 emnlp-2011-Learning Sentential Paraphrases from Bilingual Parallel Corpora for Text-to-Text Generation
18 0.34227183 10 emnlp-2011-A Probabilistic Forest-to-String Model for Language Generation from Typed Lambda Calculus Expressions
19 0.34063327 93 emnlp-2011-Minimum Imputed-Risk: Unsupervised Discriminative Training for Machine Translation
20 0.32912487 65 emnlp-2011-Heuristic Search for Non-Bottom-Up Tree Structure Prediction
topicId topicWeight
[(23, 0.106), (35, 0.02), (36, 0.021), (37, 0.056), (40, 0.219), (45, 0.059), (53, 0.07), (54, 0.051), (62, 0.029), (64, 0.029), (65, 0.021), (66, 0.032), (69, 0.032), (79, 0.045), (82, 0.017), (85, 0.024), (90, 0.015), (96, 0.045), (98, 0.022)]
simIndex simValue paperId paperTitle
same-paper 1 0.76757812 123 emnlp-2011-Soft Dependency Constraints for Reordering in Hierarchical Phrase-Based Translation
Author: Yang Gao ; Philipp Koehn ; Alexandra Birch
Abstract: Long-distance reordering remains one of the biggest challenges facing machine translation. We derive soft constraints from the source dependency parsing to directly address the reordering problem for the hierarchical phrasebased model. Our approach significantly improves Chinese–English machine translation on a large-scale task by 0.84 BLEU points on average. Moreover, when we switch the tuning function from BLEU to the LRscore which promotes reordering, we observe total improvements of 1.21 BLEU, 1.30 LRscore and 3.36 TER over the baseline. On average our approach improves reordering precision and recall by 6.9 and 0.3 absolute points, respectively, and is found to be especially effective for long-distance reodering.
2 0.58754081 108 emnlp-2011-Quasi-Synchronous Phrase Dependency Grammars for Machine Translation
Author: Kevin Gimpel ; Noah A. Smith
Abstract: We present a quasi-synchronous dependency grammar (Smith and Eisner, 2006) for machine translation in which the leaves of the tree are phrases rather than words as in previous work (Gimpel and Smith, 2009). This formulation allows us to combine structural components of phrase-based and syntax-based MT in a single model. We describe a method of extracting phrase dependencies from parallel text using a target-side dependency parser. For decoding, we describe a coarse-to-fine approach based on lattice dependency parsing of phrase lattices. We demonstrate performance improvements for Chinese-English and UrduEnglish translation over a phrase-based baseline. We also investigate the use of unsupervised dependency parsers, reporting encouraging preliminary results.
3 0.58304453 15 emnlp-2011-A novel dependency-to-string model for statistical machine translation
Author: Jun Xie ; Haitao Mi ; Qun Liu
Abstract: Dependency structure, as a first step towards semantics, is believed to be helpful to improve translation quality. However, previous works on dependency structure based models typically resort to insertion operations to complete translations, which make it difficult to specify ordering information in translation rules. In our model of this paper, we handle this problem by directly specifying the ordering information in head-dependents rules which represent the source side as head-dependents relations and the target side as strings. The head-dependents rules require only substitution operation, thus our model requires no heuristics or separate ordering models of the previous works to control the word order of translations. Large-scale experiments show that our model performs well on long distance reordering, and outperforms the state- of-the-art constituency-to-string model (+1.47 BLEU on average) and hierarchical phrasebased model (+0.46 BLEU on average) on two Chinese-English NIST test sets without resort to phrases or parse forest. For the first time, a source dependency structure based model catches up with and surpasses the state-of-theart translation models.
4 0.57546449 66 emnlp-2011-Hierarchical Phrase-based Translation Representations
Author: Gonzalo Iglesias ; Cyril Allauzen ; William Byrne ; Adria de Gispert ; Michael Riley
Abstract: This paper compares several translation representations for a synchronous context-free grammar parse including CFGs/hypergraphs, finite-state automata (FSA), and pushdown automata (PDA). The representation choice is shown to determine the form and complexity of target LM intersection and shortest-path algorithms that follow. Intersection, shortest path, FSA expansion and RTN replacement algorithms are presented for PDAs. Chinese-toEnglish translation experiments using HiFST and HiPDT, FSA and PDA-based decoders, are presented using admissible (or exact) search, possible for HiFST with compact SCFG rulesets and HiPDT with compact LMs. For large rulesets with large LMs, we introduce a two-pass search strategy which we then analyze in terms of search errors and translation performance.
5 0.57505929 20 emnlp-2011-Augmenting String-to-Tree Translation Models with Fuzzy Use of Source-side Syntax
Author: Jiajun Zhang ; Feifei Zhai ; Chengqing Zong
Abstract: Due to its explicit modeling of the grammaticality of the output via target-side syntax, the string-to-tree model has been shown to be one of the most successful syntax-based translation models. However, a major limitation of this model is that it does not utilize any useful syntactic information on the source side. In this paper, we analyze the difficulties of incorporating source syntax in a string-totree model. We then propose a new way to use the source syntax in a fuzzy manner, both in source syntactic annotation and in rule matching. We further explore three algorithms in rule matching: 0-1 matching, likelihood matching, and deep similarity matching. Our method not only guarantees grammatical output with an explicit target tree, but also enables the system to choose the proper translation rules via fuzzy use of the source syntax. Our extensive experiments have shown significant improvements over the state-of-the-art string-to-tree system. 1
6 0.57461274 68 emnlp-2011-Hypotheses Selection Criteria in a Reranking Framework for Spoken Language Understanding
7 0.5736137 22 emnlp-2011-Better Evaluation Metrics Lead to Better Machine Translation
8 0.57012022 13 emnlp-2011-A Word Reordering Model for Improved Machine Translation
9 0.56922251 136 emnlp-2011-Training a Parser for Machine Translation Reordering
10 0.56541193 53 emnlp-2011-Experimental Support for a Categorical Compositional Distributional Model of Meaning
11 0.56261486 1 emnlp-2011-A Bayesian Mixture Model for PoS Induction Using Multiple Features
12 0.56253916 59 emnlp-2011-Fast and Robust Joint Models for Biomedical Event Extraction
13 0.55986768 46 emnlp-2011-Efficient Subsampling for Training Complex Language Models
14 0.55845308 35 emnlp-2011-Correcting Semantic Collocation Errors with L1-induced Paraphrases
15 0.55634189 83 emnlp-2011-Learning Sentential Paraphrases from Bilingual Parallel Corpora for Text-to-Text Generation
16 0.55533153 58 emnlp-2011-Fast Generation of Translation Forest for Large-Scale SMT Discriminative Training
17 0.55513602 44 emnlp-2011-Domain Adaptation via Pseudo In-Domain Data Selection
18 0.55349451 111 emnlp-2011-Reducing Grounded Learning Tasks To Grammatical Inference
19 0.55331552 128 emnlp-2011-Structured Relation Discovery using Generative Models
20 0.55297995 85 emnlp-2011-Learning to Simplify Sentences with Quasi-Synchronous Grammar and Integer Programming