acl acl2011 acl2011-61 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Hao Zhang ; Licheng Fang ; Peng Xu ; Xiaoyun Wu
Abstract: Tree-to-string translation is syntax-aware and efficient but sensitive to parsing errors. Forestto-string translation approaches mitigate the risk of propagating parser errors into translation errors by considering a forest of alternative trees, as generated by a source language parser. We propose an alternative approach to generating forests that is based on combining sub-trees within the first best parse through binarization. Provably, our binarization forest can cover any non-consitituent phrases in a sentence but maintains the desirable property that for each span there is at most one nonterminal so that the grammar constant for decoding is relatively small. For the purpose of reducing search errors, we apply the synchronous binarization technique to forest-tostring decoding. Combining the two techniques, we show that using a fast shift-reduce parser we can achieve significant quality gains in NIST 2008 English-to-Chinese track (1.3 BLEU points over a phrase-based system, 0.8 BLEU points over a hierarchical phrase-based system). Consistent and significant gains are also shown in WMT 2010 in the English to German, French, Spanish and Czech tracks.
Reference: text
sentIndex sentText sentNum sentScore
1 Forestto-string translation approaches mitigate the risk of propagating parser errors into translation errors by considering a forest of alternative trees, as generated by a source language parser. [sent-4, score-0.738]
2 Provably, our binarization forest can cover any non-consitituent phrases in a sentence but maintains the desirable property that for each span there is at most one nonterminal so that the grammar constant for decoding is relatively small. [sent-6, score-0.95]
3 For the purpose of reducing search errors, we apply the synchronous binarization technique to forest-tostring decoding. [sent-7, score-0.814]
4 The unifying framework for these models is synchronous grammars (Chiang, 2005) or tree transducers (Graehl and Knight, 2004). [sent-13, score-0.358]
5 , 2008) In terms of search, the string-to-x models explore all possible source parses and map them to the target side, while the tree-to-x models search over the subspace of structures of the source side constrained by an input tree or trees. [sent-25, score-0.376]
6 (2006) can match multilevel tree fragments on the source side which means larger contexts are taken into account for translation (Poutsma, 2000), which is a modeling advantage. [sent-28, score-0.491]
7 Traditionally, such forests are obtained through hyper-edge pruning in the k-best search space of a monolingual parser (Huang, 2008). [sent-31, score-0.324]
8 The pruning parameters that control the size of forests are normally handtuned. [sent-32, score-0.25]
9 Ac s2s0o1ci1a Atiosnso fcoirat Cio nm foprut Caotimonpaulta Lti nognuails Lti cnsg,u piasgteics 835–845, We believe that structural variants which allow more source spans to be explored during translation are more important (DeNeefe et al. [sent-37, score-0.277]
10 To focus on structural variants, we propose a family of binarization algorithms to expand one single constituent tree into a packed forest of binary trees containing combinations of adjacent tree nodes. [sent-39, score-1.4]
11 We control the freedom of tree node binary combination by restricting the distance to the lowest common ancestor of two tree nodes. [sent-40, score-0.6]
12 Instead of using arbitary pruning parameters, we tceoandtr oofl f uosriensgt s airzbei by an integer n puamrambeer ttehras,t defines the degree of tree structure violation. [sent-45, score-0.279]
13 , 2004) can cover multi-level tree fragments, a synchronous grammar extracted using the GHKM algorithm can have synchronous translation rules with more than two non- terminals regardless of the branching factor of the source trees. [sent-48, score-1.029]
14 For the first time, we show that similar to string-to-tree decoding, synchronous binarization significantly reduces search errors and improves translation quality for forest-to-string decoding. [sent-49, score-0.983]
15 First, a parser produces the highest-scored tree for an input sentence. [sent-51, score-0.235]
16 Second, the parse tree is restructured using our binarization algorithm, resulting in a binary packed forest. [sent-52, score-0.903]
17 Third, we apply the forest-based variant of the GHKM algorithm (Mi and Huang, 2008) on the new forest for rule extraction. [sent-53, score-0.388]
18 Fourth, on the translation forest generated by all applicable translation rules, which is not necessarily binary, we apply the synchronous binarization algorithm (Zhang et al. [sent-54, score-1.439]
19 1, we introduce a more efficient and flexible algorithm for extracting com- posed GHKM rules based on the same principle as cube pruning (Chiang, 2007). [sent-60, score-0.287]
20 In Section 3, we introduce our source tree binarization algorithm for producing binarized forests. [sent-61, score-0.969]
21 In Section 4, we explain how to do synchronous rule factorization in a forest-to-string decoder. [sent-62, score-0.298]
22 When the first-best parse is wrong or no good translation rules are applicable to the first-best parse, the model can recover good translations from alternative parses. [sent-71, score-0.245]
23 In STSG, local features are defined on tree-tostring rules, which are synchronous grammar rules defining how a sequence of terminals and nonterminals on the source side translates to a sequence of target terminals and nonterminals. [sent-72, score-0.631]
24 Figure 1 shows a typical English-Chinese tree-to-string rule with a reordering pattern consisting of two nonterminals and different numbers of terminals on the two sides. [sent-75, score-0.252]
25 The second stage is rule enumeration and DP decoding on forests of input strings. [sent-81, score-0.345]
26 In both stages, at each tree node, the task on the source side is to generate a list of tree fragments by composing the tree fragments of its children. [sent-82, score-0.729]
27 We propose a cube-pruning style algorithm that is suitable for both rule extraction during training and rule enumeration during decoding. [sent-83, score-0.272]
28 In the first step, we label each node in the input forest by a boolean variable indicating whether it is a site of interest for tree fragment generation. [sent-85, score-0.649]
29 In the case of rule extraction, a node is admissible if and only if it corresponds to a phrase pair according to the underlying word alignment. [sent-87, score-0.322]
30 In the case of decoding, every node is admissible for the sake of completeness of search. [sent-88, score-0.221]
31 An initial one-node tree fragment is placed at each admissible node for seeding the tree fragment generation process. [sent-89, score-0.655]
32 In the second step, we do cube-pruning style bottom-up combinations to enumerate a pruned list of tree fragments at each tree node. [sent-90, score-0.373]
33 In the third step, we extract or enumerateand-match tree-to-string rules for the tree fragments at the admissible nodes. [sent-91, score-0.36]
34 We can imagine substituting k-best LM states with k composed rules at each node and composing them bottom-up. [sent-100, score-0.259]
35 We can also borrow the cube pruning trick to compose multiple lists of rules using a priority queue to lazily explore the space of combinations starting from the top-most element in the cube formed by the lists. [sent-101, score-0.384]
36 (2006), we define the figure of merit of a tree-to-string rule as a tuple m = (h, s, t), where h is the height of a tree fragment, s is the number of frontier nodes, i. [sent-104, score-0.333]
37 , bottom-level nodes including both terminals and non-terminals, and t is the number of terminals in the set of frontier nodes. [sent-106, score-0.299]
38 We define an additive operator +: m1 + m2 = ( max{h1 , h2} + 1, s1 + s2 , t1 + t2 ) and a min operator based on the order <: m1< m2⇐⇒h h1 1=<= h h2 2∨∧ s s1 <= s s2 ∨∧ t1< t2 The + operator corresponds to rule compositions. [sent-107, score-0.276]
39 3 Source Tree Binarization The motivation of tree binarization is to factorize large and rare structures into smaller but frequent ones to improve generalization. [sent-111, score-0.803]
40 For example, the simplest binarization methods left-to-right, right-to-left, and head-out explore sharing ofprefixes or suffixes. [sent-119, score-0.646]
41 Among exponentially many binarization choices, these algorithms pick a single bracketing structure for a sequence of sibling nodes. [sent-120, score-0.617]
42 To explore all possible binarizations, we use a CYK algorithm to produce a packed forest of binary trees for a given sibling sequence. [sent-121, score-0.496]
43 With CYK binarization, we can explore any span that is nested within the original tree structure, but still miss all cross-bracket spans. [sent-122, score-0.222]
44 As a result, tree-to-string translation based on constituent phrases misses the good translation rule. [sent-125, score-0.371]
45 The CYK-n binarization algorithm shown in Algorithm 1 is a parameterization of the basic CYK binarization algorithm we just outlined. [sent-126, score-1.312]
46 The idea is that binarization can go beyond the scope of parent nodes to more distant ancestors. [sent-127, score-0.676]
47 The CYK-n algorithm first annotates each node with its n nearest ancestors in the source tree, then generates a binarization forest that allows combining any two nodes with common ancestors. [sent-128, score-1.292]
48 The ancestor chain labeled at each node licenses the node to only combine with nodes having common ancestors in the past n generations. [sent-129, score-0.579]
49 838 New tree nodes need to have their own states indicated by a node label representing what is covered internally by the node and an ancestor chain representing which nodes the node attaches to externally. [sent-131, score-0.852]
50 Line 22 and Line 23 of Algorithm 1 update the label and ancestor annotations of new tree nodes. [sent-132, score-0.287]
51 It can be proved that the final label of each new node in the forest corresponds to the tree sequence which has the minimum length among all sequences covered by the node span. [sent-143, score-0.774]
52 The ancestor chain of a new node is the common ancestors of the nodes in its minimum tree sequence. [sent-144, score-0.623]
53 In this example, standard CYK binarization will not create any new trees since the input is already binary. [sent-152, score-0.672]
54 4 Synchronous Binarization for Forest-to-string Decoding In this section, we deal with binarization of translation forests, also known as translation hypergraphs (Mi et al. [sent-154, score-0.98]
55 A translation forest is a packed forest representation of all synchronous derivations composed of tree-to-string rules that match the source forest. [sent-156, score-1.074]
56 Tree-to-string decoding algorithms work on a translation forest, rather than a source forest. [sent-157, score-0.271]
57 A binary source forest does not necessarily always result in a binary translation forest. [sent-158, score-0.542]
58 binary with the help of source tree binarization, but the translation rule involves three variables in the set of frontier nodes. [sent-161, score-0.564]
59 , 2006), we can factorize it into two smaller translation rules each having two variables. [sent-163, score-0.27]
60 Obviously, the second rule, which is a common pattern, is likely to be shared by many translation rules in the derivation forest. [sent-164, score-0.245]
61 The challenge of synchronous binarization for a forest-to-string system is that we need to first match large tree fragments in the input forest as the first step of decoding. [sent-166, score-1.274]
62 Our solution is to do the matching using the original rules and then run synchronous binarization to break matching rules down to factor rules which can be shared in the derivation forest. [sent-167, score-1.042]
63 This is different from the offline binarization scheme described in (Zhang et al. [sent-168, score-0.617]
64 We compare three systems: a phrase-based system (Och and Ney, 2004), a hierarchical phrasebased system (Chiang, 2005), and our forest-tostring system with different binarization schemes. [sent-204, score-0.668]
65 2 Translation Results Table 2 shows the scores of our system with the best binarization scheme compared to the phrasebased system and the hierarchical phrase-based system. [sent-218, score-0.668]
66 3 Different Binarization Methods The translation results for the bf2s system in Table 2 are based on the cyk binarization algorithm with bracket violation degree 2. [sent-232, score-1.008]
67 Table 3 shows the scores of different tree binarization methods for the English-Chinese task. [sent-238, score-0.778]
68 It is clear from reading the table that cyk-2 is the optimal binarization parameter. [sent-239, score-0.617]
69 If the parser is reasonably good, crossingjust one bracket is likely to cover most interesting phrases that can be translation units. [sent-243, score-0.289]
70 From another point of view, enlarging the forests entails more parameters in the resulting translation model, making over-fitting likely to happen. [sent-244, score-0.329]
71 A natural question is how the binarizer-generated forests compare with parser-generated forests in translation. [sent-247, score-0.32]
72 26735 Table 3: Comparing different source tree binarization schemes for English-Chinese translation, showing both BLEU scores and model sizes. [sent-250, score-0.827]
73 So the packed forest it produces is also a binarized forest. [sent-261, score-0.438]
74 We compare two systems: one is using the cyk-2 binarizer to generate forests; the other is using the CRF parser with pruning threshold e−p, where p = 2 to generate Although the parser outputs binary trees, we found cross-bracket cyk-2 binarization is still helpful. [sent-262, score-0.968]
75 07 Table 4: Binarized forests versus parser-generated forests for forest-to-string English-German translation. [sent-266, score-0.32]
76 Table 4 shows the comparison of binarization forest and parser forest on English-German translation. [sent-267, score-1.187]
77 The results show that cyk-2 forest performs slightly 1All hyper-edges with negative log posterior probability larger than p are pruned. [sent-268, score-0.248]
78 The difference is that they do the forest pruning on a forest generated by a k-best algorithm, while we do the forest-pruning on the full CYK chart. [sent-270, score-0.586]
79 As a result, we need more aggressive pruning to control forest size. [sent-271, score-0.338]
80 We have not done full exploration of forest pruning parameters to fine-tune the parser-forest. [sent-273, score-0.338]
81 It does not require hand-tuning of forest pruning parameters for training. [sent-277, score-0.338]
82 5 Synchronous Binarization In this section, we demonstrate the effect of synchronous binarization for both tree-to-string and forest-to-string translation. [sent-279, score-0.814]
83 , the maximum number ofnonterminals on the right-hand side of any synchronous translation rule in an input grammar. [sent-283, score-0.528]
84 The competing system does online synchronous binarization as described in Section 4 to transform the grammar intersected with the input sentence to the minimum branching factor k′ (k′ < k), and then applies k′-way cube pruning. [sent-284, score-0.972]
85 2075 Table 5: The effect of synchronous binarization for treeto-string and forest-to-string systems, on the EnglishChinese task. [sent-289, score-0.814]
86 Table 5 shows that synchronous binarization does help reduce search errors and find better translations consistently in all settings. [sent-290, score-0.814]
87 (2007) proposed treesequence-to-string translation rules but did not pro- vide a good solution to place joint subtrees into connection with the rest of the tree structure. [sent-294, score-0.432]
88 (2007) used target tree binarization to improve rule extraction for their string-to-tree system. [sent-301, score-0.906]
89 Their binarization forest is equivalent to our cyk-1 forest. [sent-302, score-0.865]
90 In contrast to theirs, our binarization scheme affects decoding directly because we match tree-to-string rules on a binarized forest. [sent-303, score-0.849]
91 Different methods of translation rule binarization have been discussed in Huang (2007). [sent-304, score-0.887]
92 Their argument is that for tree-to-string decoding target side binarization is simpler than synchronous binarization and works well because creating discontinous source spans does not explode the state space. [sent-305, score-1.649]
93 Our experiments show that synchronous binarization helps significantly in the forest-to-string case. [sent-307, score-0.814]
94 It involves a source tree binarization step and a standard forest-to-string translation step. [sent-309, score-0.996]
95 We have demonstrated state-of-the-art results using a fast parser and a simple tree binarizer that allows crossing at most one bracket in each binarized node. [sent-311, score-0.459]
96 We adapted the synchronous binarization technqiue to improve search and have shown significant gains. [sent-313, score-0.814]
97 In the future, we plan to improve the learning of translation rules with binarized forests. [sent-316, score-0.348]
98 Efficient minimum error rate train- ing and minimum bayes-risk decoding for translation hypergraphs and lattices. [sent-387, score-0.311]
99 A new string-to-dependency machine translation algorithm with a target dependency language model. [sent-438, score-0.235]
100 Binarizing syntax trees to improve syntax-based machine translation accuracy. [sent-443, score-0.224]
wordName wordTfidf (topN-words)
[('binarization', 0.617), ('forest', 0.248), ('synchronous', 0.197), ('pnode', 0.17), ('translation', 0.169), ('tree', 0.161), ('forests', 0.16), ('vbn', 0.153), ('node', 0.149), ('ancestors', 0.131), ('vbd', 0.118), ('lnode', 0.113), ('rnode', 0.113), ('cyk', 0.109), ('binarized', 0.103), ('rule', 0.101), ('terminals', 0.097), ('ancestor', 0.091), ('pruning', 0.09), ('packed', 0.087), ('cube', 0.082), ('rules', 0.076), ('binarizer', 0.075), ('parser', 0.074), ('admissible', 0.072), ('galley', 0.063), ('huang', 0.062), ('side', 0.061), ('nodes', 0.059), ('fragment', 0.056), ('trees', 0.055), ('ghkm', 0.055), ('decoding', 0.053), ('chiang', 0.052), ('hierarchical', 0.051), ('fragments', 0.051), ('wmt', 0.051), ('vp', 0.05), ('stands', 0.049), ('source', 0.049), ('bleu', 0.047), ('bracket', 0.046), ('frontier', 0.046), ('operator', 0.044), ('branching', 0.044), ('associationfor', 0.044), ('mi', 0.044), ('min', 0.043), ('zhang', 0.042), ('mb', 0.039), ('concatenate', 0.039), ('algorithm', 0.039), ('binary', 0.038), ('addedge', 0.038), ('getnode', 0.038), ('knight', 0.037), ('kevin', 0.037), ('venugopal', 0.036), ('label', 0.035), ('zollmann', 0.035), ('liang', 0.035), ('google', 0.035), ('decoder', 0.034), ('composing', 0.034), ('hier', 0.033), ('constituent', 0.033), ('rb', 0.032), ('pp', 0.032), ('span', 0.032), ('deneefe', 0.032), ('hao', 0.032), ('och', 0.032), ('minimum', 0.032), ('pb', 0.031), ('variants', 0.031), ('stsg', 0.031), ('adjp', 0.031), ('enumeration', 0.031), ('npb', 0.031), ('graehl', 0.029), ('propagating', 0.029), ('explore', 0.029), ('degree', 0.028), ('spans', 0.028), ('reordering', 0.027), ('northamerican', 0.027), ('target', 0.027), ('nonterminals', 0.027), ('franz', 0.026), ('association', 0.026), ('semiring', 0.026), ('aug', 0.026), ('joint', 0.026), ('qun', 0.026), ('liu', 0.026), ('begin', 0.026), ('factorize', 0.025), ('height', 0.025), ('priority', 0.025), ('hypergraphs', 0.025)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999958 61 acl-2011-Binarized Forest to String Translation
Author: Hao Zhang ; Licheng Fang ; Peng Xu ; Xiaoyun Wu
Abstract: Tree-to-string translation is syntax-aware and efficient but sensitive to parsing errors. Forestto-string translation approaches mitigate the risk of propagating parser errors into translation errors by considering a forest of alternative trees, as generated by a source language parser. We propose an alternative approach to generating forests that is based on combining sub-trees within the first best parse through binarization. Provably, our binarization forest can cover any non-consitituent phrases in a sentence but maintains the desirable property that for each span there is at most one nonterminal so that the grammar constant for decoding is relatively small. For the purpose of reducing search errors, we apply the synchronous binarization technique to forest-tostring decoding. Combining the two techniques, we show that using a fast shift-reduce parser we can achieve significant quality gains in NIST 2008 English-to-Chinese track (1.3 BLEU points over a phrase-based system, 0.8 BLEU points over a hierarchical phrase-based system). Consistent and significant gains are also shown in WMT 2010 in the English to German, French, Spanish and Czech tracks.
2 0.59763145 296 acl-2011-Terminal-Aware Synchronous Binarization
Author: Licheng Fang ; Tagyoung Chung ; Daniel Gildea
Abstract: We present an SCFG binarization algorithm that combines the strengths of early terminal matching on the source language side and early language model integration on the target language side. We also examine how different strategies of target-side terminal attachment during binarization can significantly affect translation quality.
3 0.33941695 110 acl-2011-Effective Use of Function Words for Rule Generalization in Forest-Based Translation
Author: Xianchao Wu ; Takuya Matsuzaki ; Jun'ichi Tsujii
Abstract: In the present paper, we propose the effective usage of function words to generate generalized translation rules for forest-based translation. Given aligned forest-string pairs, we extract composed tree-to-string translation rules that account for multiple interpretations of both aligned and unaligned target function words. In order to constrain the exhaustive attachments of function words, we limit to bind them to the nearby syntactic chunks yielded by a target dependency parser. Therefore, the proposed approach can not only capture source-tree-to-target-chunk correspondences but can also use forest structures that compactly encode an exponential number of parse trees to properly generate target function words during decoding. Extensive experiments involving large-scale English-toJapanese translation revealed a significant im- provement of 1.8 points in BLEU score, as compared with a strong forest-to-string baseline system.
4 0.28481936 30 acl-2011-Adjoining Tree-to-String Translation
Author: Yang Liu ; Qun Liu ; Yajuan Lu
Abstract: We introduce synchronous tree adjoining grammars (TAG) into tree-to-string translation, which converts a source tree to a target string. Without reconstructing TAG derivations explicitly, our rule extraction algorithm directly learns tree-to-string rules from aligned Treebank-style trees. As tree-to-string translation casts decoding as a tree parsing problem rather than parsing, the decoder still runs fast when adjoining is included. Less than 2 times slower, the adjoining tree-tostring system improves translation quality by +0.7 BLEU over the baseline system only allowing for tree substitution on NIST ChineseEnglish test sets.
5 0.27639836 206 acl-2011-Learning to Transform and Select Elementary Trees for Improved Syntax-based Machine Translations
Author: Bing Zhao ; Young-Suk Lee ; Xiaoqiang Luo ; Liu Li
Abstract: We propose a novel technique of learning how to transform the source parse trees to improve the translation qualities of syntax-based translation models using synchronous context-free grammars. We transform the source tree phrasal structure into a set of simpler structures, expose such decisions to the decoding process, and find the least expensive transformation operation to better model word reordering. In particular, we integrate synchronous binarizations, verb regrouping, removal of redundant parse nodes, and incorporate a few important features such as translation boundaries. We learn the structural preferences from the data in a generative framework. The syntax-based translation system integrating the proposed techniques outperforms the best Arabic-English unconstrained system in NIST08 evaluations by 1.3 absolute BLEU, which is statistically significant.
6 0.25897133 180 acl-2011-Issues Concerning Decoding with Synchronous Context-free Grammar
7 0.25597712 217 acl-2011-Machine Translation System Combination by Confusion Forest
8 0.22659737 202 acl-2011-Learning Hierarchical Translation Structure with Linguistic Annotations
9 0.21649493 268 acl-2011-Rule Markov Models for Fast Tree-to-String Translation
10 0.18965062 171 acl-2011-Incremental Syntactic Language Models for Phrase-based Translation
11 0.17209639 166 acl-2011-Improving Decoding Generalization for Tree-to-String Translation
12 0.1619494 29 acl-2011-A Word-Class Approach to Labeling PSCFG Rules for Machine Translation
13 0.15528002 290 acl-2011-Syntax-based Statistical Machine Translation using Tree Automata and Tree Transducers
14 0.13662902 250 acl-2011-Prefix Probability for Probabilistic Synchronous Context-Free Grammars
15 0.11280161 44 acl-2011-An exponential translation model for target language morphology
16 0.11140017 188 acl-2011-Judging Grammaticality with Tree Substitution Grammar Derivations
17 0.11137236 155 acl-2011-Hypothesis Mixture Decoding for Statistical Machine Translation
18 0.1073759 234 acl-2011-Optimal Head-Driven Parsing Complexity for Linear Context-Free Rewriting Systems
19 0.1068658 28 acl-2011-A Statistical Tree Annotator and Its Applications
20 0.10480625 43 acl-2011-An Unsupervised Model for Joint Phrase Alignment and Extraction
topicId topicWeight
[(0, 0.274), (1, -0.3), (2, 0.146), (3, -0.104), (4, 0.059), (5, 0.012), (6, -0.351), (7, -0.088), (8, -0.089), (9, -0.118), (10, -0.11), (11, -0.075), (12, -0.005), (13, 0.05), (14, 0.117), (15, -0.118), (16, 0.005), (17, 0.05), (18, 0.019), (19, 0.037), (20, -0.068), (21, -0.056), (22, -0.055), (23, 0.103), (24, 0.065), (25, -0.081), (26, -0.045), (27, 0.009), (28, 0.064), (29, 0.026), (30, -0.023), (31, -0.009), (32, -0.0), (33, 0.095), (34, 0.018), (35, 0.123), (36, 0.045), (37, 0.23), (38, 0.044), (39, 0.075), (40, -0.024), (41, 0.009), (42, 0.059), (43, 0.049), (44, -0.013), (45, 0.037), (46, -0.02), (47, -0.008), (48, -0.187), (49, -0.074)]
simIndex simValue paperId paperTitle
1 0.94918716 296 acl-2011-Terminal-Aware Synchronous Binarization
Author: Licheng Fang ; Tagyoung Chung ; Daniel Gildea
Abstract: We present an SCFG binarization algorithm that combines the strengths of early terminal matching on the source language side and early language model integration on the target language side. We also examine how different strategies of target-side terminal attachment during binarization can significantly affect translation quality.
same-paper 2 0.91820556 61 acl-2011-Binarized Forest to String Translation
Author: Hao Zhang ; Licheng Fang ; Peng Xu ; Xiaoyun Wu
Abstract: Tree-to-string translation is syntax-aware and efficient but sensitive to parsing errors. Forestto-string translation approaches mitigate the risk of propagating parser errors into translation errors by considering a forest of alternative trees, as generated by a source language parser. We propose an alternative approach to generating forests that is based on combining sub-trees within the first best parse through binarization. Provably, our binarization forest can cover any non-consitituent phrases in a sentence but maintains the desirable property that for each span there is at most one nonterminal so that the grammar constant for decoding is relatively small. For the purpose of reducing search errors, we apply the synchronous binarization technique to forest-tostring decoding. Combining the two techniques, we show that using a fast shift-reduce parser we can achieve significant quality gains in NIST 2008 English-to-Chinese track (1.3 BLEU points over a phrase-based system, 0.8 BLEU points over a hierarchical phrase-based system). Consistent and significant gains are also shown in WMT 2010 in the English to German, French, Spanish and Czech tracks.
3 0.80713367 180 acl-2011-Issues Concerning Decoding with Synchronous Context-free Grammar
Author: Tagyoung Chung ; Licheng Fang ; Daniel Gildea
Abstract: We discuss some of the practical issues that arise from decoding with general synchronous context-free grammars. We examine problems caused by unary rules and we also examine how virtual nonterminals resulting from binarization can best be handled. We also investigate adding more flexibility to synchronous context-free grammars by adding glue rules and phrases.
4 0.70899367 250 acl-2011-Prefix Probability for Probabilistic Synchronous Context-Free Grammars
Author: Mark-Jan Nederhof ; Giorgio Satta
Abstract: We present a method for the computation of prefix probabilities for synchronous contextfree grammars. Our framework is fairly general and relies on the combination of a simple, novel grammar transformation and standard techniques to bring grammars into normal forms.
5 0.70522177 268 acl-2011-Rule Markov Models for Fast Tree-to-String Translation
Author: Ashish Vaswani ; Haitao Mi ; Liang Huang ; David Chiang
Abstract: Most statistical machine translation systems rely on composed rules (rules that can be formed out of smaller rules in the grammar). Though this practice improves translation by weakening independence assumptions in the translation model, it nevertheless results in huge, redundant grammars, making both training and decoding inefficient. Here, we take the opposite approach, where we only use minimal rules (those that cannot be formed out of other rules), and instead rely on a rule Markov model of the derivation history to capture dependencies between minimal rules. Large-scale experiments on a state-of-the-art tree-to-string translation system show that our approach leads to a slimmer model, a faster decoder, yet the same translation quality (measured using B ) as composed rules.
6 0.69676322 154 acl-2011-How to train your multi bottom-up tree transducer
7 0.69163454 206 acl-2011-Learning to Transform and Select Elementary Trees for Improved Syntax-based Machine Translations
8 0.68509251 110 acl-2011-Effective Use of Function Words for Rule Generalization in Forest-Based Translation
9 0.66977489 30 acl-2011-Adjoining Tree-to-String Translation
10 0.60934311 234 acl-2011-Optimal Head-Driven Parsing Complexity for Linear Context-Free Rewriting Systems
11 0.60666829 217 acl-2011-Machine Translation System Combination by Confusion Forest
12 0.59963846 290 acl-2011-Syntax-based Statistical Machine Translation using Tree Automata and Tree Transducers
13 0.5644545 202 acl-2011-Learning Hierarchical Translation Structure with Linguistic Annotations
14 0.52100641 166 acl-2011-Improving Decoding Generalization for Tree-to-String Translation
15 0.49812883 171 acl-2011-Incremental Syntactic Language Models for Phrase-based Translation
16 0.47584605 29 acl-2011-A Word-Class Approach to Labeling PSCFG Rules for Machine Translation
17 0.43790239 173 acl-2011-Insertion Operator for Bayesian Tree Substitution Grammars
18 0.4173409 155 acl-2011-Hypothesis Mixture Decoding for Statistical Machine Translation
19 0.38905656 44 acl-2011-An exponential translation model for target language morphology
20 0.38575995 28 acl-2011-A Statistical Tree Annotator and Its Applications
topicId topicWeight
[(5, 0.035), (17, 0.117), (26, 0.025), (28, 0.016), (31, 0.011), (37, 0.089), (39, 0.079), (41, 0.069), (55, 0.025), (58, 0.173), (59, 0.033), (62, 0.011), (72, 0.018), (91, 0.032), (96, 0.19)]
simIndex simValue paperId paperTitle
1 0.91452628 319 acl-2011-Unsupervised Decomposition of a Document into Authorial Components
Author: Moshe Koppel ; Navot Akiva ; Idan Dershowitz ; Nachum Dershowitz
Abstract: We propose a novel unsupervised method for separating out distinct authorial components of a document. In particular, we show that, given a book artificially “munged” from two thematically similar biblical books, we can separate out the two constituent books almost perfectly. This allows us to automatically recapitulate many conclusions reached by Bible scholars over centuries of research. One of the key elements of our method is exploitation of differences in synonym choice by different authors. 1
same-paper 2 0.87974346 61 acl-2011-Binarized Forest to String Translation
Author: Hao Zhang ; Licheng Fang ; Peng Xu ; Xiaoyun Wu
Abstract: Tree-to-string translation is syntax-aware and efficient but sensitive to parsing errors. Forestto-string translation approaches mitigate the risk of propagating parser errors into translation errors by considering a forest of alternative trees, as generated by a source language parser. We propose an alternative approach to generating forests that is based on combining sub-trees within the first best parse through binarization. Provably, our binarization forest can cover any non-consitituent phrases in a sentence but maintains the desirable property that for each span there is at most one nonterminal so that the grammar constant for decoding is relatively small. For the purpose of reducing search errors, we apply the synchronous binarization technique to forest-tostring decoding. Combining the two techniques, we show that using a fast shift-reduce parser we can achieve significant quality gains in NIST 2008 English-to-Chinese track (1.3 BLEU points over a phrase-based system, 0.8 BLEU points over a hierarchical phrase-based system). Consistent and significant gains are also shown in WMT 2010 in the English to German, French, Spanish and Czech tracks.
3 0.85981792 248 acl-2011-Predicting Clicks in a Vocabulary Learning System
Author: Aaron Michelony
Abstract: We consider the problem of predicting which words a student will click in a vocabulary learning system. Often a language learner will find value in the ability to look up the meaning of an unknown word while reading an electronic document by clicking the word. Highlighting words likely to be unknown to a readeris attractive due to drawing his orher attention to it and indicating that information is available. However, this option is usually done manually in vocabulary systems and online encyclopedias such as Wikipedia. Furthurmore, it is never on a per-user basis. This paper presents an automated way of highlighting words likely to be unknown to the specific user. We present related work in search engine ranking, a description of the study used to collect click data, the experiment we performed using the random forest machine learning algorithm and finish with a discussion of future work.
4 0.81718421 30 acl-2011-Adjoining Tree-to-String Translation
Author: Yang Liu ; Qun Liu ; Yajuan Lu
Abstract: We introduce synchronous tree adjoining grammars (TAG) into tree-to-string translation, which converts a source tree to a target string. Without reconstructing TAG derivations explicitly, our rule extraction algorithm directly learns tree-to-string rules from aligned Treebank-style trees. As tree-to-string translation casts decoding as a tree parsing problem rather than parsing, the decoder still runs fast when adjoining is included. Less than 2 times slower, the adjoining tree-tostring system improves translation quality by +0.7 BLEU over the baseline system only allowing for tree substitution on NIST ChineseEnglish test sets.
5 0.80511409 202 acl-2011-Learning Hierarchical Translation Structure with Linguistic Annotations
Author: Markos Mylonakis ; Khalil Sima'an
Abstract: While it is generally accepted that many translation phenomena are correlated with linguistic structures, employing linguistic syntax for translation has proven a highly non-trivial task. The key assumption behind many approaches is that translation is guided by the source and/or target language parse, employing rules extracted from the parse tree or performing tree transformations. These approaches enforce strict constraints and might overlook important translation phenomena that cross linguistic constituents. We propose a novel flexible modelling approach to introduce linguistic information of varying granularity from the source side. Our method induces joint probability synchronous grammars and estimates their parameters, by select- ing and weighing together linguistically motivated rules according to an objective function directly targeting generalisation over future data. We obtain statistically significant improvements across 4 different language pairs with English as source, mounting up to +1.92 BLEU for Chinese as target.
6 0.80453837 327 acl-2011-Using Bilingual Parallel Corpora for Cross-Lingual Textual Entailment
7 0.80417627 241 acl-2011-Parsing the Internal Structure of Words: A New Paradigm for Chinese Word Segmentation
8 0.80295694 180 acl-2011-Issues Concerning Decoding with Synchronous Context-free Grammar
9 0.80246562 28 acl-2011-A Statistical Tree Annotator and Its Applications
10 0.80176276 137 acl-2011-Fine-Grained Class Label Markup of Search Queries
11 0.80091429 11 acl-2011-A Fast and Accurate Method for Approximate String Search
12 0.80082178 15 acl-2011-A Hierarchical Pitman-Yor Process HMM for Unsupervised Part of Speech Induction
13 0.79588199 16 acl-2011-A Joint Sequence Translation Model with Integrated Reordering
14 0.79578221 87 acl-2011-Corpus Expansion for Statistical Machine Translation with Semantic Role Label Substitution Rules
15 0.79574919 110 acl-2011-Effective Use of Function Words for Rule Generalization in Forest-Based Translation
16 0.79537106 117 acl-2011-Entity Set Expansion using Topic information
17 0.79535508 43 acl-2011-An Unsupervised Model for Joint Phrase Alignment and Extraction
18 0.7944966 193 acl-2011-Language-independent compound splitting with morphological operations
19 0.79263842 206 acl-2011-Learning to Transform and Select Elementary Trees for Improved Syntax-based Machine Translations
20 0.79243946 154 acl-2011-How to train your multi bottom-up tree transducer