acl acl2013 acl2013-320 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Fabienne Braune ; Nina Seemann ; Daniel Quernheim ; Andreas Maletti
Abstract: We present a new translation model integrating the shallow local multi bottomup tree transducer. We perform a largescale empirical evaluation of our obtained system, which demonstrates that we significantly beat a realistic tree-to-tree baseline on the WMT 2009 English → German tlriannes olnati tohne tWasMk.T TA 2s0 an a Edndgitliisonha →l c Gonetrrmibauntion we make the developed software and complete tool-chain publicly available for further experimentation.
Reference: text
sentIndex sentText sentNum sentScore
1 de , } Abstract We present a new translation model integrating the shallow local multi bottomup tree transducer. [sent-3, score-0.381]
2 1 Introduction Besides phrase-based machine translation systems (Koehn et al. [sent-6, score-0.11]
3 Those systems use synchronous context-free grammars (Chi- ang, 2007), synchronous tree substitution grammars (Eisner, 2003) or even more powerful formalisms like synchronous tree-sequence substitution grammars (Sun et al. [sent-8, score-0.391]
4 (2006) use syntactic annotations on the source language side and show significant improvements in translation quality. [sent-14, score-0.206]
5 Using syntax exclusively on the target language side has also been successfully tried by Galley et al. [sent-15, score-0.15]
6 The improvements observed for systems using syntactic annotation on either the source or the target language side naturally led to experiments with models that use syntactic annotations on both sides. [sent-21, score-0.15]
7 (2009), and Chiang (2010), the integration of syntactic information on both sides tends to decrease translation quality because the systems become too restrictive. [sent-24, score-0.11]
8 (2009), which allows sequences of trees on both sides of the rules [see also (Raoult, 1997)]. [sent-31, score-0.268]
9 The multi bottom-up tree transducer (MBOT) of Arnold and Dauchet (1982) and Lilin (1978) offers a middle ground between traditional syntax-based models and STSSG. [sent-32, score-0.249]
10 Roughly speaking, an MBOT is an STSSG, in which all the discontinuities must occur on the target language side (Maletti, 2011). [sent-33, score-0.15]
11 Figure 2 displays sample rules of the MBOT variant, called ‘MBOT, that we use (in a graphical representation of the trees and the alignment). [sent-36, score-0.227]
12 In this contribution, we report on our novel statistical machine translation system that uses an ‘MBOT-based translation model. [sent-37, score-0.22]
13 The theoretical foundations of ‘MBOT and their integration into our translation model are presented in Sections 2 and 3. [sent-38, score-0.11]
14 In order to empirically evaluate the ‘MBOT model, we implemented a machine trans1A translation is sensible if it is of linear size increase and can be computed by some (potentially copying) top-down tree transducer. [sent-39, score-0.244]
15 We evaluate our new system on the WMT 2009 shared translation task English → German. [sent-47, score-0.11]
16 Essentially, it is the local multi bottom-up tree transducer of Maletti (201 1) with the restriction that all rules must be shallow, which means that the left-hand side of each rule has height at most 2 (see Figure 2 for shallow rules and Figure 4 for rules including non-shallow rules). [sent-54, score-1.158]
17 The rules extracted from the training example of Figure 3 are displayed in Figure 4. [sent-55, score-0.185]
18 Those extracted rules are forcibly made shallow by removing internal nodes. [sent-56, score-0.265]
19 The application of those rules is illustrated in Figures 5 and 6. [sent-57, score-0.185]
20 tA| position w ∈ pos(t) is a leaf (in t) if Fwig1 ∈/ pos(t). [sent-105, score-0.111]
21 G Iinve ont a sru wbsoertd sN, ⊆ we nleot Σ, leafN(t) = {w ∈ pos(t) | t(w) ∈ N, w leaf in t} be the set of all leaves labeled by elements of N. [sent-108, score-0.111]
22 When N is the set of nonterminals, we call them leaf nonterminals. [sent-109, score-0.111]
23 duce our model, which is a minor variation of the local multi bottom-up tree transducer of Maletti (201 1). [sent-126, score-0.249]
24 Definition 1A shallow local multi bottom-up tree transducer (‘MBOT) is a finite set R of rules together with a mapping c : R → R such that every rule, wrr witittehn a at m →ψ (u1, . [sent-153, score-0.514]
25 , uk), ψ, and c(ρ) are called the left-hand side, the right-hand side, the alignment, and the weight of the rule ρ = t →ψ (u1, . [sent-163, score-0.139]
26 Figure 2 shows two example ‘ tM →BOT rules (without weights). [sent-167, score-0.185]
27 Overall, the rules of an ‘MBOT are similar to the rules of an SCFG (synchronous context-free grammar), but our right-hand sides contain a sequence of trees instead of just a single tree. [sent-168, score-0.412]
28 In addition, the alignments in an SCFG rule are bijective between leaf nonterminals, whereas our model permits multiple alignments to a single leaf nonterminal in the left-hand side (see Figure 2). [sent-169, score-0.655]
29 Our ‘MBOT rules are obtained automatically from data like that in Figure 3. [sent-170, score-0.185]
30 To these sentence pairs we apply the rule extraction method of Maletti (201 1). [sent-173, score-0.139]
31 The rules extracted from the sentence pair of Figure 3 are shown in Figure 4. [sent-174, score-0.185]
32 Note that these rules are not necessarily shallow (the last two rules are not). [sent-175, score-0.45]
33 Thus, we post-process the extracted rules and make them shallow. [sent-176, score-0.185]
34 The shallow rules corresponding to the non-shallow rules of Figure 4 are shown in Figure 2. [sent-177, score-0.45]
35 Next, we define how to combine rules to form derivations. [sent-178, score-0.185]
36 , uk) be a rule and w ∈ leafN(t) →b e a leaf nonterminal (occurrence) iwn t∈he l eleafft-hand side. [sent-186, score-0.372]
37 The w-rank rk(ρ, w) of the rule ρ is rk(ρ, w) = max {i ∈ N | (w, i) ∈ ran(ψ)} For example, for the lower rule ρ . [sent-187, score-0.278]
38 Definition 2 The set τ(R, c) of weighted pretranslations of an ‘MBOT (R, c) is the smallest set T subject to the following restriction: If there exist • a rule ρ = t →ψ (u1, . [sent-189, score-0.179]
39 We obtain our rules by making those rules shallow. [sent-226, score-0.37]
40 Rules that do not contain any nonterminal leaves are automatically weighted pre-translations with their associated rule weight. [sent-228, score-0.261]
41 Otherwise, each nonterminal leaf w in the left-hand side of a rule ρ must be replaced by the input tree tw of a pretranslation htw , cw, (u1w, . [sent-229, score-0.761]
42 ) iI,n w addition, tth ies rank rk(ρ, w) of the replaced nonterminal should match the number kw of components in the selected weighted pre-translation. [sent-233, score-0.122]
43 Finally, the nonterminals in the right-hand side that are aligned to w should be replaced by the translation that the alignment requests, provided that the nontermi- nal matches with the root symbol of the requested translation. [sent-234, score-0.315]
44 The weight of the new pre-translation is obtained simply by multiplying the rule weight and the weights of the selected weighted pretranslations. [sent-235, score-0.191]
45 3 Translation Model Given a source language sentence e, our translation model aims to find the best corresponding target language translation ˆg ;7 i. [sent-237, score-0.274]
46 , uk)i : τ = ht,(u 7Our main translation direction is English to German. [sent-249, score-0.11]
47 In the same fashion the rule weights required for (2) are relative frequencies normalized over all rules with the same righthand side. [sent-252, score-0.324]
48 Additionally, rules that were extracted at most 10 times are discounted by multiplying the rule weight by 10−2. [sent-253, score-0.376]
49 The lexical weights for (2) and (3) are obtained by multiplying the word translations w(gi |ej) [respectively, w(ej |gi)] of lexically aligned wo|reds (gi, ej) accross (possibly discontiguous) target side sequences. [sent-254, score-0.24]
50 9 Whenever a source word ej is aligned to multiple target words, we average over the word translations. [sent-255, score-0.11]
51 , uk)i) average {w(g|e) | g aligned to e} lexicYal item e ocYcurs in t The computation of the language model estimates for (6) is adapted to score partial translations consisting of discontiguous units. [sent-259, score-0.236]
52 10If the word ej has no alignment to a target word, then it is assumed to be aligned to a special NULL word and this alignment is scored. [sent-264, score-0.196]
53 814 Combining a rule with pre-translations: JJNPNNS →? [sent-265, score-0.139]
54 Offizielle Prognosen Figure 5: Simple rule application. [sent-273, score-0.139]
55 e, rules with contiguous components on the source and the target language side. [sent-289, score-0.278]
56 Roughly speaking, SCFG rules are ‘MBOT rules with exactly one output tree. [sent-290, score-0.37]
57 A rule is applicable11 if the left-hand side of it matches the nonterminal assigned to the full span by the parser and the (non-)terminal assigned to each subspan. [sent-293, score-0.398]
58 12 In order to speed up the decoding, cube pruning (Chiang, 2007) is applied to each chart cell in order to select the most likely hypotheses for subspans. [sent-294, score-0.146]
59 The language model (LM) scoring is directly integrated into the cube pruning algorithm. [sent-295, score-0.164]
60 First, the rule representation itself is adjusted to allow sequences of shallow output trees on the target side. [sent-298, score-0.356]
61 Naturally, we also had to adjust hypothesis expansion and, most importantly, language model scoring inside the cube pruning algorithm. [sent-299, score-0.164]
62 The expansion in Line 5 involves matching all nonterminal leaves in the rule as defined in Definition 2, which includes matching all leaf nonterminals in all (discontiguous) output trees. [sent-302, score-0.438]
63 Because the output trees can remain discontiguous after hypothesis creation, LM scoring has to be done individually over all output trees. [sent-303, score-0.297]
64 , wk to collect the lexical information from the k output com11Note that our notion of applicable rules differs from the default in Moses. [sent-308, score-0.185]
65 12Theoretically, this allows that the decoder ignores unary parser nonterminals, which could also disappear when we make our rules shallow; e. [sent-309, score-0.185]
66 , the parse tree left in the pretranslation of Figure 5 can be matched by a rule with lefthand side NP(Official, forecasts). [sent-311, score-0.488]
67 Algorithm 1 Cube pruning with ‘MBOT rules Data structures: - r[i, j] : list of rules matching span e[i . [sent-312, score-0.455]
68 j] - c[i, j] : cube of hypotheses covering span e[i . [sent-318, score-0.143]
69 Suppose that u0j will be substituted into the nonterminal in question. [sent-335, score-0.122]
70 Then we first LM-score the pretranslation τ0 to obtain the string wj0 corresponding to u0j . [sent-336, score-0.159]
71 The overall LM score for the pretranslation is obtained by multiplying the scores for w1, . [sent-339, score-0.211]
72 Finally, in the final rule only one component is allowed, which yields that the LM indeed scores the complete output sentence. [sent-348, score-0.139]
73 Figure 7 illustrates our LM scoring for a pretranslation involving a rule with two (discontiguous) target sequences (the construction of the pretranslation is illustrated in Figure 6). [sent-349, score-0.609]
74 When processing the rule rooted at S, an LM estimate is computed by expanding all nonterminal leaves. [sent-350, score-0.261]
75 Assembling the string we obtain Offizielle Prognosen sind von nur 3 % ausgegangen which is scored by the LM. [sent-363, score-0.155]
76 Thus, we first score the 4-grams “Offizielle Prognosen sind von”, then “Prognosen sind von nur”, etc. [sent-364, score-0.174]
77 We use linguistic syntactic annotation on both the source and the target language side (tree-to-tree). [sent-369, score-0.15]
78 Our contrastive system is the ‘MBOT-based translation system presented here. [sent-370, score-0.11]
79 English side of the bilingual data was parsed using the Charniak parser of Charniak and Johnson (2005), and the German side was parsed using BitPar (Schmid, 2004) without the function and morphological annotations. [sent-387, score-0.192]
80 In Table 2, we report the number of ‘MBOT rules used by our system when decoding the test set. [sent-407, score-0.185]
81 By lex we denote rules containing only lexical 817 disc o n tig u o u s 23, 1l e7x5 non1-28t,e53r51m56412t,o58t3 a01l Table 2: Number of rules used in decoding test (lex: only lexical items; non-term: at least one nonterminal). [sent-408, score-0.37]
82 The label non-term stands for rules containing at least one leaf nonterminal. [sent-411, score-0.296]
83 6% of all rules used by our ‘MBOTsystem have discontiguous target sides. [sent-413, score-0.437]
84 Furthermore, the reported numbers show that the system also uses rules in which lexical items are combined with nonterminals. [sent-414, score-0.185]
85 Finally, Table 3 presents the number of rules with k target side components used during decoding. [sent-415, score-0.335]
86 An analysis of the generated derivations shows that our system produces the correct translation by taking advantage of rules with discontiguous units on target language side. [sent-421, score-0.547]
87 The rules used in the presented derivations are displayed in Figures 10 and 11. [sent-422, score-0.185]
88 In the first example (Figure 8), we begin by translating “((smuggle)VB (eight projectiles)NP (into the kingdom)PP)VP” into the discontiguous sequence composed of (i) “(acht geschosse)NP” ; (ii) “(in das k o¨nigreich)PP” and (iii) “(schmuggeln)VP”. [sent-423, score-0.251]
89 In a second step we assemble all sequences in a rule with contiguous tar- get language side and, at the same time, insert the word “(zu)PTKZU” between “(in das k o¨nigreich)PP” and “(schmuggeln)VP”. [sent-424, score-0.368]
90 Figure 10: Used ‘MBOT rules for verbal reordering ADVcommeVnPtedonNP →? [sent-430, score-0.185]
91 Figure 11: Used ‘MBOT rules for verbal reorder- ing late “((again)ADV commented on (the problem of global warming)NP)VP” into the discontiguous sequence composed of (i) “(das problem der globalen erw¨ armung)NP”; (ii) “(wieder)ADV” and (iii) “(kommentiert)VPP”. [sent-436, score-0.497]
92 We thus obtain, for the input segment “((has)VBZ (again)ADV commented on (the problem of global warming)NP)VP”, the sequence (i) “(das problem der globalen erw¨ armung)NP”; (ii) “(hat)VAFIN”; (iii) “(wieder)ADV”; (iv) “(kommentiert)VVPP”. [sent-438, score-0.114]
93 In a last step, the constituent “(president v ´aclav klaus)NP” is inserted between the discontiguous units “(hat)VAFIN” and “(wieder)ADV” to form the contiguous sequence “((das problem der globalen erw a¨rmung)NP (hat)VAFIN (pr a¨sident v ´aclav klaus)NP (wieder)ADV (kommentiert)VVPP)TOP”. [sent-439, score-0.386]
94 Again, an analysis of the generated derivation shows that ‘MBOT takes advantage of rules having several target side components. [sent-441, score-0.335]
95 Through its ability to use these discontiguous rules, our system correctly translates into reflexive or particle verbs such as “konzentriert sich” (for the English “focuses”) or “besteht darauf” (for the English “insist”). [sent-443, score-0.198]
96 6 Conclusion and Future Work We demonstrated that our ‘MBOT-based machine translation system beats a standard tree-to-tree system (Moses tree-to-tree) on the WMT 2009 translation task English → German. [sent-450, score-0.22]
97 Figure 14: ‘MBOT rules generating a relative clause/reflexive pronoun presented in Section 4 and a separate C++ module that we use for rule extraction (see Section 3). [sent-459, score-0.324]
98 We argue that our ‘MBOT approach can adequately handle discontiguous phrases, which occur frequently in German. [sent-462, score-0.198]
99 Syntax-driven learning of sub-sentential translation equivalents and translation rules from parsed parallel corpora. [sent-556, score-0.405]
100 Every sensible extended topdown tree transducer is a multi bottom-up tree transducer. [sent-585, score-0.383]
wordName wordTfidf (topN-words)
[('mbot', 0.575), ('discontiguous', 0.198), ('rules', 0.185), ('lm', 0.162), ('pretranslation', 0.159), ('maletti', 0.14), ('rule', 0.139), ('leafn', 0.139), ('vafin', 0.139), ('nonterminal', 0.122), ('ht', 0.113), ('leaf', 0.111), ('translation', 0.11), ('wieder', 0.099), ('uk', 0.098), ('multi', 0.097), ('side', 0.096), ('tree', 0.094), ('moses', 0.093), ('np', 0.089), ('rk', 0.085), ('shallow', 0.08), ('adv', 0.08), ('globalen', 0.079), ('kommentiert', 0.079), ('prognosen', 0.079), ('vvpp', 0.079), ('erw', 0.07), ('koehn', 0.067), ('nonterminals', 0.066), ('cube', 0.063), ('tk', 0.061), ('hat', 0.061), ('konzentriert', 0.059), ('offizielle', 0.059), ('schmuggeln', 0.059), ('sind', 0.059), ('stssg', 0.059), ('die', 0.059), ('scfg', 0.059), ('transducer', 0.058), ('scoring', 0.057), ('von', 0.056), ('ej', 0.056), ('ui', 0.055), ('synchronous', 0.055), ('target', 0.054), ('das', 0.053), ('delegation', 0.053), ('roadmap', 0.053), ('aclav', 0.053), ('multiplying', 0.052), ('strings', 0.048), ('vp', 0.046), ('bali', 0.046), ('grammars', 0.044), ('pruning', 0.044), ('forecasts', 0.043), ('alignment', 0.043), ('trees', 0.042), ('iw', 0.042), ('sequences', 0.041), ('span', 0.041), ('wmt', 0.041), ('cw', 0.04), ('sensible', 0.04), ('adjanpnn', 0.04), ('armung', 0.04), ('bem', 0.04), ('bestand', 0.04), ('darauf', 0.04), ('dass', 0.04), ('entscheidung', 0.04), ('geschosse', 0.04), ('htw', 0.04), ('jede', 0.04), ('nigreich', 0.04), ('nur', 0.04), ('oaffdizjiealle', 0.04), ('ofjficjial', 0.04), ('pretranslations', 0.04), ('prongnnosen', 0.04), ('qpnpnn', 0.04), ('rmung', 0.04), ('serbische', 0.04), ('sich', 0.04), ('ujv', 0.04), ('ukww', 0.04), ('warming', 0.04), ('hypotheses', 0.039), ('restriction', 0.039), ('contiguous', 0.039), ('translations', 0.038), ('hoang', 0.038), ('pos', 0.038), ('alignments', 0.038), ('speaking', 0.037), ('stuttgart', 0.037), ('nk', 0.037), ('commented', 0.035)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000002 320 acl-2013-Shallow Local Multi-Bottom-up Tree Transducers in Statistical Machine Translation
Author: Fabienne Braune ; Nina Seemann ; Daniel Quernheim ; Andreas Maletti
Abstract: We present a new translation model integrating the shallow local multi bottomup tree transducer. We perform a largescale empirical evaluation of our obtained system, which demonstrates that we significantly beat a realistic tree-to-tree baseline on the WMT 2009 English → German tlriannes olnati tohne tWasMk.T TA 2s0 an a Edndgitliisonha →l c Gonetrrmibauntion we make the developed software and complete tool-chain publicly available for further experimentation.
2 0.17981742 314 acl-2013-Semantic Roles for String to Tree Machine Translation
Author: Marzieh Bazrafshan ; Daniel Gildea
Abstract: We experiment with adding semantic role information to a string-to-tree machine translation system based on the rule extraction procedure of Galley et al. (2004). We compare methods based on augmenting the set of nonterminals by adding semantic role labels, and altering the rule extraction process to produce a separate set of rules for each predicate that encompass its entire predicate-argument structure. Our results demonstrate that the second approach is effective in increasing the quality of translations.
3 0.17268559 361 acl-2013-Travatar: A Forest-to-String Machine Translation Engine based on Tree Transducers
Author: Graham Neubig
Abstract: In this paper we describe Travatar, a forest-to-string machine translation (MT) engine based on tree transducers. It provides an open-source C++ implementation for the entire forest-to-string MT pipeline, including rule extraction, tuning, decoding, and evaluation. There are a number of options for model training, and tuning includes advanced options such as hypergraph MERT, and training of sparse features through online learning. The training pipeline is modeled after that of the popular Moses decoder, so users familiar with Moses should be able to get started quickly. We perform a validation experiment of the decoder on EnglishJapanese machine translation, and find that it is possible to achieve greater accuracy than translation using phrase-based and hierarchical-phrase-based translation. As auxiliary results, we also compare different syntactic parsers and alignment techniques that we tested in the process of developing the decoder. Travatar is available under the LGPL at http : / /phont ron . com/t ravat ar
4 0.15322646 226 acl-2013-Learning to Prune: Context-Sensitive Pruning for Syntactic MT
Author: Wenduan Xu ; Yue Zhang ; Philip Williams ; Philipp Koehn
Abstract: We present a context-sensitive chart pruning method for CKY-style MT decoding. Source phrases that are unlikely to have aligned target constituents are identified using sequence labellers learned from the parallel corpus, and speed-up is obtained by pruning corresponding chart cells. The proposed method is easy to implement, orthogonal to cube pruning and additive to its pruning power. On a full-scale Englishto-German experiment with a string-totree model, we obtain a speed-up of more than 60% over a strong baseline, with no loss in BLEU.
5 0.13162313 19 acl-2013-A Shift-Reduce Parsing Algorithm for Phrase-based String-to-Dependency Translation
Author: Yang Liu
Abstract: We introduce a shift-reduce parsing algorithm for phrase-based string-todependency translation. As the algorithm generates dependency trees for partial translations left-to-right in decoding, it allows for efficient integration of both n-gram and dependency language models. To resolve conflicts in shift-reduce parsing, we propose a maximum entropy model trained on the derivation graph of training data. As our approach combines the merits of phrase-based and string-todependency models, it achieves significant improvements over the two baselines on the NIST Chinese-English datasets.
6 0.12972029 46 acl-2013-An Infinite Hierarchical Bayesian Model of Phrasal Translation
7 0.12260062 200 acl-2013-Integrating Phrase-based Reordering Features into a Chart-based Decoder for Machine Translation
8 0.11838648 166 acl-2013-Generalized Reordering Rules for Improved SMT
9 0.11526992 312 acl-2013-Semantic Parsing as Machine Translation
10 0.10873637 11 acl-2013-A Multi-Domain Translation Model Framework for Statistical Machine Translation
11 0.10830685 40 acl-2013-Advancements in Reordering Models for Statistical Machine Translation
12 0.10704232 223 acl-2013-Learning a Phrase-based Translation Model from Monolingual Data with Application to Domain Adaptation
13 0.10611542 10 acl-2013-A Markov Model of Machine Translation using Non-parametric Bayesian Inference
14 0.098939851 221 acl-2013-Learning Non-linear Features for Machine Translation Using Gradient Boosting Machines
15 0.097759269 255 acl-2013-Name-aware Machine Translation
16 0.091910578 165 acl-2013-General binarization for parsing and translation
17 0.091741793 330 acl-2013-Stem Translation with Affix-Based Rule Selection for Agglutinative Languages
18 0.089209996 15 acl-2013-A Novel Graph-based Compact Representation of Word Alignment
19 0.088125929 274 acl-2013-Parsing Graphs with Hyperedge Replacement Grammars
20 0.087825298 38 acl-2013-Additive Neural Networks for Statistical Machine Translation
topicId topicWeight
[(0, 0.222), (1, -0.171), (2, 0.1), (3, 0.068), (4, -0.098), (5, 0.056), (6, 0.027), (7, -0.008), (8, 0.018), (9, 0.048), (10, -0.004), (11, 0.08), (12, 0.05), (13, 0.003), (14, 0.019), (15, -0.041), (16, 0.12), (17, 0.047), (18, 0.023), (19, 0.057), (20, -0.076), (21, 0.044), (22, 0.035), (23, -0.001), (24, -0.04), (25, -0.01), (26, 0.026), (27, -0.049), (28, -0.019), (29, 0.014), (30, -0.013), (31, -0.017), (32, -0.017), (33, -0.044), (34, -0.012), (35, -0.002), (36, -0.005), (37, 0.043), (38, -0.002), (39, 0.002), (40, -0.08), (41, 0.099), (42, 0.017), (43, 0.039), (44, 0.081), (45, 0.096), (46, -0.063), (47, 0.043), (48, 0.023), (49, 0.001)]
simIndex simValue paperId paperTitle
same-paper 1 0.95415717 320 acl-2013-Shallow Local Multi-Bottom-up Tree Transducers in Statistical Machine Translation
Author: Fabienne Braune ; Nina Seemann ; Daniel Quernheim ; Andreas Maletti
Abstract: We present a new translation model integrating the shallow local multi bottomup tree transducer. We perform a largescale empirical evaluation of our obtained system, which demonstrates that we significantly beat a realistic tree-to-tree baseline on the WMT 2009 English → German tlriannes olnati tohne tWasMk.T TA 2s0 an a Edndgitliisonha →l c Gonetrrmibauntion we make the developed software and complete tool-chain publicly available for further experimentation.
2 0.84538591 361 acl-2013-Travatar: A Forest-to-String Machine Translation Engine based on Tree Transducers
Author: Graham Neubig
Abstract: In this paper we describe Travatar, a forest-to-string machine translation (MT) engine based on tree transducers. It provides an open-source C++ implementation for the entire forest-to-string MT pipeline, including rule extraction, tuning, decoding, and evaluation. There are a number of options for model training, and tuning includes advanced options such as hypergraph MERT, and training of sparse features through online learning. The training pipeline is modeled after that of the popular Moses decoder, so users familiar with Moses should be able to get started quickly. We perform a validation experiment of the decoder on EnglishJapanese machine translation, and find that it is possible to achieve greater accuracy than translation using phrase-based and hierarchical-phrase-based translation. As auxiliary results, we also compare different syntactic parsers and alignment techniques that we tested in the process of developing the decoder. Travatar is available under the LGPL at http : / /phont ron . com/t ravat ar
3 0.81097376 226 acl-2013-Learning to Prune: Context-Sensitive Pruning for Syntactic MT
Author: Wenduan Xu ; Yue Zhang ; Philip Williams ; Philipp Koehn
Abstract: We present a context-sensitive chart pruning method for CKY-style MT decoding. Source phrases that are unlikely to have aligned target constituents are identified using sequence labellers learned from the parallel corpus, and speed-up is obtained by pruning corresponding chart cells. The proposed method is easy to implement, orthogonal to cube pruning and additive to its pruning power. On a full-scale Englishto-German experiment with a string-totree model, we obtain a speed-up of more than 60% over a strong baseline, with no loss in BLEU.
4 0.79898047 312 acl-2013-Semantic Parsing as Machine Translation
Author: Jacob Andreas ; Andreas Vlachos ; Stephen Clark
Abstract: Semantic parsing is the problem of deriving a structured meaning representation from a natural language utterance. Here we approach it as a straightforward machine translation task, and demonstrate that standard machine translation components can be adapted into a semantic parser. In experiments on the multilingual GeoQuery corpus we find that our parser is competitive with the state of the art, and in some cases achieves higher accuracy than recently proposed purpose-built systems. These results support the use of machine translation methods as an informative baseline in semantic parsing evaluations, and suggest that research in semantic parsing could benefit from advances in machine translation.
5 0.79083616 314 acl-2013-Semantic Roles for String to Tree Machine Translation
Author: Marzieh Bazrafshan ; Daniel Gildea
Abstract: We experiment with adding semantic role information to a string-to-tree machine translation system based on the rule extraction procedure of Galley et al. (2004). We compare methods based on augmenting the set of nonterminals by adding semantic role labels, and altering the rule extraction process to produce a separate set of rules for each predicate that encompass its entire predicate-argument structure. Our results demonstrate that the second approach is effective in increasing the quality of translations.
6 0.78532523 165 acl-2013-General binarization for parsing and translation
7 0.71692002 200 acl-2013-Integrating Phrase-based Reordering Features into a Chart-based Decoder for Machine Translation
8 0.70823556 46 acl-2013-An Infinite Hierarchical Bayesian Model of Phrasal Translation
9 0.70064503 330 acl-2013-Stem Translation with Affix-Based Rule Selection for Agglutinative Languages
10 0.66670793 363 acl-2013-Two-Neighbor Orientation Model with Cross-Boundary Global Contexts
11 0.66127008 77 acl-2013-Can Markov Models Over Minimal Translation Units Help Phrase-Based SMT?
12 0.65851855 19 acl-2013-A Shift-Reduce Parsing Algorithm for Phrase-based String-to-Dependency Translation
13 0.65791023 15 acl-2013-A Novel Graph-based Compact Representation of Word Alignment
14 0.63662797 10 acl-2013-A Markov Model of Machine Translation using Non-parametric Bayesian Inference
15 0.62128896 285 acl-2013-Propminer: A Workflow for Interactive Information Extraction and Exploration using Dependency Trees
16 0.61258054 16 acl-2013-A Novel Translation Framework Based on Rhetorical Structure Theory
17 0.6054967 180 acl-2013-Handling Ambiguities of Bilingual Predicate-Argument Structures for Statistical Machine Translation
18 0.6036309 166 acl-2013-Generalized Reordering Rules for Improved SMT
19 0.60314053 137 acl-2013-Enlisting the Ghost: Modeling Empty Categories for Machine Translation
20 0.59456265 40 acl-2013-Advancements in Reordering Models for Statistical Machine Translation
topicId topicWeight
[(0, 0.046), (6, 0.032), (11, 0.036), (14, 0.013), (24, 0.023), (26, 0.048), (35, 0.05), (42, 0.075), (48, 0.029), (70, 0.044), (71, 0.012), (88, 0.016), (90, 0.424), (95, 0.071)]
simIndex simValue paperId paperTitle
1 0.92970783 263 acl-2013-On the Predictability of Human Assessment: when Matrix Completion Meets NLP Evaluation
Author: Guillaume Wisniewski
Abstract: This paper tackles the problem of collecting reliable human assessments. We show that knowing multiple scores for each example instead of a single score results in a more reliable estimation of a system quality. To reduce the cost of collecting these multiple ratings, we propose to use matrix completion techniques to predict some scores knowing only scores of other judges and some common ratings. Even if prediction performance is pretty low, decisions made using the predicted score proved to be more reliable than decision based on a single rating of each example.
2 0.89736146 182 acl-2013-High-quality Training Data Selection using Latent Topics for Graph-based Semi-supervised Learning
Author: Akiko Eriguchi ; Ichiro Kobayashi
Abstract: In a multi-class document categorization using graph-based semi-supervised learning (GBSSL), it is essential to construct a proper graph expressing the relation among nodes and to use a reasonable categorization algorithm. Furthermore, it is also important to provide high-quality correct data as training data. In this context, we propose a method to construct a similarity graph by employing both surface information and latent information to express similarity between nodes and a method to select high-quality training data for GBSSL by means of the PageR- ank algorithm. Experimenting on Reuters21578 corpus, we have confirmed that our proposed methods work well for raising the accuracy of a multi-class document categorization.
same-paper 3 0.89218986 320 acl-2013-Shallow Local Multi-Bottom-up Tree Transducers in Statistical Machine Translation
Author: Fabienne Braune ; Nina Seemann ; Daniel Quernheim ; Andreas Maletti
Abstract: We present a new translation model integrating the shallow local multi bottomup tree transducer. We perform a largescale empirical evaluation of our obtained system, which demonstrates that we significantly beat a realistic tree-to-tree baseline on the WMT 2009 English → German tlriannes olnati tohne tWasMk.T TA 2s0 an a Edndgitliisonha →l c Gonetrrmibauntion we make the developed software and complete tool-chain publicly available for further experimentation.
4 0.88568527 390 acl-2013-Word surprisal predicts N400 amplitude during reading
Author: Stefan L. Frank ; Leun J. Otten ; Giulia Galli ; Gabriella Vigliocco
Abstract: We investigated the effect of word surprisal on the EEG signal during sentence reading. On each word of 205 experimental sentences, surprisal was estimated by three types of language model: Markov models, probabilistic phrasestructure grammars, and recurrent neural networks. Four event-related potential components were extracted from the EEG of 24 readers of the same sentences. Surprisal estimates under each model type formed a significant predictor of the amplitude of the N400 component only, with more surprising words resulting in more negative N400s. This effect was mostly due to content words. These findings provide support for surprisal as a gener- ally applicable measure of processing difficulty during language comprehension.
5 0.87854421 200 acl-2013-Integrating Phrase-based Reordering Features into a Chart-based Decoder for Machine Translation
Author: ThuyLinh Nguyen ; Stephan Vogel
Abstract: Hiero translation models have two limitations compared to phrase-based models: 1) Limited hypothesis space; 2) No lexicalized reordering model. We propose an extension of Hiero called PhrasalHiero to address Hiero’s second problem. Phrasal-Hiero still has the same hypothesis space as the original Hiero but incorporates a phrase-based distance cost feature and lexicalized reodering features into the chart decoder. The work consists of two parts: 1) for each Hiero translation derivation, find its corresponding dis- continuous phrase-based path. 2) Extend the chart decoder to incorporate features from the phrase-based path. We achieve significant improvement over both Hiero and phrase-based baselines for ArabicEnglish, Chinese-English and GermanEnglish translation.
6 0.82752049 139 acl-2013-Entity Linking for Tweets
7 0.81014013 197 acl-2013-Incremental Topic-Based Translation Model Adaptation for Conversational Spoken Language Translation
8 0.58755547 226 acl-2013-Learning to Prune: Context-Sensitive Pruning for Syntactic MT
9 0.56670117 361 acl-2013-Travatar: A Forest-to-String Machine Translation Engine based on Tree Transducers
10 0.56126744 166 acl-2013-Generalized Reordering Rules for Improved SMT
11 0.54828674 250 acl-2013-Models of Translation Competitions
12 0.54510707 314 acl-2013-Semantic Roles for String to Tree Machine Translation
13 0.52760673 341 acl-2013-Text Classification based on the Latent Topics of Important Sentences extracted by the PageRank Algorithm
14 0.51629221 59 acl-2013-Automated Pyramid Scoring of Summaries using Distributional Semantics
15 0.49839032 201 acl-2013-Integrating Translation Memory into Phrase-Based Machine Translation during Decoding
16 0.49262825 165 acl-2013-General binarization for parsing and translation
17 0.48974782 127 acl-2013-Docent: A Document-Level Decoder for Phrase-Based Statistical Machine Translation
18 0.48920053 312 acl-2013-Semantic Parsing as Machine Translation
19 0.48300055 174 acl-2013-Graph Propagation for Paraphrasing Out-of-Vocabulary Words in Statistical Machine Translation
20 0.48284867 15 acl-2013-A Novel Graph-based Compact Representation of Word Alignment