acl acl2011 acl2011-154 knowledge-graph by maker-knowledge-mining

154 acl-2011-How to train your multi bottom-up tree transducer

Source: pdf

Author: Andreas Maletti

Abstract: The local multi bottom-up tree transducer is introduced and related to the (non-contiguous) synchronous tree sequence substitution grammar. It is then shown how to obtain a weighted local multi bottom-up tree transducer from a bilingual and biparsed corpus. Finally, the problem of non-preservation of regularity is addressed. Three properties that ensure preservation are introduced, and it is discussed how to adjust the rule extraction process such that they are automatically fulfilled.

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 How to train your multi bottom-up tree transducer Andreas Maletti Universit a¨t Stuttgart, Institute for Natural Language Processing Azenbergstraße 12, 70174 Stuttgart, Germany andrea s . [sent-1, score-0.248]

2 de @ Abstract The local multi bottom-up tree transducer is introduced and related to the (non-contiguous) synchronous tree sequence substitution grammar. [sent-4, score-0.427]

3 It is then shown how to obtain a weighted local multi bottom-up tree transducer from a bilingual and biparsed corpus. [sent-5, score-0.283]

4 Three properties that ensure preservation are introduced, and it is discussed how to adjust the rule extraction process such that they are automatically fulfilled. [sent-7, score-0.238]

5 1 Introduction A (formal) translation model is at the core of every machine translation system. [sent-8, score-0.117]

6 In this paper, we deal exclusively with syntaxbased translation models such as synchronous tree substitution grammars (STSG), multi bottom-up tree transducers (MBOT), and synchronous tree-sequence substitution grammars (STSSG). [sent-16, score-0.592]

7 Roughly speaking, an STSG has rules in which two linked nonterminals are replaced (at the same time) by two corresponding trees containing terminal and nonterminal symbols. [sent-18, score-0.23]

8 In addition, the nonterminals in the two replacement trees are linked, which creates new linked nonterminals to which further rules can be applied. [sent-19, score-0.255]

9 Roughly speaking, they allow one replacement input tree and several output trees in a single rule. [sent-22, score-0.183]

10 Finally, STSSG, which have been derived from rational tree relations (Raoult, 1997), have been discussed by Zhang et al. [sent-24, score-0.134]

11 They are even more expressive than the local variant of the multi bottom-up tree transducer (LMBOT) that we introduce here and can have several input and output trees in a single rule. [sent-28, score-0.399]

12 (2009) demonstrate that the additional expressivity gained from non-contiguous rules greatly improves the translation quality. [sent-40, score-0.119]

13 To this end, we introduce a rule extraction and weight training method for LMBOT that is based on the corresponding procedures for STSG and STSSG. [sent-42, score-0.144]

14 In addi- × tion, we shortly discuss how these properties could be enforced in the rule extraction procedure. [sent-47, score-0.191]

15 Consequently, a tree t consists of a labeled root node σ followed by a sequence t of its children. [sent-61, score-0.124]

16 The positions pos(t) ⊆ N∗ of a tree t = σ(t) are 826 inductively defined by pos(t) = {ε}∪pos(t), where pos(t) = [ {ip | p ∈ pos(ti)} . [sent-66, score-0.139]

17 Later NT will be the set of nonterminals, so that the elements of ↓NT(t) will be the leaf nonterminals othfe et . [sent-76, score-0.185]

18 eTh iennt[pi ← ti | 1 ≤ i ≤ n] denotes the tr∈ee Tthat is obtaine←d fr tom| t1 by replacing (in parallel) teh teh astub istre obe-s at pi by ti for every i∈ [k] . [sent-86, score-0.121]

19 Finally, floetr us e rreyca il ∈ regular tree languages. [sent-87, score-0.146]

20 A finite tree automaton M is a tuple (Q, Σ, δ, F) such that Q is a finite set, δ Q∗ Σ Q is a fitnhiatet relation, iannidte F se Q. [sent-88, score-0.254]

21 The finite tree automatfoonr eMve recognizes tdhe t t ∈re eT language = L(M) = {t ∈ TΣ | δ(t) ∩ F ∅} . [sent-90, score-0.179]

22 A tree language L ⊆ TΣ is regular if there exists a fAin titree etr leaen gauutaogmea Lton ⊆ M T such that L = L(M) . [sent-91, score-0.146]

23 NP-OBJ sVigBneDdVP P1→tPwVlY,ADlEtNTw-PqNyNEP Figure 1: 3 The model In this section, we recall particular multi bottomup tree transducers, which have been introduced by Arnold and Dauchet (1982) and Lilin (1981). [sent-92, score-0.228]

24 (2009), we recall a variant of linear and nondeleting extended multi bottom-up tree transducers (MBOT) here. [sent-96, score-0.237]

25 Essentially, the model works on pairs ht, ui consisting olyf, an input etrlee w ot ∈ TΣ aainrds a sequence u ∈ T∆∗ of output trees. [sent-102, score-0.127]

26 Given a pre-translation ht, ui ∈ TΣ ×T∆k asntodr eid ∈ [k], we nca all p ui ttrhaen siltaht tornan hst,lautiion ∈ o Tf t×. [sent-106, score-0.118]

27 ATn alignment f]o,r w tehe c pre-translation ht, ui is an injecativlieg mapping ψ : ↓NT(u) → ↓NT(t) iNs asunch in tjhecat(p, j) ∈ ψ(↓NT(u)) f(our every (p, i) ∈ ψ(↓NT(u)) a(np,dj j ∈ [i] . [sent-107, score-0.118]

28 Definition 1A local multi bottom-up tree transducer (LMBOT) is a finite set R of rules such that every rule, written l →ψ r, contains a pre-translation hl, ri alne,d w an alignment ψ for it. [sent-110, score-0.462]

29 The component l is the left-hand side, r is the right-hand side, and ψ is the alignment of a rule l →ψ r ∈ R. [sent-112, score-0.136]

30 The rules of an LMBOT are similar rtou teh le →rulesr o∈f an STSG (synchronous tree substitution grammar) of Eisner (2003) and Shieber (2004), but right-hand sides of LMBOT contain a sequence of trees instead of just a single tree as in an STSG. [sent-113, score-0.371]

31 In addition, the alignments in an STSG rule are bijective between leaf nonterminals, whereas our model permits multiple alignments to a single leaf nonterminal in the left-hand side. [sent-114, score-0.447]

32 (2009), which allows sequences of trees on both sides of rules [see also (Raoult, 1997)]. [sent-118, score-0.133]

33 sVigBnDedIfoNrPSPNe rNbPi aDtwlYADlEtNwT-PqNy EPERnEP NPNS-NrPb RyAOPE Figure 2: Top left: (a) Initial pre-translation; Top right: (b) Pre-translation obtained from the left rule of Fig. [sent-132, score-0.116]

34 In plain words, each nonterminal leaf p in the left-hand side of a rule ρ can be replaced by the input tree t of a pre-translation ht, ui whose root i sn luabte tlreede by fth ae same nnoslnattieornmin hta,lu. [sent-136, score-0.523]

35 The top-down semantics is introduced using rule compositions, which will play an important rule later on. [sent-146, score-0.255]

36 (2008a) show that the additional expressive power of treesequences helps the translation process. [sent-164, score-0.13]

37 Essentially, the process has two steps: rule extraction and training. [sent-169, score-0.144]

38 In the rule extraction step, an (unweighted) LMBOT is extracted from the corpus. [sent-170, score-0.144]

39 The two main inspirations for our rule extraction are the corresponding procedures for STSG (Galley et al. [sent-172, score-0.144]

40 On the contrary, STSSG rules can be noncontiguous on both sides, but the extraction procedure of Sun et al. [sent-177, score-0.131]

41 The adjustment is necessary because LMBOT rules cannot have (contiguous) tree sequences in their left-hand sides. [sent-180, score-0.184]

42 Overall, the rule extraction process is sketched in Algorithm 1. [sent-181, score-0.144]

43 Algorithm 1 Rule extraction for Algorithm 1Rule extraction for LMBOT Require: word-aligned tree pair (t,u) Return: LMBOT rules R such that (t, u) ∈ τ(R) while there rexulisests a umcahx thimaat (l n,ou)n- ∈lea τf( Rno)de p ∈ pos(t) and minimal p1, . [sent-182, score-0.24]

44 , no alignments from within t|p to a leaf outside (u|p1 , . [sent-190, score-0.142]

45 , upk ) to R awdidth r tuhlee nρo =nte tr|m→inal alignments / / exci se rule ρ from (t, u) 4: t ← t[p ← t(p)] u ← u[pi ← u(pi) | i∈ {1, . [sent-196, score-0.145]

46 , k}] 6: uest ←abl uis[ph alignments according to position end while ψ The requirement that we can only have one input tree in LMBOT rules indeed might cause the extraction of bigger and less useful rules (when compared to the corresponding STSSG rules) as demonstrated in (Sun et al. [sent-199, score-0.344]

47 However, the stricter rule shape preserves the good algorithmic proper- ties of LMBOT. [sent-201, score-0.213]

48 Using the more liberal format of LMBOT rules, we can decompose the STSG rule of Figure 5 further into the rules displayed in Figure 1. [sent-206, score-0.196]

49 Let ρ1 be the top left rule of Figure 2 and ρ2 and ρ3 be the NP-SBJSVP NML JJ NNP NNP VBD Voislav signed PP IN Figure 4: Biparsed aligned parallel text. [sent-210, score-0.139]

50 If we name all rules of R, then we can represent each pretranslation of τ(R) symbolically by a tree containing rule names. [sent-214, score-0.3]

51 Such trees containing rule names are often called derivation trees. [sent-215, score-0.147]

52 Theorem 7 The set D(R) is a regular tree language for every LMBOT R, and the set ofderivations is also regular for every MBOT. [sent-217, score-0.266]

53 Moreover, using the input and output product constructions of Maletti (2010) we obtain that even the set Dt,u(R) of derivations for a specific input tree t and output tree u is regular. [sent-219, score-0.304]

54 Given a set T of pre-translations and a tree language L ⊆ TΣ, we let Tc(L) = {ui | (u1, . [sent-229, score-0.123]

55 We say that T preserves regularity if Tc(L) is reguWlare f soary every regular tvreees language L if ⊆ TΣ. [sent-233, score-0.244]

56 Correspondingly, an LMBOT eR l preserves regularity if its set τ(R) of pre-translations preserves regularity. [sent-234, score-0.241]

57 The rules of an LMBOT have only alignments between the left-hand side (input tree) and the right-hand side (output tree), which are also called inter-tree alignments. [sent-236, score-0.191]

58 However, several alignments to a single nonterminal in the left-hand side can transitively relate two different nonterminals in the output side and thus simulate an intratree alignment. [sent-237, score-0.255]

59 S Cinlecaer ltyh,e t hleisaf w laonrdguage of every regular tree language is context-free and regular tree languages are closed under intersection (needed to single out the translations that have the symbol Y at the root), this also proves that τ(R)c(TΣ) is not regular. [sent-243, score-0.349]

60 Preservation of regularity is an important property for a number of translation model manipulations. [sent-245, score-0.144]

61 , a representation of a regular tree language) is an efficient representation. [sent-250, score-0.146]

62 More complex representations such as context-free tree grammars [see, e. [sent-251, score-0.126]

63 In this section, we investigate three syntactic restrictions on the set R of rules that guarantees that the obtained LMBOT preserves regularity. [sent-256, score-0.158]

64 Then we 831 shortly discuss how to adjust the rule extraction algorithm, so that the extracted rules automatically have these property. [sent-257, score-0.288]

65 Mind that R2 might not contain all rules of R, but it contains all those without leaf nonterminals. [sent-260, score-0.193]

66 Definition 8 An LMBOT R is finitely collapsing if there is n ∈ N such that ψ : ↓NT (r) → ↓NT(l) {1} for every r ∈ul eN ls →ψ r ∈ ψR:n . [sent-261, score-0.2]

67 Often the simple condition ‘finitely collapsing’ is fulfilled after rule extraction. [sent-264, score-0.139]

68 It can also be ensured in the rule extraction process by introducing collapsing points for output symbols that can appear recursively in the corpus. [sent-266, score-0.261]

69 For example, we could enforce that all extracted rules for clause-level output symbols (assuming that there is no recursion not involving a clause-level output symbols) should have only 1output tree in the right-hand side. [sent-267, score-0.258]

70 Finitely collapsing LMBOT have only slightly more expressive power than STSG. [sent-269, score-0.159]

71 This is due to the fact that the alignment in composed rules establishes an injective relation between leaf nonterminals (as in an STSG), but it need not be bijective. [sent-271, score-0.304]

72 Consequently, there can be leaf nonterminals in the left-hand side that have no aligned leaf nonterminal in the right-hand side. [sent-272, score-0.409]

73 Xx→Xc,Xc XX1→aXX,aXX XX1→bXX,bXX XY1→XYX 2 2 2 Figure 7: Output subtree synchronization (intra-tree). [sent-276, score-0.124]

74 Theorem 10 For every STSG, we can construct an equivalent finitely collapsing LMBOT in linear time. [sent-278, score-0.219]

75 Moreover, finitely collapsing LMBOT are strictly more expressive than STSG. [sent-279, score-0.233]

76 Definition 11 An LMBOT R has finite synchronization if there is n ∈ N such that for every rule lt →ψ r ∈ eR ins a nnd p ∈ ↓NT(l) hthatere fo rex eisvtse iy ∈ lNe wl i→th ψ−r1 ∈({p R} N) ⊆ {iw | w ∈ N∗}. [sent-281, score-0.354]

77 In plain terms, multiple alignments to a single leaf nonterminal at p in the left-hand side are allowed, but all leaf nonterminals of the right-hand side that are aligned to p must be in the same tree. [sent-282, score-0.479]

78 Clearly, an LMBOT with finite synchronization is finitely collapsing. [sent-283, score-0.292]

79 Raoult (1997) investigated this restriction in the context of rational tree relations, which are a generalization of our LMBOT. [sent-284, score-0.134]

80 Raoult (1997) shows that finite synchronization can be decided. [sent-285, score-0.199]

81 Theorem 12 Every LMBOT with finite synchronization preserves regularity. [sent-287, score-0.277]

82 2 In Figure 9 we illustrate a translation that can be computed by a composition of two STSG, but that cannot be computed by an MBOT (or LMBOT) with finite synchronization. [sent-290, score-0.146]

83 Theorem 13 For every STSG, we can construct an equivalent LMBOT with finite synchronization in linear time. [sent-297, score-0.257]

84 LMBOT and MBOT with finite synchronization are strictly more expressive than STSG and com- ×× pute classes that are not closed under composition. [sent-298, score-0.289]

85 Again, it is straightforward to adjust the rule extraction algorithm by the introduction of synchronization points (for example, for clause level output symbols). [sent-299, score-0.328]

86 We can simply require that rules extracted for those selected output symbols fulfill the condition mentioned in Definition 11. [sent-300, score-0.158]

87 Definition 14 An LMBOT R is copy-free if there is n ∈ N such that for every rule l →ψ r ∈ Rn and p ∈ ↓NT(l) we h foarve e (i) yψ r−u1l ({p} N) ⊆ N, or (ii) ∈ψ ↓−1({p} N) ⊆ {iw | w ∈ N∗} for an ⊆i ∈ N, o. [sent-302, score-0.155]

88 Intuitively, a copy-free LMBOT has rules whose right hand sides may use all leaf nonterminals that are aligned to a given leaf nonterminal in the lefthand side directly at the root (of one of the trees X X X X . [sent-303, score-0.562]

89 in the right-hand side forest) or group all those leaf nonterminals in a single tree in the forest. [sent-312, score-0.33]

90 Clearly, the LMBOT of Figure 7 is not copy-free because the second rule composes with itself (see Figure 10) to a rule that does not fulfill the copy-free condition. [sent-313, score-0.261]

91 We replace the LMBOT with rules R by the equivalent LMBOT M with rules Rn. [sent-316, score-0.179]

92 In this way, we obtain an MBOT M0, whose rules still fulfill the requirements (adapted for MBOT) of Definition 14 because the input product does not change the structure of the rules (it only modifies the state behavior). [sent-320, score-0.212]

93 This can be achieved using a decomposition into a relabeling, which clearly preserves regularity, and a determinis- tic finite-copying top-down tree transducer (Engelfriet et al. [sent-322, score-0.223]

94 In addition, we can adjust the rule extraction process using synchronization points as for LMBOT with finite synchronization using the restrictions of Definition 14. [sent-326, score-0.502]

95 LMBOT for the transformation Copy-free LMBOT are strictly more expressive than LMBOT with finite synchronization. [sent-329, score-0.171]

96 6 Conclusion We have introduced a simple restriction of multi bottom-up tree transducers. [sent-330, score-0.207]

97 Next, we introduced a rule extraction procedure and a corresponding rule weight training procedure for our LMBOT. [sent-332, score-0.26]

98 In addition, we shortly discussed how these properties could be enforced in the presented rule extraction procedure. [sent-335, score-0.191]

99 Composition and decomposition of extended multi bottom-up tree transducers. [sent-398, score-0.207]

100 A noncontiguous tree sequence alignment-based model for statistical machine translation. [sent-484, score-0.127]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('lmbot', 0.763), ('stsg', 0.286), ('mbot', 0.223), ('stssg', 0.141), ('nt', 0.137), ('synchronization', 0.124), ('rule', 0.116), ('leaf', 0.113), ('tree', 0.104), ('multi', 0.103), ('raoult', 0.094), ('finitely', 0.093), ('theorem', 0.09), ('regularity', 0.085), ('rk', 0.083), ('rules', 0.08), ('preserves', 0.078), ('finite', 0.075), ('maletti', 0.072), ('expressive', 0.072), ('nonterminals', 0.072), ('engelfriet', 0.07), ('collapsing', 0.068), ('ui', 0.059), ('ht', 0.051), ('nonterminal', 0.047), ('synchronous', 0.045), ('regular', 0.042), ('transducer', 0.041), ('preservation', 0.041), ('side', 0.041), ('every', 0.039), ('sun', 0.039), ('translation', 0.039), ('graehl', 0.036), ('definition', 0.036), ('biparsed', 0.035), ('inductively', 0.035), ('joost', 0.035), ('lilin', 0.035), ('adjust', 0.035), ('ip', 0.034), ('arnold', 0.034), ('composition', 0.032), ('preserve', 0.031), ('pos', 0.031), ('trees', 0.031), ('dauchet', 0.031), ('substitution', 0.03), ('rational', 0.03), ('transducers', 0.03), ('ti', 0.03), ('alignments', 0.029), ('fulfill', 0.029), ('shortly', 0.029), ('extraction', 0.028), ('output', 0.025), ('shieber', 0.025), ('symbols', 0.024), ('transformation', 0.024), ('zhang', 0.024), ('fulfilled', 0.023), ('htp', 0.023), ('noncontiguous', 0.023), ('upi', 0.023), ('semantics', 0.023), ('knight', 0.023), ('input', 0.023), ('aligned', 0.023), ('pk', 0.023), ('pv', 0.023), ('vbd', 0.022), ('pi', 0.022), ('sides', 0.022), ('grammars', 0.022), ('bottomup', 0.021), ('hpe', 0.021), ('root', 0.02), ('property', 0.02), ('ot', 0.02), ('kevin', 0.02), ('alignment', 0.02), ('let', 0.019), ('algorithmic', 0.019), ('compositions', 0.019), ('bx', 0.019), ('aho', 0.019), ('oift', 0.019), ('figure', 0.019), ('composed', 0.019), ('power', 0.019), ('equivalent', 0.019), ('closed', 0.018), ('lp', 0.018), ('pietra', 0.018), ('della', 0.018), ('syntaxbased', 0.018), ('nte', 0.018), ('properties', 0.018), ('vp', 0.018)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0 154 acl-2011-How to train your multi bottom-up tree transducer

Author: Andreas Maletti

2 0.14668843 30 acl-2011-Adjoining Tree-to-String Translation

Author: Yang Liu ; Qun Liu ; Yajuan Lu

Abstract: We introduce synchronous tree adjoining grammars (TAG) into tree-to-string translation, which converts a source tree to a target string. Without reconstructing TAG derivations explicitly, our rule extraction algorithm directly learns tree-to-string rules from aligned Treebank-style trees. As tree-to-string translation casts decoding as a tree parsing problem rather than parsing, the decoder still runs fast when adjoining is included. Less than 2 times slower, the adjoining tree-tostring system improves translation quality by +0.7 BLEU over the baseline system only allowing for tree substitution on NIST ChineseEnglish test sets.

3 0.12327066 268 acl-2011-Rule Markov Models for Fast Tree-to-String Translation

Author: Ashish Vaswani ; Haitao Mi ; Liang Huang ; David Chiang

Abstract: Most statistical machine translation systems rely on composed rules (rules that can be formed out of smaller rules in the grammar). Though this practice improves translation by weakening independence assumptions in the translation model, it nevertheless results in huge, redundant grammars, making both training and decoding inefficient. Here, we take the opposite approach, where we only use minimal rules (those that cannot be formed out of other rules), and instead rely on a rule Markov model of the derivation history to capture dependencies between minimal rules. Large-scale experiments on a state-of-the-art tree-to-string translation system show that our approach leads to a slimmer model, a faster decoder, yet the same translation quality (measured using B ) as composed rules.

4 0.10025903 290 acl-2011-Syntax-based Statistical Machine Translation using Tree Automata and Tree Transducers

Author: Daniel Emilio Beck

Abstract: In this paper I present a Master’s thesis proposal in syntax-based Statistical Machine Translation. Ipropose to build discriminative SMT models using both tree-to-string and tree-to-tree approaches. Translation and language models will be represented mainly through the use of Tree Automata and Tree Transducers. These formalisms have important representational properties that makes them well-suited for syntax modeling. Ialso present an experiment plan to evaluate these models through the use of a parallel corpus written in English and Brazilian Portuguese.

5 0.097918719 61 acl-2011-Binarized Forest to String Translation

Author: Hao Zhang ; Licheng Fang ; Peng Xu ; Xiaoyun Wu

Abstract: Tree-to-string translation is syntax-aware and efficient but sensitive to parsing errors. Forestto-string translation approaches mitigate the risk of propagating parser errors into translation errors by considering a forest of alternative trees, as generated by a source language parser. We propose an alternative approach to generating forests that is based on combining sub-trees within the first best parse through binarization. Provably, our binarization forest can cover any non-consitituent phrases in a sentence but maintains the desirable property that for each span there is at most one nonterminal so that the grammar constant for decoding is relatively small. For the purpose of reducing search errors, we apply the synchronous binarization technique to forest-tostring decoding. Combining the two techniques, we show that using a fast shift-reduce parser we can achieve significant quality gains in NIST 2008 English-to-Chinese track (1.3 BLEU points over a phrase-based system, 0.8 BLEU points over a hierarchical phrase-based system). Consistent and significant gains are also shown in WMT 2010 in the English to German, French, Spanish and Czech tracks.

6 0.087323882 110 acl-2011-Effective Use of Function Words for Rule Generalization in Forest-Based Translation

7 0.08309406 250 acl-2011-Prefix Probability for Probabilistic Synchronous Context-Free Grammars

8 0.08302597 29 acl-2011-A Word-Class Approach to Labeling PSCFG Rules for Machine Translation

9 0.077347681 206 acl-2011-Learning to Transform and Select Elementary Trees for Improved Syntax-based Machine Translations

10 0.072705328 202 acl-2011-Learning Hierarchical Translation Structure with Linguistic Annotations

11 0.063290998 296 acl-2011-Terminal-Aware Synchronous Binarization

12 0.06231647 180 acl-2011-Issues Concerning Decoding with Synchronous Context-free Grammar

13 0.058357496 325 acl-2011-Unsupervised Word Alignment with Arbitrary Features

14 0.052691586 44 acl-2011-An exponential translation model for target language morphology

15 0.052113678 232 acl-2011-Nonparametric Bayesian Machine Transliteration with Synchronous Adaptor Grammars

16 0.051830962 234 acl-2011-Optimal Head-Driven Parsing Complexity for Linear Context-Free Rewriting Systems

17 0.050895352 171 acl-2011-Incremental Syntactic Language Models for Phrase-based Translation

18 0.048515208 123 acl-2011-Exact Decoding of Syntactic Translation Models through Lagrangian Relaxation

19 0.047483169 173 acl-2011-Insertion Operator for Bayesian Tree Substitution Grammars

20 0.047254845 188 acl-2011-Judging Grammaticality with Tree Substitution Grammar Derivations

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.114), (1, -0.098), (2, 0.042), (3, -0.026), (4, 0.016), (5, 0.009), (6, -0.112), (7, -0.043), (8, -0.059), (9, -0.028), (10, -0.02), (11, 0.002), (12, -0.0), (13, 0.082), (14, 0.019), (15, -0.037), (16, -0.009), (17, 0.026), (18, 0.012), (19, 0.007), (20, 0.003), (21, -0.001), (22, -0.025), (23, 0.005), (24, 0.022), (25, -0.039), (26, -0.028), (27, -0.015), (28, 0.015), (29, -0.047), (30, 0.014), (31, -0.035), (32, -0.042), (33, 0.006), (34, -0.058), (35, 0.036), (36, 0.048), (37, 0.01), (38, -0.033), (39, 0.019), (40, -0.089), (41, -0.035), (42, -0.017), (43, -0.019), (44, -0.019), (45, -0.04), (46, -0.037), (47, -0.035), (48, 0.073), (49, 0.062)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.93071359 154 acl-2011-How to train your multi bottom-up tree transducer

Author: Andreas Maletti

2 0.80764008 250 acl-2011-Prefix Probability for Probabilistic Synchronous Context-Free Grammars

Author: Mark-Jan Nederhof ; Giorgio Satta

Abstract: We present a method for the computation of prefix probabilities for synchronous contextfree grammars. Our framework is fairly general and relies on the combination of a simple, novel grammar transformation and standard techniques to bring grammars into normal forms.

3 0.78991461 268 acl-2011-Rule Markov Models for Fast Tree-to-String Translation

Author: Ashish Vaswani ; Haitao Mi ; Liang Huang ; David Chiang

4 0.75244701 30 acl-2011-Adjoining Tree-to-String Translation

Author: Yang Liu ; Qun Liu ; Yajuan Lu

5 0.67428184 173 acl-2011-Insertion Operator for Bayesian Tree Substitution Grammars

Author: Hiroyuki Shindo ; Akinori Fujino ; Masaaki Nagata

Abstract: We propose a model that incorporates an insertion operator in Bayesian tree substitution grammars (BTSG). Tree insertion is helpful for modeling syntax patterns accurately with fewer grammar rules than BTSG. The experimental parsing results show that our model outperforms a standard PCFG and BTSG for a small dataset. For a large dataset, our model obtains comparable results to BTSG, making the number of grammar rules much smaller than with BTSG.

6 0.64695722 110 acl-2011-Effective Use of Function Words for Rule Generalization in Forest-Based Translation

7 0.62939072 61 acl-2011-Binarized Forest to String Translation

8 0.62732476 180 acl-2011-Issues Concerning Decoding with Synchronous Context-free Grammar

9 0.61466223 290 acl-2011-Syntax-based Statistical Machine Translation using Tree Automata and Tree Transducers

10 0.54745126 176 acl-2011-Integrating surprisal and uncertain-input models in online sentence comprehension: formal techniques and empirical results

11 0.53338498 239 acl-2011-P11-5002 k2opt.pdf

12 0.53292435 234 acl-2011-Optimal Head-Driven Parsing Complexity for Linear Context-Free Rewriting Systems

13 0.53180504 29 acl-2011-A Word-Class Approach to Labeling PSCFG Rules for Machine Translation

14 0.52457118 296 acl-2011-Terminal-Aware Synchronous Binarization

15 0.52220994 206 acl-2011-Learning to Transform and Select Elementary Trees for Improved Syntax-based Machine Translations

16 0.50212759 202 acl-2011-Learning Hierarchical Translation Structure with Linguistic Annotations

17 0.4980284 123 acl-2011-Exact Decoding of Syntactic Translation Models through Lagrangian Relaxation

18 0.48445082 219 acl-2011-Metagrammar engineering: Towards systematic exploration of implemented grammars

19 0.48210993 217 acl-2011-Machine Translation System Combination by Confusion Forest

20 0.47537652 303 acl-2011-Tier-based Strictly Local Constraints for Phonology

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(5, 0.026), (17, 0.073), (26, 0.036), (37, 0.073), (39, 0.057), (41, 0.052), (55, 0.015), (59, 0.027), (62, 0.022), (72, 0.029), (85, 0.263), (91, 0.031), (96, 0.171)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.85676706 99 acl-2011-Discrete vs. Continuous Rating Scales for Language Evaluation in NLP

Author: Anja Belz ; Eric Kow

Abstract: Studies assessing rating scales are very common in psychology and related fields, but are rare in NLP. In this paper we assess discrete and continuous scales used for measuring quality assessments of computergenerated language. We conducted six separate experiments designed to investigate the validity, reliability, stability, interchangeability and sensitivity of discrete vs. continuous scales. We show that continuous scales are viable for use in language evaluation, and offer distinct advantages over discrete scales. 1 Background and Introduction Rating scales have been used for measuring human perception of various stimuli for a long time, at least since the early 20th century (Freyd, 1923). First used in psychology and psychophysics, they are now also common in a variety of other disciplines, including NLP. Discrete scales are the only type of scale commonly used for qualitative assessments of computer-generated language in NLP (e.g. in the DUC/TAC evaluation competitions). Continuous scales are commonly used in psychology and related fields, but are virtually unknown in NLP. While studies assessing the quality of individual scales and comparing different types of rating scales are common in psychology and related fields, such studies hardly exist in NLP, and so at present little is known about whether discrete scales are a suitable rating tool for NLP evaluation tasks, or whether continuous scales might provide a better alternative. A range of studies from sociology, psychophysiology, biometrics and other fields have compared 230 Kow} @bright on .ac .uk discrete and continuous scales. Results tend to differ for different types of data. E.g., results from pain measurement show a continuous scale to outperform a discrete scale (ten Klooster et al., 2006). Other results (Svensson, 2000) from measuring students’ ease of following lectures show a discrete scale to outperform a continuous scale. When measuring dyspnea, Lansing et al. (2003) found a hybrid scale to perform on a par with a discrete scale. Another consideration is the types of data produced by discrete and continuous scales. Parametric methods of statistical analysis, which are far more sensitive than non-parametric ones, are commonly applied to both discrete and continuous data. However, parametric methods make very strong assumptions about data, including that it is numerical and normally distributed (Siegel, 1957). If these assumptions are violated, then the significance of results is overestimated. Clearly, the numerical assumption does not hold for the categorial data produced by discrete scales, and it is unlikely to be normally distributed. Many researchers are happier to apply parametric methods to data from continuous scales, and some simply take it as read that such data is normally distributed (Lansing et al., 2003). Our aim in the present study was to systematically assess and compare discrete and continuous scales when used for the qualitative assessment of computer-generated language. We start with an overview of assessment scale types (Section 2). We describe the experiments we conducted (Sec- tion 4), the data we used in them (Section 3), and the properties we examined in our inter-scale comparisons (Section 5), before presenting our results Proceedings ofP thoer t4l9atnhd A, Onrnuegaoln M,e Jeuntineg 19 o-f2 t4h,e 2 A0s1s1o.c?i ac t2io0n11 fo Ar Cssoocmiaptuiotanti foonra Clo Lminpguutiastti ocns:aslh Loirntpgaupisetricss, pages 230–235, Q1: Grammaticality The summary should have no datelines, system-internal formatting, capitalization errors or obviously ungrammatical sentences (e.g., fragments, missing components) that make the text difficult to read. 1. Very Poor 2. Poor 3. Barely Acceptable 4. Good 5. Very Good Figure 1: Evaluation of Readability in DUC’06, comprising 5 evaluation criteria, including Grammaticality. Evaluation task for each summary text: evaluator selects one of the options (1–5) to represent quality of the summary in terms of the criterion. (Section 6), and some conclusions (Section 7). 2 Rating Scales With Verbal Descriptor Scales (VDSs), participants give responses on ordered lists of verbally described and/or numerically labelled response cate- gories, typically varying in number from 2 to 11 (Svensson, 2000). An example of a VDS used in NLP is shown in Figure 1. VDSs are used very widely in contexts where computationally generated language is evaluated, including in dialogue, summarisation, MT and data-to-text generation. Visual analogue scales (VASs) are far less common outside psychology and related areas than VDSs. Responses are given by selecting a point on a typically horizontal line (although vertical lines have also been used (Scott and Huskisson, 2003)), on which the two end points represent the extreme values of the variable to be measured. Such lines can be mono-polar or bi-polar, and the end points are labelled with an image (smiling/frowning face), or a brief verbal descriptor, to indicate which end of the line corresponds to which extreme of the variable. The labels are commonly chosen to represent a point beyond any response actually likely to be chosen by raters. There is only one examples of a VAS in NLP system evaluation that we are aware of (Gatt et al., 2009). Hybrid scales, known as a graphic rating scales, combine the features of VDSs and VASs, and are also used in psychology. Here, the verbal descriptors are aligned along the line of a VAS and the endpoints are typically unmarked (Svensson, 2000). We are aware of one example in NLP (Williams and Reiter, 2008); 231 Q1: Grammaticality The summary should have no datelines, system-internal formatting, capitalization errors or obviously ungrammatical sentences (e.g., fragments, missing components) that make the text difficult to read. extbreamdely excellent Figure 2: Evaluation of Grammaticality with alternative VAS scale (cf. Figure 1). Evaluation task for each summary text: evaluator selects a place on the line to represent quality of the summary in terms of the criterion. we did not investigate this scale in our study. We used the following two specific scale designs in our experiments: VDS-7: 7 response categories, numbered (7 = best) and verbally described (e.g. 7 = “perfectly fluent” for Fluency, and 7 = “perfectly clear” for Clarity). Response categories were presented in a vertical list, with the best category at the bottom. Each category had a tick-box placed next to it; the rater’s task was to tick the box by their chosen rating. VAS: a horizontal, bi-polar line, with no ticks on it, mapping to 0–100. In the image description tests, statements identified the left end as negative, the right end as positive; in the weather forecast tests, the positive end had a smiling face and the label “statement couldn’t be clearer/read better”; the negative end had a frowning face and the label “statement couldn’t be more unclear/read worse”. The raters’ task was to move a pointer (initially in the middle of the line) to the place corresponding to their rating. 3 Data Weather forecast texts: In one half of our evaluation experiments we used human-written and automatically generated weather forecasts for the same weather data. The data in our evaluations was for 22 different forecast dates and included outputs from 10 generator systems and one set of human forecasts. This data has also been used for comparative system evaluation in previous research (Langner, 2010; Angeli et al., 2010; Belz and Kow, 2009). The following are examples of weather forecast texts from the data: 1: S SE 2 8 -3 2 INCREAS ING 3 6-4 0 BY MID AF TERNOON 2 : S ’ LY 2 6-3 2 BACKING S SE 3 0 -3 5 BY AFTERNOON INCREAS ING 3 5 -4 0 GUSTS 5 0 BY MID EVENING Image descriptions: In the other half of our evaluations, we used human-written and automatically generated image descriptions for the same images. The data in our evaluations was for 112 different image sets and included outputs from 6 generator systems and 2 sets of human-authored descriptions. This data was originally created in the TUNA Project (van Deemter et al., 2006). The following is an example of an item from the corpus, consisting of a set of images and a description for the entity in the red frame: the smal l blue fan 4 Experimental Set-up 4.1 Evaluation criteria Fluency/Readability: Both the weather forecast and image description evaluation experiments used a quality criterion intended to capture ‘how well a piece of text reads’ , called Fluency in the latter, Readability in the former. Adequacy/Clarity: In the image description experiments, the second quality criterion was Adequacy, explained as “how clear the description is”, and “how easy it would be to identify the image from the description”. This criterion was called Clarity in the weather forecast experiments, explained as “how easy is it to understand what is being described”. 4.2 Raters In the image experiments we used 8 raters (native speakers) in each experiment, from cohorts of 3rdyear undergraduate and postgraduate students doing a degree in a linguistics-related subject. They were paid and spent about 1hour doing the experiment. In the weather forecast experiments, we used 22 raters in each experiment, from among academic staff at our own university. They were not paid and spent about 15 minutes doing the experiment. 232 4.3 Summary overview of experiments Weather VDS-7 (A): VDS-7 scale; weather forecast data; criteria: Readability and Clarity; 22 raters (university staff) each assessing 22 forecasts. Weather VDS-7 (B): exact repeat of Weather VDS-7 (A), including same raters. Weather VAS: VAS scale; 22 raters (university staff), no overlap with raters in Weather VDS-7 experiments; other details same as in Weather VDS-7. Image VDS-7: VDS-7 scale; image description data; 8 raters (linguistics students) each rating 112 descriptions; criteria: Fluency and Adequacy. Image VAS (A): VAS scale; 8 raters (linguistics students), no overlap with raters in Image VAS-7; other details same as in Image VDS-7 experiment. Image VAS (B): exact repeat of Image VAS (A), including same raters. 4.4 Design features common to all experiments In all our experiments we used a Repeated Latin Squares design to ensure that each rater sees the same number of outputs from each system and for each text type (forecast date/image set). Following detailed instructions, raters first did a small number of practice examples, followed by the texts to be rated, in an order randomised for each rater. Evaluations were carried out via a web interface. They were allowed to interrupt the experiment, and in the case of the 1hour long image description evaluation they were encouraged to take breaks. 5 Comparison and Assessment of Scales Validity is to the extent to which an assessment method measures what it is intended to measure (Svensson, 2000). Validity is often impossible to assess objectively, as is the case of all our criteria except Adequacy, the validity of which we can directly test by looking at correlations with the accuracy with which participants in a separate experiment identify the intended images given their descriptions. A standard method for assessing Reliability is Kendall’s W, a coefficient of concordance, measuring the degree to which different raters agree in their ratings. We report W for all 6 experiments. Stability refers to the extent to which the results of an experiment run on one occasion agree with the results of the same experiment (with the same raters) run on a different occasion. In the present study, we assess stability in an intra-rater, test-retest design, assessing the agreement between the same participant’s responses in the first and second runs of the test with Pearson’s product-moment correlation coefficient. We report these measures between ratings given in Image VAS (A) vs. those given in Image VAS (B), and between ratings given in Weather VDS-7 (A) vs. those given in Weather VDS-7 (B). We assess Interchangeability, that is, the extent to which our VDS and VAS scales agree, by computing Pearson’s and Spearman’s coefficients between results. We report these measures for all pairs of weather forecast/image description evaluations. We assess the Sensitivity of our scales by determining the number of significant differences between different systems and human authors detected by each scale. We also look at the relative effect of the different experimental factors by computing the F-Ratio for System (the main factor under investigation, so its relative effect should be high), Rater and Text Type (their effect should be low). F-ratios were de- termined by a one-way ANOVA with the evaluation criterion in question as the dependent variable and System, Rater or Text Type as grouping factors. 6 Results 6.1 Interchangeability and Reliability for system/human authored image descriptions Interchangeability: Pearson’s r between the means per system/human in the three image description evaluation experiments were as follows (Spearman’s ρ shown in brackets): Forb.eqAdFlouthV AD S d-(e7Aq)uac.y945a78n*d(V.F9A2l5uS8e*(—An *c)y,.98o36r.748*e1l9a*(tV.i98(Ao.2578nS019s(*5B b) e- tween Image VDS-7 and Image VAS (A) (the main VAS experiment) are extremely high, meaning that they could substitute for each other here. Reliability: Inter-rater agreement in terms of Kendall’s W in each of the experiments: 233 K ’ s W FAldue qnucayc .6V549D80S* -7* VA.46S7 16(*A * )VA.7S529 (5*B *) W was higher in the VAS data in the case of Fluency, whereas for Adequacy, W was the same for the VDS data and VAS (B), and higher in the VDS data than in the VAS (A) data. 6.2 Interchangeability and Reliability for system/human authored weather forecasts Interchangeability: The correlation coefficients (Pearson’s r with Spearman’s ρ in brackets) between the means per system/human in the image description experiments were as follows: ForRCea.ld bVoDt hS -A7 (d BAeq)ua.c9y851a*nVdD(.8F9S7-lu09*(eBn—*)cy,.9 o43r2957*1e la(*t.8i(o736n025Vs9*6A bS)e- tween Weather VDS-7 (A) (the main VDS-7 experiment) and Weather VAS (A) are again very high, although rank-correlation is somewhat lower. Reliability: Inter-rater agreement Kendall’s W was as follows: in terms of W RClea rdi.tyVDS.5-4739 7(*A * )VDS.4- 7583 (*B * ).4 8 V50*A *S This time the highest agreement for both Clarity and Readability was in the VDS-7 data. 6.3 Stability tests for image and weather data Pearson’s r between ratings given by the same raters first in Image VAS (A) and then in Image VAS (B) was .666 for Adequacy, .593 for Fluency. Between ratings given by the same raters first in Weather VDS-7 (A) and then in Weather VDS-7 (B), Pearson’s r was .656 for Clarity, .704 for Readability. (All significant at p < .01.) Note that these are computed on individual scores (rather than means as in the correlation figures given in previous sections). 6.4 F-ratios and post-hoc analysis for image data The table below shows F-ratios determined by a oneway ANOVA with the evaluation criterion in question (Adequacy/Fluency) as the dependent variable and System/Rater/Text Type as the grouping factor. Note that for System a high F-ratio is desirable, but a low F-ratio is desirable for other factors. tem, the main factor under investigation, VDS-7 found 8 for Adequacy and 14 for Fluency; VAS (A) found 7 for Adequacy and 15 for Fluency. 6.5 F-ratios and post-hoc analysis for weather data The table below shows F-ratios analogous to the previous section (for Clarity/Readability). tem, VDS-7 (A) found 24 for Clarity, 23 for Readability; VAS found 25 for Adequacy, 26 for Fluency. 6.6 Scale validity test for image data Our final table of results shows Pearson’s correlation coefficients (calculated on means per system) between the Adequacy data from the three image description evaluation experiments on the one hand, and the data from an extrinsic experiment in which we measured the accuracy with which participants identified the intended image described by a description: ThecorIlm at iog ne V bAeDSt w-(A7eB)An dA eqd uqe ac uy a cy.I89nD720d 6AI*Dc .Acuray was strong and highly significant in all three image description evaluation experiments, but strongest in VAS (B), and weakest in VAS (A). For comparison, 234 Pearson’s between Fluency and ID Accuracy ranged between .3 and .5, whereas Pearson’s between Adequacy and ID Speed (also measured in the same image identfication experiment) ranged between -.35 and -.29. 7 Discussion and Conclusions Our interchangeability results (Sections 6. 1and 6.2) indicate that the VAS and VDS-7 scales we have tested can substitute for each other in our present evaluation tasks in terms of the mean system scores they produce. Where we were able to measure validity (Section 6.6), both scales were shown to be similarly valid, predicting image identification accuracy figures from a separate experiment equally well. Stability (Section 6.3) was marginally better for VDS-7 data, and Reliability (Sections 6.1 and 6.2) was better for VAS data in the image descrip- tion evaluations, but (mostly) better for VDS-7 data in the weather forecast evaluations. Finally, the VAS experiments found greater numbers of statistically significant differences between systems in 3 out of 4 cases (Section 6.5). Our own raters strongly prefer working with VAS scales over VDSs. This has also long been clear from the psychology literature (Svensson, 2000)), where raters are typically found to prefer VAS scales over VDSs which can be a “constant source of vexation to the conscientious rater when he finds his judgments falling between the defined points” (Champney, 1941). Moreover, if a rater’s judgment falls between two points on a VDS then they must make the false choice between the two points just above and just below their actual judgment. In this case we know that the point they end up selecting is not an accurate measure of their judgment but rather just one of two equally accurate ones (one of which goes unrecorded). Our results establish (for our evaluation tasks) that VAS scales, so far unproven for use in NLP, are at least as good as VDSs, currently virtually the only scale in use in NLP. Combined with the fact that raters strongly prefer VASs and that they are regarded as more amenable to parametric means of statistical analysis, this indicates that VAS scales should be used more widely for NLP evaluation tasks. References Gabor Angeli, Percy Liang, and Dan Klein. 2010. A simple domain-independent probabilistic approach to generation. In Proceedings of the 15th Conference on Empirical Methods in Natural Language Processing (EMNLP’10). Anja Belz and Eric Kow. 2009. System building cost vs. output quality in data-to-text generation. In Proceedings of the 12th European Workshop on Natural Language Generation, pages 16–24. H. Champney. 1941. The measurement of parent behavior. Child Development, 12(2): 13 1. M. Freyd. 1923. The graphic rating scale. Biometrical Journal, 42:83–102. A. Gatt, A. Belz, and E. Kow. 2009. The TUNA Challenge 2009: Overview and evaluation results. In Proceedings of the 12th European Workshop on Natural Language Generation (ENLG’09), pages 198–206. Brian Langner. 2010. Data-driven Natural Language Generation: Making Machines Talk Like Humans Using Natural Corpora. Ph.D. thesis, Language Technologies Institute, School of Computer Science, Carnegie Mellon University. Robert W. Lansing, Shakeeb H. Moosavi, and Robert B. Banzett. 2003. Measurement of dyspnea: word labeled visual analog scale vs. verbal ordinal scale. Respiratory Physiology & Neurobiology, 134(2):77 –83. J. Scott and E. C. Huskisson. 2003. Vertical or horizontal visual analogue scales. Annals of the rheumatic diseases, (38):560. Sidney Siegel. 1957. Non-parametric statistics. The American Statistician, 11(3): 13–19. Elisabeth Svensson. 2000. Comparison of the quality of assessments using continuous and discrete ordinal rating scales. Biometrical Journal, 42(4):417–434. P. M. ten Klooster, A. P. Klaar, E. Taal, R. E. Gheith, J. J. Rasker, A. K. El-Garf, and M. A. van de Laar. 2006. The validity and reliability of the graphic rating scale and verbal rating scale for measuing pain across cultures: A study in egyptian and dutch women with rheumatoid arthritis. The Clinical Journal of Pain, 22(9):827–30. Kees van Deemter, Ielka van der Sluis, and Albert Gatt. 2006. Building a semantically transparent corpus for the generation of referring expressions. In Proceedings of the 4th International Conference on Natural Language Generation, pages 130–132, Sydney, Australia, July. S. Williams and E. Reiter. 2008. Generating basic skills reports for low-skilled readers. Natural Language Engineering, 14(4):495–525. 235

same-paper 2 0.80576336 154 acl-2011-How to train your multi bottom-up tree transducer

Author: Andreas Maletti

3 0.72257006 244 acl-2011-Peeling Back the Layers: Detecting Event Role Fillers in Secondary Contexts

Author: Ruihong Huang ; Ellen Riloff

Abstract: The goal of our research is to improve event extraction by learning to identify secondary role filler contexts in the absence of event keywords. We propose a multilayered event extraction architecture that progressively “zooms in” on relevant information. Our extraction model includes a document genre classifier to recognize event narratives, two types of sentence classifiers, and noun phrase classifiers to extract role fillers. These modules are organized as a pipeline to gradually zero in on event-related information. We present results on the MUC-4 event extraction data set and show that this model performs better than previous systems.

4 0.67692274 46 acl-2011-Automated Whole Sentence Grammar Correction Using a Noisy Channel Model

Author: Y. Albert Park ; Roger Levy

Abstract: Automated grammar correction techniques have seen improvement over the years, but there is still much room for increased performance. Current correction techniques mainly focus on identifying and correcting a specific type of error, such as verb form misuse or preposition misuse, which restricts the corrections to a limited scope. We introduce a novel technique, based on a noisy channel model, which can utilize the whole sentence context to determine proper corrections. We show how to use the EM algorithm to learn the parameters of the noise model, using only a data set of erroneous sentences, given the proper language model. This frees us from the burden of acquiring a large corpora of corrected sentences. We also present a cheap and efficient way to provide automated evaluation re- sults for grammar corrections by using BLEU and METEOR, in contrast to the commonly used manual evaluations.

5 0.63761902 137 acl-2011-Fine-Grained Class Label Markup of Search Queries

Author: Joseph Reisinger ; Marius Pasca

Abstract: We develop a novel approach to the semantic analysis of short text segments and demonstrate its utility on a large corpus of Web search queries. Extracting meaning from short text segments is difficult as there is little semantic redundancy between terms; hence methods based on shallow semantic analysis may fail to accurately estimate meaning. Furthermore search queries lack explicit syntax often used to determine intent in question answering. In this paper we propose a hybrid model of semantic analysis combining explicit class-label extraction with a latent class PCFG. This class-label correlation (CLC) model admits a robust parallel approximation, allowing it to scale to large amounts of query data. We demonstrate its performance in terms of (1) its predicted label accuracy on polysemous queries and (2) its ability to accurately chunk queries into base constituents.

6 0.63331521 327 acl-2011-Using Bilingual Parallel Corpora for Cross-Lingual Textual Entailment

7 0.63228375 241 acl-2011-Parsing the Internal Structure of Words: A New Paradigm for Chinese Word Segmentation

8 0.63160074 61 acl-2011-Binarized Forest to String Translation

9 0.62986481 117 acl-2011-Entity Set Expansion using Topic information

10 0.629691 11 acl-2011-A Fast and Accurate Method for Approximate String Search

11 0.62964737 28 acl-2011-A Statistical Tree Annotator and Its Applications

12 0.62941039 30 acl-2011-Adjoining Tree-to-String Translation

13 0.62858862 15 acl-2011-A Hierarchical Pitman-Yor Process HMM for Unsupervised Part of Speech Induction

14 0.62823796 77 acl-2011-Computing and Evaluating Syntactic Complexity Features for Automated Scoring of Spontaneous Non-Native Speech

15 0.62781942 193 acl-2011-Language-independent compound splitting with morphological operations

16 0.62759268 217 acl-2011-Machine Translation System Combination by Confusion Forest

17 0.62711126 110 acl-2011-Effective Use of Function Words for Rule Generalization in Forest-Based Translation

18 0.62688839 202 acl-2011-Learning Hierarchical Translation Structure with Linguistic Annotations

19 0.62646997 254 acl-2011-Putting it Simply: a Context-Aware Approach to Lexical Simplification

20 0.62604678 318 acl-2011-Unsupervised Bilingual Morpheme Segmentation and Alignment with Context-rich Hidden Semi-Markov Models