acl acl2011 acl2011-296 knowledge-graph by maker-knowledge-mining

296 acl-2011-Terminal-Aware Synchronous Binarization

Source: pdf

Author: Licheng Fang ; Tagyoung Chung ; Daniel Gildea

Abstract: We present an SCFG binarization algorithm that combines the strengths of early terminal matching on the source language side and early language model integration on the target language side. We also examine how different strategies of target-side terminal attachment during binarization can significantly affect translation quality.

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 We also examine how different strategies of target-side terminal attachment during binarization can significantly affect translation quality. [sent-2, score-0.882]

2 1 Introduction Synchronous context-free grammars (SCFG) are behind most syntax-based machine translation models. [sent-3, score-0.031]

3 Efficient machine translation decoding with an SCFG requires converting the grammar into a binarized form, either explicitly, as in synchronous binarization (Zhang et al. [sent-4, score-1.139]

4 , 2006), where virtual nonterminals are generated for binarization, or implicitly, as in Earley parsing (Earley, 1970), where dotted items are used. [sent-5, score-0.267]

5 Given a source-side binarized SCFG with terminal set T and nonterminal set N, the time complexity ofdecoding a sentence oflength n w tiimthe a m-gram language model is (Venugopal et al. [sent-6, score-0.448]

6 SCFG binarization serves two important goals: • • Parsing complexity for unbinarized SCFG grows exponentially w foitrh tuhneb innuamrizbeedr o Sf nonterminals on the right-hand side of grammar rules. [sent-8, score-0.947]

7 Binarization ensures cubic time decoding in terms of input sentence length. [sent-9, score-0.095]

8 401 In machine translation, integrating language Imno dmela cshtiatnees as early as possible i sn gess leanntgiuala gtoe reducing search errors. [sent-10, score-0.091]

9 , 2006) enables the decoder to incorporate language model scores as soon as a binarized rule is applied. [sent-12, score-0.362]

10 In this paper, we examine a CYK-like synchronous binarization algorithm that integrates a novel criterion in a unified semiring parsing framework. [sent-13, score-1.107]

11 The criterion we present has explicit consideration of source-side terminals. [sent-14, score-0.027]

12 In general, terminals in a rule have a lower probability of being matched given a sentence, and therefore have the effect of “anchoring” a rule and limiting its possible application points. [sent-15, score-0.402]

13 Hopkins and Langmead (2010) formalized this concept as the scope of a rule. [sent-16, score-0.087]

14 The scope of a rule can be calculated by counting the number of adjacent nonterminal pairs and boundary nonterminals. [sent-18, score-0.352]

15 Building on the concept of scope, we define a cost function that estimates the expected number of hyperedges to be built when a particular binarization tree is applied to unseen data. [sent-20, score-1.091]

16 This effectively puts hard-to-match derivations at the bottom of the binarization tree, which enables the decoder to decide early on whether an unbinarized rule can be built or not. [sent-21, score-1.067]

17 We also investigate a better way to handle targetside terminals during binarization. [sent-22, score-0.167]

18 In theory, different strategies should produce equivalent translation results. [sent-23, score-0.066]

19 However, because decoding always involves Proceedings ofP thoer t4l9atnhd A, Onrnuegaoln M,e Jeuntineg 19 o-f2 t4h,e 2 A0s1s1o. [sent-24, score-0.049]

20 i ac t2io0n11 fo Ar Cssoocmiaptuiotanti foonra Clo Lminpguutiast i ocns:aslh Loirntpgaupisetrics , pages 401–406, Number of rgiht-hand-sdie nontermniasl Figure 1: Rule Statistics pruning, we show that different strategies do have a significant effect in translation quality. [sent-26, score-0.066]

21 Other works investigating alternative binarization methods mostly focus on the effect of nonterminal sharing. [sent-27, score-0.902]

22 (2009) also proposed a CYKlike algorithm for synchronous binarization. [sent-29, score-0.183]

23 Apparently the lack of virtual nonterminal sharing in their decoder caused heavy competition between virtual nonterminals, and they created a cost function to “diversify” binarization trees, which is equivalent to minimizing nonterminal sharing. [sent-30, score-1.666]

24 (2009b) used a greedy method to maximize virtual nonterminal sharing on the source side during the -LM parsing phase. [sent-32, score-0.474]

25 They show that effective source-side binarization can improve the efficiency of parsing SCFG. [sent-33, score-0.767]

26 However, their method works only on the source side, and synchronous binarization is put off to the +LM decoding phase (DeNero et al. [sent-34, score-0.974]

27 Although these ideas all lead to faster decoding and reduced search errors, there can be conflicts in the constraints each of them has on the form of rules and accommodating all of them can be a challenge. [sent-36, score-0.132]

28 In this paper, we present a cubic time algorithm to find the best binarization tree, given the conflicting constraints. [sent-37, score-0.801]

29 2 The Binarization Algorithm An SCFG rule is synchronously binarizable if when simultaneously binarizing source and target sides, virtual nonterminals created by binarizations always have contiguous spans on both sides (Huang, 2007). [sent-38, score-0.859]

30 Tj [−k, 1j ]d +o c(hi, k, ji) Tt [←i, j T] ←i,k m] +in( TT[k[i,, j ]] ,+ +t) c T[i,j] ← min(T[i,j],t) Even with the synchronous binarization constraint, many possible binarizations exist. [sent-60, score-0.948]

31 Analysis of our Chinese-English parallel corpus has shown that the majority of synchronously binarizable rules with arity smaller than 4 are monotonic, i. [sent-61, score-0.384]

32 , the target-side nonterminal permutation is either strictly increasing or decreasing (See Figure 1). [sent-63, score-0.17]

33 For monotonic rules, any source-side binarization is also a permissible synchronous binarization. [sent-64, score-1.014]

34 The binarization problem can be formulated as a semiring parsing (Goodman, 1999) problem. [sent-65, score-0.863]

35 We define a cost function that considers different binarization criteria. [sent-66, score-0.928]

36 A CYK-like algorithm can be used to find the best binarization tree according to the cost function. [sent-67, score-0.99]

37 Consider an SCFG rule X → hγ, αi, wcohsetre fu γ atniodn α Cstoannsdid feorr atnhe S source slied eX a n→d t hhγe, tαari-, get side. [sent-68, score-0.211]

38 Let B(γ) be the set of all possible binarization trees for γ. [sent-69, score-0.732]

39 With the cost function c defined over hyperedges in a binarization tree t, the optimal binarization tree is tˆ ˆt = at∈rgBm(γin)Xc(h) where c(h) is the cost of a hyperedge h in t. [sent-70, score-2.156]

40 hi, k, ji denotes a hyperedge h that conngeorctisth tmhe 1 spans (i, k) naontde (k, j) etroe tdhgee span (i, j). [sent-72, score-0.31]

41 cinit is the initialization for the cost function c. [sent-73, score-0.196]

42 We can recover the optimal source-side binarization tree by augmenting the algorithm with back pointers. [sent-74, score-0.846]

43 Binarized rules are generated by iterating over the nodes in the optimal binarization tree, while attaching unaligned target-side terminals. [sent-75, score-0.785]

44 At each tree node, we generate a virtual nonterminal symbol by concatenating the source span it dominates. [sent-76, score-0.514]

45 We define the cost function c(h) to be a tuple of component cost functions: c(h) = (c1(h) , c2 (h) , . [sent-77, score-0.41]

46 If the (min, +) operators on each component cost satisfy the semiring properties, the cost tuple is also a semiring. [sent-86, score-0.517]

47 Next, we describe our cost functions and how we handle target-side terminals. [sent-87, score-0.168]

48 1 Synchronous Binarization as a Cost We use a binary cost b to indicate whether a binarization tree is a permissible synchronous binarization. [sent-89, score-1.206]

49 Given a hyperedge hi, k, ji,we say k is apermissible split nofa thhyep span (i, j) i,fj ain, dw only ikf tishea spans (i, k) and (k, j) are both synchronously binarizable and the span (i, j) covers a consecutive sequence of non- terminals on the target side. [sent-90, score-0.918]

50 A span is synchronously binarizable if and only if the span is of length one, or a permissible split of the span exists. [sent-91, score-0.707]

51 The cost b is defined as: VP → [ pPrPop [o提se出 a [ J J N NNN]]1 ]2]2P,P The source side of the first binarized rule “[]1 → JJ NN, propose a JJ NN” contains a very frequent nonterminal sequence “JJ NN”. [sent-92, score-0.707]

52 If one were to parse with the binarized rule, and if the virtual nonterminal [] 1 has been built, the parser needs to continue following the binarization tree in order to determine whether the original rule would be matched. [sent-93, score-1.405]

53 Furthermore, having two consecutive nonterminals adds to complexity since the parser needs to test each split point. [sent-94, score-0.158]

54 The following binarization is equally valid but integrates terminals early: VP → [ pPrPop [ o提se出 a J J ] 1 NNNN]]2 P,P Here, the first binarized rule “[]1 → 提出 JJ, propose a JJ” anchors on a terminal and enables earlier pruning of the original rule. [sent-95, score-1.362]

55 ewsThtanhtuibsmi nsbaer eiazolafitz heoydn- binit(i) = T Under this configuration, the semiring operators (min, +) defined for the cost b are (∨, ∧). [sent-98, score-0.303]

56 Using b as t(mhei nfir,s+t )co dsetf finuendct fioorn t hine tchoes ct bo astr feu (n∨c,ti∧o)n. [sent-99, score-0.051]

57 tuple guarantees that we will find a tree that is a synchronously binarized if one exists. [sent-100, score-0.432]

58 2 Early Source-Side Terminal Matching When a rule is being applied while parsing a sentence, terminals in the rule have less chance of being matched. [sent-102, score-0.41]

59 We can exploit this fact by taking terminals into account during binarization and placing terminals lower in the binarization tree. [sent-103, score-1.742]

60 Consider the following SCFG rule: VP → prPoPpo 提se出 a J J J N NNN P,P The synchronous binarization algorithm of Zhang et al. [sent-104, score-0.915]

61 (2006) binarizes the rule1 by finding the rightmost binarizable points on the source side: 1We follow Wu (1997) and use square brackets for straight rules and pointed brackets for inverted rules. [sent-105, score-0.373]

62 We also mark brackets with indices to represent virtual nonterminals. [sent-106, score-0.205]

63 403 by defining a cost function e which estimates the probability of a hyperedge hi, k, ji being built. [sent-107, score-0.398]

64 We use a simple fm ao hdyepl:e assume eka,cjhi t bereminign balu or nonterminal in γ is matched independently with a fixed probability, then a hyperedge hi, k, ji is derived if apnrodb only tyif, a tlhl symbols eirne dthgee source span (i, j) are matched. [sent-108, score-0.562]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('binarization', 0.732), ('scfg', 0.188), ('binarizable', 0.173), ('nonterminal', 0.17), ('cost', 0.168), ('binarized', 0.167), ('synchronous', 0.16), ('synchronously', 0.152), ('virtual', 0.151), ('terminals', 0.139), ('hyperedge', 0.12), ('rule', 0.118), ('semiring', 0.096), ('span', 0.093), ('terminal', 0.084), ('nonterminals', 0.081), ('hi', 0.081), ('permissible', 0.079), ('jj', 0.074), ('pprpop', 0.069), ('tree', 0.067), ('scope', 0.064), ('nnn', 0.061), ('early', 0.061), ('ji', 0.059), ('binarizations', 0.056), ('unbinarized', 0.056), ('brackets', 0.054), ('side', 0.051), ('hyperedges', 0.05), ('decoding', 0.049), ('earley', 0.048), ('cubic', 0.046), ('tuple', 0.046), ('monotonic', 0.043), ('vp', 0.039), ('operators', 0.039), ('enables', 0.039), ('spans', 0.038), ('decoder', 0.038), ('parsing', 0.035), ('strategies', 0.035), ('nn', 0.035), ('min', 0.034), ('integrates', 0.034), ('denero', 0.034), ('sharing', 0.034), ('source', 0.033), ('rt', 0.033), ('sides', 0.032), ('rochester', 0.031), ('tt', 0.031), ('se', 0.031), ('translation', 0.031), ('feorr', 0.03), ('thhyep', 0.03), ('adl', 0.03), ('mhei', 0.03), ('arity', 0.03), ('orne', 0.03), ('gtoe', 0.03), ('tlhl', 0.03), ('apnrodb', 0.03), ('binarizes', 0.03), ('nofa', 0.03), ('sor', 0.03), ('atnhe', 0.03), ('rules', 0.029), ('targetside', 0.028), ('accommodating', 0.028), ('licheng', 0.028), ('nnnn', 0.028), ('anchoring', 0.028), ('langmead', 0.028), ('fioorn', 0.028), ('function', 0.028), ('complexity', 0.027), ('criterion', 0.027), ('matched', 0.027), ('cin', 0.026), ('conflicts', 0.026), ('diversify', 0.026), ('piecewise', 0.026), ('consecutive', 0.026), ('pruning', 0.025), ('cyk', 0.025), ('binarizing', 0.025), ('tagyoung', 0.024), ('ain', 0.024), ('anchors', 0.024), ('competition', 0.024), ('optimal', 0.024), ('split', 0.024), ('zhang', 0.023), ('formalized', 0.023), ('hine', 0.023), ('rf', 0.023), ('estimates', 0.023), ('algorithm', 0.023), ('built', 0.023)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000001 296 acl-2011-Terminal-Aware Synchronous Binarization

Author: Licheng Fang ; Tagyoung Chung ; Daniel Gildea

2 0.59763145 61 acl-2011-Binarized Forest to String Translation

Author: Hao Zhang ; Licheng Fang ; Peng Xu ; Xiaoyun Wu

Abstract: Tree-to-string translation is syntax-aware and efficient but sensitive to parsing errors. Forestto-string translation approaches mitigate the risk of propagating parser errors into translation errors by considering a forest of alternative trees, as generated by a source language parser. We propose an alternative approach to generating forests that is based on combining sub-trees within the first best parse through binarization. Provably, our binarization forest can cover any non-consitituent phrases in a sentence but maintains the desirable property that for each span there is at most one nonterminal so that the grammar constant for decoding is relatively small. For the purpose of reducing search errors, we apply the synchronous binarization technique to forest-tostring decoding. Combining the two techniques, we show that using a fast shift-reduce parser we can achieve significant quality gains in NIST 2008 English-to-Chinese track (1.3 BLEU points over a phrase-based system, 0.8 BLEU points over a hierarchical phrase-based system). Consistent and significant gains are also shown in WMT 2010 in the English to German, French, Spanish and Czech tracks.

3 0.27823117 180 acl-2011-Issues Concerning Decoding with Synchronous Context-free Grammar

Author: Tagyoung Chung ; Licheng Fang ; Daniel Gildea

Abstract: We discuss some of the practical issues that arise from decoding with general synchronous context-free grammars. We examine problems caused by unary rules and we also examine how virtual nonterminals resulting from binarization can best be handled. We also investigate adding more flexibility to synchronous context-free grammars by adding glue rules and phrases.

4 0.18778372 206 acl-2011-Learning to Transform and Select Elementary Trees for Improved Syntax-based Machine Translations

Author: Bing Zhao ; Young-Suk Lee ; Xiaoqiang Luo ; Liu Li

Abstract: We propose a novel technique of learning how to transform the source parse trees to improve the translation qualities of syntax-based translation models using synchronous context-free grammars. We transform the source tree phrasal structure into a set of simpler structures, expose such decisions to the decoding process, and find the least expensive transformation operation to better model word reordering. In particular, we integrate synchronous binarizations, verb regrouping, removal of redundant parse nodes, and incorporate a few important features such as translation boundaries. We learn the structural preferences from the data in a generative framework. The syntax-based translation system integrating the proposed techniques outperforms the best Arabic-English unconstrained system in NIST08 evaluations by 1.3 absolute BLEU, which is statistically significant.

5 0.16136278 250 acl-2011-Prefix Probability for Probabilistic Synchronous Context-Free Grammars

Author: Mark-Jan Nederhof ; Giorgio Satta

Abstract: We present a method for the computation of prefix probabilities for synchronous contextfree grammars. Our framework is fairly general and relies on the combination of a simple, novel grammar transformation and standard techniques to bring grammars into normal forms.

6 0.13686644 202 acl-2011-Learning Hierarchical Translation Structure with Linguistic Annotations

7 0.13660011 234 acl-2011-Optimal Head-Driven Parsing Complexity for Linear Context-Free Rewriting Systems

8 0.11933178 29 acl-2011-A Word-Class Approach to Labeling PSCFG Rules for Machine Translation

9 0.10377447 268 acl-2011-Rule Markov Models for Fast Tree-to-String Translation

10 0.099541582 252 acl-2011-Prototyping virtual instructors from human-human corpora

11 0.096249767 30 acl-2011-Adjoining Tree-to-String Translation

12 0.083980165 110 acl-2011-Effective Use of Function Words for Rule Generalization in Forest-Based Translation

13 0.078343861 232 acl-2011-Nonparametric Bayesian Machine Transliteration with Synchronous Adaptor Grammars

14 0.070573181 44 acl-2011-An exponential translation model for target language morphology

15 0.068822443 173 acl-2011-Insertion Operator for Bayesian Tree Substitution Grammars

16 0.064922817 155 acl-2011-Hypothesis Mixture Decoding for Statistical Machine Translation

17 0.063290998 154 acl-2011-How to train your multi bottom-up tree transducer

18 0.058350511 171 acl-2011-Incremental Syntactic Language Models for Phrase-based Translation

19 0.058023352 210 acl-2011-Lexicographic Semirings for Exact Automata Encoding of Sequence Models

20 0.055947497 50 acl-2011-Automatic Extraction of Lexico-Syntactic Patterns for Detection of Negation and Speculation Scopes

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.131), (1, -0.17), (2, 0.084), (3, -0.099), (4, 0.022), (5, 0.02), (6, -0.286), (7, -0.071), (8, -0.093), (9, -0.105), (10, -0.107), (11, -0.053), (12, -0.009), (13, 0.115), (14, 0.103), (15, -0.102), (16, 0.005), (17, 0.08), (18, 0.041), (19, 0.024), (20, -0.037), (21, -0.067), (22, -0.081), (23, 0.035), (24, 0.014), (25, -0.051), (26, -0.068), (27, -0.019), (28, 0.083), (29, 0.006), (30, 0.021), (31, 0.074), (32, 0.013), (33, 0.135), (34, 0.033), (35, 0.174), (36, 0.131), (37, 0.369), (38, 0.115), (39, 0.149), (40, 0.004), (41, -0.035), (42, 0.054), (43, 0.04), (44, -0.005), (45, 0.034), (46, -0.052), (47, 0.02), (48, -0.269), (49, -0.079)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.97331524 296 acl-2011-Terminal-Aware Synchronous Binarization

Author: Licheng Fang ; Tagyoung Chung ; Daniel Gildea

2 0.73370987 180 acl-2011-Issues Concerning Decoding with Synchronous Context-free Grammar

Author: Tagyoung Chung ; Licheng Fang ; Daniel Gildea

3 0.68227506 61 acl-2011-Binarized Forest to String Translation

Author: Hao Zhang ; Licheng Fang ; Peng Xu ; Xiaoyun Wu

4 0.61174947 250 acl-2011-Prefix Probability for Probabilistic Synchronous Context-Free Grammars

Author: Mark-Jan Nederhof ; Giorgio Satta

5 0.59427094 234 acl-2011-Optimal Head-Driven Parsing Complexity for Linear Context-Free Rewriting Systems

Author: Pierluigi Crescenzi ; Daniel Gildea ; Andrea Marino ; Gianluca Rossi ; Giorgio Satta

Abstract: We study the problem offinding the best headdriven parsing strategy for Linear ContextFree Rewriting System productions. A headdriven strategy must begin with a specified righthand-side nonterminal (the head) and add the remaining nonterminals one at a time in any order. We show that it is NP-hard to find the best head-driven strategy in terms of either the time or space complexity of parsing.

6 0.4581998 154 acl-2011-How to train your multi bottom-up tree transducer

7 0.40379483 252 acl-2011-Prototyping virtual instructors from human-human corpora

8 0.39462265 268 acl-2011-Rule Markov Models for Fast Tree-to-String Translation

9 0.37641746 206 acl-2011-Learning to Transform and Select Elementary Trees for Improved Syntax-based Machine Translations

10 0.32022813 110 acl-2011-Effective Use of Function Words for Rule Generalization in Forest-Based Translation

11 0.3129909 202 acl-2011-Learning Hierarchical Translation Structure with Linguistic Annotations

12 0.3109926 232 acl-2011-Nonparametric Bayesian Machine Transliteration with Synchronous Adaptor Grammars

13 0.31081659 30 acl-2011-Adjoining Tree-to-String Translation

14 0.28978923 29 acl-2011-A Word-Class Approach to Labeling PSCFG Rules for Machine Translation

15 0.26531357 290 acl-2011-Syntax-based Statistical Machine Translation using Tree Automata and Tree Transducers

16 0.26506278 11 acl-2011-A Fast and Accurate Method for Approximate String Search

17 0.24295303 291 acl-2011-SystemT: A Declarative Information Extraction System

18 0.23753038 50 acl-2011-Automatic Extraction of Lexico-Syntactic Patterns for Detection of Negation and Speculation Scopes

19 0.23382917 107 acl-2011-Dynamic Programming Algorithms for Transition-Based Dependency Parsers

20 0.2335171 217 acl-2011-Machine Translation System Combination by Confusion Forest

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(5, 0.029), (8, 0.289), (17, 0.079), (37, 0.118), (39, 0.078), (41, 0.05), (55, 0.026), (59, 0.048), (62, 0.016), (72, 0.011), (91, 0.03), (96, 0.11)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.78689945 296 acl-2011-Terminal-Aware Synchronous Binarization

Author: Licheng Fang ; Tagyoung Chung ; Daniel Gildea

2 0.65639496 128 acl-2011-Exploring Entity Relations for Named Entity Disambiguation

Author: Danuta Ploch

Abstract: Named entity disambiguation is the task of linking an entity mention in a text to the correct real-world referent predefined in a knowledge base, and is a crucial subtask in many areas like information retrieval or topic detection and tracking. Named entity disambiguation is challenging because entity mentions can be ambiguous and an entity can be referenced by different surface forms. We present an approach that exploits Wikipedia relations between entities co-occurring with the ambiguous form to derive a range of novel features for classifying candidate referents. We find that our features improve disambiguation results significantly over a strong popularity baseline, and are especially suitable for recognizing entities not contained in the knowledge base. Our system achieves state-of-the-art results on the TAC-KBP 2009 dataset.

3 0.62215436 3 acl-2011-A Bayesian Model for Unsupervised Semantic Parsing

Author: Ivan Titov ; Alexandre Klementiev

Abstract: We propose a non-parametric Bayesian model for unsupervised semantic parsing. Following Poon and Domingos (2009), we consider a semantic parsing setting where the goal is to (1) decompose the syntactic dependency tree of a sentence into fragments, (2) assign each of these fragments to a cluster of semantically equivalent syntactic structures, and (3) predict predicate-argument relations between the fragments. We use hierarchical PitmanYor processes to model statistical dependencies between meaning representations of predicates and those of their arguments, as well as the clusters of their syntactic realizations. We develop a modification of the MetropolisHastings split-merge sampler, resulting in an efficient inference algorithm for the model. The method is experimentally evaluated by us- ing the induced semantic representation for the question answering task in the biomedical domain.

4 0.55687296 126 acl-2011-Exploiting Syntactico-Semantic Structures for Relation Extraction

Author: Yee Seng Chan ; Dan Roth

Abstract: In this paper, we observe that there exists a second dimension to the relation extraction (RE) problem that is orthogonal to the relation type dimension. We show that most of these second dimensional structures are relatively constrained and not difficult to identify. We propose a novel algorithmic approach to RE that starts by first identifying these structures and then, within these, identifying the semantic type of the relation. In the real RE problem where relation arguments need to be identified, exploiting these structures also allows reducing pipelined propagated errors. We show that this RE framework provides significant improvement in RE performance.

5 0.55250019 277 acl-2011-Semi-supervised Relation Extraction with Large-scale Word Clustering

Author: Ang Sun ; Ralph Grishman ; Satoshi Sekine

Abstract: We present a simple semi-supervised relation extraction system with large-scale word clustering. We focus on systematically exploring the effectiveness of different cluster-based features. We also propose several statistical methods for selecting clusters at an appropriate level of granularity. When training on different sizes of data, our semi-supervised approach consistently outperformed a state-of-the-art supervised baseline system. 1

6 0.55178732 202 acl-2011-Learning Hierarchical Translation Structure with Linguistic Annotations

7 0.55152035 111 acl-2011-Effects of Noun Phrase Bracketing in Dependency Parsing and Machine Translation

8 0.55139118 186 acl-2011-Joint Training of Dependency Parsing Filters through Latent Support Vector Machines

9 0.55032486 309 acl-2011-Transition-based Dependency Parsing with Rich Non-local Features

10 0.54631346 324 acl-2011-Unsupervised Semantic Role Induction via Split-Merge Clustering

11 0.54591179 85 acl-2011-Coreference Resolution with World Knowledge

12 0.54587406 5 acl-2011-A Comparison of Loopy Belief Propagation and Dual Decomposition for Integrated CCG Supertagging and Parsing

13 0.54479754 58 acl-2011-Beam-Width Prediction for Efficient Context-Free Parsing

14 0.54476947 103 acl-2011-Domain Adaptation by Constraining Inter-Domain Variability of Latent Feature Representation

15 0.54328966 92 acl-2011-Data point selection for cross-language adaptation of dependency parsers

16 0.54261714 331 acl-2011-Using Large Monolingual and Bilingual Corpora to Improve Coordination Disambiguation

17 0.54166597 241 acl-2011-Parsing the Internal Structure of Words: A New Paradigm for Chinese Word Segmentation

18 0.54061091 284 acl-2011-Simple Unsupervised Grammar Induction from Raw Text with Cascaded Finite State Models

19 0.54048198 164 acl-2011-Improving Arabic Dependency Parsing with Form-based and Functional Morphological Features

20 0.54043591 28 acl-2011-A Statistical Tree Annotator and Its Applications