acl acl2010 acl2010-20 knowledge-graph by maker-knowledge-mining

20 acl-2010-A Transition-Based Parser for 2-Planar Dependency Structures


Source: pdf

Author: Carlos Gomez-Rodriguez ; Joakim Nivre

Abstract: Finding a class of structures that is rich enough for adequate linguistic representation yet restricted enough for efficient computational processing is an important problem for dependency parsing. In this paper, we present a transition system for 2-planar dependency trees trees that can be decomposed into at most two planar graphs and show that it can be used to implement a classifier-based parser that runs in linear time and outperforms a stateof-the-art transition-based parser on four data sets from the CoNLL-X shared task. In addition, we present an efficient method – – for determining whether an arbitrary tree is 2-planar and show that 99% or more of the trees in existing treebanks are 2-planar.

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 e s Abstract Finding a class of structures that is rich enough for adequate linguistic representation yet restricted enough for efficient computational processing is an important problem for dependency parsing. [sent-3, score-0.435]

2 In addition, we present an efficient method – – for determining whether an arbitrary tree is 2-planar and show that 99% or more of the trees in existing treebanks are 2-planar. [sent-5, score-0.236]

3 1 Introduction Dependency-based syntactic parsing has become a widely used technique in natural language processing, and many different parsing models have been proposed in recent years (Yamada and Matsumoto, 2003; Nivre et al. [sent-6, score-0.104]

4 Although these proposals seem to have a very good fit with linguistic data, in the sense that they often cover 99% or more of the structures found in existing treebanks, the development of efficient parsing algorithms for these classes has met with more limited success. [sent-15, score-0.182]

5 (2009) have shown how well-nested dependency trees with bounded gap degree can be parsed in polynomial time, the best time complexity for lexicalized parsing of this class remains a prohibitive O(n7), which makes the practical usefulness questionable. [sent-17, score-0.465]

6 In this paper, we explore another characterization of mildly non-projective dependency trees based on the notion of multiplanarity. [sent-18, score-0.374]

7 This was originally proposed by Yli-Jyr a¨ (2003) but has so far played a marginal role in the dependency parsing literature, because no algorithm was known for determining whether an arbitrary tree was mplanar, and no parsing algorithm existed for any constant value of m. [sent-19, score-0.456]

8 First, we present a procedure for determining the minimal number m such that a dependency tree is m-planar and use it to show that the overwhelming majority of sentences in dependency treebanks have a tree that is at most 2planar. [sent-21, score-0.748]

9 Secondly, we present a transition-based parsing algorithm for 2-planar dependency trees, developed in two steps. [sent-22, score-0.339]

10 We begin by showing how the stack-based algorithm of Nivre (2003) can be generalized from projective to planar structures. [sent-23, score-0.509]

11 We then extend the system by adding a second stack and show that the resulting system captures exactly the set of 2-planar structures. [sent-24, score-0.14]

12 Although the contributions of this paper are mainly theoretical, we also present an empirical evaluation of the 2planar parser, showing that it outperforms the projective parser on four data sets from the CoNLL-X shared task (Buchholz and Marsi, 2006). [sent-25, score-0.242]

13 A dependency graph for w is a directed graph G = (Vw , E), where Vw = [1, n] and E ⊆ Vw Vw. [sent-35, score-0.685]

14 We call an edge (wi, wj) in a dependency graph G a dependency link2 from wi to wj. [sent-36, score-1.093]

15 We say that wi is the parent (or head) of wj and, conversely, that wj is a syntactic child (or dependent) of wi. [sent-37, score-1.088]

16 The projection of a node denoted bwic , is the set of reflexive-transitive dependents o bfw wi: bwic = {wj ∈ Vle | wi t→ra∗n wj}. [sent-39, score-0.525]

17 arbitrary dependency graphs but typically require graphs to be acyclic and have at most one head per node. [sent-41, score-0.576]

18 Single-head: If wj → wi, then not wk → wi (for every k j). [sent-50, score-0.799]

19 wi, = Nodes in a forest that do not have a head are called roots. [sent-51, score-0.142]

20 Some frameworks require that dependency forests have a unique root (i. [sent-52, score-0.409]

21 2In practice, dependency links are usually labeled, but to simplify the presentation we will ignore labels throughout most of the paper. [sent-61, score-0.429]

22 However, all the results and algorithms presented can be applied to labeled dependency graphs and will be so applied in the experimental evaluation. [sent-62, score-0.386]

23 wn is projective iff bwic is an interval for every wordi wi ∈ [1, n]. [sent-67, score-0.699]

24 Projective dependency trees correspond to the set of structures that can be induced from lexicalised context-free derivations (Kuhlmann, 2007; Gaifman, 1965). [sent-68, score-0.445]

25 , 2008), while in the case of general non-projective dependency forests, it is only tractable under strong independence assumptions (McDonald et al. [sent-70, score-0.287]

26 3 Planarity The concept of planarity (Sleator and Temperley, 1993) is closely related to projectivity3 and can be informally defined as the property of a dependency forest whose links can be drawn above the words without crossing. [sent-73, score-0.627]

27 4 To define planarity more formally, we first define crossing links as follows: let (wi, wk) and (wj , wl) be dependency links in a dependency graph G. [sent-74, score-1.267]

28 Then, tihtye, lwienk ass are esa tihda t o m i bne crossing miifn min(i, k) < min(j, l) < max(i, k) < max(j, l). [sent-76, score-0.092]

29 A dependency graph is planar iff it does not contain a pair of crossing links. [sent-78, score-0.973]

30 4 Multiplanarity The concept of planarity on its own does not seem to be very relevant as an extension of projectivity for practical dependency parsing. [sent-80, score-0.496]

31 According to the results by Kuhlmann and Nivre (2006), most non-projective structures in dependency treebanks are also non-planar, so being able to parse planar structures will only give us a modest improvement in coverage with respect to a projective parser. [sent-81, score-1.188]

32 However, our interest in planarity is motivated by the fact that it can be generalised narity (Yli-Jyr a¨, to multipla- 2003): 3For dependency forests that are extended with a unique artificial root located at position 0, as is commonly done, the two notions are equivalent. [sent-82, score-0.584]

33 4Planarity in the context of dependency structures is not to be confused with the homonymous concept in graph theory, which does not restrict links to be drawn above the nodes. [sent-83, score-0.733]

34 1493 Figure 1: A 2-planar dependency structure with two different ways of distributing its links into two planes (represented by solid and dotted lines). [sent-84, score-0.547]

35 A dependency graph G = (V, E) is m-planar iff there exist planar dependency graphs G1 = (V, E1), . [sent-86, score-1.267]

36 Intuitively, we can associate planes with colours and say that a dependency graph G is m-planar ifit is possible to assign one of m colours to each of its links in such a way that links with the same colour do not cross. [sent-90, score-1.201]

37 Note that there may be multiple ways of dividing an m-planar graph into planes, as shown in the example of Figure 1. [sent-91, score-0.199]

38 3 Determining Multiplanarity Several constraints on non-projective dependency structures have been proposed recently that seek a good balance between parsing efficiency and coverage of non-projective phenomena present in natural language treebanks. [sent-92, score-0.533]

39 For example, Kuhlmann and Nivre (2006) and Havelka (2007) have shown that the vast majority of structures present in existing treebanks are well-nested and have a small gap degree (Bodirsky et al. [sent-93, score-0.223]

40 , 2005), leading to an interest in parsers for these kinds of structures (G´ omezRodr ı´guez et al. [sent-94, score-0.127]

41 No similar analysis has been performed for m-planar structures, although Yli-Jyr a¨ (2003) provides evidence that all except two structures in the Danish dependency treebank are at most 3-planar. [sent-96, score-0.392]

42 However, his analysis is based on constraints that restrict the possible ways of assigning planes to dependency links, and he is not guaranteed to find the minimal number m for which a given structure is m-planar. [sent-97, score-0.476]

43 In this section, we provide a procedure for finding the minimal number m such that a dependency graph is m-planar and use it to show that the vast majority of sentences in dependency treebanks are Figure 2: The crossings graph corresponding to the dependency structure of Figure 1. [sent-98, score-1.54]

44 The idea is to reduce the problem of determining whether a dependency graph G = (V, E) is m-planar, for a given value of m, to a standard graph colouring problem. [sent-100, score-0.767]

45 Figure 2 shows the crossings graph of the 2-planar structure in Figure 1. [sent-102, score-0.341]

46 4, a dependency graph G is m-planar if each of its links can be assigned one of m colours in such a way that links with the same colours do not cross. [sent-104, score-0.998]

47 In terms of the cross- ings graph, this means that G is m-planar if each of the nodes of U(G) can be assigned one of m colours such that no two neighbours have the same colour. [sent-105, score-0.229]

48 For k = 1the problem is trivial: a graph is 1colourable only if it has no edges. [sent-107, score-0.199]

49 For k = 2, the problem can be solved in time linear in the size of the graph by simple breadth-first search. [sent-108, score-0.199]

50 Given a graph U = (V, E), we pick an arbitrary node v and give it one of two colours. [sent-109, score-0.27]

51 This forces us to give the other colour to all its neighbours, the first colour to the neighbours’ neighbours, and so on. [sent-110, score-0.124]

52 This process continues until we have processed all the nodes in the connected component of v. [sent-111, score-0.117]

53 If this has resulted in assigning two different colours to the same node, the graph is not 2-colourable. [sent-112, score-0.313]

54 If there are still unprocessed nodes, we repeat the process by arbitrarily selecting one of them, continue with the rest of the connected components, and in this way obtain a 2-colouring of the whole graph if it 1494 nestedness in treebanks for Arabic (Haji ˇc et al. [sent-114, score-0.369]

55 Since this process can be completed by visiting each node and edge of the graph U once, its complexity is O(V + E). [sent-124, score-0.266]

56 5 Given that non-projective sentences in natural language tend to have a small proportion of non-projective links (Nivre and Nilsson, 2005), the connected components of their crossings graphs are very small, and k-colourings for them can quickly be found by brute-force search. [sent-128, score-0.435]

57 By applying these techniques to dependency treebanks of several languages, we obtain the data shown in Table 1. [sent-129, score-0.405]

58 As we will see below, the class of 2-planar dependency structures not only has good coverage oflinguistic phenomena in existing treebanks but is also efficiently parsable with transition-based parsing methods, making it a practically interesting subclass of non-projective dependency structures. [sent-132, score-0.983]

59 5If we have a valid colouring for all the cycles in the graph, the rest of the nodes can be safely coloured by breadthfirst search as in the k = 2 case. [sent-133, score-0.131]

60 4 Parsing 1-Planar Structures In this section, we present a deterministic lineartime parser for planar dependency structures. [sent-134, score-0.704]

61 The parser is a variant of Nivre’s arc-eager projective parser (Nivre, 2003), modified so that it can also handle graphs that are planar but not projective. [sent-135, score-0.758]

62 As seen in Table 1, this only gives a modest improvement in coverage compared to projective parsing, so the main interest of this algorithm lies in the fact that it can be generalised to deal with 2-planar structures, as shown in the next section. [sent-136, score-0.267]

63 A transition system for dependency parsing is a quadruple S = (C, T, cs, Ct) where 1. [sent-140, score-0.533]

64 cs is a function that maps each input sentence w to an initial configuration cs (w) ∈ C, 4. [sent-143, score-0.323]

65 An oracle for a transition system S = (C, T, cs, Ct) is a function o : C → T. [sent-146, score-0.26]

66 An input sentence w can be parsed using a transition system S = (C, T, cs, Ct) and an oracle o by starting in the initial configuration cs (w), calling the oracle function on the current configuration c, and updating the configuration by applying the transition o(c) returned by the oracle. [sent-147, score-0.95]

67 This process is repeated until a terminal configuration is 1495 Initial configuration: Terminal configurations: Transitions: SHIFT REDUCE LEFT-ARC cs(w1 . [sent-148, score-0.168]

68 wn] , ∅i Cf = {hΣ, [] ,Ai ∈ C} hΣ, wi |B, Ai ⇒ hΣ|wi, B, Ai hΣ|wi, B, Ai ⇒ hΣ, B, Ai hΣ|wi, wj |B, Ai ⇒ hΣ|wi, wj |B, A ∪ {(wj, wi)}i only if 6 ∃k| (wk , wi) ∈ ⇒A h (single-head) and not wi ↔∗ wj ∈ A (acyclicity). [sent-154, score-1.792]

69 RIGHT-ARC hΣ|wi, wj |B, Ai ⇒ hΣ|wi, wj |B, A ∪ {(wi, wj)}i only if 6 ∃k| (wk , wj) ∈ ⇒A h (single-head) and not wi ↔∗ wj ∈ A (acyclicity). [sent-155, score-1.472]

70 Figure 3: Transition system for planar dependency parsing. [sent-156, score-0.653]

71 reached, and the dependency analysis of the sentence is defined by the terminal configuration. [sent-157, score-0.338]

72 Each sequence of configurations that the parser can traverse from an initial configuration to a terminal configuration for some input w is called a transition sequence. [sent-158, score-0.601]

73 If we associate each configuration c of a transition system S = (C, T, cs, Ct) with a dependency graph g(c), we can say that S is sound for a class of dependency graphs G if, fsor s every osernt aen clcaes w afn dde ptreanndseitnicony sequence (cs (w) , c1, . [sent-159, score-1.284]

74 , cf) of S, g(cf) is in G, and that S is complete for G if, for every )s iesn itenn Gc,e w da tnhda dte Spendency graph GG i ∈ Gor rfo evr w, sthenertee nisc a wtra annsdit dioensequence (cs(w) , c1, . [sent-162, score-0.222]

75 A transition system that is sound and complete for G is said to be correct for G. [sent-166, score-0.227]

76 rect transition system, a practical parser needs a good oracle to achieve the desired results, since a transition system only specifies how to reach all the possible dependency graphs that could be associated to a sentence, but not how to select the correct one. [sent-168, score-0.916]

77 2 A Transition System for Planar Structures A correct transition system for the class of planar dependency forests can be obtained as a variant of the arc-eager projective system by Nivre (2003). [sent-172, score-1.136]

78 As in that system, the set of configurations of the planar transition system is the set of all triples c = hΣ, B, Ai such that Σ and B are disjoint lists ocf = =w hoΣrd,sB fr,oAmi Vw (for some input w), iasjnodi nAt i ss a set of dependency links over Vw. [sent-173, score-1.006]

79 The list Σ, called the stack, is initially empty and holds words that have dependency links pending to be created. [sent-175, score-0.485]

80 The system is shown in Figure 3, where we use the notation Σ |wi for a stack uwrieth 3 top wi ea wnde t uasile Σ, ea nndot we nin Σve|rwt the notation for the buffer for clarity (i. [sent-176, score-0.543]

81 The system reads the input from left to right and creates links in a left-to-right order by executing its four transitions: 1. [sent-179, score-0.166]

82 LEFT-ARC: adds a link from the first word in the buffer to the top of the stack. [sent-182, score-0.156]

83 RIGHT-ARC: adds a link from the top of the stack to the first word in the buffer. [sent-184, score-0.141]

84 REDUCE: pops the top word from the stack, implying that we have finished building links to or from it. [sent-186, score-0.199]

85 Note that the planar parser’s transitions are more fine-grained than those of the arc-eager projective parser by Nivre (2003), which pops the stack as part of its LEFT-ARC transition and shifts a word as part of its RIGHT-ARC transition. [sent-187, score-0.989]

86 Forcing these actions after creating dependency links rules out structures whose root is covered by a dependency link, which are planar but not projective. [sent-188, score-1.185]

87 For the same reason, we remove the constraint in Nivre’s parser by which words without a head cannot be reduced. [sent-190, score-0.139]

88 Since we are interested in planar dependency forests, which do not contain cycles, we only apply ARC transitions after checking that there is no undirected path between the nodes to be linked. [sent-192, score-0.792]

89 This check can be done without affecting the linear-time complexity of the 1496 parser by storing the weakly connected component of each node in g(c). [sent-193, score-0.215]

90 The fine-grained transitions used by this parser have also been used by Sagae and Tsujii (2008) to parse DAGs. [sent-194, score-0.161]

91 However, the latter parser differs from ours in the constraints, since it does not allow the reduction ofwords without a head (disallowing forests with covered roots) and does not enforce the acyclicity constraint (which is guaranteed by post-processing the graphs to break cycles). [sent-195, score-0.506]

92 We wish to prove that the planar transition system is sound and complete for the set Fp of all planar dependency fpoleretests f. [sent-198, score-1.238]

93 o rT toh prove soundness, we have to show that, for every sentence w and transition sequence (cs (w) , c1, . [sent-199, score-0.233]

94 , cf), the graph g(cf) associated with cf is in Fp. [sent-202, score-0.31]

95 We take the graph associated with a configuration c = (Σ, B, A) to be g(c) = (Vw, A). [sent-203, score-0.316]

96 With this, we prove the stronger claim that g(c) ∈ Fp for every configuration c that belongs tco) some transition sequence starting with cs (w). [sent-204, score-0.453]

97 This amounts to showing that in every configuration c reachable from cs (w), g(c) meets the following three conditions that characterise a planar dependency forest: (1) g(c) does not contain nodes with more than one head; (2) g(c) is acyclic; and (3) g(c) contains no crossing links. [sent-205, score-1.008]

98 (1) is trivially guaranteed by the single-head constraint; (2) follows from (1) and the acyclicity constraint; and (3) can be established by proving that there is no transition sequence that will invoke two ARC transitions on node pairs that would create crossing links. [sent-206, score-0.587]

99 At the point when a link from wi to wj is created, we know that all the words strictly located between wi and wj are not in the stack or in the buffer, so no links can be created to or from them. [sent-207, score-1.712]

100 To prove completeness, we show that every planar dependency forest G = (V, E) ∈ Fp fpolra a s deenpteenncdee w can sbte produced by applying the oracle function that maps a configuration hΣ |wi , wj |B, Ai to: 1. [sent-208, score-1.339]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('wj', 0.384), ('planar', 0.342), ('wi', 0.32), ('dependency', 0.287), ('graph', 0.199), ('transition', 0.17), ('projective', 0.167), ('nivre', 0.164), ('crossings', 0.142), ('links', 0.142), ('acyclicity', 0.118), ('planarity', 0.118), ('planes', 0.118), ('treebanks', 0.118), ('configuration', 0.117), ('colours', 0.114), ('cf', 0.111), ('buffer', 0.107), ('kuhlmann', 0.107), ('structures', 0.105), ('cs', 0.103), ('forests', 0.1), ('graphs', 0.099), ('stack', 0.092), ('crossing', 0.092), ('transitions', 0.086), ('vw', 0.08), ('forest', 0.08), ('guez', 0.076), ('parser', 0.075), ('wk', 0.072), ('neighbours', 0.071), ('bwic', 0.071), ('multiplanarity', 0.071), ('oracle', 0.066), ('colour', 0.062), ('ai', 0.062), ('pops', 0.057), ('iff', 0.053), ('trees', 0.053), ('parsing', 0.052), ('connected', 0.052), ('terminal', 0.051), ('guaranteed', 0.05), ('link', 0.049), ('colouring', 0.047), ('parsable', 0.047), ('nodes', 0.044), ('ct', 0.043), ('satta', 0.042), ('node', 0.041), ('configurations', 0.041), ('projectivity', 0.041), ('fp', 0.04), ('cycles', 0.04), ('prove', 0.04), ('wn', 0.04), ('mcdonald', 0.038), ('havelka', 0.038), ('arc', 0.037), ('coverage', 0.036), ('generalised', 0.036), ('determining', 0.035), ('min', 0.035), ('mildly', 0.034), ('sound', 0.033), ('undirected', 0.033), ('head', 0.032), ('definition', 0.032), ('constraint', 0.032), ('string', 0.031), ('arbitrary', 0.03), ('ej', 0.03), ('danish', 0.03), ('trivially', 0.03), ('called', 0.03), ('convenience', 0.029), ('acyclic', 0.029), ('phenomena', 0.029), ('modest', 0.028), ('nilsson', 0.027), ('complexity', 0.026), ('empty', 0.026), ('haji', 0.026), ('seem', 0.025), ('interval', 0.025), ('practical', 0.025), ('system', 0.024), ('efficiency', 0.024), ('every', 0.023), ('associate', 0.023), ('parsers', 0.022), ('class', 0.022), ('root', 0.022), ('correctness', 0.022), ('projection', 0.022), ('minimal', 0.021), ('component', 0.021), ('adequate', 0.021), ('located', 0.021)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999988 20 acl-2010-A Transition-Based Parser for 2-Planar Dependency Structures

Author: Carlos Gomez-Rodriguez ; Joakim Nivre

Abstract: Finding a class of structures that is rich enough for adequate linguistic representation yet restricted enough for efficient computational processing is an important problem for dependency parsing. In this paper, we present a transition system for 2-planar dependency trees trees that can be decomposed into at most two planar graphs and show that it can be used to implement a classifier-based parser that runs in linear time and outperforms a stateof-the-art transition-based parser on four data sets from the CoNLL-X shared task. In addition, we present an efficient method – – for determining whether an arbitrary tree is 2-planar and show that 99% or more of the trees in existing treebanks are 2-planar.

2 0.41370627 242 acl-2010-Tree-Based Deterministic Dependency Parsing - An Application to Nivre's Method -

Author: Kotaro Kitagawa ; Kumiko Tanaka-Ishii

Abstract: Nivre’s method was improved by enhancing deterministic dependency parsing through application of a tree-based model. The model considers all words necessary for selection of parsing actions by including words in the form of trees. It chooses the most probable head candidate from among the trees and uses this candidate to select a parsing action. In an evaluation experiment using the Penn Treebank (WSJ section), the proposed model achieved higher accuracy than did previous deterministic models. Although the proposed model’s worst-case time complexity is O(n2), the experimental results demonstrated an average pars- ing time not much slower than O(n).

3 0.17706279 241 acl-2010-Transition-Based Parsing with Confidence-Weighted Classification

Author: Martin Haulrich

Abstract: We show that using confidence-weighted classification in transition-based parsing gives results comparable to using SVMs with faster training and parsing time. We also compare with other online learning algorithms and investigate the effect of pruning features when using confidenceweighted classification.

4 0.16067314 83 acl-2010-Dependency Parsing and Projection Based on Word-Pair Classification

Author: Wenbin Jiang ; Qun Liu

Abstract: In this paper we describe an intuitionistic method for dependency parsing, where a classifier is used to determine whether a pair of words forms a dependency edge. And we also propose an effective strategy for dependency projection, where the dependency relationships of the word pairs in the source language are projected to the word pairs of the target language, leading to a set of classification instances rather than a complete tree. Experiments show that, the classifier trained on the projected classification instances significantly outperforms previous projected dependency parsers. More importantly, when this clas- , sifier is integrated into a maximum spanning tree (MST) dependency parser, obvious improvement is obtained over the MST baseline.

5 0.13944374 69 acl-2010-Constituency to Dependency Translation with Forests

Author: Haitao Mi ; Qun Liu

Abstract: Tree-to-string systems (and their forestbased extensions) have gained steady popularity thanks to their simplicity and efficiency, but there is a major limitation: they are unable to guarantee the grammaticality of the output, which is explicitly modeled in string-to-tree systems via targetside syntax. We thus propose to combine the advantages of both, and present a novel constituency-to-dependency translation model, which uses constituency forests on the source side to direct the translation, and dependency trees on the target side (as a language model) to ensure grammaticality. Medium-scale experiments show an absolute and statistically significant improvement of +0.7 BLEU points over a state-of-the-art forest-based tree-to-string system even with fewer rules. This is also the first time that a treeto-tree model can surpass tree-to-string counterparts.

6 0.12366986 93 acl-2010-Dynamic Programming for Linear-Time Incremental Parsing

7 0.1234087 99 acl-2010-Efficient Third-Order Dependency Parsers

8 0.1166152 48 acl-2010-Better Filtration and Augmentation for Hierarchical Phrase-Based Translation Rules

9 0.11434306 146 acl-2010-Improving Chinese Semantic Role Labeling with Rich Syntactic Features

10 0.1138797 143 acl-2010-Importance of Linguistic Constraints in Statistical Dependency Parsing

11 0.10015154 27 acl-2010-An Active Learning Approach to Finding Related Terms

12 0.098164566 3 acl-2010-A Bayesian Method for Robust Estimation of Distributional Similarities

13 0.093687855 84 acl-2010-Detecting Errors in Automatically-Parsed Dependency Relations

14 0.085871108 136 acl-2010-How Many Words Is a Picture Worth? Automatic Caption Generation for News Images

15 0.08273191 214 acl-2010-Sparsity in Dependency Grammar Induction

16 0.081325591 52 acl-2010-Bitext Dependency Parsing with Bilingual Subtree Constraints

17 0.077576116 127 acl-2010-Global Learning of Focused Entailment Graphs

18 0.074724481 109 acl-2010-Experiments in Graph-Based Semi-Supervised Learning Methods for Class-Instance Acquisition

19 0.073576108 71 acl-2010-Convolution Kernel over Packed Parse Forest

20 0.07324931 253 acl-2010-Using Smaller Constituents Rather Than Sentences in Active Learning for Japanese Dependency Parsing


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.194), (1, -0.057), (2, 0.079), (3, 0.024), (4, -0.049), (5, -0.093), (6, 0.107), (7, 0.017), (8, -0.096), (9, 0.253), (10, -0.303), (11, -0.019), (12, -0.098), (13, 0.223), (14, 0.197), (15, -0.087), (16, -0.027), (17, -0.0), (18, -0.16), (19, 0.015), (20, -0.038), (21, -0.089), (22, -0.038), (23, -0.036), (24, 0.096), (25, -0.114), (26, -0.095), (27, -0.054), (28, 0.05), (29, 0.067), (30, 0.007), (31, 0.044), (32, 0.011), (33, 0.01), (34, -0.083), (35, -0.009), (36, 0.157), (37, -0.004), (38, -0.005), (39, 0.103), (40, -0.022), (41, 0.06), (42, 0.069), (43, 0.008), (44, -0.068), (45, 0.011), (46, -0.075), (47, 0.047), (48, 0.085), (49, 0.063)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.9749884 20 acl-2010-A Transition-Based Parser for 2-Planar Dependency Structures

Author: Carlos Gomez-Rodriguez ; Joakim Nivre

Abstract: Finding a class of structures that is rich enough for adequate linguistic representation yet restricted enough for efficient computational processing is an important problem for dependency parsing. In this paper, we present a transition system for 2-planar dependency trees trees that can be decomposed into at most two planar graphs and show that it can be used to implement a classifier-based parser that runs in linear time and outperforms a stateof-the-art transition-based parser on four data sets from the CoNLL-X shared task. In addition, we present an efficient method – – for determining whether an arbitrary tree is 2-planar and show that 99% or more of the trees in existing treebanks are 2-planar.

2 0.92375582 242 acl-2010-Tree-Based Deterministic Dependency Parsing - An Application to Nivre's Method -

Author: Kotaro Kitagawa ; Kumiko Tanaka-Ishii

Abstract: Nivre’s method was improved by enhancing deterministic dependency parsing through application of a tree-based model. The model considers all words necessary for selection of parsing actions by including words in the form of trees. It chooses the most probable head candidate from among the trees and uses this candidate to select a parsing action. In an evaluation experiment using the Penn Treebank (WSJ section), the proposed model achieved higher accuracy than did previous deterministic models. Although the proposed model’s worst-case time complexity is O(n2), the experimental results demonstrated an average pars- ing time not much slower than O(n).

3 0.60604811 241 acl-2010-Transition-Based Parsing with Confidence-Weighted Classification

Author: Martin Haulrich

Abstract: We show that using confidence-weighted classification in transition-based parsing gives results comparable to using SVMs with faster training and parsing time. We also compare with other online learning algorithms and investigate the effect of pruning features when using confidenceweighted classification.

4 0.59975123 99 acl-2010-Efficient Third-Order Dependency Parsers

Author: Terry Koo ; Michael Collins

Abstract: We present algorithms for higher-order dependency parsing that are “third-order” in the sense that they can evaluate substructures containing three dependencies, and “efficient” in the sense that they require only O(n4) time. Importantly, our new parsers can utilize both sibling-style and grandchild-style interactions. We evaluate our parsers on the Penn Treebank and Prague Dependency Treebank, achieving unlabeled attachment scores of 93.04% and 87.38%, respectively.

5 0.57697958 83 acl-2010-Dependency Parsing and Projection Based on Word-Pair Classification

Author: Wenbin Jiang ; Qun Liu

Abstract: In this paper we describe an intuitionistic method for dependency parsing, where a classifier is used to determine whether a pair of words forms a dependency edge. And we also propose an effective strategy for dependency projection, where the dependency relationships of the word pairs in the source language are projected to the word pairs of the target language, leading to a set of classification instances rather than a complete tree. Experiments show that, the classifier trained on the projected classification instances significantly outperforms previous projected dependency parsers. More importantly, when this clas- , sifier is integrated into a maximum spanning tree (MST) dependency parser, obvious improvement is obtained over the MST baseline.

6 0.55786556 93 acl-2010-Dynamic Programming for Linear-Time Incremental Parsing

7 0.54095232 253 acl-2010-Using Smaller Constituents Rather Than Sentences in Active Learning for Japanese Dependency Parsing

8 0.53079987 143 acl-2010-Importance of Linguistic Constraints in Statistical Dependency Parsing

9 0.51401454 3 acl-2010-A Bayesian Method for Robust Estimation of Distributional Similarities

10 0.48064101 12 acl-2010-A Probabilistic Generative Model for an Intermediate Constituency-Dependency Representation

11 0.40726706 136 acl-2010-How Many Words Is a Picture Worth? Automatic Caption Generation for News Images

12 0.3990598 186 acl-2010-Optimal Rank Reduction for Linear Context-Free Rewriting Systems with Fan-Out Two

13 0.39660272 52 acl-2010-Bitext Dependency Parsing with Bilingual Subtree Constraints

14 0.37420142 48 acl-2010-Better Filtration and Augmentation for Hierarchical Phrase-Based Translation Rules

15 0.36481467 69 acl-2010-Constituency to Dependency Translation with Forests

16 0.3590216 124 acl-2010-Generating Image Descriptions Using Dependency Relational Patterns

17 0.34689203 84 acl-2010-Detecting Errors in Automatically-Parsed Dependency Relations

18 0.34082639 130 acl-2010-Hard Constraints for Grammatical Function Labelling

19 0.32266808 214 acl-2010-Sparsity in Dependency Grammar Induction

20 0.32044029 250 acl-2010-Untangling the Cross-Lingual Link Structure of Wikipedia


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(4, 0.011), (14, 0.024), (25, 0.073), (33, 0.01), (42, 0.017), (44, 0.015), (59, 0.07), (69, 0.201), (73, 0.036), (76, 0.035), (78, 0.057), (83, 0.053), (84, 0.022), (98, 0.277)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.90957832 20 acl-2010-A Transition-Based Parser for 2-Planar Dependency Structures

Author: Carlos Gomez-Rodriguez ; Joakim Nivre

Abstract: Finding a class of structures that is rich enough for adequate linguistic representation yet restricted enough for efficient computational processing is an important problem for dependency parsing. In this paper, we present a transition system for 2-planar dependency trees trees that can be decomposed into at most two planar graphs and show that it can be used to implement a classifier-based parser that runs in linear time and outperforms a stateof-the-art transition-based parser on four data sets from the CoNLL-X shared task. In addition, we present an efficient method – – for determining whether an arbitrary tree is 2-planar and show that 99% or more of the trees in existing treebanks are 2-planar.

2 0.88592279 191 acl-2010-PCFGs, Topic Models, Adaptor Grammars and Learning Topical Collocations and the Structure of Proper Names

Author: Mark Johnson

Abstract: This paper establishes a connection between two apparently very different kinds of probabilistic models. Latent Dirichlet Allocation (LDA) models are used as “topic models” to produce a lowdimensional representation of documents, while Probabilistic Context-Free Grammars (PCFGs) define distributions over trees. The paper begins by showing that LDA topic models can be viewed as a special kind of PCFG, so Bayesian inference for PCFGs can be used to infer Topic Models as well. Adaptor Grammars (AGs) are a hierarchical, non-parameteric Bayesian extension of PCFGs. Exploiting the close relationship between LDA and PCFGs just described, we propose two novel probabilistic models that combine insights from LDA and AG models. The first replaces the unigram component of LDA topic models with multi-word sequences or collocations generated by an AG. The second extension builds on the first one to learn aspects of the internal structure of proper names.

3 0.87784469 116 acl-2010-Finding Cognate Groups Using Phylogenies

Author: David Hall ; Dan Klein

Abstract: A central problem in historical linguistics is the identification of historically related cognate words. We present a generative phylogenetic model for automatically inducing cognate group structure from unaligned word lists. Our model represents the process of transformation and transmission from ancestor word to daughter word, as well as the alignment between the words lists of the observed languages. We also present a novel method for simplifying complex weighted automata created during inference to counteract the otherwise exponential growth of message sizes. On the task of identifying cognates in a dataset of Romance words, our model significantly outperforms a baseline ap- proach, increasing accuracy by as much as 80%. Finally, we demonstrate that our automatically induced groups can be used to successfully reconstruct ancestral words.

4 0.80866241 244 acl-2010-TrustRank: Inducing Trust in Automatic Translations via Ranking

Author: Radu Soricut ; Abdessamad Echihabi

Abstract: The adoption ofMachine Translation technology for commercial applications is hampered by the lack of trust associated with machine-translated output. In this paper, we describe TrustRank, an MT system enhanced with a capability to rank the quality of translation outputs from good to bad. This enables the user to set a quality threshold, granting the user control over the quality of the translations. We quantify the gains we obtain in translation quality, and show that our solution works on a wide variety of domains and language pairs.

5 0.80722868 27 acl-2010-An Active Learning Approach to Finding Related Terms

Author: David Vickrey ; Oscar Kipersztok ; Daphne Koller

Abstract: We present a novel system that helps nonexperts find sets of similar words. The user begins by specifying one or more seed words. The system then iteratively suggests a series of candidate words, which the user can either accept or reject. Current techniques for this task typically bootstrap a classifier based on a fixed seed set. In contrast, our system involves the user throughout the labeling process, using active learning to intelligently explore the space of similar words. In particular, our system can take advantage of negative examples provided by the user. Our system combines multiple preexisting sources of similarity data (a standard thesaurus, WordNet, contextual similarity), enabling it to capture many types of similarity groups (“synonyms of crash,” “types of car,” etc.). We evaluate on a hand-labeled evaluation set; our system improves over a strong baseline by 36%.

6 0.80620801 242 acl-2010-Tree-Based Deterministic Dependency Parsing - An Application to Nivre's Method -

7 0.80601484 8 acl-2010-A Hybrid Hierarchical Model for Multi-Document Summarization

8 0.80463892 221 acl-2010-Syntax-to-Morphology Mapping in Factored Phrase-Based Statistical Machine Translation from English to Turkish

9 0.80176651 253 acl-2010-Using Smaller Constituents Rather Than Sentences in Active Learning for Japanese Dependency Parsing

10 0.8017112 232 acl-2010-The S-Space Package: An Open Source Package for Word Space Models

11 0.80129296 201 acl-2010-Pseudo-Word for Phrase-Based Machine Translation

12 0.79818487 24 acl-2010-Active Learning-Based Elicitation for Semi-Supervised Word Alignment

13 0.79712659 129 acl-2010-Growing Related Words from Seed via User Behaviors: A Re-Ranking Based Approach

14 0.79017586 90 acl-2010-Diversify and Combine: Improving Word Alignment for Machine Translation on Low-Resource Languages

15 0.7887218 83 acl-2010-Dependency Parsing and Projection Based on Word-Pair Classification

16 0.78760123 52 acl-2010-Bitext Dependency Parsing with Bilingual Subtree Constraints

17 0.78562152 146 acl-2010-Improving Chinese Semantic Role Labeling with Rich Syntactic Features

18 0.78559673 79 acl-2010-Cross-Lingual Latent Topic Extraction

19 0.78268385 133 acl-2010-Hierarchical Search for Word Alignment

20 0.77562326 93 acl-2010-Dynamic Programming for Linear-Time Incremental Parsing