acl acl2010 acl2010-242 knowledge-graph by maker-knowledge-mining

242 acl-2010-Tree-Based Deterministic Dependency Parsing - An Application to Nivre's Method -


Source: pdf

Author: Kotaro Kitagawa ; Kumiko Tanaka-Ishii

Abstract: Nivre’s method was improved by enhancing deterministic dependency parsing through application of a tree-based model. The model considers all words necessary for selection of parsing actions by including words in the form of trees. It chooses the most probable head candidate from among the trees and uses this candidate to select a parsing action. In an evaluation experiment using the Penn Treebank (WSJ section), the proposed model achieved higher accuracy than did previous deterministic models. Although the proposed model’s worst-case time complexity is O(n2), the experimental results demonstrated an average pars- ing time not much slower than O(n).

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Abstract Nivre’s method was improved by enhancing deterministic dependency parsing through application of a tree-based model. [sent-5, score-0.504]

2 The model considers all words necessary for selection of parsing actions by including words in the form of trees. [sent-6, score-0.226]

3 It chooses the most probable head candidate from among the trees and uses this candidate to select a parsing action. [sent-7, score-0.533]

4 In an evaluation experiment using the Penn Treebank (WSJ section), the proposed model achieved higher accuracy than did previous deterministic models. [sent-8, score-0.188]

5 Although the proposed model’s worst-case time complexity is O(n2), the experimental results demonstrated an average pars- ing time not much slower than O(n). [sent-9, score-0.106]

6 1 Introduction Deterministic parsing methods achieve both effective time complexity and accuracy not far from those of the most accurate methods. [sent-10, score-0.24]

7 One such deterministic method is Nivre’s method, an incremental parsing method whose time complexity is linear in the number of words (Nivre, 2003). [sent-11, score-0.474]

8 As a specific example, Nivre’s model greedily decides the parsing action only from two words and their locally relational words, which can lead to errors. [sent-13, score-0.309]

9 In the field of Japanese dependency parsing, Iwatate et al. [sent-14, score-0.157]

10 (2008) proposed a tournament model that takes all head candidates into account in judging dependency relations. [sent-15, score-0.532]

11 This method assumes backward parsing because the Japanese dependency structure has a head-final constraint, so that any word’s head is located to its right. [sent-16, score-0.642]

12 Here, we propose a tree-based model, applicable to any projective language, which can be considered as a kind of generalization of Iwatate’s ac . [sent-17, score-0.077]

13 Instead of selecting a parsing action for two words, as in Nivre’s model, our tree-based model first chooses the most probable head candidate from among the trees through a tournament and then decides the parsing action between two trees. [sent-22, score-0.987]

14 Global-optimization parsing methods are another common approach (Eisner, 1996; McDonald et al. [sent-23, score-0.154]

15 Hybrid systems have improved parsing by integrating outputs obtained from different parsing models (Zhang and Clark, 2008). [sent-27, score-0.308]

16 Our proposal can be situated among globaloptimization parsing methods as follows. [sent-28, score-0.154]

17 1 Dependency Parsing A dependency parser receives an input sentence x = w1, w2, . [sent-31, score-0.179]

18 , wn} corresponds to the words of a sentence, and the} n coodrere w0 nisd tsh teo ro thoet wofo Grd. [sent-38, score-0.048]

19 s oAf i sa the set of arcs (wi, wj), each of which represents a dependency relation where wi is the head and wj is the dependent. [sent-39, score-0.9]

20 In this paper, we assume that the resulting dependency graph for a sentence is well-formed and projective (Nivre, 2008). [sent-40, score-0.234]

21 2 Nivre’s Method An incremental dependency parsing algorithm was first proposed by (Covington, 2001). [sent-43, score-0.347]

22 c C2o0n1f0er Aenscseoc Sihatoirotn P faopre Crso,m papguetsat 1io8n9a–l1 L9i3n,guistics Table 1: Transitions for Nivre’s method and the proposed method. [sent-46, score-0.034]

23 Nivre’s method applying an arc-eager algorithm works by using a stack of words denoted as σ, for a buffer β initially containing the sentence x. [sent-48, score-0.355]

24 Parsing is formulated as a quadruple (S, Ts, sinit, St), where each component is defined as follows: • • • • S is a set of states, each of which is denoted as (σ, β, A) ∈ Sate. [sent-49, score-0.102]

25 Ts (isσ a s,eAt o)f ∈ transitions, and each element of Ts is a function ts : S → S. [sent-50, score-0.078]

26 Syntactic analysis generates a sequence of optimal transitions ts provided by an oracle o : S → Ts, applied to a target consisting no fo rthacel esta ock :’ Ss top e Tlement wi and the first element wj in the buffer. [sent-56, score-0.812]

27 The oracle is constructed as a classifier trained on treebank data. [sent-57, score-0.094]

28 Each transition is defined in the upper block of Table 1 and explained as follows: Left-Arc Make wj the head of wi and pop wi, where wi is located at the stack top (denoted as σ|wi), when the buffer head is wj (denoted as wj |β). [sent-58, score-2.413]

29 Shift Push the word wj, located at the buffer head, onto the stack top. [sent-62, score-0.382]

30 The method explained thus far has the following drawbacks. [sent-63, score-0.073]

31 Locality of Parsing Action Selection The dependency relations are greedily determined, so when the transition Right-Arc adds a dependency arc (wi, wj), a more probable head of wj located in the stack is disregarded as a candidate. [sent-64, score-1.359]

32 Features Used for Selecting Reduce The features used in (Nivre and Scholz, 2004) to define a state transition are basically obtained from the two target words wi and wj, and their related words. [sent-65, score-0.372]

33 These words are not sufficient to select Reduce, because this action means that wj has no dependency relation with any word in the stack. [sent-66, score-0.636]

34 Preconditions When the classifier selects a transition, the resulting graph satisfies well-formedness and projectivity only under the preconditions listed in Table 1. [sent-67, score-0.171]

35 Even though the parsing seems to be formulated as a four-class classifier problem, it is in fact formed of two types of three-class classifiers. [sent-68, score-0.246]

36 Solving these problems and selecting a more suitable dependency relation requires a parser that considers more global dependency relations. [sent-69, score-0.396]

37 1 Overall Procedure Tree-based parsing uses trees as the procedural elements instead of words. [sent-71, score-0.225]

38 This allows enhancement of previously proposed deterministic models such as (Covington, 2001 ; Yamada and Mat- sumoto, 2003). [sent-72, score-0.193]

39 In this paper, we show the application of tree-based parsing to Nivre’s method. [sent-73, score-0.154]

40 The parser is formulated as a state transition system (S, Ts, sinit, St), similarly to Nivre’s parser, but σ and β for a state s = (σ, β, A) ∈ S denote a stack aofn dtre βes fo ran ad s a tbeu fsfe =r o (σf trees, respectively. [sent-74, score-0.412]

41 aA s ttarceek ti ∈ T is defined as the tree rooted by the word wi, and∈ th Te i isn diteiafiln setdat aes i tsh sinit = ( [t0] , [t1, . [sent-75, score-0.349]

42 The state transitions Ts are decided through the following two steps. [sent-79, score-0.107]

43 Select the most probable head candidate (MPHC): For the tree ti located at the stack top, search for and select the MPHC for wj, which is the root word of tj located at the buffer head. [sent-81, score-1.296]

44 This procedure is denoted as a 190 Figure 1: Example of a tournament. [sent-82, score-0.094]

45 function mphc(ti, tj), and its details are explained in §3. [sent-83, score-0.039]

46 3): Left-Arc Make wj the head of wi and pop ti, where ti is at the stack top (denoted as σ|ti, with the tail being σ), when the abusf σfe|rt head is tj (denoted as tj |β). [sent-87, score-1.92]

47 Right-Arc Make the MPHC the head of wj, and pop the MPHC. [sent-88, score-0.23]

48 Shift Push the tree tj located at the buffer head onto the stack top. [sent-89, score-0.864]

49 These transitions correspond to three possibilities for the relation between ti and tj : (1) a word of ti is a dependent of a word of tj ; (2) a word of tj is a dependent of a word of ti; or (3) the two trees are not related. [sent-90, score-1.571]

50 The formulations of these transitions in the lower block of Table 1correspond to Nivre’s transitions of the same name, except that here a transition is applied to a tree. [sent-91, score-0.32]

51 This enhancement from words to trees allows removal of both the Reduce transition and certain preconditions. [sent-92, score-0.236]

52 2 Selection of Most Probable Head Candidate By using mphc(ti, tj), a word located far from wj (the head of tj) can be selected as the head candidate in ti. [sent-94, score-0.873]

53 This selection process decreases the number of errors resulting from greedy decision considering only a few candidates. [sent-95, score-0.043]

54 One way is to apply the tournament procedure to the words in ti. [sent-97, score-0.233]

55 The tournament procedure was originally introduced for parsing methods in Japanese by (Iwatate et al. [sent-98, score-0.387]

56 , Thebipreobdowtasi oldseptar jtelybmyph ics(tcio,mtjp)any Thebiproebdtoiwas oeldpaRr igtehlyt-Arctbyjhiscompany Figure 2: Example of the transition Right. [sent-99, score-0.154]

57 Since the Japanese language has the headfinal property, the tournament model itself constitutes parsing, whereas for parsing a general projective language, the tournament model can only be used as part of a parsing algorithm. [sent-101, score-0.799]

58 Figure 1 shows a tournament for the example of “with,” where the word “watched” finally wins. [sent-102, score-0.207]

59 Although only the words on the left-hand side of tree tj are searched, this does not mean that the tree-based method considers only one side of a dependency relation. [sent-103, score-0.534]

60 For example, when we apply the tree-based parsing to Yamada’s method, the search problems on both sides are solved. [sent-104, score-0.154]

61 To implement mphc(ti, tj), a binary classifier is built to judge which of two given words is more appropriate as the head for another input word. [sent-105, score-0.226]

62 This classifier concerns three words, namely, the two words l (left) and r (right) in ti, whose appropriateness as the head is compared for the dependent wj. [sent-106, score-0.263]

63 All word pairs of l and r in ti are compared repeatedly in a “tournament,” and the survivor is regarded as the MPHC of wj. [sent-107, score-0.242]

64 The classifier is generated through learning of training examples for all ti and wj pairs, each of which generates examples comparing the true head and other (inappropriate) heads in ti. [sent-108, score-0.855]

65 Xleft means the dependents of X located on the left-hand side of X, while Xright means those on the right. [sent-111, score-0.129]

66 The feature design concerns three additional words occurring after wj, as well, denoted as wj+1, wj+2, wj+3. [sent-113, score-0.105]

67 3 Transition Selection A transition is selected by a three-class classifier after deciding the MPHC, as explained in §3. [sent-115, score-0.251]

68 The transition Shift indicates that the target trees ti and tj have no dependency relations. [sent-120, score-0.917]

69 The transition Right-Arc indicates generation of the dependent-head relation between wj and the result of mphc(ti, tj), i. [sent-121, score-0.56]

70 The transition Left-Arc indicates generation of the dependency relation in which wj is the head of wi. [sent-125, score-0.885]

71 The key to obtaining an accurate tree-based parsing model is to extend the search space while at the same time providing ways to narrow down the space and find important information, such as the MPHC, for proper judgment of transitions. [sent-127, score-0.176]

72 The dependency relation between the target trees is represented by the three words wi, MPHC, and wj. [sent-129, score-0.261]

73 Since this transition selection procedure presumes selection of the MPHC, the result of mphc(ti, tj) is also incorporated among the features. [sent-132, score-0.266]

74 1 Data and Experimental Setting In our experimental evaluation, we used Yamada’s head rule to extract unlabeled dependencies from the Wall Street Journal section of a Penn Treebank. [sent-134, score-0.168]

75 This test data 1The head word of wi can only be wj without searching within tj ,because the relations between the other words in tj and wi have already been inferred from the decisions made within previous transitions. [sent-136, score-1.531]

76 If tj has a child wk that could become the head of wi under projectivity, this wk must be located between wi and wj . [sent-137, score-1.452]

77 The fact that wk ’s head is wj means that there were two phases before ti and tj (i. [sent-138, score-1.14]

78 , wi and wj) became the target: • ti and tk became the target, and Shift was selected. [sent-140, score-0.486]

79 •• ttk and tj became the target, and Left-Arc was selected. [sent-141, score-0.363]

80 The• •fir tst phase precisely indicates that wi and wk are unrelated. [sent-142, score-0.256]

81 The binary classifier for MPHC selection and the three-class classifier for transition selection were built using a cubic polynomial kernel. [sent-145, score-0.412]

82 The parsing speed was evaluated on a Core2Duo (2. [sent-146, score-0.181]

83 2 Parsing Accuracy We measured the ratio of words assigned correct heads to all words (accuracy), and the ratio of sentences with completely correct dependency graphs to all sentences (complete match). [sent-149, score-0.192]

84 Table 4 compares our results for the proposed method with those reported in some previous works using equivalent training and test data. [sent-151, score-0.057]

85 Note that every method used different features, which depend on the method. [sent-154, score-0.034]

86 The proposed method achieved higher accuracy than did the previous deterministic models. [sent-155, score-0.222]

87 Although the accuracy of our method did not reach that of (McDonald and Pereira, 2006), the scores were competitive even though our method is deterministic. [sent-156, score-0.097]

88 3 Parsing Time Such extension of the search space also concerns the speed of the method. [sent-159, score-0.064]

89 We re-implemented Nivre’s method to use SVMs with cubic polynomial kernel, similarly to our 2http://svmlight. [sent-161, score-0.09]

90 Figure 3 shows plots of the parsing times for all sentences in the test data. [sent-172, score-0.154]

91 Although the worst-case time complexity for Nivre’s method is O(n) and that for our method is O(n2), worst-case situations (e. [sent-176, score-0.125]

92 , all words having heads on their left) did not appear frequently. [sent-178, score-0.035]

93 5 Conclusion We have proposed a tree-based model that decides head-dependency relations between trees instead of between words. [sent-180, score-0.094]

94 This extends the search space to obtain the best head for a word within a deterministic model. [sent-181, score-0.327]

95 The tree-based idea is potentially applicable to various previous parsing methods; in this paper, we have applied it to enhance Nivre’s method. [sent-182, score-0.154]

96 Our tree-based model outperformed various deterministic parsing methods reported previously. [sent-183, score-0.313]

97 Although the worst-case time complexity of our method is O(n2), the average parsing time is not much slower than O(n). [sent-184, score-0.294]

98 Three new probabilistic models for dependency parsing: An exploration. [sent-199, score-0.157]

99 Japanese dependency analysis using cascaded chunking Proceedings of CoNLL, pp. [sent-218, score-0.157]

100 A tale of two parsers: investigating and combining graph-based and transitionbased dependency parsing using beamsearch. [sent-254, score-0.334]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('mphc', 0.446), ('wj', 0.375), ('tj', 0.314), ('nivre', 0.28), ('ti', 0.219), ('tournament', 0.207), ('wi', 0.169), ('head', 0.168), ('deterministic', 0.159), ('dependency', 0.157), ('parsing', 0.154), ('transition', 0.154), ('stack', 0.131), ('located', 0.129), ('iwatate', 0.105), ('sinit', 0.105), ('buffer', 0.099), ('transitions', 0.083), ('ts', 0.078), ('projective', 0.077), ('yamada', 0.073), ('denoted', 0.068), ('wk', 0.064), ('pop', 0.062), ('japanese', 0.061), ('classifier', 0.058), ('covington', 0.052), ('probable', 0.052), ('action', 0.051), ('became', 0.049), ('trees', 0.048), ('decides', 0.046), ('yuji', 0.046), ('projectivity', 0.046), ('kumiko', 0.046), ('shift', 0.044), ('selection', 0.043), ('mcdonald', 0.042), ('preconditions', 0.042), ('push', 0.041), ('okyo', 0.039), ('explained', 0.039), ('joakim', 0.037), ('concerns', 0.037), ('oracle', 0.036), ('greedily', 0.036), ('incremental', 0.036), ('heads', 0.035), ('complexity', 0.035), ('enhancement', 0.034), ('koo', 0.034), ('formulated', 0.034), ('method', 0.034), ('candidate', 0.033), ('wn', 0.033), ('cubic', 0.032), ('kudo', 0.032), ('relation', 0.031), ('xavier', 0.031), ('considers', 0.029), ('accuracy', 0.029), ('possibilities', 0.029), ('matsumoto', 0.029), ('iwpt', 0.028), ('slower', 0.027), ('speed', 0.027), ('procedure', 0.026), ('satisfies', 0.025), ('tsh', 0.025), ('target', 0.025), ('ryan', 0.024), ('state', 0.024), ('polynomial', 0.024), ('works', 0.023), ('onto', 0.023), ('reduce', 0.023), ('procedural', 0.023), ('asahara', 0.023), ('masayuki', 0.023), ('hther', 0.023), ('scholz', 0.023), ('jl', 0.023), ('widening', 0.023), ('wofo', 0.023), ('esx', 0.023), ('hiroyasu', 0.023), ('ock', 0.023), ('esta', 0.023), ('tst', 0.023), ('kotaro', 0.023), ('survivor', 0.023), ('transitionbased', 0.023), ('aofn', 0.023), ('jw', 0.023), ('chooses', 0.023), ('time', 0.022), ('searching', 0.022), ('relational', 0.022), ('select', 0.022), ('parser', 0.022)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999964 242 acl-2010-Tree-Based Deterministic Dependency Parsing - An Application to Nivre's Method -

Author: Kotaro Kitagawa ; Kumiko Tanaka-Ishii

Abstract: Nivre’s method was improved by enhancing deterministic dependency parsing through application of a tree-based model. The model considers all words necessary for selection of parsing actions by including words in the form of trees. It chooses the most probable head candidate from among the trees and uses this candidate to select a parsing action. In an evaluation experiment using the Penn Treebank (WSJ section), the proposed model achieved higher accuracy than did previous deterministic models. Although the proposed model’s worst-case time complexity is O(n2), the experimental results demonstrated an average pars- ing time not much slower than O(n).

2 0.41370627 20 acl-2010-A Transition-Based Parser for 2-Planar Dependency Structures

Author: Carlos Gomez-Rodriguez ; Joakim Nivre

Abstract: Finding a class of structures that is rich enough for adequate linguistic representation yet restricted enough for efficient computational processing is an important problem for dependency parsing. In this paper, we present a transition system for 2-planar dependency trees trees that can be decomposed into at most two planar graphs and show that it can be used to implement a classifier-based parser that runs in linear time and outperforms a stateof-the-art transition-based parser on four data sets from the CoNLL-X shared task. In addition, we present an efficient method – – for determining whether an arbitrary tree is 2-planar and show that 99% or more of the trees in existing treebanks are 2-planar.

3 0.2145578 241 acl-2010-Transition-Based Parsing with Confidence-Weighted Classification

Author: Martin Haulrich

Abstract: We show that using confidence-weighted classification in transition-based parsing gives results comparable to using SVMs with faster training and parsing time. We also compare with other online learning algorithms and investigate the effect of pruning features when using confidenceweighted classification.

4 0.15430735 93 acl-2010-Dynamic Programming for Linear-Time Incremental Parsing

Author: Liang Huang ; Kenji Sagae

Abstract: Incremental parsing techniques such as shift-reduce have gained popularity thanks to their efficiency, but there remains a major problem: the search is greedy and only explores a tiny fraction of the whole space (even with beam search) as opposed to dynamic programming. We show that, surprisingly, dynamic programming is in fact possible for many shift-reduce parsers, by merging “equivalent” stacks based on feature values. Empirically, our algorithm yields up to a five-fold speedup over a state-of-the-art shift-reduce depen- dency parser with no loss in accuracy. Better search also leads to better learning, and our final parser outperforms all previously reported dependency parsers for English and Chinese, yet is much faster.

5 0.13609797 83 acl-2010-Dependency Parsing and Projection Based on Word-Pair Classification

Author: Wenbin Jiang ; Qun Liu

Abstract: In this paper we describe an intuitionistic method for dependency parsing, where a classifier is used to determine whether a pair of words forms a dependency edge. And we also propose an effective strategy for dependency projection, where the dependency relationships of the word pairs in the source language are projected to the word pairs of the target language, leading to a set of classification instances rather than a complete tree. Experiments show that, the classifier trained on the projected classification instances significantly outperforms previous projected dependency parsers. More importantly, when this clas- , sifier is integrated into a maximum spanning tree (MST) dependency parser, obvious improvement is obtained over the MST baseline.

6 0.13319482 99 acl-2010-Efficient Third-Order Dependency Parsers

7 0.12504096 146 acl-2010-Improving Chinese Semantic Role Labeling with Rich Syntactic Features

8 0.12424743 253 acl-2010-Using Smaller Constituents Rather Than Sentences in Active Learning for Japanese Dependency Parsing

9 0.10552665 48 acl-2010-Better Filtration and Augmentation for Hierarchical Phrase-Based Translation Rules

10 0.088880166 52 acl-2010-Bitext Dependency Parsing with Bilingual Subtree Constraints

11 0.087821908 27 acl-2010-An Active Learning Approach to Finding Related Terms

12 0.085386731 143 acl-2010-Importance of Linguistic Constraints in Statistical Dependency Parsing

13 0.072166331 84 acl-2010-Detecting Errors in Automatically-Parsed Dependency Relations

14 0.070541918 69 acl-2010-Constituency to Dependency Translation with Forests

15 0.069717899 189 acl-2010-Optimizing Question Answering Accuracy by Maximizing Log-Likelihood

16 0.067563206 55 acl-2010-Bootstrapping Semantic Analyzers from Non-Contradictory Texts

17 0.066895768 153 acl-2010-Joint Syntactic and Semantic Parsing of Chinese

18 0.06181879 136 acl-2010-How Many Words Is a Picture Worth? Automatic Caption Generation for News Images

19 0.060736716 12 acl-2010-A Probabilistic Generative Model for an Intermediate Constituency-Dependency Representation

20 0.06037166 3 acl-2010-A Bayesian Method for Robust Estimation of Distributional Similarities


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.176), (1, -0.025), (2, 0.095), (3, 0.021), (4, -0.063), (5, -0.104), (6, 0.059), (7, 0.002), (8, -0.066), (9, 0.309), (10, -0.303), (11, 0.011), (12, -0.08), (13, 0.203), (14, 0.182), (15, -0.072), (16, -0.036), (17, -0.019), (18, -0.136), (19, 0.01), (20, 0.005), (21, -0.096), (22, -0.018), (23, -0.001), (24, 0.071), (25, -0.134), (26, -0.061), (27, -0.036), (28, 0.009), (29, 0.077), (30, 0.031), (31, 0.026), (32, 0.037), (33, 0.016), (34, -0.092), (35, 0.042), (36, 0.137), (37, 0.001), (38, 0.012), (39, 0.136), (40, 0.032), (41, 0.04), (42, 0.005), (43, -0.012), (44, -0.039), (45, 0.012), (46, -0.081), (47, 0.084), (48, 0.045), (49, 0.022)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.96090192 242 acl-2010-Tree-Based Deterministic Dependency Parsing - An Application to Nivre's Method -

Author: Kotaro Kitagawa ; Kumiko Tanaka-Ishii

Abstract: Nivre’s method was improved by enhancing deterministic dependency parsing through application of a tree-based model. The model considers all words necessary for selection of parsing actions by including words in the form of trees. It chooses the most probable head candidate from among the trees and uses this candidate to select a parsing action. In an evaluation experiment using the Penn Treebank (WSJ section), the proposed model achieved higher accuracy than did previous deterministic models. Although the proposed model’s worst-case time complexity is O(n2), the experimental results demonstrated an average pars- ing time not much slower than O(n).

2 0.93168622 20 acl-2010-A Transition-Based Parser for 2-Planar Dependency Structures

Author: Carlos Gomez-Rodriguez ; Joakim Nivre

Abstract: Finding a class of structures that is rich enough for adequate linguistic representation yet restricted enough for efficient computational processing is an important problem for dependency parsing. In this paper, we present a transition system for 2-planar dependency trees trees that can be decomposed into at most two planar graphs and show that it can be used to implement a classifier-based parser that runs in linear time and outperforms a stateof-the-art transition-based parser on four data sets from the CoNLL-X shared task. In addition, we present an efficient method – – for determining whether an arbitrary tree is 2-planar and show that 99% or more of the trees in existing treebanks are 2-planar.

3 0.68928701 241 acl-2010-Transition-Based Parsing with Confidence-Weighted Classification

Author: Martin Haulrich

Abstract: We show that using confidence-weighted classification in transition-based parsing gives results comparable to using SVMs with faster training and parsing time. We also compare with other online learning algorithms and investigate the effect of pruning features when using confidenceweighted classification.

4 0.63337559 99 acl-2010-Efficient Third-Order Dependency Parsers

Author: Terry Koo ; Michael Collins

Abstract: We present algorithms for higher-order dependency parsing that are “third-order” in the sense that they can evaluate substructures containing three dependencies, and “efficient” in the sense that they require only O(n4) time. Importantly, our new parsers can utilize both sibling-style and grandchild-style interactions. We evaluate our parsers on the Penn Treebank and Prague Dependency Treebank, achieving unlabeled attachment scores of 93.04% and 87.38%, respectively.

5 0.62542444 253 acl-2010-Using Smaller Constituents Rather Than Sentences in Active Learning for Japanese Dependency Parsing

Author: Manabu Sassano ; Sadao Kurohashi

Abstract: We investigate active learning methods for Japanese dependency parsing. We propose active learning methods of using partial dependency relations in a given sentence for parsing and evaluate their effectiveness empirically. Furthermore, we utilize syntactic constraints of Japanese to obtain more labeled examples from precious labeled ones that annotators give. Experimental results show that our proposed methods improve considerably the learning curve of Japanese dependency parsing. In order to achieve an accuracy of over 88.3%, one of our methods requires only 34.4% of labeled examples as compared to passive learning.

6 0.61301535 93 acl-2010-Dynamic Programming for Linear-Time Incremental Parsing

7 0.59015495 83 acl-2010-Dependency Parsing and Projection Based on Word-Pair Classification

8 0.50772983 143 acl-2010-Importance of Linguistic Constraints in Statistical Dependency Parsing

9 0.49706119 12 acl-2010-A Probabilistic Generative Model for an Intermediate Constituency-Dependency Representation

10 0.48057556 3 acl-2010-A Bayesian Method for Robust Estimation of Distributional Similarities

11 0.39753494 52 acl-2010-Bitext Dependency Parsing with Bilingual Subtree Constraints

12 0.36592868 146 acl-2010-Improving Chinese Semantic Role Labeling with Rich Syntactic Features

13 0.35722676 136 acl-2010-How Many Words Is a Picture Worth? Automatic Caption Generation for News Images

14 0.35558757 186 acl-2010-Optimal Rank Reduction for Linear Context-Free Rewriting Systems with Fan-Out Two

15 0.33816478 48 acl-2010-Better Filtration and Augmentation for Hierarchical Phrase-Based Translation Rules

16 0.3234742 84 acl-2010-Detecting Errors in Automatically-Parsed Dependency Relations

17 0.30406788 130 acl-2010-Hard Constraints for Grammatical Function Labelling

18 0.30181214 69 acl-2010-Constituency to Dependency Translation with Forests

19 0.29944566 124 acl-2010-Generating Image Descriptions Using Dependency Relational Patterns

20 0.29110882 214 acl-2010-Sparsity in Dependency Grammar Induction


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(14, 0.035), (25, 0.034), (59, 0.044), (73, 0.026), (76, 0.039), (78, 0.018), (83, 0.043), (98, 0.651)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.99127924 129 acl-2010-Growing Related Words from Seed via User Behaviors: A Re-Ranking Based Approach

Author: Yabin Zheng ; Zhiyuan Liu ; Lixing Xie

Abstract: Motivated by Google Sets, we study the problem of growing related words from a single seed word by leveraging user behaviors hiding in user records of Chinese input method. Our proposed method is motivated by the observation that the more frequently two words cooccur in user records, the more related they are. First, we utilize user behaviors to generate candidate words. Then, we utilize search engine to enrich candidate words with adequate semantic features. Finally, we reorder candidate words according to their semantic relatedness to the seed word. Experimental results on a Chinese input method dataset show that our method gains better performance. 1

same-paper 2 0.99124563 242 acl-2010-Tree-Based Deterministic Dependency Parsing - An Application to Nivre's Method -

Author: Kotaro Kitagawa ; Kumiko Tanaka-Ishii

Abstract: Nivre’s method was improved by enhancing deterministic dependency parsing through application of a tree-based model. The model considers all words necessary for selection of parsing actions by including words in the form of trees. It chooses the most probable head candidate from among the trees and uses this candidate to select a parsing action. In an evaluation experiment using the Penn Treebank (WSJ section), the proposed model achieved higher accuracy than did previous deterministic models. Although the proposed model’s worst-case time complexity is O(n2), the experimental results demonstrated an average pars- ing time not much slower than O(n).

3 0.98543841 221 acl-2010-Syntax-to-Morphology Mapping in Factored Phrase-Based Statistical Machine Translation from English to Turkish

Author: Reyyan Yeniterzi ; Kemal Oflazer

Abstract: We present a novel scheme to apply factored phrase-based SMT to a language pair with very disparate morphological structures. Our approach relies on syntactic analysis on the source side (English) and then encodes a wide variety of local and non-local syntactic structures as complex structural tags which appear as additional factors in the training data. On the target side (Turkish), we only perform morphological analysis and disambiguation but treat the complete complex morphological tag as a factor, instead of separating morphemes. We incrementally explore capturing various syntactic substructures as complex tags on the English side, and evaluate how our translations improve in BLEU scores. Our maximal set of source and target side transformations, coupled with some additional techniques, provide an 39% relative improvement from a baseline 17.08 to 23.78 BLEU, all averaged over 10 training and test sets. Now that the syntactic analysis on the English side is available, we also experiment with more long distance constituent reordering to bring the English constituent order close to Turkish, but find that these transformations do not provide any additional consistent tangible gains when averaged over the 10 sets.

4 0.98064178 27 acl-2010-An Active Learning Approach to Finding Related Terms

Author: David Vickrey ; Oscar Kipersztok ; Daphne Koller

Abstract: We present a novel system that helps nonexperts find sets of similar words. The user begins by specifying one or more seed words. The system then iteratively suggests a series of candidate words, which the user can either accept or reject. Current techniques for this task typically bootstrap a classifier based on a fixed seed set. In contrast, our system involves the user throughout the labeling process, using active learning to intelligently explore the space of similar words. In particular, our system can take advantage of negative examples provided by the user. Our system combines multiple preexisting sources of similarity data (a standard thesaurus, WordNet, contextual similarity), enabling it to capture many types of similarity groups (“synonyms of crash,” “types of car,” etc.). We evaluate on a hand-labeled evaluation set; our system improves over a strong baseline by 36%.

5 0.97441971 201 acl-2010-Pseudo-Word for Phrase-Based Machine Translation

Author: Xiangyu Duan ; Min Zhang ; Haizhou Li

Abstract: The pipeline of most Phrase-Based Statistical Machine Translation (PB-SMT) systems starts from automatically word aligned parallel corpus. But word appears to be too fine-grained in some cases such as non-compositional phrasal equivalences, where no clear word alignments exist. Using words as inputs to PBSMT pipeline has inborn deficiency. This paper proposes pseudo-word as a new start point for PB-SMT pipeline. Pseudo-word is a kind of basic multi-word expression that characterizes minimal sequence of consecutive words in sense of translation. By casting pseudo-word searching problem into a parsing framework, we search for pseudo-words in a monolingual way and a bilingual synchronous way. Experiments show that pseudo-word significantly outperforms word for PB-SMT model in both travel translation domain and news translation domain. 1

6 0.97185266 8 acl-2010-A Hybrid Hierarchical Model for Multi-Document Summarization

7 0.96336025 24 acl-2010-Active Learning-Based Elicitation for Semi-Supervised Word Alignment

8 0.93738365 20 acl-2010-A Transition-Based Parser for 2-Planar Dependency Structures

9 0.93728024 253 acl-2010-Using Smaller Constituents Rather Than Sentences in Active Learning for Japanese Dependency Parsing

10 0.9138279 232 acl-2010-The S-Space Package: An Open Source Package for Word Space Models

11 0.90578562 90 acl-2010-Diversify and Combine: Improving Word Alignment for Machine Translation on Low-Resource Languages

12 0.86890262 77 acl-2010-Cross-Language Document Summarization Based on Machine Translation Quality Prediction

13 0.86535728 52 acl-2010-Bitext Dependency Parsing with Bilingual Subtree Constraints

14 0.86253577 83 acl-2010-Dependency Parsing and Projection Based on Word-Pair Classification

15 0.85545838 79 acl-2010-Cross-Lingual Latent Topic Extraction

16 0.84194052 37 acl-2010-Automatic Evaluation Method for Machine Translation Using Noun-Phrase Chunking

17 0.84079343 188 acl-2010-Optimizing Informativeness and Readability for Sentiment Summarization

18 0.83924925 262 acl-2010-Word Alignment with Synonym Regularization

19 0.83492815 133 acl-2010-Hierarchical Search for Word Alignment

20 0.82979155 110 acl-2010-Exploring Syntactic Structural Features for Sub-Tree Alignment Using Bilingual Tree Kernels