acl acl2011 acl2011-309 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Yue Zhang ; Joakim Nivre
Abstract: Transition-based dependency parsers generally use heuristic decoding algorithms but can accommodate arbitrarily rich feature representations. In this paper, we show that we can improve the accuracy of such parsers by considering even richer feature sets than those employed in previous systems. In the standard Penn Treebank setup, our novel features improve attachment score form 91.4% to 92.9%, giving the best results so far for transitionbased parsing and rivaling the best results overall. For the Chinese Treebank, they give a signficant improvement of the state of the art. An open source release of our parser is freely available.
Reference: text
sentIndex sentText sentNum sentScore
1 Transition-based Dependency Parsing with Rich Non-local Features Yue Zhang University of Cambridge Computer Laboratory yue . [sent-1, score-0.104]
2 uk Abstract Transition-based dependency parsers generally use heuristic decoding algorithms but can accommodate arbitrarily rich feature representations. [sent-5, score-0.488]
3 In this paper, we show that we can improve the accuracy of such parsers by considering even richer feature sets than those employed in previous systems. [sent-6, score-0.166]
4 In the standard Penn Treebank setup, our novel features improve attachment score form 91. [sent-7, score-0.216]
5 9%, giving the best results so far for transitionbased parsing and rivaling the best results overall. [sent-9, score-0.228]
6 An open source release of our parser is freely available. [sent-11, score-0.202]
7 1 Introduction Transition-based dependency parsing (Yamada and Matsumoto, 2003; Nivre et al. [sent-12, score-0.39]
8 Compared to graph-based dependency parsing, it typically offers linear time complexity and the comparative freedom to define non-local features, as exemplified by the comparison between MaltParser and MSTParser (Nivre et al. [sent-14, score-0.308]
9 In the aspect of decoding, beam-search (Johansson and Nugues, 2007; Zhang and Clark, 2008; Huang et al. [sent-18, score-0.052]
10 In the aspect of training, global structural learning has been used to replace local learning on each decision (Zhang and Clark, 2008; Huang et al. [sent-23, score-0.106]
11 , 2009), although the effect of global learning has not been separated out and studied alone. [sent-24, score-0.093]
12 In this short paper, we study a third aspect in a statistical system: feature definition. [sent-25, score-0.139]
13 Representing the type of information a statistical system uses to make predictions, feature templates can be one of the most important factors determining parsing accuracy. [sent-26, score-0.431]
14 Various recent attempts have been made to include non-local features into graph-based dependency parsing (Smith and Eisner, 2008; Martins et al. [sent-27, score-0.44]
15 Transitionbased parsing, by contrast, can easily accommodate arbitrarily complex representations involving nonlocal features. [sent-29, score-0.168]
16 Complex non-local features, such as bracket matching and rhythmic patterns, are used in transition-based constituency parsing (Zhang and Clark, 2009; Wang et al. [sent-30, score-0.147]
17 , 2006), and most transitionbased dependency parsers incorporate some nonlocal features, but current practice is nevertheless to use a rather restricted set of features, as exemplified by the default feature models in MaltParser (Nivre et al. [sent-31, score-0.644]
18 We explore considerably richer feature representations and show that they improve parsing accuracy significantly. [sent-33, score-0.234]
19 In standard experiments using the Penn Treebank, our parser gets an unlabeled attachment score of 92. [sent-34, score-0.436]
20 9%, which is the best result achieved with a transition-based parser and comparable to the state of the art. [sent-35, score-0.166]
21 For the Chinese Treebank, our parser gets a score of 86. [sent-36, score-0.166]
22 i ac t2io0n11 fo Ar Cssoocmiaptuiotanti foonra Clo Lminpguutiast i ocns:aslh Loirntpgaupisetrics , pages 188–193, 2 The Transition-based Parsing Algorithm In a typical transition-based parsing process, the input words are put into a queue and partially built structures are organized by a stack. [sent-40, score-0.322]
23 A set of shiftreduce actions are defined, which consume words from the queue and build the output parse. [sent-41, score-0.228]
24 Recent research have focused on action sets that build projective dependency trees in an arc-eager (Nivre et al. [sent-42, score-0.368]
25 (2009) and use the generalized perceptron (Collins, 2002) for global learning and beamsearch for decoding. [sent-46, score-0.054]
26 Unlike both earlier globallearning parsers, which only perform unlabeled parsing, we perform labeled parsing by augmenting the Le ftArc and RightArc actions with the set of dependency labels. [sent-47, score-0.585]
27 Hence our work is in line with Titov and Henderson (2007) in using labeled transitions with global learning. [sent-48, score-0.092]
28 3 Feature Templates At each step during a parsing process, the parser configuration can be represented by a tuple hS, N, Ai, where S is the stack, N is the queue of incoming words, San ids tAh ei ss tachek s Net iosf dependency arcs that have been built. [sent-50, score-0.769]
29 Denoting the top of stack 1It is very likely that the type of features explored in this paper would be beneficial also for the arc-standard system, although the exact same feature templates would not be applicable because of differences in the parsing order. [sent-51, score-0.63]
30 duvSS na0i0rishlwgwtera;nvadmN;Src y00sS0ewrhp0dp;;SS0v;r0rNl;pSSNN0w0lwdp;v;lSN;N00Spl pdl;v;lNS0wNvl0;Nl0p;vll; tlS ha0 bipwrhe2dSwsl-0ro;ehlpSd t e0 rhpl2 s rp;S 0N rhpw20lSs;NrpS0l2p0wNrs;2lpSN;0pl2w;pslS0N2l;ps w – Table 2: New feature templates. [sent-54, score-0.087]
31 word; p POS-tag; vl, vr valency; l dependency label, sl, sr labelset. [sent-55, score-0.349]
32 – – – – with S0, the front items from the queue with N0, N1, and N2, the head of S0 (if any) with S0h, the leftmost and rightmost modifiers of S0 (if any) with S0l and S0r, respectively, and the leftmost modifier of N0 (if any) with N0l, the baseline features are shown in Table 1. [sent-56, score-0.911]
33 These features are mostly taken from Zhang and Clark (2008) and Huang and Sagae (2010), and our parser reproduces the same accuracies as reported by both papers. [sent-57, score-0.268]
34 For example, S0pN0wp represents the feature template that takes the word and POS-tag of N0, and combines it with the word of S0. [sent-59, score-0.087]
35 In this short paper, we extend the baseline feature templates with the following: Distance between S0 and N0 Direction and distance between a pair of head and modifier have been used in the standard feature templates for maximum spanning tree parsing (McDonald et al. [sent-60, score-1.013]
36 Distance information has also been used in the easy-first parser of (Goldberg and Elhadad, 2010). [sent-62, score-0.166]
37 We add the distance between S0 and N0 to the feature set by combining it with the word and POS-tag of S0 and N0, as shown in Table 2. [sent-64, score-0.162]
38 It is worth noticing that the use of distance information in our transition-based model is different from that in a typical graph-based parser such as MSTParser. [sent-65, score-0.241]
39 The distance between S0 and N0 will correspond to the distance between a pair of head and modifier when an Le ftArc action is taken, for example, but not when a Shi ft action is taken. [sent-66, score-0.499]
40 Valency of S0 and N0 The number of modifiers to a given head is used by the graph-based submodel of Zhang and Clark (2008) and the models of Martins et al. [sent-67, score-0.243]
41 In particular, we calculate the number of left and right modifiers separately, calling them left valency and right valency, respectively. [sent-70, score-0.419]
42 Left and right valencies are represented by vl and vr in Table 2, respectively. [sent-71, score-0.117]
43 They are combined with the word and POS-tag of S0 and N0 to form new feature templates. [sent-72, score-0.087]
44 Again, the use of valency information in our transition-based parser is different from the aforementioned graph-based models. [sent-73, score-0.476]
45 In our case, valency information is put into the context of the shift-reduce process, and used together with each action to give a score to the local decision. [sent-74, score-0.373]
46 Unigram information for S0h, S0l, S0r and N0l The head, left/rightmost modifiers of S0 and the leftmost modifier of N0 have been used by most arc-eager transition-based parsers we are aware of through the combination of their POS-tag with information from S0 and N0. [sent-75, score-0.466]
47 Such use is exemplified by 190 the feature templates “from three words” in Table 1. [sent-76, score-0.349]
48 We further use their word and POS-tag information as “unigram” features in Table 2. [sent-77, score-0.05]
49 Moreover, we include the dependency label information in the unigram features, represented by lin the table. [sent-78, score-0.318]
50 Third-order features of S0 and N0 Higher-order context features have been used by graph-based dependency parsers to improve accuracies (Carreras, 2007; Koo and Collins, 2010). [sent-81, score-0.474]
51 We include information of third order dependency arcs in our new feature templates, when available. [sent-82, score-0.368]
52 In Table 2, S0h2, S0l2, S0r2 and N0l2 refer to the head of S0h, the second leftmost modifier and the second rightmost modifier of S0, and the second leftmost modifier of N0, respectively. [sent-83, score-0.824]
53 The new templates include unigram word, POS-tag and dependency labels of S0h2, S0l2, S0r2 and N0l2, as well as POS-tag combinations with S0 and N0. [sent-84, score-0.515]
54 Set of dependency labels with S0 and N0 As a more global feature, we include the set of unique dependency labels from the modifiers of S0 and N0. [sent-85, score-0.649]
55 This information is combined with the word and POS-tag of S0 and N0 to make feature templates. [sent-86, score-0.087]
56 In Table 2, sl and sr stands for the set of labels on the left and right of the head, respectively. [sent-87, score-0.086]
57 Bracketed sentences from PTB were transformed into dependency formats using the Penn2Malt tool. [sent-90, score-0.243]
58 For all experiments, we set the beam size to 64 for the parser, and report unlabeled and labeled attachment scores (UAS, LAS) and unlabeled exact match (UEM) for evaluation. [sent-94, score-0.457]
59 UAS = unlabeled attachment score; UEM = unlabeled exact match. [sent-102, score-0.419]
60 UASUEMLAS beled attachment score; UEM = unlabeled exact match; LAS = labeled attachment score. [sent-103, score-0.519]
61 1 Development Experiments Table 3 shows the effect of new features on the development test data for English. [sent-105, score-0.089]
62 We start with the baseline features in Table 1, and incrementally add the distance, valency, unigram, third-order and label set feature templates in Table 2. [sent-106, score-0.334]
63 Each group of new feature templates improved the accuracies over the previous system, and the final accuracy with all new features was 93. [sent-107, score-0.386]
64 2 Table Final Test Results 4 shows the final test results of our parser for English. [sent-110, score-0.166]
65 Our extended parser significantly outperformed the baseline parser, achiev191 UASUEMLAS ZH&&CS0108; transition8845. [sent-113, score-0.166]
66 UAS = unlabeled attachment score; UEM = unlabeled exact match; LAS = labeled attachment score. [sent-121, score-0.623]
67 ing the highest attachment score reported for a transition-based parser, comparable to those of the best graph-based parsers. [sent-122, score-0.166]
68 The speed of our baseline parser was 50 sentences per second. [sent-124, score-0.202]
69 With all new features added, the speed dropped to 29 sentences per second. [sent-125, score-0.086]
70 As an alternative to Penn2Malt, bracketed sentences can also be transformed into Stanford dependencies (De Marneffe et al. [sent-126, score-0.094]
71 1% UEM when trained and evaluated on Stanford basic dependencies, which are projective dependency trees. [sent-131, score-0.305]
72 (2010) report results on Stanford collapsed dependencies, which allow a word to have multiple heads and therefore cannot be produced by a regular dependency parser. [sent-133, score-0.243]
73 3 Chinese Test Results Table 5 shows the results of our final parser, the pure transition-based parser of Zhang and Clark (2008), and the parser of Huang and Sagae (2010) on Chinese. [sent-136, score-0.377]
74 5 Conclusion We have shown that enriching the feature representation significantly improves the accuracy of our transition-based dependency parser. [sent-139, score-0.33]
75 The effect of the new features appears to outweigh the effect of combining transition-based and graph-based models, reported by Zhang and Clark (2008), as well as the effect of using dynamic programming, as inHuang and Sagae (2010). [sent-140, score-0.214]
76 This shows that feature definition is a crucial aspect of transition-based parsing. [sent-141, score-0.139]
77 In fact, some of the new feature templates in this paper, such as distance and valency, are among those which are in the graph-based submodel of Zhang and Clark (2008), but not the transition-based submodel. [sent-142, score-0.427]
78 Therefore our new features to some extent achieved the same effect as their model combination. [sent-143, score-0.089]
79 The new features are also hard to use in dynamic programming because they add considerable complexity to the parse items. [sent-144, score-0.16]
80 Enriched feature representations have been studied as an important factor for improving the accu- racies of graph-based dependency parsing also. [sent-145, score-0.477]
81 Recent research including the use of loopy belief network (Smith and Eisner, 2008), integer linear programming (Martins et al. [sent-146, score-0.101]
82 , 2009) and an improved dynamic programming algorithm (Koo and Collins, 2010) can be seen as methods to incorporate nonlocal features into a graph-based model. [sent-147, score-0.249]
83 An open source release of our parser, together with trained models for English and Chinese, are freely available. [sent-148, score-0.036]
84 Parsing to stan- ford dependencies: Trade-offs between speed and accuracy. [sent-158, score-0.036]
85 Dependency parsing and domain adaptation with LR models and parser ensembles. [sent-228, score-0.313]
86 A tale of two parsers: investigating and combining graph-based and transition-based dependency parsing using beamsearch. [sent-249, score-0.39]
87 Transition-based parsing of the Chinese Treebank using a global discriminative model. [sent-253, score-0.201]
wordName wordTfidf (topN-words)
[('valency', 0.31), ('dependency', 0.243), ('sagae', 0.223), ('templates', 0.197), ('clark', 0.18), ('queue', 0.175), ('nivre', 0.166), ('attachment', 0.166), ('parser', 0.166), ('uem', 0.157), ('modifier', 0.157), ('ftarc', 0.155), ('zhang', 0.148), ('parsing', 0.147), ('koo', 0.145), ('huang', 0.139), ('leftmost', 0.121), ('modifiers', 0.109), ('unlabeled', 0.104), ('stack', 0.104), ('yue', 0.104), ('uas', 0.091), ('mcdonald', 0.091), ('nonlocal', 0.089), ('joakim', 0.088), ('las', 0.087), ('maltparser', 0.087), ('feature', 0.087), ('martins', 0.081), ('treebank', 0.081), ('transitionbased', 0.081), ('parsers', 0.079), ('rightarc', 0.078), ('uasuemlas', 0.078), ('unigram', 0.075), ('distance', 0.075), ('collins', 0.074), ('submodel', 0.068), ('front', 0.067), ('head', 0.066), ('iwpt', 0.065), ('exemplified', 0.065), ('chinese', 0.064), ('marneffe', 0.064), ('programming', 0.063), ('action', 0.063), ('vr', 0.063), ('pops', 0.063), ('projective', 0.062), ('yamada', 0.06), ('hoef', 0.056), ('prague', 0.056), ('czech', 0.055), ('global', 0.054), ('pushes', 0.054), ('vl', 0.054), ('actions', 0.053), ('accuracies', 0.052), ('bracketed', 0.052), ('aspect', 0.052), ('features', 0.05), ('cer', 0.05), ('ctb', 0.048), ('mstparser', 0.048), ('dynamic', 0.047), ('smith', 0.046), ('exact', 0.045), ('penn', 0.045), ('pure', 0.045), ('titov', 0.045), ('rightmost', 0.045), ('ptb', 0.045), ('shi', 0.045), ('accommodate', 0.043), ('carreras', 0.043), ('sl', 0.043), ('sr', 0.043), ('le', 0.043), ('xavier', 0.042), ('dependencies', 0.042), ('goldberg', 0.042), ('terry', 0.041), ('johan', 0.041), ('conll', 0.04), ('johansson', 0.04), ('uppsala', 0.04), ('effect', 0.039), ('arcs', 0.038), ('belief', 0.038), ('jens', 0.038), ('hawaii', 0.038), ('ryan', 0.038), ('labeled', 0.038), ('removes', 0.038), ('matsumoto', 0.036), ('speed', 0.036), ('freely', 0.036), ('arbitrarily', 0.036), ('row', 0.036), ('kenji', 0.035)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999976 309 acl-2011-Transition-based Dependency Parsing with Rich Non-local Features
Author: Yue Zhang ; Joakim Nivre
Abstract: Transition-based dependency parsers generally use heuristic decoding algorithms but can accommodate arbitrarily rich feature representations. In this paper, we show that we can improve the accuracy of such parsers by considering even richer feature sets than those employed in previous systems. In the standard Penn Treebank setup, our novel features improve attachment score form 91.4% to 92.9%, giving the best results so far for transitionbased parsing and rivaling the best results overall. For the Chinese Treebank, they give a signficant improvement of the state of the art. An open source release of our parser is freely available.
2 0.30236697 282 acl-2011-Shift-Reduce CCG Parsing
Author: Yue Zhang ; Stephen Clark
Abstract: CCGs are directly compatible with binarybranching bottom-up parsing algorithms, in particular CKY and shift-reduce algorithms. While the chart-based approach has been the dominant approach for CCG, the shift-reduce method has been little explored. In this paper, we develop a shift-reduce CCG parser using a discriminative model and beam search, and compare its strengths and weaknesses with the chart-based C&C; parser. We study different errors made by the two parsers, and show that the shift-reduce parser gives competitive accuracies compared to C&C.; Considering our use of a small beam, and given the high ambiguity levels in an automatically-extracted grammar and the amount of information in the CCG lexical categories which form the shift actions, this is a surprising result.
3 0.29905745 143 acl-2011-Getting the Most out of Transition-based Dependency Parsing
Author: Jinho D. Choi ; Martha Palmer
Abstract: This paper suggests two ways of improving transition-based, non-projective dependency parsing. First, we add a transition to an existing non-projective parsing algorithm, so it can perform either projective or non-projective parsing as needed. Second, we present a bootstrapping technique that narrows down discrepancies between gold-standard and automatic parses used as features. The new addition to the algorithm shows a clear advantage in parsing speed. The bootstrapping technique gives a significant improvement to parsing accuracy, showing near state-of-theart performance with respect to other parsing approaches evaluated on the same data set.
4 0.28950498 39 acl-2011-An Ensemble Model that Combines Syntactic and Semantic Clustering for Discriminative Dependency Parsing
Author: Gholamreza Haffari ; Marzieh Razavi ; Anoop Sarkar
Abstract: We combine multiple word representations based on semantic clusters extracted from the (Brown et al., 1992) algorithm and syntactic clusters obtained from the Berkeley parser (Petrov et al., 2006) in order to improve discriminative dependency parsing in the MSTParser framework (McDonald et al., 2005). We also provide an ensemble method for combining diverse cluster-based models. The two contributions together significantly improves unlabeled dependency accuracy from 90.82% to 92. 13%.
5 0.26870847 167 acl-2011-Improving Dependency Parsing with Semantic Classes
Author: Eneko Agirre ; Kepa Bengoetxea ; Koldo Gojenola ; Joakim Nivre
Abstract: This paper presents the introduction of WordNet semantic classes in a dependency parser, obtaining improvements on the full Penn Treebank for the first time. We tried different combinations of some basic semantic classes and word sense disambiguation algorithms. Our experiments show that selecting the adequate combination of semantic features on development data is key for success. Given the basic nature of the semantic classes and word sense disambiguation algorithms used, we think there is ample room for future improvements. 1
6 0.24661343 127 acl-2011-Exploiting Web-Derived Selectional Preference to Improve Statistical Dependency Parsing
7 0.20992114 333 acl-2011-Web-Scale Features for Full-Scale Parsing
8 0.19418043 111 acl-2011-Effects of Noun Phrase Bracketing in Dependency Parsing and Machine Translation
9 0.16517845 107 acl-2011-Dynamic Programming Algorithms for Transition-Based Dependency Parsers
10 0.1645636 59 acl-2011-Better Automatic Treebank Conversion Using A Feature-Based Approach
11 0.1642139 48 acl-2011-Automatic Detection and Correction of Errors in Dependency Treebanks
12 0.13166934 237 acl-2011-Ordering Prenominal Modifiers with a Reranking Approach
13 0.13161957 230 acl-2011-Neutralizing Linguistically Problematic Annotations in Unsupervised Dependency Parsing Evaluation
14 0.13098803 241 acl-2011-Parsing the Internal Structure of Words: A New Paradigm for Chinese Word Segmentation
15 0.12352093 275 acl-2011-Semi-Supervised Modeling for Prenominal Modifier Ordering
16 0.12098935 164 acl-2011-Improving Arabic Dependency Parsing with Form-based and Functional Morphological Features
17 0.11659877 243 acl-2011-Partial Parsing from Bitext Projections
18 0.11465282 5 acl-2011-A Comparison of Loopy Belief Propagation and Dual Decomposition for Integrated CCG Supertagging and Parsing
19 0.1120014 186 acl-2011-Joint Training of Dependency Parsing Filters through Latent Support Vector Machines
20 0.1087886 10 acl-2011-A Discriminative Model for Joint Morphological Disambiguation and Dependency Parsing
topicId topicWeight
[(0, 0.235), (1, -0.066), (2, -0.079), (3, -0.402), (4, -0.042), (5, -0.068), (6, 0.063), (7, 0.128), (8, 0.146), (9, -0.012), (10, 0.092), (11, 0.077), (12, 0.037), (13, -0.24), (14, -0.038), (15, 0.071), (16, 0.036), (17, 0.037), (18, -0.009), (19, -0.052), (20, -0.095), (21, -0.015), (22, 0.007), (23, 0.045), (24, 0.073), (25, -0.176), (26, 0.029), (27, 0.027), (28, 0.015), (29, 0.05), (30, -0.027), (31, -0.035), (32, -0.002), (33, 0.018), (34, -0.058), (35, 0.054), (36, 0.003), (37, 0.035), (38, 0.054), (39, 0.151), (40, 0.05), (41, 0.048), (42, 0.012), (43, 0.049), (44, -0.031), (45, 0.014), (46, -0.021), (47, 0.024), (48, 0.001), (49, -0.003)]
simIndex simValue paperId paperTitle
same-paper 1 0.97153699 309 acl-2011-Transition-based Dependency Parsing with Rich Non-local Features
Author: Yue Zhang ; Joakim Nivre
Abstract: Transition-based dependency parsers generally use heuristic decoding algorithms but can accommodate arbitrarily rich feature representations. In this paper, we show that we can improve the accuracy of such parsers by considering even richer feature sets than those employed in previous systems. In the standard Penn Treebank setup, our novel features improve attachment score form 91.4% to 92.9%, giving the best results so far for transitionbased parsing and rivaling the best results overall. For the Chinese Treebank, they give a signficant improvement of the state of the art. An open source release of our parser is freely available.
2 0.89799124 143 acl-2011-Getting the Most out of Transition-based Dependency Parsing
Author: Jinho D. Choi ; Martha Palmer
Abstract: This paper suggests two ways of improving transition-based, non-projective dependency parsing. First, we add a transition to an existing non-projective parsing algorithm, so it can perform either projective or non-projective parsing as needed. Second, we present a bootstrapping technique that narrows down discrepancies between gold-standard and automatic parses used as features. The new addition to the algorithm shows a clear advantage in parsing speed. The bootstrapping technique gives a significant improvement to parsing accuracy, showing near state-of-theart performance with respect to other parsing approaches evaluated on the same data set.
3 0.86403304 127 acl-2011-Exploiting Web-Derived Selectional Preference to Improve Statistical Dependency Parsing
Author: Guangyou Zhou ; Jun Zhao ; Kang Liu ; Li Cai
Abstract: In this paper, we present a novel approach which incorporates the web-derived selectional preferences to improve statistical dependency parsing. Conventional selectional preference learning methods have usually focused on word-to-class relations, e.g., a verb selects as its subject a given nominal class. This paper extends previous work to wordto-word selectional preferences by using webscale data. Experiments show that web-scale data improves statistical dependency parsing, particularly for long dependency relationships. There is no data like more data, performance improves log-linearly with the number of parameters (unique N-grams). More importantly, when operating on new domains, we show that using web-derived selectional preferences is essential for achieving robust performance.
4 0.84190267 39 acl-2011-An Ensemble Model that Combines Syntactic and Semantic Clustering for Discriminative Dependency Parsing
Author: Gholamreza Haffari ; Marzieh Razavi ; Anoop Sarkar
Abstract: We combine multiple word representations based on semantic clusters extracted from the (Brown et al., 1992) algorithm and syntactic clusters obtained from the Berkeley parser (Petrov et al., 2006) in order to improve discriminative dependency parsing in the MSTParser framework (McDonald et al., 2005). We also provide an ensemble method for combining diverse cluster-based models. The two contributions together significantly improves unlabeled dependency accuracy from 90.82% to 92. 13%.
5 0.76972032 107 acl-2011-Dynamic Programming Algorithms for Transition-Based Dependency Parsers
Author: Marco Kuhlmann ; Carlos Gomez-Rodriguez ; Giorgio Satta
Abstract: We develop a general dynamic programming technique for the tabulation of transition-based dependency parsers, and apply it to obtain novel, polynomial-time algorithms for parsing with the arc-standard and arc-eager models. We also show how to reverse our technique to obtain new transition-based dependency parsers from existing tabular methods. Additionally, we provide a detailed discussion of the conditions under which the feature models commonly used in transition-based parsing can be integrated into our algorithms.
6 0.76608628 111 acl-2011-Effects of Noun Phrase Bracketing in Dependency Parsing and Machine Translation
7 0.73297971 230 acl-2011-Neutralizing Linguistically Problematic Annotations in Unsupervised Dependency Parsing Evaluation
8 0.72998214 282 acl-2011-Shift-Reduce CCG Parsing
9 0.72491193 243 acl-2011-Partial Parsing from Bitext Projections
10 0.71456242 333 acl-2011-Web-Scale Features for Full-Scale Parsing
11 0.71319878 167 acl-2011-Improving Dependency Parsing with Semantic Classes
12 0.70640105 236 acl-2011-Optimistic Backtracking - A Backtracking Overlay for Deterministic Incremental Parsing
13 0.68375403 48 acl-2011-Automatic Detection and Correction of Errors in Dependency Treebanks
14 0.67648762 59 acl-2011-Better Automatic Treebank Conversion Using A Feature-Based Approach
15 0.64751917 295 acl-2011-Temporal Restricted Boltzmann Machines for Dependency Parsing
16 0.60518622 186 acl-2011-Joint Training of Dependency Parsing Filters through Latent Support Vector Machines
17 0.55440956 184 acl-2011-Joint Hebrew Segmentation and Parsing using a PCFGLA Lattice Parser
18 0.53605741 199 acl-2011-Learning Condensed Feature Representations from Large Unsupervised Data Sets for Supervised Learning
19 0.51804698 267 acl-2011-Reversible Stochastic Attribute-Value Grammars
20 0.51655543 92 acl-2011-Data point selection for cross-language adaptation of dependency parsers
topicId topicWeight
[(5, 0.017), (17, 0.061), (26, 0.017), (28, 0.193), (37, 0.174), (39, 0.115), (41, 0.069), (55, 0.042), (59, 0.034), (72, 0.014), (91, 0.027), (96, 0.113), (97, 0.048), (98, 0.017)]
simIndex simValue paperId paperTitle
same-paper 1 0.83668905 309 acl-2011-Transition-based Dependency Parsing with Rich Non-local Features
Author: Yue Zhang ; Joakim Nivre
Abstract: Transition-based dependency parsers generally use heuristic decoding algorithms but can accommodate arbitrarily rich feature representations. In this paper, we show that we can improve the accuracy of such parsers by considering even richer feature sets than those employed in previous systems. In the standard Penn Treebank setup, our novel features improve attachment score form 91.4% to 92.9%, giving the best results so far for transitionbased parsing and rivaling the best results overall. For the Chinese Treebank, they give a signficant improvement of the state of the art. An open source release of our parser is freely available.
2 0.81008714 267 acl-2011-Reversible Stochastic Attribute-Value Grammars
Author: Daniel de Kok ; Barbara Plank ; Gertjan van Noord
Abstract: An attractive property of attribute-value grammars is their reversibility. Attribute-value grammars are usually coupled with separate statistical components for parse selection and fluency ranking. We propose reversible stochastic attribute-value grammars, in which a single statistical model is employed both for parse selection and fluency ranking.
3 0.78990149 188 acl-2011-Judging Grammaticality with Tree Substitution Grammar Derivations
Author: Matt Post
Abstract: In this paper, we show that local features computed from the derivations of tree substitution grammars such as the identify of particular fragments, and a count of large and small fragments are useful in binary grammatical classification tasks. Such features outperform n-gram features and various model scores by a wide margin. Although they fall short of the performance of the hand-crafted feature set of Charniak and Johnson (2005) developed for parse tree reranking, they do so with an order of magnitude fewer features. Furthermore, since the TSGs employed are learned in a Bayesian setting, the use of their derivations can be viewed as the automatic discovery of tree patterns useful for classification. On the BLLIP dataset, we achieve an accuracy of 89.9% in discriminating between grammatical text and samples from an n-gram language model. — —
4 0.76208335 235 acl-2011-Optimal and Syntactically-Informed Decoding for Monolingual Phrase-Based Alignment
Author: Kapil Thadani ; Kathleen McKeown
Abstract: The task of aligning corresponding phrases across two related sentences is an important component of approaches for natural language problems such as textual inference, paraphrase detection and text-to-text generation. In this work, we examine a state-of-the-art structured prediction model for the alignment task which uses a phrase-based representation and is forced to decode alignments using an approximate search approach. We propose instead a straightforward exact decoding technique based on integer linear programming that yields order-of-magnitude improvements in decoding speed. This ILP-based decoding strategy permits us to consider syntacticallyinformed constraints on alignments which significantly increase the precision of the model.
5 0.75157976 10 acl-2011-A Discriminative Model for Joint Morphological Disambiguation and Dependency Parsing
Author: John Lee ; Jason Naradowsky ; David A. Smith
Abstract: Most previous studies of morphological disambiguation and dependency parsing have been pursued independently. Morphological taggers operate on n-grams and do not take into account syntactic relations; parsers use the “pipeline” approach, assuming that morphological information has been separately obtained. However, in morphologically-rich languages, there is often considerable interaction between morphology and syntax, such that neither can be disambiguated without the other. In this paper, we propose a discriminative model that jointly infers morphological properties and syntactic structures. In evaluations on various highly-inflected languages, this joint model outperforms both a baseline tagger in morphological disambiguation, and a pipeline parser in head selection.
6 0.74110425 186 acl-2011-Joint Training of Dependency Parsing Filters through Latent Support Vector Machines
7 0.73623461 282 acl-2011-Shift-Reduce CCG Parsing
8 0.73622948 92 acl-2011-Data point selection for cross-language adaptation of dependency parsers
9 0.73326045 111 acl-2011-Effects of Noun Phrase Bracketing in Dependency Parsing and Machine Translation
10 0.73152649 300 acl-2011-The Surprising Variance in Shortest-Derivation Parsing
11 0.73123085 250 acl-2011-Prefix Probability for Probabilistic Synchronous Context-Free Grammars
12 0.73012 122 acl-2011-Event Extraction as Dependency Parsing
13 0.72965479 332 acl-2011-Using Multiple Sources to Construct a Sentiment Sensitive Thesaurus for Cross-Domain Sentiment Classification
14 0.72451019 85 acl-2011-Coreference Resolution with World Knowledge
15 0.72247177 54 acl-2011-Automatically Extracting Polarity-Bearing Topics for Cross-Domain Sentiment Classification
16 0.7208485 103 acl-2011-Domain Adaptation by Constraining Inter-Domain Variability of Latent Feature Representation
17 0.71879792 202 acl-2011-Learning Hierarchical Translation Structure with Linguistic Annotations
18 0.71821892 126 acl-2011-Exploiting Syntactico-Semantic Structures for Relation Extraction
19 0.71646482 334 acl-2011-Which Noun Phrases Denote Which Concepts?
20 0.71359456 331 acl-2011-Using Large Monolingual and Bilingual Corpora to Improve Coordination Disambiguation