acl acl2013 acl2013-155 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Muhua Zhu ; Yue Zhang ; Wenliang Chen ; Min Zhang ; Jingbo Zhu
Abstract: Shift-reduce dependency parsers give comparable accuracies to their chartbased counterparts, yet the best shiftreduce constituent parsers still lag behind the state-of-the-art. One important reason is the existence of unary nodes in phrase structure trees, which leads to different numbers of shift-reduce actions between different outputs for the same input. This turns out to have a large empirical impact on the framework of global training and beam search. We propose a simple yet effective extension to the shift-reduce process, which eliminates size differences between action sequences in beam-search. Our parser gives comparable accuracies to the state-of-the-art chart parsers. With linear run-time complexity, our parser is over an order of magnitude faster than the fastest chart parser.
Reference: text
sentIndex sentText sentNum sentScore
1 cn Abstract Shift-reduce dependency parsers give comparable accuracies to their chartbased counterparts, yet the best shiftreduce constituent parsers still lag behind the state-of-the-art. [sent-12, score-0.875]
2 One important reason is the existence of unary nodes in phrase structure trees, which leads to different numbers of shift-reduce actions between different outputs for the same input. [sent-13, score-0.324]
3 Our parser gives comparable accuracies to the state-of-the-art chart parsers. [sent-16, score-0.454]
4 With linear run-time complexity, our parser is over an order of magnitude faster than the fastest chart parser. [sent-17, score-0.289]
5 1 Introduction Transition-based parsers employ a set of shiftreduce actions and perform parsing using a sequence of state transitions. [sent-18, score-0.715]
6 Greedy, classifier-based parsers have been developed for both dependency grammars (Yamada and Matsumoto, 2003; Nivre et al. [sent-20, score-0.32]
7 With linear run-time complexity, they were commonly regarded as a faster but less accurate alternative to graph-based chart parsers (Collins, 1997; Charniak, 2000; McDonald et al. [sent-22, score-0.253]
8 Various methods have been proposed to address the disadvantages of greedy local parsing, among which a framework of beam-search and global discriminative training have been shown effective for dependency parsing (Zhang and Clark, 2008; Huang and Sagae, 2010). [sent-24, score-0.36]
9 While beam-search reduces error propagation compared with greedy search, a discriminative model that is globally optimized for whole sequences of transition actions can avoid local score biases (Lafferty et al. [sent-25, score-0.361]
10 With the use of rich non-local features, transition-based dependency parsers achieve state-of-the-art accuracies that are comparable to the best-graph-based parsers (Zhang and Nivre, 2011; Bohnet and Nivre, 2012). [sent-28, score-0.679]
11 In addition, processing tens of sentences per second (Zhang and Nivre, 2011), these transition-based parsers can be a favorable choice for dependency parsing. [sent-29, score-0.32]
12 The best reported accuracies of transition-based constituent parsers still lag behind the state-of-the-art (Sagae and Lavie, 2006; Zhang and Clark, 2009). [sent-32, score-0.474]
13 One difference between phrasestructure parsing and dependency parsing is that for the former, parse trees with different numbers of unary rules require different numbers of actions to build. [sent-33, score-0.863]
14 For the same sentence, the largest output can take twice as many as actions to build as the 434 Proce dingSsof oifa, th Beu 5l1gsarti Aan,An u aglu Mste 4e-ti9n2g 0 o1f3 t. [sent-35, score-0.185]
15 Our method is conceptually simple, requiring only one additional transition action to eliminate size differences between different candidate outputs. [sent-40, score-0.236]
16 On standard evaluations using both the Penn Tree- bank and the Penn Chinese Treebank, our parser gave higher accuracies than the Berkeley parser (Petrov and Klein, 2007), a state-of-the-art chart parser. [sent-41, score-0.684]
17 In addition, our parser runs with over 89 sentences per second, which is 14 times faster than the Berkeley parser, and is the fastest that we are aware of for phrase-structure parsing. [sent-42, score-0.23]
18 2 Baseline parser We adopt the parser of Zhang and Clark (2009) for our baseline, which is based on the shift-reduce process of Sagae and Lavie (2005), and employs global perceptron training and beam search. [sent-54, score-0.518]
19 At each step, a transition action is applied to consume an input word or construct a new phrase-structure. [sent-57, score-0.236]
20 The set of transition actions are • SHIFT: pop the front word from the buffer, aSHndI push iot pon tthoe t fhreo nstta wcko. [sent-59, score-0.384]
21 • • • REDUCE-L/R-X: pop the top two consRtEitDueUntCs Eo-fLf/ tRh-eX stack, c tohmebin toep pth tewmo oin ctoo a new constituent with label X, and push the new constituent onto the stack. [sent-62, score-0.333]
22 UNARY-X: pop the top constituent off the stack, Y ra-Xis:e pito pto a new c coonnssttiitutueenntt tw oiftfh tlhaebel X, and push the new constituent onto the stack. [sent-63, score-0.333]
23 2 Global Discriminative Training and Beam-Search For a given input sentence, the initial state has an empty stack and a buffer that contains all the input words. [sent-68, score-0.253]
24 An agenda is used to keep the k best state items at each step. [sent-69, score-0.284]
25 At each step, every state item in the agenda is popped and expanded by applying a valid transition action, and the top k from the newly constructed state items are put back onto the agenda. [sent-71, score-0.662]
26 The score of a state item is the total score of the transition actions that have been applied to build the item: C(α) =iXN=1Φ(ai) ·θ~ Here Φ(ai) represents the feature vector for the ith action ai in state item α. [sent-75, score-1.032]
27 The model parameter with the averaged perceptron algorithm, applied to state items (sequence of actions) globally. [sent-78, score-0.246]
28 We apply the early update strategy (Collins and Roark, 2004), stopping parsing for parameter updates when the goldstandard state item falls off the agenda. [sent-79, score-0.436]
29 3 Baseline Features Our baseline features are adopted from Zhang and Clark (2009), and are shown in Table 1 Here si represents the ith item on the top of the stack S and qi denotes the ith item in the front end of the queue Q. [sent-81, score-0.792]
30 The symbol w denotes the lexical head of an item; the symbol c denotes the constituent label of an item; the symbol t is the POS of a lexical head. [sent-82, score-0.318]
31 We remove Chinese specific features and make the baseline parser languageindependent. [sent-84, score-0.331]
32 3 Improved hypotheses comparison Unlike dependency parsing, constituent parse trees for the same sentence can have different numbers of nodes, mainly due to the existence of unary nodes. [sent-85, score-0.429]
33 As a result, completed state NP VBVPNP NN NNS address issues address NNS issues Figure 2: Example parse trees of the same sentence with different numbers of actions. [sent-86, score-0.275]
34 items for the same sentence can have different numbers of unary actions. [sent-87, score-0.226]
35 The first parse corresponds to the action sequence [SHIFT, SHIFT, REDUCE-R-NP, FINISH], while the second parse corresponds to the action se- quence [SHIFT, SHIFT, UNARY-NP, REDUCE-LVP, FINISH], which consists of one more action than the first case. [sent-89, score-0.42]
36 In practice, variances between state items can be much larger than the chosen example. [sent-90, score-0.188]
37 In the extreme case where a state item does not contain any unary action, the number of actions is 2n, where n is the number of words in the sentence. [sent-91, score-0.55]
38 On the other hand, if the maximum number of consequent unary actions is 2 (Sagae and Lavie, 2005; Zhang and Clark, 2009), then the maximum number of actions a state item can have is 4n. [sent-92, score-0.735]
39 The significant variance in the number of actions N can have an impact on the linear separability of state items, for which the feature vectors are Φ (ai). [sent-93, score-0.286]
40 One way of improving the comparability of state items is to reduce the differences in their sizes, and we use a padding method to achieve this. [sent-95, score-0.533]
41 The idea is to extend the set of actions by PiN=1 adding an IDLE action, so that completed state items can be further expanded using the IDLE action. [sent-96, score-0.442]
42 The action does not change the state itself, but simply adds to the number of actions in the sequence. [sent-97, score-0.426]
43 A feature vector is extracted for the IDLE action according to the final state context, in the same way as other actions. [sent-98, score-0.241]
44 Note that there can be more than one action that are padded to a sequence of actions, and the number of IDLE actions depends on the size difference between the current action sequence and the largest action sequence without IDLE actions. [sent-102, score-0.605]
45 The initial item (Axioms) has k = 0, while the goal item has 2n ≤ k ≤ 4n. [sent-105, score-0.362]
46 While they used a candidate output to record the best completed state item, and finish decoding when the agenda contains no more items, we can simply finish decoding when all items in the agenda are completed, and output the best state item in the agenda. [sent-108, score-1.009]
47 With this new transition process, we experimented with several extended features,and found that the templates in Table 2 are useful to improve the accuracies further. [sent-109, score-0.389]
48 4 Semi-supervised Parsing with Large Data This section discusses how to extract information from unlabeled data or auto-parsed data to further improve shift-reduce parsing accuracies. [sent-112, score-0.19]
49 1 Paradigmatic Relations: Word Clustering Word clusters are regarded as lexical intermediaries for dependency parsing (Koo et al. [sent-118, score-0.28]
50 The idea of exploiting lexical dependency information from auto-parsed data has been explored before for dependency parsing (Chen et al. [sent-127, score-0.406]
51 To extract lexical dependencies, we first run the baseline parser on unlabeled data. [sent-130, score-0.327]
52 To simplify the extraction process, we can convert auto-parsed constituency trees into dependency trees by using Penn2Malt. [sent-131, score-0.26]
53 2 From the dependency trees, we extract bigram lexical dependencies hw1 , w2 , L/Ri wtrahcetre b tihgrea symbol a Ll (R) means tiehast w1 (w2) Lis/ Rthei head of w2 (w1). [sent-132, score-0.218]
54 Specifically, top-10% most frequent items are assigned to the category of High Frequency (HF); otherwise if an item is among top 20%, we assign it to the category of Middle Frequency (MF); otherwise the category of Low Frequency (LF). [sent-140, score-0.268]
55 Hereafter, we refer to the bigram and trigram lexical dependency lists as BLD and TLD, respectively. [sent-141, score-0.166]
56 (2008) and is used as additional information for graph-based dependency parsing in Chen et al. [sent-144, score-0.28]
57 For each xh ∈ H(y), we have a corresponding dependency stru∈ctHu re(y Dh = ×× (xLk, . [sent-147, score-0.227]
58 Again, we convert constituency trees into dependency trees for the purpose of simplicity. [sent-162, score-0.26]
59 The following are the templates of the records of the dependency language models. [sent-164, score-0.174]
60 Here the symbol si denotes a stack item, qi denotes a queue item, w represents a word, and t represents a POS tag. [sent-206, score-0.315]
61 140g365lishand Chinese development sets with the padding technique and new supervised features added incrementally. [sent-211, score-0.385]
62 We used EVALB to evaluate parser performances, including labeled precision (LP), labeled recall (LR), and bracketing F1. [sent-217, score-0.23]
63 2 Results on Development Sets Table 5 reports the results of the extended parser (baseline + padding + supervised features) on the English and Chinese development sets. [sent-238, score-0.655]
64 We integrated the padding method into the baseline parser, based on which we further incorporated the supervised features in Table 2. [sent-239, score-0.446]
65 From the results we find that the padding method improves the parser accuracies by 0. [sent-240, score-0.74]
66 3 Table 7: Comparison of our parsers and related work on the English test set. [sent-298, score-0.194]
67 26 Table 8: Comparison of our parsers and related work on the test set of CTB5. [sent-326, score-0.194]
68 ∗ Huang (2009) adapted the parsers to Chinese parsing on CTB5. [sent-328, score-0.348]
69 We grouped the parsers into three categories: single parsers (SI), discriminative reranking parsers (RE), and semisupervised parsers (SE). [sent-337, score-0.908]
70 From the results we can see that our extended parser (baseline + padding + supervised features) outperforms the Berkeley parser by 0. [sent-339, score-0.885]
71 3% on English, and is comparable with the Berkeley parser on Chinese (−0. [sent-340, score-0.23]
72 ∗ The results of SVM-based shiftreduce parsing with greedy search. [sent-356, score-0.275]
73 The results of MaxEnt-based shift-reduce parser with best-first search. [sent-357, score-0.23]
74 The padding technique, supervised features, and semi-supervised features achieve an overall improvement of 1. [sent-364, score-0.385]
75 4 Comparison of Running Time We also compared the running times of our parsers with the related single parsers. [sent-369, score-0.194]
76 From the table, we can see that incorporating semisupervised features decreases parsing speed, but the semi-supervised parser still has the advantage of efficiency over other parsers. [sent-373, score-0.463]
77 Specifically, the semi-supervised parser is 7 times faster than the Berkeley parser. [sent-374, score-0.23]
78 In practice, the running times of the shift-reduce parsers should be much shorter than the reported times in the table. [sent-376, score-0.194]
79 5 Error Analysis We conducted error analysis for the three systems: the baseline parser, the extended parser with 440 Span Length Figure 5: Comparison of parsing accuracies of the baseline, extended parser, and semi-supervised parsers on spans of different lengths. [sent-378, score-0.964]
80 the padding technique, and the semi-supervised parser, focusing on the English test set. [sent-379, score-0.345]
81 The analysis was performed in four dimensions: parsing accuracies on different phrase types, on constituents of different span lengths, on different sentence lengths, and on sentences with different numbers of unknown words. [sent-380, score-0.458]
82 1 Different Phrase Types Table 10 shows the parsing accuracies of the baseline, extended parser, and semi-supervised parser on different phrase types. [sent-383, score-0.629]
83 We also show the improvements of the semi-supervised parser over the baseline parser (the last row in the table). [sent-386, score-0.521]
84 As the results show, the extended parser achieves improvements on most of the phrase types with two exceptions: Preposition Prase (PP) and Quantifier Phrase (QP). [sent-387, score-0.31]
85 Semisupervised features further improve parsing accuracies over the extended parser (QP is an exception). [sent-388, score-0.669]
86 From the last row, we can see that improvements of the semi-supervised parser over the baseline on VP, S, SBAR, ADVP, and ADJP are above the average improvement (1. [sent-389, score-0.291]
87 2 Different Span Lengths Figure 5 shows a comparison of the three parsers on spans of different lengths. [sent-393, score-0.194]
88 As the results show, both the padding extension and semi-supervised features are more helpful on relatively large spans: the performance gaps between the three parsers are enlarged with increasing span lengths. [sent-395, score-0.618]
89 Sentence Length Figure 6: Comparison of parsing accuracies of the baseline, extended parser, and semi-supervised parser on sentences of different lengths. [sent-396, score-0.629]
90 3 Different Sentence Lengths Figure 6 shows a comparison of parsing accuracies of the three parsers on sentences of different lengths. [sent-399, score-0.513]
91 From the results we can see that semi-supervised features improve parsing accuracy on both short and long sentences. [sent-402, score-0.194]
92 4 Different Numbers of Unknown Words Figure 4 shows a comparison of parsing accuracies of the baseline, extended parser, and semisupervised parser on sentences with different numbers of unknown words. [sent-407, score-0.768]
93 As the results show, the padding method is not very helpful on sentences with large numbers of unknown words, while semi-supervised features help significantly on this aspect. [sent-408, score-0.485]
94 The resulting supervised parser outperforms the Berkeley parser, a state-of-the-art chart parser, in both accuracies and speeds. [sent-411, score-0.454]
95 2 Table 10: Comparison ofparsing accuracies of the baseline, extended on different phrase types. [sent-449, score-0.245]
96 parser, and semi-supervised parsers )rec-%Fos(109870 091. [sent-450, score-0.194]
97 4 BaselineExtendedSemi-supervised Figure 4: Comparison of parsing accuracies of the baseline, extended parser, and semi-supervised parser on sentences of different unknown words. [sent-474, score-0.673]
98 Coarse- to-fine n-best parsing and maxent discriminative reranking. [sent-499, score-0.194]
99 A linear observed time statistical parser based on maximum entropy models. [sent-596, score-0.23]
100 Transition-based parsing of the Chinese Treebank using a global discriminative model. [sent-628, score-0.194]
wordName wordTfidf (topN-words)
[('padding', 0.345), ('parser', 0.23), ('parsers', 0.194), ('actions', 0.185), ('item', 0.181), ('accuracies', 0.165), ('idle', 0.162), ('bldl', 0.161), ('parsing', 0.154), ('sagae', 0.151), ('clu', 0.15), ('chinese', 0.149), ('action', 0.14), ('finish', 0.139), ('shift', 0.133), ('dh', 0.128), ('dependency', 0.126), ('constituent', 0.115), ('bldr', 0.115), ('lavie', 0.105), ('stack', 0.102), ('xh', 0.101), ('state', 0.101), ('agenda', 0.096), ('transition', 0.096), ('lengths', 0.094), ('blml', 0.092), ('tldl', 0.092), ('tldr', 0.092), ('xlk', 0.092), ('items', 0.087), ('unary', 0.083), ('zhang', 0.082), ('shiftreduce', 0.081), ('extended', 0.08), ('zhu', 0.076), ('muhua', 0.07), ('blmr', 0.069), ('completed', 0.069), ('yue', 0.063), ('deduction', 0.061), ('charniak', 0.061), ('baseline', 0.061), ('berkeley', 0.06), ('wenliang', 0.06), ('pop', 0.06), ('cs', 0.059), ('chart', 0.059), ('perceptron', 0.058), ('clark', 0.058), ('numbers', 0.056), ('axioms', 0.056), ('reranking', 0.053), ('collins', 0.052), ('si', 0.05), ('buffer', 0.05), ('chen', 0.049), ('carreras', 0.049), ('dependencies', 0.049), ('trees', 0.049), ('templates', 0.048), ('ith', 0.047), ('jingbo', 0.046), ('queue', 0.046), ('bld', 0.046), ('blm', 0.046), ('eugune', 0.046), ('provement', 0.046), ('tlml', 0.046), ('tlmr', 0.046), ('unk', 0.046), ('nivre', 0.046), ('paradigmatic', 0.045), ('unknown', 0.044), ('push', 0.043), ('petrov', 0.043), ('symbol', 0.043), ('huang', 0.041), ('penn', 0.041), ('kal', 0.041), ('singapore', 0.04), ('trigram', 0.04), ('greedy', 0.04), ('ctb', 0.04), ('iwpt', 0.04), ('features', 0.04), ('discriminative', 0.04), ('span', 0.039), ('semisupervised', 0.039), ('pos', 0.039), ('cr', 0.039), ('mcclosky', 0.038), ('cluster', 0.038), ('semi', 0.037), ('denotes', 0.037), ('constituency', 0.036), ('koo', 0.036), ('unlabeled', 0.036), ('kenji', 0.035), ('wsj', 0.035)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999976 155 acl-2013-Fast and Accurate Shift-Reduce Constituent Parsing
Author: Muhua Zhu ; Yue Zhang ; Wenliang Chen ; Min Zhang ; Jingbo Zhu
Abstract: Shift-reduce dependency parsers give comparable accuracies to their chartbased counterparts, yet the best shiftreduce constituent parsers still lag behind the state-of-the-art. One important reason is the existence of unary nodes in phrase structure trees, which leads to different numbers of shift-reduce actions between different outputs for the same input. This turns out to have a large empirical impact on the framework of global training and beam search. We propose a simple yet effective extension to the shift-reduce process, which eliminates size differences between action sequences in beam-search. Our parser gives comparable accuracies to the state-of-the-art chart parsers. With linear run-time complexity, our parser is over an order of magnitude faster than the fastest chart parser.
2 0.29145125 80 acl-2013-Chinese Parsing Exploiting Characters
Author: Meishan Zhang ; Yue Zhang ; Wanxiang Che ; Ting Liu
Abstract: Characters play an important role in the Chinese language, yet computational processing of Chinese has been dominated by word-based approaches, with leaves in syntax trees being words. We investigate Chinese parsing from the character-level, extending the notion of phrase-structure trees by annotating internal structures of words. We demonstrate the importance of character-level information to Chinese processing by building a joint segmentation, part-of-speech (POS) tagging and phrase-structure parsing system that integrates character-structure features. Our joint system significantly outperforms a state-of-the-art word-based baseline on the standard CTB5 test, and gives the best published results for Chinese parsing.
3 0.26174378 358 acl-2013-Transition-based Dependency Parsing with Selectional Branching
Author: Jinho D. Choi ; Andrew McCallum
Abstract: We present a novel approach, called selectional branching, which uses confidence estimates to decide when to employ a beam, providing the accuracy of beam search at speeds close to a greedy transition-based dependency parsing approach. Selectional branching is guaranteed to perform a fewer number of transitions than beam search yet performs as accurately. We also present a new transition-based dependency parsing algorithm that gives a complexity of O(n) for projective parsing and an expected linear time speed for non-projective parsing. With the standard setup, our parser shows an unlabeled attachment score of 92.96% and a parsing speed of 9 milliseconds per sentence, which is faster and more accurate than the current state-of-the-art transitionbased parser that uses beam search.
4 0.24028994 133 acl-2013-Efficient Implementation of Beam-Search Incremental Parsers
Author: Yoav Goldberg ; Kai Zhao ; Liang Huang
Abstract: Beam search incremental parsers are accurate, but not as fast as they could be. We demonstrate that, contrary to popular belief, most current implementations of beam parsers in fact run in O(n2), rather than linear time, because each statetransition is actually implemented as an O(n) operation. We present an improved implementation, based on Tree Structured Stack (TSS), in which a transition is performed in O(1), resulting in a real lineartime algorithm, which is verified empiri- cally. We further improve parsing speed by sharing feature-extraction and dotproduct across beam items. Practically, our methods combined offer a speedup of ∼2x over strong baselines on Penn Treeb∼a2nxk sentences, a bnads are eosrd oenrs P eofn magnitude faster on much longer sentences.
5 0.23711263 19 acl-2013-A Shift-Reduce Parsing Algorithm for Phrase-based String-to-Dependency Translation
Author: Yang Liu
Abstract: We introduce a shift-reduce parsing algorithm for phrase-based string-todependency translation. As the algorithm generates dependency trees for partial translations left-to-right in decoding, it allows for efficient integration of both n-gram and dependency language models. To resolve conflicts in shift-reduce parsing, we propose a maximum entropy model trained on the derivation graph of training data. As our approach combines the merits of phrase-based and string-todependency models, it achieves significant improvements over the two baselines on the NIST Chinese-English datasets.
6 0.22976971 132 acl-2013-Easy-First POS Tagging and Dependency Parsing with Beam Search
7 0.22133279 26 acl-2013-A Transition-Based Dependency Parser Using a Dynamic Parsing Strategy
8 0.21983767 44 acl-2013-An Empirical Examination of Challenges in Chinese Parsing
9 0.21139951 343 acl-2013-The Effect of Higher-Order Dependency Features in Discriminative Phrase-Structure Parsing
10 0.19429217 112 acl-2013-Dependency Parser Adaptation with Subtrees from Auto-Parsed Target Domain Data
11 0.17873912 204 acl-2013-Iterative Transformation of Annotation Guidelines for Constituency Parsing
12 0.16136831 7 acl-2013-A Lattice-based Framework for Joint Chinese Word Segmentation, POS Tagging and Parsing
13 0.15851963 208 acl-2013-Joint Inference for Heterogeneous Dependency Parsing
14 0.15715899 288 acl-2013-Punctuation Prediction with Transition-based Parsing
15 0.15490738 164 acl-2013-FudanNLP: A Toolkit for Chinese Natural Language Processing
16 0.13546753 123 acl-2013-Discriminative Learning with Natural Annotations: Word Segmentation as a Case Study
17 0.1296552 362 acl-2013-Turning on the Turbo: Fast Third-Order Non-Projective Turbo Parsers
18 0.12642804 368 acl-2013-Universal Dependency Annotation for Multilingual Parsing
19 0.11193723 36 acl-2013-Adapting Discriminative Reranking to Grounded Language Learning
20 0.11111302 275 acl-2013-Parsing with Compositional Vector Grammars
topicId topicWeight
[(0, 0.261), (1, -0.201), (2, -0.333), (3, 0.081), (4, -0.046), (5, -0.026), (6, 0.066), (7, -0.013), (8, 0.007), (9, -0.077), (10, -0.005), (11, 0.081), (12, -0.037), (13, 0.018), (14, 0.214), (15, 0.055), (16, -0.121), (17, -0.056), (18, -0.022), (19, -0.0), (20, -0.044), (21, -0.002), (22, 0.037), (23, -0.003), (24, -0.008), (25, 0.013), (26, -0.028), (27, -0.004), (28, 0.049), (29, 0.038), (30, -0.037), (31, 0.002), (32, 0.058), (33, -0.075), (34, -0.05), (35, 0.007), (36, -0.023), (37, -0.011), (38, -0.065), (39, 0.02), (40, -0.007), (41, 0.052), (42, -0.008), (43, -0.014), (44, -0.009), (45, 0.012), (46, 0.006), (47, 0.002), (48, 0.03), (49, 0.002)]
simIndex simValue paperId paperTitle
same-paper 1 0.96529657 155 acl-2013-Fast and Accurate Shift-Reduce Constituent Parsing
Author: Muhua Zhu ; Yue Zhang ; Wenliang Chen ; Min Zhang ; Jingbo Zhu
Abstract: Shift-reduce dependency parsers give comparable accuracies to their chartbased counterparts, yet the best shiftreduce constituent parsers still lag behind the state-of-the-art. One important reason is the existence of unary nodes in phrase structure trees, which leads to different numbers of shift-reduce actions between different outputs for the same input. This turns out to have a large empirical impact on the framework of global training and beam search. We propose a simple yet effective extension to the shift-reduce process, which eliminates size differences between action sequences in beam-search. Our parser gives comparable accuracies to the state-of-the-art chart parsers. With linear run-time complexity, our parser is over an order of magnitude faster than the fastest chart parser.
2 0.9214763 132 acl-2013-Easy-First POS Tagging and Dependency Parsing with Beam Search
Author: Ji Ma ; Jingbo Zhu ; Tong Xiao ; Nan Yang
Abstract: In this paper, we combine easy-first dependency parsing and POS tagging algorithms with beam search and structured perceptron. We propose a simple variant of “early-update” to ensure valid update in the training process. The proposed solution can also be applied to combine beam search and structured perceptron with other systems that exhibit spurious ambiguity. On CTB, we achieve 94.01% tagging accuracy and 86.33% unlabeled attachment score with a relatively small beam width. On PTB, we also achieve state-of-the-art performance. 1
3 0.89955038 358 acl-2013-Transition-based Dependency Parsing with Selectional Branching
Author: Jinho D. Choi ; Andrew McCallum
Abstract: We present a novel approach, called selectional branching, which uses confidence estimates to decide when to employ a beam, providing the accuracy of beam search at speeds close to a greedy transition-based dependency parsing approach. Selectional branching is guaranteed to perform a fewer number of transitions than beam search yet performs as accurately. We also present a new transition-based dependency parsing algorithm that gives a complexity of O(n) for projective parsing and an expected linear time speed for non-projective parsing. With the standard setup, our parser shows an unlabeled attachment score of 92.96% and a parsing speed of 9 milliseconds per sentence, which is faster and more accurate than the current state-of-the-art transitionbased parser that uses beam search.
4 0.89618909 26 acl-2013-A Transition-Based Dependency Parser Using a Dynamic Parsing Strategy
Author: Francesco Sartorio ; Giorgio Satta ; Joakim Nivre
Abstract: We present a novel transition-based, greedy dependency parser which implements a flexible mix of bottom-up and top-down strategies. The new strategy allows the parser to postpone difficult decisions until the relevant information becomes available. The novel parser has a ∼12% error reduction in unlabeled attach∼ment score over an arc-eager parser, with a slow-down factor of 2.8.
5 0.85904384 133 acl-2013-Efficient Implementation of Beam-Search Incremental Parsers
Author: Yoav Goldberg ; Kai Zhao ; Liang Huang
Abstract: Beam search incremental parsers are accurate, but not as fast as they could be. We demonstrate that, contrary to popular belief, most current implementations of beam parsers in fact run in O(n2), rather than linear time, because each statetransition is actually implemented as an O(n) operation. We present an improved implementation, based on Tree Structured Stack (TSS), in which a transition is performed in O(1), resulting in a real lineartime algorithm, which is verified empiri- cally. We further improve parsing speed by sharing feature-extraction and dotproduct across beam items. Practically, our methods combined offer a speedup of ∼2x over strong baselines on Penn Treeb∼a2nxk sentences, a bnads are eosrd oenrs P eofn magnitude faster on much longer sentences.
6 0.85842538 288 acl-2013-Punctuation Prediction with Transition-based Parsing
7 0.7977758 335 acl-2013-Survey on parsing three dependency representations for English
8 0.79690862 343 acl-2013-The Effect of Higher-Order Dependency Features in Discriminative Phrase-Structure Parsing
9 0.7571004 19 acl-2013-A Shift-Reduce Parsing Algorithm for Phrase-based String-to-Dependency Translation
10 0.74207538 208 acl-2013-Joint Inference for Heterogeneous Dependency Parsing
11 0.72187203 80 acl-2013-Chinese Parsing Exploiting Characters
12 0.6750223 362 acl-2013-Turning on the Turbo: Fast Third-Order Non-Projective Turbo Parsers
13 0.66359442 44 acl-2013-An Empirical Examination of Challenges in Chinese Parsing
14 0.614645 331 acl-2013-Stop-probability estimates computed on a large corpus improve Unsupervised Dependency Parsing
15 0.60795569 7 acl-2013-A Lattice-based Framework for Joint Chinese Word Segmentation, POS Tagging and Parsing
16 0.6079284 204 acl-2013-Iterative Transformation of Annotation Guidelines for Constituency Parsing
17 0.60441148 176 acl-2013-Grounded Unsupervised Semantic Parsing
18 0.59931117 276 acl-2013-Part-of-Speech Induction in Dependency Trees for Statistical Machine Translation
19 0.59374064 112 acl-2013-Dependency Parser Adaptation with Subtrees from Auto-Parsed Target Domain Data
20 0.59056759 163 acl-2013-From Natural Language Specifications to Program Input Parsers
topicId topicWeight
[(0, 0.042), (6, 0.039), (11, 0.1), (14, 0.018), (24, 0.035), (26, 0.047), (28, 0.027), (35, 0.067), (42, 0.076), (43, 0.173), (48, 0.051), (70, 0.096), (71, 0.014), (88, 0.046), (90, 0.038), (95, 0.051)]
simIndex simValue paperId paperTitle
same-paper 1 0.83502311 155 acl-2013-Fast and Accurate Shift-Reduce Constituent Parsing
Author: Muhua Zhu ; Yue Zhang ; Wenliang Chen ; Min Zhang ; Jingbo Zhu
Abstract: Shift-reduce dependency parsers give comparable accuracies to their chartbased counterparts, yet the best shiftreduce constituent parsers still lag behind the state-of-the-art. One important reason is the existence of unary nodes in phrase structure trees, which leads to different numbers of shift-reduce actions between different outputs for the same input. This turns out to have a large empirical impact on the framework of global training and beam search. We propose a simple yet effective extension to the shift-reduce process, which eliminates size differences between action sequences in beam-search. Our parser gives comparable accuracies to the state-of-the-art chart parsers. With linear run-time complexity, our parser is over an order of magnitude faster than the fastest chart parser.
2 0.72127074 132 acl-2013-Easy-First POS Tagging and Dependency Parsing with Beam Search
Author: Ji Ma ; Jingbo Zhu ; Tong Xiao ; Nan Yang
Abstract: In this paper, we combine easy-first dependency parsing and POS tagging algorithms with beam search and structured perceptron. We propose a simple variant of “early-update” to ensure valid update in the training process. The proposed solution can also be applied to combine beam search and structured perceptron with other systems that exhibit spurious ambiguity. On CTB, we achieve 94.01% tagging accuracy and 86.33% unlabeled attachment score with a relatively small beam width. On PTB, we also achieve state-of-the-art performance. 1
3 0.7212103 80 acl-2013-Chinese Parsing Exploiting Characters
Author: Meishan Zhang ; Yue Zhang ; Wanxiang Che ; Ting Liu
Abstract: Characters play an important role in the Chinese language, yet computational processing of Chinese has been dominated by word-based approaches, with leaves in syntax trees being words. We investigate Chinese parsing from the character-level, extending the notion of phrase-structure trees by annotating internal structures of words. We demonstrate the importance of character-level information to Chinese processing by building a joint segmentation, part-of-speech (POS) tagging and phrase-structure parsing system that integrates character-structure features. Our joint system significantly outperforms a state-of-the-art word-based baseline on the standard CTB5 test, and gives the best published results for Chinese parsing.
4 0.71345949 56 acl-2013-Argument Inference from Relevant Event Mentions in Chinese Argument Extraction
Author: Peifeng Li ; Qiaoming Zhu ; Guodong Zhou
Abstract: As a paratactic language, sentence-level argument extraction in Chinese suffers much from the frequent occurrence of ellipsis with regard to inter-sentence arguments. To resolve such problem, this paper proposes a novel global argument inference model to explore specific relationships, such as Coreference, Sequence and Parallel, among relevant event mentions to recover those intersentence arguments in the sentence, discourse and document layers which represent the cohesion of an event or a topic. Evaluation on the ACE 2005 Chinese corpus justifies the effectiveness of our global argument inference model over a state-of-the-art baseline. 1
5 0.709966 274 acl-2013-Parsing Graphs with Hyperedge Replacement Grammars
Author: David Chiang ; Jacob Andreas ; Daniel Bauer ; Karl Moritz Hermann ; Bevan Jones ; Kevin Knight
Abstract: Hyperedge replacement grammar (HRG) is a formalism for generating and transforming graphs that has potential applications in natural language understanding and generation. A recognition algorithm due to Lautemann is known to be polynomial-time for graphs that are connected and of bounded degree. We present a more precise characterization of the algorithm’s complexity, an optimization analogous to binarization of contextfree grammars, and some important implementation details, resulting in an algorithm that is practical for natural-language applications. The algorithm is part of Bolinas, a new software toolkit for HRG processing.
6 0.7077384 343 acl-2013-The Effect of Higher-Order Dependency Features in Discriminative Phrase-Structure Parsing
7 0.70683521 358 acl-2013-Transition-based Dependency Parsing with Selectional Branching
8 0.70652676 169 acl-2013-Generating Synthetic Comparable Questions for News Articles
9 0.70351261 133 acl-2013-Efficient Implementation of Beam-Search Incremental Parsers
10 0.70063972 275 acl-2013-Parsing with Compositional Vector Grammars
11 0.69924009 280 acl-2013-Plurality, Negation, and Quantification:Towards Comprehensive Quantifier Scope Disambiguation
12 0.69644165 57 acl-2013-Arguments and Modifiers from the Learner's Perspective
13 0.69526309 225 acl-2013-Learning to Order Natural Language Texts
15 0.6934787 212 acl-2013-Language-Independent Discriminative Parsing of Temporal Expressions
16 0.69145608 85 acl-2013-Combining Intra- and Multi-sentential Rhetorical Parsing for Document-level Discourse Analysis
17 0.69031954 123 acl-2013-Discriminative Learning with Natural Annotations: Word Segmentation as a Case Study
18 0.68876845 173 acl-2013-Graph-based Semi-Supervised Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging
19 0.68791509 339 acl-2013-Temporal Signals Help Label Temporal Relations
20 0.68790656 167 acl-2013-Generalizing Image Captions for Image-Text Parallel Corpus