emnlp emnlp2010 emnlp2010-115 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Slav Petrov ; Pi-Chuan Chang ; Michael Ringgaard ; Hiyan Alshawi
Abstract: It is well known that parsing accuracies drop significantly on out-of-domain data. What is less known is that some parsers suffer more from domain shifts than others. We show that dependency parsers have more difficulty parsing questions than constituency parsers. In particular, deterministic shift-reduce dependency parsers, which are of highest interest for practical applications because of their linear running time, drop to 60% labeled accuracy on a question test set. We propose an uptraining procedure in which a deterministic parser is trained on the output of a more accurate, but slower, latent variable constituency parser (converted to dependencies). Uptraining with 100K unlabeled questions achieves results comparable to having 2K labeled questions for training. With 100K unlabeled and 2K labeled questions, uptraining is able to improve parsing accuracy to 84%, closing the gap between in-domain and out-of-domain performance.
Reference: text
sentIndex sentText sentNum sentScore
1 com av Abstract It is well known that parsing accuracies drop significantly on out-of-domain data. [sent-2, score-0.286]
2 What is less known is that some parsers suffer more from domain shifts than others. [sent-3, score-0.412]
3 We show that dependency parsers have more difficulty parsing questions than constituency parsers. [sent-4, score-0.952]
4 In particular, deterministic shift-reduce dependency parsers, which are of highest interest for practical applications because of their linear running time, drop to 60% labeled accuracy on a question test set. [sent-5, score-0.54]
5 We propose an uptraining procedure in which a deterministic parser is trained on the output of a more accurate, but slower, latent variable constituency parser (converted to dependencies). [sent-6, score-1.385]
6 Uptraining with 100K unlabeled questions achieves results comparable to having 2K labeled questions for training. [sent-7, score-0.7]
7 With 100K unlabeled and 2K labeled questions, uptraining is able to improve parsing accuracy to 84%, closing the gap between in-domain and out-of-domain performance. [sent-8, score-0.762]
8 At this point, we have many different parsing models that reach and even surpass 90% dependency or constituency accuracy on this test set (McDonald et al. [sent-10, score-0.394]
9 Quite impressively, models based on deterministic shift-reduce parsing 705 algorithms are able to rival the other computationally more expensive models (see Nivre (2008) and references therein for more details). [sent-15, score-0.293]
10 Unfortunately, the parsing accuracies of all models have been reported to drop significantly on outof-domain test sets, due to shifts in vocabulary and grammar usage (Gildea, 2001; McClosky et al. [sent-17, score-0.286]
11 Questions pose interesting challenges for WSJ-trained parsers because they are heavily underrepresented in the training data (there are only 334 questions among the 39,832 training sentences). [sent-20, score-0.558]
12 At the same time, questions are of particular interest for user facing applications like question answering or web search, which necessitate parsers that can process questions in a fast and accurate manner. [sent-21, score-0.896]
13 We start our investigation in Section 3 by training several state-of-the-art (dependency and constituency) parsers on the standard WSJ training set. [sent-22, score-0.326]
14 When evaluated on a question corpus, we observe dramatic accuracy drops exceeding 20% for the deterministic shift-reduce parsers. [sent-23, score-0.27]
15 , 2007), seem to suffer more from this domain change than constituency parsers (Charniak and Johnson, 2005; Petrov et al. [sent-26, score-0.65]
16 Unfortunately, the parsers that generalize better to this new domain have time complexities that are cubic in the sentence length (or even higher), rendering them impractical for web-scale text processing. [sent-30, score-0.442]
17 Figure 1: Example constituency tree from the QuestionBank (a) converted to labeled Stanford dependencies (b). [sent-36, score-0.373]
18 We therefore propose an uptraining method, in which a deterministic shift-reduce parser is trained on the output of a more accurate, but slower parser (Section 4). [sent-37, score-1.081]
19 Instead, our aim is to train a computationally cheaper model (a linear time dependency parser) to match the performance of the best model (a cubic time constituency parser), resulting in a computationally efficient, yet highly accurate model. [sent-40, score-0.381]
20 In practice, we parse a large amount of unlabeled data from the target domain with the constituency parser of Petrov et al. [sent-41, score-0.649]
21 (2006) and then train a deterministic dependency parser on this noisy, automatically parsed data. [sent-42, score-0.554]
22 The accuracy of the linear time parser on a question test set goes up from 60. [sent-43, score-0.265]
23 94% after uptraining, which is comparable to adding 2,000 labeled questions to the training data. [sent-45, score-0.35]
24 Combining uptraining with 2,000 labeled questions further improves the accuracy to 84. [sent-46, score-0.79]
25 , 2006), which includes a set of manually annotated questions from a TREC question answering task. [sent-55, score-0.295]
26 The questions in the QuestionBank are very different from our training data in terms of grammatical constructions and vocabulary usage, making this a rather extreme case of domainadaptation. [sent-56, score-0.232]
27 We split the 4,000 questions contained in this corpus in three parts: the first 2,000 questions are reserved as a small target-domain training set; the remaining 2,000 questions are split in two equal parts, the first serving as development set and the second as our final test set. [sent-57, score-0.729]
28 We convert the trees in both treebanks from constituencies to labeled dependencies (see Figure 1) using the Stanford converter, which produces 46 types of labeled dependencies1 (de Marneffe et al. [sent-59, score-0.286]
29 We evaluate on both unlabeled (UAS) and labeled dependency accuracy Additionally, we use a set of 2 million questions collected from Internet search queries as unlabeled target domain data. [sent-61, score-0.842]
30 540987 Table 1: Parsing accuracies for parsers trained on newswire data and evaluated on newswire and question test sets. [sent-93, score-0.608]
31 similar in style to the questions in the QuestionBank: (i) the queries must start with an English function word that can be used to start a question (what, who when, how, why, can, does, etc. [sent-94, score-0.324]
32 2 Parsers We use multiple publicly available parsers, as well as our own implementation of a deterministic shiftreduce parser in our experiments. [sent-97, score-0.466]
33 The dependency parsers that we compare are the deterministic shift-reduce MaltParser (Nivre et al. [sent-98, score-0.636]
34 Our shiftreduce parser is a re-implementation of the MaltParser, using a standard set of features and a linear kernel SVM for classification. [sent-101, score-0.259]
35 We also train and evaluate the generative lexicalized parser of Charniak (2000) on its own, as well as in combination with the discriminative reranker of Charniak and Johnson (2005). [sent-102, score-0.241]
36 To facilitate comparisons between constituency and dependency parsers, we convert the output of the constituency parsers to labeled dependencies using the same procedure that is applied to the treebanks. [sent-108, score-1.007]
37 While the constituency parsers used in our experiments view part-of-speech (POS) tagging as an integral part of parsing, the dependency parsers require the input to be tagged with a separate POS tagger. [sent-110, score-0.96]
38 Tagger and parser are always trained on the same data. [sent-112, score-0.202]
39 1 No Labeled Target Domain Data We first trained all parsers on the WSJ training set and evaluated their performance on the two domain specific evaluation sets (newswire and questions). [sent-121, score-0.412]
40 As can be seen in the left columns of Table 1, all parsers perform very well on the WSJ development set. [sent-122, score-0.356]
41 , 2005) that constituency parsers are more accurate at producing dependencies than dependency parsers (at least when the dependencies were produced by a deterministic transformation of a constituency treebank, as is the case here). [sent-125, score-1.515]
42 80 0 Table 2: Parsing accuracies for parsers trained on newswire and question data and evaluated on a question test set. [sent-156, score-0.637]
43 might have expected, the accuracies are significantly lower, however, the drop for some of the parsers is shocking. [sent-157, score-0.526]
44 Most notably, the deterministic shiftreduce parsers lose almost 25% (absolute) on labeled accuracies, while the latent variable parsers lose around 12%. [sent-158, score-1.239]
45 3 Note also that even with gold POS tags, LAS is below 70% for our deterministic shift-reduce parser, suggesting that the drop in accuracy is primarily due to a syntactic shift rather than a lexical shift. [sent-159, score-0.256]
46 These low accuracies are especially disturbing when one considers that the average question in the evaluation set is only nine words long and therefore potentially much less ambiguous than WSJ sentences. [sent-160, score-0.214]
47 Overall, the dependency parsers seem to suffer more from the domain change than the constituency parsers. [sent-162, score-0.753]
48 oTfh thise eis f not a Sli →mit NatPio nV oPf i dne a- pendency parsers in general. [sent-166, score-0.326]
49 Looking at the constituency parsers, we observe 3The difference between our shift-reduce parser and the MaltParser are due to small differences in the feature sets. [sent-170, score-0.407]
50 708 that the lexicalized (reranking) parser of Charniak and Johnson (2005) loses more than the latent variable approach of Petrov et al. [sent-171, score-0.414]
51 Intuitively speaking, some of the latent variables seem to get allocated for modeling the few questions present in the training data, while the lexicalization contexts are not able to distinguish between declarative sentences and questions. [sent-176, score-0.374]
52 When the training and test data are processed this way, the lexicalized parser loses 1. [sent-179, score-0.285]
53 5% F1, while the latent variable parser loses only 0. [sent-180, score-0.375]
54 In the second experiment, we removed all questions from the WSJ training set and retrained both parsers. [sent-183, score-0.232]
55 The lexicalized parser came out ahead in this experiment,4 confirming our hypothesis that the latent variable model is better able to pick up the small amount of relevant evidence that is present in the WSJ training data (rather than being systematically 4The F1 scores were 52. [sent-185, score-0.37]
56 We now consider a situation where a small amount of labeled data (2,000 manually parsed sentences) from the domain of interest is available for training. [sent-192, score-0.246]
57 As Table 2 shows (left columns), even a modest amount of labeled data from the target domain can significantly boost parsing performance, giving double-digit improvements in some cases. [sent-195, score-0.328]
58 While not shown in the table, the parsing accuracies on the WSJ development set where largely unaffected by the additional training data. [sent-196, score-0.237]
59 The parsing accuracies of these domain-specific models are shown in the right columns of Table 2, and are significantly lower than those of models trained on the concatenated training sets. [sent-198, score-0.267]
60 They are often times even lower than the results of parsers trained exclusively on the WSJ, indicating that 2,000 sentences are not sufficient to train accurate parsers, even for quite narrow domains. [sent-199, score-0.369]
61 4 Uptraining for Domain-Adaptation The results in the previous section suggest that parsers without global constraints have difficulties dealing with the syntactic differences between declarative sentences and questions. [sent-200, score-0.362]
62 , 2006), we propose to use automatically labeled target domain data to learn the target domain distribution directly. [sent-206, score-0.366]
63 Self-training The idea of training parsers on their own output has been around for as long as there have been statistical parsers, but typically does not work well at all (Charniak, 1997). [sent-209, score-0.326]
64 (2003) present co-training procedures for parsers and taggers respectively, which are effective when only very little labeled data is available. [sent-212, score-0.444]
65 (2006a) were the first to improve a state-of-the-art constituency parsing system by utilizing unlabeled data for self-training. [sent-214, score-0.409]
66 In subsequent work, they show that the same idea can be used for domain adaptation if the unlabeled data is chosen accordingly (McClosky et al. [sent-215, score-0.245]
67 Sagae and Tsujii (2007) co-train two dependency parsers by adding automatically parsed sentences for which the parsers agree to the training data. [sent-217, score-0.797]
68 performance of the best parser, we want to build a more efficient parser that comes close to the accuracy of the best parser. [sent-241, score-0.202]
69 To do this, we parse the unlabeled data with our most accurate parser and generate noisy, but fairly accurate labels (parse trees) for the unlabeled data. [sent-242, score-0.554]
70 We refer to the parser used for producing the automatic labels as the base parser (unless otherwise noted, we used the latent variable parser of Petrov et al. [sent-243, score-0.803]
71 Because the most accurate base parsers are constituency parsers, we need to convert the parse trees to dependencies using the Stanford converter (see Section 2). [sent-245, score-0.7]
72 The automatically parsed sentences are appended to the labeled training data, and the shift-reduce parser (and the part-of-speech tagger) are trained on this new training set. [sent-246, score-0.362]
73 2 Varying amounts of unlabeled data Figure 2 shows the efficacy of uptraining as a function of the size of the unlabeled data. [sent-249, score-0.676]
74 Both labeled (LAS) and unlabeled accuracies (UAS) improve sharply when automatically parsed sentences from the target domain are added to the training data, and level off after 100,000 sentences. [sent-250, score-0.553]
75 3 Varying the base parser Table 3 then compares uptraining on the output of different base parsers to pure self-training. [sent-259, score-1.044]
76 In these experiments, the same set of 500,000 questions was parsed by different base parsers. [sent-260, score-0.312]
77 The automatic parses were then added to the labeled training data and the parser was retrained. [sent-261, score-0.32]
78 As the results show, self-training provides only modest improvements of less than 2%, while uptraining gives double-digit improvements in some cases. [sent-262, score-0.44]
79 Interestingly, there seems to be no substantial difference between uptraining on the output of a single latent variable parser (Petrov et al. [sent-263, score-0.771]
80 It appears that the roughly 1% accuracy difference between the two base parsers is not important for uptraining. [sent-265, score-0.364]
81 4 POS-less parsing Our uptraining procedure improves parse quality on out-of-domain data to the level of in-domain accuracy. [sent-267, score-0.526]
82 , 1992) to produce a deterministic hierarchical clustering of our input vocabulary. [sent-276, score-0.207]
83 53 Table 4: Parsing accuracies of uptrained parsers with and without part-of-speech tags and word cluster features. [sent-282, score-0.605]
84 This change makes our parser completely deterministic and enables us to process sentences in a single left-to-right pass. [sent-285, score-0.409]
85 5 Error Analysis To provide a better understanding of the challenges involved in parsing questions, we analyzed the errors made by our WSJ-trained shift-reduce parser and also compared them to the errors that are left after uptraining. [sent-286, score-0.356]
86 The parsing accuracies of our shift-reduce parser using gold POS tags are listed in the last rows of Tables 1 and 2. [sent-312, score-0.479]
87 Even with gold POS tags, the deterministic shift-reduce parser falls short of the accuracies of the constituency parsers (with automatic tags), presumably because the shift-reduce model is making only local decisions and is lacking the global constraints provided by the context-free grammar. [sent-313, score-1.091]
88 ” should be enzymes, but the WSJtrained parser labels “What” as the nsubj, which makes sense in a statement but not in a question. [sent-324, score-0.232]
89 root compl nsubj ccompdet nnnn ROOT WP NNS VBD What films featured DT NN NNP NNP . [sent-338, score-0.253]
90 rootcompl amodnsubjccomp nnnsubjp ROOT WRB JJ How many NNS people VBD NNP did Randy Figure 3: Example questions from the QuestionBank development NNP Craft VB . [sent-344, score-0.289]
91 the WSJ model often makes this mistake and therefore the precision is much lower when it doesn’t see more questions in the training data. [sent-347, score-0.232]
92 As a consequence, the WSJ model cannot predict this label in questions very well. [sent-358, score-0.232]
93 6 Conclusions We presented a method for domain adaptation of deterministic shift-reduce parsers. [sent-367, score-0.334]
94 We evaluated multiple state-of-the-art parsers on a question corpus and showed that parsing accuracies degrade substantially on this out-of-domain task. [sent-368, score-0.626]
95 Most notably, deterministic shift-reduce parsers have difficulty dealing with the modified word order and lose more than 20% in accuracy. [sent-369, score-0.571]
96 We then proposed a simple, yet very effective uptraining method for domainadaptation. [sent-370, score-0.44]
97 In a nutshell, we trained a deterministic shift-reduce parser on the output of a more accurate, but slower parser. [sent-371, score-0.439]
98 Uptraining with large amounts of unlabeled data gives similar improvements as having access to 2,000 labeled sentences from the target domain. [sent-372, score-0.274]
99 With 2,000 labeled questions and a large amount of unlabeled questions, uptraining is able to close the gap between in-domain and out-of-domain accuracy. [sent-373, score-0.908]
100 Dependency parsing and domain adaptation with lr models and parser ensembles. [sent-581, score-0.415]
wordName wordTfidf (topN-words)
[('uptraining', 0.44), ('parsers', 0.326), ('questionbank', 0.278), ('wsj', 0.233), ('questions', 0.232), ('deterministic', 0.207), ('constituency', 0.205), ('parser', 0.202), ('accuracies', 0.151), ('nnp', 0.135), ('unlabeled', 0.118), ('labeled', 0.118), ('dependency', 0.103), ('petrov', 0.095), ('nivre', 0.089), ('parsing', 0.086), ('domain', 0.086), ('nsubj', 0.082), ('mcdonald', 0.079), ('osbourne', 0.076), ('profession', 0.076), ('wrb', 0.076), ('latent', 0.073), ('charniak', 0.073), ('las', 0.068), ('pos', 0.067), ('maltparser', 0.065), ('question', 0.063), ('root', 0.062), ('mcclosky', 0.06), ('kill', 0.059), ('uas', 0.059), ('attr', 0.057), ('craft', 0.057), ('mstparser', 0.057), ('nnnsubjp', 0.057), ('ozzy', 0.057), ('shiftreduce', 0.057), ('uptrained', 0.057), ('wdt', 0.057), ('koo', 0.057), ('variable', 0.056), ('stanford', 0.055), ('vbd', 0.051), ('dependencies', 0.05), ('drop', 0.049), ('born', 0.049), ('randy', 0.049), ('qb', 0.049), ('sagae', 0.048), ('vbz', 0.048), ('nns', 0.046), ('loses', 0.044), ('amod', 0.044), ('wp', 0.043), ('accurate', 0.043), ('parsed', 0.042), ('adaptation', 0.041), ('marneffe', 0.041), ('oldest', 0.041), ('tags', 0.04), ('lexicalized', 0.039), ('abbreviated', 0.038), ('compl', 0.038), ('converter', 0.038), ('dobj', 0.038), ('doyle', 0.038), ('ental', 0.038), ('films', 0.038), ('jjs', 0.038), ('manufacture', 0.038), ('nmivcdreo', 0.038), ('peugeot', 0.038), ('popeye', 0.038), ('sbarq', 0.038), ('lose', 0.038), ('target', 0.038), ('base', 0.038), ('carreras', 0.037), ('declarative', 0.036), ('dep', 0.036), ('newswire', 0.034), ('errors', 0.034), ('enzymes', 0.033), ('osn', 0.033), ('tnt', 0.033), ('serving', 0.033), ('featured', 0.033), ('seem', 0.033), ('nn', 0.032), ('cluster', 0.031), ('dt', 0.031), ('labels', 0.03), ('slower', 0.03), ('columns', 0.03), ('tagger', 0.03), ('cubic', 0.03), ('whnp', 0.03), ('queries', 0.029), ('doesn', 0.029)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000002 115 emnlp-2010-Uptraining for Accurate Deterministic Question Parsing
Author: Slav Petrov ; Pi-Chuan Chang ; Michael Ringgaard ; Hiyan Alshawi
Abstract: It is well known that parsing accuracies drop significantly on out-of-domain data. What is less known is that some parsers suffer more from domain shifts than others. We show that dependency parsers have more difficulty parsing questions than constituency parsers. In particular, deterministic shift-reduce dependency parsers, which are of highest interest for practical applications because of their linear running time, drop to 60% labeled accuracy on a question test set. We propose an uptraining procedure in which a deterministic parser is trained on the output of a more accurate, but slower, latent variable constituency parser (converted to dependencies). Uptraining with 100K unlabeled questions achieves results comparable to having 2K labeled questions for training. With 100K unlabeled and 2K labeled questions, uptraining is able to improve parsing accuracy to 84%, closing the gap between in-domain and out-of-domain performance.
2 0.16607434 46 emnlp-2010-Evaluating the Impact of Alternative Dependency Graph Encodings on Solving Event Extraction Tasks
Author: Ekaterina Buyko ; Udo Hahn
Abstract: In state-of-the-art approaches to information extraction (IE), dependency graphs constitute the fundamental data structure for syntactic structuring and subsequent knowledge elicitation from natural language documents. The top-performing systems in the BioNLP 2009 Shared Task on Event Extraction all shared the idea to use dependency structures generated by a variety of parsers either directly or in some converted manner — and optionally modified their output to fit the special needs of IE. As there are systematic differences between various dependency representations being used in this competition, we scrutinize on different encoding styles for dependency information and their possible impact on solving several IE tasks. After assessing more or less established dependency representations such as the Stanford and CoNLL-X dependen— cies, we will then focus on trimming operations that pave the way to more effective IE. Our evaluation study covers data from a number of constituency- and dependency-based parsers and provides experimental evidence which dependency representations are particularly beneficial for the event extraction task. Based on empirical findings from our study we were able to achieve the performance of 57.2% F-score on the development data set of the BioNLP Shared Task 2009.
3 0.16496497 106 emnlp-2010-Top-Down Nearly-Context-Sensitive Parsing
Author: Eugene Charniak
Abstract: We present a new syntactic parser that works left-to-right and top down, thus maintaining a fully-connected parse tree for a few alternative parse hypotheses. All of the commonly used statistical parsers use context-free dynamic programming algorithms and as such work bottom up on the entire sentence. Thus they only find a complete fully connected parse at the very end. In contrast, both subjective and experimental evidence show that people understand a sentence word-to-word as they go along, or close to it. The constraint that the parser keeps one or more fully connected syntactic trees is intended to operationalize this cognitive fact. Our parser achieves a new best result for topdown parsers of 89.4%,a 20% error reduction over the previous single-parser best result for parsers of this type of 86.8% (Roark, 2001) . The improved performance is due to embracing the very large feature set available in exchange for giving up dynamic programming.
4 0.15043038 60 emnlp-2010-Improved Fully Unsupervised Parsing with Zoomed Learning
Author: Roi Reichart ; Ari Rappoport
Abstract: We introduce a novel training algorithm for unsupervised grammar induction, called Zoomed Learning. Given a training set T and a test set S, the goal of our algorithm is to identify subset pairs Ti, Si of T and S such that when the unsupervised parser is trained on a training subset Ti its results on its paired test subset Si are better than when it is trained on the entire training set T. A successful application of zoomed learning improves overall performance on the full test set S. We study our algorithm’s effect on the leading algorithm for the task of fully unsupervised parsing (Seginer, 2007) in three different English domains, WSJ, BROWN and GENIA, and show that it improves the parser F-score by up to 4.47%.
5 0.14941168 41 emnlp-2010-Efficient Graph-Based Semi-Supervised Learning of Structured Tagging Models
Author: Amarnag Subramanya ; Slav Petrov ; Fernando Pereira
Abstract: We describe a new scalable algorithm for semi-supervised training of conditional random fields (CRF) and its application to partof-speech (POS) tagging. The algorithm uses a similarity graph to encourage similar ngrams to have similar POS tags. We demonstrate the efficacy of our approach on a domain adaptation task, where we assume that we have access to large amounts of unlabeled data from the target domain, but no additional labeled data. The similarity graph is used during training to smooth the state posteriors on the target domain. Standard inference can be used at test time. Our approach is able to scale to very large problems and yields significantly improved target domain accuracy.
6 0.14029106 96 emnlp-2010-Self-Training with Products of Latent Variable Grammars
7 0.14013091 118 emnlp-2010-Utilizing Extra-Sentential Context for Parsing
8 0.12369016 51 emnlp-2010-Function-Based Question Classification for General QA
9 0.11426165 74 emnlp-2010-Learning the Relative Usefulness of Questions in Community QA
10 0.11369541 114 emnlp-2010-Unsupervised Parse Selection for HPSG
11 0.11251215 111 emnlp-2010-Two Decades of Unsupervised POS Induction: How Far Have We Come?
12 0.11003923 104 emnlp-2010-The Necessity of Combining Adaptation Methods
13 0.084594987 38 emnlp-2010-Dual Decomposition for Parsing with Non-Projective Head Automata
14 0.083301909 67 emnlp-2010-It Depends on the Translation: Unsupervised Dependency Parsing via Word Alignment
15 0.079744898 88 emnlp-2010-On Dual Decomposition and Linear Programming Relaxations for Natural Language Processing
16 0.079526164 119 emnlp-2010-We're Not in Kansas Anymore: Detecting Domain Changes in Streams
17 0.075583756 116 emnlp-2010-Using Universal Linguistic Knowledge to Guide Grammar Induction
18 0.074740134 121 emnlp-2010-What a Parser Can Learn from a Semantic Role Labeler and Vice Versa
19 0.067737274 98 emnlp-2010-Soft Syntactic Constraints for Hierarchical Phrase-Based Translation Using Latent Syntactic Distributions
20 0.066595346 33 emnlp-2010-Cross Language Text Classification by Model Translation and Semi-Supervised Learning
topicId topicWeight
[(0, 0.232), (1, 0.158), (2, 0.228), (3, 0.072), (4, 0.0), (5, 0.187), (6, 0.129), (7, 0.191), (8, 0.105), (9, 0.079), (10, 0.054), (11, 0.031), (12, 0.187), (13, 0.205), (14, 0.01), (15, 0.175), (16, -0.054), (17, 0.066), (18, 0.088), (19, 0.023), (20, 0.1), (21, 0.097), (22, -0.069), (23, 0.015), (24, 0.22), (25, -0.061), (26, -0.033), (27, 0.086), (28, 0.105), (29, 0.02), (30, -0.024), (31, 0.015), (32, 0.046), (33, 0.109), (34, 0.012), (35, -0.014), (36, -0.054), (37, 0.033), (38, -0.105), (39, 0.043), (40, -0.013), (41, -0.015), (42, -0.023), (43, 0.0), (44, -0.069), (45, -0.039), (46, -0.024), (47, 0.034), (48, 0.006), (49, 0.09)]
simIndex simValue paperId paperTitle
same-paper 1 0.9722749 115 emnlp-2010-Uptraining for Accurate Deterministic Question Parsing
Author: Slav Petrov ; Pi-Chuan Chang ; Michael Ringgaard ; Hiyan Alshawi
Abstract: It is well known that parsing accuracies drop significantly on out-of-domain data. What is less known is that some parsers suffer more from domain shifts than others. We show that dependency parsers have more difficulty parsing questions than constituency parsers. In particular, deterministic shift-reduce dependency parsers, which are of highest interest for practical applications because of their linear running time, drop to 60% labeled accuracy on a question test set. We propose an uptraining procedure in which a deterministic parser is trained on the output of a more accurate, but slower, latent variable constituency parser (converted to dependencies). Uptraining with 100K unlabeled questions achieves results comparable to having 2K labeled questions for training. With 100K unlabeled and 2K labeled questions, uptraining is able to improve parsing accuracy to 84%, closing the gap between in-domain and out-of-domain performance.
2 0.73978913 60 emnlp-2010-Improved Fully Unsupervised Parsing with Zoomed Learning
Author: Roi Reichart ; Ari Rappoport
Abstract: We introduce a novel training algorithm for unsupervised grammar induction, called Zoomed Learning. Given a training set T and a test set S, the goal of our algorithm is to identify subset pairs Ti, Si of T and S such that when the unsupervised parser is trained on a training subset Ti its results on its paired test subset Si are better than when it is trained on the entire training set T. A successful application of zoomed learning improves overall performance on the full test set S. We study our algorithm’s effect on the leading algorithm for the task of fully unsupervised parsing (Seginer, 2007) in three different English domains, WSJ, BROWN and GENIA, and show that it improves the parser F-score by up to 4.47%.
3 0.55153435 118 emnlp-2010-Utilizing Extra-Sentential Context for Parsing
Author: Jackie Chi Kit Cheung ; Gerald Penn
Abstract: Syntactic consistency is the preference to reuse a syntactic construction shortly after its appearance in a discourse. We present an analysis of the WSJ portion of the Penn Treebank, and show that syntactic consistency is pervasive across productions with various lefthand side nonterminals. Then, we implement a reranking constituent parser that makes use of extra-sentential context in its feature set. Using a linear-chain conditional random field, we improve parsing accuracy over the generative baseline parser on the Penn Treebank WSJ corpus, rivalling a similar model that does not make use of context. We show that the context-aware and the context-ignorant rerankers perform well on different subsets of the evaluation data, suggesting a combined approach would provide further improvement. We also compare parses made by models, and suggest that context can be useful for parsing by capturing structural dependencies between sentences as opposed to lexically governed dependencies.
4 0.51690036 46 emnlp-2010-Evaluating the Impact of Alternative Dependency Graph Encodings on Solving Event Extraction Tasks
Author: Ekaterina Buyko ; Udo Hahn
Abstract: In state-of-the-art approaches to information extraction (IE), dependency graphs constitute the fundamental data structure for syntactic structuring and subsequent knowledge elicitation from natural language documents. The top-performing systems in the BioNLP 2009 Shared Task on Event Extraction all shared the idea to use dependency structures generated by a variety of parsers either directly or in some converted manner — and optionally modified their output to fit the special needs of IE. As there are systematic differences between various dependency representations being used in this competition, we scrutinize on different encoding styles for dependency information and their possible impact on solving several IE tasks. After assessing more or less established dependency representations such as the Stanford and CoNLL-X dependen— cies, we will then focus on trimming operations that pave the way to more effective IE. Our evaluation study covers data from a number of constituency- and dependency-based parsers and provides experimental evidence which dependency representations are particularly beneficial for the event extraction task. Based on empirical findings from our study we were able to achieve the performance of 57.2% F-score on the development data set of the BioNLP Shared Task 2009.
5 0.48566046 106 emnlp-2010-Top-Down Nearly-Context-Sensitive Parsing
Author: Eugene Charniak
Abstract: We present a new syntactic parser that works left-to-right and top down, thus maintaining a fully-connected parse tree for a few alternative parse hypotheses. All of the commonly used statistical parsers use context-free dynamic programming algorithms and as such work bottom up on the entire sentence. Thus they only find a complete fully connected parse at the very end. In contrast, both subjective and experimental evidence show that people understand a sentence word-to-word as they go along, or close to it. The constraint that the parser keeps one or more fully connected syntactic trees is intended to operationalize this cognitive fact. Our parser achieves a new best result for topdown parsers of 89.4%,a 20% error reduction over the previous single-parser best result for parsers of this type of 86.8% (Roark, 2001) . The improved performance is due to embracing the very large feature set available in exchange for giving up dynamic programming.
6 0.46589741 96 emnlp-2010-Self-Training with Products of Latent Variable Grammars
7 0.46449617 41 emnlp-2010-Efficient Graph-Based Semi-Supervised Learning of Structured Tagging Models
8 0.44931772 114 emnlp-2010-Unsupervised Parse Selection for HPSG
9 0.44014227 74 emnlp-2010-Learning the Relative Usefulness of Questions in Community QA
10 0.40806708 51 emnlp-2010-Function-Based Question Classification for General QA
11 0.38619614 111 emnlp-2010-Two Decades of Unsupervised POS Induction: How Far Have We Come?
12 0.37507871 104 emnlp-2010-The Necessity of Combining Adaptation Methods
13 0.30159912 119 emnlp-2010-We're Not in Kansas Anymore: Detecting Domain Changes in Streams
14 0.29197824 113 emnlp-2010-Unsupervised Induction of Tree Substitution Grammars for Dependency Parsing
15 0.2827116 88 emnlp-2010-On Dual Decomposition and Linear Programming Relaxations for Natural Language Processing
16 0.27744734 38 emnlp-2010-Dual Decomposition for Parsing with Non-Projective Head Automata
17 0.27446008 67 emnlp-2010-It Depends on the Translation: Unsupervised Dependency Parsing via Word Alignment
18 0.27210274 55 emnlp-2010-Handling Noisy Queries in Cross Language FAQ Retrieval
20 0.26020649 75 emnlp-2010-Lessons Learned in Part-of-Speech Tagging of Conversational Speech
topicId topicWeight
[(3, 0.01), (10, 0.039), (12, 0.03), (14, 0.012), (29, 0.138), (30, 0.015), (32, 0.012), (52, 0.02), (56, 0.06), (62, 0.031), (66, 0.139), (72, 0.041), (76, 0.042), (77, 0.011), (79, 0.017), (87, 0.022), (89, 0.017), (91, 0.278)]
simIndex simValue paperId paperTitle
same-paper 1 0.7831232 115 emnlp-2010-Uptraining for Accurate Deterministic Question Parsing
Author: Slav Petrov ; Pi-Chuan Chang ; Michael Ringgaard ; Hiyan Alshawi
Abstract: It is well known that parsing accuracies drop significantly on out-of-domain data. What is less known is that some parsers suffer more from domain shifts than others. We show that dependency parsers have more difficulty parsing questions than constituency parsers. In particular, deterministic shift-reduce dependency parsers, which are of highest interest for practical applications because of their linear running time, drop to 60% labeled accuracy on a question test set. We propose an uptraining procedure in which a deterministic parser is trained on the output of a more accurate, but slower, latent variable constituency parser (converted to dependencies). Uptraining with 100K unlabeled questions achieves results comparable to having 2K labeled questions for training. With 100K unlabeled and 2K labeled questions, uptraining is able to improve parsing accuracy to 84%, closing the gap between in-domain and out-of-domain performance.
2 0.59696078 67 emnlp-2010-It Depends on the Translation: Unsupervised Dependency Parsing via Word Alignment
Author: Samuel Brody
Abstract: We reveal a previously unnoticed connection between dependency parsing and statistical machine translation (SMT), by formulating the dependency parsing task as a problem of word alignment. Furthermore, we show that two well known models for these respective tasks (DMV and the IBM models) share common modeling assumptions. This motivates us to develop an alignment-based framework for unsupervised dependency parsing. The framework (which will be made publicly available) is flexible, modular and easy to extend. Using this framework, we implement several algorithms based on the IBM alignment models, which prove surprisingly effective on the dependency parsing task, and demonstrate the potential of the alignment-based approach.
3 0.59609663 7 emnlp-2010-A Mixture Model with Sharing for Lexical Semantics
Author: Joseph Reisinger ; Raymond Mooney
Abstract: We introduce tiered clustering, a mixture model capable of accounting for varying degrees of shared (context-independent) feature structure, and demonstrate its applicability to inferring distributed representations of word meaning. Common tasks in lexical semantics such as word relatedness or selectional preference can benefit from modeling such structure: Polysemous word usage is often governed by some common background metaphoric usage (e.g. the senses of line or run), and likewise modeling the selectional preference of verbs relies on identifying commonalities shared by their typical arguments. Tiered clustering can also be viewed as a form of soft feature selection, where features that do not contribute meaningfully to the clustering can be excluded. We demonstrate the applicability of tiered clustering, highlighting particular cases where modeling shared structure is beneficial and where it can be detrimental.
4 0.59596825 78 emnlp-2010-Minimum Error Rate Training by Sampling the Translation Lattice
Author: Samidh Chatterjee ; Nicola Cancedda
Abstract: Minimum Error Rate Training is the algorithm for log-linear model parameter training most used in state-of-the-art Statistical Machine Translation systems. In its original formulation, the algorithm uses N-best lists output by the decoder to grow the Translation Pool that shapes the surface on which the actual optimization is performed. Recent work has been done to extend the algorithm to use the entire translation lattice built by the decoder, instead of N-best lists. We propose here a third, intermediate way, consisting in growing the translation pool using samples randomly drawn from the translation lattice. We empirically measure a systematic im- provement in the BLEU scores compared to training using N-best lists, without suffering the increase in computational complexity associated with operating with the whole lattice.
5 0.59414393 86 emnlp-2010-Non-Isomorphic Forest Pair Translation
Author: Hui Zhang ; Min Zhang ; Haizhou Li ; Eng Siong Chng
Abstract: This paper studies two issues, non-isomorphic structure translation and target syntactic structure usage, for statistical machine translation in the context of forest-based tree to tree sequence translation. For the first issue, we propose a novel non-isomorphic translation framework to capture more non-isomorphic structure mappings than traditional tree-based and tree-sequence-based translation methods. For the second issue, we propose a parallel space searching method to generate hypothesis using tree-to-string model and evaluate its syntactic goodness using tree-to-tree/tree sequence model. This not only reduces the search complexity by merging spurious-ambiguity translation paths and solves the data sparseness issue in training, but also serves as a syntax-based target language model for better grammatical generation. Experiment results on the benchmark data show our proposed two solutions are very effective, achieving significant performance improvement over baselines when applying to different translation models.
6 0.59218884 34 emnlp-2010-Crouching Dirichlet, Hidden Markov Model: Unsupervised POS Tagging with Context Local Tag Generation
7 0.59215599 69 emnlp-2010-Joint Training and Decoding Using Virtual Nodes for Cascaded Segmentation and Tagging Tasks
8 0.58867395 65 emnlp-2010-Inducing Probabilistic CCG Grammars from Logical Form with Higher-Order Unification
9 0.58783644 84 emnlp-2010-NLP on Spoken Documents Without ASR
10 0.58725899 87 emnlp-2010-Nouns are Vectors, Adjectives are Matrices: Representing Adjective-Noun Constructions in Semantic Space
11 0.58681369 114 emnlp-2010-Unsupervised Parse Selection for HPSG
12 0.58542895 57 emnlp-2010-Hierarchical Phrase-Based Translation Grammars Extracted from Alignment Posterior Probabilities
13 0.58542603 60 emnlp-2010-Improved Fully Unsupervised Parsing with Zoomed Learning
14 0.58527386 89 emnlp-2010-PEM: A Paraphrase Evaluation Metric Exploiting Parallel Texts
15 0.58510751 103 emnlp-2010-Tense Sense Disambiguation: A New Syntactic Polysemy Task
16 0.58458722 63 emnlp-2010-Improving Translation via Targeted Paraphrasing
17 0.58436275 18 emnlp-2010-Assessing Phrase-Based Translation Models with Oracle Decoding
18 0.58318985 109 emnlp-2010-Translingual Document Representations from Discriminative Projections
19 0.58190471 105 emnlp-2010-Title Generation with Quasi-Synchronous Grammar
20 0.58147174 32 emnlp-2010-Context Comparison of Bursty Events in Web Search and Online Media