acl acl2010 acl2010-114 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Jonathan K. Kummerfeld ; Jessika Roesner ; Tim Dawborn ; James Haggerty ; James R. Curran ; Stephen Clark
Abstract: We propose a novel self-training method for a parser which uses a lexicalised grammar and supertagger, focusing on increasing the speed of the parser rather than its accuracy. The idea is to train the supertagger on large amounts of parser output, so that the supertagger can learn to supply the supertags that the parser will eventually choose as part of the highestscoring derivation. Since the supertagger supplies fewer supertags overall, the parsing speed is increased. We demonstrate the effectiveness of the method using a CCG supertagger and parser, obtain- ing significant speed increases on newspaper text with no loss in accuracy. We also show that the method can be used to adapt the CCG parser to new domains, obtaining accuracy and speed improvements for Wikipedia and biomedical text.
Reference: text
sentIndex sentText sentNum sentScore
1 au a∗ Abstract We propose a novel self-training method for a parser which uses a lexicalised grammar and supertagger, focusing on increasing the speed of the parser rather than its accuracy. [sent-6, score-0.59]
2 The idea is to train the supertagger on large amounts of parser output, so that the supertagger can learn to supply the supertags that the parser will eventually choose as part of the highestscoring derivation. [sent-7, score-1.707]
3 Since the supertagger supplies fewer supertags overall, the parsing speed is increased. [sent-8, score-1.027]
4 We demonstrate the effectiveness of the method using a CCG supertagger and parser, obtain- ing significant speed increases on newspaper text with no loss in accuracy. [sent-9, score-0.894]
5 We also show that the method can be used to adapt the CCG parser to new domains, obtaining accuracy and speed improvements for Wikipedia and biomedical text. [sent-10, score-0.62]
6 Bangalore and Joshi (1999) call supertagging almost parsing because of the significant reduction in ambiguity which occurs once the supertags have been assigned. [sent-17, score-0.565]
7 In this paper, we focus on the CCG parser and supertagger described in Clark and Curran (2007). [sent-18, score-0.807]
8 The supertagger feeds lexical categories to the parser, and the two interact, sometimes using multiple passes over a sentence. [sent-24, score-0.815]
9 If a spanning analysis cannot be found by the parser, the number of lexical categories supplied by the supertagger is increased. [sent-25, score-0.867]
10 The supertagger-parser interaction influences speed in two ways: first, the larger the lexical ambiguity, the more derivations the parser must consider; second, each further pass is as costly as parsing a whole extra sentence. [sent-26, score-0.658]
11 Our goal is to increase parsing speed without loss of accuracy. [sent-27, score-0.33]
12 The technique we use is a form of self-training, in which the output of the parser is used to train the supertagger component. [sent-28, score-0.807]
13 The adaptive supertagger produces lexical categories that the parser would have used in the final derivation when using the baseline model. [sent-35, score-1.069]
14 However, it does so with much lower ambiguity levels, and potentially during an earlier pass, which means sentences are parsed faster. [sent-36, score-0.384]
15 By increasing the ambiguity level of the adaptive models to match the baseline system, we can also slightly increase supertagging accuracy, which can lead to higher parsing accuracy. [sent-37, score-0.634]
16 These were used to create new supertagging models adapted to the different domains. [sent-44, score-0.33]
17 The self-training method of adapting the supertagger to suit the parser increased parsing speed by more than 50% across all three domains, without loss of accuracy. [sent-45, score-1.162]
18 Using an adapted supertagger with ambiguity levels tuned to match the baseline system, we were also able to increase F-score on labelled grammatical relations by 0. [sent-46, score-1.0]
19 Figure 1 gives two sentences and their CCG derivations, showing how some of the syntactic ambiguity is transferred to the supertagging component in a lexicalised grammar. [sent-51, score-0.508]
20 Either we need a tagging model that can resolve this ambiguity, or both lexical categories must be supplied to the parser which can then attempt to resolve the ambiguity by eventually selecting between them. [sent-53, score-0.622]
21 The C&C; supertagger is similar to the Ratnaparkhi (1996) tagger, using features based on words and POS tags in a five-word window surrounding the target word, and defining a local probability distribution over supertags for each word in the sentence, given the previous two supertags. [sent-56, score-0.737]
22 , 1999), in which more than one supertag can be assigned to a word; however, as more supertags are supplied by the supertagger, parsing efficiency decreases (Chen et al. [sent-61, score-0.348]
23 The supertagger assigns to a word all lexical categories whose probabilities are within some factor, β, of the most probable category for that word. [sent-65, score-0.884]
24 When the supertagger is integrated with the C&C; parser, several progressively lower β values are considered. [sent-66, score-0.68]
25 If a sentence is not parsed on one pass then the parser attempts to parse the sentence again with a lower β value, using a larger set of categories from the supertagger. [sent-67, score-0.491]
26 Since most sentences are parsed at the first level (in which the average number of supertags assigned to each word is only slightly greater than one), this provides some of the speed benefit of single tagging, but without loss ofcoverage (Clark and Curran, 2004). [sent-68, score-0.469]
27 The use of parser output for supertagger training has been explored for LTAG by Sarkar (2007). [sent-72, score-0.84]
28 However, the focus of that work was on improving parser and supertagger accuracy rather than speed. [sent-73, score-0.894]
29 The first category in each column is correct and the categories used by the parser are marked in bold. [sent-75, score-0.35]
30 Sarkar (2001) applied co-training to LTAG parsing, in which the supertagger and parser provide the two views. [sent-80, score-0.807]
31 3 Adaptive Supertagging The purpose of the supertagger is to cut down the search space for the parser by reducing the set of categories that must be considered for each word. [sent-91, score-0.925]
32 A perfect supertagger would assign the correct category to every word. [sent-92, score-0.713]
33 In the final derivation, the parser uses one category from each set, and it is important to note that having the correct category in the set does not guarantee that the parser will use it. [sent-95, score-0.464]
34 2 The usual target of the supertagging task is to produce the top row of categories in Figure 2, the correct categories. [sent-97, score-0.362]
35 The reason speed will be improved is that we can construct models that will constrain the set of possible derivations more than the baseline model. [sent-100, score-0.323]
36 We can construct these models because we can obtain much more of our target output, parserannotated sentences, than we could for the goldstandard supertagging task. [sent-101, score-0.328]
37 The new target data will contain tagging errors, and so supertagging accuracy measured against the correct categories may decrease. [sent-102, score-0.536]
38 However, parsing accuracy will not decrease since the parser will still receive the categories it would have used, and will therefore be able to form the same highest-scoring derivation (and hence will choose it). [sent-104, score-0.519]
39 To test this idea we parsed millions of sentences 2Two of the categories for such have been left out for reasons of space, and the correct category for watch has been included for expository reasons. [sent-105, score-0.421]
40 The fact that the supertagger does not supply this category is the reason that the parser does not analyse the sentence correctly. [sent-106, score-0.876]
41 347 in three domains, producing new data annotated with the categories that the parser used with the baseline model. [sent-107, score-0.315]
42 We constructed new supertagging models that are adapted to suit the parser by training on the combination of these sets and the standard training corpora. [sent-108, score-0.622]
43 We applied standard evaluation metrics for speed and accuracy, and explored the source of the changes in parsing performance. [sent-109, score-0.325]
44 For supertagger evaluation, one thousand sentences were manually annotated with CCG lex- ical categories and POS tags. [sent-121, score-0.824]
45 As gold standard data for supertagger evaluation we have used supertagged GENIA data (Kim et al. [sent-127, score-0.673]
46 0 Table 1: Statistics for sentences in the supertagger training data. [sent-175, score-0.739]
47 These sentences were POS-tagged and parsed twice, once as for the newswire and Wikipedia data, and then again, using the bio-specific models developed by Rimell and Clark (2009). [sent-180, score-0.288]
48 2 Speed evaluation data For speed evaluation we held out three sets of sentences from each domain-specific corpus. [sent-183, score-0.301]
49 Speeds on these length controlled sets were combined to calculate an overall parsing speed for the text in each domain. [sent-185, score-0.318]
50 The test data was excluded from training data for the supertagger for all of the newswire and Wikipedia models. [sent-189, score-0.765]
51 The accuracy of supertagging is measured by multi-tagging at the first β level and considering a word correct if the correct tag is amongst any of the assigned tags. [sent-191, score-0.391]
52 For the biomedical parser evaluation we have used the parsing model and grammatical relation conversion script from Rimell and Clark (2009). [sent-192, score-0.426]
53 1 Tagging ambiguity optimisation The number of lexical categories assigned to a word by the CCG supertagger depends on the probabilities calculated for each category and the β level being used. [sent-203, score-1.05]
54 If the categories assigned are too restrictive to enable a spanning analysis, the system makes another pass with a lower β level, resulting in a higher tagging ambiguity. [sent-210, score-0.312]
55 For accuracy optimisation experiments we tune the β levels to produce the same average tagging ambiguity as the baseline model on Section 00 of CCGbank. [sent-218, score-0.457]
56 By matching the ambiguity of the default model, we can increase accuracy at the cost of some of the speed improvements the new models obtain. [sent-220, score-0.522]
57 6 Results We have performed four primary sets of experiments to explore the ability of an adaptive supertagger to improve parsing speed or accuracy. [sent-221, score-1.015]
58 In the first two experiments, we explore performance on the newswire domain, which is the source of training data for the parsing model and the baseline supertagging model. [sent-222, score-0.478]
59 This should lead to an increase in speed as the extra training data means the models are more confident and so have lower ambiguity than the baseline model for a given β value. [sent-226, score-0.627]
60 The BFGS, GIS and MIRA models produced mixed results, but no statistically significant decrease in accuracy, and as the amount of parser-annotated data was increased, parsing speed increased by up to 85%. [sent-350, score-0.367]
61 In Table 3, we have aggregated these measurements based on the change in the pass at which the sentence is parsed, and how the tagging ambiguity changes on that pass. [sent-352, score-0.395]
62 For sentences parsed on two different passes the ambiguity comparison is at the earlier pass. [sent-353, score-0.374]
63 The “Total Time Change” section of the table is the change in parsing time for sentences of that type when parsing ten thousand sentences from the corpus. [sent-354, score-0.309]
64 72% of sentences are parsed on the same pass, but with lower tag ambiguity (5th row in Table 3). [sent-357, score-0.41]
65 Three to six times as many sentences are parsed on an earlier pass than are parsed on a later pass. [sent-359, score-0.373]
66 At the same time, the average gain for sentences parsed earlier is almost always larger than the average cost for sentences parsed later. [sent-361, score-0.364]
67 In fact, despite making up only 7% of sentences in the set, those parsed earlier with lower ambiguity provide 50% of the speed improvement. [sent-363, score-0.595]
68 We may expect these sentences to be parsed in approximately the same amount of time, and this is the case for the short set, but not for the two larger sets, where we see an increase in parsing time. [sent-365, score-0.284]
69 2 Newswire accuracy optimised Any decrease in tagging ambiguity will generally lead to a decrease in accuracy. [sent-368, score-0.407]
70 Unlike the supertagger it will exclude categories that cannot be used in a derivation. [sent-370, score-0.762]
71 In the previous section, we saw that training the supertagger on parser output allowed us to develop models that produced the same categories, despite lower tagging ambiguity. [sent-371, score-0.998]
72 Since they were trained on the categories the parser was able to use in derivations, these models should also now be providing categories that are more likely to be useful. [sent-372, score-0.468]
73 When the default β values were used ambiguity dropped consistently as more parserannotated data was used, and category accuracy dropped in the same way. [sent-478, score-0.354]
74 Interestingly, while the decrease in supertag accuracy in the previous experiment did not translate into a decrease in F-score, the increase in tag accuracy here does translate into an increase in F-score. [sent-480, score-0.496]
75 This indicates that the supertagger is adapting to suit the parser. [sent-481, score-0.709]
76 In the previous experiment, the supertagger was still providing the categories the parser would have used with the baseline supertagging model, but it provided fewer other categories. [sent-482, score-1.203]
77 Since the parser is not a perfect supertagger these other categories may in fact have been incorrect, and so supertagger accuracy goes down, without changing parsing results. [sent-483, score-1.735]
78 Here we have allowed the supertagger to assign extra categories, which will only increase its accuracy. [sent-484, score-0.748]
79 First, our supertagger is more accurate, and so the parser is more likely to receive category sets that can be combined into the correct derivation. [sent-486, score-0.904]
80 Also, the supertagger has been trained on categories that the parser is able to use in derivations, which means they are more productive. [sent-487, score-0.959]
81 Three extra training sets were created by annotating newswire sentences with supertags using the baseline supertagging model. [sent-512, score-0.646]
82 The parser has access to a range of information that the supertagger does not, producing a different view of the data that the supertagger can productively learn from. [sent-585, score-1.451]
83 4 Cross-domain speed improvement When applying parsers out of domain they are typically slower and less accurate (Gildea, 2001). [sent-587, score-0.306]
84 Note that for some of the results presented here it may appear that the C&C; parser does not lose speed when out of domain, since the Wikipedia and biomedical corpora contain shorter sentences on average than the news corpus. [sent-589, score-0.595]
85 For speed improvement these were MIRA models trained on 4,000,000 parserannotated sentences from the target domain. [sent-592, score-0.391]
86 In particular, note that models trained on Wikipedia or the biomedical data produce lower F-scores3 than the baseline on newswire. [sent-594, score-0.298]
87 The changes in tagging ambiguity and accuracy also show that adaptation has occurred. [sent-596, score-0.358]
88 In all cases, the new models have lower tagging ambiguity, and lower supertag accuracy. [sent-597, score-0.292]
89 However, on the corpus of the extra data, the performance of the adapted models is comparable to the baseline model, which means the parser is probably still be receiving the same categories that it used from the sets provided by the baseline system. [sent-598, score-0.527]
90 5 Cross-domain accuracy optimised The ambiguity tuning method used to improve accuracy on the newspaper domain can also be ap- plied to the models trained on other domains. [sent-600, score-0.472]
91 In Table 7, we have tested models trained using GIS and 400,000 sentences of parsed target-domain text, with β levels tuned to match ambiguity with the baseline. [sent-601, score-0.44]
92 The accuracy presented so far for the biomedi3Note that the F-scores for Wikipedia and biomedical text are reported to only three significant figures as only 300 and 500 sentences respectively were available for parser evaluation. [sent-607, score-0.471]
93 Models were trained with GIS and 4,000,000 extra sentences, and are tested using a POS-tagger trained on biomedical data. [sent-616, score-0.291]
94 Table 10 shows the results of adding Rimell and Clark’s gold standard biomedical supertag data and using their biomedical POStagger. [sent-619, score-0.416]
95 The table also shows how accuracy can be further improved by adding our parser-annotated data from the biomedical domain as well as the additional gold standard data. [sent-620, score-0.287]
96 7 Conclusion This work has demonstrated that an adapted supertagger can improve parsing speed and accuracy. [sent-621, score-0.985]
97 The purpose of the supertagger is to reduce the search space for the parser. [sent-622, score-0.644]
98 By training the supertagger on parser output, we allow the parser to reach the derivation it would have found, sooner. [sent-623, score-1.033]
99 This approach also enables domain adaptation, improving speed and accuracy outside the original domain of the parser. [sent-624, score-0.38]
100 Optimising for speed or accuracy can be achieved by modifying the β levels used by the supertagger, which controls the lexical category ambiguity at each level used by the parser. [sent-632, score-0.6]
wordName wordTfidf (topN-words)
[('supertagger', 0.644), ('supertagging', 0.244), ('speed', 0.211), ('parser', 0.163), ('biomedical', 0.159), ('ccg', 0.154), ('ambiguity', 0.149), ('np', 0.132), ('rimell', 0.129), ('gis', 0.129), ('nanc', 0.123), ('categories', 0.118), ('clark', 0.11), ('mira', 0.105), ('parsed', 0.103), ('supertag', 0.098), ('supertags', 0.093), ('newswire', 0.088), ('tagging', 0.087), ('accuracy', 0.087), ('parsing', 0.079), ('supplied', 0.078), ('wikipedia', 0.077), ('pass', 0.071), ('ccgbank', 0.07), ('category', 0.069), ('pss', 0.065), ('curran', 0.064), ('extra', 0.064), ('sentences', 0.062), ('bangalore', 0.061), ('bfgs', 0.061), ('tag', 0.06), ('sents', 0.058), ('levels', 0.057), ('lexicalised', 0.053), ('adaptive', 0.053), ('mcclosky', 0.051), ('adapted', 0.051), ('srinivas', 0.049), ('parserannotated', 0.049), ('stephen', 0.049), ('sarkar', 0.048), ('anoop', 0.047), ('dcl', 0.047), ('derivations', 0.043), ('supertaggers', 0.043), ('watch', 0.043), ('optimisation', 0.043), ('decrease', 0.042), ('domain', 0.041), ('increase', 0.04), ('newspaper', 0.039), ('sec', 0.038), ('bio', 0.037), ('lower', 0.036), ('changes', 0.035), ('suit', 0.035), ('models', 0.035), ('baseline', 0.034), ('trained', 0.034), ('james', 0.034), ('earlier', 0.034), ('steedman', 0.034), ('sydney', 0.033), ('intel', 0.033), ('training', 0.033), ('bioinfer', 0.033), ('darroch', 0.033), ('ltag', 0.033), ('pizza', 0.033), ('tokenised', 0.033), ('perceptron', 0.031), ('timing', 0.031), ('aravind', 0.031), ('domains', 0.03), ('adapting', 0.03), ('derivation', 0.03), ('hockenmaier', 0.029), ('pyysalo', 0.029), ('supertagged', 0.029), ('sets', 0.028), ('hpsg', 0.028), ('collins', 0.028), ('accurate', 0.028), ('laura', 0.027), ('lexical', 0.027), ('change', 0.027), ('hollingshead', 0.026), ('expository', 0.026), ('genia', 0.026), ('mcintosh', 0.026), ('parc', 0.026), ('probable', 0.026), ('passes', 0.026), ('measurements', 0.026), ('parsers', 0.026), ('grammatical', 0.025), ('views', 0.025), ('confident', 0.025)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000004 114 acl-2010-Faster Parsing by Supertagger Adaptation
Author: Jonathan K. Kummerfeld ; Jessika Roesner ; Tim Dawborn ; James Haggerty ; James R. Curran ; Stephen Clark
Abstract: We propose a novel self-training method for a parser which uses a lexicalised grammar and supertagger, focusing on increasing the speed of the parser rather than its accuracy. The idea is to train the supertagger on large amounts of parser output, so that the supertagger can learn to supply the supertags that the parser will eventually choose as part of the highestscoring derivation. Since the supertagger supplies fewer supertags overall, the parsing speed is increased. We demonstrate the effectiveness of the method using a CCG supertagger and parser, obtain- ing significant speed increases on newspaper text with no loss in accuracy. We also show that the method can be used to adapt the CCG parser to new domains, obtaining accuracy and speed improvements for Wikipedia and biomedical text.
2 0.24968281 23 acl-2010-Accurate Context-Free Parsing with Combinatory Categorial Grammar
Author: Timothy A. D. Fowler ; Gerald Penn
Abstract: The definition of combinatory categorial grammar (CCG) in the literature varies quite a bit from author to author. However, the differences between the definitions are important in terms of the language classes of each CCG. We prove that a wide range of CCGs are strongly context-free, including the CCG of CCGbank and of the parser of Clark and Curran (2007). In light of these new results, we train the PCFG parser of Petrov and Klein (2007) on CCGbank and achieve state of the art results in supertagging accuracy, PARSEVAL measures and dependency accuracy.
3 0.22640799 172 acl-2010-Minimized Models and Grammar-Informed Initialization for Supertagging with Highly Ambiguous Lexicons
Author: Sujith Ravi ; Jason Baldridge ; Kevin Knight
Abstract: We combine two complementary ideas for learning supertaggers from highly ambiguous lexicons: grammar-informed tag transitions and models minimized via integer programming. Each strategy on its own greatly improves performance over basic expectation-maximization training with a bitag Hidden Markov Model, which we show on the CCGbank and CCG-TUT corpora. The strategies provide further error reductions when combined. We describe a new two-stage integer programming strategy that efficiently deals with the high degree of ambiguity on these datasets while obtaining the full effect of model minimization.
4 0.19551539 203 acl-2010-Rebanking CCGbank for Improved NP Interpretation
Author: Matthew Honnibal ; James R. Curran ; Johan Bos
Abstract: Once released, treebanks tend to remain unchanged despite any shortcomings in their depth of linguistic analysis or coverage of specific phenomena. Instead, separate resources are created to address such problems. In this paper we show how to improve the quality of a treebank, by integrating resources and implementing improved analyses for specific constructions. We demonstrate this rebanking process by creating an updated version of CCGbank that includes the predicate-argument structure of both verbs and nouns, baseNP brackets, verb-particle constructions, and restrictive and non-restrictive nominal modifiers; and evaluate the impact of these changes on a statistical parser.
5 0.14494537 228 acl-2010-The Importance of Rule Restrictions in CCG
Author: Marco Kuhlmann ; Alexander Koller ; Giorgio Satta
Abstract: Combinatory Categorial Grammar (CCG) is generally construed as a fully lexicalized formalism, where all grammars use one and the same universal set of rules, and crosslinguistic variation is isolated in the lexicon. In this paper, we show that the weak generative capacity of this ‘pure’ form of CCG is strictly smaller than that of CCG with grammar-specific rules, and of other mildly context-sensitive grammar formalisms, including Tree Adjoining Grammar (TAG). Our result also carries over to a multi-modal extension of CCG.
6 0.086615413 169 acl-2010-Learning to Translate with Source and Target Syntax
7 0.083640218 93 acl-2010-Dynamic Programming for Linear-Time Incremental Parsing
8 0.080143854 260 acl-2010-Wide-Coverage NLP with Linguistically Expressive Grammars
9 0.079534195 130 acl-2010-Hard Constraints for Grammatical Function Labelling
10 0.078678668 200 acl-2010-Profiting from Mark-Up: Hyper-Text Annotations for Guided Parsing
11 0.075514734 99 acl-2010-Efficient Third-Order Dependency Parsers
12 0.071807377 144 acl-2010-Improved Unsupervised POS Induction through Prototype Discovery
13 0.069377616 150 acl-2010-Inducing Domain-Specific Semantic Class Taggers from (Almost) Nothing
14 0.067757048 211 acl-2010-Simple, Accurate Parsing with an All-Fragments Grammar
15 0.067326158 205 acl-2010-SVD and Clustering for Unsupervised POS Tagging
16 0.065839216 241 acl-2010-Transition-Based Parsing with Confidence-Weighted Classification
17 0.065485328 252 acl-2010-Using Parse Features for Preposition Selection and Error Detection
18 0.064494669 261 acl-2010-Wikipedia as Sense Inventory to Improve Diversity in Web Search Results
19 0.063867882 153 acl-2010-Joint Syntactic and Semantic Parsing of Chinese
20 0.062722534 118 acl-2010-Fine-Grained Tree-to-String Translation Rule Extraction
topicId topicWeight
[(0, -0.194), (1, 0.001), (2, 0.09), (3, -0.018), (4, -0.062), (5, -0.094), (6, 0.168), (7, 0.033), (8, 0.18), (9, 0.078), (10, 0.157), (11, 0.059), (12, -0.209), (13, 0.08), (14, -0.049), (15, 0.019), (16, -0.008), (17, 0.006), (18, -0.049), (19, -0.043), (20, 0.062), (21, 0.014), (22, 0.005), (23, 0.096), (24, -0.079), (25, 0.014), (26, 0.061), (27, -0.078), (28, -0.081), (29, -0.013), (30, -0.005), (31, 0.054), (32, 0.054), (33, 0.034), (34, -0.012), (35, 0.011), (36, 0.016), (37, -0.002), (38, -0.023), (39, 0.022), (40, 0.084), (41, 0.002), (42, -0.005), (43, 0.014), (44, -0.022), (45, 0.025), (46, 0.013), (47, 0.057), (48, -0.063), (49, 0.09)]
simIndex simValue paperId paperTitle
same-paper 1 0.91492844 114 acl-2010-Faster Parsing by Supertagger Adaptation
Author: Jonathan K. Kummerfeld ; Jessika Roesner ; Tim Dawborn ; James Haggerty ; James R. Curran ; Stephen Clark
Abstract: We propose a novel self-training method for a parser which uses a lexicalised grammar and supertagger, focusing on increasing the speed of the parser rather than its accuracy. The idea is to train the supertagger on large amounts of parser output, so that the supertagger can learn to supply the supertags that the parser will eventually choose as part of the highestscoring derivation. Since the supertagger supplies fewer supertags overall, the parsing speed is increased. We demonstrate the effectiveness of the method using a CCG supertagger and parser, obtain- ing significant speed increases on newspaper text with no loss in accuracy. We also show that the method can be used to adapt the CCG parser to new domains, obtaining accuracy and speed improvements for Wikipedia and biomedical text.
2 0.8646751 23 acl-2010-Accurate Context-Free Parsing with Combinatory Categorial Grammar
Author: Timothy A. D. Fowler ; Gerald Penn
Abstract: The definition of combinatory categorial grammar (CCG) in the literature varies quite a bit from author to author. However, the differences between the definitions are important in terms of the language classes of each CCG. We prove that a wide range of CCGs are strongly context-free, including the CCG of CCGbank and of the parser of Clark and Curran (2007). In light of these new results, we train the PCFG parser of Petrov and Klein (2007) on CCGbank and achieve state of the art results in supertagging accuracy, PARSEVAL measures and dependency accuracy.
3 0.80406058 172 acl-2010-Minimized Models and Grammar-Informed Initialization for Supertagging with Highly Ambiguous Lexicons
Author: Sujith Ravi ; Jason Baldridge ; Kevin Knight
Abstract: We combine two complementary ideas for learning supertaggers from highly ambiguous lexicons: grammar-informed tag transitions and models minimized via integer programming. Each strategy on its own greatly improves performance over basic expectation-maximization training with a bitag Hidden Markov Model, which we show on the CCGbank and CCG-TUT corpora. The strategies provide further error reductions when combined. We describe a new two-stage integer programming strategy that efficiently deals with the high degree of ambiguity on these datasets while obtaining the full effect of model minimization.
4 0.77257025 203 acl-2010-Rebanking CCGbank for Improved NP Interpretation
Author: Matthew Honnibal ; James R. Curran ; Johan Bos
Abstract: Once released, treebanks tend to remain unchanged despite any shortcomings in their depth of linguistic analysis or coverage of specific phenomena. Instead, separate resources are created to address such problems. In this paper we show how to improve the quality of a treebank, by integrating resources and implementing improved analyses for specific constructions. We demonstrate this rebanking process by creating an updated version of CCGbank that includes the predicate-argument structure of both verbs and nouns, baseNP brackets, verb-particle constructions, and restrictive and non-restrictive nominal modifiers; and evaluate the impact of these changes on a statistical parser.
5 0.69887543 228 acl-2010-The Importance of Rule Restrictions in CCG
Author: Marco Kuhlmann ; Alexander Koller ; Giorgio Satta
Abstract: Combinatory Categorial Grammar (CCG) is generally construed as a fully lexicalized formalism, where all grammars use one and the same universal set of rules, and crosslinguistic variation is isolated in the lexicon. In this paper, we show that the weak generative capacity of this ‘pure’ form of CCG is strictly smaller than that of CCG with grammar-specific rules, and of other mildly context-sensitive grammar formalisms, including Tree Adjoining Grammar (TAG). Our result also carries over to a multi-modal extension of CCG.
6 0.50786775 12 acl-2010-A Probabilistic Generative Model for an Intermediate Constituency-Dependency Representation
7 0.48286602 252 acl-2010-Using Parse Features for Preposition Selection and Error Detection
8 0.46511158 200 acl-2010-Profiting from Mark-Up: Hyper-Text Annotations for Guided Parsing
9 0.44212669 260 acl-2010-Wide-Coverage NLP with Linguistically Expressive Grammars
10 0.42485574 211 acl-2010-Simple, Accurate Parsing with an All-Fragments Grammar
11 0.4094997 99 acl-2010-Efficient Third-Order Dependency Parsers
12 0.40326262 19 acl-2010-A Taxonomy, Dataset, and Classifier for Automatic Noun Compound Interpretation
13 0.39943415 130 acl-2010-Hard Constraints for Grammatical Function Labelling
14 0.38590726 96 acl-2010-Efficient Optimization of an MDL-Inspired Objective Function for Unsupervised Part-Of-Speech Tagging
15 0.3777597 93 acl-2010-Dynamic Programming for Linear-Time Incremental Parsing
16 0.35617539 212 acl-2010-Simple Semi-Supervised Training of Part-Of-Speech Taggers
17 0.35475343 185 acl-2010-Open Information Extraction Using Wikipedia
18 0.34298727 76 acl-2010-Creating Robust Supervised Classifiers via Web-Scale N-Gram Data
19 0.34141886 205 acl-2010-SVD and Clustering for Unsupervised POS Tagging
20 0.32783595 144 acl-2010-Improved Unsupervised POS Induction through Prototype Discovery
topicId topicWeight
[(2, 0.225), (14, 0.022), (25, 0.12), (33, 0.01), (39, 0.016), (42, 0.022), (59, 0.128), (73, 0.057), (76, 0.015), (78, 0.068), (80, 0.015), (83, 0.077), (84, 0.019), (98, 0.118)]
simIndex simValue paperId paperTitle
1 0.8823061 30 acl-2010-An Open-Source Package for Recognizing Textual Entailment
Author: Milen Kouylekov ; Matteo Negri
Abstract: This paper presents a general-purpose open source package for recognizing Textual Entailment. The system implements a collection of algorithms, providing a configurable framework to quickly set up a working environment to experiment with the RTE task. Fast prototyping of new solutions is also allowed by the possibility to extend its modular architecture. We present the tool as a useful resource to approach the Textual Entailment problem, as an instrument for didactic purposes, and as an opportunity to create a collaborative environment to promote research in the field.
same-paper 2 0.81342959 114 acl-2010-Faster Parsing by Supertagger Adaptation
Author: Jonathan K. Kummerfeld ; Jessika Roesner ; Tim Dawborn ; James Haggerty ; James R. Curran ; Stephen Clark
Abstract: We propose a novel self-training method for a parser which uses a lexicalised grammar and supertagger, focusing on increasing the speed of the parser rather than its accuracy. The idea is to train the supertagger on large amounts of parser output, so that the supertagger can learn to supply the supertags that the parser will eventually choose as part of the highestscoring derivation. Since the supertagger supplies fewer supertags overall, the parsing speed is increased. We demonstrate the effectiveness of the method using a CCG supertagger and parser, obtain- ing significant speed increases on newspaper text with no loss in accuracy. We also show that the method can be used to adapt the CCG parser to new domains, obtaining accuracy and speed improvements for Wikipedia and biomedical text.
3 0.70979667 169 acl-2010-Learning to Translate with Source and Target Syntax
Author: David Chiang
Abstract: Statistical translation models that try to capture the recursive structure of language have been widely adopted over the last few years. These models make use of varying amounts of information from linguistic theory: some use none at all, some use information about the grammar of the target language, some use information about the grammar of the source language. But progress has been slower on translation models that are able to learn the relationship between the grammars of both the source and target language. We discuss the reasons why this has been a challenge, review existing attempts to meet this challenge, and show how some old and new ideas can be combined into a sim- ple approach that uses both source and target syntax for significant improvements in translation accuracy.
4 0.70958972 23 acl-2010-Accurate Context-Free Parsing with Combinatory Categorial Grammar
Author: Timothy A. D. Fowler ; Gerald Penn
Abstract: The definition of combinatory categorial grammar (CCG) in the literature varies quite a bit from author to author. However, the differences between the definitions are important in terms of the language classes of each CCG. We prove that a wide range of CCGs are strongly context-free, including the CCG of CCGbank and of the parser of Clark and Curran (2007). In light of these new results, we train the PCFG parser of Petrov and Klein (2007) on CCGbank and achieve state of the art results in supertagging accuracy, PARSEVAL measures and dependency accuracy.
5 0.70045501 53 acl-2010-Blocked Inference in Bayesian Tree Substitution Grammars
Author: Trevor Cohn ; Phil Blunsom
Abstract: Learning a tree substitution grammar is very challenging due to derivational ambiguity. Our recent approach used a Bayesian non-parametric model to induce good derivations from treebanked input (Cohn et al., 2009), biasing towards small grammars composed of small generalisable productions. In this paper we present a novel training method for the model using a blocked Metropolis-Hastings sampler in place of the previous method’s local Gibbs sampler. The blocked sampler makes considerably larger moves than the local sampler and consequently con- verges in less time. A core component of the algorithm is a grammar transformation which represents an infinite tree substitution grammar in a finite context free grammar. This enables efficient blocked inference for training and also improves the parsing algorithm. Both algorithms are shown to improve parsing accuracy.
6 0.69948637 237 acl-2010-Topic Models for Word Sense Disambiguation and Token-Based Idiom Detection
7 0.69877279 211 acl-2010-Simple, Accurate Parsing with an All-Fragments Grammar
8 0.69598556 172 acl-2010-Minimized Models and Grammar-Informed Initialization for Supertagging with Highly Ambiguous Lexicons
9 0.69398278 31 acl-2010-Annotation
10 0.69294816 71 acl-2010-Convolution Kernel over Packed Parse Forest
11 0.69292951 120 acl-2010-Fully Unsupervised Core-Adjunct Argument Classification
12 0.68966693 130 acl-2010-Hard Constraints for Grammatical Function Labelling
13 0.68626237 261 acl-2010-Wikipedia as Sense Inventory to Improve Diversity in Web Search Results
14 0.68613267 248 acl-2010-Unsupervised Ontology Induction from Text
15 0.68578643 218 acl-2010-Structural Semantic Relatedness: A Knowledge-Based Method to Named Entity Disambiguation
16 0.68512243 70 acl-2010-Contextualizing Semantic Representations Using Syntactically Enriched Vector Models
17 0.68398172 121 acl-2010-Generating Entailment Rules from FrameNet
18 0.68282324 158 acl-2010-Latent Variable Models of Selectional Preference
19 0.68192232 162 acl-2010-Learning Common Grammar from Multilingual Corpus
20 0.68069792 17 acl-2010-A Structured Model for Joint Learning of Argument Roles and Predicate Senses