emnlp emnlp2010 emnlp2010-114 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Rebecca Dridan ; Timothy Baldwin
Abstract: Parser disambiguation with precision grammars generally takes place via statistical ranking of the parse yield of the grammar using a supervised parse selection model. In the standard process, the parse selection model is trained over a hand-disambiguated treebank, meaning that without a significant investment of effort to produce the treebank, parse selection is not possible. Furthermore, as treebanking is generally streamlined with parse selection models, creating the initial treebank without a model requires more resources than subsequent treebanks. In this work, we show that, by taking advantage of the constrained nature of these HPSG grammars, we can learn a discriminative parse selection model from raw text in a purely unsupervised fashion. This allows us to bootstrap the treebanking process and provide better parsers faster, and with less resources.
Reference: text
sentIndex sentText sentNum sentScore
1 rdridan @ c s se unime lb Abstract Parser disambiguation with precision grammars generally takes place via statistical ranking of the parse yield of the grammar using a supervised parse selection model. [sent-3, score-1.125]
2 In the standard process, the parse selection model is trained over a hand-disambiguated treebank, meaning that without a significant investment of effort to produce the treebank, parse selection is not possible. [sent-4, score-1.057]
3 Furthermore, as treebanking is generally streamlined with parse selection models, creating the initial treebank without a model requires more resources than subsequent treebanks. [sent-5, score-0.841]
4 In this work, we show that, by taking advantage of the constrained nature of these HPSG grammars, we can learn a discriminative parse selection model from raw text in a purely unsupervised fashion. [sent-6, score-0.609]
5 This allows us to bootstrap the treebanking process and provide better parsers faster, and with less resources. [sent-7, score-0.252]
6 Parsing with precision grammars is generally a twostage process: (1) the full parse yield of the precision grammar is calculated for a given item, often in the form of a packed forest for efficiency (Oepen and Carroll, 2000; Zhang et al. [sent-9, score-0.692]
7 , 2007); and (2) the individual analyses in the parse forest are ranked using a statistical model (“parse selection”). [sent-10, score-0.598]
8 In the domain of treebank parsing, the Charniak and Johnson (2005) reranking parser adopts an analogous strategy, except that ranking and pruning are incorporated into the first stage, and the second stage is based on only the top-ranked parses from the first . [sent-11, score-0.4]
9 For both styles of parsing, however, parse selection is based on a statistical model learned from a pre-existing treebank associated with the grammar. [sent-14, score-0.56]
10 Our interest in this paper is in completely removing this requirement of parse selection on explicitly treebanked data, ie the development of fully unsupervised parse selection models. [sent-15, score-1.147]
11 The particular style of precision grammar we ex- periment with in this paper is HPSG (Pollard and Sag, 1994), in the form of the DELPH-IN suite of grammars (http : / /www . [sent-16, score-0.268]
12 , 2002) has been developed which, through a set of questionnaires, allows grammar engineers to quickly produce a core grammar for a language of their choice. [sent-21, score-0.4]
13 The statistical model used in the second stage of parsing (ie parse selection) requires a treebank to learn the features, but as we explain in Section 2, the treebanks are created by parsing, preferably with a statistical model. [sent-24, score-0.589]
14 The annotation process involves making binary decisions based on so-called parse discriminants (Carter, 1997). [sent-34, score-0.35]
15 This treebanking process not only produces gold standard trees, but also a set of non-gold trees which provides the negative training data necessary for a discriminative maximum entropy model. [sent-36, score-0.379]
16 The standard process for creating a parse selection model is: 1. [sent-37, score-0.507]
17 parse the training set, recording up to 500 highest-ranking parses for each sentence; 2. [sent-38, score-0.499]
18 1 (Malouf, 2002) The useful training data from this process is the parses from those sentences for which: more than one parse was found; and at least one parse has been annotated as correct. [sent-42, score-0.849]
19 Firstly, treebanking takes many personhours, and is hence both time-consuming and expensive. [sent-48, score-0.266]
20 While it is possible to parse exhaustively with no model, parsing is much slower, since the unpacking of results is time-consuming. [sent-50, score-0.511]
21 , 2007) speeds this up a great deal, but requires a parse selection model. [sent-52, score-0.507]
22 Treebanking is also much slower when the parser must be run exhaustively, since there are usually many more analyses to manually discard. [sent-53, score-0.269]
23 Even if the top1 parses this parser produces are not as accurate as those trained on gold standard data, this model can be used to produce the N-best analyses for the treebanker. [sent-56, score-0.621]
24 Hence, in this work, we experiment with languages and grammars where we have gold standard data, in order to be able to evaluate the quality of the parse selection models. [sent-59, score-0.757]
25 It is worth reinforcing that the gold-standard data is used for evaluation only, except in calculating the supervised parse selection accuracy as an upperbound. [sent-61, score-0.565]
26 1 Training Data Both of our grammars come with statistical models, and the parsed data and gold standard annotations used to create these models are freely available. [sent-71, score-0.331]
27 The details of our training sets are shown in Table 1,2 indicating that the sentence lengths are relatively short, and hence the ambiguity (measured as average parses per sentence) is low for both our grammars. [sent-75, score-0.28]
28 The ambiguity figures also suggest that the Japanese grammar is more constrained (less ambiguous) than the English grammar, since there are, on average, more parses per sentence for English, even with a lower average sentence length. [sent-76, score-0.454]
29 The tc-006 data set is from 2Any sentences that do not have both gold and non-gold analyses (ie, had no correct parse, only one parse, or none) are not included in these figures. [sent-79, score-0.334]
30 In order to have some idea of domain effects, we also use the catb data set, the text of an essay on opensource development. [sent-92, score-0.397]
31 Also, since we are not artificially limiting the parse ambiguity by ignoring those with 500 or more parses, the ambiguity is much higher. [sent-94, score-0.518]
32 This ambiguity figure gives some indication of the difficulty of the parse selection task. [sent-95, score-0.591]
33 Again we see that the English sentences are more ambiguous, much more in this case, making the parse selection task difficult. [sent-96, score-0.507]
34 In fact, the English ambiguity figures are an under-estimate, since some of the longer sentences timed out before producing a parse count. [sent-97, score-0.521]
35 3 Evaluation The exact match metric is the most common accu- racy metric used in work with the DELPH-IN tool set, and refers to the percentage of sentences for which the top parse matched the gold parse in every way. [sent-101, score-0.936]
36 Exact match is a useful metric for parse selection evaluation, but it is very blunt-edged, and gives no way of evaluating how close the top parse was to the gold standard. [sent-105, score-1.017]
37 1, label a subset of analyses as correct and the remainder as incorrect; (2) train a model using the same features and learner as in the standard process of Section 2; (3) parse the test data using that model; and (4) evaluate the accuracy of the top analyses. [sent-113, score-0.582]
38 Each of the following sections detail different methods for nominating which of the (up to 500) analyses from the training data should be considered pseudo-gold for training the parse selection model. [sent-115, score-0.681]
39 The upperbound model in this case is the model trained with gold standard annotations. [sent-118, score-0.291]
40 839 Table 3: Accuracy of the gold standard-based parse selection model. [sent-133, score-0.667]
41 For the baseline model, we used random selection to select our gold analyses. [sent-148, score-0.317]
42 For this experiment, we randomly assigned one parse from each sentence in the training data to be correct (and the remainder of analyses as incorrect), and then used that ‘gold standard’ to train the model. [sent-149, score-0.524]
43 The catb test set results suffer, not only from being longer, more ambiguous sentences, but also because it is completely out of the domain of the training data. [sent-152, score-0.451]
44 The EDM figures are perhaps higher than might be expected given random selection from the entire parse forest. [sent-154, score-0.55]
45 This results from using a precision grammar, with an inbuilt notion of grammaticality, hence constraining the parser to only produce somewhat reasonable parses, and creating a reasonably high baseline for our parse selection experiments. [sent-155, score-0.683]
46 We also tried a separate baseline, eliminating the parse selection model altogether, and using random selection directly to select the top analysis. [sent-156, score-0.664]
47 2 First attempts As a first approach to unsupervised parse selection, we looked at two heuristics to designate some num- ber of the analyses as ‘gold’ for training. [sent-161, score-0.633]
48 Since it was possible for multiple analyses to have the same score, there could be multiple gold analyses for any one sentence. [sent-168, score-0.508]
49 This method has the effect of selecting the parse(s) most like all the others, by some definitions the centroid of the parse forest. [sent-170, score-0.35]
50 In that case, however, the dependencies were extracted only from analyses that matched the gold standard supertag sequence, rather than the whole parse forest. [sent-172, score-0.721]
51 In this instance, we calculated the degree of branching as the number of right branches in a parse divided by the number of left branches (and vice versa for Japanese, a predominantly left-branching language). [sent-187, score-0.406]
52 Subsequent work involving supertags has mostly focussed on this efficiency goal, but they can also be used to inform parse selection. [sent-196, score-0.468]
53 Dalrymple (2006) and Blunsom (2007) both look at how discriminatory a tag sequence is in filtering a parse forest. [sent-197, score-0.561]
54 work has shown that tag sequences can be successfully used to restrict the set of parses produced, but generally are not discriminatory enough to distinguish a single best parse. [sent-200, score-0.406]
55 (2002) present a similar exploration but also go on to include probabilities from a HMM model into the parse selection model as features. [sent-202, score-0.507]
56 In Dridan (2009), tag sequences from a supertagger are used together with other factors to re-rank the top 500 parses from the same parser and English grammar we use in this research, and achieve some improvement in the rank- ing where tagger accuracy is sufficiently high. [sent-205, score-0.802]
57 1 Gold Supertags In order to test the viability of this method, we first experimented using gold standard tags, extracted from the gold standard parses. [sent-208, score-0.32]
58 In the Dridan (2009) work, parse ranking showed some improvement when morphological information was added to the tags. [sent-214, score-0.35]
59 794 Table 6: Accuracy using gold tag sequence compatibility to select the ‘gold’ parse(s). [sent-228, score-0.283]
60 from the leaf types of all the parses in the forest, marking as ‘gold’ any parse that had the same sequence as the gold standard parse and then training the models as before. [sent-229, score-1.009]
61 Table 6 shows the results from parsing with models based on both the basic lextype and the lextype with morphology. [sent-230, score-0.265]
62 They still fall well below training purely on gold standard data (at least for the in-domain sets), since the tag sequences are not fully discriminatory and hence noise can creep in, but accuracy is significantly better than the heuristic methods tried earlier. [sent-232, score-0.522]
63 With no significant difference between the basic and +morph versions of the tag set, we decided to use the basic lextypes as tags, since a smaller tag set should be easier to tag with. [sent-234, score-0.479]
64 2 Unsupervised Supertagging Research into unsupervised part-of-speech tagging with a tag dictionary (sometimes called weakly supervised POS tagging) has been going on for many years (cf Merialdo (1994), Brill (1995)), but generally using a fairly small tag set. [sent-237, score-0.339]
65 In this work, the constraining nature of the (CCG) grammar is used to mitigate the problem of having a much more ambiguous tag set. [sent-240, score-0.389]
66 Our method has a similar underlying idea, but the implementation differs both in the way we extract the word-to-tag mappings, and also how we extract and use the information from the grammar to initialise the tagger model. [sent-241, score-0.375]
67 One possibility for an initial model was to extract the word-to-lextype mappings from the grammar lexicon as Baldridge does, and make all starting probabilities uniform. [sent-243, score-0.273]
68 7 For this rea- son, we decided it would be simplest to initialise our probability estimates using the output of the parser, feeding in only those tag sequences which are compatible with analyses in the parse forest for that item. [sent-246, score-0.811]
69 This method takes advantage of the fact that, because the grammars are heavily constrained, the parse forest only contains viable tag sequences. [sent-247, score-0.671]
70 Since parsing without a model is slow, we restricted the training set to those sentences shorter than a specific word length (12 for English and 15 for Japanese, since that was the less ambiguous grammar and hence faster). [sent-248, score-0.368]
71 From this parsed data we extracted tag-to-word and tag-to-tag frequency counts from all parses for all sentences, and used these frequencies to produce the emission and transition probabilities, respectively. [sent-250, score-0.34]
72 3 Supertagging-based parse selection models We use both the initial counts and EM trained models to tag the training data from Table 1 and then compared this with the extracted tag sequences 6Available from http : / /webdoc s . [sent-255, score-0.975]
73 29 Raw Sentences135009410 Raw Total Words 146053 151906 Table 7: Training data for the HMM tagger (both the parsed data from which the initial probabilities were derived, and the raw data which was used to estimated the EM trained models). [sent-264, score-0.387]
74 783 Table 8: Accuracy using tag sequences from a HMM tagger to select the ‘correct’ parse(s). [sent-277, score-0.322]
75 The initial counts model was based on using counts from a parse forest to approximate the emission and transition probabilities. [sent-278, score-0.667]
76 Since we could no longer assume that our tag sequence would be present within the extracted tag sequences, we used the percentage of tokens from a parse whose lextype matched our tagged sequence as the parse score. [sent-281, score-1.071]
77 Again, we marked as ‘gold’ any parse that had the best parse score for each sentence, and trained a new parse selection model. [sent-282, score-1.25]
78 To explore why this might be so, we looked at the tagger accuracy for both models over the respective training data sets, shown in Table 9. [sent-286, score-0.266]
79 However, this insignificant tagger accuracy decrease for Japanese produced a significant increase in parser accuracy, while a more pronounced tagger accuracy decrease had no significant effect on parser accuracy in English. [sent-289, score-0.67]
80 There is also the issue of whether tag accuracy is the best measure for indicat- ing potential parse accuracy. [sent-297, score-0.531]
81 The Japanese parsing results are already equivalent to those achieved using gold standard tags. [sent-298, score-0.249]
82 It is possible that parsing accuracy is reasonably insensitive to tagger accuracy, but it is also possible that there is a better metric to look at, such as tag accuracy over frequently confused tags. [sent-299, score-0.481]
83 Results at every stage have been much worse for the catb data set, compared to the other jhpstgt English data set. [sent-306, score-0.561]
84 In this process, data from the new domain is parsed with the parser trained on the old do701 Source of ‘Gold’ DataExact MatchF-score Random Selection8. [sent-310, score-0.287]
85 791 Table 10: Accuracy results over the out-of-domain catb data set, using the initial counts unsupervised model to produce in-domain training data in a self-training set up. [sent-318, score-0.516]
86 main, and then the top analyses of the parsed new domain data are added to the training data, and the parser is re-trained. [sent-320, score-0.418]
87 This is generally considered a semi-supervised method, since the original parser is trained on gold standard data. [sent-321, score-0.298]
88 In our case, we wanted to test whether parsing data from the new domain using our unsupervised parse selection model was accurate enough to still get an improvement using self-training for domain adaptation. [sent-322, score-0.786]
89 It is not immediately clear what one might consider to be the ‘domain’ of the catb test set, since domain is generally very vaguely defined. [sent-323, score-0.397]
90 8 While the topics of these essays vary, they all relate to the social side oftechnical communities, and so we used this to represent in-domain data for the catb test set. [sent-325, score-0.329]
91 Previous results for the catb data set are given for comparison. [sent-329, score-0.329]
92 The results show that the completely unsupervised parse selection method produces a top parse that is at least accurate enough to be used in self- training, providing a cheap means of domain adaptation. [sent-330, score-0.979]
93 7 Conclusions and Further Work Comparing Tables 8 and 4, we can see that for both English and Japanese, we are able to achieve parse selection accuracy well above our baseline of a ran8http : / /www . [sent-332, score-0.565]
94 This was in part because it is possible to extract a reasonable tagging model from uncorrected parse data, due to the constrained nature of these grammars. [sent-335, score-0.389]
95 These models will hopefully allow grammar engineers to more easily build statistical models for new languages, using nothing more than their new grammar and raw text. [sent-336, score-0.448]
96 Since fully evaluating the potential for building models for new languages is a long-term ongoing experiment, we looked at a more short-term eval- uation of our unsupervised parse selection methods: building models for new domains. [sent-337, score-0.616]
97 A preliminary self-training experiment, using our initial counts tagger trained model as the starting point, showed promising results for domain adaptation. [sent-338, score-0.397]
98 The issues surrounding what makes a good tagger for this purpose, and how can we best learn one without gold training data, would be one possibly fruitful avenue for further exploration. [sent-340, score-0.313]
99 Since the optimal tagger ‘training’ we saw here (for English) was merely to read off frequency counts for parsed data, it would be easy to retrain the tagger on different domains. [sent-343, score-0.458]
100 Alternatively, it would be interesting so see how much difference it makes to train the tagger on one set of data, and use that to tag a model training set from a different domain. [sent-344, score-0.276]
wordName wordTfidf (topN-words)
[('parse', 0.35), ('catb', 0.329), ('treebanking', 0.219), ('jhpstgt', 0.197), ('grammar', 0.178), ('oepen', 0.175), ('analyses', 0.174), ('gold', 0.16), ('selection', 0.157), ('tagger', 0.153), ('parses', 0.149), ('hpsg', 0.135), ('tag', 0.123), ('supertags', 0.118), ('flickinger', 0.113), ('miyao', 0.11), ('lextypes', 0.11), ('supertagging', 0.11), ('japanese', 0.099), ('parser', 0.095), ('grammars', 0.09), ('parsing', 0.089), ('discriminatory', 0.088), ('dridan', 0.088), ('lextype', 0.088), ('upperbound', 0.088), ('ambiguity', 0.084), ('parsed', 0.081), ('edm', 0.075), ('forest', 0.074), ('counts', 0.071), ('ccg', 0.069), ('domain', 0.068), ('bender', 0.068), ('stephan', 0.065), ('clark', 0.063), ('treebanks', 0.062), ('initial', 0.062), ('ichi', 0.06), ('accuracy', 0.058), ('branching', 0.056), ('em', 0.055), ('yusuke', 0.055), ('looked', 0.055), ('unsupervised', 0.054), ('ambiguous', 0.054), ('treebank', 0.053), ('ravi', 0.052), ('raw', 0.048), ('jun', 0.048), ('hence', 0.047), ('emily', 0.047), ('taipei', 0.047), ('sequences', 0.046), ('edmna', 0.044), ('engineers', 0.044), ('hara', 0.044), ('incr', 0.044), ('initialem', 0.044), ('initialise', 0.044), ('normalised', 0.044), ('pollard', 0.044), ('rimell', 0.044), ('sag', 0.044), ('timed', 0.044), ('treebanked', 0.044), ('treebanker', 0.044), ('tsdb', 0.044), ('baldridge', 0.044), ('trained', 0.043), ('hmm', 0.043), ('figures', 0.043), ('english', 0.041), ('exact', 0.039), ('emission', 0.039), ('tagging', 0.039), ('engineering', 0.039), ('briscoe', 0.038), ('siegel', 0.038), ('copestake', 0.038), ('unpacking', 0.038), ('tsujii', 0.037), ('matched', 0.037), ('curran', 0.036), ('stage', 0.035), ('ie', 0.035), ('pages', 0.035), ('proceedings', 0.035), ('edges', 0.035), ('viable', 0.034), ('saarland', 0.034), ('constraining', 0.034), ('recursion', 0.034), ('exhaustively', 0.034), ('ccgbank', 0.034), ('tanaka', 0.034), ('edge', 0.034), ('stephen', 0.033), ('lexicon', 0.033), ('parsers', 0.033)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000001 114 emnlp-2010-Unsupervised Parse Selection for HPSG
Author: Rebecca Dridan ; Timothy Baldwin
Abstract: Parser disambiguation with precision grammars generally takes place via statistical ranking of the parse yield of the grammar using a supervised parse selection model. In the standard process, the parse selection model is trained over a hand-disambiguated treebank, meaning that without a significant investment of effort to produce the treebank, parse selection is not possible. Furthermore, as treebanking is generally streamlined with parse selection models, creating the initial treebank without a model requires more resources than subsequent treebanks. In this work, we show that, by taking advantage of the constrained nature of these HPSG grammars, we can learn a discriminative parse selection model from raw text in a purely unsupervised fashion. This allows us to bootstrap the treebanking process and provide better parsers faster, and with less resources.
2 0.15634461 121 emnlp-2010-What a Parser Can Learn from a Semantic Role Labeler and Vice Versa
Author: Stephen Boxwell ; Dennis Mehay ; Chris Brew
Abstract: In many NLP systems, there is a unidirectional flow of information in which a parser supplies input to a semantic role labeler. In this paper, we build a system that allows information to flow in both directions. We make use of semantic role predictions in choosing a single-best parse. This process relies on an averaged perceptron model to distinguish likely semantic roles from erroneous ones. Our system penalizes parses that give rise to low-scoring semantic roles. To explore the consequences of this we perform two experiments. First, we use a baseline generative model to produce n-best parses, which are then re-ordered by our semantic model. Second, we use a modified version of our semantic role labeler to predict semantic roles at parse time. The performance of this modified labeler is weaker than that of our best full SRL, because it is restricted to features that can be computed directly from the parser’s packed chart. For both experiments, the resulting semantic predictions are then used to select parses. Finally, we feed the selected parses produced by each experiment to the full version of our semantic role labeler. We find that SRL performance can be improved over this baseline by selecting parses with likely semantic roles.
3 0.14640318 96 emnlp-2010-Self-Training with Products of Latent Variable Grammars
Author: Zhongqiang Huang ; Mary Harper ; Slav Petrov
Abstract: Mary Harper†‡ ‡HLT Center of Excellence Johns Hopkins University Baltimore, MD mharpe r@ umd .edu Slav Petrov∗ ∗Google Research 76 Ninth Avenue New York, NY s lav@ google . com ting the training data and eventually begins over- fitting (Liang et al., 2007). Moreover, EM is a loWe study self-training with products of latent variable grammars in this paper. We show that increasing the quality of the automatically parsed data used for self-training gives higher accuracy self-trained grammars. Our generative self-trained grammars reach F scores of 91.6 on the WSJ test set and surpass even discriminative reranking systems without selftraining. Additionally, we show that multiple self-trained grammars can be combined in a product model to achieve even higher accuracy. The product model is most effective when the individual underlying grammars are most diverse. Combining multiple grammars that were self-trained on disjoint sets of unlabeled data results in a final test accuracy of 92.5% on the WSJ test set and 89.6% on our Broadcast News test set.
4 0.13488343 97 emnlp-2010-Simple Type-Level Unsupervised POS Tagging
Author: Yoong Keok Lee ; Aria Haghighi ; Regina Barzilay
Abstract: Part-of-speech (POS) tag distributions are known to exhibit sparsity a word is likely to take a single predominant tag in a corpus. Recent research has demonstrated that incorporating this sparsity constraint improves tagging accuracy. However, in existing systems, this expansion come with a steep increase in model complexity. This paper proposes a simple and effective tagging method that directly models tag sparsity and other distributional properties of valid POS tag assignments. In addition, this formulation results in a dramatic reduction in the number of model parameters thereby, enabling unusually rapid training. Our experiments consistently demonstrate that this model architecture yields substantial performance gains over more complex tagging — counterparts. On several languages, we report performance exceeding that of more complex state-of-the art systems.1
5 0.12838988 118 emnlp-2010-Utilizing Extra-Sentential Context for Parsing
Author: Jackie Chi Kit Cheung ; Gerald Penn
Abstract: Syntactic consistency is the preference to reuse a syntactic construction shortly after its appearance in a discourse. We present an analysis of the WSJ portion of the Penn Treebank, and show that syntactic consistency is pervasive across productions with various lefthand side nonterminals. Then, we implement a reranking constituent parser that makes use of extra-sentential context in its feature set. Using a linear-chain conditional random field, we improve parsing accuracy over the generative baseline parser on the Penn Treebank WSJ corpus, rivalling a similar model that does not make use of context. We show that the context-aware and the context-ignorant rerankers perform well on different subsets of the evaluation data, suggesting a combined approach would provide further improvement. We also compare parses made by models, and suggest that context can be useful for parsing by capturing structural dependencies between sentences as opposed to lexically governed dependencies.
6 0.12423925 65 emnlp-2010-Inducing Probabilistic CCG Grammars from Logical Form with Higher-Order Unification
7 0.11664449 106 emnlp-2010-Top-Down Nearly-Context-Sensitive Parsing
8 0.11369541 115 emnlp-2010-Uptraining for Accurate Deterministic Question Parsing
9 0.10899707 60 emnlp-2010-Improved Fully Unsupervised Parsing with Zoomed Learning
10 0.098864906 98 emnlp-2010-Soft Syntactic Constraints for Hierarchical Phrase-Based Translation Using Latent Syntactic Distributions
11 0.092636466 57 emnlp-2010-Hierarchical Phrase-Based Translation Grammars Extracted from Alignment Posterior Probabilities
12 0.090011358 113 emnlp-2010-Unsupervised Induction of Tree Substitution Grammars for Dependency Parsing
13 0.089547127 111 emnlp-2010-Two Decades of Unsupervised POS Induction: How Far Have We Come?
14 0.08619491 117 emnlp-2010-Using Unknown Word Techniques to Learn Known Words
15 0.083292812 34 emnlp-2010-Crouching Dirichlet, Hidden Markov Model: Unsupervised POS Tagging with Context Local Tag Generation
16 0.081124552 88 emnlp-2010-On Dual Decomposition and Linear Programming Relaxations for Natural Language Processing
17 0.080656111 9 emnlp-2010-A New Approach to Lexical Disambiguation of Arabic Text
18 0.077248126 17 emnlp-2010-An Efficient Algorithm for Unsupervised Word Segmentation with Branching Entropy and MDL
19 0.076993637 67 emnlp-2010-It Depends on the Translation: Unsupervised Dependency Parsing via Word Alignment
20 0.076150849 46 emnlp-2010-Evaluating the Impact of Alternative Dependency Graph Encodings on Solving Event Extraction Tasks
topicId topicWeight
[(0, 0.266), (1, 0.107), (2, 0.252), (3, -0.007), (4, -0.006), (5, 0.095), (6, 0.001), (7, -0.076), (8, 0.126), (9, -0.042), (10, 0.158), (11, 0.024), (12, 0.129), (13, 0.101), (14, 0.009), (15, -0.034), (16, 0.032), (17, 0.06), (18, -0.119), (19, 0.031), (20, -0.021), (21, -0.056), (22, 0.035), (23, 0.044), (24, 0.153), (25, -0.044), (26, -0.159), (27, -0.155), (28, -0.005), (29, -0.004), (30, -0.02), (31, -0.083), (32, 0.033), (33, -0.142), (34, -0.061), (35, -0.012), (36, 0.11), (37, -0.098), (38, 0.016), (39, 0.073), (40, 0.038), (41, 0.008), (42, -0.042), (43, 0.004), (44, 0.125), (45, -0.05), (46, 0.037), (47, 0.098), (48, -0.033), (49, -0.032)]
simIndex simValue paperId paperTitle
same-paper 1 0.96500683 114 emnlp-2010-Unsupervised Parse Selection for HPSG
Author: Rebecca Dridan ; Timothy Baldwin
Abstract: Parser disambiguation with precision grammars generally takes place via statistical ranking of the parse yield of the grammar using a supervised parse selection model. In the standard process, the parse selection model is trained over a hand-disambiguated treebank, meaning that without a significant investment of effort to produce the treebank, parse selection is not possible. Furthermore, as treebanking is generally streamlined with parse selection models, creating the initial treebank without a model requires more resources than subsequent treebanks. In this work, we show that, by taking advantage of the constrained nature of these HPSG grammars, we can learn a discriminative parse selection model from raw text in a purely unsupervised fashion. This allows us to bootstrap the treebanking process and provide better parsers faster, and with less resources.
2 0.66547471 65 emnlp-2010-Inducing Probabilistic CCG Grammars from Logical Form with Higher-Order Unification
Author: Tom Kwiatkowksi ; Luke Zettlemoyer ; Sharon Goldwater ; Mark Steedman
Abstract: This paper addresses the problem of learning to map sentences to logical form, given training data consisting of natural language sentences paired with logical representations of their meaning. Previous approaches have been designed for particular natural languages or specific meaning representations; here we present a more general method. The approach induces a probabilistic CCG grammar that represents the meaning of individual words and defines how these meanings can be combined to analyze complete sentences. We use higher-order unification to define a hypothesis space containing all grammars consistent with the training data, and develop an online learning algorithm that efficiently searches this space while simultaneously estimating the parameters of a log-linear parsing model. Experiments demonstrate high accuracy on benchmark data sets in four languages with two different meaning representations.
3 0.62712198 96 emnlp-2010-Self-Training with Products of Latent Variable Grammars
Author: Zhongqiang Huang ; Mary Harper ; Slav Petrov
Abstract: Mary Harper†‡ ‡HLT Center of Excellence Johns Hopkins University Baltimore, MD mharpe r@ umd .edu Slav Petrov∗ ∗Google Research 76 Ninth Avenue New York, NY s lav@ google . com ting the training data and eventually begins over- fitting (Liang et al., 2007). Moreover, EM is a loWe study self-training with products of latent variable grammars in this paper. We show that increasing the quality of the automatically parsed data used for self-training gives higher accuracy self-trained grammars. Our generative self-trained grammars reach F scores of 91.6 on the WSJ test set and surpass even discriminative reranking systems without selftraining. Additionally, we show that multiple self-trained grammars can be combined in a product model to achieve even higher accuracy. The product model is most effective when the individual underlying grammars are most diverse. Combining multiple grammars that were self-trained on disjoint sets of unlabeled data results in a final test accuracy of 92.5% on the WSJ test set and 89.6% on our Broadcast News test set.
4 0.62462759 118 emnlp-2010-Utilizing Extra-Sentential Context for Parsing
Author: Jackie Chi Kit Cheung ; Gerald Penn
Abstract: Syntactic consistency is the preference to reuse a syntactic construction shortly after its appearance in a discourse. We present an analysis of the WSJ portion of the Penn Treebank, and show that syntactic consistency is pervasive across productions with various lefthand side nonterminals. Then, we implement a reranking constituent parser that makes use of extra-sentential context in its feature set. Using a linear-chain conditional random field, we improve parsing accuracy over the generative baseline parser on the Penn Treebank WSJ corpus, rivalling a similar model that does not make use of context. We show that the context-aware and the context-ignorant rerankers perform well on different subsets of the evaluation data, suggesting a combined approach would provide further improvement. We also compare parses made by models, and suggest that context can be useful for parsing by capturing structural dependencies between sentences as opposed to lexically governed dependencies.
5 0.62210751 60 emnlp-2010-Improved Fully Unsupervised Parsing with Zoomed Learning
Author: Roi Reichart ; Ari Rappoport
Abstract: We introduce a novel training algorithm for unsupervised grammar induction, called Zoomed Learning. Given a training set T and a test set S, the goal of our algorithm is to identify subset pairs Ti, Si of T and S such that when the unsupervised parser is trained on a training subset Ti its results on its paired test subset Si are better than when it is trained on the entire training set T. A successful application of zoomed learning improves overall performance on the full test set S. We study our algorithm’s effect on the leading algorithm for the task of fully unsupervised parsing (Seginer, 2007) in three different English domains, WSJ, BROWN and GENIA, and show that it improves the parser F-score by up to 4.47%.
6 0.52871042 121 emnlp-2010-What a Parser Can Learn from a Semantic Role Labeler and Vice Versa
7 0.4798094 106 emnlp-2010-Top-Down Nearly-Context-Sensitive Parsing
8 0.47788888 115 emnlp-2010-Uptraining for Accurate Deterministic Question Parsing
9 0.4452287 117 emnlp-2010-Using Unknown Word Techniques to Learn Known Words
10 0.41166896 113 emnlp-2010-Unsupervised Induction of Tree Substitution Grammars for Dependency Parsing
11 0.39782062 97 emnlp-2010-Simple Type-Level Unsupervised POS Tagging
12 0.37907571 111 emnlp-2010-Two Decades of Unsupervised POS Induction: How Far Have We Come?
13 0.37453762 71 emnlp-2010-Latent-Descriptor Clustering for Unsupervised POS Induction
14 0.3719531 17 emnlp-2010-An Efficient Algorithm for Unsupervised Word Segmentation with Branching Entropy and MDL
15 0.37109631 9 emnlp-2010-A New Approach to Lexical Disambiguation of Arabic Text
16 0.34250692 35 emnlp-2010-Discriminative Sample Selection for Statistical Machine Translation
17 0.34171173 98 emnlp-2010-Soft Syntactic Constraints for Hierarchical Phrase-Based Translation Using Latent Syntactic Distributions
18 0.33276734 75 emnlp-2010-Lessons Learned in Part-of-Speech Tagging of Conversational Speech
19 0.33235312 34 emnlp-2010-Crouching Dirichlet, Hidden Markov Model: Unsupervised POS Tagging with Context Local Tag Generation
20 0.32234725 122 emnlp-2010-WikiWars: A New Corpus for Research on Temporal Expressions
topicId topicWeight
[(3, 0.014), (10, 0.016), (12, 0.044), (21, 0.268), (29, 0.137), (30, 0.018), (32, 0.013), (52, 0.031), (56, 0.049), (62, 0.013), (66, 0.15), (72, 0.068), (76, 0.072), (87, 0.021), (89, 0.018)]
simIndex simValue paperId paperTitle
same-paper 1 0.76304126 114 emnlp-2010-Unsupervised Parse Selection for HPSG
Author: Rebecca Dridan ; Timothy Baldwin
Abstract: Parser disambiguation with precision grammars generally takes place via statistical ranking of the parse yield of the grammar using a supervised parse selection model. In the standard process, the parse selection model is trained over a hand-disambiguated treebank, meaning that without a significant investment of effort to produce the treebank, parse selection is not possible. Furthermore, as treebanking is generally streamlined with parse selection models, creating the initial treebank without a model requires more resources than subsequent treebanks. In this work, we show that, by taking advantage of the constrained nature of these HPSG grammars, we can learn a discriminative parse selection model from raw text in a purely unsupervised fashion. This allows us to bootstrap the treebanking process and provide better parsers faster, and with less resources.
2 0.62834352 78 emnlp-2010-Minimum Error Rate Training by Sampling the Translation Lattice
Author: Samidh Chatterjee ; Nicola Cancedda
Abstract: Minimum Error Rate Training is the algorithm for log-linear model parameter training most used in state-of-the-art Statistical Machine Translation systems. In its original formulation, the algorithm uses N-best lists output by the decoder to grow the Translation Pool that shapes the surface on which the actual optimization is performed. Recent work has been done to extend the algorithm to use the entire translation lattice built by the decoder, instead of N-best lists. We propose here a third, intermediate way, consisting in growing the translation pool using samples randomly drawn from the translation lattice. We empirically measure a systematic im- provement in the BLEU scores compared to training using N-best lists, without suffering the increase in computational complexity associated with operating with the whole lattice.
3 0.62771565 86 emnlp-2010-Non-Isomorphic Forest Pair Translation
Author: Hui Zhang ; Min Zhang ; Haizhou Li ; Eng Siong Chng
Abstract: This paper studies two issues, non-isomorphic structure translation and target syntactic structure usage, for statistical machine translation in the context of forest-based tree to tree sequence translation. For the first issue, we propose a novel non-isomorphic translation framework to capture more non-isomorphic structure mappings than traditional tree-based and tree-sequence-based translation methods. For the second issue, we propose a parallel space searching method to generate hypothesis using tree-to-string model and evaluate its syntactic goodness using tree-to-tree/tree sequence model. This not only reduces the search complexity by merging spurious-ambiguity translation paths and solves the data sparseness issue in training, but also serves as a syntax-based target language model for better grammatical generation. Experiment results on the benchmark data show our proposed two solutions are very effective, achieving significant performance improvement over baselines when applying to different translation models.
4 0.62654233 40 emnlp-2010-Effects of Empty Categories on Machine Translation
Author: Tagyoung Chung ; Daniel Gildea
Abstract: We examine effects that empty categories have on machine translation. Empty categories are elements in parse trees that lack corresponding overt surface forms (words) such as dropped pronouns and markers for control constructions. We start by training machine translation systems with manually inserted empty elements. We find that inclusion of some empty categories in training data improves the translation result. We expand the experiment by automatically inserting these elements into a larger data set using various methods and training on the modified corpus. We show that even when automatic prediction of null elements is not highly accurate, it nevertheless improves the end translation result.
5 0.62558323 7 emnlp-2010-A Mixture Model with Sharing for Lexical Semantics
Author: Joseph Reisinger ; Raymond Mooney
Abstract: We introduce tiered clustering, a mixture model capable of accounting for varying degrees of shared (context-independent) feature structure, and demonstrate its applicability to inferring distributed representations of word meaning. Common tasks in lexical semantics such as word relatedness or selectional preference can benefit from modeling such structure: Polysemous word usage is often governed by some common background metaphoric usage (e.g. the senses of line or run), and likewise modeling the selectional preference of verbs relies on identifying commonalities shared by their typical arguments. Tiered clustering can also be viewed as a form of soft feature selection, where features that do not contribute meaningfully to the clustering can be excluded. We demonstrate the applicability of tiered clustering, highlighting particular cases where modeling shared structure is beneficial and where it can be detrimental.
6 0.62379766 67 emnlp-2010-It Depends on the Translation: Unsupervised Dependency Parsing via Word Alignment
7 0.62173653 65 emnlp-2010-Inducing Probabilistic CCG Grammars from Logical Form with Higher-Order Unification
8 0.62132347 60 emnlp-2010-Improved Fully Unsupervised Parsing with Zoomed Learning
10 0.62060893 26 emnlp-2010-Classifying Dialogue Acts in One-on-One Live Chats
11 0.61956972 115 emnlp-2010-Uptraining for Accurate Deterministic Question Parsing
12 0.61949509 84 emnlp-2010-NLP on Spoken Documents Without ASR
13 0.61854655 69 emnlp-2010-Joint Training and Decoding Using Virtual Nodes for Cascaded Segmentation and Tagging Tasks
14 0.6155411 63 emnlp-2010-Improving Translation via Targeted Paraphrasing
15 0.61529809 35 emnlp-2010-Discriminative Sample Selection for Statistical Machine Translation
16 0.61487085 2 emnlp-2010-A Fast Decoder for Joint Word Segmentation and POS-Tagging Using a Single Discriminative Model
17 0.61416399 32 emnlp-2010-Context Comparison of Bursty Events in Web Search and Online Media
18 0.61414254 103 emnlp-2010-Tense Sense Disambiguation: A New Syntactic Polysemy Task
19 0.61339968 34 emnlp-2010-Crouching Dirichlet, Hidden Markov Model: Unsupervised POS Tagging with Context Local Tag Generation
20 0.61242104 57 emnlp-2010-Hierarchical Phrase-Based Translation Grammars Extracted from Alignment Posterior Probabilities