emnlp emnlp2010 emnlp2010-118 knowledge-graph by maker-knowledge-mining

118 emnlp-2010-Utilizing Extra-Sentential Context for Parsing


Source: pdf

Author: Jackie Chi Kit Cheung ; Gerald Penn

Abstract: Syntactic consistency is the preference to reuse a syntactic construction shortly after its appearance in a discourse. We present an analysis of the WSJ portion of the Penn Treebank, and show that syntactic consistency is pervasive across productions with various lefthand side nonterminals. Then, we implement a reranking constituent parser that makes use of extra-sentential context in its feature set. Using a linear-chain conditional random field, we improve parsing accuracy over the generative baseline parser on the Penn Treebank WSJ corpus, rivalling a similar model that does not make use of context. We show that the context-aware and the context-ignorant rerankers perform well on different subsets of the evaluation data, suggesting a combined approach would provide further improvement. We also compare parses made by models, and suggest that context can be useful for parsing by capturing structural dependencies between sentences as opposed to lexically governed dependencies.

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 edu , Abstract Syntactic consistency is the preference to reuse a syntactic construction shortly after its appearance in a discourse. [sent-3, score-0.481]

2 We present an analysis of the WSJ portion of the Penn Treebank, and show that syntactic consistency is pervasive across productions with various lefthand side nonterminals. [sent-4, score-0.526]

3 Then, we implement a reranking constituent parser that makes use of extra-sentential context in its feature set. [sent-5, score-0.435]

4 Using a linear-chain conditional random field, we improve parsing accuracy over the generative baseline parser on the Penn Treebank WSJ corpus, rivalling a similar model that does not make use of context. [sent-6, score-0.471]

5 We also compare parses made by models, and suggest that context can be useful for parsing by capturing structural dependencies between sentences as opposed to lexically governed dependencies. [sent-8, score-0.382]

6 1 Introduction Recent corpus linguistics work has produced evidence of syntactic consistency, the preference to reuse a syntactic construction shortly after its appearance in a discourse (Gries, 2005; Dubey et al. [sent-9, score-0.467]

7 In addition, experimental studies have confirmed the existence of syntactic priming, the psycholinguistic phenomenon of syntactic consistency1 . [sent-11, score-0.313]

8 Both types of studies, however, have 1Whether or not corpus-based studies of consistency have any bearing on syntactic priming as a reality in the human mind 23 limited the constructions that are examined to particular syntactic constructions and alternations. [sent-12, score-0.844]

9 In this work, we extend these results and present an analysis of the distribution of all syntactic productions in the Penn Treebank WSJ corpus. [sent-15, score-0.312]

10 We provide evidence that syntactic consistency is a widespread phenomenon across productions of various types of LHS nonterminals, including all of the commonly occurring ones. [sent-16, score-0.618]

11 Despite this growing evidence that the probability of syntactic constructions is not independent of the extra-sentential context, current high-performance statistical parsers (e. [sent-17, score-0.327]

12 We address this by implementing a reranking parser which takes advantage of features based on the context surrounding the sentence. [sent-22, score-0.392]

13 The reranker outperforms the generative baseline parser, and rivals a similar model that does not make use of context. [sent-23, score-0.412]

14 ec th2o0d1s0 i Ans Nsaotcuiartaioln La fonrg Cuaogmep Purtoatcieosnsainlg L,in pgagueis ti 2c3s–33, ¬p p Figure 1: Visual representation of calculation of prior and positive adaptation probabilities. [sent-29, score-0.204]

15 p represents the presence of the construction in the prime set. [sent-31, score-0.133]

16 formance, indicating the potential of extra-sentential contextual information to aid parsing, especially for structural dependencies between sentences, such as parallelism effects. [sent-32, score-0.274]

17 2 Syntactic Consistency in the Penn Treebank WSJ Syntactic consistency has been examined by Dubey et al. [sent-33, score-0.255]

18 They have provided evidence that syntactic consistency exists not only within coordinate structures, but also in a variety of other contexts, such as within sentences, between sentences, within documents, and between speaker turns in the Switchboard corpus. [sent-35, score-0.46]

19 There have also been studies into syntactic consistency that consider all syntactic productions in dialogue corpora (Reitter, 2008; Buch and Pietsch, 2010). [sent-38, score-0.671]

20 These studies, however, do not provide consistency results on subsets of production-types, such as by production LHS as our study does, so the implications that can be drawn from them for improving parsing are less apparent. [sent-41, score-0.446]

21 This measure originates in work on lexical priming (Church, 2000), and quantifies the probability of a target word or construction w appearing in a “primed” context. [sent-44, score-0.232]

22 for which the positive adaptation probability The columns on the right show the number of production-types is significantly greater than, not different from, or less than the prior probability. [sent-46, score-0.244]

23 We did not analyze section 23 in order not to use its characteristics in designing our reranking parser so that we can use this section as our evaluation test set. [sent-54, score-0.313]

24 Our analysis focuses on the consistency of rules between sentences, so we take the previous sentence within the same article as the prime set, and the current sentence as the target set in calculating the probabilities given above. [sent-55, score-0.295]

25 i We first present results for consistency in all the production-types2, grouped by the LHS of the production. [sent-61, score-0.214]

26 Table 1 shows the weighted average prior and positive adaptation probabilities for productions by LHS, where the weighting is done by the number of occurrence of that production. [sent-62, score-0.408]

27 It also shows the number of production-types in which the positive adaptation probability is statistically signif- icantly greater than, not significantly different from, and significantly lower than the prior probability. [sent-64, score-0.244]

28 quency and calculated the proportion of productiontypes in that bin for which the positive adaptation probability is significantly greater than the prior. [sent-78, score-0.287]

29 It is clear that the most frequently occurring productiontypes are also the ones most likely to exhibit evidence of syntactic consistency. [sent-79, score-0.304]

30 Table 2 shows the breakdown of the prior and positive adaptation calculation components for the ten most frequent production-types and the ten most consistent (by the ratio pos adapt / prior) productions among the top decile ofproduction-types. [sent-80, score-0.478]

31 Interestingly, many of the most consistent production-types have NP as the LHS, but overall, productions with many different LHS parents exhibit consistency. [sent-82, score-0.204]

32 26 3 A Context-Aware Reranker Having established evidence for widespread syntactic consistency in the WSJ corpus, we now investigate incorporating extra-sentential context into a statistical parser. [sent-83, score-0.493]

33 The first decision to make is whether to incorporate the context into a generative or a discriminative parsing model. [sent-84, score-0.364]

34 Employing a generative model would allow us to train the parser in one step, and one such parser which incorporates the previous context has been implemented by Dubey et al. [sent-85, score-0.531]

35 Prime is a binary variable which is True if and only if the current production has occurred in the prime set (the previous sentence). [sent-88, score-0.214]

36 We instead opt to incorporate the extra-sentential context into a discriminative reranking parser, which naturally allows additional features to be incorporated into the statistical model. [sent-95, score-0.261]

37 They can be divided into two broad categories–those that rerank the N-best outputs of a generative parser, and those that make all parsing decisions using the discriminative model. [sent-97, score-0.352]

38 We choose to implement an N-best reranking parser so that we can utilize state-of-the-art generative parsers to ensure a good selection of candidate parses to feed into our reranking module. [sent-98, score-0.945]

39 Our approach is similar to N-best reranking parsers such as Charniak and Johnson (2005) and Collins and Koo (2005), which implement a variety of features to capture within-sentence lexical and structural dependencies. [sent-101, score-0.314]

40 It is also similar to work which focuses on coordinate noun phrase parsing (e. [sent-102, score-0.141]

41 , 2009)) in that we also attempt to exploit syntactic parallelism, but in a between-sentence setting rather than in a withinsentence setting that only considers coordination. [sent-105, score-0.16]

42 27 pared to the generative baseline for the 50-best parses in the development set. [sent-108, score-0.401]

43 or the same amount of rule overlap with the previous correct parse than the generative baseline, and whether the candidate parse has a better, worse, or the same F1 measure than the generative baseline (Table 3). [sent-109, score-0.532]

44 We find that a larger percentage of candidate parses which share more productions with the previous parse are better than the generative baseline parse than for the other categories, and this difference is statistically significant (χ2 test). [sent-110, score-0.714]

45 CRFs are a very flexible class of graphical models which have been used for various sequence and relational labelling tasks (Lafferty et al. [sent-113, score-0.164]

46 They have also been used for shallow parsing (Sha and Pereira, 2003), and full constituent parsing (Finkel et al. [sent-117, score-0.19]

47 The label sequence then is the sequence of parses, and the outputs are the sentences in the document. [sent-134, score-0.142]

48 Since there is a large number of parses possible for each sentence and correspondingly many possible states for each label variable, we restrict the possible label state-space by extracting the N-best parses from a generative parser, and rerank over the sequences of candidate parses thus provided. [sent-135, score-0.761]

49 We use the generative parser of Petrov and Klein (2007), a state-splitting parser that uses an EM algorithm to find splits in the nonterminal symbols to maximize training data likelihood. [sent-136, score-0.452]

50 To learn the weight vector, we employ a stochastic gradient ascent method on the conditional log likelihood, which has been shown to perform well for parsing tasks (Finkel et al. [sent-139, score-0.392]

51 In standard gradient ascent, we update the weight vector after iterating through the whole training corpus. [sent-144, score-0.161]

52 Because this is computationally expensive, we instead use stochastic gradient ascent, which approximates the true gradient by the gradient calculated from a single sample from the training corpus. [sent-145, score-0.243]

53 We conduct 20-fold cross validation to generate the Nbest parses for the training set, as is standard for Nbest rerankers. [sent-156, score-0.168]

54 To rerank, we do inference with the linear-chain CRF for the most likely sequence of parses using the Viterbi algorithm. [sent-157, score-0.239]

55 2 Feature Functions We experiment with various feature functions that depend on the syntactic and lexical parallelism between yt−1 and yt. [sent-159, score-0.342]

56 We use the occurrence of a rule in yt that occurred in yt−1 as a feature. [sent-160, score-0.283]

57 we tried was to simply enumerate the (non-lexical) productions in yt along with whether that production is found in yt−1. [sent-162, score-0.536]

58 Each of the subtrees in yt marked by whether that same subtree occurs in the previous tree is a feature. [sent-165, score-0.241]

59 The simple production representation corresponds to a vertical markovization of 1 and a horizontal markovization of infinite. [sent-166, score-0.299]

60 We found that a vertical markovization of 1 and a horizontal markovization of 2 produced the best results on our data. [sent-167, score-0.208]

61 This schema so far only considers local substructures of parse trees, without being informed by the lexical information found in the leaves of the tree. [sent-169, score-0.176]

62 In addition, we also add a feature corresponding to the scaled log probability of a parse tree derived from the generative parsing baseline. [sent-179, score-0.397]

63 The scaling formula that we found to work best is to scale the max- imum log probability among the N-best candidate parses to be 1. [sent-181, score-0.204]

64 3 Results We train the two models which make use of extrasentential context described in the previous section, and use the model to parse the development and test set. [sent-185, score-0.276]

65 The generative parser forms the first baseline method to which we compare our results. [sent-188, score-0.321]

66 We also train a reranker which makes use of the same features as we described above, but without marking whether each substructure occurs in the previous sentence. [sent-189, score-0.274]

67 This is thus a reranking method which does not make use of the previous context. [sent-190, score-0.182]

68 Again, we tried model averaging, but this produces less accurate parses on the TaGb+−lenC5eo:rnaPteirvxesting9r e10Ls. [sent-191, score-0.168]

69 We see that the reranked models outperform the generative baseline model in terms of F1, and that the reranked model that uses extra-sentential context outperforms the version that does not use extra-sentential context in the development set, but not in the test set. [sent-221, score-0.547]

70 Using Bikel’s randomized parsing evaluation comparator5, we find that both reranking models outperform the baseline generative model to statistical significance for recall and precision. [sent-222, score-0.467]

71 The context-ignorant reranker outperforms the context-aware reranker on recall (p < 0. [sent-223, score-0.444]

72 98%%)) Table 8: A comparison of parsers specialized to exploit intra- or extra-sentential syntactic parallelism on section 23 in terms of the generative baseline they compare themselves against, the best F1 their non-baseline models achieve, and the absolute and relative improvements. [sent-260, score-0.581]

73 aware model achieves over the generative baseline to that of other systems specialized to exploit intraor extra-sentential parallelism. [sent-261, score-0.19]

74 We achieve a greater improvement despite the fact that our generative baseline provides a higher level of performance, and is presumably thus more difficult to improve upon (Table 8). [sent-262, score-0.23]

75 These systems do not compare themselves against a reranked model that does not use parallelism as we do in this work. [sent-263, score-0.312]

76 During inference, the Viterbi algorithm recovers the most probable sequence of parses, and this means that we are relying on the generative parser to provide the context (i. [sent-264, score-0.471]

77 We do another type of oracle analysis in which we provide the parser with the correct, manually annotated parse tree of the previous sentence when extracting features for the current sentence during training and parsing. [sent-267, score-0.242]

78 00% on sections 22 and 23 respectively, which is comparable to the best results of our reranking models. [sent-270, score-0.182]

79 4 Analysis We now analyze several specific cases in the development set in which the reranker makes correct use of contextual information. [sent-273, score-0.265]

80 They concretely illustrate how context can improve parsing performance, and confirm our initial intuition that extra-sentential context can be useful for parsing. [sent-274, score-0.253]

81 (3) Generative/Context-ignorant: (S (S A BMA spokesman said “runaway medical costs” have 31 made health insurance “a significant challenge) ,” and (S margins also have been pinched . [sent-276, score-0.228]

82 )) (4) Context-aware: (S (NP A BMA spokesman) (VP said “runaway medical costs” have made health insurance “a significant challenge,” and margins also have been pinched . [sent-281, score-0.176]

83 )) The baseline and the context-ignorant models parse the sentence as a conjunction of two S clauses, misanalyzing the scope of what is said by the BMA spokesman to the first part of the conjunct. [sent-286, score-0.172]

84 By analyzing the features and feature weight values extracted from the parse sequence, we determined that the context-aware reranker is able to correct the analysis of the scoping due to a parallelism in the syntactic structure. [sent-287, score-0.676]

85 The original generative and context-ignorant parses posit that “either all markets” is a noun phrase, which is incorrect. [sent-304, score-0.358]

86 First, the reranker prefers a determiner to start an NP in a consistent context, as both surround sentences also contain this substructure. [sent-306, score-0.222]

87 Also, the previous sentence also contains a conjunction CC followed by a S node under a S node, which the reranker prefers. [sent-307, score-0.222]

88 While these examples show contextual features to be useful for parsing coordinations, we also found context-awareness to be useful for other types of structural ambiguity such as PP attachment ambiguity. [sent-308, score-0.135]

89 Notice that the method we employ to cor- rect coordination errors is different from previous approaches which usually rely on lexical or syntactic similarity between conjuncts rather than between sentences. [sent-309, score-0.189]

90 Based on these analyses, it appears that context awareness provides a source of information for parsing which is not available to context-ignorant parsers. [sent-312, score-0.174]

91 We should thus consider integrating both types of features into the reranking parser to build on the advantages of each. [sent-313, score-0.313]

92 Extra-sentential features, on the other hand, are appropriate for capturing the syntactic consistency effects as we have demonstrated in this paper. [sent-315, score-0.322]

93 4 Conclusions In this paper, we have examined evidence for syntactic consistency between neighbouring sentences. [sent-316, score-0.455]

94 First, we conducted a corpus analysis of the Penn Treebank WSJ, and shown that parallelism exists between sentences for productions with a variety of LHS types, generalizing previous results for noun phrase structure. [sent-317, score-0.438]

95 Then, we explored a novel source of features for parsing informed by the extrasentential context. [sent-318, score-0.173]

96 We improved on the parsing accuracy over a generative baseline parser, and rival a similar reranking model that does not rely on extrasentential context. [sent-319, score-0.545]

97 By examining the subsets of the evaluation data on which each model performs best and also individual cases, we argue that context allows a type of structural ambiguity resolution not available to parsers which only rely on intrasentential context. [sent-320, score-0.214]

98 Parallelism in coordination as an instance of syntactic priming: Evidence from corpus-based modeling. [sent-366, score-0.145]

99 Integrating syntactic priming into an incremental probabilistic parser, with an application to psycholinguistic modeling. [sent-373, score-0.348]

100 Coordinate noun phrase disambiguation in a generative parsing model. [sent-394, score-0.285]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('yt', 0.241), ('parallelism', 0.234), ('reranker', 0.222), ('consistency', 0.214), ('dubey', 0.208), ('productions', 0.204), ('lhs', 0.2), ('generative', 0.19), ('reranking', 0.182), ('priming', 0.18), ('parses', 0.168), ('parser', 0.131), ('vbd', 0.121), ('np', 0.117), ('syntactic', 0.108), ('vp', 0.108), ('edge', 0.104), ('bma', 0.104), ('markovization', 0.104), ('productiontypes', 0.104), ('parsing', 0.095), ('adaptation', 0.093), ('labelling', 0.093), ('evidence', 0.092), ('production', 0.091), ('ascent', 0.089), ('wsj', 0.084), ('gradient', 0.081), ('prime', 0.081), ('context', 0.079), ('constructions', 0.078), ('extrasentential', 0.078), ('gries', 0.078), ('phras', 0.078), ('reranked', 0.078), ('nns', 0.077), ('parse', 0.076), ('finkel', 0.075), ('sequence', 0.071), ('appearance', 0.067), ('rerank', 0.067), ('prior', 0.061), ('schema', 0.06), ('psycholinguistic', 0.06), ('penn', 0.06), ('markets', 0.056), ('conditional', 0.055), ('construction', 0.052), ('buch', 0.052), ('cheung', 0.052), ('deciles', 0.052), ('jousse', 0.052), ('lexi', 0.052), ('lhss', 0.052), ('liquidity', 0.052), ('oront', 0.052), ('pinched', 0.052), ('primed', 0.052), ('reitter', 0.052), ('runaway', 0.052), ('substructure', 0.052), ('trading', 0.052), ('withinsentence', 0.052), ('spokesman', 0.052), ('positive', 0.05), ('parsers', 0.049), ('treebank', 0.049), ('crf', 0.049), ('petrov', 0.047), ('coordinate', 0.046), ('subsets', 0.046), ('ubler', 0.044), ('volatility', 0.044), ('iterating', 0.044), ('conjuncts', 0.044), ('rhs', 0.044), ('pickering', 0.044), ('said', 0.044), ('implement', 0.043), ('development', 0.043), ('occurred', 0.042), ('examined', 0.041), ('insurance', 0.04), ('kfk', 0.04), ('margins', 0.04), ('tsuruoka', 0.04), ('toronto', 0.04), ('substructures', 0.04), ('shortly', 0.04), ('greater', 0.04), ('structural', 0.04), ('xml', 0.037), ('sha', 0.037), ('bins', 0.037), ('coordination', 0.037), ('studies', 0.037), ('weight', 0.036), ('log', 0.036), ('ten', 0.035), ('oracle', 0.035)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.9999997 118 emnlp-2010-Utilizing Extra-Sentential Context for Parsing

Author: Jackie Chi Kit Cheung ; Gerald Penn

Abstract: Syntactic consistency is the preference to reuse a syntactic construction shortly after its appearance in a discourse. We present an analysis of the WSJ portion of the Penn Treebank, and show that syntactic consistency is pervasive across productions with various lefthand side nonterminals. Then, we implement a reranking constituent parser that makes use of extra-sentential context in its feature set. Using a linear-chain conditional random field, we improve parsing accuracy over the generative baseline parser on the Penn Treebank WSJ corpus, rivalling a similar model that does not make use of context. We show that the context-aware and the context-ignorant rerankers perform well on different subsets of the evaluation data, suggesting a combined approach would provide further improvement. We also compare parses made by models, and suggest that context can be useful for parsing by capturing structural dependencies between sentences as opposed to lexically governed dependencies.

2 0.16765478 106 emnlp-2010-Top-Down Nearly-Context-Sensitive Parsing

Author: Eugene Charniak

Abstract: We present a new syntactic parser that works left-to-right and top down, thus maintaining a fully-connected parse tree for a few alternative parse hypotheses. All of the commonly used statistical parsers use context-free dynamic programming algorithms and as such work bottom up on the entire sentence. Thus they only find a complete fully connected parse at the very end. In contrast, both subjective and experimental evidence show that people understand a sentence word-to-word as they go along, or close to it. The constraint that the parser keeps one or more fully connected syntactic trees is intended to operationalize this cognitive fact. Our parser achieves a new best result for topdown parsers of 89.4%,a 20% error reduction over the previous single-parser best result for parsers of this type of 86.8% (Roark, 2001) . The improved performance is due to embracing the very large feature set available in exchange for giving up dynamic programming.

3 0.14013091 115 emnlp-2010-Uptraining for Accurate Deterministic Question Parsing

Author: Slav Petrov ; Pi-Chuan Chang ; Michael Ringgaard ; Hiyan Alshawi

Abstract: It is well known that parsing accuracies drop significantly on out-of-domain data. What is less known is that some parsers suffer more from domain shifts than others. We show that dependency parsers have more difficulty parsing questions than constituency parsers. In particular, deterministic shift-reduce dependency parsers, which are of highest interest for practical applications because of their linear running time, drop to 60% labeled accuracy on a question test set. We propose an uptraining procedure in which a deterministic parser is trained on the output of a more accurate, but slower, latent variable constituency parser (converted to dependencies). Uptraining with 100K unlabeled questions achieves results comparable to having 2K labeled questions for training. With 100K unlabeled and 2K labeled questions, uptraining is able to improve parsing accuracy to 84%, closing the gap between in-domain and out-of-domain performance.

4 0.13594201 121 emnlp-2010-What a Parser Can Learn from a Semantic Role Labeler and Vice Versa

Author: Stephen Boxwell ; Dennis Mehay ; Chris Brew

Abstract: In many NLP systems, there is a unidirectional flow of information in which a parser supplies input to a semantic role labeler. In this paper, we build a system that allows information to flow in both directions. We make use of semantic role predictions in choosing a single-best parse. This process relies on an averaged perceptron model to distinguish likely semantic roles from erroneous ones. Our system penalizes parses that give rise to low-scoring semantic roles. To explore the consequences of this we perform two experiments. First, we use a baseline generative model to produce n-best parses, which are then re-ordered by our semantic model. Second, we use a modified version of our semantic role labeler to predict semantic roles at parse time. The performance of this modified labeler is weaker than that of our best full SRL, because it is restricted to features that can be computed directly from the parser’s packed chart. For both experiments, the resulting semantic predictions are then used to select parses. Finally, we feed the selected parses produced by each experiment to the full version of our semantic role labeler. We find that SRL performance can be improved over this baseline by selecting parses with likely semantic roles.

5 0.12838988 114 emnlp-2010-Unsupervised Parse Selection for HPSG

Author: Rebecca Dridan ; Timothy Baldwin

Abstract: Parser disambiguation with precision grammars generally takes place via statistical ranking of the parse yield of the grammar using a supervised parse selection model. In the standard process, the parse selection model is trained over a hand-disambiguated treebank, meaning that without a significant investment of effort to produce the treebank, parse selection is not possible. Furthermore, as treebanking is generally streamlined with parse selection models, creating the initial treebank without a model requires more resources than subsequent treebanks. In this work, we show that, by taking advantage of the constrained nature of these HPSG grammars, we can learn a discriminative parse selection model from raw text in a purely unsupervised fashion. This allows us to bootstrap the treebanking process and provide better parsers faster, and with less resources.

6 0.096495286 98 emnlp-2010-Soft Syntactic Constraints for Hierarchical Phrase-Based Translation Using Latent Syntactic Distributions

7 0.093952529 60 emnlp-2010-Improved Fully Unsupervised Parsing with Zoomed Learning

8 0.08869382 65 emnlp-2010-Inducing Probabilistic CCG Grammars from Logical Form with Higher-Order Unification

9 0.08472959 104 emnlp-2010-The Necessity of Combining Adaptation Methods

10 0.081509843 42 emnlp-2010-Efficient Incremental Decoding for Tree-to-String Translation

11 0.079840295 116 emnlp-2010-Using Universal Linguistic Knowledge to Guide Grammar Induction

12 0.076668143 96 emnlp-2010-Self-Training with Products of Latent Variable Grammars

13 0.075950712 41 emnlp-2010-Efficient Graph-Based Semi-Supervised Learning of Structured Tagging Models

14 0.07265991 39 emnlp-2010-EMNLP 044

15 0.072342806 69 emnlp-2010-Joint Training and Decoding Using Virtual Nodes for Cascaded Segmentation and Tagging Tasks

16 0.070894085 86 emnlp-2010-Non-Isomorphic Forest Pair Translation

17 0.068916641 46 emnlp-2010-Evaluating the Impact of Alternative Dependency Graph Encodings on Solving Event Extraction Tasks

18 0.068728112 88 emnlp-2010-On Dual Decomposition and Linear Programming Relaxations for Natural Language Processing

19 0.066121049 75 emnlp-2010-Lessons Learned in Part-of-Speech Tagging of Conversational Speech

20 0.066014633 67 emnlp-2010-It Depends on the Translation: Unsupervised Dependency Parsing via Word Alignment


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.243), (1, 0.099), (2, 0.229), (3, 0.046), (4, 0.016), (5, 0.164), (6, 0.108), (7, -0.042), (8, 0.02), (9, 0.009), (10, 0.105), (11, -0.037), (12, 0.037), (13, 0.166), (14, -0.005), (15, -0.018), (16, 0.035), (17, 0.027), (18, -0.013), (19, -0.021), (20, 0.04), (21, 0.114), (22, 0.023), (23, 0.042), (24, 0.0), (25, -0.002), (26, -0.1), (27, -0.035), (28, 0.109), (29, 0.13), (30, -0.066), (31, 0.082), (32, 0.07), (33, 0.028), (34, -0.029), (35, 0.022), (36, -0.042), (37, 0.071), (38, 0.087), (39, -0.002), (40, 0.148), (41, -0.028), (42, 0.1), (43, 0.007), (44, 0.02), (45, 0.004), (46, 0.137), (47, -0.0), (48, 0.002), (49, -0.026)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.9521271 118 emnlp-2010-Utilizing Extra-Sentential Context for Parsing

Author: Jackie Chi Kit Cheung ; Gerald Penn

Abstract: Syntactic consistency is the preference to reuse a syntactic construction shortly after its appearance in a discourse. We present an analysis of the WSJ portion of the Penn Treebank, and show that syntactic consistency is pervasive across productions with various lefthand side nonterminals. Then, we implement a reranking constituent parser that makes use of extra-sentential context in its feature set. Using a linear-chain conditional random field, we improve parsing accuracy over the generative baseline parser on the Penn Treebank WSJ corpus, rivalling a similar model that does not make use of context. We show that the context-aware and the context-ignorant rerankers perform well on different subsets of the evaluation data, suggesting a combined approach would provide further improvement. We also compare parses made by models, and suggest that context can be useful for parsing by capturing structural dependencies between sentences as opposed to lexically governed dependencies.

2 0.73628676 106 emnlp-2010-Top-Down Nearly-Context-Sensitive Parsing

Author: Eugene Charniak

Abstract: We present a new syntactic parser that works left-to-right and top down, thus maintaining a fully-connected parse tree for a few alternative parse hypotheses. All of the commonly used statistical parsers use context-free dynamic programming algorithms and as such work bottom up on the entire sentence. Thus they only find a complete fully connected parse at the very end. In contrast, both subjective and experimental evidence show that people understand a sentence word-to-word as they go along, or close to it. The constraint that the parser keeps one or more fully connected syntactic trees is intended to operationalize this cognitive fact. Our parser achieves a new best result for topdown parsers of 89.4%,a 20% error reduction over the previous single-parser best result for parsers of this type of 86.8% (Roark, 2001) . The improved performance is due to embracing the very large feature set available in exchange for giving up dynamic programming.

3 0.63498062 60 emnlp-2010-Improved Fully Unsupervised Parsing with Zoomed Learning

Author: Roi Reichart ; Ari Rappoport

Abstract: We introduce a novel training algorithm for unsupervised grammar induction, called Zoomed Learning. Given a training set T and a test set S, the goal of our algorithm is to identify subset pairs Ti, Si of T and S such that when the unsupervised parser is trained on a training subset Ti its results on its paired test subset Si are better than when it is trained on the entire training set T. A successful application of zoomed learning improves overall performance on the full test set S. We study our algorithm’s effect on the leading algorithm for the task of fully unsupervised parsing (Seginer, 2007) in three different English domains, WSJ, BROWN and GENIA, and show that it improves the parser F-score by up to 4.47%.

4 0.60391802 115 emnlp-2010-Uptraining for Accurate Deterministic Question Parsing

Author: Slav Petrov ; Pi-Chuan Chang ; Michael Ringgaard ; Hiyan Alshawi

Abstract: It is well known that parsing accuracies drop significantly on out-of-domain data. What is less known is that some parsers suffer more from domain shifts than others. We show that dependency parsers have more difficulty parsing questions than constituency parsers. In particular, deterministic shift-reduce dependency parsers, which are of highest interest for practical applications because of their linear running time, drop to 60% labeled accuracy on a question test set. We propose an uptraining procedure in which a deterministic parser is trained on the output of a more accurate, but slower, latent variable constituency parser (converted to dependencies). Uptraining with 100K unlabeled questions achieves results comparable to having 2K labeled questions for training. With 100K unlabeled and 2K labeled questions, uptraining is able to improve parsing accuracy to 84%, closing the gap between in-domain and out-of-domain performance.

5 0.5888862 114 emnlp-2010-Unsupervised Parse Selection for HPSG

Author: Rebecca Dridan ; Timothy Baldwin

Abstract: Parser disambiguation with precision grammars generally takes place via statistical ranking of the parse yield of the grammar using a supervised parse selection model. In the standard process, the parse selection model is trained over a hand-disambiguated treebank, meaning that without a significant investment of effort to produce the treebank, parse selection is not possible. Furthermore, as treebanking is generally streamlined with parse selection models, creating the initial treebank without a model requires more resources than subsequent treebanks. In this work, we show that, by taking advantage of the constrained nature of these HPSG grammars, we can learn a discriminative parse selection model from raw text in a purely unsupervised fashion. This allows us to bootstrap the treebanking process and provide better parsers faster, and with less resources.

6 0.57700521 121 emnlp-2010-What a Parser Can Learn from a Semantic Role Labeler and Vice Versa

7 0.49350083 65 emnlp-2010-Inducing Probabilistic CCG Grammars from Logical Form with Higher-Order Unification

8 0.40775651 40 emnlp-2010-Effects of Empty Categories on Machine Translation

9 0.38290647 41 emnlp-2010-Efficient Graph-Based Semi-Supervised Learning of Structured Tagging Models

10 0.37970799 104 emnlp-2010-The Necessity of Combining Adaptation Methods

11 0.37434286 42 emnlp-2010-Efficient Incremental Decoding for Tree-to-String Translation

12 0.37040371 39 emnlp-2010-EMNLP 044

13 0.3429375 98 emnlp-2010-Soft Syntactic Constraints for Hierarchical Phrase-Based Translation Using Latent Syntactic Distributions

14 0.33021003 75 emnlp-2010-Lessons Learned in Part-of-Speech Tagging of Conversational Speech

15 0.31545678 105 emnlp-2010-Title Generation with Quasi-Synchronous Grammar

16 0.31137666 96 emnlp-2010-Self-Training with Products of Latent Variable Grammars

17 0.29721403 21 emnlp-2010-Automatic Discovery of Manner Relations and its Applications

18 0.29117748 46 emnlp-2010-Evaluating the Impact of Alternative Dependency Graph Encodings on Solving Event Extraction Tasks

19 0.289152 116 emnlp-2010-Using Universal Linguistic Knowledge to Guide Grammar Induction

20 0.2748591 87 emnlp-2010-Nouns are Vectors, Adjectives are Matrices: Representing Adjective-Noun Constructions in Semantic Space


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(12, 0.022), (29, 0.1), (30, 0.015), (52, 0.036), (56, 0.061), (66, 0.079), (72, 0.039), (76, 0.071), (79, 0.446), (87, 0.03)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.75910854 118 emnlp-2010-Utilizing Extra-Sentential Context for Parsing

Author: Jackie Chi Kit Cheung ; Gerald Penn

Abstract: Syntactic consistency is the preference to reuse a syntactic construction shortly after its appearance in a discourse. We present an analysis of the WSJ portion of the Penn Treebank, and show that syntactic consistency is pervasive across productions with various lefthand side nonterminals. Then, we implement a reranking constituent parser that makes use of extra-sentential context in its feature set. Using a linear-chain conditional random field, we improve parsing accuracy over the generative baseline parser on the Penn Treebank WSJ corpus, rivalling a similar model that does not make use of context. We show that the context-aware and the context-ignorant rerankers perform well on different subsets of the evaluation data, suggesting a combined approach would provide further improvement. We also compare parses made by models, and suggest that context can be useful for parsing by capturing structural dependencies between sentences as opposed to lexically governed dependencies.

2 0.72839916 48 emnlp-2010-Exploiting Conversation Structure in Unsupervised Topic Segmentation for Emails

Author: Shafiq Joty ; Giuseppe Carenini ; Gabriel Murray ; Raymond T. Ng

Abstract: This work concerns automatic topic segmentation of email conversations. We present a corpus of email threads manually annotated with topics, and evaluate annotator reliability. To our knowledge, this is the first such email corpus. We show how the existing topic segmentation models (i.e., Lexical Chain Segmenter (LCSeg) and Latent Dirichlet Allocation (LDA)) which are solely based on lexical information, can be applied to emails. By pointing out where these methods fail and what any desired model should consider, we propose two novel extensions of the models that not only use lexical information but also exploit finer level conversation structure in a principled way. Empirical evaluation shows that LCSeg is a better model than LDA for segmenting an email thread into topical clusters and incorporating conversation structure into these models improves the performance significantly.

3 0.35609195 107 emnlp-2010-Towards Conversation Entailment: An Empirical Investigation

Author: Chen Zhang ; Joyce Chai

Abstract: While a significant amount of research has been devoted to textual entailment, automated entailment from conversational scripts has received less attention. To address this limitation, this paper investigates the problem of conversation entailment: automated inference of hypotheses from conversation scripts. We examine two levels of semantic representations: a basic representation based on syntactic parsing from conversation utterances and an augmented representation taking into consideration of conversation structures. For each of these levels, we further explore two ways of capturing long distance relations between language constituents: implicit modeling based on the length of distance and explicit modeling based on actual patterns of relations. Our empirical findings have shown that the augmented representation with conversation structures is important, which achieves the best performance when combined with explicit modeling of long distance relations.

4 0.35575324 98 emnlp-2010-Soft Syntactic Constraints for Hierarchical Phrase-Based Translation Using Latent Syntactic Distributions

Author: Zhongqiang Huang ; Martin Cmejrek ; Bowen Zhou

Abstract: In this paper, we present a novel approach to enhance hierarchical phrase-based machine translation systems with linguistically motivated syntactic features. Rather than directly using treebank categories as in previous studies, we learn a set of linguistically-guided latent syntactic categories automatically from a source-side parsed, word-aligned parallel corpus, based on the hierarchical structure among phrase pairs as well as the syntactic structure of the source side. In our model, each X nonterminal in a SCFG rule is decorated with a real-valued feature vector computed based on its distribution of latent syntactic categories. These feature vectors are utilized at decod- ing time to measure the similarity between the syntactic analysis of the source side and the syntax of the SCFG rules that are applied to derive translations. Our approach maintains the advantages of hierarchical phrase-based translation systems while at the same time naturally incorporates soft syntactic constraints.

5 0.35292372 25 emnlp-2010-Better Punctuation Prediction with Dynamic Conditional Random Fields

Author: Wei Lu ; Hwee Tou Ng

Abstract: This paper focuses on the task of inserting punctuation symbols into transcribed conversational speech texts, without relying on prosodic cues. We investigate limitations associated with previous methods, and propose a novel approach based on dynamic conditional random fields. Different from previous work, our proposed approach is designed to jointly perform both sentence boundary and sentence type prediction, and punctuation prediction on speech utterances. We performed evaluations on a transcribed conversational speech domain consisting of both English and Chinese texts. Empirical results show that our method outperforms an approach based on linear-chain conditional random fields and other previous approaches.

6 0.34907854 75 emnlp-2010-Lessons Learned in Part-of-Speech Tagging of Conversational Speech

7 0.3487367 69 emnlp-2010-Joint Training and Decoding Using Virtual Nodes for Cascaded Segmentation and Tagging Tasks

8 0.34831852 78 emnlp-2010-Minimum Error Rate Training by Sampling the Translation Lattice

9 0.34029362 58 emnlp-2010-Holistic Sentiment Analysis Across Languages: Multilingual Supervised Latent Dirichlet Allocation

10 0.33851501 34 emnlp-2010-Crouching Dirichlet, Hidden Markov Model: Unsupervised POS Tagging with Context Local Tag Generation

11 0.33666229 13 emnlp-2010-A Simple Domain-Independent Probabilistic Approach to Generation

12 0.33355922 40 emnlp-2010-Effects of Empty Categories on Machine Translation

13 0.33234048 67 emnlp-2010-It Depends on the Translation: Unsupervised Dependency Parsing via Word Alignment

14 0.33046705 115 emnlp-2010-Uptraining for Accurate Deterministic Question Parsing

15 0.32991827 120 emnlp-2010-What's with the Attitude? Identifying Sentences with Attitude in Online Discussions

16 0.32691571 46 emnlp-2010-Evaluating the Impact of Alternative Dependency Graph Encodings on Solving Event Extraction Tasks

17 0.32691538 26 emnlp-2010-Classifying Dialogue Acts in One-on-One Live Chats

18 0.32683608 37 emnlp-2010-Domain Adaptation of Rule-Based Annotators for Named-Entity Recognition Tasks

19 0.32547998 81 emnlp-2010-Modeling Perspective Using Adaptor Grammars

20 0.3224223 65 emnlp-2010-Inducing Probabilistic CCG Grammars from Logical Form with Higher-Order Unification