emnlp emnlp2012 emnlp2012-3 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Annie Louis ; Ani Nenkova
Abstract: We introduce a model of coherence which captures the intentional discourse structure in text. Our work is based on the hypothesis that syntax provides a proxy for the communicative goal of a sentence and therefore the sequence of sentences in a coherent discourse should exhibit detectable structural patterns. Results show that our method has high discriminating power for separating out coherent and incoherent news articles reaching accuracies of up to 90%. We also show that our syntactic patterns are correlated with manual annotations of intentional structure for academic conference articles and can successfully predict the coherence of abstract, introduction and related work sections of these articles. 59.3 (100.0) Intro 50.3 (100.0) 1166 Rel wk 55.4 (100.0) >= 0.663.8 (67.2)50.8 (71.1)58.6 (75.9) >= 0.7 67.2 (32.0) 54.4 (38.6) 63.3 (52.8) >= 0.8 74.0 (10.0) 51.6 (22.0) 63.0 (25.7) >= 0.9 91.7 (2.0) 30.6 (5.0) 68.1 (7.2) Table 9: Accuracy (% examples) above each confidence level for the conference versus workshop task. These results are shown in Table 9. The proportion of examples under each setting is also indicated. When only examples above 0.6 confidence are examined, the classifier has a higher accuracy of63.8% for abstracts and covers close to 70% of the examples. Similarly, when a cutoff of 0.7 is applied to the confidence for predicting related work sections, we achieve 63.3% accuracy for 53% of examples. So we can consider that 30 to 47% of the examples in the two sections respectively are harder to tell apart. Interestingly however even high confidence predictions on introductions remain incorrect. These results show that our model can successfully distinguish the structure of articles beyond just clearly incoherent permutation examples. 7 Conclusion Our work is the first to develop an unsupervised model for intentional structure and to show that it has good accuracy for coherence prediction and also complements entity and lexical structure of discourse. This result raises interesting questions about how patterns captured by these different coherence metrics vary and how they can be combined usefully for predicting coherence. We plan to explore these ideas in future work. We also want to analyze genre differences to understand if the strength of these coherence dimensions varies with genre. Acknowledgements This work is partially supported by a Google research grant and NSF CAREER 0953445 award. References Regina Barzilay and Mirella Lapata. 2008. Modeling local coherence: An entity-based approach. Computa- tional Linguistics, 34(1): 1–34. Regina Barzilay and Lillian Lee. 2004. Catching the drift: Probabilistic content models, with applications to generation and summarization. In Proceedings of NAACL-HLT, pages 113–120. Xavier Carreras, Michael Collins, and Terry Koo. 2008. Tag, dynamic programming, and the perceptron for efficient, feature-rich parsing. In Proceedings of CoNLL, pages 9–16. Eugene Charniak and Mark Johnson. 2005. Coarse-tofine n-best parsing and maxent discriminative reranking. In Proceedings of ACL, pages 173–180. Jackie C.K. Cheung and Gerald Penn. 2010. Utilizing extra-sentential context for parsing. In Proceedings of EMNLP, pages 23–33. Christelle Cocco, Rapha ¨el Pittier, Fran ¸cois Bavaud, and Aris Xanthos. 2011. Segmentation and clustering of textual sequences: a typological approach. In Proceedings of RANLP, pages 427–433. Michael Collins and Terry Koo. 2005. Discriminative reranking for natural language parsing. Computational Linguistics, 3 1:25–70. Isaac G. Councill, C. Lee Giles, and Min-Yen Kan. 2008. Parscit: An open-source crf reference string parsing package. In Proceedings of LREC, pages 661–667. Micha Elsner and Eugene Charniak. 2008. Coreferenceinspired coherence modeling. In Proceedings of ACLHLT, Short Papers, pages 41–44. Micha Elsner and Eugene Charniak. 2011. Extending the entity grid with entity-specific features. In Proceedings of ACL-HLT, pages 125–129. Micha Elsner, Joseph Austerweil, and Eugene Charniak. 2007. A unified local and global model for discourse coherence. In Proceedings of NAACL-HLT, pages 436–443. Pascale Fung and Grace Ngai. 2006. One story, one flow: Hidden markov story models for multilingual multidocument summarization. ACM Transactions on Speech and Language Processing, 3(2): 1–16. Barbara J. Grosz and Candace L. Sidner. 1986. Attention, intentions, and the structure of discourse. Computational Linguistics, 3(12): 175–204. Yufan Guo, Anna Korhonen, and Thierry Poibeau. 2011. A weakly-supervised approach to argumentative zoning of scientific documents. In Proceedings of EMNLP, pages 273–283. Liang Huang. 2008. Forest reranking: Discriminative parsing with non-local features. In Proceedings of ACL-HLT, pages 586–594, June. 1167 Nikiforos Karamanis, Chris Mellish, Massimo Poesio, and Jon Oberlander. 2009. Evaluating centering for information ordering using corpora. Computational Linguistics, 35(1):29–46. Dan Klein and Christopher D. Manning. 2003. Accurate unlexicalized parsing. In Proceedings of ACL, pages 423–430. Mirella Lapata and Regina Barzilay. 2005. Automatic evaluation of text coherence: Models and representations. In Proceedings of IJCAI. Mirella Lapata. 2003. Probabilistic text structuring: Experiments with sentence ordering. In Proceedings of ACL, pages 545–552. Maria Liakata and Larisa Soldatova. 2008. Guidelines for the annotation of general scientific concepts. JISC Project Report. Maria Liakata, Simone Teufel, Advaith Siddharthan, and Colin Batchelor. 2010. Corpora for the conceptualisation and zoning of scientific papers. In Proceedings of LREC. Ziheng Lin, Min-Yen Kan, and Hwee Tou Ng. 2009. Recognizing implicit discourse relations in the Penn Discourse Treebank. In Proceedings of EMNLP, pages 343–351. Ziheng Lin, Hwee Tou Ng, and Min-Yen Kan. 2011. Automatically evaluating text coherence using discourse relations. In Proceedings of ACL-HLT, pages 997– 1006. Mitchell P. Marcus, Beatrice Santorini, and Mary Ann Marcinkiewicz. 1994. Building a large annotated corpus of english: The penn treebank. Computational Linguistics, 19(2):313–330. Emily Pitler and Ani Nenkova. 2008. Revisiting readability: A unified framework for predicting text quality. In Proceedings of EMNLP, pages 186–195. Dragomir R. Radev, Mark Thomas Joseph, Bryan Gibson, and Pradeep Muthukrishnan. 2009. A Bibliometric and Network Analysis ofthe field of Computational Linguistics. Journal of the American Society for Information Science and Technology. David Reitter, Johanna D. Moore, and Frank Keller. 2006. Priming of Syntactic Rules in Task-Oriented Dialogue and Spontaneous Conversation. In Proceedings of the 28th Annual Conference of the Cognitive Science Society, pages 685–690. Jeffrey C. Reynar and Adwait Ratnaparkhi. 1997. A maximum entropy approach to identifying sentence boundaries. In Proceedings of the fifth conference on Applied natural language processing, pages 16–19. Radu Soricut and Daniel Marcu. 2006. Discourse generation using utility-trained coherence models. In Proceedings of COLING-ACL, pages 803–810. John Swales. 1990. Genre analysis: English in academic and research settings, volume 11. Cambridge University Press. Simone Teufel and Marc Moens. 2000. What’s yours and what’s mine: determining intellectual attribution in scientific text. In Proceedings of EMNLP, pages 9– 17. Simone Teufel, Jean Carletta, and Marc Moens. 1999. An annotation scheme for discourse-level argumentation in research articles. In Proceedings of EACL, pages 110–1 17. Ying Zhao, George Karypis, and Usama Fayyad. 2005. Hierarchical clustering algorithms for document datasets. Data Mining and Knowledge Discovery, 10: 141–168. 1168
Reference: text
sentIndex sentText sentNum sentScore
1 A coherence model based on syntactic patterns Annie Louis University of Pennsylvania Philadelphia, PA 19104, USA lannie @ s ea s upenn edu . [sent-1, score-0.411]
2 Abstract We introduce a model of coherence which captures the intentional discourse structure in text. [sent-3, score-0.643]
3 Our work is based on the hypothesis that syntax provides a proxy for the communicative goal of a sentence and therefore the sequence of sentences in a coherent discourse should exhibit detectable structural patterns. [sent-4, score-0.603]
4 Results show that our method has high discriminating power for separating out coherent and incoherent news articles reaching accuracies of up to 90%. [sent-5, score-0.503]
5 We also show that our syntactic patterns are correlated with manual annotations of intentional structure for academic conference articles and can successfully predict the coherence of abstract, introduction and related work sections of these articles. [sent-6, score-1.058]
6 1 Introduction Recent studies have introduced successful automatic methods to predict the structure and coherence of texts. [sent-7, score-0.267]
7 tribute to coherence: intentional structure (purpose of discourse), attentional structure (what items are discussed) and the organization of discourse segments. [sent-13, score-0.428]
8 The highly successful entity approaches capture attentional structure and content approaches are related to topic segments but intentional structure has largely been neglected. [sent-14, score-0.434]
9 As a result each sentence in the article has a communicative goal and the sequence of goals helps the author achieve the discourse purpose. [sent-16, score-0.477]
10 In this work, we introduce a model to capture coherence from the intentional structure dimension. [sent-17, score-0.546]
11 Our key proposal is that syntactic patterns are a useful proxy for intentional structure. [sent-18, score-0.423]
12 These are examples of syntactic patterns related to the communicative goals of individual sentences. [sent-25, score-0.394]
13 Table 1: The first two sentences of two descriptive articles ing syntactic features such as the presence of a topicalized phrase providing the focus of the sentence. [sent-32, score-0.482]
14 The two sets of sentences have similar sequence of communicative goals and so we can expect the syntax of adjacent sentences to also be related. [sent-33, score-0.618]
15 We aim to characterize this relationship on a broad scale using a coherence model based entirely on syntax. [sent-34, score-0.267]
16 The model relies on two assumptions which summarize our intuitions about syntax and intentional structure: 1. [sent-35, score-0.43]
17 Sentences with similar syntax are likely to have the same communicative goal. [sent-36, score-0.355]
18 Regularities in intentional structure will be manifested in syntactic regularities between adjacent sentences. [sent-38, score-0.517]
19 Cheung and Penn (2010) find that a better syntactic parse of a sentence can be derived when the syntax of adjacent sentences is also taken into account. [sent-40, score-0.469]
20 (2009) report that the syntactic productions in adjacent sentences are powerful features for predicting which discourse relation (cause, contrast, etc. [sent-42, score-0.547]
21 In our model, syntax is represented either as parse tree productions or a sequence of phrasal nodes augmented with part of speech tags. [sent-46, score-0.44]
22 Results show that syntax models can distinguish coherent and incoherent news articles from two domains with 7590% accuracies over a 50% baseline. [sent-49, score-0.654]
23 In addition, 1158 the syntax coherence scores turn out complementary to scores given by lexical and entity models. [sent-50, score-0.503]
24 We also study our models’ predictions on academic articles, a genre where intentional structure is widely studied. [sent-51, score-0.389]
25 Sections in these articles have well-defined purposes and we find recurring sentence types such as motivation, citations, description, and speculations. [sent-52, score-0.288]
26 We also present results on coherence prediction: our model can distinguish the introduction section of conference papers from its perturbed versions with over 70% accuracy. [sent-57, score-0.267]
27 Further, our model is able to identify conference from workshop papers with good accuracies, given that we can expect these articles to vary in purpose. [sent-58, score-0.249]
28 2 Evidence for syntactic coherence We first present a pilot study that confirms that adjacent sentences in discourse exhibit stable patterns of syntactic co-occurrence. [sent-59, score-0.747]
29 This study validates our second assumption relating the syntax of adjacent sentences. [sent-60, score-0.25]
30 Later in Section 6, we examine syntactic patterns in individual sentences (assumption 1) using a corpus of academic articles where sentences were manually annotated with communicative goals. [sent-61, score-0.782]
31 Prior work has reported that certain grammatical productions are repeated in adjacent sentences more often than would be expected by chance (Reitter et al. [sent-62, score-0.369]
32 We enumerate all productions that appear in the syntactic parse of any sentence and exclude those that appear less than 25 times, resulting in a list of 197 unique productions. [sent-68, score-0.371]
33 Some of these other sequences can be explained by the fact that these articles come from the finance domain: they involve productions containing numbers and quantities. [sent-78, score-0.46]
34 Here the intentional structure is INTRODUCE X / STATEMENT BY X. [sent-89, score-0.279]
35 In the remainder of the paper we formalize our representation of syntax and the derived model of coherence and test its efficacy in three domains. [sent-90, score-0.418]
36 present two coherence models: a local model which captures the co-occurrence of structural features in adjacent sentences and a global one which learns from clusters of sentences with similar syntax. [sent-95, score-0.581]
37 The main verb of a sentence is central to its structure, so the parameter d is always set to be greater than that of the main verb and is tuned to optimize performance for coherence prediction. [sent-121, score-0.306]
38 2 Implementing the model We adapt two models of coherence to operate over the two syntactic representations. [sent-123, score-0.348]
39 It allows us to test the assumption that coherent discourse is characterized by syntactic regularities in adjacent sentences. [sent-127, score-0.388]
40 We estimate the probabilities of pairs of syntactic items from adjacent sentences in the training data and use these probabilities to compute the coherence of new texts. [sent-128, score-0.558]
41 The coherence of a text T containing n sentences (S1. [sent-129, score-0.326]
42 Items are either productions or syntactic word unigrams depending on the representation. [sent-133, score-0.292]
43 The top two descriptive productions for each cluster are also listed. [sent-139, score-0.263]
44 2 Global structure Now we turn to a global coherence approach that implements the assumption that sentences with similar syntax have the same communicative goal as well as captures the patterns in communicative goals in the discourse. [sent-144, score-0.994]
45 This approach uses a Hidden Markov Model (HMM) which has been a popular implementation for modeling coherence (Barzilay and Lee, 2004; Fung and Ngai, 2006; Elsner et al. [sent-145, score-0.267]
46 The hidden states in our model depict communicative goals by encoding a probability distribution over syntactic items. [sent-147, score-0.371]
47 This distribution gives higher weight to syntactic items that are more likely for that communicative goal. [sent-148, score-0.337]
48 Transitions be- tween states record the common patterns in intentional structure for the domain. [sent-149, score-0.382]
49 For the productions representation of syntax, the features for clustering are the number of times a given production appeared in the parse of the sentence. [sent-151, score-0.293]
50 Table 4 shows sentences from two clusters formed on the abstracts of journal articles using the productions representation. [sent-156, score-0.789]
51 For productions representation, this is the unigram distribution of produc- tions from the sentences in hk. [sent-159, score-0.27]
52 4 Content and entity grid models We compare the syntax model with content model and entity grid methods. [sent-174, score-0.579]
53 5 Evaluating syntactic coherence We follow the common approach from prior work and use pairs of articles, where one has the original document order and the other is a random permutation of the sentences from the same document. [sent-197, score-0.474]
54 This setting is not ideal but has become the de facto standard for evaluation of coherence models (Barzilay and Lee, 2004; Elsner et al. [sent-205, score-0.267]
55 , 2011) shows that people identify the original article as more coherent than its permutations with over 90% accuracy and assessors also have high agreement. [sent-210, score-0.256]
56 Later, we present an experiment distinguishing conference from workshop articles as a more realistic evaluation. [sent-211, score-0.249]
57 We use two corpora that are widely employed for coherence prediction (Barzilay and Lee, 2004; Elsner et al. [sent-212, score-0.267]
58 These corpora were chosen since within each dataset, the articles have the same intentional structure. [sent-217, score-0.528]
59 Further, these corpora are also standard ones used in prior work on lexical, entity and discourse relation based coherence models. [sent-218, score-0.449]
60 Later in Section 6, we show that the models perform well on the academic genre and longer articles too. [sent-219, score-0.359]
61 For each of the two corpora, we have 100 articles for training and 100 (accidents) and 99 (earthquakes) for testing. [sent-220, score-0.249]
62 The articles were parsed using the Stanford parser (Klein and Manning, 2003). [sent-223, score-0.249]
63 After tuning, the final model was trained on all 100 articles in the training set. [sent-227, score-0.249]
64 Overall, the syntax models work quite well, with accuracies at least 15% or more absolute improvement over the baseline. [sent-233, score-0.265]
65 In the local co-occurrence approach, both productions and d-sequences provide 72% accuracy for the accidents corpus. [sent-234, score-0.312]
66 Overall both productions and d-sequence work competitively and give the best accuracies when implemented with the global approach. [sent-241, score-0.325]
67 2 Comparison with other approaches For our implementations of the content and entity grid models, the best accuracies are 71% on the accidents corpus and 85% on the earthquakes one, similar to the syntactic models. [sent-243, score-0.626]
68 For instance, to combine content models and entity grid, two features are created: one ofthese records the difference in log probabilities for the two articles from the content model, the other feature indicates the difference in probabilities from the entity grid. [sent-248, score-0.559]
69 In each fold, the training is done using the pairs from 90 articles and tested on permutations from the remaining 10 articles. [sent-267, score-0.361]
70 We find that syntax supplements both content and entity grid methods. [sent-270, score-0.4]
71 While on the airplane corpus syntax only combines well with the entity grid, on the earthquake corpus, both entity and content approaches give better accuracies when combined with syntax. [sent-271, score-0.601]
72 In prior work, content and entity grid methods have been combined generatively (Elsner et al. [sent-274, score-0.249]
73 6 Predictions on academic articles The distinctive intentional structure of academic articles has motivated several proposals to define and annotate the communicative purpose (argumentative zone) of each sentence (Swales, 1990; Teufel et al. [sent-277, score-1.154]
74 So we expect that these articles form a good testbed for our models. [sent-282, score-0.249]
75 In the remainder of the paper, we examine how unsupervised patterns discovered by our approach relate to zones and how well our models predict coherence for articles from this genre. [sent-283, score-0.701]
76 ART Corpus: contains a set of 225 Chemistry journal articles that were manually annotated for intentional structure (Liakata and Soldatova, 2008). [sent-285, score-0.528]
77 We create two test sets: one has 500 ACL-NAACL conference articles and another has 500 articles from ACL-sponsored workshops. [sent-300, score-0.498]
78 We only choose articles in which all three sections—abstract, introduction and related work— 6Some articles did not have labelled ‘introduction’ sections resulting in fewer examples for this setup. [sent-301, score-0.55]
79 For each corpus and each section, we train all our syntactic models: the two local coherence models using the production and d-sequence representations and the HMM models with the two representations. [sent-308, score-0.39]
80 The zone annotations present in this corpus allow us to directly test our first assumption in this work, that sentences with similar syntax have the same communicative goal. [sent-313, score-0.587]
81 9 We examine the clusters created by these models on the training data and check whether there are clusters which strongly involve sentences from some particular annotated zone. [sent-316, score-0.253]
82 For each possible pair of cluster and zone (Ci, Zj), we compute c(Ci, Zj): the number of sentences in Ci that are annotated as zone Zj. [sent-317, score-0.405]
83 The HMM-prod model for abstracts has 9 clusters (named Clus0 to 8) and the HMM-d-seq model for introductions has 6 clusters (Clus0 to 5). [sent-320, score-0.529]
84 4 Not associated: Clus1 - Motivation, Clus2 - Goal, Clus4 - Background, Clus 5 - Model Table 7: Cluster-Zone mappings on the ART Corpus The presence of significant associations validate our intuitions that syntax provides clues about communicative goals. [sent-347, score-0.355]
85 Other clusters have high recall of a zone, 55% of all goal sentences from the abstracts training data is captured by Clus7. [sent-349, score-0.329]
86 It is particularly interesting to see that Clus7 of abstracts captures both objective and goal zone sentences and for introductions, Clus4 is a mix of hypothesis and goal sentences which intuitively are closely related categories. [sent-350, score-0.464]
87 2 Original versus permuted sections We also explore the accuracy of the syntax models for predicting coherence of articles from the test set of ART corpus and the 500 test articles from ACLNAACL conferences. [sent-352, score-1.031]
88 For testing, we assume that the oracle zone is provided for each sentence and use the model to predict the likelihood of the zone sequence. [sent-358, score-0.385]
89 These results are lower than that obtained on the earthquake/accident corpus but the task here is much harder: the articles are longer and the ACL corpus also has OCR errors which affect sentence segmentation and parsing accuracies. [sent-363, score-0.288]
90 When the oracle zones are known, the accuracies are much higher on the ART corpus indicating that the intentional structure of academic articles is very predictive of their coherence. [sent-364, score-0.831]
91 So the way information is conveyed in the abstracts and introductions would vary in these articles. [sent-369, score-0.335]
92 We perform this analysis on the ACL corpus and no permutations are used, only the original text of the 500 articles each in the conference and workshop test sets. [sent-370, score-0.361]
93 For example, both original and permuted articles have the same length. [sent-372, score-0.312]
94 We use perplexity rather than probability because the length of the articles vary widely in contrast to the previous permutation-based tests, where both permutation and original article have the same length. [sent-377, score-0.407]
95 We represent sentences in the training set as either productions or d-sequence items and compute pairs of associated items (xi, xj) from adjacent sentences using the same chi-square test as in our pilot study. [sent-390, score-0.532]
96 For abstracts and related work, these accuracies are significantly better than baseline (95% confidence level from a two-sided paired t-test comparing the accuracies from the 10 folds). [sent-401, score-0.456]
97 These results show that our model can successfully distinguish the structure of articles beyond just clearly incoherent permutation examples. [sent-455, score-0.403]
98 7 Conclusion Our work is the first to develop an unsupervised model for intentional structure and to show that it has good accuracy for coherence prediction and also complements entity and lexical structure of discourse. [sent-456, score-0.631]
99 This result raises interesting questions about how patterns captured by these different coherence metrics vary and how they can be combined usefully for predicting coherence. [sent-457, score-0.33]
100 We also want to analyze genre differences to understand if the strength of these coherence dimensions varies with genre. [sent-459, score-0.31]
wordName wordTfidf (topN-words)
[('intentional', 0.279), ('coherence', 0.267), ('articles', 0.249), ('productions', 0.211), ('communicative', 0.204), ('abstracts', 0.173), ('zone', 0.173), ('introductions', 0.162), ('syntax', 0.151), ('vp', 0.137), ('zj', 0.137), ('zones', 0.122), ('elsner', 0.116), ('accuracies', 0.114), ('permutations', 0.112), ('barzilay', 0.107), ('teufel', 0.105), ('accidents', 0.101), ('liakata', 0.101), ('adjacent', 0.099), ('clusters', 0.097), ('discourse', 0.097), ('dv', 0.096), ('grid', 0.094), ('article', 0.091), ('incoherent', 0.087), ('entity', 0.085), ('nnp', 0.082), ('ci', 0.081), ('syntactic', 0.081), ('earthquakes', 0.081), ('nnpp', 0.081), ('nvn', 0.081), ('qnpp', 0.081), ('receptors', 0.081), ('content', 0.07), ('permutation', 0.067), ('academic', 0.067), ('hj', 0.063), ('permuted', 0.063), ('patterns', 0.063), ('hk', 0.061), ('aqueduct', 0.061), ('cytokine', 0.061), ('mvp', 0.061), ('odtn', 0.061), ('hi', 0.06), ('sentences', 0.059), ('pc', 0.058), ('regularities', 0.058), ('np', 0.057), ('confidence', 0.055), ('earthquake', 0.055), ('fung', 0.055), ('depth', 0.054), ('coherent', 0.053), ('chairs', 0.052), ('nvp', 0.052), ('items', 0.052), ('sections', 0.052), ('descriptive', 0.052), ('art', 0.052), ('nenkova', 0.047), ('ngai', 0.047), ('soricut', 0.047), ('lapata', 0.047), ('goals', 0.046), ('qp', 0.044), ('genre', 0.043), ('production', 0.042), ('eugene', 0.042), ('micha', 0.041), ('sbar', 0.041), ('scientific', 0.041), ('airplane', 0.041), ('argumentative', 0.041), ('bcdd', 0.041), ('ccdd', 0.041), ('cocco', 0.041), ('councill', 0.041), ('csbda', 0.041), ('ddtt', 0.041), ('ennnnpp', 0.041), ('hclius', 0.041), ('karamanis', 0.041), ('nnnn', 0.041), ('nvpp', 0.041), ('parscit', 0.041), ('qqpp', 0.041), ('reitter', 0.041), ('reynar', 0.041), ('swales', 0.041), ('topicalized', 0.041), ('vqpp', 0.041), ('ziheng', 0.041), ('zoning', 0.041), ('states', 0.04), ('parse', 0.04), ('sentence', 0.039), ('nodes', 0.038)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000001 3 emnlp-2012-A Coherence Model Based on Syntactic Patterns
Author: Annie Louis ; Ani Nenkova
Abstract: We introduce a model of coherence which captures the intentional discourse structure in text. Our work is based on the hypothesis that syntax provides a proxy for the communicative goal of a sentence and therefore the sequence of sentences in a coherent discourse should exhibit detectable structural patterns. Results show that our method has high discriminating power for separating out coherent and incoherent news articles reaching accuracies of up to 90%. We also show that our syntactic patterns are correlated with manual annotations of intentional structure for academic conference articles and can successfully predict the coherence of abstract, introduction and related work sections of these articles. 59.3 (100.0) Intro 50.3 (100.0) 1166 Rel wk 55.4 (100.0) >= 0.663.8 (67.2)50.8 (71.1)58.6 (75.9) >= 0.7 67.2 (32.0) 54.4 (38.6) 63.3 (52.8) >= 0.8 74.0 (10.0) 51.6 (22.0) 63.0 (25.7) >= 0.9 91.7 (2.0) 30.6 (5.0) 68.1 (7.2) Table 9: Accuracy (% examples) above each confidence level for the conference versus workshop task. These results are shown in Table 9. The proportion of examples under each setting is also indicated. When only examples above 0.6 confidence are examined, the classifier has a higher accuracy of63.8% for abstracts and covers close to 70% of the examples. Similarly, when a cutoff of 0.7 is applied to the confidence for predicting related work sections, we achieve 63.3% accuracy for 53% of examples. So we can consider that 30 to 47% of the examples in the two sections respectively are harder to tell apart. Interestingly however even high confidence predictions on introductions remain incorrect. These results show that our model can successfully distinguish the structure of articles beyond just clearly incoherent permutation examples. 7 Conclusion Our work is the first to develop an unsupervised model for intentional structure and to show that it has good accuracy for coherence prediction and also complements entity and lexical structure of discourse. This result raises interesting questions about how patterns captured by these different coherence metrics vary and how they can be combined usefully for predicting coherence. We plan to explore these ideas in future work. We also want to analyze genre differences to understand if the strength of these coherence dimensions varies with genre. Acknowledgements This work is partially supported by a Google research grant and NSF CAREER 0953445 award. References Regina Barzilay and Mirella Lapata. 2008. Modeling local coherence: An entity-based approach. Computa- tional Linguistics, 34(1): 1–34. Regina Barzilay and Lillian Lee. 2004. Catching the drift: Probabilistic content models, with applications to generation and summarization. In Proceedings of NAACL-HLT, pages 113–120. Xavier Carreras, Michael Collins, and Terry Koo. 2008. Tag, dynamic programming, and the perceptron for efficient, feature-rich parsing. In Proceedings of CoNLL, pages 9–16. Eugene Charniak and Mark Johnson. 2005. Coarse-tofine n-best parsing and maxent discriminative reranking. In Proceedings of ACL, pages 173–180. Jackie C.K. Cheung and Gerald Penn. 2010. Utilizing extra-sentential context for parsing. In Proceedings of EMNLP, pages 23–33. Christelle Cocco, Rapha ¨el Pittier, Fran ¸cois Bavaud, and Aris Xanthos. 2011. Segmentation and clustering of textual sequences: a typological approach. In Proceedings of RANLP, pages 427–433. Michael Collins and Terry Koo. 2005. Discriminative reranking for natural language parsing. Computational Linguistics, 3 1:25–70. Isaac G. Councill, C. Lee Giles, and Min-Yen Kan. 2008. Parscit: An open-source crf reference string parsing package. In Proceedings of LREC, pages 661–667. Micha Elsner and Eugene Charniak. 2008. Coreferenceinspired coherence modeling. In Proceedings of ACLHLT, Short Papers, pages 41–44. Micha Elsner and Eugene Charniak. 2011. Extending the entity grid with entity-specific features. In Proceedings of ACL-HLT, pages 125–129. Micha Elsner, Joseph Austerweil, and Eugene Charniak. 2007. A unified local and global model for discourse coherence. In Proceedings of NAACL-HLT, pages 436–443. Pascale Fung and Grace Ngai. 2006. One story, one flow: Hidden markov story models for multilingual multidocument summarization. ACM Transactions on Speech and Language Processing, 3(2): 1–16. Barbara J. Grosz and Candace L. Sidner. 1986. Attention, intentions, and the structure of discourse. Computational Linguistics, 3(12): 175–204. Yufan Guo, Anna Korhonen, and Thierry Poibeau. 2011. A weakly-supervised approach to argumentative zoning of scientific documents. In Proceedings of EMNLP, pages 273–283. Liang Huang. 2008. Forest reranking: Discriminative parsing with non-local features. In Proceedings of ACL-HLT, pages 586–594, June. 1167 Nikiforos Karamanis, Chris Mellish, Massimo Poesio, and Jon Oberlander. 2009. Evaluating centering for information ordering using corpora. Computational Linguistics, 35(1):29–46. Dan Klein and Christopher D. Manning. 2003. Accurate unlexicalized parsing. In Proceedings of ACL, pages 423–430. Mirella Lapata and Regina Barzilay. 2005. Automatic evaluation of text coherence: Models and representations. In Proceedings of IJCAI. Mirella Lapata. 2003. Probabilistic text structuring: Experiments with sentence ordering. In Proceedings of ACL, pages 545–552. Maria Liakata and Larisa Soldatova. 2008. Guidelines for the annotation of general scientific concepts. JISC Project Report. Maria Liakata, Simone Teufel, Advaith Siddharthan, and Colin Batchelor. 2010. Corpora for the conceptualisation and zoning of scientific papers. In Proceedings of LREC. Ziheng Lin, Min-Yen Kan, and Hwee Tou Ng. 2009. Recognizing implicit discourse relations in the Penn Discourse Treebank. In Proceedings of EMNLP, pages 343–351. Ziheng Lin, Hwee Tou Ng, and Min-Yen Kan. 2011. Automatically evaluating text coherence using discourse relations. In Proceedings of ACL-HLT, pages 997– 1006. Mitchell P. Marcus, Beatrice Santorini, and Mary Ann Marcinkiewicz. 1994. Building a large annotated corpus of english: The penn treebank. Computational Linguistics, 19(2):313–330. Emily Pitler and Ani Nenkova. 2008. Revisiting readability: A unified framework for predicting text quality. In Proceedings of EMNLP, pages 186–195. Dragomir R. Radev, Mark Thomas Joseph, Bryan Gibson, and Pradeep Muthukrishnan. 2009. A Bibliometric and Network Analysis ofthe field of Computational Linguistics. Journal of the American Society for Information Science and Technology. David Reitter, Johanna D. Moore, and Frank Keller. 2006. Priming of Syntactic Rules in Task-Oriented Dialogue and Spontaneous Conversation. In Proceedings of the 28th Annual Conference of the Cognitive Science Society, pages 685–690. Jeffrey C. Reynar and Adwait Ratnaparkhi. 1997. A maximum entropy approach to identifying sentence boundaries. In Proceedings of the fifth conference on Applied natural language processing, pages 16–19. Radu Soricut and Daniel Marcu. 2006. Discourse generation using utility-trained coherence models. In Proceedings of COLING-ACL, pages 803–810. John Swales. 1990. Genre analysis: English in academic and research settings, volume 11. Cambridge University Press. Simone Teufel and Marc Moens. 2000. What’s yours and what’s mine: determining intellectual attribution in scientific text. In Proceedings of EMNLP, pages 9– 17. Simone Teufel, Jean Carletta, and Marc Moens. 1999. An annotation scheme for discourse-level argumentation in research articles. In Proceedings of EACL, pages 110–1 17. Ying Zhao, George Karypis, and Usama Fayyad. 2005. Hierarchical clustering algorithms for document datasets. Data Mining and Knowledge Discovery, 10: 141–168. 1168
2 0.11945637 49 emnlp-2012-Exploring Topic Coherence over Many Models and Many Topics
Author: Keith Stevens ; Philip Kegelmeyer ; David Andrzejewski ; David Buttler
Abstract: We apply two new automated semantic evaluations to three distinct latent topic models. Both metrics have been shown to align with human evaluations and provide a balance between internal measures of information gain and comparisons to human ratings of coherent topics. We improve upon the measures by introducing new aggregate measures that allows for comparing complete topic models. We further compare the automated measures to other metrics for topic models, comparison to manually crafted semantic tests and document classification. Our experiments reveal that LDA and LSA each have different strengths; LDA best learns descriptive topics while LSA is best at creating a compact semantic representation ofdocuments and words in a corpus.
3 0.11756707 16 emnlp-2012-Aligning Predicates across Monolingual Comparable Texts using Graph-based Clustering
Author: Michael Roth ; Anette Frank
Abstract: Generating coherent discourse is an important aspect in natural language generation. Our aim is to learn factors that constitute coherent discourse from data, with a focus on how to realize predicate-argument structures in a model that exceeds the sentence level. We present an important subtask for this overall goal, in which we align predicates across comparable texts, admitting partial argument structure correspondence. The contribution of this work is two-fold: We first construct a large corpus resource of comparable texts, including an evaluation set with manual predicate alignments. Secondly, we present a novel approach for aligning predicates across comparable texts using graph-based clustering with Mincuts. Our method significantly outperforms other alignment techniques when applied to this novel alignment task, by a margin of at least 6.5 percentage points in F1-score.
4 0.10480966 19 emnlp-2012-An Entity-Topic Model for Entity Linking
Author: Xianpei Han ; Le Sun
Abstract: Entity Linking (EL) has received considerable attention in recent years. Given many name mentions in a document, the goal of EL is to predict their referent entities in a knowledge base. Traditionally, there have been two distinct directions of EL research: one focusing on the effects of mention’s context compatibility, assuming that “the referent entity of a mention is reflected by its context”; the other dealing with the effects of document’s topic coherence, assuming that “a mention ’s referent entity should be coherent with the document’ ’s main topics”. In this paper, we propose a generative model called entitytopic model, to effectively join the above two complementary directions together. By jointly modeling and exploiting the context compatibility, the topic coherence and the correlation between them, our model can – accurately link all mentions in a document using both the local information (including the words and the mentions in a document) and the global knowledge (including the topic knowledge, the entity context knowledge and the entity name knowledge). Experimental results demonstrate the effectiveness of the proposed model. 1
5 0.10220888 105 emnlp-2012-Parser Showdown at the Wall Street Corral: An Empirical Investigation of Error Types in Parser Output
Author: Jonathan K. Kummerfeld ; David Hall ; James R. Curran ; Dan Klein
Abstract: Constituency parser performance is primarily interpreted through a single metric, F-score on WSJ section 23, that conveys no linguistic information regarding the remaining errors. We classify errors within a set of linguistically meaningful types using tree transformations that repair groups of errors together. We use this analysis to answer a range of questions about parser behaviour, including what linguistic constructions are difficult for stateof-the-art parsers, what types of errors are being resolved by rerankers, and what types are introduced when parsing out-of-domain text.
6 0.096733376 10 emnlp-2012-A Statistical Relational Learning Approach to Identifying Evidence Based Medicine Categories
7 0.09619464 27 emnlp-2012-Characterizing Stylistic Elements in Syntactic Structure
8 0.095337011 7 emnlp-2012-A Novel Discriminative Framework for Sentence-Level Discourse Analysis
9 0.081301108 127 emnlp-2012-Transforming Trees to Improve Syntactic Convergence
10 0.076842949 94 emnlp-2012-Multiple Aspect Summarization Using Integer Linear Programming
11 0.076125823 71 emnlp-2012-Joint Entity and Event Coreference Resolution across Documents
12 0.071590647 65 emnlp-2012-Improving NLP through Marginalization of Hidden Syntactic Structure
13 0.070639625 24 emnlp-2012-Biased Representation Learning for Domain Adaptation
14 0.068983614 89 emnlp-2012-Mixed Membership Markov Models for Unsupervised Conversation Modeling
15 0.067403823 135 emnlp-2012-Using Discourse Information for Paraphrase Extraction
16 0.063886352 12 emnlp-2012-A Transition-Based System for Joint Part-of-Speech Tagging and Labeled Non-Projective Dependency Parsing
17 0.063588597 81 emnlp-2012-Learning to Map into a Universal POS Tagset
18 0.063186973 33 emnlp-2012-Discovering Diverse and Salient Threads in Document Collections
19 0.06286791 70 emnlp-2012-Joint Chinese Word Segmentation, POS Tagging and Parsing
20 0.062300853 97 emnlp-2012-Natural Language Questions for the Web of Data
topicId topicWeight
[(0, 0.263), (1, 0.028), (2, 0.038), (3, 0.003), (4, -0.027), (5, 0.093), (6, -0.011), (7, -0.039), (8, -0.095), (9, 0.032), (10, -0.067), (11, -0.002), (12, -0.114), (13, 0.142), (14, 0.052), (15, 0.017), (16, 0.086), (17, -0.113), (18, -0.025), (19, 0.075), (20, 0.084), (21, 0.048), (22, 0.156), (23, -0.026), (24, 0.091), (25, 0.096), (26, 0.071), (27, -0.137), (28, 0.135), (29, -0.169), (30, -0.035), (31, 0.059), (32, 0.092), (33, -0.043), (34, 0.005), (35, 0.074), (36, -0.018), (37, 0.102), (38, -0.161), (39, -0.237), (40, 0.085), (41, -0.241), (42, -0.034), (43, 0.085), (44, 0.121), (45, 0.045), (46, -0.121), (47, 0.042), (48, -0.006), (49, 0.009)]
simIndex simValue paperId paperTitle
same-paper 1 0.94342923 3 emnlp-2012-A Coherence Model Based on Syntactic Patterns
Author: Annie Louis ; Ani Nenkova
Abstract: We introduce a model of coherence which captures the intentional discourse structure in text. Our work is based on the hypothesis that syntax provides a proxy for the communicative goal of a sentence and therefore the sequence of sentences in a coherent discourse should exhibit detectable structural patterns. Results show that our method has high discriminating power for separating out coherent and incoherent news articles reaching accuracies of up to 90%. We also show that our syntactic patterns are correlated with manual annotations of intentional structure for academic conference articles and can successfully predict the coherence of abstract, introduction and related work sections of these articles. 59.3 (100.0) Intro 50.3 (100.0) 1166 Rel wk 55.4 (100.0) >= 0.663.8 (67.2)50.8 (71.1)58.6 (75.9) >= 0.7 67.2 (32.0) 54.4 (38.6) 63.3 (52.8) >= 0.8 74.0 (10.0) 51.6 (22.0) 63.0 (25.7) >= 0.9 91.7 (2.0) 30.6 (5.0) 68.1 (7.2) Table 9: Accuracy (% examples) above each confidence level for the conference versus workshop task. These results are shown in Table 9. The proportion of examples under each setting is also indicated. When only examples above 0.6 confidence are examined, the classifier has a higher accuracy of63.8% for abstracts and covers close to 70% of the examples. Similarly, when a cutoff of 0.7 is applied to the confidence for predicting related work sections, we achieve 63.3% accuracy for 53% of examples. So we can consider that 30 to 47% of the examples in the two sections respectively are harder to tell apart. Interestingly however even high confidence predictions on introductions remain incorrect. These results show that our model can successfully distinguish the structure of articles beyond just clearly incoherent permutation examples. 7 Conclusion Our work is the first to develop an unsupervised model for intentional structure and to show that it has good accuracy for coherence prediction and also complements entity and lexical structure of discourse. This result raises interesting questions about how patterns captured by these different coherence metrics vary and how they can be combined usefully for predicting coherence. We plan to explore these ideas in future work. We also want to analyze genre differences to understand if the strength of these coherence dimensions varies with genre. Acknowledgements This work is partially supported by a Google research grant and NSF CAREER 0953445 award. References Regina Barzilay and Mirella Lapata. 2008. Modeling local coherence: An entity-based approach. Computa- tional Linguistics, 34(1): 1–34. Regina Barzilay and Lillian Lee. 2004. Catching the drift: Probabilistic content models, with applications to generation and summarization. In Proceedings of NAACL-HLT, pages 113–120. Xavier Carreras, Michael Collins, and Terry Koo. 2008. Tag, dynamic programming, and the perceptron for efficient, feature-rich parsing. In Proceedings of CoNLL, pages 9–16. Eugene Charniak and Mark Johnson. 2005. Coarse-tofine n-best parsing and maxent discriminative reranking. In Proceedings of ACL, pages 173–180. Jackie C.K. Cheung and Gerald Penn. 2010. Utilizing extra-sentential context for parsing. In Proceedings of EMNLP, pages 23–33. Christelle Cocco, Rapha ¨el Pittier, Fran ¸cois Bavaud, and Aris Xanthos. 2011. Segmentation and clustering of textual sequences: a typological approach. In Proceedings of RANLP, pages 427–433. Michael Collins and Terry Koo. 2005. Discriminative reranking for natural language parsing. Computational Linguistics, 3 1:25–70. Isaac G. Councill, C. Lee Giles, and Min-Yen Kan. 2008. Parscit: An open-source crf reference string parsing package. In Proceedings of LREC, pages 661–667. Micha Elsner and Eugene Charniak. 2008. Coreferenceinspired coherence modeling. In Proceedings of ACLHLT, Short Papers, pages 41–44. Micha Elsner and Eugene Charniak. 2011. Extending the entity grid with entity-specific features. In Proceedings of ACL-HLT, pages 125–129. Micha Elsner, Joseph Austerweil, and Eugene Charniak. 2007. A unified local and global model for discourse coherence. In Proceedings of NAACL-HLT, pages 436–443. Pascale Fung and Grace Ngai. 2006. One story, one flow: Hidden markov story models for multilingual multidocument summarization. ACM Transactions on Speech and Language Processing, 3(2): 1–16. Barbara J. Grosz and Candace L. Sidner. 1986. Attention, intentions, and the structure of discourse. Computational Linguistics, 3(12): 175–204. Yufan Guo, Anna Korhonen, and Thierry Poibeau. 2011. A weakly-supervised approach to argumentative zoning of scientific documents. In Proceedings of EMNLP, pages 273–283. Liang Huang. 2008. Forest reranking: Discriminative parsing with non-local features. In Proceedings of ACL-HLT, pages 586–594, June. 1167 Nikiforos Karamanis, Chris Mellish, Massimo Poesio, and Jon Oberlander. 2009. Evaluating centering for information ordering using corpora. Computational Linguistics, 35(1):29–46. Dan Klein and Christopher D. Manning. 2003. Accurate unlexicalized parsing. In Proceedings of ACL, pages 423–430. Mirella Lapata and Regina Barzilay. 2005. Automatic evaluation of text coherence: Models and representations. In Proceedings of IJCAI. Mirella Lapata. 2003. Probabilistic text structuring: Experiments with sentence ordering. In Proceedings of ACL, pages 545–552. Maria Liakata and Larisa Soldatova. 2008. Guidelines for the annotation of general scientific concepts. JISC Project Report. Maria Liakata, Simone Teufel, Advaith Siddharthan, and Colin Batchelor. 2010. Corpora for the conceptualisation and zoning of scientific papers. In Proceedings of LREC. Ziheng Lin, Min-Yen Kan, and Hwee Tou Ng. 2009. Recognizing implicit discourse relations in the Penn Discourse Treebank. In Proceedings of EMNLP, pages 343–351. Ziheng Lin, Hwee Tou Ng, and Min-Yen Kan. 2011. Automatically evaluating text coherence using discourse relations. In Proceedings of ACL-HLT, pages 997– 1006. Mitchell P. Marcus, Beatrice Santorini, and Mary Ann Marcinkiewicz. 1994. Building a large annotated corpus of english: The penn treebank. Computational Linguistics, 19(2):313–330. Emily Pitler and Ani Nenkova. 2008. Revisiting readability: A unified framework for predicting text quality. In Proceedings of EMNLP, pages 186–195. Dragomir R. Radev, Mark Thomas Joseph, Bryan Gibson, and Pradeep Muthukrishnan. 2009. A Bibliometric and Network Analysis ofthe field of Computational Linguistics. Journal of the American Society for Information Science and Technology. David Reitter, Johanna D. Moore, and Frank Keller. 2006. Priming of Syntactic Rules in Task-Oriented Dialogue and Spontaneous Conversation. In Proceedings of the 28th Annual Conference of the Cognitive Science Society, pages 685–690. Jeffrey C. Reynar and Adwait Ratnaparkhi. 1997. A maximum entropy approach to identifying sentence boundaries. In Proceedings of the fifth conference on Applied natural language processing, pages 16–19. Radu Soricut and Daniel Marcu. 2006. Discourse generation using utility-trained coherence models. In Proceedings of COLING-ACL, pages 803–810. John Swales. 1990. Genre analysis: English in academic and research settings, volume 11. Cambridge University Press. Simone Teufel and Marc Moens. 2000. What’s yours and what’s mine: determining intellectual attribution in scientific text. In Proceedings of EMNLP, pages 9– 17. Simone Teufel, Jean Carletta, and Marc Moens. 1999. An annotation scheme for discourse-level argumentation in research articles. In Proceedings of EACL, pages 110–1 17. Ying Zhao, George Karypis, and Usama Fayyad. 2005. Hierarchical clustering algorithms for document datasets. Data Mining and Knowledge Discovery, 10: 141–168. 1168
2 0.60717672 10 emnlp-2012-A Statistical Relational Learning Approach to Identifying Evidence Based Medicine Categories
Author: Mathias Verbeke ; Vincent Van Asch ; Roser Morante ; Paolo Frasconi ; Walter Daelemans ; Luc De Raedt
Abstract: Evidence-based medicine is an approach whereby clinical decisions are supported by the best available findings gained from scientific research. This requires efficient access to such evidence. To this end, abstracts in evidence-based medicine can be labeled using a set of predefined medical categories, the socalled PICO criteria. This paper presents an approach to automatically annotate sentences in medical abstracts with these labels. Since both structural and sequential information are important for this classification task, we use kLog, a new language for statistical relational learning with kernels. Our results show a clear improvement with respect to state-of-the-art systems.
3 0.51074719 7 emnlp-2012-A Novel Discriminative Framework for Sentence-Level Discourse Analysis
Author: Shafiq Joty ; Giuseppe Carenini ; Raymond Ng
Abstract: We propose a complete probabilistic discriminative framework for performing sentencelevel discourse analysis. Our framework comprises a discourse segmenter, based on a binary classifier, and a discourse parser, which applies an optimal CKY-like parsing algorithm to probabilities inferred from a Dynamic Conditional Random Field. We show on two corpora that our approach outperforms the state-of-the-art, often by a wide margin.
4 0.48372397 16 emnlp-2012-Aligning Predicates across Monolingual Comparable Texts using Graph-based Clustering
Author: Michael Roth ; Anette Frank
Abstract: Generating coherent discourse is an important aspect in natural language generation. Our aim is to learn factors that constitute coherent discourse from data, with a focus on how to realize predicate-argument structures in a model that exceeds the sentence level. We present an important subtask for this overall goal, in which we align predicates across comparable texts, admitting partial argument structure correspondence. The contribution of this work is two-fold: We first construct a large corpus resource of comparable texts, including an evaluation set with manual predicate alignments. Secondly, we present a novel approach for aligning predicates across comparable texts using graph-based clustering with Mincuts. Our method significantly outperforms other alignment techniques when applied to this novel alignment task, by a margin of at least 6.5 percentage points in F1-score.
5 0.4728232 27 emnlp-2012-Characterizing Stylistic Elements in Syntactic Structure
Author: Song Feng ; Ritwik Banerjee ; Yejin Choi
Abstract: Much of the writing styles recognized in rhetorical and composition theories involve deep syntactic elements. However, most previous research for computational stylometric analysis has relied on shallow lexico-syntactic patterns. Some very recent work has shown that PCFG models can detect distributional difference in syntactic styles, but without offering much insights into exactly what constitute salient stylistic elements in sentence structure characterizing each authorship. In this paper, we present a comprehensive exploration of syntactic elements in writing styles, with particular emphasis on interpretable characterization of stylistic elements. We present analytic insights with respect to the authorship attribution task in two different domains. ,
6 0.4037042 33 emnlp-2012-Discovering Diverse and Salient Threads in Document Collections
7 0.38236722 49 emnlp-2012-Exploring Topic Coherence over Many Models and Many Topics
8 0.38062376 121 emnlp-2012-Supervised Text-based Geolocation Using Language Models on an Adaptive Grid
9 0.34229693 105 emnlp-2012-Parser Showdown at the Wall Street Corral: An Empirical Investigation of Error Types in Parser Output
10 0.33351746 19 emnlp-2012-An Entity-Topic Model for Entity Linking
11 0.29880828 9 emnlp-2012-A Sequence Labelling Approach to Quote Attribution
12 0.28988135 45 emnlp-2012-Exploiting Chunk-level Features to Improve Phrase Chunking
13 0.28827959 50 emnlp-2012-Extending Machine Translation Evaluation Metrics with Lexical Cohesion to Document Level
14 0.28324047 135 emnlp-2012-Using Discourse Information for Paraphrase Extraction
15 0.28287718 17 emnlp-2012-An "AI readability" Formula for French as a Foreign Language
16 0.27477872 89 emnlp-2012-Mixed Membership Markov Models for Unsupervised Conversation Modeling
17 0.27446547 94 emnlp-2012-Multiple Aspect Summarization Using Integer Linear Programming
18 0.26143166 81 emnlp-2012-Learning to Map into a Universal POS Tagset
19 0.25202537 80 emnlp-2012-Learning Verb Inference Rules from Linguistically-Motivated Evidence
20 0.24641392 107 emnlp-2012-Polarity Inducing Latent Semantic Analysis
topicId topicWeight
[(2, 0.013), (10, 0.307), (16, 0.04), (25, 0.024), (29, 0.018), (34, 0.071), (45, 0.017), (60, 0.093), (63, 0.062), (64, 0.025), (65, 0.02), (70, 0.018), (73, 0.016), (74, 0.073), (76, 0.05), (80, 0.022), (86, 0.027), (95, 0.044)]
simIndex simValue paperId paperTitle
1 0.76338583 66 emnlp-2012-Improving Transition-Based Dependency Parsing with Buffer Transitions
Author: Daniel Fernandez-Gonzalez ; Carlos Gomez-Rodriguez
Abstract: In this paper, we show that significant improvements in the accuracy of well-known transition-based parsers can be obtained, without sacrificing efficiency, by enriching the parsers with simple transitions that act on buffer nodes. First, we show how adding a specific transition to create either a left or right arc of length one between the first two buffer nodes produces improvements in the accuracy of Nivre’s arc-eager projective parser on a number of datasets from the CoNLL-X shared task. Then, we show that accuracy can also be improved by adding transitions involving the topmost stack node and the second buffer node (allowing a limited form of non-projectivity). None of these transitions has a negative impact on the computational complexity of the algorithm. Although the experiments in this paper use the arc-eager parser, the approach is generic enough to be applicable to any stackbased dependency parser.
same-paper 2 0.71256757 3 emnlp-2012-A Coherence Model Based on Syntactic Patterns
Author: Annie Louis ; Ani Nenkova
Abstract: We introduce a model of coherence which captures the intentional discourse structure in text. Our work is based on the hypothesis that syntax provides a proxy for the communicative goal of a sentence and therefore the sequence of sentences in a coherent discourse should exhibit detectable structural patterns. Results show that our method has high discriminating power for separating out coherent and incoherent news articles reaching accuracies of up to 90%. We also show that our syntactic patterns are correlated with manual annotations of intentional structure for academic conference articles and can successfully predict the coherence of abstract, introduction and related work sections of these articles. 59.3 (100.0) Intro 50.3 (100.0) 1166 Rel wk 55.4 (100.0) >= 0.663.8 (67.2)50.8 (71.1)58.6 (75.9) >= 0.7 67.2 (32.0) 54.4 (38.6) 63.3 (52.8) >= 0.8 74.0 (10.0) 51.6 (22.0) 63.0 (25.7) >= 0.9 91.7 (2.0) 30.6 (5.0) 68.1 (7.2) Table 9: Accuracy (% examples) above each confidence level for the conference versus workshop task. These results are shown in Table 9. The proportion of examples under each setting is also indicated. When only examples above 0.6 confidence are examined, the classifier has a higher accuracy of63.8% for abstracts and covers close to 70% of the examples. Similarly, when a cutoff of 0.7 is applied to the confidence for predicting related work sections, we achieve 63.3% accuracy for 53% of examples. So we can consider that 30 to 47% of the examples in the two sections respectively are harder to tell apart. Interestingly however even high confidence predictions on introductions remain incorrect. These results show that our model can successfully distinguish the structure of articles beyond just clearly incoherent permutation examples. 7 Conclusion Our work is the first to develop an unsupervised model for intentional structure and to show that it has good accuracy for coherence prediction and also complements entity and lexical structure of discourse. This result raises interesting questions about how patterns captured by these different coherence metrics vary and how they can be combined usefully for predicting coherence. We plan to explore these ideas in future work. We also want to analyze genre differences to understand if the strength of these coherence dimensions varies with genre. Acknowledgements This work is partially supported by a Google research grant and NSF CAREER 0953445 award. References Regina Barzilay and Mirella Lapata. 2008. Modeling local coherence: An entity-based approach. Computa- tional Linguistics, 34(1): 1–34. Regina Barzilay and Lillian Lee. 2004. Catching the drift: Probabilistic content models, with applications to generation and summarization. In Proceedings of NAACL-HLT, pages 113–120. Xavier Carreras, Michael Collins, and Terry Koo. 2008. Tag, dynamic programming, and the perceptron for efficient, feature-rich parsing. In Proceedings of CoNLL, pages 9–16. Eugene Charniak and Mark Johnson. 2005. Coarse-tofine n-best parsing and maxent discriminative reranking. In Proceedings of ACL, pages 173–180. Jackie C.K. Cheung and Gerald Penn. 2010. Utilizing extra-sentential context for parsing. In Proceedings of EMNLP, pages 23–33. Christelle Cocco, Rapha ¨el Pittier, Fran ¸cois Bavaud, and Aris Xanthos. 2011. Segmentation and clustering of textual sequences: a typological approach. In Proceedings of RANLP, pages 427–433. Michael Collins and Terry Koo. 2005. Discriminative reranking for natural language parsing. Computational Linguistics, 3 1:25–70. Isaac G. Councill, C. Lee Giles, and Min-Yen Kan. 2008. Parscit: An open-source crf reference string parsing package. In Proceedings of LREC, pages 661–667. Micha Elsner and Eugene Charniak. 2008. Coreferenceinspired coherence modeling. In Proceedings of ACLHLT, Short Papers, pages 41–44. Micha Elsner and Eugene Charniak. 2011. Extending the entity grid with entity-specific features. In Proceedings of ACL-HLT, pages 125–129. Micha Elsner, Joseph Austerweil, and Eugene Charniak. 2007. A unified local and global model for discourse coherence. In Proceedings of NAACL-HLT, pages 436–443. Pascale Fung and Grace Ngai. 2006. One story, one flow: Hidden markov story models for multilingual multidocument summarization. ACM Transactions on Speech and Language Processing, 3(2): 1–16. Barbara J. Grosz and Candace L. Sidner. 1986. Attention, intentions, and the structure of discourse. Computational Linguistics, 3(12): 175–204. Yufan Guo, Anna Korhonen, and Thierry Poibeau. 2011. A weakly-supervised approach to argumentative zoning of scientific documents. In Proceedings of EMNLP, pages 273–283. Liang Huang. 2008. Forest reranking: Discriminative parsing with non-local features. In Proceedings of ACL-HLT, pages 586–594, June. 1167 Nikiforos Karamanis, Chris Mellish, Massimo Poesio, and Jon Oberlander. 2009. Evaluating centering for information ordering using corpora. Computational Linguistics, 35(1):29–46. Dan Klein and Christopher D. Manning. 2003. Accurate unlexicalized parsing. In Proceedings of ACL, pages 423–430. Mirella Lapata and Regina Barzilay. 2005. Automatic evaluation of text coherence: Models and representations. In Proceedings of IJCAI. Mirella Lapata. 2003. Probabilistic text structuring: Experiments with sentence ordering. In Proceedings of ACL, pages 545–552. Maria Liakata and Larisa Soldatova. 2008. Guidelines for the annotation of general scientific concepts. JISC Project Report. Maria Liakata, Simone Teufel, Advaith Siddharthan, and Colin Batchelor. 2010. Corpora for the conceptualisation and zoning of scientific papers. In Proceedings of LREC. Ziheng Lin, Min-Yen Kan, and Hwee Tou Ng. 2009. Recognizing implicit discourse relations in the Penn Discourse Treebank. In Proceedings of EMNLP, pages 343–351. Ziheng Lin, Hwee Tou Ng, and Min-Yen Kan. 2011. Automatically evaluating text coherence using discourse relations. In Proceedings of ACL-HLT, pages 997– 1006. Mitchell P. Marcus, Beatrice Santorini, and Mary Ann Marcinkiewicz. 1994. Building a large annotated corpus of english: The penn treebank. Computational Linguistics, 19(2):313–330. Emily Pitler and Ani Nenkova. 2008. Revisiting readability: A unified framework for predicting text quality. In Proceedings of EMNLP, pages 186–195. Dragomir R. Radev, Mark Thomas Joseph, Bryan Gibson, and Pradeep Muthukrishnan. 2009. A Bibliometric and Network Analysis ofthe field of Computational Linguistics. Journal of the American Society for Information Science and Technology. David Reitter, Johanna D. Moore, and Frank Keller. 2006. Priming of Syntactic Rules in Task-Oriented Dialogue and Spontaneous Conversation. In Proceedings of the 28th Annual Conference of the Cognitive Science Society, pages 685–690. Jeffrey C. Reynar and Adwait Ratnaparkhi. 1997. A maximum entropy approach to identifying sentence boundaries. In Proceedings of the fifth conference on Applied natural language processing, pages 16–19. Radu Soricut and Daniel Marcu. 2006. Discourse generation using utility-trained coherence models. In Proceedings of COLING-ACL, pages 803–810. John Swales. 1990. Genre analysis: English in academic and research settings, volume 11. Cambridge University Press. Simone Teufel and Marc Moens. 2000. What’s yours and what’s mine: determining intellectual attribution in scientific text. In Proceedings of EMNLP, pages 9– 17. Simone Teufel, Jean Carletta, and Marc Moens. 1999. An annotation scheme for discourse-level argumentation in research articles. In Proceedings of EACL, pages 110–1 17. Ying Zhao, George Karypis, and Usama Fayyad. 2005. Hierarchical clustering algorithms for document datasets. Data Mining and Knowledge Discovery, 10: 141–168. 1168
3 0.48178431 136 emnlp-2012-Weakly Supervised Training of Semantic Parsers
Author: Jayant Krishnamurthy ; Tom Mitchell
Abstract: We present a method for training a semantic parser using only a knowledge base and an unlabeled text corpus, without any individually annotated sentences. Our key observation is that multiple forms ofweak supervision can be combined to train an accurate semantic parser: semantic supervision from a knowledge base, and syntactic supervision from dependencyparsed sentences. We apply our approach to train a semantic parser that uses 77 relations from Freebase in its knowledge representation. This semantic parser extracts instances of binary relations with state-of-theart accuracy, while simultaneously recovering much richer semantic structures, such as conjunctions of multiple relations with partially shared arguments. We demonstrate recovery of this richer structure by extracting logical forms from natural language queries against Freebase. On this task, the trained semantic parser achieves 80% precision and 56% recall, despite never having seen an annotated logical form.
4 0.47177878 14 emnlp-2012-A Weakly Supervised Model for Sentence-Level Semantic Orientation Analysis with Multiple Experts
Author: Lizhen Qu ; Rainer Gemulla ; Gerhard Weikum
Abstract: We propose the weakly supervised MultiExperts Model (MEM) for analyzing the semantic orientation of opinions expressed in natural language reviews. In contrast to most prior work, MEM predicts both opinion polarity and opinion strength at the level of individual sentences; such fine-grained analysis helps to understand better why users like or dislike the entity under review. A key challenge in this setting is that it is hard to obtain sentence-level training data for both polarity and strength. For this reason, MEM is weakly supervised: It starts with potentially noisy indicators obtained from coarse-grained training data (i.e., document-level ratings), a small set of diverse base predictors, and, if available, small amounts of fine-grained training data. We integrate these noisy indicators into a unified probabilistic framework using ideas from ensemble learning and graph-based semi-supervised learning. Our experiments indicate that MEM outperforms state-of-the-art methods by a significant margin.
5 0.46493831 124 emnlp-2012-Three Dependency-and-Boundary Models for Grammar Induction
Author: Valentin I. Spitkovsky ; Hiyan Alshawi ; Daniel Jurafsky
Abstract: We present a new family of models for unsupervised parsing, Dependency and Boundary models, that use cues at constituent boundaries to inform head-outward dependency tree generation. We build on three intuitions that are explicit in phrase-structure grammars but only implicit in standard dependency formulations: (i) Distributions of words that occur at sentence boundaries such as English determiners resemble constituent edges. (ii) Punctuation at sentence boundaries further helps distinguish full sentences from fragments like headlines and titles, allowing us to model grammatical differences between complete and incomplete sentences. (iii) Sentence-internal punctuation boundaries help with longer-distance dependencies, since punctuation correlates with constituent edges. Our models induce state-of-the-art dependency grammars for many languages without — — special knowledge of optimal input sentence lengths or biased, manually-tuned initializers.
6 0.46097127 123 emnlp-2012-Syntactic Transfer Using a Bilingual Lexicon
7 0.46087891 20 emnlp-2012-Answering Opinion Questions on Products by Exploiting Hierarchical Organization of Consumer Reviews
8 0.45995727 109 emnlp-2012-Re-training Monolingual Parser Bilingually for Syntactic SMT
9 0.45988956 51 emnlp-2012-Extracting Opinion Expressions with semi-Markov Conditional Random Fields
10 0.45569408 89 emnlp-2012-Mixed Membership Markov Models for Unsupervised Conversation Modeling
11 0.45408684 23 emnlp-2012-Besting the Quiz Master: Crowdsourcing Incremental Classification Games
12 0.45337728 122 emnlp-2012-Syntactic Surprisal Affects Spoken Word Duration in Conversational Contexts
13 0.45273697 130 emnlp-2012-Unambiguity Regularization for Unsupervised Learning of Probabilistic Grammars
14 0.45194376 71 emnlp-2012-Joint Entity and Event Coreference Resolution across Documents
15 0.45171478 70 emnlp-2012-Joint Chinese Word Segmentation, POS Tagging and Parsing
16 0.45107809 5 emnlp-2012-A Discriminative Model for Query Spelling Correction with Latent Structural SVM
17 0.45052606 18 emnlp-2012-An Empirical Investigation of Statistical Significance in NLP
18 0.45040128 64 emnlp-2012-Improved Parsing and POS Tagging Using Inter-Sentence Consistency Constraints
19 0.45004848 77 emnlp-2012-Learning Constraints for Consistent Timeline Extraction
20 0.44946155 129 emnlp-2012-Type-Supervised Hidden Markov Models for Part-of-Speech Tagging with Incomplete Tag Dictionaries