acl acl2013 acl2013-172 knowledge-graph by maker-knowledge-mining

172 acl-2013-Graph-based Local Coherence Modeling


Source: pdf

Author: Camille Guinaudeau ; Michael Strube

Abstract: We propose a computationally efficient graph-based approach for local coherence modeling. We evaluate our system on three tasks: sentence ordering, summary coherence rating and readability assessment. The performance is comparable to entity grid based approaches though these rely on a computationally expensive training phase and face data sparsity problems.

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Abstract We propose a computationally efficient graph-based approach for local coherence modeling. [sent-4, score-0.561]

2 We evaluate our system on three tasks: sentence ordering, summary coherence rating and readability assessment. [sent-5, score-0.641]

3 The performance is comparable to entity grid based approaches though these rely on a computationally expensive training phase and face data sparsity problems. [sent-6, score-0.515]

4 information about which entities occur in which sentence and how the entities are distributed in the text. [sent-9, score-0.259]

5 , 1995), uses a ranking of discourse entities realized in particular sentences and computes transitions between adjacent sentences to provide insight in the felicity of texts. [sent-12, score-0.371]

6 Centering models local coherence rather generally and has been applied to the generation of referring expressions (Kibble and Power, 2004), to resolve pronouns (Brennan et al. [sent-13, score-0.533]

7 This led them to propose a local coherence model relying on a more parsimonious representation, the entity grid model. [sent-19, score-1.009]

8 The entity grid is a two dimensional array where the rows represent sentences and the columns discourse entities. [sent-20, score-0.572]

9 From this grid Barzilay and Lapata (2008) derive probabilities of transitions be- tween adjacent sentences which are used as features for machine learning algorithms. [sent-21, score-0.431]

10 They evaluate this approach successfully on sentence ordering, summary coherence rating, and readability assessment. [sent-22, score-0.572]

11 In order to overcome these problems we propose to represent entities in a graph and then model local coherence by applying centrality measures to the nodes in the graph (Section 3). [sent-29, score-0.861]

12 We claim that a graph is a more powerful representation for local coherence than the entity grid (Barzilay and Lapata, 2008) which is restricted to transitions between adjacent sentences. [sent-30, score-1.197]

13 work on word sense disambiguation by Navigli and Lapata (2010); for an overview over graph-based methods in NLP see Mihalcea and Radev (201 1)) we model local coherence by relying only on centrality measures applied to the nodes in the graph. [sent-34, score-0.6]

14 We apply our graph-based model to the three tasks handled by Barzilay and Lapata (2008) to show that it provides the same flexibility over disparate tasks as the entity grid model: sentence ordering (Section 4. [sent-35, score-0.625]

15 Table 1: Excerpt of a manual summary M from DUC2003 experiments sections, we discuss the impact of genre and stylistic properties of documents on the local coherence computation. [sent-45, score-0.658]

16 From this we conclude that a graph is an alternative to the entity grid model: it is computationally more tractable for modeling local coherence and does not suffer from data sparsity problems (Section 5). [sent-47, score-1.122]

17 2 The Entity Grid Model Barzilay and Lapata (2005; 2008) introduced the entity grid, a method for local coherence modeling that captures the distribution of discourse entities across sentences in a text. [sent-48, score-0.928]

18 An entity grid is a two dimensional array, where rows correspond to sentences and columns to discourse entities. [sent-49, score-0.572]

19 For each discourse entity ej and each sentence si in the text, the corresponding grid cell cij contains information about the presence or absence of the entity in the sentence. [sent-50, score-0.87]

20 If the entity does not appear in the sentence, the corresponding grid cell contains an absence marker “−”. [sent-51, score-0.477]

21 If tinheg entity eilsl present isn a tnhe a sentence, trkhee rc “el−l con- tains a representation of the entity’s syntactic role: “S” if the entity is a subject, “O” if it is an object and “X” for all other syntactic roles (cf. [sent-52, score-0.432]

22 Barzilay and Lapata (2008) capture local coherence by means of local entity transitions, i. [sent-55, score-0.832]

23 The coherence of a sentence in relation to its local context is determined by the 1For complexity reasons, Barzilay and Lapata consider only transitions between at most three sentences. [sent-64, score-0.658]

24 local entity transitions of the entities present or absent in the sentence. [sent-65, score-0.481]

25 The entity grid approach has already been ap- plied to many applications relying on local coherence estimation: summary rating (Barzilay and Lapata, 2005), essay scoring (Burstein et al. [sent-68, score-1.095]

26 Soricut and Marcu (2006) show that the entity grid model is a critical component in their sentence ordering model for discourse generation. [sent-71, score-0.683]

27 Barzilay and Lapata (2008) combine the entity grid with readability-related features to discriminate documents between easy- and difficult-to-read categories. [sent-72, score-0.52]

28 (201 1) use discourse relations to transform the entity grid representation into a discourse role matrix that is used to generate feature vectors for machine learning algorithms similarly to Barzilay and Lapata (2008). [sent-74, score-0.673]

29 Several studies propose to extend the entity grid model using different strategies for entity selection. [sent-75, score-0.641]

30 Filippova and Strube (2007) aim to improve the entity grid model performance by grouping entities by means of semantic relatedness. [sent-76, score-0.589]

31 In their studies, Elsner and Charniak extend the number and type of entities selected and consider that each entity has to be dealt with accordingly with its information status (Elsner et al. [sent-77, score-0.278]

32 3 Method Our model is based on the insight that the entity grid (Barzilay and Lapata, 2008) corresponds to the incidence matrix of a bipartite graph representing the text (see Newman (2010) for more details on graph representation). [sent-83, score-0.781]

33 A fundamental assumption underlying our model is that this bipartite graph contains the entity transition information needed for local coherence computation, rendering feature vectors and learning phase unnecessary. [sent-84, score-0.873]

34 The bipartite graph G = (Vs, Ve, L, w) is defined by two independent sets of nodes that correspond to the set of sentences Vs and the set of entities Ve of the text and a set of edges L associated with weights w. [sent-85, score-0.378]

35 An edge between a sentence node si and an entity node ej is created in the bipartite graph if the corresponding cell cij in the entity grid is not equal to “−”. [sent-86, score-0.972]

36 In contrast to Barzilay and Lapata’s entity grid that contains information about absent enti– – ties, our graph-based representation only contains “positive” information. [sent-89, score-0.487]

37 Figure 1(a) shows an example ofthe bipartite graph that corresponds to the grid in Table 2. [sent-90, score-0.444]

38 The incidence matrix of this graph (Figure 1(d)) is very similar to the entity grid. [sent-91, score-0.316]

39 By modeling entity transitions, Barzilay and Lapata rely on links that exist between sentences to model local coherence. [sent-93, score-0.358]

40 In the same spirit, we apply different kinds of one-mode projections to the sentence node set Vs of the bipartite graph to represent the connections that exist between potentially non adjacent sentences in the graph. [sent-94, score-0.301]

41 m this graph-based representation, the local coherence of a text T can be measured by computing the average outdegree of a projection graph P. [sent-107, score-0.709]

42 Second, compared to other centrality measures, the computational complexity of the average outdegree is low for a document composed by Nw sentences), keeping the local coherence estimation feasible on large documents and on large corpora. [sent-110, score-0.838]

43 Formally, the local coherence of a text T is equal to (O(N∗(N2−1)) LocalCoherence(T) = AvgOutDegree(P) =N1i=X1. [sent-111, score-0.533]

44 4 Experiments We compare our model with the entity grid approach and evaluate the influence of the different weighting schemes used in the projection graphs, either PW or PAcc, where weights are potentially decreased by distance information Dist. [sent-115, score-0.626]

45 Our baseline corresponds to local coherence computation based on the unweighted projection graph PU. [sent-116, score-0.664]

46 For graph construction, all nouns in a document are considered as discourse entities, even those which do not head NPs as this is beneficial for the entity grid model as described in Elsner and Charniak (201 1). [sent-117, score-0.702]

47 We also propose to use a coreference resolution system and consider coreferent entities to be the same discourse entity. [sent-118, score-0.398]

48 As the coreference resolution system is trained on well-formed textual documents and expects a correct sentence ordering, we use in all our experiments only features that do not rely on sentence order (e. [sent-121, score-0.337]

49 Moreover, as the local coherence computation is a linear combination of the syntactic weights, the function is smooth and no large variations of the local coherence values are observed for small changes of weights’ values. [sent-129, score-1.124]

50 We evaluate the ability of our graph-based model to estimate the local coherence of a tex- tual document with three different experiments. [sent-131, score-0.627]

51 Then, as the first task uses “artificial” documents, we also work on two other tasks that involve “real” documents: summary coherence rating (Section 4. [sent-134, score-0.527]

52 For this, our system associates local coherence values with the original document and its permutation, the output of our system being considered as correct if the score for the original document is higher than the 96 score of its permutation. [sent-145, score-0.817]

53 For this, each sentence is removed in turn and a local coherence score is computed for every possible reinsertion position. [sent-147, score-0.591]

54 The system output is considered as correct if the document associated with the highest local coherence score is the one in which the sentence is reinserted in the correct position. [sent-148, score-0.715]

55 Results for the entity grid models described by Barzilay and Lapata (2008) and Elsner and Charniak (201 1) are obtained by using Micha Elsner’s reimplementation in the Brown Coherence Toolkit4. [sent-173, score-0.455]

56 Distance however can detect changes in the distribution of entities within the document as space between entities is significantly modified when sentence order is permuted. [sent-180, score-0.332]

57 org/melsner/ browncoherence; B&L; is Elsner’s “baseline entity grid” (command line option ’-n’), E&C; is Elsner’s “extended entity grid” (’-f’) 97 based ing coreference resolution improves the performance of the system when distance information is used alone in the system (Table 3). [sent-187, score-0.607]

58 We can see that the accuracy value obtained with our system is higher than the one provided with the entity grid model. [sent-200, score-0.524]

59 However, the entity grid model reaches a significantly higher insertion score. [sent-201, score-0.558]

60 This means that, if it makes more mistakes than our system, the position chosen by the entity grid model is usually closer to the correct position. [sent-202, score-0.476]

61 When the coreference resolution system is used, the best accuracy value decreases while the insertion score increases from 0. [sent-205, score-0.318]

62 2 Summary Coherence Rating To reconfirm the hypothesis that our model can estimate the local coherence of a textual document, we perform a second experiment, summary coherence rating. [sent-210, score-1.013]

63 As the objective of our model is to estimate the coherence of a summary, we prefer this dataset to other summarization evaluation task corpora, as these account for other dimensions of the summaries: content selection, fluency, etc. [sent-212, score-0.42]

64 1 Experimental Settings For the summary coherence rating experiment, pairs to be ordered are composed of summaries extracted from the Document Understanding Conference (DUC 2003). [sent-216, score-0.619]

65 Summaries, provided either by humans or by automatic systems, were judged by seven humans annotators and associated with a coherence score (for more details on this score see Barzilay and Lapata (2008)). [sent-217, score-0.478]

66 80 pairs were then created, each of these being composed by two summaries of a same document where the score of one of the summaries is significantly higher than the score of the second one. [sent-218, score-0.317]

67 This table also shows that, contrary to sentence ordering task, accounting for the distance between two sentences (Dist) tends to decrease the results. [sent-229, score-0.294]

68 As adding distance information decreases the value of our local coherence score, our graph-based model gives better results without it. [sent-231, score-0.603]

69 This means that, in these documents, two sentences tend to share a larger number of entities and therefore have a higher local coherence score when the PW projection graph is used. [sent-235, score-0.861]

70 Finally, Table 5 also shows that using a coreference resolution system for document representation does not improve the performance of our system. [sent-237, score-0.311]

71 3 Readability Assessment Barzilay and Lapata (2008) argue that grid models are domain and style dependent. [sent-243, score-0.312]

72 Therefore they proposed a readability assessment task to test ifthe entity grid model can be used for style classification. [sent-244, score-0.637]

73 In order to estimate the complexity of a document, our model computes the local coherence score for each article in the two categories. [sent-256, score-0.602]

74 The article associated with the higher score is considered to be the more readable as it is more coherent, needing less interpretation from the reader than a document associated with a lower local coherence score. [sent-257, score-0.71]

75 Values presented in the following section correspond to accuracy, where the system is correct ifit assigns the higher local coherence score to the most “easy to read” document, and F-measure. [sent-258, score-0.601]

76 2 (S&O;: Results In order to compare our results with those reported by Barzilay and Lapata (2008), entities used for the graph-based representation are discourse entities that head NPs. [sent-262, score-0.337]

77 This increases the local coherence value of difficult documents more than the value of “easy to read” documents, that contain less entities. [sent-270, score-0.598]

78 The poor performance of our system in this case can be explained by the fact that the coreference resolution system regroups more entities in Encyclopedia Britannica documents than in Britannica Elementary ones. [sent-272, score-0.406]

79 Therefore, the number of entities that are “shared” by two sentences increases more importantly in the Encyclopedia Britannica corpus, while the distance between two occurrences of one entity decreases in a more significant manner. [sent-273, score-0.365]

80 For these reasons, the coherence scores associated with “difficult to read” documents tend to be higher when coreference resolution is performed on our data, which reduces the performance of our system. [sent-274, score-0.698]

81 Compared to the results provided by Barzilay and Lapata (2008) with the entity grid model alone, our representation outperforms their model significantly. [sent-276, score-0.529]

82 We believe that this difference is caused by how syntactic information is introduced in the document representation and by the fact that our system can deal with entities that appear throughout the whole document while the entity grid model only looks at entities within a three sentences windows. [sent-277, score-0.975]

83 Our model which captures exclusively local coherence is almost on par with the results reported for Schwarm & Ostendorf’s (2005) system which relies on a wide range of lexical, syntactic and semantic features. [sent-278, score-0.611]

84 Only when Barzilay and Lapata (2008) combine the entity grid with Schwarm & Ostendorf’s features they reach performance considerably better than ours. [sent-279, score-0.455]

85 As previously, coref- erence resolution tends to lower the results, therefore only values obtained without coreference resolution are reported in the table. [sent-285, score-0.311]

86 As the length of the two kinds of articles is similar, distance between entities in Britannica Elementary documents is more important. [sent-291, score-0.273]

87 As a result, accounting for distance information lowers the local coherence values for the more coherent document, which reduces the performance of our model. [sent-292, score-0.707]

88 5 Conclusions In this paper, we proposed an unsupervised and computationally efficient graph-based local coherence model. [sent-299, score-0.561]

89 766 for sentence ordering, summary coherence rating and readability assessment tasks respectively (PAcc, Dist)). [sent-305, score-0.699]

90 Moreover, our model can be optimized and obtains results comparable with entity grid based methods when proper settings are used for each task. [sent-306, score-0.476]

91 Our model has two main advantages over the entity grid model. [sent-307, score-0.476]

92 First, as the graph used for document representation contains information about entity transitions, our model does not need a learning phase. [sent-308, score-0.365]

93 Indeed, our model only uses entities as indications of sentence connection although it has been shown that distinguishing important from unimportant entities, according to their named-entity category, has a positive impact on local coherence computation (Elsner and Charniak, 2011). [sent-312, score-0.7]

94 This can be easily done by adding edges in the projection graphs when sentences contain entities related from a discourse point of view while Lin et al. [sent-315, score-0.321]

95 ’s approach suffers from complexity and data sparsity problems similar to the entity grid model. [sent-316, score-0.51]

96 Finally, these promising results on local coherence modeling make us believe that our graphbased representation can be used without much modification for other tasks, e. [sent-317, score-0.589]

97 Using entity-based features to model coherence in student essays. [sent-348, score-0.488]

98 Extending the entity-grid coherence model to semantically related entities. [sent-378, score-0.42]

99 Centering: A framework for modeling the local coherence of discourse. [sent-384, score-0.533]

100 Evaluation of text coherence for electronic essay scoring systems. [sent-419, score-0.399]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('britannica', 0.526), ('coherence', 0.399), ('grid', 0.29), ('barzilay', 0.252), ('elsner', 0.199), ('lapata', 0.198), ('entity', 0.165), ('local', 0.134), ('pacc', 0.12), ('entities', 0.113), ('centering', 0.105), ('coreference', 0.104), ('encyclopedia', 0.101), ('charniak', 0.086), ('bipartite', 0.08), ('resolution', 0.08), ('readability', 0.08), ('discourse', 0.079), ('elementary', 0.077), ('graph', 0.074), ('ordering', 0.074), ('document', 0.073), ('transitions', 0.069), ('student', 0.068), ('schwarm', 0.066), ('documents', 0.065), ('insertion', 0.061), ('pw', 0.06), ('summaries', 0.06), ('summary', 0.06), ('assessment', 0.059), ('discrimination', 0.059), ('projection', 0.057), ('july', 0.055), ('accounting', 0.054), ('composed', 0.053), ('june', 0.052), ('distance', 0.049), ('incidence', 0.049), ('coherent', 0.048), ('si', 0.047), ('rating', 0.047), ('centrality', 0.046), ('articles', 0.046), ('outdegree', 0.045), ('weights', 0.044), ('micha', 0.044), ('projections', 0.042), ('accounted', 0.038), ('ostendorf', 0.038), ('sentences', 0.038), ('shared', 0.037), ('syntactic', 0.035), ('cij', 0.035), ('ej', 0.034), ('graphs', 0.034), ('adjacent', 0.034), ('sentence', 0.033), ('permutations', 0.033), ('read', 0.032), ('representation', 0.032), ('permutation', 0.032), ('sparsity', 0.032), ('bonferroni', 0.03), ('kibble', 0.03), ('secular', 0.03), ('associated', 0.029), ('matrix', 0.028), ('computationally', 0.028), ('mirella', 0.027), ('edge', 0.027), ('brennan', 0.027), ('eik', 0.027), ('karamanis', 0.027), ('martschat', 0.027), ('mcintyre', 0.027), ('pu', 0.026), ('heidelberg', 0.026), ('accuracy', 0.026), ('regina', 0.026), ('score', 0.025), ('accidents', 0.025), ('earthquakes', 0.025), ('filippova', 0.025), ('grosz', 0.025), ('tends', 0.024), ('graphbased', 0.024), ('poesio', 0.023), ('burstein', 0.023), ('values', 0.023), ('complexity', 0.023), ('cell', 0.022), ('equals', 0.022), ('system', 0.022), ('decrease', 0.022), ('leaders', 0.022), ('style', 0.022), ('model', 0.021), ('higher', 0.021), ('tasks', 0.021)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999988 172 acl-2013-Graph-based Local Coherence Modeling

Author: Camille Guinaudeau ; Michael Strube

Abstract: We propose a computationally efficient graph-based approach for local coherence modeling. We evaluate our system on three tasks: sentence ordering, summary coherence rating and readability assessment. The performance is comparable to entity grid based approaches though these rely on a computationally expensive training phase and face data sparsity problems.

2 0.17794625 55 acl-2013-Are Semantically Coherent Topic Models Useful for Ad Hoc Information Retrieval?

Author: Romain Deveaud ; Eric SanJuan ; Patrice Bellot

Abstract: The current topic modeling approaches for Information Retrieval do not allow to explicitly model query-oriented latent topics. More, the semantic coherence of the topics has never been considered in this field. We propose a model-based feedback approach that learns Latent Dirichlet Allocation topic models on the top-ranked pseudo-relevant feedback, and we measure the semantic coherence of those topics. We perform a first experimental evaluation using two major TREC test collections. Results show that retrieval perfor- mances tend to be better when using topics with higher semantic coherence.

3 0.14457938 252 acl-2013-Multigraph Clustering for Unsupervised Coreference Resolution

Author: Sebastian Martschat

Abstract: We present an unsupervised model for coreference resolution that casts the problem as a clustering task in a directed labeled weighted multigraph. The model outperforms most systems participating in the English track of the CoNLL’ 12 shared task.

4 0.14432311 225 acl-2013-Learning to Order Natural Language Texts

Author: Jiwei Tan ; Xiaojun Wan ; Jianguo Xiao

Abstract: Ordering texts is an important task for many NLP applications. Most previous works on summary sentence ordering rely on the contextual information (e.g. adjacent sentences) of each sentence in the source document. In this paper, we investigate a more challenging task of ordering a set of unordered sentences without any contextual information. We introduce a set of features to characterize the order and coherence of natural language texts, and use the learning to rank technique to determine the order of any two sentences. We also propose to use the genetic algorithm to determine the total order of all sentences. Evaluation results on a news corpus show the effectiveness of our proposed method. 1

5 0.11079607 139 acl-2013-Entity Linking for Tweets

Author: Xiaohua Liu ; Yitong Li ; Haocheng Wu ; Ming Zhou ; Furu Wei ; Yi Lu

Abstract: We study the task of entity linking for tweets, which tries to associate each mention in a tweet with a knowledge base entry. Two main challenges of this task are the dearth of information in a single tweet and the rich entity mention variations. To address these challenges, we propose a collective inference method that simultaneously resolves a set of mentions. Particularly, our model integrates three kinds of similarities, i.e., mention-entry similarity, entry-entry similarity, and mention-mention similarity, to enrich the context for entity linking, and to address irregular mentions that are not covered by the entity-variation dictionary. We evaluate our method on a publicly available data set and demonstrate the effectiveness of our method.

6 0.09884771 130 acl-2013-Domain-Specific Coreference Resolution with Lexicalized Features

7 0.094339602 341 acl-2013-Text Classification based on the Latent Topics of Important Sentences extracted by the PageRank Algorithm

8 0.09140256 196 acl-2013-Improving pairwise coreference models through feature space hierarchy learning

9 0.088123344 352 acl-2013-Towards Accurate Distant Supervision for Relational Facts Extraction

10 0.086550102 389 acl-2013-Word Association Profiles and their Use for Automated Scoring of Essays

11 0.085547075 160 acl-2013-Fine-grained Semantic Typing of Emerging Entities

12 0.078781515 2 acl-2013-A Bayesian Model for Joint Unsupervised Induction of Sentiment, Aspect and Discourse Representations

13 0.077604227 169 acl-2013-Generating Synthetic Comparable Questions for News Articles

14 0.077196151 18 acl-2013-A Sentence Compression Based Framework to Query-Focused Multi-Document Summarization

15 0.073342882 179 acl-2013-HYENA-live: Fine-Grained Online Entity Type Classification from Natural-language Text

16 0.070805185 98 acl-2013-Cross-lingual Transfer of Semantic Role Labeling Models

17 0.070736721 71 acl-2013-Bootstrapping Entity Translation on Weakly Comparable Corpora

18 0.070584126 219 acl-2013-Learning Entity Representation for Entity Disambiguation

19 0.067461573 85 acl-2013-Combining Intra- and Multi-sentential Rhetorical Parsing for Document-level Discourse Analysis

20 0.067379221 106 acl-2013-Decentralized Entity-Level Modeling for Coreference Resolution


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.177), (1, 0.06), (2, -0.014), (3, -0.089), (4, 0.038), (5, 0.092), (6, 0.043), (7, 0.039), (8, -0.091), (9, -0.025), (10, 0.06), (11, -0.048), (12, -0.077), (13, 0.056), (14, -0.079), (15, 0.088), (16, -0.034), (17, 0.048), (18, -0.135), (19, 0.012), (20, -0.059), (21, 0.06), (22, -0.015), (23, -0.056), (24, -0.035), (25, 0.001), (26, 0.077), (27, 0.032), (28, -0.006), (29, 0.031), (30, -0.045), (31, -0.107), (32, -0.002), (33, -0.047), (34, -0.023), (35, -0.117), (36, -0.022), (37, 0.021), (38, -0.005), (39, -0.004), (40, 0.021), (41, -0.045), (42, -0.035), (43, 0.059), (44, -0.035), (45, 0.037), (46, -0.003), (47, 0.054), (48, -0.031), (49, 0.034)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.94417316 172 acl-2013-Graph-based Local Coherence Modeling

Author: Camille Guinaudeau ; Michael Strube

Abstract: We propose a computationally efficient graph-based approach for local coherence modeling. We evaluate our system on three tasks: sentence ordering, summary coherence rating and readability assessment. The performance is comparable to entity grid based approaches though these rely on a computationally expensive training phase and face data sparsity problems.

2 0.6923331 225 acl-2013-Learning to Order Natural Language Texts

Author: Jiwei Tan ; Xiaojun Wan ; Jianguo Xiao

Abstract: Ordering texts is an important task for many NLP applications. Most previous works on summary sentence ordering rely on the contextual information (e.g. adjacent sentences) of each sentence in the source document. In this paper, we investigate a more challenging task of ordering a set of unordered sentences without any contextual information. We introduce a set of features to characterize the order and coherence of natural language texts, and use the learning to rank technique to determine the order of any two sentences. We also propose to use the genetic algorithm to determine the total order of all sentences. Evaluation results on a news corpus show the effectiveness of our proposed method. 1

3 0.67543381 196 acl-2013-Improving pairwise coreference models through feature space hierarchy learning

Author: Emmanuel Lassalle ; Pascal Denis

Abstract: This paper proposes a new method for significantly improving the performance of pairwise coreference models. Given a set of indicators, our method learns how to best separate types of mention pairs into equivalence classes for which we construct distinct classification models. In effect, our approach finds an optimal feature space (derived from a base feature set and indicator set) for discriminating coreferential mention pairs. Although our approach explores a very large space of possible feature spaces, it remains tractable by exploiting the structure of the hierarchies built from the indicators. Our exper- iments on the CoNLL-2012 Shared Task English datasets (gold mentions) indicate that our method is robust relative to different clustering strategies and evaluation metrics, showing large and consistent improvements over a single pairwise model using the same base features. Our best system obtains a competitive 67.2 of average F1 over MUC, and CEAF which, despite its simplicity, places it above the mean score of other systems on these datasets. B3,

4 0.64672589 252 acl-2013-Multigraph Clustering for Unsupervised Coreference Resolution

Author: Sebastian Martschat

Abstract: We present an unsupervised model for coreference resolution that casts the problem as a clustering task in a directed labeled weighted multigraph. The model outperforms most systems participating in the English track of the CoNLL’ 12 shared task.

5 0.60582983 178 acl-2013-HEADY: News headline abstraction through event pattern clustering

Author: Enrique Alfonseca ; Daniele Pighin ; Guillermo Garrido

Abstract: This paper presents HEADY: a novel, abstractive approach for headline generation from news collections. From a web-scale corpus of English news, we mine syntactic patterns that a Noisy-OR model generalizes into event descriptions. At inference time, we query the model with the patterns observed in an unseen news collection, identify the event that better captures the gist of the collection and retrieve the most appropriate pattern to generate a headline. HEADY improves over a state-of-theart open-domain title abstraction method, bridging half of the gap that separates it from extractive methods using humangenerated titles in manual evaluations, and performs comparably to human-generated headlines as evaluated with ROUGE.

6 0.6017006 139 acl-2013-Entity Linking for Tweets

7 0.60163033 340 acl-2013-Text-Driven Toponym Resolution using Indirect Supervision

8 0.57870597 130 acl-2013-Domain-Specific Coreference Resolution with Lexicalized Features

9 0.56770182 106 acl-2013-Decentralized Entity-Level Modeling for Coreference Resolution

10 0.56023031 219 acl-2013-Learning Entity Representation for Entity Disambiguation

11 0.53816915 18 acl-2013-A Sentence Compression Based Framework to Query-Focused Multi-Document Summarization

12 0.53377748 179 acl-2013-HYENA-live: Fine-Grained Online Entity Type Classification from Natural-language Text

13 0.52853644 341 acl-2013-Text Classification based on the Latent Topics of Important Sentences extracted by the PageRank Algorithm

14 0.52723521 177 acl-2013-GuiTAR-based Pronominal Anaphora Resolution in Bengali

15 0.51245123 182 acl-2013-High-quality Training Data Selection using Latent Topics for Graph-based Semi-supervised Learning

16 0.50934273 159 acl-2013-Filling Knowledge Base Gaps for Distant Supervision of Relation Extraction

17 0.49733454 280 acl-2013-Plurality, Negation, and Quantification:Towards Comprehensive Quantifier Scope Disambiguation

18 0.49641982 142 acl-2013-Evolutionary Hierarchical Dirichlet Process for Timeline Summarization

19 0.49403405 21 acl-2013-A Statistical NLG Framework for Aggregated Planning and Realization

20 0.49295431 352 acl-2013-Towards Accurate Distant Supervision for Relational Facts Extraction


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(0, 0.04), (6, 0.051), (11, 0.063), (24, 0.052), (26, 0.041), (35, 0.137), (42, 0.07), (48, 0.055), (64, 0.014), (70, 0.053), (71, 0.024), (72, 0.131), (88, 0.061), (90, 0.035), (95, 0.061)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.93256378 30 acl-2013-A computational approach to politeness with application to social factors

Author: Cristian Danescu-Niculescu-Mizil ; Moritz Sudhof ; Dan Jurafsky ; Jure Leskovec ; Christopher Potts

Abstract: We propose a computational framework for identifying linguistic aspects of politeness. Our starting point is a new corpus of requests annotated for politeness, which we use to evaluate aspects of politeness theory and to uncover new interactions between politeness markers and context. These findings guide our construction of a classifier with domain-independent lexical and syntactic features operationalizing key components of politeness theory, such as indirection, deference, impersonalization and modality. Our classifier achieves close to human performance and is effective across domains. We use our framework to study the relationship between po- liteness and social power, showing that polite Wikipedia editors are more likely to achieve high status through elections, but, once elevated, they become less polite. We see a similar negative correlation between politeness and power on Stack Exchange, where users at the top of the reputation scale are less polite than those at the bottom. Finally, we apply our classifier to a preliminary analysis of politeness variation by gender and community.

2 0.91718227 332 acl-2013-Subtree Extractive Summarization via Submodular Maximization

Author: Hajime Morita ; Ryohei Sasano ; Hiroya Takamura ; Manabu Okumura

Abstract: This study proposes a text summarization model that simultaneously performs sentence extraction and compression. We translate the text summarization task into a problem of extracting a set of dependency subtrees in the document cluster. We also encode obligatory case constraints as must-link dependency constraints in order to guarantee the readability of the generated summary. In order to handle the subtree extraction problem, we investigate a new class of submodular maximization problem, and a new algorithm that has the approximation ratio 21 (1 − e−1). Our experiments with the NTC(1IR − −A eCLIA test collections show that our approach outperforms a state-of-the-art algorithm.

3 0.89963591 165 acl-2013-General binarization for parsing and translation

Author: Matthias Buchse ; Alexander Koller ; Heiko Vogler

Abstract: Binarization ofgrammars is crucial for improving the complexity and performance of parsing and translation. We present a versatile binarization algorithm that can be tailored to a number of grammar formalisms by simply varying a formal parameter. We apply our algorithm to binarizing tree-to-string transducers used in syntax-based machine translation.

same-paper 4 0.8845641 172 acl-2013-Graph-based Local Coherence Modeling

Author: Camille Guinaudeau ; Michael Strube

Abstract: We propose a computationally efficient graph-based approach for local coherence modeling. We evaluate our system on three tasks: sentence ordering, summary coherence rating and readability assessment. The performance is comparable to entity grid based approaches though these rely on a computationally expensive training phase and face data sparsity problems.

5 0.81868583 158 acl-2013-Feature-Based Selection of Dependency Paths in Ad Hoc Information Retrieval

Author: K. Tamsin Maxwell ; Jon Oberlander ; W. Bruce Croft

Abstract: Techniques that compare short text segments using dependency paths (or simply, paths) appear in a wide range of automated language processing applications including question answering (QA). However, few models in ad hoc information retrieval (IR) use paths for document ranking due to the prohibitive cost of parsing a retrieval collection. In this paper, we introduce a flexible notion of paths that describe chains of words on a dependency path. These chains, or catenae, are readily applied in standard IR models. Informative catenae are selected using supervised machine learning with linguistically informed features and compared to both non-linguistic terms and catenae selected heuristically with filters derived from work on paths. Automatically selected catenae of 1-2 words deliver significant performance gains on three TREC collections.

6 0.81636935 389 acl-2013-Word Association Profiles and their Use for Automated Scoring of Essays

7 0.81397414 347 acl-2013-The Role of Syntax in Vector Space Models of Compositional Semantics

8 0.81263798 22 acl-2013-A Structured Distributional Semantic Model for Event Co-reference

9 0.80897176 283 acl-2013-Probabilistic Domain Modelling With Contextualized Distributional Semantic Vectors

10 0.80851203 252 acl-2013-Multigraph Clustering for Unsupervised Coreference Resolution

11 0.80590707 231 acl-2013-Linggle: a Web-scale Linguistic Search Engine for Words in Context

12 0.80474561 185 acl-2013-Identifying Bad Semantic Neighbors for Improving Distributional Thesauri

13 0.80425984 46 acl-2013-An Infinite Hierarchical Bayesian Model of Phrasal Translation

14 0.8039453 159 acl-2013-Filling Knowledge Base Gaps for Distant Supervision of Relation Extraction

15 0.80101609 291 acl-2013-Question Answering Using Enhanced Lexical Semantic Models

16 0.80095363 341 acl-2013-Text Classification based on the Latent Topics of Important Sentences extracted by the PageRank Algorithm

17 0.800951 215 acl-2013-Large-scale Semantic Parsing via Schema Matching and Lexicon Extension

18 0.80040085 238 acl-2013-Measuring semantic content in distributional vectors

19 0.79944062 60 acl-2013-Automatic Coupling of Answer Extraction and Information Retrieval

20 0.79766178 83 acl-2013-Collective Annotation of Linguistic Resources: Basic Principles and a Formal Model