acl acl2013 acl2013-18 knowledge-graph by maker-knowledge-mining

18 acl-2013-A Sentence Compression Based Framework to Query-Focused Multi-Document Summarization


Source: pdf

Author: Lu Wang ; Hema Raghavan ; Vittorio Castelli ; Radu Florian ; Claire Cardie

Abstract: We consider the problem of using sentence compression techniques to facilitate queryfocused multi-document summarization. We present a sentence-compression-based framework for the task, and design a series of learning-based compression models built on parse trees. An innovative beam search decoder is proposed to efficiently find highly probable compressions. Under this framework, we show how to integrate various indicative metrics such as linguistic motivation and query relevance into the compression process by deriving a novel formulation of a compression scoring function. Our best model achieves statistically significant improvement over the state-of-the-art systems on several metrics (e.g. 8.0% and 5.4% improvements in ROUGE-2 respectively) for the DUC 2006 and 2007 summarization task. ,

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 s,e NdYu 10598, USA {hraghav, vitt orio Abstract We consider the problem of using sentence compression techniques to facilitate queryfocused multi-document summarization. [sent-7, score-0.741]

2 We present a sentence-compression-based framework for the task, and design a series of learning-based compression models built on parse trees. [sent-8, score-0.609]

3 An innovative beam search decoder is proposed to efficiently find highly probable compressions. [sent-9, score-0.197]

4 Under this framework, we show how to integrate various indicative metrics such as linguistic motivation and query relevance into the compression process by deriving a novel formulation of a compression scoring function. [sent-10, score-1.316]

5 4% improvements in ROUGE-2 respectively) for the DUC 2006 and 2007 summarization task. [sent-15, score-0.134]

6 Query-focused multi-document summarization (MDS) methods have been proposed as one such technique and have attracted significant attention in recent years. [sent-17, score-0.134]

7 The goal of query-focused MDS is to synthesize a brief (often fixed-length) and well-organized summary from a set of topicrelated documents that answer a complex question or address a topic statement. [sent-18, score-0.098]

8 The resulting summaries, in turn, can support a number of information analysis applications including openended question answering, recommender systems, and summarization of search engine results. [sent-19, score-0.17]

9 As further evidence of its importance, the Document Understanding Conference (DUC) has used queryfocused MDS as its main task since 2004 to foster radu f} @ us . [sent-20, score-0.126]

10 com new research on automatic summarization in the context of users’ needs. [sent-22, score-0.134]

11 First, lengthy sentences that are partly relevant are either excluded from the summary or (if selected) can block the selection of other important sentences, due to summary length constraints. [sent-25, score-0.159]

12 In news articles, for example, most sentences are lengthy and contain both potentially useful information for a summary as well as unnecessary details that are better omitted. [sent-27, score-0.095]

13 Consider the following DUC query as input for a MDS system:1 “In what ways have stolen artworks been recovered? [sent-28, score-0.105]

14 In this example, the compressed sentence is rela1From DUC 2005, query for topic d422g. [sent-31, score-0.231]

15 Sentence compression techniques (Knight and Marcu, 2000; Clarke and Lapata, 2008) are the standard for producing a compact and grammatical version of a sentence while preserving relevance, and prior research (e. [sent-41, score-0.688]

16 Similarly, strides have been made to incorporate sentence compression into query-focused MDS systems (Zajic et al. [sent-44, score-0.645]

17 Most attempts, however, fail to produce better results than those of the best systems built on pure extraction-based approaches that use no sentence compression. [sent-46, score-0.122]

18 In this paper we investigate the role of sentence compression techniques for query-focused MDS. [sent-47, score-0.645]

19 We extend existing work in the area first by investigating the role of learning-based sentence compression techniques. [sent-48, score-0.645]

20 Our topperforming sentence compression algorithm incorporates measures of query relevance, content importance, redundancy and language qual- ity, among others. [sent-50, score-0.831]

21 We evaluate the summarization models on the standard Document Understanding Conference (DUC) 2006 and 2007 corpora 2 for queryfocused MDS and find that all of our compressionbased summarization models achieve statistically significantly better performance than the best DUC 2006 systems. [sent-52, score-0.415]

22 TAC-08’s opinion summarization or TAC-09’s update summarization) or domains (e. [sent-59, score-0.134]

23 , 2006)); human annotators furthermore rate our system-generated summaries as having less redundancy and comparable quality w. [sent-75, score-0.104]

24 With these results we believe we are the first to successfully show that sentence compression can provide statistically significant improvements over pure extraction-based approaches for queryfocused MDS. [sent-79, score-0.842]

25 2 Related Work Existing research on query-focused multidocument summarization (MDS) largely relies on extractive approaches, where systems usually take as input a set of documents and select the top relevant sentences for inclusion in the final summary. [sent-80, score-0.265]

26 , 2006), combining query similarity and document centrality within a graph-based model (Otterbacher et al. [sent-83, score-0.14]

27 Our work is more related to the less studied area of sentence compression as applied to (single) document summarization. [sent-90, score-0.68]

28 (2006) tackle the query-focused MDS problem using a compress-first strategy: they develop heuristics to generate multiple alternative compressions of all sentences in the original document; these then become the candidates for extraction. [sent-92, score-0.094]

29 A similar idea has been studied for MDS (Lin, 2003; Gillick and Favre, 2009), 1385 but limited improvement is observed over extractive baselines with simple compression rules. [sent-94, score-0.659]

30 Finally, although learning-based compression methods are promising (Martins and Smith, 2009; Berg-Kirkpatrick et al. [sent-95, score-0.573]

31 Rather than attempt to derive a new parse tree like Knight and Marcu (2000) and Galley and McKeown (2007), we learn to safely re- move a set of constituents in our parse tree-based compression model while preserving grammatical structure and essential content. [sent-98, score-0.733]

32 Sentence-level compression has also been examined via a discriminative model McDonald (2006), and Clarke and Lapata (2008) also incorporate discourse information by using integer linear programming. [sent-99, score-0.573]

33 First, sentence ranking determines the importance of each sentence given the query. [sent-101, score-0.178]

34 Then, a sentence compressor iteratively generates the most likely succinct versions of the ranked sentences, which are cumulatively added to the summary, until a length limit is reached. [sent-102, score-0.119]

35 Finally, the postprocessing stage applies coreference resolution and sentence reordering to build the summary. [sent-103, score-0.165]

36 Table 1), we describe next the query-relevant features used for sentence ranking as these are the most important for our summarization setting. [sent-119, score-0.24]

37 The goal of this feature subset is to determine the similarity between the query and each candidate sentence. [sent-120, score-0.105]

38 Then we conduct simple query expansion based on the title of the topic and cross-document coreference resolution. [sent-123, score-0.16]

39 Finally, we compute two versions of the features—one based on the original query and another on the expanded one. [sent-126, score-0.105]

40 We also derive the semantic role overlap and relation instance overlap between the query and each sentence. [sent-127, score-0.105]

41 As the main focus of this paper, we propose three types of compression methods, described in detail in Section 4 below. [sent-130, score-0.573]

42 For sentence ordering, each compressed sentence is assigned to the most similar (tf-idf) query sentence. [sent-135, score-0.303]

43 , 2002) sorts the sentences for each query based first on the time stamp, and then the position in the source document. [sent-137, score-0.105]

44 4 Sentence Compression Sentence compression is typically formulated as the problem of removing secondary information from a sentence while maintaining its grammaticality and semantic structure (Knight and Marcu, 2000; McDonald, 2006; Galley and McKeown, 2007; Clarke and Lapata, 2008). [sent-138, score-0.681]

45 Below we describe the sentence compression approaches developed in this research: RULE-BASED COMPRESSION, SEQUENCE-BASED COMPRESSION, and TREEBASED COMPRESSION. [sent-140, score-0.645]

46 , 2007) to create the linguistically-motivated compression rules of Table 2. [sent-145, score-0.573]

47 To avoid ill-formed output, we disallow compressions of more than 10 words by each rule. [sent-146, score-0.094]

48 2 Sequence-based Compression As in McDonald (2006) and Clarke and Lapata (2008), our sequence-based compression model makes a binary “keep-or-delete” decision for each word in the sentence. [sent-148, score-0.573]

49 ” view compression as a sequential tagging problem and make use of linear-chain Conditional Random Fields (CRFs) (Lafferty et al. [sent-154, score-0.573]

50 3 Tree-based Compression Our tree-based compression methods are in line with syntax-driven approaches (Galley and McKeown, 2007), where operations are carried out on parse tree constituents. [sent-171, score-0.654]

51 Formally, given a parse tree T of the sentence to be compressed and a tree traversal algorithm, T can be presented as a list of ordered constituent nodes, T = t0t1 . [sent-174, score-0.313]

52 RETAIN (RET) Rand REMOVE (REM) denote} whether the node ti is retained or removed. [sent-182, score-0.13]

53 Labels are identified, in order, according to the tree traversal algorithm. [sent-186, score-0.106]

54 Every node label needs to be compatible with the labeling history: given a node ti, and a set of labels l0 . [sent-187, score-0.129]

55 ti−1, li =RET or li =REM is compatible with the history when all children of ti are labeled as RET or REM, respectively; li =PAR is compatible when ti has at least two descendents tj and tk (j < i and k < i), one of which is RETained and the other, REMoved. [sent-193, score-0.154]

56 As the space of possible compressions is exponential in the number of leaves in the parse tree, instead of looking for the globally optimal solution, we use beam search to find a set of highly likely compressions and employ a language model trained on a large corpus for evaluation. [sent-195, score-0.374]

57 The beam search decoder (see Algorithm 1) takes as input the sentence’s parse tree T = t0t1 . [sent-197, score-0.278]

58 postorder) as a sequence of nodes in T, the set L of possible node labels, a scoring function S for evaluating each sentence compression hypothesis, and a beam size N. [sent-202, score-0.874]

59 In iteration i, all existing sentence compression hypotheses are ex- panded by one node, tOi , labeling it with all compatible labels. [sent-211, score-0.68]

60 Our BASIC Tree-based Compression instantiates the beam search decoder with postorder traversal and a hypothesis scorer that takes a possible sentence compression— a sequence of nodes (e. [sent-214, score-0.479]

61 , Pjk=1 1388 Figure 2: Example of beam search decoding. [sent-226, score-0.15]

62 For postorder traversal, the three nodes are visited in a bottom-up order. [sent-227, score-0.112]

63 The associated compression hypotheses (boxed) are ranked based on the scores in parentheses. [sent-228, score-0.573]

64 HEAD-driven search modifies the BASIC postorder tree traversal by visiting the head node first at each level, leaving other orders unchanged. [sent-232, score-0.266]

65 Given the N-best compressions from the decoder, we evaluate the yield of the trimmed trees using a language model trained on the Gigaword (Graff, 2003) corpus and return the compression with the highest probability. [sent-239, score-0.701]

66 Thus, the decoder is quite flexible its learned scoring function allows us to incorporate features salient for sentence compression while its language model guarantees the linguistic quality of the compressed string. [sent-240, score-0.779]

67 Towards this goal, we construct a compression scoring function—the multi-scorer (MULTI)—that allows the incorporation of multiple task-specific scorers. [sent-246, score-0.606]

68 The query Q is expanded as described in Section 3. [sent-253, score-0.105]

69 Relevant documents for each query are provided along with 4 to 9 human MDS abstracts. [sent-271, score-0.105]

70 We split DUC 2005 into two parts: 40 topics to train the sentence ranking models, and 10 for ranking algorithm selection and parameter tuning for the multiscorer. [sent-273, score-0.14]

71 It includes 82 newswire articles with one manually produced compression aligned to each sentence. [sent-277, score-0.573]

72 Documents are processed by a full NLP pipeline, including token and sentence segmentation, parsing, semantic role labeling, and an information extraction pipeline consisting of mention detection, NP coreference, crossdocument resolution, and relation detection (Florian et al. [sent-279, score-0.103]

73 For sequencebased compression using CRFs, we employ Mallet (McCallum, 2002) and integrate the Table 2 rules during inference. [sent-286, score-0.573]

74 4 Sentence compressions are evaluated by a 5-gram language model trained on Gigaword (Graff, 2003) by SRILM (Stolcke, 2002). [sent-292, score-0.094]

75 (2012) that report the best R-2 score on DUC 2006 and 2007 thus far, and to the purely extractive methods of SVR and LambdaMART. [sent-297, score-0.086]

76 Our sentence-compression-based systems (marked with †) show statistically significant improvements over pure wex sttraatictsitvicea lsluymm sigarnizifaictiaonnt for both R-2 and R-SU4 (paired t-test, p < 0. [sent-298, score-0.101]

77 This means our systems can effectively remove redundancy within the summary through compression. [sent-300, score-0.109]

78 Furthermore, our HEAD-driven beam search method with MULTI-scorer beats all systems on DUC 20066 and all systems on DUC 2007 except the best system in terms of R-2 (p < 0. [sent-301, score-0.15]

79 01) better than extractive methods, rule-based and sequence-based compression methods on both DUC 2006 and 2007. [sent-304, score-0.659]

80 Moreover, our systems with learning-based compression have considerable compression rates, indicating their capability to remove superfluous words as well as improve summary quality. [sent-305, score-1.21]

81 BASIC, CONTEXT and HEAD represent the basic beam search decoder, context-aware and head-driven search extensions respectively. [sent-319, score-0.186]

82 Four native speakers who are undergraduate students in computer science (none are authors) per- formed the task, We compare our system based on HEAD-driven beam search with MULTI-scorer to the best systems in DUC 2006 achieving top ROUGE scores (Jagarlamudi et al. [sent-353, score-0.15]

83 (2006), which either uses minimal non-learning-based compression rules or is a pure extractive system. [sent-366, score-0.709]

84 However, our compression system sometimes generates less grammatical sentences, and those are mostly due to parsing errors. [sent-367, score-0.616]

85 A sample summary from our multiscorer based system is in Figure 3. [sent-370, score-0.102]

86 We also evaluate sentence compression separately on (Clarke and Lapata, 2008), adopting the same partitions as (Martins and Smith, 2009), i. [sent-372, score-0.645]

87 Our compression models are compared with Hedge Trimmer (Dorr et al. [sent-375, score-0.573]

88 there is no statistically signifi- cant difference between our models and McDonald (2006) / M & S (2009) with p unigram F1 (Uni-F1) are statistically indistinguishable (p > 0. [sent-380, score-0.102]

89 How and Figure 3: Part of the summary generated by the multiscorer based summarizer for topic D0626H (DUC 2006). [sent-389, score-0.102]

90 In Table 7, our context-aware and head-driven tree-based compression systems show statistically significantly (p < 0. [sent-395, score-0.624]

91 For grammatical relation evaluation, our head-driven tree-based system obtains statistically significantly (p < 0. [sent-403, score-0.094]

92 7 Conclusion We have presented a framework for query-focused multi-document summarization based on sentence compression. [sent-405, score-0.206]

93 Our tree-based compression method can easily incorporate measures of query relevance, content importance, redundancy and language quality into the compression process. [sent-407, score-1.332]

94 Inferring strategies for sentence ordering in multidocument news summarization. [sent-425, score-0.148]

95 Global inference for sentence compression an integer linear programming approach. [sent-477, score-0.645]

96 In Proceedings of the HLT-NAACL 03 on Text summarization workshop - Volume 5, HLT-NAACLDUC ’03, pages 1 8, Stroudsburg, PA, USA. [sent-527, score-0.134]

97 Support vector machines for query-focused summarization trained and evaluated on pyramid data. [sent-544, score-0.24]

98 Improving summarization performance by sentence compression: a pilot study. [sent-622, score-0.206]

99 A mentionsynchronous coreference resolution algorithm based on the bell tree. [sent-631, score-0.093]

100 Sentence compression as a component of a multidocument summarization system. [sent-705, score-0.752]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('compression', 0.573), ('duc', 0.424), ('mds', 0.191), ('jagarlamudi', 0.142), ('summarization', 0.134), ('beam', 0.114), ('pyramid', 0.106), ('query', 0.105), ('rem', 0.102), ('lacatusu', 0.096), ('queryfocused', 0.096), ('compressions', 0.094), ('extractive', 0.086), ('clarke', 0.082), ('rouge', 0.078), ('davis', 0.078), ('lambdamart', 0.077), ('postorder', 0.077), ('ret', 0.074), ('sentence', 0.072), ('dang', 0.069), ('stroudsburg', 0.069), ('luo', 0.067), ('conroy', 0.065), ('summary', 0.064), ('martins', 0.063), ('lq', 0.063), ('traversal', 0.061), ('summaries', 0.059), ('ouyang', 0.058), ('pingali', 0.058), ('zajic', 0.056), ('coreference', 0.055), ('pa', 0.055), ('compressed', 0.054), ('xiaoqiang', 0.054), ('marcu', 0.052), ('statistically', 0.051), ('pure', 0.05), ('decoder', 0.047), ('galley', 0.047), ('compressor', 0.047), ('node', 0.047), ('redundancy', 0.045), ('tree', 0.045), ('lapata', 0.045), ('multidocument', 0.045), ('turner', 0.044), ('grammatical', 0.043), ('erkan', 0.042), ('mckeown', 0.042), ('ti', 0.042), ('retained', 0.041), ('gillick', 0.04), ('knight', 0.039), ('fuentes', 0.038), ('mozer', 0.038), ('multiscorer', 0.038), ('rankers', 0.038), ('scorebasic', 0.038), ('scoreim', 0.038), ('scorelm', 0.038), ('scoreq', 0.038), ('scorered', 0.038), ('sumbasic', 0.038), ('resolution', 0.038), ('scorer', 0.037), ('parse', 0.036), ('grammaticality', 0.036), ('content', 0.036), ('search', 0.036), ('document', 0.035), ('nodes', 0.035), ('compatible', 0.035), ('burges', 0.034), ('lin', 0.034), ('ranking', 0.034), ('aho', 0.034), ('grayed', 0.034), ('kincaid', 0.034), ('synthesize', 0.034), ('toi', 0.034), ('trimmed', 0.034), ('trimmer', 0.034), ('gray', 0.034), ('scoring', 0.033), ('mcdonald', 0.032), ('relevance', 0.032), ('nenkova', 0.031), ('ordering', 0.031), ('vasudeva', 0.031), ('crossdocument', 0.031), ('langauge', 0.031), ('lengthy', 0.031), ('nanda', 0.031), ('otterbacher', 0.031), ('varma', 0.031), ('radu', 0.03), ('retain', 0.03), ('arrested', 0.029)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000011 18 acl-2013-A Sentence Compression Based Framework to Query-Focused Multi-Document Summarization

Author: Lu Wang ; Hema Raghavan ; Vittorio Castelli ; Radu Florian ; Claire Cardie

Abstract: We consider the problem of using sentence compression techniques to facilitate queryfocused multi-document summarization. We present a sentence-compression-based framework for the task, and design a series of learning-based compression models built on parse trees. An innovative beam search decoder is proposed to efficiently find highly probable compressions. Under this framework, we show how to integrate various indicative metrics such as linguistic motivation and query relevance into the compression process by deriving a novel formulation of a compression scoring function. Our best model achieves statistically significant improvement over the state-of-the-art systems on several metrics (e.g. 8.0% and 5.4% improvements in ROUGE-2 respectively) for the DUC 2006 and 2007 summarization task. ,

2 0.28797209 157 acl-2013-Fast and Robust Compressive Summarization with Dual Decomposition and Multi-Task Learning

Author: Miguel Almeida ; Andre Martins

Abstract: We present a dual decomposition framework for multi-document summarization, using a model that jointly extracts and compresses sentences. Compared with previous work based on integer linear programming, our approach does not require external solvers, is significantly faster, and is modular in the three qualities a summary should have: conciseness, informativeness, and grammaticality. In addition, we propose a multi-task learning framework to take advantage of existing data for extractive summarization and sentence compression. Experiments in the TAC2008 dataset yield the highest published ROUGE scores to date, with runtimes that rival those of extractive summarizers.

3 0.23477224 50 acl-2013-An improved MDL-based compression algorithm for unsupervised word segmentation

Author: Ruey-Cheng Chen

Abstract: We study the mathematical properties of a recently proposed MDL-based unsupervised word segmentation algorithm, called regularized compression. Our analysis shows that its objective function can be efficiently approximated using the negative empirical pointwise mutual information. The proposed extension improves the baseline performance in both efficiency and accuracy on a standard benchmark.

4 0.22355177 332 acl-2013-Subtree Extractive Summarization via Submodular Maximization

Author: Hajime Morita ; Ryohei Sasano ; Hiroya Takamura ; Manabu Okumura

Abstract: This study proposes a text summarization model that simultaneously performs sentence extraction and compression. We translate the text summarization task into a problem of extracting a set of dependency subtrees in the document cluster. We also encode obligatory case constraints as must-link dependency constraints in order to guarantee the readability of the generated summary. In order to handle the subtree extraction problem, we investigate a new class of submodular maximization problem, and a new algorithm that has the approximation ratio 21 (1 − e−1). Our experiments with the NTC(1IR − −A eCLIA test collections show that our approach outperforms a state-of-the-art algorithm.

5 0.1621491 5 acl-2013-A Decade of Automatic Content Evaluation of News Summaries: Reassessing the State of the Art

Author: Peter A. Rankel ; John M. Conroy ; Hoa Trang Dang ; Ani Nenkova

Abstract: How good are automatic content metrics for news summary evaluation? Here we provide a detailed answer to this question, with a particular focus on assessing the ability of automatic evaluations to identify statistically significant differences present in manual evaluation of content. Using four years of data from the Text Analysis Conference, we analyze the performance of eight ROUGE variants in terms of accuracy, precision and recall in finding significantly different systems. Our experiments show that some of the neglected variants of ROUGE, based on higher order n-grams and syntactic dependencies, are most accurate across the years; the commonly used ROUGE-1 scores find too many significant differences between systems which manual evaluation would deem comparable. We also test combinations ofROUGE variants and find that they considerably improve the accuracy of automatic prediction.

6 0.15125659 333 acl-2013-Summarization Through Submodularity and Dispersion

7 0.14920481 377 acl-2013-Using Supervised Bigram-based ILP for Extractive Summarization

8 0.14412877 129 acl-2013-Domain-Independent Abstract Generation for Focused Meeting Summarization

9 0.1330916 353 acl-2013-Towards Robust Abstractive Multi-Document Summarization: A Caseframe Analysis of Centrality and Domain

10 0.13170682 167 acl-2013-Generalizing Image Captions for Image-Text Parallel Corpus

11 0.12091849 59 acl-2013-Automated Pyramid Scoring of Summaries using Distributional Semantics

12 0.11074369 225 acl-2013-Learning to Order Natural Language Texts

13 0.10340738 178 acl-2013-HEADY: News headline abstraction through event pattern clustering

14 0.092551276 23 acl-2013-A System for Summarizing Scientific Topics Starting from Keywords

15 0.091592155 132 acl-2013-Easy-First POS Tagging and Dependency Parsing with Beam Search

16 0.080230907 358 acl-2013-Transition-based Dependency Parsing with Selectional Branching

17 0.078528695 142 acl-2013-Evolutionary Hierarchical Dirichlet Process for Timeline Summarization

18 0.078148939 319 acl-2013-Sequential Summarization: A New Application for Timely Updated Twitter Trending Topics

19 0.077196151 172 acl-2013-Graph-based Local Coherence Modeling

20 0.075597294 55 acl-2013-Are Semantically Coherent Topic Models Useful for Ad Hoc Information Retrieval?


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.21), (1, 0.004), (2, -0.031), (3, -0.071), (4, 0.009), (5, 0.057), (6, 0.173), (7, -0.036), (8, -0.226), (9, -0.07), (10, -0.068), (11, 0.056), (12, -0.212), (13, -0.023), (14, -0.083), (15, 0.189), (16, 0.183), (17, -0.105), (18, -0.031), (19, 0.112), (20, -0.035), (21, -0.023), (22, -0.032), (23, -0.019), (24, 0.017), (25, -0.085), (26, -0.002), (27, 0.02), (28, 0.072), (29, 0.02), (30, -0.022), (31, 0.029), (32, -0.03), (33, 0.009), (34, 0.047), (35, -0.026), (36, 0.069), (37, -0.027), (38, 0.0), (39, -0.05), (40, -0.06), (41, -0.046), (42, 0.041), (43, 0.107), (44, 0.034), (45, -0.015), (46, -0.033), (47, 0.023), (48, 0.023), (49, -0.037)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.9272455 18 acl-2013-A Sentence Compression Based Framework to Query-Focused Multi-Document Summarization

Author: Lu Wang ; Hema Raghavan ; Vittorio Castelli ; Radu Florian ; Claire Cardie

Abstract: We consider the problem of using sentence compression techniques to facilitate queryfocused multi-document summarization. We present a sentence-compression-based framework for the task, and design a series of learning-based compression models built on parse trees. An innovative beam search decoder is proposed to efficiently find highly probable compressions. Under this framework, we show how to integrate various indicative metrics such as linguistic motivation and query relevance into the compression process by deriving a novel formulation of a compression scoring function. Our best model achieves statistically significant improvement over the state-of-the-art systems on several metrics (e.g. 8.0% and 5.4% improvements in ROUGE-2 respectively) for the DUC 2006 and 2007 summarization task. ,

2 0.85845071 332 acl-2013-Subtree Extractive Summarization via Submodular Maximization

Author: Hajime Morita ; Ryohei Sasano ; Hiroya Takamura ; Manabu Okumura

Abstract: This study proposes a text summarization model that simultaneously performs sentence extraction and compression. We translate the text summarization task into a problem of extracting a set of dependency subtrees in the document cluster. We also encode obligatory case constraints as must-link dependency constraints in order to guarantee the readability of the generated summary. In order to handle the subtree extraction problem, we investigate a new class of submodular maximization problem, and a new algorithm that has the approximation ratio 21 (1 − e−1). Our experiments with the NTC(1IR − −A eCLIA test collections show that our approach outperforms a state-of-the-art algorithm.

3 0.85407126 333 acl-2013-Summarization Through Submodularity and Dispersion

Author: Anirban Dasgupta ; Ravi Kumar ; Sujith Ravi

Abstract: We propose a new optimization framework for summarization by generalizing the submodular framework of (Lin and Bilmes, 2011). In our framework the summarization desideratum is expressed as a sum of a submodular function and a nonsubmodular function, which we call dispersion; the latter uses inter-sentence dissimilarities in different ways in order to ensure non-redundancy of the summary. We consider three natural dispersion functions and show that a greedy algorithm can obtain an approximately optimal summary in all three cases. We conduct experiments on two corpora—DUC 2004 and user comments on news articles—and show that the performance of our algorithm outperforms those that rely only on submodularity.

4 0.82463413 157 acl-2013-Fast and Robust Compressive Summarization with Dual Decomposition and Multi-Task Learning

Author: Miguel Almeida ; Andre Martins

Abstract: We present a dual decomposition framework for multi-document summarization, using a model that jointly extracts and compresses sentences. Compared with previous work based on integer linear programming, our approach does not require external solvers, is significantly faster, and is modular in the three qualities a summary should have: conciseness, informativeness, and grammaticality. In addition, we propose a multi-task learning framework to take advantage of existing data for extractive summarization and sentence compression. Experiments in the TAC2008 dataset yield the highest published ROUGE scores to date, with runtimes that rival those of extractive summarizers.

5 0.797306 377 acl-2013-Using Supervised Bigram-based ILP for Extractive Summarization

Author: Chen Li ; Xian Qian ; Yang Liu

Abstract: In this paper, we propose a bigram based supervised method for extractive document summarization in the integer linear programming (ILP) framework. For each bigram, a regression model is used to estimate its frequency in the reference summary. The regression model uses a variety ofindicative features and is trained discriminatively to minimize the distance between the estimated and the ground truth bigram frequency in the reference summary. During testing, the sentence selection problem is formulated as an ILP problem to maximize the bigram gains. We demonstrate that our system consistently outperforms the previous ILP method on different TAC data sets, and performs competitively compared to the best results in the TAC evaluations. We also conducted various analysis to show the impact of bigram selection, weight estimation, and ILP setup.

6 0.79656506 353 acl-2013-Towards Robust Abstractive Multi-Document Summarization: A Caseframe Analysis of Centrality and Domain

7 0.73490173 5 acl-2013-A Decade of Automatic Content Evaluation of News Summaries: Reassessing the State of the Art

8 0.65757751 129 acl-2013-Domain-Independent Abstract Generation for Focused Meeting Summarization

9 0.6439631 142 acl-2013-Evolutionary Hierarchical Dirichlet Process for Timeline Summarization

10 0.62249458 59 acl-2013-Automated Pyramid Scoring of Summaries using Distributional Semantics

11 0.59157622 50 acl-2013-An improved MDL-based compression algorithm for unsupervised word segmentation

12 0.57686162 225 acl-2013-Learning to Order Natural Language Texts

13 0.52240878 319 acl-2013-Sequential Summarization: A New Application for Timely Updated Twitter Trending Topics

14 0.51043844 178 acl-2013-HEADY: News headline abstraction through event pattern clustering

15 0.5098871 375 acl-2013-Using Integer Linear Programming in Concept-to-Text Generation to Produce More Compact Texts

16 0.48695379 23 acl-2013-A System for Summarizing Scientific Topics Starting from Keywords

17 0.46869785 172 acl-2013-Graph-based Local Coherence Modeling

18 0.42683959 283 acl-2013-Probabilistic Domain Modelling With Contextualized Distributional Semantic Vectors

19 0.41627887 167 acl-2013-Generalizing Image Captions for Image-Text Parallel Corpus

20 0.3914313 158 acl-2013-Feature-Based Selection of Dependency Paths in Ad Hoc Information Retrieval


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(0, 0.051), (6, 0.064), (11, 0.082), (15, 0.015), (16, 0.014), (24, 0.039), (26, 0.056), (34, 0.163), (35, 0.077), (42, 0.067), (48, 0.04), (60, 0.012), (64, 0.015), (70, 0.048), (88, 0.045), (90, 0.041), (95, 0.062)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.8663733 21 acl-2013-A Statistical NLG Framework for Aggregated Planning and Realization

Author: Ravi Kondadadi ; Blake Howald ; Frank Schilder

Abstract: We present a hybrid natural language generation (NLG) system that consolidates macro and micro planning and surface realization tasks into one statistical learning process. Our novel approach is based on deriving a template bank automatically from a corpus of texts from a target domain. First, we identify domain specific entity tags and Discourse Representation Structures on a per sentence basis. Each sentence is then organized into semantically similar groups (representing a domain specific concept) by k-means clustering. After this semi-automatic processing (human review of cluster assignments), a number of corpus–level statistics are compiled and used as features by a ranking SVM to develop model weights from a training corpus. At generation time, a set of input data, the collection of semantically organized templates, and the model weights are used to select optimal templates. Our system is evaluated with automatic, non–expert crowdsourced and expert evaluation metrics. We also introduce a novel automatic metric syntactic variability that represents linguistic variation as a measure of unique template sequences across a collection of automatically generated documents. The metrics for generated weather and biography texts fall within acceptable ranges. In sum, we argue that our statistical approach to NLG reduces the need for complicated knowledge-based architectures and readily adapts to different domains with reduced development time. – – *∗Ravi Kondadadi is now affiliated with Nuance Communications, Inc.

same-paper 2 0.85163337 18 acl-2013-A Sentence Compression Based Framework to Query-Focused Multi-Document Summarization

Author: Lu Wang ; Hema Raghavan ; Vittorio Castelli ; Radu Florian ; Claire Cardie

Abstract: We consider the problem of using sentence compression techniques to facilitate queryfocused multi-document summarization. We present a sentence-compression-based framework for the task, and design a series of learning-based compression models built on parse trees. An innovative beam search decoder is proposed to efficiently find highly probable compressions. Under this framework, we show how to integrate various indicative metrics such as linguistic motivation and query relevance into the compression process by deriving a novel formulation of a compression scoring function. Our best model achieves statistically significant improvement over the state-of-the-art systems on several metrics (e.g. 8.0% and 5.4% improvements in ROUGE-2 respectively) for the DUC 2006 and 2007 summarization task. ,

3 0.75439799 164 acl-2013-FudanNLP: A Toolkit for Chinese Natural Language Processing

Author: Xipeng Qiu ; Qi Zhang ; Xuanjing Huang

Abstract: The growing need for Chinese natural language processing (NLP) is largely in a range of research and commercial applications. However, most of the currently Chinese NLP tools or components still have a wide range of issues need to be further improved and developed. FudanNLP is an open source toolkit for Chinese natural language processing (NLP) , which uses statistics-based and rule-based methods to deal with Chinese NLP tasks, such as word segmentation, part-ofspeech tagging, named entity recognition, dependency parsing, time phrase recognition, anaphora resolution and so on.

4 0.74083757 83 acl-2013-Collective Annotation of Linguistic Resources: Basic Principles and a Formal Model

Author: Ulle Endriss ; Raquel Fernandez

Abstract: Crowdsourcing, which offers new ways of cheaply and quickly gathering large amounts of information contributed by volunteers online, has revolutionised the collection of labelled data. Yet, to create annotated linguistic resources from this data, we face the challenge of having to combine the judgements of a potentially large group of annotators. In this paper we investigate how to aggregate individual annotations into a single collective annotation, taking inspiration from the field of social choice theory. We formulate a general formal model for collective annotation and propose several aggregation methods that go beyond the commonly used majority rule. We test some of our methods on data from a crowdsourcing experiment on textual entailment annotation.

5 0.73060668 333 acl-2013-Summarization Through Submodularity and Dispersion

Author: Anirban Dasgupta ; Ravi Kumar ; Sujith Ravi

Abstract: We propose a new optimization framework for summarization by generalizing the submodular framework of (Lin and Bilmes, 2011). In our framework the summarization desideratum is expressed as a sum of a submodular function and a nonsubmodular function, which we call dispersion; the latter uses inter-sentence dissimilarities in different ways in order to ensure non-redundancy of the summary. We consider three natural dispersion functions and show that a greedy algorithm can obtain an approximately optimal summary in all three cases. We conduct experiments on two corpora—DUC 2004 and user comments on news articles—and show that the performance of our algorithm outperforms those that rely only on submodularity.

6 0.72857964 155 acl-2013-Fast and Accurate Shift-Reduce Constituent Parsing

7 0.72665977 123 acl-2013-Discriminative Learning with Natural Annotations: Word Segmentation as a Case Study

8 0.72592258 343 acl-2013-The Effect of Higher-Order Dependency Features in Discriminative Phrase-Structure Parsing

9 0.72555768 225 acl-2013-Learning to Order Natural Language Texts

10 0.72501922 70 acl-2013-Bilingually-Guided Monolingual Dependency Grammar Induction

11 0.72474569 132 acl-2013-Easy-First POS Tagging and Dependency Parsing with Beam Search

12 0.72466165 353 acl-2013-Towards Robust Abstractive Multi-Document Summarization: A Caseframe Analysis of Centrality and Domain

13 0.72448486 318 acl-2013-Sentiment Relevance

14 0.72230291 358 acl-2013-Transition-based Dependency Parsing with Selectional Branching

15 0.72087538 159 acl-2013-Filling Knowledge Base Gaps for Distant Supervision of Relation Extraction

16 0.71917617 212 acl-2013-Language-Independent Discriminative Parsing of Temporal Expressions

17 0.71836168 275 acl-2013-Parsing with Compositional Vector Grammars

18 0.71732724 196 acl-2013-Improving pairwise coreference models through feature space hierarchy learning

19 0.71699107 63 acl-2013-Automatic detection of deception in child-produced speech using syntactic complexity features

20 0.7153939 46 acl-2013-An Infinite Hierarchical Bayesian Model of Phrasal Translation