acl acl2011 acl2011-4 knowledge-graph by maker-knowledge-mining

4 acl-2011-A Class of Submodular Functions for Document Summarization


Source: pdf

Author: Hui Lin ; Jeff Bilmes

Abstract: We design a class of submodular functions meant for document summarization tasks. These functions each combine two terms, one which encourages the summary to be representative of the corpus, and the other which positively rewards diversity. Critically, our functions are monotone nondecreasing and submodular, which means that an efficient scalable greedy optimization scheme has a constant factor guarantee of optimality. When evaluated on DUC 2004-2007 corpora, we obtain better than existing state-of-art results in both generic and query-focused document summarization. Lastly, we show that several well-established methods for document summarization correspond, in fact, to submodular function optimization, adding further evidence that submodular functions are a natural fit for document summarization.

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 hl in@ ee washingt on edu Abstract We design a class of submodular functions meant for document summarization tasks. [sent-4, score-1.251]

2 These functions each combine two terms, one which encourages the summary to be representative of the corpus, and the other which positively rewards diversity. [sent-5, score-0.21]

3 Critically, our functions are monotone nondecreasing and submodular, which means that an efficient scalable greedy optimization scheme has a constant factor guarantee of optimality. [sent-6, score-0.418]

4 When evaluated on DUC 2004-2007 corpora, we obtain better than existing state-of-art results in both generic and query-focused document summarization. [sent-7, score-0.118]

5 Lastly, we show that several well-established methods for document summarization correspond, in fact, to submodular function optimization, adding further evidence that submodular functions are a natural fit for document summarization. [sent-8, score-2.19]

6 1 Introduction In this paper, we address the problem of generic and query-based extractive summarization from collections of related documents, a task commonly known as multi-document summarization. [sent-9, score-0.379]

7 We treat this task as monotone submodular function maximization (to be defined in Section 2). [sent-10, score-1.007]

8 On the one hand, there exists a simple greedy algorithm for monotone submodular function maximization where the summary solution obtained (say is guaranteed to be almost as good as the best possible solution (say Sopt) according to an objective F. [sent-12, score-1.216]

9 Of course, none of this is useful if the objective function F is inappropriate for the summarization task. [sent-21, score-0.371]

10 In thFis paper, we argue that monotone nondecreasing submodular functions F are an ideal class of functions to investigate for docuFment summarization. [sent-22, score-1.24]

11 We show, in fact, that many well-established methods for summarization (Carbonell and Goldstein, 1998; Filatova and Hatzivassiloglou, 2004; Takamura and Okumura, 2009; Riedhammer et al. [sent-23, score-0.254]

12 , 2010; Shen and Li, 2010) correspond to submodular function optimization, a property not explicitly mentioned in these publications. [sent-24, score-0.892]

13 We take this fact, however, as testament to the value of submodular functions for summarization: if summarization algorithms are repeatedly developed that, by chance, happen to be an instance of a − submodular function optimization, this suggests that submodular functions are a natural fit. [sent-25, score-2.988]

14 On the other hand, other authors have started realizing explicitly the value of submodular functions for summarization (Lin and Bilmes, 2010; Qazvinian et al. [sent-26, score-1.198]

15 Submodular functions share many properties in common with convex functions, one of which is that they are closed under a number of common combination operations (summation, certain compositions, restrictions, and so on). [sent-28, score-0.172]

16 These operations give us the tools necessary to design a powerful submodular objective for submodular document summarization that extends beyond any previous work. [sent-29, score-2.008]

17 We demonstrate this by carefully crafting a class of submodular funcProce dinPgosrt olafn thde, 4 O9rtehg Aon ,n Ju anle M 1e9e-2tin4g, 2 o0f1 t1h. [sent-30, score-0.801]

18 Ac s2s0o1ci1a Atiosnso fcoirat Cio nm foprut Caotimonpaulta Lti nognuails Lti cnsg,u piasgteics 510–520, tions we feel are ideal for extractive summarization tasks, both generic and query-focused. [sent-32, score-0.379]

19 In doing so, we demonstrate better than existing state-of-the-art performance on a number of standard summarization evaluation tasks, namely DUC-04 through to DUC07. [sent-33, score-0.254]

20 We believe our work, moreover, might act as a springboard for researchers in summarization to consider the problem of “how to design a submodular function” for the summarization task. [sent-34, score-1.309]

21 In Section 2, we provide a briefbackground on submodular functions and their optimization. [sent-35, score-0.924]

22 Section 3 describes how the task of extractive summarization can be viewed as a problem of submodular function maximization. [sent-36, score-1.2]

23 We also in this section show that many standard methods for summarization are, in fact, already performing submodular function optimization. [sent-37, score-1.12]

24 In Section 4, we present our own submodular functions. [sent-38, score-0.801]

25 Section 5 presents results on both generic and query-focused summarization tasks, showing as far as we know the best known ROUGE results for DUC04 through DUC-06, and the best known precision results for DUC-07, and the best recall DUC-07 results among those that do not use a web search engine. [sent-39, score-0.321]

26 , vn} and a function F : 2V → R that retVurn =s a v real value }for any subset S ⊆ :V 2. [sent-44, score-0.085]

27 For example, F might correspond to the value or coverage of a seFt of sensor locations in an environment, and the goal is to find the best locations for a fixed number of sensors (Krause et al. [sent-49, score-0.098]

28 If the function F is monotone submodular then the maximization is sFtill NP complete, but it was shown in (Nemhauser et al. [sent-51, score-1.007]

29 Submodular functions are those that satisfy the property ofdiminishing returns: for any A ⊆ B ⊆ V \v, a submodular function F must satisfy F(A+v) −F(A) ≥ 511 F(B + v) F(B). [sent-56, score-1.015]

30 A set func|Vtio |n F is mono- e−e1 − f~a f~of tone nondecreasing if ∀A ⊆ B, F(A) ≤ F(B). [sent-60, score-0.095]

31 As shorthand, in this paper, monotone nondecreasing submodular functions will simply be referred to as monotone submodular. [sent-61, score-1.215]

32 Historically, submodular functions have their roots in economics, game theory, combinatorial optimization, and operations research. [sent-62, score-0.951]

33 More recently, submodular functions have started receiving attention in the machine learning and computer vision community (Kempe et al. [sent-63, score-0.944]

34 , 2008; Kolmogorov and Zabin, 2004) and have recently been introduced to natural language processing for the tasks of document summarization (Lin and Bilmes, 2010) and word alignment (Lin and Bilmes, 2011). [sent-65, score-0.327]

35 For exam- ple, if a collection of functions {Fi}Pi is submodular, then so is their weighted sum F{F = Pi αiFi where αi are nonnegative weights. [sent-67, score-0.123]

36 ItF Fis = noPt hardF to show that submodular functions also haveP the following composition property with concave functions: Theorem 1. [sent-68, score-1.061]

37 Given functions F : 2V → R and f : R → R, the composition F0F = f ◦ F : →2V R → R (i. [sent-69, score-0.147]

38 , F0(S) = f(F(S))) iFs nondecreasing →sub Rmodular, if f =is non-decreasing concave and F is nondecreasing submodular. [sent-71, score-0.277]

39 This property will be quite useful when defining submodular functions for document summarization. [sent-72, score-1.023]

40 1 Summarization with knapsack constraint Let the ground set V represents all the sentences (or other linguistic units) in a document (or document collection, in the multi-document summarization case). [sent-74, score-0.476]

41 The task of extractive document summarization is to select a subset S ⊆ V to represent the entirety (ground set V). [sent-75, score-0.407]

42 In other words, adding k3 achieves a greater reward as it increases the diversity of the summary (by choosing from a different cluster). [sent-78, score-0.306]

43 It is easy to show that R(S) is submodular by using the composition ruleR f(roSm) Theorem 1. [sent-80, score-0.825]

44 Inside each square root lies a modular function with non-negative weights (and thus is monotone). [sent-82, score-0.151]

45 Applying the square root to such a monotone submodular function yields a submodular function, and summing them all together retains submodularity, as mentioned in Section 2. [sent-83, score-1.824]

46 , 2009)), recently shown to be related to submodularity (Bach, 2010; Jegelka and Bilmes, 2011). [sent-89, score-0.095]

47 5 can be replaced with any other non-decreasing concave functions (e. [sent-93, score-0.21]

48 , f(x) = log(1 + x)) while preserving the desired property of R(S), and the curvature of the concave function theRn dSe)termines the rate that the reward diminishes. [sent-95, score-0.339]

49 5 Experiments The document understanding conference (DUC) (http : / / duc . [sent-104, score-0.21]

50 The tasks in DUC evolved from single-document summarization to multi-document summarization, and from generic summarization (2001–2004) to query-focused summarization (2005–2007). [sent-107, score-0.807]

51 In all experiments, the modified greedy algorithm (Lin and Bilmes, 2010) was used for summary generation. [sent-110, score-0.109]

52 1 Generic summarization Summarization tasks in DUC-03 and DUC-04 are multi-document summarization on English news articles. [sent-112, score-0.508]

53 For each document cluster, the system generated summary may not be longer than 665 bytes including spaces and punctuation. [sent-114, score-0.144]

54 39 4 We first tested our coverage and diversity reward objectives sePparately. [sent-134, score-0.321]

55 (6) When α = 1, L1(S) reduces to Pi∈V,j∈S wi,j, which measures Lthe overall similaritPy oi∈fV summary set S to ground set V. [sent-138, score-0.095]

56 1, using such similarity measurement could possibly over-concentrate on a small portion of the document and result in a poor coverage of the whole document. [sent-140, score-0.113]

57 As shown in Table 1, optimizing this objective function gives a ROUGE-1 F-measure score 32. [sent-141, score-0.117]

58 65% 516 Figure 1: ROUGE-1 F-measure scores on DUC-03 when α and K vary in objective function L1(S) + λR1 (S), where λ = 6 and α = Na . [sent-144, score-0.117]

59 As for the diversity reward obPjective, we define the singleton reward as ri = N1 Pj wi,j, which is the average similarity of sentence Pi to the rest of the document. [sent-146, score-0.492]

60 It basically states that tPhe more similar to the whole document a sentence is, the more reward there will be by adding this sentence to an empty summary set. [sent-147, score-0.331]

61 By using this singleton reward, we have the following diversity reward function: R1(S) =kX=K1sj∈XS∩PkN1iX∈Vwi,j. [sent-148, score-0.308]

62 And as we can see in Table 1, optimizing the diversity reward function alone achieves comparable performance to the DUC-04 best system. [sent-153, score-0.32]

63 Combining L1(S) and R1(S), our system outperforms the bestL system in DRUC-04 significantly, and it also outperforms several recent systems, including a concept-based summarization approach (Takamura and Okumura, 2009), a sentence topic model based system (Wang et al. [sent-154, score-0.383]

64 , 2009), and our MMR-styled submodular system (Lin and Bilmes, 2010). [sent-155, score-0.821]

65 2 Query-focused summarization We evaluated our approach on the task of queryfocused summarization using DUC 05-07 data. [sent-172, score-0.54]

66 In DUC-05 and DUC-06, participants were given 50 document clusters, where each cluster contains 25 news articles related to the same topic. [sent-173, score-0.106]

67 We used both the title and the narrative as query, where stop words, including some function words (e. [sent-181, score-0.094]

68 Note that there are several ways to incorporate query-focused information into both the coverage and diversity reward objectives. [sent-185, score-0.295]

69 Also, the coefficient α could be query and sentence dependent, where it takes larger value when a sentence is more relevant to query (i. [sent-187, score-0.19]

70 Similarly, sentence clustering and singleton rewards in the diversity function can also 3ROUGE version 1. [sent-190, score-0.304]

71 In this experiment, we explore an objective with a query-independent coverage function (R1(S)), indicating prior importance, combined with a query-dependent diversity reward function, where the latter is defined as: RQ(S) =kX=K1uuvtj∈XS∩Pk NβiX∈Vwi,j+ (1 − β)rj,Q! [sent-210, score-0.412]

72 , ≤ ≤ where 0 β 1, and rj,Q represents the relevance 0 be ≤twe βen ≤ sentence j to query Q. [sent-211, score-0.106]

73 This query-dependent reward function is derived by using a singleton reward that is expressed as a conPvex combination of the query-independent score (N1 Pi∈V wi,j) and the query-dependent score (rj,Q) of aP sie∈Vntence. [sent-212, score-0.44]

74 To better estimate of the relevance between query and sentences, we further expanded sentences with synonyms and hypernyms of its constituent words. [sent-217, score-0.161]

75 In particular, part-of-speech tags were obtained for each sentence using the maximum entropy part-of-speech tagger (Ratnaparkhi, 1996), and all nouns were then expanded with their synonyms and hypernyms using WordNet (Fellbaum, 1998). [sent-218, score-0.101]

76 Note that these expanded documents were only used in the estimation rj,Q, and we plan to further explore whether there is benefit to use the expanded documents either in sentence similarity estimation or in sentence clustering in our future work. [sent-219, score-0.187]

77 We also tried to expand the query with synonyms and observed a performancedecrease, presumably due to noisy information in a query expression. [sent-220, score-0.147]

78 , 2007) to learn the coefficients in our objective function, we trained all coefficients to maximize ROUGE-2 F-measure score using the Nelder-Mead (derivative-free) method. [sent-222, score-0.096]

79 Using L1(S) λRQ (S) as the objective and with the same sentence clustering algorithm as in the generic summarization experiment (K = 0. [sent-223, score-0.407]

80 2N), our system, + when both trained and tested on DUC-05 (results in Table 2), outperforms the Bayesian query-focused summarization approach and the search-based structured prediction approach, which were also trained and tested on DUC-05 (Daume´ et al. [sent-224, score-0.329]

81 24% in ROUGE-2 recall) is a so called “vine-growth” system, which can be seen as an abstractive approach, whereas our system is purely an extractive system. [sent-228, score-0.1]

82 Comparing to the extractive system in (Daume´ et al. [sent-229, score-0.1]

83 , 2007), a state-of-the-art supervised summarization system, as well as two recent systems including a generative summarization system based on topic models (Haghighi and Vanderwende, 2009), and a hybrid hierarchical summarization system (Celikyilmaz and Hakkani-tu¨r, 2010). [sent-241, score-0.802]

84 To further improve the performance of our system, we used both DUC-05 and DUC-06 as a training set, and introduced three diversity reward terms into the objective where three different sentence clusterings with different resolutions were produced (with sizes 0. [sent-248, score-0.33]

85 Denoting a diversity reward corresponding to clustering κ as RQ,κ(SP), we model the summary quality as L1(S) + λκRQ,κ(S). [sent-252, score-0.339]

86 As shown in Table 5, using thisP Pobjective function with multi-resolution diversity Pr ewards improves our results further, and outperforms the best system in DUC-07 in terms of ROUGE-2 F-measure score. [sent-253, score-0.202]

87 Pκ3=1 6 Conclusion and discussion In this paper, we show that submodularity naturally arises in document summarization. [sent-254, score-0.168]

88 Not only do many existing automatic summarization methods correspond to submodular function optimization, but also the widely used ROUGE evaluation is closely related to submodular functions. [sent-255, score-1.921]

89 As the corresponding submodular optimization problem can be solved efficiently and effectively, the remaining question is then how to design a submodular objective that best models the task. [sent-256, score-1.698]

90 To address this problem, we introduce a powerful class of monotone submodular functions that are well suited to document summarization by modeling two important properties of a summary, fidelity and diversity. [sent-257, score-1.349]

91 While more advanced NLP techniques could be easily incorporated into our functions (e. [sent-258, score-0.123]

92 , language models could define a better Ci (S), more advanced relevance estimations for the singleton rewards ri, and better and/or overlapping clustering algorithms for our diversity reward), we already show top results on standard benchmark evaluations using fairly basic NLP methods (e. [sent-260, score-0.237]

93 , term weighting and WordNet expansion), all, we believe, thanks to the power and generality of submodular functions. [sent-262, score-0.822]

94 Submodularity beyond submodular energies: coupling edges in graph cuts. [sent-326, score-0.801]

95 What energy functions can be minimized via graph cuts? [sent-346, score-0.123]

96 An analysis of approximations for maximizing submodular set functions I. [sent-432, score-0.958]

97 A note on maximizing a submodular set function subject to a knapsack constraint. [sent-471, score-0.932]

98 Text summarization model based on maximum coverage problem and its variant. [sent-477, score-0.294]

99 The PYTHY summarization system: Microsoft research at DUC 2007. [sent-488, score-0.254]

100 An analysis of the greedy algorithm for the submodular set covering problem. [sent-503, score-0.859]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('submodular', 0.801), ('summarization', 0.254), ('reward', 0.161), ('duc', 0.137), ('functions', 0.123), ('bilmes', 0.115), ('monotone', 0.098), ('nondecreasing', 0.095), ('submodularity', 0.095), ('diversity', 0.094), ('concave', 0.087), ('extractive', 0.08), ('krause', 0.079), ('rq', 0.078), ('document', 0.073), ('daume', 0.068), ('function', 0.065), ('narasimhan', 0.063), ('query', 0.062), ('greedy', 0.058), ('sopt', 0.054), ('singleton', 0.053), ('objective', 0.052), ('summary', 0.051), ('celikyilmaz', 0.048), ('generic', 0.045), ('optimization', 0.044), ('ground', 0.044), ('rouge', 0.043), ('maximization', 0.043), ('pi', 0.042), ('electrical', 0.041), ('pingali', 0.041), ('coverage', 0.04), ('square', 0.039), ('lin', 0.039), ('kx', 0.039), ('lova', 0.036), ('nemhauser', 0.036), ('jagarlamudi', 0.036), ('rewards', 0.036), ('takamura', 0.035), ('maximizing', 0.034), ('cluster', 0.033), ('clustering', 0.033), ('budgeted', 0.032), ('jegelka', 0.032), ('knapsack', 0.032), ('pythy', 0.032), ('queryfocused', 0.032), ('expanded', 0.031), ('narrative', 0.029), ('kolmogorov', 0.029), ('riedhammer', 0.029), ('truncation', 0.029), ('uc', 0.028), ('sd', 0.028), ('guestrin', 0.027), ('kempe', 0.027), ('keyphrase', 0.027), ('modular', 0.027), ('operations', 0.027), ('filatova', 0.026), ('porter', 0.026), ('tested', 0.026), ('property', 0.026), ('ifs', 0.025), ('monotonicity', 0.025), ('qazvinian', 0.025), ('solution', 0.024), ('composition', 0.024), ('toutanova', 0.024), ('hypernyms', 0.024), ('okumura', 0.024), ('synonyms', 0.023), ('pk', 0.023), ('sentence', 0.023), ('documents', 0.023), ('outperforms', 0.023), ('convex', 0.022), ('sin', 0.022), ('recall', 0.022), ('coefficients', 0.022), ('pj', 0.022), ('carbonell', 0.021), ('generality', 0.021), ('hyderabad', 0.021), ('norms', 0.021), ('theorem', 0.021), ('relevance', 0.021), ('tp', 0.021), ('ci', 0.021), ('root', 0.02), ('ix', 0.02), ('sj', 0.02), ('system', 0.02), ('value', 0.02), ('vision', 0.02), ('xs', 0.019), ('locations', 0.019)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0 4 acl-2011-A Class of Submodular Functions for Document Summarization

Author: Hui Lin ; Jeff Bilmes

Abstract: We design a class of submodular functions meant for document summarization tasks. These functions each combine two terms, one which encourages the summary to be representative of the corpus, and the other which positively rewards diversity. Critically, our functions are monotone nondecreasing and submodular, which means that an efficient scalable greedy optimization scheme has a constant factor guarantee of optimality. When evaluated on DUC 2004-2007 corpora, we obtain better than existing state-of-art results in both generic and query-focused document summarization. Lastly, we show that several well-established methods for document summarization correspond, in fact, to submodular function optimization, adding further evidence that submodular functions are a natural fit for document summarization.

2 0.49651638 340 acl-2011-Word Alignment via Submodular Maximization over Matroids

Author: Hui Lin ; Jeff Bilmes

Abstract: We cast the word alignment problem as maximizing a submodular function under matroid constraints. Our framework is able to express complex interactions between alignment components while remaining computationally efficient, thanks to the power and generality of submodular functions. We show that submodularity naturally arises when modeling word fertility. Experiments on the English-French Hansards alignment task show that our approach achieves lower alignment error rates compared to conventional matching based approaches.

3 0.1744159 255 acl-2011-Query Snowball: A Co-occurrence-based Approach to Multi-document Summarization for Question Answering

Author: Hajime Morita ; Tetsuya Sakai ; Manabu Okumura

Abstract: We propose a new method for query-oriented extractive multi-document summarization. To enrich the information need representation of a given query, we build a co-occurrence graph to obtain words that augment the original query terms. We then formulate the summarization problem as a Maximum Coverage Problem with Knapsack Constraints based on word pairs rather than single words. Our experiments with the NTCIR ACLIA question answering test collections show that our method achieves a pyramid F3-score of up to 0.3 13, a 36% improvement over a baseline using Maximal Marginal Relevance. 1

4 0.14990728 21 acl-2011-A Pilot Study of Opinion Summarization in Conversations

Author: Dong Wang ; Yang Liu

Abstract: This paper presents a pilot study of opinion summarization on conversations. We create a corpus containing extractive and abstractive summaries of speaker’s opinion towards a given topic using 88 telephone conversations. We adopt two methods to perform extractive summarization. The first one is a sentence-ranking method that linearly combines scores measured from different aspects including topic relevance, subjectivity, and sentence importance. The second one is a graph-based method, which incorporates topic and sentiment information, as well as additional information about sentence-to-sentence relations extracted based on dialogue structure. Our evaluation results show that both methods significantly outperform the baseline approach that extracts the longest utterances. In particular, we find that incorporating dialogue structure in the graph-based method contributes to the improved system performance.

5 0.12798893 251 acl-2011-Probabilistic Document Modeling for Syntax Removal in Text Summarization

Author: William M. Darling ; Fei Song

Abstract: Statistical approaches to automatic text summarization based on term frequency continue to perform on par with more complex summarization methods. To compute useful frequency statistics, however, the semantically important words must be separated from the low-content function words. The standard approach of using an a priori stopword list tends to result in both undercoverage, where syntactical words are seen as semantically relevant, and overcoverage, where words related to content are ignored. We present a generative probabilistic modeling approach to building content distributions for use with statistical multi-document summarization where the syntax words are learned directly from the data with a Hidden Markov Model and are thereby deemphasized in the term frequency statistics. This approach is compared to both a stopword-list and POS-tagging approach and our method demonstrates improved coverage on the DUC 2006 and TAC 2010 datasets using the ROUGE metric.

6 0.12173167 270 acl-2011-SciSumm: A Multi-Document Summarization System for Scientific Articles

7 0.11210857 187 acl-2011-Jointly Learning to Extract and Compress

8 0.10511693 326 acl-2011-Using Bilingual Information for Cross-Language Document Summarization

9 0.10148533 98 acl-2011-Discovery of Topically Coherent Sentences for Extractive Summarization

10 0.095433734 76 acl-2011-Comparative News Summarization Using Linear Programming

11 0.094610006 18 acl-2011-A Latent Topic Extracting Method based on Events in a Document and its Application

12 0.088385694 47 acl-2011-Automatic Assessment of Coverage Quality in Intelligence Reports

13 0.07457754 149 acl-2011-Hierarchical Reinforcement Learning and Hidden Markov Models for Task-Oriented Natural Language Generation

14 0.073704109 308 acl-2011-Towards a Framework for Abstractive Summarization of Multimodal Documents

15 0.065126732 256 acl-2011-Query Weighting for Ranking Model Adaptation

16 0.063266955 156 acl-2011-IMASS: An Intelligent Microblog Analysis and Summarization System

17 0.062388349 201 acl-2011-Learning From Collective Human Behavior to Introduce Diversity in Lexical Choice

18 0.059291672 280 acl-2011-Sentence Ordering Driven by Local and Global Coherence for Summary Generation

19 0.057402484 71 acl-2011-Coherent Citation-Based Summarization of Scientific Papers

20 0.05225873 137 acl-2011-Fine-Grained Class Label Markup of Search Queries


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.133), (1, 0.052), (2, -0.045), (3, 0.108), (4, -0.073), (5, -0.074), (6, -0.103), (7, 0.138), (8, 0.042), (9, 0.003), (10, 0.013), (11, 0.055), (12, -0.123), (13, 0.05), (14, -0.231), (15, -0.032), (16, 0.057), (17, 0.021), (18, -0.026), (19, 0.057), (20, -0.076), (21, -0.09), (22, 0.074), (23, 0.022), (24, -0.051), (25, -0.059), (26, 0.069), (27, -0.171), (28, 0.119), (29, -0.195), (30, 0.042), (31, -0.073), (32, 0.148), (33, -0.115), (34, 0.083), (35, 0.254), (36, -0.127), (37, 0.0), (38, 0.043), (39, -0.05), (40, -0.066), (41, -0.084), (42, 0.033), (43, -0.04), (44, 0.151), (45, 0.304), (46, -0.067), (47, 0.044), (48, -0.123), (49, -0.071)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.93595803 4 acl-2011-A Class of Submodular Functions for Document Summarization

Author: Hui Lin ; Jeff Bilmes

Abstract: We design a class of submodular functions meant for document summarization tasks. These functions each combine two terms, one which encourages the summary to be representative of the corpus, and the other which positively rewards diversity. Critically, our functions are monotone nondecreasing and submodular, which means that an efficient scalable greedy optimization scheme has a constant factor guarantee of optimality. When evaluated on DUC 2004-2007 corpora, we obtain better than existing state-of-art results in both generic and query-focused document summarization. Lastly, we show that several well-established methods for document summarization correspond, in fact, to submodular function optimization, adding further evidence that submodular functions are a natural fit for document summarization.

2 0.73462987 340 acl-2011-Word Alignment via Submodular Maximization over Matroids

Author: Hui Lin ; Jeff Bilmes

Abstract: We cast the word alignment problem as maximizing a submodular function under matroid constraints. Our framework is able to express complex interactions between alignment components while remaining computationally efficient, thanks to the power and generality of submodular functions. We show that submodularity naturally arises when modeling word fertility. Experiments on the English-French Hansards alignment task show that our approach achieves lower alignment error rates compared to conventional matching based approaches.

3 0.55145401 255 acl-2011-Query Snowball: A Co-occurrence-based Approach to Multi-document Summarization for Question Answering

Author: Hajime Morita ; Tetsuya Sakai ; Manabu Okumura

Abstract: We propose a new method for query-oriented extractive multi-document summarization. To enrich the information need representation of a given query, we build a co-occurrence graph to obtain words that augment the original query terms. We then formulate the summarization problem as a Maximum Coverage Problem with Knapsack Constraints based on word pairs rather than single words. Our experiments with the NTCIR ACLIA question answering test collections show that our method achieves a pyramid F3-score of up to 0.3 13, a 36% improvement over a baseline using Maximal Marginal Relevance. 1

4 0.48342305 187 acl-2011-Jointly Learning to Extract and Compress

Author: Taylor Berg-Kirkpatrick ; Dan Gillick ; Dan Klein

Abstract: We learn a joint model of sentence extraction and compression for multi-document summarization. Our model scores candidate summaries according to a combined linear model whose features factor over (1) the n-gram types in the summary and (2) the compressions used. We train the model using a marginbased objective whose loss captures end summary quality. Because of the exponentially large set of candidate summaries, we use a cutting-plane algorithm to incrementally detect and add active constraints efficiently. Inference in our model can be cast as an ILP and thereby solved in reasonable time; we also present a fast approximation scheme which achieves similar performance. Our jointly extracted and compressed summaries outperform both unlearned baselines and our learned extraction-only system on both ROUGE and Pyramid, without a drop in judged linguistic quality. We achieve the highest published ROUGE results to date on the TAC 2008 data set.

5 0.45899725 270 acl-2011-SciSumm: A Multi-Document Summarization System for Scientific Articles

Author: Nitin Agarwal ; Ravi Shankar Reddy ; Kiran GVR ; Carolyn Penstein Rose

Abstract: In this demo, we present SciSumm, an interactive multi-document summarization system for scientific articles. The document collection to be summarized is a list of papers cited together within the same source article, otherwise known as a co-citation. At the heart of the approach is a topic based clustering of fragments extracted from each article based on queries generated from the context surrounding the co-cited list of papers. This analysis enables the generation of an overview of common themes from the co-cited papers that relate to the context in which the co-citation was found. SciSumm is currently built over the 2008 ACL Anthology, however the gen- eralizable nature of the summarization techniques and the extensible architecture makes it possible to use the system with other corpora where a citation network is available. Evaluation results on the same corpus demonstrate that our system performs better than an existing widely used multi-document summarization system (MEAD).

6 0.41783467 308 acl-2011-Towards a Framework for Abstractive Summarization of Multimodal Documents

7 0.38850603 21 acl-2011-A Pilot Study of Opinion Summarization in Conversations

8 0.37882766 251 acl-2011-Probabilistic Document Modeling for Syntax Removal in Text Summarization

9 0.37197077 201 acl-2011-Learning From Collective Human Behavior to Introduce Diversity in Lexical Choice

10 0.36560854 47 acl-2011-Automatic Assessment of Coverage Quality in Intelligence Reports

11 0.32267648 326 acl-2011-Using Bilingual Information for Cross-Language Document Summarization

12 0.31392536 76 acl-2011-Comparative News Summarization Using Linear Programming

13 0.30844852 291 acl-2011-SystemT: A Declarative Information Extraction System

14 0.28043804 98 acl-2011-Discovery of Topically Coherent Sentences for Extractive Summarization

15 0.26614353 149 acl-2011-Hierarchical Reinforcement Learning and Hidden Markov Models for Task-Oriented Natural Language Generation

16 0.26547015 51 acl-2011-Automatic Headline Generation using Character Cross-Correlation

17 0.25287145 18 acl-2011-A Latent Topic Extracting Method based on Events in a Document and its Application

18 0.23716846 335 acl-2011-Why Initialization Matters for IBM Model 1: Multiple Optima and Non-Strict Convexity

19 0.23288797 102 acl-2011-Does Size Matter - How Much Data is Required to Train a REG Algorithm?

20 0.23020585 256 acl-2011-Query Weighting for Ranking Model Adaptation


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(5, 0.025), (17, 0.045), (25, 0.078), (26, 0.021), (37, 0.074), (39, 0.049), (41, 0.068), (50, 0.168), (55, 0.03), (59, 0.045), (72, 0.021), (76, 0.021), (91, 0.029), (96, 0.209), (97, 0.015)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.92723441 89 acl-2011-Creative Language Retrieval: A Robust Hybrid of Information Retrieval and Linguistic Creativity

Author: Tony Veale

Abstract: Information retrieval (IR) and figurative language processing (FLP) could scarcely be more different in their treatment of language and meaning. IR views language as an open-ended set of mostly stable signs with which texts can be indexed and retrieved, focusing more on a text’s potential relevance than its potential meaning. In contrast, FLP views language as a system of unstable signs that can be used to talk about the world in creative new ways. There is another key difference: IR is practical, scalable and robust, and in daily use by millions of casual users. FLP is neither scalable nor robust, and not yet practical enough to migrate beyond the lab. This paper thus presents a mutually beneficial hybrid of IR and FLP, one that enriches IR with new operators to enable the non-literal retrieval of creative expressions, and which also transplants FLP into a robust, scalable framework in which practical applications of linguistic creativity can be implemented. 1

2 0.87961841 284 acl-2011-Simple Unsupervised Grammar Induction from Raw Text with Cascaded Finite State Models

Author: Elias Ponvert ; Jason Baldridge ; Katrin Erk

Abstract: We consider a new subproblem of unsupervised parsing from raw text, unsupervised partial parsing—the unsupervised version of text chunking. We show that addressing this task directly, using probabilistic finite-state methods, produces better results than relying on the local predictions of a current best unsupervised parser, Seginer’s (2007) CCL. These finite-state models are combined in a cascade to produce more general (full-sentence) constituent structures; doing so outperforms CCL by a wide margin in unlabeled PARSEVAL scores for English, German and Chinese. Finally, we address the use of phrasal punctuation as a heuristic indicator of phrasal boundaries, both in our system and in CCL.

3 0.87913489 187 acl-2011-Jointly Learning to Extract and Compress

Author: Taylor Berg-Kirkpatrick ; Dan Gillick ; Dan Klein

Abstract: We learn a joint model of sentence extraction and compression for multi-document summarization. Our model scores candidate summaries according to a combined linear model whose features factor over (1) the n-gram types in the summary and (2) the compressions used. We train the model using a marginbased objective whose loss captures end summary quality. Because of the exponentially large set of candidate summaries, we use a cutting-plane algorithm to incrementally detect and add active constraints efficiently. Inference in our model can be cast as an ILP and thereby solved in reasonable time; we also present a fast approximation scheme which achieves similar performance. Our jointly extracted and compressed summaries outperform both unlearned baselines and our learned extraction-only system on both ROUGE and Pyramid, without a drop in judged linguistic quality. We achieve the highest published ROUGE results to date on the TAC 2008 data set.

same-paper 4 0.87242281 4 acl-2011-A Class of Submodular Functions for Document Summarization

Author: Hui Lin ; Jeff Bilmes

Abstract: We design a class of submodular functions meant for document summarization tasks. These functions each combine two terms, one which encourages the summary to be representative of the corpus, and the other which positively rewards diversity. Critically, our functions are monotone nondecreasing and submodular, which means that an efficient scalable greedy optimization scheme has a constant factor guarantee of optimality. When evaluated on DUC 2004-2007 corpora, we obtain better than existing state-of-art results in both generic and query-focused document summarization. Lastly, we show that several well-established methods for document summarization correspond, in fact, to submodular function optimization, adding further evidence that submodular functions are a natural fit for document summarization.

5 0.83939445 340 acl-2011-Word Alignment via Submodular Maximization over Matroids

Author: Hui Lin ; Jeff Bilmes

Abstract: We cast the word alignment problem as maximizing a submodular function under matroid constraints. Our framework is able to express complex interactions between alignment components while remaining computationally efficient, thanks to the power and generality of submodular functions. We show that submodularity naturally arises when modeling word fertility. Experiments on the English-French Hansards alignment task show that our approach achieves lower alignment error rates compared to conventional matching based approaches.

6 0.80135745 262 acl-2011-Relation Guided Bootstrapping of Semantic Lexicons

7 0.791587 137 acl-2011-Fine-Grained Class Label Markup of Search Queries

8 0.7913661 15 acl-2011-A Hierarchical Pitman-Yor Process HMM for Unsupervised Part of Speech Induction

9 0.79111713 171 acl-2011-Incremental Syntactic Language Models for Phrase-based Translation

10 0.79096204 11 acl-2011-A Fast and Accurate Method for Approximate String Search

11 0.79007411 3 acl-2011-A Bayesian Model for Unsupervised Semantic Parsing

12 0.78791946 175 acl-2011-Integrating history-length interpolation and classes in language modeling

13 0.78751659 18 acl-2011-A Latent Topic Extracting Method based on Events in a Document and its Application

14 0.78709412 117 acl-2011-Entity Set Expansion using Topic information

15 0.78698361 251 acl-2011-Probabilistic Document Modeling for Syntax Removal in Text Summarization

16 0.7867654 240 acl-2011-ParaSense or How to Use Parallel Corpora for Word Sense Disambiguation

17 0.78614312 318 acl-2011-Unsupervised Bilingual Morpheme Segmentation and Alignment with Context-rich Hidden Semi-Markov Models

18 0.78552645 286 acl-2011-Social Network Extraction from Texts: A Thesis Proposal

19 0.7852385 53 acl-2011-Automatically Evaluating Text Coherence Using Discourse Relations

20 0.78520012 324 acl-2011-Unsupervised Semantic Role Induction via Split-Merge Clustering