acl acl2013 acl2013-23 knowledge-graph by maker-knowledge-mining

23 acl-2013-A System for Summarizing Scientific Topics Starting from Keywords


Source: pdf

Author: Rahul Jha ; Amjad Abu-Jbara ; Dragomir Radev

Abstract: In this paper, we investigate the problem of automatic generation of scientific surveys starting from keywords provided by a user. We present a system that can take a topic query as input and generate a survey of the topic by first selecting a set of relevant documents, and then selecting relevant sentences from those documents. We discuss the issues of robust evaluation of such systems and describe an evaluation corpus we generated by manually extracting factoids, or information units, from 47 gold standard documents (surveys and tutorials) on seven topics in Natural Language Processing. We have manually annotated 2,625 sentences with these factoids (around 375 sentences per topic) to build an evaluation corpus for this task. We present evaluation results for the performance of our system using this annotated data.

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 A System for Summarizing Scientific Topics Starting from Keywords Rahul Jha Department of EECS University of Michigan Ann Arbor, MI, USA rahul j ha@ umi ch . [sent-1, score-0.152]

2 edu Abstract In this paper, we investigate the problem of automatic generation of scientific surveys starting from keywords provided by a user. [sent-2, score-0.565]

3 We present a system that can take a topic query as input and generate a survey of the topic by first selecting a set of relevant documents, and then selecting relevant sentences from those documents. [sent-3, score-0.63]

4 We discuss the issues of robust evaluation of such systems and describe an evaluation corpus we generated by manually extracting factoids, or information units, from 47 gold standard documents (surveys and tutorials) on seven topics in Natural Language Processing. [sent-4, score-0.268]

5 We have manually annotated 2,625 sentences with these factoids (around 375 sentences per topic) to build an evaluation corpus for this task. [sent-5, score-0.572]

6 1 Introduction The rise of the number of publications in all scientific fields is making it more and more difficult to get quickly acquainted with the new developments in a new area. [sent-7, score-0.235]

7 One way to wade through this huge amount of scholarly information is to consult topical surveys written by experts in an area. [sent-8, score-0.319]

8 Such surveys can be very helpful when available, but unfortunately, may not be available for all areas. [sent-10, score-0.319]

9 Additionally, the manual Amjad Abu-Jbara Department of EECS University of Michigan Ann Arbor, MI, USA amjbara@umi ch . [sent-11, score-0.039]

10 edu Dragomir Radev Department of EECS and School of Information University of Michigan Ann Arbor, MI, USA radev@umi ch . [sent-12, score-0.039]

11 edu Thus, a system that can generate such surveys automatically would be a useful tool. [sent-13, score-0.354]

12 Short summaries in the form of abstracts are available for individual papers, but no such information is available for scientific topics. [sent-14, score-0.247]

13 In this paper, we explore strategies for generating and evaluating such surveys of scientific topics automatically starting from a phrase representing a topic area. [sent-15, score-0.772]

14 We evaluate our system on a set of topics in the field of Natural Language Processing. [sent-16, score-0.101]

15 In earlier work, (Teufel and Moens, 2002) have examined the problem of summarizing scientific articles using rhetorical analysis of sentences. [sent-17, score-0.291]

16 Nanba and Okumura (1999) have also discussed the problem of generating surveys of multiple papers. [sent-18, score-0.319]

17 (2009) presented experiments on generating surveys of scientific topics starting from papers to be summarized. [sent-20, score-0.856]

18 More recently, Hoang and Kan (2010) have presented initial results on automatically generating related work section for a target paper by taking a hierarchical topic tree as an input. [sent-21, score-0.138]

19 In this paper, we tackle the more challenging problem of summarizing a topic starting from a topic query. [sent-22, score-0.375]

20 Our system takes as an input a string describing the topic area, selects the relevant papers from a corpus of papers, and then selects sentences from the citing sentences to these papers to generate a survey of the topic. [sent-23, score-1.157]

21 A sample output of our system for the topic of “Word Sense Disambiguation” is shown in Figure 1. [sent-24, score-0.138]

22 2 Candidate Document Selection Given a query representing the topic to be summarized, our first task is to find the set of relevant documents from the corpus. [sent-25, score-0.308]

23 The simplest way to do this for a corpus of scientific publica- isonufrtpvhue ybfsileqcladut. [sent-26, score-0.166]

24 ank in the bibliometric citation network, and se- 2008. [sent-28, score-0.093]

25 Most researchers working on word sense disambiguation (WSD) use manually sense tagged data such as SemCor (Miller et al. [sent-39, score-0.307]

26 , 1993) to train statistical classifiers, but also use the information in SemCor on the overall sense distribution for each word as a backoff model. [sent-40, score-0.104]

27 Yarowsky (1995) has proposed a bootstrapping method for word sense disambiguation. [sent-41, score-0.104]

28 For example, the use of parallel corpora for sense tagging can help with word sense disambiguation (Brown et al. [sent-44, score-0.307]

29 Figure 1: A sample output survey of our system on the topic of “Word Sense Disambiguation” produced by paper selection using Restricted Expansion and sentence selection using Lexrank. [sent-46, score-0.396]

30 In our evaluations, this survey achieved a pyramid score of 0. [sent-47, score-0.182]

31 90815f246or document selection by measuring the Cumulative Gain (CG) of top 5, 10 and 20 results. [sent-53, score-0.144]

32 the results of these techniques with the papers covered by gold standard surveys on a few topics, we found that some important papers are missed by these simple approaches. [sent-54, score-0.836]

33 One reason for this is that early papers in a field might use non-standard terms in the absence of a stable, accepted terminology. [sent-55, score-0.222]

34 Additionally, papers might use alternative forms or abbreviations of topics in their titles and abstracts, e. [sent-57, score-0.323]

35 for input query “Semantic Role Labelling”, papers such as (Dahlmeier et al. [sent-59, score-0.319]

36 , 2009) titled “Joint Learning of Preposition Senses and Semantic Roles of Prepositional Phrases” and (Che and Liu, 2010) titled “Jointly Modeling WSD and SRL with Markov Logic” might be missed. [sent-60, score-0.108]

37 In this method, we first create a base set B, by finding papers with an exact match to the query. [sent-62, score-0.222]

38 This is a high precision set since a paper with a title that contains the exact query phrase is very likely to be relevant to the topic. [sent-63, score-0.095]

39 We then find additional papers by expanding in the citation network around B, that is, by finding all the papers that are cited by or cite the papers in B, to create an extended set E. [sent-64, score-0.969]

40 From this combined set (B ∪ E), we create a new set F by filtering out the set of papers that are not cited by or cite a minimum threshold tinit of papers in B. [sent-65, score-0.631]

41 If the total number of papers is lower than fmin or higher than fmax, we iteratively increase or decrease t till fmin ≤ |F| ≤ fmax. [sent-66, score-0.42]

42 The values for our current experiments are: tinit = 5, fmin = 150, fmax = 250. [sent-68, score-0.223]

43 lected for the topic of “Word Sense Disambiguation”. [sent-69, score-0.138]

44 Sizes for surveys are expressed in number of pages, sizes for tutorials are expressed in number of slides. [sent-70, score-0.44]

45 To evaluate different methods of candidate document selection, we use Cumulative Gain (CG), where the weight for each paper is estimated by the fraction of surveys it appears in. [sent-71, score-0.374]

46 Table 1 shows the average Cumulative Gain of top 5, 10 and 20 documents for each of eight methods we tried. [sent-72, score-0.075]

47 Once we obtain a set of papers to be summarized, we select the top n most cited papers in the document set as the papers to be summarized, and extract the set of citing sentences S from all the papers in the document set to these n papers. [sent-74, score-1.344]

48 S is the input for our sentence selection algorithms, described in Section 4. [sent-75, score-0.129]

49 We built a factoid inventory for seven topics in NLP based on manual written surveys in the following way. [sent-79, score-0.612]

50 For each topic, we found at least 3 recent tutorials and 3 recent surveys on the topic and extracted the factoids that are covered in each of them. [sent-80, score-1.017]

51 Table 2 shows the complete list of material collected for the topic of “Word Sense Disambiguation”. [sent-81, score-0.138]

52 We found around 80 factoids per topic on an average. [sent-82, score-0.543]

53 Once the factoids were extracted, each factoid was assigned a weight based on the number of documents it appears in, and any factoids with weight one were removed. [sent-83, score-1.024]

54 Table 3 shows the top ten factoids in the topic of Word Sense Disambiguation along with their distribution across the different surveys and tutorials and final weight. [sent-84, score-0.983]

55 For each of the topics, we used the method described in Section 2 to create a candidate document set and extracted the candidate citing sen- tences to be used as the input for the content selection component. [sent-85, score-0.419]

56 Each sentence in each topic was then annotated by a human judge against the factoid list for that topic. [sent-86, score-0.312]

57 On an average, 375 citing sentences were annotated for each topic, with 2,625 sentences being annotated in total. [sent-89, score-0.38]

58 4 Content Models Once we have the set of input sentences, our system must select the sentences that should be part 2http://clair. [sent-91, score-0.141]

59 For this task, we experimented with three content models, described below. [sent-95, score-0.057]

60 1 Centroid The centroid of a set of documents is a set ofwords that are statistically important to the cluster ofdocuments. [sent-97, score-0.229]

61 Centroid based summarization of a document set involves first creating the centroid of the documents, and then judging the salience of each document based on its similarity to the centroid of the document set. [sent-98, score-0.601]

62 In our case, the input citing sentences represent the documents from which we extract the centroid. [sent-99, score-0.359]

63 We use the centroid implementation from the publicly available summarization toolkit, MEAD (Radev et al. [sent-100, score-0.224]

64 2 Lexrank LexRank (Erkan and Radev, 2004) is a network based content selection algorithm that works by first building a graph of all the documents in a cluster. [sent-103, score-0.31]

65 Once the network is built, the algorithm computes the salience of sentences in this graph based on their eigenvector centrality in the network. [sent-105, score-0.254]

66 3 C-Lexrank C-Lexrank is another network based content selection algorithm that focuses on diversity (Qazvinian and Radev, 2008). [sent-107, score-0.293]

67 Given a set of sentences, it first creates a network using these sentences and then runs a clustering algorithm to partition the network into smaller clusters that represent different aspects of the paper. [sent-108, score-0.244]

68 5 Experiments and Results To do an evaluation of our different content selection methods, we first select the documents using our Restricted Expansion method, and then pick 574 TaSDQbWNAoeulavpnmredtiscanmSg4dtoernic:EzRsAtaenyioDlPwenaiRsryLemcuibnsogltuiaonfpyRr0 a. [sent-110, score-0.256]

69 the citing sentences to be used as the input to the summarization module as described in Section 2. [sent-115, score-0.354]

70 Given this input, we generate 500 word summaries for each of the seven topics using the four methods: Centroid, Lexrank, C-Lexrank and a random baseline. [sent-116, score-0.263]

71 The first is the Pyramid score (Nenkova and Passonneau, 2004) computed by treating the factoids as Summary Content Units (SCU’s). [sent-118, score-0.405]

72 The Pyramid scores for each summary is shown in Table 4. [sent-119, score-0.061]

73 The second metric is an Unnormalized Relative Utility score (Radev and Tam, 2003), computed using the factoid scores of sentences based on the method presented in (Qazvinian, 2012). [sent-120, score-0.205]

74 We call this Unnormalized RU since we are not able to normalize the scores with human generated gold summaries. [sent-121, score-0.039]

75 The parameter α is the RU penalty for including a redundant sentence subsumed by an earlier sentence. [sent-123, score-0.112]

76 If the summary chooses a sentence si with score worig that is sub- sumed by an earlier summary sentence, the score is reduced as wsubsumed = (α ∗ worig). [sent-124, score-0.223]

77 We approximate subsumption by marking a sentence sj as being subsumed by si if Fj ⊂ Fi, where Fi and Fj are sets of factoids covered ⊂in Feach sentence. [sent-125, score-0.485]

78 The reason for the relatively high scores for the random baseline is that our process to select the initial set of sentences eliminates many bad sentences. [sent-132, score-0.131]

79 For example, for a subset of 5 topics, the total input set contains 1508 sentences, out of which 922 of the sentences (60%) have at least one factoid. [sent-133, score-0.106]

80 This makes it highly likely to pick good content sentences even when we are picking sentences at random. [sent-134, score-0.189]

81 We find that the Lexrank method outperforms other sentence selection methods on both evaluation metrics. [sent-135, score-0.089]

82 The reason for the low performance of C-Lexrank as compared to Lexrank on this data set can be attributed to the fact that the input sentence set is derived from a much more diverse set of papers which can have a high diversity in lexical choice when describing the same factoid. [sent-137, score-0.32]

83 The lower Unnormalized RU scores compared to Pyramid scores indicate that we are selecting sentences containing highly weighted factoids, but we do not select the most informative sentences that contain a large number of factoids. [sent-139, score-0.167]

84 This also shows that we select some redundant factoids, since Unnormalized RU contains a penalty for redundancy. [sent-140, score-0.066]

85 This is again, explained by the fact that the simple lexical diversity based model in CLexrank is not able to detect the same factoids being present in two sentences. [sent-141, score-0.463]

86 Despite these shortcomings, our system works quite well in terms of content selection for unseen topics, Figure 2 shows the top 5 sentences for the query “Conditional Random Fields”. [sent-142, score-0.269]

87 6 Conclusion and Future Work In this paper, we described a pipeline for the generation of scientific surveys starting from a topic query. [sent-143, score-0.671]

88 The first component finds the set of papers from the corpus relevant to the query using a simple heuristic called Restricted Expansion. [sent-145, score-0.317]

89 The second component selects sentences from these papers to generate a survey of the topic. [sent-146, score-0.439]

90 We collected 47 gold standard documents (surveys and tutorials) on seven topics in Natural Language Processing and extracted factoids for each topic. [sent-148, score-0.673]

91 Each factoid is given an importance score based on the number of gold standard documents it appears in. [sent-149, score-0.253]

92 575 In recent years, conditional random fields (CRFs) (Lafferty et al. [sent-150, score-0.07]

93 , 2001) have shown success on a number of natural language processing (NLP) tasks, including shallow parsing (Sha and Pereira, 2003), named entity recognition (McCallum and Li, 2003) and information extraction from research papers (Peng and McCallum, 2004). [sent-151, score-0.222]

94 Figure 2: A sample output survey produced by our system on the topic of “Conditional Random Fields” using Restricted Expansion and Lexrank. [sent-161, score-0.218]

95 Additionally, we manually annotated 2,625 input sentences, about 375 sentences per topic, with the factoids extracted from the gold standard documents for each topic. [sent-162, score-0.66]

96 Using this corpus, we presented experimental results for the performance of our document selection component and three sentence selection strategies. [sent-163, score-0.233]

97 We plan to look at better models of diversity in sentence selection, since methods based on simple lexical similarity do not seem to work well. [sent-165, score-0.058]

98 The low factoid recall shown by low unnormalized RU scores suggests integrating the full text of papers with citation based summaries which might help us find factoids such as topic definitions that are unlikely to be present in citing sentences. [sent-166, score-1.357]

99 Radev, Pradeep Qazvinian, anthology network corpus. [sent-220, score-0.143]

100 Teufel Language and rhetorical articles: status. [sent-222, score-0.039]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('factoids', 0.405), ('surveys', 0.319), ('papers', 0.222), ('radev', 0.194), ('citing', 0.178), ('lexrank', 0.169), ('scientific', 0.166), ('centroid', 0.154), ('factoid', 0.139), ('unnormalized', 0.138), ('topic', 0.138), ('qazvinian', 0.134), ('ru', 0.129), ('tutorials', 0.121), ('vahed', 0.116), ('dragomir', 0.115), ('sha', 0.108), ('sense', 0.104), ('pyramid', 0.102), ('topics', 0.101), ('fmin', 0.099), ('disambiguation', 0.099), ('mccallum', 0.093), ('citation', 0.093), ('selection', 0.089), ('network', 0.089), ('yarowsky', 0.085), ('survey', 0.08), ('crfs', 0.077), ('umi', 0.076), ('documents', 0.075), ('erkan', 0.072), ('summarization', 0.07), ('wsd', 0.068), ('cited', 0.067), ('tinit', 0.066), ('worig', 0.066), ('sentences', 0.066), ('eecs', 0.065), ('teufel', 0.063), ('pereira', 0.061), ('summary', 0.061), ('cumulative', 0.059), ('fmax', 0.058), ('mead', 0.058), ('nanba', 0.058), ('semcor', 0.058), ('salience', 0.058), ('diversity', 0.058), ('restricted', 0.058), ('query', 0.057), ('content', 0.057), ('document', 0.055), ('lafferty', 0.054), ('amjad', 0.054), ('anthology', 0.054), ('cite', 0.054), ('titled', 0.054), ('seven', 0.053), ('peng', 0.053), ('summarizing', 0.051), ('michigan', 0.05), ('cg', 0.048), ('starting', 0.048), ('expansion', 0.047), ('subsumed', 0.046), ('pradeep', 0.046), ('iarpa', 0.045), ('summaries', 0.044), ('arbor', 0.043), ('settles', 0.042), ('centrality', 0.041), ('input', 0.04), ('fields', 0.04), ('fj', 0.04), ('gold', 0.039), ('ch', 0.039), ('rhetorical', 0.039), ('relevant', 0.038), ('rahul', 0.037), ('abstracts', 0.037), ('summarized', 0.037), ('selects', 0.036), ('nenkova', 0.036), ('select', 0.035), ('annotated', 0.035), ('earlier', 0.035), ('generate', 0.035), ('mohammad', 0.034), ('covered', 0.034), ('mi', 0.034), ('simone', 0.033), ('keywords', 0.032), ('redundant', 0.031), ('dagan', 0.031), ('random', 0.03), ('acquainted', 0.029), ('blairgoldensohn', 0.029), ('citations', 0.029), ('dimitrov', 0.029)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999958 23 acl-2013-A System for Summarizing Scientific Topics Starting from Keywords

Author: Rahul Jha ; Amjad Abu-Jbara ; Dragomir Radev

Abstract: In this paper, we investigate the problem of automatic generation of scientific surveys starting from keywords provided by a user. We present a system that can take a topic query as input and generate a survey of the topic by first selecting a set of relevant documents, and then selecting relevant sentences from those documents. We discuss the issues of robust evaluation of such systems and describe an evaluation corpus we generated by manually extracting factoids, or information units, from 47 gold standard documents (surveys and tutorials) on seven topics in Natural Language Processing. We have manually annotated 2,625 sentences with these factoids (around 375 sentences per topic) to build an evaluation corpus for this task. We present evaluation results for the performance of our system using this annotated data.

2 0.3266899 293 acl-2013-Random Walk Factoid Annotation for Collective Discourse

Author: Ben King ; Rahul Jha ; Dragomir Radev ; Robert Mankoff

Abstract: In this paper, we study the problem of automatically annotating the factoids present in collective discourse. Factoids are information units that are shared between instances of collective discourse and may have many different ways ofbeing realized in words. Our approach divides this problem into two steps, using a graph-based approach for each step: (1) factoid discovery, finding groups of words that correspond to the same factoid, and (2) factoid assignment, using these groups of words to mark collective discourse units that contain the respective factoids. We study this on two novel data sets: the New Yorker caption contest data set, and the crossword clues data set.

3 0.13004117 55 acl-2013-Are Semantically Coherent Topic Models Useful for Ad Hoc Information Retrieval?

Author: Romain Deveaud ; Eric SanJuan ; Patrice Bellot

Abstract: The current topic modeling approaches for Information Retrieval do not allow to explicitly model query-oriented latent topics. More, the semantic coherence of the topics has never been considered in this field. We propose a model-based feedback approach that learns Latent Dirichlet Allocation topic models on the top-ranked pseudo-relevant feedback, and we measure the semantic coherence of those topics. We perform a first experimental evaluation using two major TREC test collections. Results show that retrieval perfor- mances tend to be better when using topics with higher semantic coherence.

4 0.12008629 341 acl-2013-Text Classification based on the Latent Topics of Important Sentences extracted by the PageRank Algorithm

Author: Yukari Ogura ; Ichiro Kobayashi

Abstract: In this paper, we propose a method to raise the accuracy of text classification based on latent topics, reconsidering the techniques necessary for good classification for example, to decide important sentences in a document, the sentences with important words are usually regarded as important sentences. In this case, tf.idf is often used to decide important words. On the other hand, we apply the PageRank algorithm to rank important words in each document. Furthermore, before clustering documents, we refine the target documents by representing them as a collection of important sentences in each document. We then classify the documents based on latent information in the documents. As a clustering method, we employ the k-means algorithm and inves– tigate how our proposed method works for good clustering. We conduct experiments with Reuters-21578 corpus under various conditions of important sentence extraction, using latent and surface information for clustering, and have confirmed that our proposed method provides better result among various conditions for clustering.

5 0.11130255 54 acl-2013-Are School-of-thought Words Characterizable?

Author: Xiaorui Jiang ; Xiaoping Sun ; Hai Zhuge

Abstract: School of thought analysis is an important yet not-well-elaborated scientific knowledge discovery task. This paper makes the first attempt at this problem. We focus on one aspect of the problem: do characteristic school-of-thought words exist and whether they are characterizable? To answer these questions, we propose a probabilistic generative School-Of-Thought (SOT) model to simulate the scientific authoring process based on several assumptions. SOT defines a school of thought as a distribution of topics and assumes that authors determine the school of thought for each sentence before choosing words to deliver scientific ideas. SOT distinguishes between two types of school-ofthought words for either the general background of a school of thought or the original ideas each paper contributes to its school of thought. Narrative and quantitative experiments show positive and promising results to the questions raised above. 1

6 0.10935772 59 acl-2013-Automated Pyramid Scoring of Summaries using Distributional Semantics

7 0.10506506 74 acl-2013-Building Comparable Corpora Based on Bilingual LDA Model

8 0.1044388 105 acl-2013-DKPro WSD: A Generalized UIMA-based Framework for Word Sense Disambiguation

9 0.1009622 43 acl-2013-Align, Disambiguate and Walk: A Unified Approach for Measuring Semantic Similarity

10 0.098295279 351 acl-2013-Topic Modeling Based Classification of Clinical Reports

11 0.092551276 18 acl-2013-A Sentence Compression Based Framework to Query-Focused Multi-Document Summarization

12 0.090651989 187 acl-2013-Identifying Opinion Subgroups in Arabic Online Discussions

13 0.084745616 111 acl-2013-Density Maximization in Context-Sense Metric Space for All-words WSD

14 0.084255345 121 acl-2013-Discovering User Interactions in Ideological Discussions

15 0.083600685 5 acl-2013-A Decade of Automatic Content Evaluation of News Summaries: Reassessing the State of the Art

16 0.080893524 129 acl-2013-Domain-Independent Abstract Generation for Focused Meeting Summarization

17 0.079641044 142 acl-2013-Evolutionary Hierarchical Dirichlet Process for Timeline Summarization

18 0.077067375 62 acl-2013-Automatic Term Ambiguity Detection

19 0.074875012 333 acl-2013-Summarization Through Submodularity and Dispersion

20 0.073081411 197 acl-2013-Incremental Topic-Based Translation Model Adaptation for Conversational Spoken Language Translation


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.19), (1, 0.078), (2, 0.025), (3, -0.073), (4, 0.084), (5, -0.076), (6, 0.04), (7, -0.007), (8, -0.164), (9, -0.071), (10, 0.032), (11, 0.104), (12, -0.076), (13, 0.078), (14, 0.007), (15, 0.059), (16, 0.063), (17, -0.001), (18, -0.115), (19, -0.045), (20, -0.008), (21, -0.042), (22, -0.032), (23, -0.055), (24, 0.016), (25, -0.091), (26, 0.049), (27, -0.104), (28, 0.066), (29, -0.115), (30, 0.046), (31, -0.011), (32, -0.005), (33, -0.026), (34, -0.025), (35, -0.108), (36, -0.027), (37, -0.006), (38, 0.039), (39, -0.029), (40, 0.056), (41, 0.067), (42, 0.178), (43, -0.06), (44, 0.15), (45, -0.005), (46, -0.009), (47, 0.03), (48, -0.039), (49, 0.074)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.92394096 23 acl-2013-A System for Summarizing Scientific Topics Starting from Keywords

Author: Rahul Jha ; Amjad Abu-Jbara ; Dragomir Radev

Abstract: In this paper, we investigate the problem of automatic generation of scientific surveys starting from keywords provided by a user. We present a system that can take a topic query as input and generate a survey of the topic by first selecting a set of relevant documents, and then selecting relevant sentences from those documents. We discuss the issues of robust evaluation of such systems and describe an evaluation corpus we generated by manually extracting factoids, or information units, from 47 gold standard documents (surveys and tutorials) on seven topics in Natural Language Processing. We have manually annotated 2,625 sentences with these factoids (around 375 sentences per topic) to build an evaluation corpus for this task. We present evaluation results for the performance of our system using this annotated data.

2 0.83615023 293 acl-2013-Random Walk Factoid Annotation for Collective Discourse

Author: Ben King ; Rahul Jha ; Dragomir Radev ; Robert Mankoff

Abstract: In this paper, we study the problem of automatically annotating the factoids present in collective discourse. Factoids are information units that are shared between instances of collective discourse and may have many different ways ofbeing realized in words. Our approach divides this problem into two steps, using a graph-based approach for each step: (1) factoid discovery, finding groups of words that correspond to the same factoid, and (2) factoid assignment, using these groups of words to mark collective discourse units that contain the respective factoids. We study this on two novel data sets: the New Yorker caption contest data set, and the crossword clues data set.

3 0.63183033 54 acl-2013-Are School-of-thought Words Characterizable?

Author: Xiaorui Jiang ; Xiaoping Sun ; Hai Zhuge

Abstract: School of thought analysis is an important yet not-well-elaborated scientific knowledge discovery task. This paper makes the first attempt at this problem. We focus on one aspect of the problem: do characteristic school-of-thought words exist and whether they are characterizable? To answer these questions, we propose a probabilistic generative School-Of-Thought (SOT) model to simulate the scientific authoring process based on several assumptions. SOT defines a school of thought as a distribution of topics and assumes that authors determine the school of thought for each sentence before choosing words to deliver scientific ideas. SOT distinguishes between two types of school-ofthought words for either the general background of a school of thought or the original ideas each paper contributes to its school of thought. Narrative and quantitative experiments show positive and promising results to the questions raised above. 1

4 0.61518037 126 acl-2013-Diverse Keyword Extraction from Conversations

Author: Maryam Habibi ; Andrei Popescu-Belis

Abstract: A new method for keyword extraction from conversations is introduced, which preserves the diversity of topics that are mentioned. Inspired from summarization, the method maximizes the coverage of topics that are recognized automatically in transcripts of conversation fragments. The method is evaluated on excerpts of the Fisher and AMI corpora, using a crowdsourcing platform to elicit comparative relevance judgments. The results demonstrate that the method outperforms two competitive baselines.

5 0.60879803 142 acl-2013-Evolutionary Hierarchical Dirichlet Process for Timeline Summarization

Author: Jiwei Li ; Sujian Li

Abstract: Timeline summarization aims at generating concise summaries and giving readers a faster and better access to understand the evolution of news. It is a new challenge which combines salience ranking problem with novelty detection. Previous researches in this field seldom explore the evolutionary pattern of topics such as birth, splitting, merging, developing and death. In this paper, we develop a novel model called Evolutionary Hierarchical Dirichlet Process(EHDP) to capture the topic evolution pattern in time- line summarization. In EHDP, time varying information is formulated as a series of HDPs by considering time-dependent information. Experiments on 6 different datasets which contain 3 156 documents demonstrates the good performance of our system with regard to ROUGE scores.

6 0.5691098 353 acl-2013-Towards Robust Abstractive Multi-Document Summarization: A Caseframe Analysis of Centrality and Domain

7 0.55404669 341 acl-2013-Text Classification based on the Latent Topics of Important Sentences extracted by the PageRank Algorithm

8 0.5325768 59 acl-2013-Automated Pyramid Scoring of Summaries using Distributional Semantics

9 0.52782726 129 acl-2013-Domain-Independent Abstract Generation for Focused Meeting Summarization

10 0.52714324 62 acl-2013-Automatic Term Ambiguity Detection

11 0.51449114 333 acl-2013-Summarization Through Submodularity and Dispersion

12 0.51385039 55 acl-2013-Are Semantically Coherent Topic Models Useful for Ad Hoc Information Retrieval?

13 0.50929075 18 acl-2013-A Sentence Compression Based Framework to Query-Focused Multi-Document Summarization

14 0.50477457 5 acl-2013-A Decade of Automatic Content Evaluation of News Summaries: Reassessing the State of the Art

15 0.48908088 53 acl-2013-Annotation of regular polysemy and underspecification

16 0.48817438 351 acl-2013-Topic Modeling Based Classification of Clinical Reports

17 0.47558793 316 acl-2013-SenseSpotting: Never let your parallel data tie you to an old domain

18 0.45831576 182 acl-2013-High-quality Training Data Selection using Latent Topics for Graph-based Semi-supervised Learning

19 0.45723251 111 acl-2013-Density Maximization in Context-Sense Metric Space for All-words WSD

20 0.45487943 105 acl-2013-DKPro WSD: A Generalized UIMA-based Framework for Word Sense Disambiguation


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(0, 0.044), (6, 0.049), (11, 0.045), (15, 0.024), (24, 0.046), (26, 0.05), (35, 0.079), (42, 0.073), (48, 0.077), (57, 0.241), (70, 0.036), (88, 0.066), (90, 0.048), (95, 0.054)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.78191894 23 acl-2013-A System for Summarizing Scientific Topics Starting from Keywords

Author: Rahul Jha ; Amjad Abu-Jbara ; Dragomir Radev

Abstract: In this paper, we investigate the problem of automatic generation of scientific surveys starting from keywords provided by a user. We present a system that can take a topic query as input and generate a survey of the topic by first selecting a set of relevant documents, and then selecting relevant sentences from those documents. We discuss the issues of robust evaluation of such systems and describe an evaluation corpus we generated by manually extracting factoids, or information units, from 47 gold standard documents (surveys and tutorials) on seven topics in Natural Language Processing. We have manually annotated 2,625 sentences with these factoids (around 375 sentences per topic) to build an evaluation corpus for this task. We present evaluation results for the performance of our system using this annotated data.

2 0.77855891 230 acl-2013-Lightly Supervised Learning of Procedural Dialog Systems

Author: Svitlana Volkova ; Pallavi Choudhury ; Chris Quirk ; Bill Dolan ; Luke Zettlemoyer

Abstract: Procedural dialog systems can help users achieve a wide range of goals. However, such systems are challenging to build, currently requiring manual engineering of substantial domain-specific task knowledge and dialog management strategies. In this paper, we demonstrate that it is possible to learn procedural dialog systems given only light supervision, of the type that can be provided by non-experts. We consider domains where the required task knowledge exists in textual form (e.g., instructional web pages) and where system builders have access to statements of user intent (e.g., search query logs or dialog interactions). To learn from such textual resources, we describe a novel approach that first automatically extracts task knowledge from instructions, then learns a dialog manager over this task knowledge to provide assistance. Evaluation in a Microsoft Office domain shows that the individual components are highly accurate and can be integrated into a dialog system that provides effective help to users.

3 0.7687034 325 acl-2013-Smoothed marginal distribution constraints for language modeling

Author: Brian Roark ; Cyril Allauzen ; Michael Riley

Abstract: We present an algorithm for re-estimating parameters of backoff n-gram language models so as to preserve given marginal distributions, along the lines of wellknown Kneser-Ney (1995) smoothing. Unlike Kneser-Ney, our approach is designed to be applied to any given smoothed backoff model, including models that have already been heavily pruned. As a result, the algorithm avoids issues observed when pruning Kneser-Ney models (Siivola et al., 2007; Chelba et al., 2010), while retaining the benefits of such marginal distribution constraints. We present experimental results for heavily pruned backoff ngram models, and demonstrate perplexity and word error rate reductions when used with various baseline smoothing methods. An open-source version of the algorithm has been released as part of the OpenGrm ngram library.1

4 0.67602271 272 acl-2013-Paraphrase-Driven Learning for Open Question Answering

Author: Anthony Fader ; Luke Zettlemoyer ; Oren Etzioni

Abstract: We study question answering as a machine learning problem, and induce a function that maps open-domain questions to queries over a database of web extractions. Given a large, community-authored, question-paraphrase corpus, we demonstrate that it is possible to learn a semantic lexicon and linear ranking function without manually annotating questions. Our approach automatically generalizes a seed lexicon and includes a scalable, parallelized perceptron parameter estimation scheme. Experiments show that our approach more than quadruples the recall of the seed lexicon, with only an 8% loss in precision.

5 0.60626066 358 acl-2013-Transition-based Dependency Parsing with Selectional Branching

Author: Jinho D. Choi ; Andrew McCallum

Abstract: We present a novel approach, called selectional branching, which uses confidence estimates to decide when to employ a beam, providing the accuracy of beam search at speeds close to a greedy transition-based dependency parsing approach. Selectional branching is guaranteed to perform a fewer number of transitions than beam search yet performs as accurately. We also present a new transition-based dependency parsing algorithm that gives a complexity of O(n) for projective parsing and an expected linear time speed for non-projective parsing. With the standard setup, our parser shows an unlabeled attachment score of 92.96% and a parsing speed of 9 milliseconds per sentence, which is faster and more accurate than the current state-of-the-art transitionbased parser that uses beam search.

6 0.58983344 225 acl-2013-Learning to Order Natural Language Texts

7 0.58682138 83 acl-2013-Collective Annotation of Linguistic Resources: Basic Principles and a Formal Model

8 0.5856517 275 acl-2013-Parsing with Compositional Vector Grammars

9 0.58546084 172 acl-2013-Graph-based Local Coherence Modeling

10 0.58377326 70 acl-2013-Bilingually-Guided Monolingual Dependency Grammar Induction

11 0.5832327 252 acl-2013-Multigraph Clustering for Unsupervised Coreference Resolution

12 0.5823437 62 acl-2013-Automatic Term Ambiguity Detection

13 0.58211803 347 acl-2013-The Role of Syntax in Vector Space Models of Compositional Semantics

14 0.58017927 318 acl-2013-Sentiment Relevance

15 0.57988983 47 acl-2013-An Information Theoretic Approach to Bilingual Word Clustering

16 0.57970726 82 acl-2013-Co-regularizing character-based and word-based models for semi-supervised Chinese word segmentation

17 0.57731789 185 acl-2013-Identifying Bad Semantic Neighbors for Improving Distributional Thesauri

18 0.57657248 18 acl-2013-A Sentence Compression Based Framework to Query-Focused Multi-Document Summarization

19 0.57645589 196 acl-2013-Improving pairwise coreference models through feature space hierarchy learning

20 0.57592696 389 acl-2013-Word Association Profiles and their Use for Automated Scoring of Essays