acl acl2010 acl2010-196 knowledge-graph by maker-knowledge-mining

196 acl-2010-Plot Induction and Evolutionary Search for Story Generation


Source: pdf

Author: Neil McIntyre ; Mirella Lapata

Abstract: In this paper we develop a story generator that leverages knowledge inherent in corpora without requiring extensive manual involvement. A key feature in our approach is the reliance on a story planner which we acquire automatically by recording events, their participants, and their precedence relationships in a training corpus. Contrary to previous work our system does not follow a generate-and-rank architecture. Instead, we employ evolutionary search techniques to explore the space of possible stories which we argue are well suited to the story generation task. Experiments on generating simple children’s stories show that our system outperforms pre- vious data-driven approaches.

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Abstract In this paper we develop a story generator that leverages knowledge inherent in corpora without requiring extensive manual involvement. [sent-5, score-0.419]

2 A key feature in our approach is the reliance on a story planner which we acquire automatically by recording events, their participants, and their precedence relationships in a training corpus. [sent-6, score-0.465]

3 Instead, we employ evolutionary search techniques to explore the space of possible stories which we argue are well suited to the story generation task. [sent-8, score-0.727]

4 Experiments on generating simple children’s stories show that our system outperforms pre- vious data-driven approaches. [sent-9, score-0.268]

5 1 Introduction Computer story generation has met with fascination since the early days of artificial intelligence. [sent-10, score-0.393]

6 Indeed, over the years, several generators have been developed capable of creating stories that resemble human output. [sent-11, score-0.295]

7 A large body of more recent work views story generation as a form of agent-based planning (Swartjes and Theune, 2008; Pizzi et al. [sent-13, score-0.435]

8 Interesting stories emerge as plans interact and cause failures and possible replanning. [sent-17, score-0.268]

9 uk The broader appeal of computational story generation lies in its application potential. [sent-23, score-0.393]

10 , 2004); rendering video games more interesting by allowing the plot to adapt dynamically to the players’ actions (Barros and Musse, 2007); and assisting teachers to create or personalize stories for their students (Riedl and Young, 2004). [sent-25, score-0.437]

11 A major stumbling block for the widespread use of computational story generators is their reliance on expensive, manually created resources. [sent-26, score-0.398]

12 A typical story generator will make use of a knowledge base for providing detailed domain-specific information about the characters and objects involved in the story and their relations. [sent-27, score-0.765]

13 It will also have a story planner that specifies how these characters interact, what their goals are and how their actions result in different story plots. [sent-28, score-0.836]

14 Finally, a sentence planner (coupled with a surface realizer) will render an abstract story specification into natural language text. [sent-29, score-0.488]

15 c s 2o0c1ia0ti Aosnso focria Ctio nm fpourta Ctoiomnpault Laitniognuaislt Licisn,g puaigsetisc 1s562–1572, likely to be part of the same story or narrative. [sent-36, score-0.346]

16 The latter could be used to construct or enrich the knowledge base of a story generator. [sent-38, score-0.346]

17 In McIntyre and Lapata (2009) we presented a story generator that leverages knowledge inherent in corpora without requiring extensive manual involvement. [sent-39, score-0.419]

18 These are used to produce a large set of candidate stories which are subsequently ranked based on their interestingness and coherence. [sent-41, score-0.29]

19 The approach is unusual in that it does not involve an explicit story planning component. [sent-42, score-0.406]

20 In this work we develop a story generator that is also data-driven but crucially relies on a story planner for creating meaningful stories. [sent-44, score-0.861]

21 Inspired by Chambers and Jurafsky (2009) we acquire story plots automatically by recording events, their participants, and their precedence relationships as attested in a training corpus. [sent-45, score-0.462]

22 Instead, we search the space of possible stories using Genetic Algorithms (GAs) which we argue are advantageous in the story generation setting, as they can search large fitness landscapes while greatly reducing the risk of getting stuck in local optima. [sent-48, score-0.826]

23 By virtue of exploring the search space more broadly, we are able to generate creative stories without an explicit interest scoring module. [sent-49, score-0.314]

24 Next, we detail our approach, specifically how plots are created and used in conjunction with genetic search (Sections 3 and 4). [sent-51, score-0.256]

25 2 Related Work Our work builds on and extends the story generator developed in McIntyre and Lapata (2009). [sent-53, score-0.419]

26 The system creates simple children’s stories in an interactive context: the user supplies the topic of the story and its desired length (number of sentences). [sent-54, score-0.666]

27 The generator creates a story following a pipeline architecture typical of natural language generation systems (Reiter and Dale, 2000) consisting of content selection, sentence planning, and surface realization. [sent-55, score-0.54]

28 The content of a story is determined by consulting a data-driven knowledge base that records the entities (i. [sent-56, score-0.389]

29 The sentence planner aggregates together entities and their actions into a sentence using phrase structure rules. [sent-63, score-0.239]

30 The system searches for the best story overall as well as the best sentences that can be generated from the knowledge base. [sent-65, score-0.346]

31 In addition, stories are reranked using two scoring functions based on coherence and interest. [sent-67, score-0.337]

32 , stories labeled with numeric values for interest and coherence. [sent-70, score-0.268]

33 Cheng and Mellish (2000) focus on the interaction of aggregation and text planning and use genetic algorithms to search for the best aggregated document that satisfies coherence con- straints. [sent-76, score-0.289]

34 The application of genetic algorithms to story generation is novel to our knowledge. [sent-77, score-0.523]

35 Secondly, our search procedure is simpler and more global; instead of searching for the best story twice (i. [sent-82, score-0.374]

36 , by first finding the n-best stories and then subsequently reranking them based on coherence and interest), our genetic algorithm explores the space of possible stories once. [sent-84, score-0.757]

37 , the princess loves the prince) from which the system creates a story. [sent-89, score-0.615]

38 Plots are generated by merging the entity-specific narrative schemas which subsequently serve as the input to the genetic algorithm. [sent-97, score-0.287]

39 In the following we describe how the narrative schemas are extracted and plots merged, and then discuss our evolutionary search procedure. [sent-98, score-0.274]

40 Entity-based Schema Extraction Before we can generate a plot for a story we must have an idea of the actions associated with the entities in the story, the order in which these actions are per- formed and also which other entities can participate. [sent-99, score-0.649]

41 The schema for princess after processing the first document is given on the left hand side. [sent-114, score-0.611]

42 In our example, the nodes “prince marry princess in castle” and “prince marry princess in temple” can be merged as they contain the same verb and number of similar arguments. [sent-125, score-1.389]

43 goblin hold princess in lair prince rescue princess prince marry princess in castle princess have influence prince rescue princess princess love prince prince marry princess in? [sent-134, score-6.644]

44 princess have baby Figure 1: Example of schema construction for the entity princess The schema construction algorithm terminates when graphs like the ones shown in Figure 1(right hand side) have been created for all entities in the corpus. [sent-136, score-1.385]

45 We achieve this by merging the schemas associated with the entities in the sentence into a plot graph. [sent-138, score-0.259]

46 As an example, consider again the sentence the princess loves the prince which requires combing the schemas representing prince and princess shown in Figures 2 and 1 (right hand side), respectively. [sent-139, score-2.083]

47 Once the plot graph is created, a depth first search starting from the node corresponding to the input sentence, finds all paths with length matching the desired story length (cycles are disallowed). [sent-144, score-0.567]

48 Assuming we wish to generate a story consisting of three sentences, the graph in Figure 3 would create four plots. [sent-145, score-0.398]

49 Each of these plots represents two different stories one with castle and one with temple in it. [sent-147, score-0.479]

50 Sentence Planning The sentence planner is interleaved with the story planner and influences the final structure of each sentence in the story. [sent-148, score-0.59]

51 To avoid generating short sentences note that nodes in the plot graph consist of a single action and would otherwise correspond to a sentence with a single clause we combine pairs of nodes within the same graph by looking at intrasentential verb-verb co-occurrences in the training corpus. [sent-149, score-0.345]

52 For example, the nodes (prince have problem, prince keep secret) could become the sentence the prince has a problem keeping a secret. [sent-150, score-0.908]

53 1565 princess love prince prince slay dragon prince rescue princess prince ask king’s permission prince marry princess in? [sent-157, score-4.137]

54 en- prince rule country Figure 2: Narrative schema for the entity prince. [sent-159, score-0.514]

55 The latter would normally contain hundreds of nodes and give rise to thousands of stories once lexical variables have been expanded. [sent-161, score-0.322]

56 Searching the story space is a difficult optimization problem, that must satisfy several constraints: the story should be of a certain length, overall coherent, creative, display some form of event progression, and generally make sense. [sent-162, score-0.692]

57 An initial population is randomly created containing a predefined number of individuals (or solutions), each represented by a genetic string (e. [sent-166, score-0.283]

58 A number of individuals are then chosen as parents from the population according to their fitness, and undergo crossover (also called recombination) and mutation in order to develop the new population. [sent-170, score-0.443]

59 The algorithm thus identifies the individuals with the optimizing fitness values, and those with lower fitness will naturally get discarded from the population. [sent-172, score-0.248]

60 prince rescue princess prince slay dragon princess love prince Figure 3: Plot graph for the input sentence the princess loves the prince. [sent-177, score-3.292]

61 We describe below how we developed a genetic algorithm for our story generation problem. [sent-180, score-0.523]

62 Initial Population Rather than start with a random population, we seed the initial population with story plots generated from our plot graph. [sent-181, score-0.638]

63 Figure 4a shows two parents (prince rescue princess, prince marry princess in castle, princess have baby) and (prince rescue princess, prince love princess, princess kiss prince) and how two new plots are created by swapping their last nodes. [sent-189, score-3.135]

64 Verbs, however, have structural impor- tance in the stories and we cannot simply replace them without taking account of their arguments. [sent-193, score-0.268]

65 For instance, through crossover it is possible to create a plot in which all or some nodes are identical. [sent-202, score-0.274]

66 The latter can be easily created by permuting the sentences of coherent stories (assuming that the original story is more coherent than its permutations). [sent-212, score-0.719]

67 However, in the context of genetic search such a function seems redundant as interesting stories emerge naturally through the operations of crossover and mutation. [sent-215, score-0.545]

68 5 Surface Realization Once the final generation of the population has been reached, the fittest story is selected for surface realization. [sent-216, score-0.545]

69 The realizer takes each sentence in the story and reformulates it into input compatible with the RealPro (Lavoie and Rambow, 1997) text generation engine. [sent-217, score-0.461]

70 Realpro creates several variants of the same story differing in the choice of determiners, number (singular or plural), and prepositions. [sent-218, score-0.374]

71 6 Experimental Setup In this section we present our experimental set-up for assessing the performance of our story generator. [sent-222, score-0.346]

72 Corpus The generator was trained on the same corpus used in McIntyre and Lapata (2009), 437 stories from the Andrew Lang fairy tales collection. [sent-224, score-0.411]

73 These involve the population size, crossover, and mutation rates. [sent-235, score-0.246]

74 To evaluate which setting was best, we asked two human evaluators to rate (on a 1–5 scale) stories produced with a population size ranging from 1,000 to 10,000, crossover rate of 0. [sent-236, score-0.539]

75 The human ratings revealed that the best stories were produced for a population size of 10,000, a crossover rate of 0. [sent-242, score-0.535]

76 , Karamanis and Manurung 2002) our crossover rate may seem low and the mutation rate high. [sent-247, score-0.321]

77 1568 × Evaluation We compared the stories generated by the GA against those produced by the rank-based system described in McIntyre and Lapata (2009) and a system that creates stories from the plot graph, without any stochastic search. [sent-261, score-0.685]

78 After expanding all lexical variables, the chosen plot graph will give rise to different stories (e. [sent-263, score-0.461]

79 We select the story ranked highest according to our coherence function. [sent-266, score-0.415]

80 In addition, we included a baseline which randomly selects sentences from the training corpus provided they contain either of the story protagonists (i. [sent-267, score-0.376]

81 Each system created stories for 12 input sentences, resulting in 48 (4 12) stories for evaluteanticoens. [sent-271, score-0.561]

82 The stories were split into three sets containing four stories from each system but with only one story from each input sentence. [sent-277, score-0.882]

83 All stories had the same length, namely five sentences. [sent-278, score-0.268]

84 Human judges were presented with one of the three sets and asked to rate the stories on a scale of 1 to 5 for fluency (was the sentence grammatical? [sent-279, score-0.35]

85 The stories were presented in random order and participants were told that all of them were generated by a computer program. [sent-283, score-0.268]

86 They were instructed to rate more favorably interesting stories, stories that were comprehensible and overall grammatical. [sent-284, score-0.295]

87 We performed an Analysis of Variance (ANOVA) to examine the effect of system type on the story generation task. [sent-288, score-0.393]

88 r2381e0654st∗ Table 1: Human evaluation results: mean story ratings for four story generators; ∗ : p < 0. [sent-293, score-0.715]

89 Overall our results indicate that an explicit story planner improves the quality of the generated stories, especially when coupled with a search mechanism that advantageously explores the search space. [sent-308, score-0.516]

90 It is worth noting that the Plot-based system is relatively simple, however the explicit use of a story plot, seems to make up for the lack of sophisticated search and more elaborate linguistic information. [sent-309, score-0.392]

91 Example stories generated by the four systems are shown in Table 2 for the input sentences The emperor rules the kingdom and The child watches the bird. [sent-310, score-0.311]

92 In the future we plan to explore multiple objectives, such as whether the story is verbose, readable (using existing readability metrics), has two many or two few protagonists, and so on. [sent-315, score-0.346]

93 Thirdly, our stories would benefit from some explicit modeling of discourse structure. [sent-316, score-0.286]

94 Although the plot graph captures the progression of the actions in a story, we would also like to know where in the story these actions are likely to occur— some tend to appear in the beginning and others in the end. [sent-317, score-0.615]

95 Such information would allow us to structure the stories better and render them more natural sounding. [sent-318, score-0.268]

96 For example, an improvement would be the inclusion of proper endings, as the stories are currently cut off at an arbitrary point when the desired maximum length is reached. [sent-319, score-0.268]

97 Finally, the fluency of the stories would benefit from generating referring expressions, multiple tense forms, indirect speech, aggregation and generally more elaborate syntactic structure. [sent-320, score-0.317]

98 A case based reasoning approach to story plot generation. [sent-323, score-0.467]

99 Learning to tell tales: A data-driven approach to story generation. [sent-430, score-0.346]

100 A planning approach to story generation and history education. [sent-460, score-0.435]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('princess', 0.553), ('prince', 0.424), ('story', 0.346), ('stories', 0.268), ('mcintyre', 0.158), ('mutation', 0.148), ('rescue', 0.147), ('genetic', 0.13), ('plot', 0.121), ('crossover', 0.119), ('fitness', 0.109), ('marry', 0.109), ('population', 0.098), ('planner', 0.096), ('castle', 0.082), ('generator', 0.073), ('plots', 0.073), ('coherence', 0.069), ('schemas', 0.069), ('narrative', 0.066), ('baby', 0.063), ('love', 0.059), ('lapata', 0.058), ('schema', 0.058), ('temple', 0.056), ('graph', 0.052), ('actions', 0.048), ('generation', 0.047), ('kiss', 0.045), ('princeprminarcreyssincastle', 0.045), ('subclause', 0.045), ('ga', 0.045), ('chambers', 0.044), ('entities', 0.043), ('realizer', 0.042), ('planning', 0.042), ('coherent', 0.04), ('commonsense', 0.04), ('evolutionary', 0.038), ('gas', 0.036), ('tales', 0.036), ('fairy', 0.034), ('fittest', 0.034), ('lavoie', 0.034), ('loves', 0.034), ('realpro', 0.034), ('nodes', 0.034), ('entity', 0.032), ('merged', 0.031), ('individuals', 0.03), ('events', 0.03), ('karamanis', 0.03), ('protagonists', 0.03), ('fluency', 0.029), ('creates', 0.028), ('search', 0.028), ('mellish', 0.027), ('generators', 0.027), ('rate', 0.027), ('action', 0.026), ('children', 0.026), ('sentence', 0.026), ('neil', 0.025), ('undergo', 0.025), ('created', 0.025), ('interactive', 0.024), ('ratings', 0.023), ('parents', 0.023), ('precedence', 0.023), ('briscoe', 0.023), ('agudo', 0.023), ('barros', 0.023), ('conceptnet', 0.023), ('manurung', 0.023), ('meehan', 0.023), ('pizzi', 0.023), ('princekispsrincess', 0.023), ('princelovperincess', 0.023), ('riedl', 0.023), ('shim', 0.023), ('slay', 0.023), ('swartjes', 0.023), ('tceamsptlele', 0.023), ('watches', 0.023), ('subsequently', 0.022), ('solutions', 0.021), ('argument', 0.021), ('rise', 0.02), ('candidates', 0.02), ('attested', 0.02), ('aggregation', 0.02), ('node', 0.02), ('recombination', 0.02), ('webexp', 0.02), ('turner', 0.02), ('dragon', 0.02), ('emperor', 0.02), ('surface', 0.02), ('goldberg', 0.019), ('explicit', 0.018)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000001 196 acl-2010-Plot Induction and Evolutionary Search for Story Generation

Author: Neil McIntyre ; Mirella Lapata

Abstract: In this paper we develop a story generator that leverages knowledge inherent in corpora without requiring extensive manual involvement. A key feature in our approach is the reliance on a story planner which we acquire automatically by recording events, their participants, and their precedence relationships in a training corpus. Contrary to previous work our system does not follow a generate-and-rank architecture. Instead, we employ evolutionary search techniques to explore the space of possible stories which we argue are well suited to the story generation task. Experiments on generating simple children’s stories show that our system outperforms pre- vious data-driven approaches.

2 0.095519803 11 acl-2010-A New Approach to Improving Multilingual Summarization Using a Genetic Algorithm

Author: Marina Litvak ; Mark Last ; Menahem Friedman

Abstract: Automated summarization methods can be defined as “language-independent,” if they are not based on any languagespecific knowledge. Such methods can be used for multilingual summarization defined by Mani (2001) as “processing several languages, with summary in the same language as input.” In this paper, we introduce MUSE, a languageindependent approach for extractive summarization based on the linear optimization of several sentence ranking measures using a genetic algorithm. We tested our methodology on two languages—English and Hebrew—and evaluated its performance with ROUGE-1 Recall vs. state- of-the-art extractive summarization approaches. Our results show that MUSE performs better than the best known multilingual approach (TextRank1) in both languages. Moreover, our experimental results on a bilingual (English and Hebrew) document collection suggest that MUSE does not need to be retrained on each language and the same model can be used across at least two different languages.

3 0.088041849 231 acl-2010-The Prevalence of Descriptive Referring Expressions in News and Narrative

Author: Raquel Hervas ; Mark Finlayson

Abstract: Generating referring expressions is a key step in Natural Language Generation. Researchers have focused almost exclusively on generating distinctive referring expressions, that is, referring expressions that uniquely identify their intended referent. While undoubtedly one of their most important functions, referring expressions can be more than distinctive. In particular, descriptive referring expressions those that provide additional information not required for distinction are critical to flu– – ent, efficient, well-written text. We present a corpus analysis in which approximately one-fifth of 7,207 referring expressions in 24,422 words ofnews and narrative are descriptive. These data show that if we are ever to fully master natural language generation, especially for the genres of news and narrative, researchers will need to devote more attention to understanding how to generate descriptive, and not just distinctive, referring expressions. 1 A Distinctive Focus Generating referring expressions is a key step in Natural Language Generation (NLG). From early treatments in seminal papers by Appelt (1985) and Reiter and Dale (1992) to the recent set of Referring Expression Generation (REG) Challenges (Gatt et al., 2009) through different corpora available for the community (Eugenio et al., 1998; van Deemter et al., 2006; Viethen and Dale, 2008), generating referring expressions has become one of the most studied areas of NLG. Researchers studying this area have, almost without exception, focused exclusively on how to generate distinctive referring expressions, that is, referring expressions that unambiguously idenMark Alan Finlayson Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology Cambridge, MA, 02139 USA markaf@mit .edu tify their intended referent. Referring expressions, however, may be more than distinctive. It is widely acknowledged that they can be used to achieve multiple goals, above and beyond distinction. Here we focus on descriptive referring expressions, that is, referring expressions that are not only distinctive, but provide additional information not required for identifying their intended referent. Consider the following text, in which some of the referring expressions have been underlined: Once upon a time there was a man, who had three daughters. They lived in a house and their dresses were made of fabric. While a bit strange, the text is perfectly wellformed. All the referring expressions are distinctive, in that we can properly identify the referents of each expression. But the real text, the opening lines to the folktale The Beauty and the Beast, is actually much more lyrical: Once upon a time there was a rich merchant, who had three daughters. They lived in a very fine house and their gowns were made of the richest fabric sewn with jewels. All the boldfaced portions namely, the choice of head nouns, the addition of adjectives, the use of appositive phrases serve to perform a descriptive function, and, importantly, are all unnecessary for distinction! In all of these cases, the author is using the referring expressions as a vehicle for communicating information about the referents. This descriptive information is sometimes – – new, sometimes necessary for understanding the text, and sometimes just for added flavor. But when the expression is descriptive, as opposed to distinctive, this additional information is not required for identifying the referent of the expression, and it is these sorts of referring expressions that we will be concerned with here. 49 Uppsala,P Srwoce de dni,n 1g1s- 1of6 t Jhuely AC 20L1 20 .1 ?0c 2 C0o1n0fe Aresnsoceci Sathio rnt f Poarp Ceorsm,p paugteastio 4n9a–l5 L4i,nguistics Although these sorts of referring expression have been mostly ignored by researchers in this area1 , we show in this corpus study that descriptive expressions are in fact quite prevalent: nearly one-fifth of referring expressions in news and narrative are descriptive. In particular, our data, the trained judgments of native English speakers, show that 18% of all distinctive referring expressions in news and 17% of those in narrative folktales are descriptive. With this as motivation, we argue that descriptive referring expressions must be studied more carefully, especially as the field progresses from referring in a physical, immediate context (like that in the REG Challenges) to generating more literary forms of text. 2 Corpus Annotation This is a corpus study; our procedure was therefore to define our annotation guidelines (Section 2.1), select texts to annotate (2.2), create an annotation tool for our annotators (2.3), and, finally, train annotators, have them annotate referring expressions’ constituents and function, and then adjudicate the double-annotated texts into a gold standard (2.4). 2.1 Definitions We wrote an annotation guide explaining the difference between distinctive and descriptive referring expressions. We used the guide when training annotators, and it was available to them while annotating. With limited space here we can only give an outline of what is contained in the guide; for full details see (Finlayson and Herv a´s, 2010a). Referring Expressions We defined referring expressions as referential noun phrases and their coreferential expressions, e.g., “John kissed Mary. She blushed.”. This included referring expressions to generics (e.g., “Lions are fierce”), dates, times, and numbers, as well as events if they were referred to using a noun phrase. We included in each referring expression all the determiners, quantifiers, adjectives, appositives, and prepositional phrases that syntactically attached to that expression. When referring expressions were nested, all the nested referring expressions were also marked separately. Nuclei vs. Modifiers In the only previous corpus study of descriptive referring expressions, on 1With the exception of a small amount of work, discussed in Section 4. museum labels, Cheng et al. (2001) noted that descriptive information is often integrated into referring expressions using modifiers to the head noun. To study this, and to allow our results to be more closely compared with Cheng’s, we had our annotators split referring expressions into their constituents, portions called either nuclei or modifiers. The nuclei were the portions of the referring expression that performed the ‘core’ referring function; the modifiers were those portions that could be varied, syntactically speaking, independently of the nuclei. Annotators then assigned a distinctive or descriptive function to each constituent, rather than the referring expression as a whole. Normally, the nuclei corresponded to the head of the noun phrase. In (1), the nucleus is the token king, which we have here surrounded with square brackets. The modifiers, surrounded by parentheses, are The and old. (1) (The) (old) [king] was wise. Phrasal modifiers were marked as single modifiers, for example, in (2). (2) (The) [roof] (of the house) collapsed. It is significant that we had our annotators mark and tag the nuclei of referring expressions. Cheng and colleagues only mentioned the possibility that additional information could be introduced in the modifiers. However, O’Donnell et al. (1998) observed that often the choice of head noun can also influence the function of a referring expression. Consider (3), in which the word villain is used to refer to the King. (3) The King assumed the throne today. I ’t trust (that) [villain] one bit. don The speaker could have merely used him to refer to the King–the choice of that particular head noun villain gives us additional information about the disposition of the speaker. Thus villain is descriptive. Function: Distinctive vs. Descriptive As already noted, instead of tagging the whole referring expression, annotators tagged each constituent (nuclei and modifiers) as distinctive or descriptive. The two main tests for determining descriptiveness were (a) if presence of the constituent was unnecessary for identifying the referent, or (b) if 50 the constituent was expressed using unusual or ostentatious word choice. If either was true, the constituent was considered descriptive; otherwise, it was tagged as distinctive. In cases where the constituent was completely irrelevant to identifying the referent, it was tagged as descriptive. For example, in the folktale The Princess and the Pea, from which (1) was extracted, there is only one king in the entire story. Thus, in that story, the king is sufficient for identification, and therefore the modifier old is descriptive. This points out the importance of context in determining distinctiveness or descriptiveness; if there had been a roomful of kings, the tags on those modifiers would have been reversed. There is some question as to whether copular predicates, such as the plumber in (4), are actually referring expressions. (4) John is the plumber Our annotators marked and tagged these constructions as normal referring expressions, but they added an additional flag to identify them as copular predicates. We then excluded these constructions from our final analysis. Note that copular predicates were treated differently from appositives: in appositives the predicate was included in the referring expression, and in most cases (again, depending on context) was marked descriptive (e.g., John, the plumber, slept.). 2.2 Text Selection Our corpus comprised 62 texts, all originally written in English, from two different genres, news and folktales. We began with 30 folktales of different sizes, totaling 12,050 words. These texts were used in a previous work on the influence of dialogues on anaphora resolution algorithms (Aggarwal et al., 2009); they were assembled with an eye toward including different styles, different authors, and different time periods. Following this, we matched, approximately, the number of words in the folktales by selecting 32 texts from Wall Street Journal section of the Penn Treebank (Marcus et al., 1993). These texts were selected at ran- dom from the first 200 texts in the corpus. 2.3 The Story Workbench We used the Story Workbench application (Finlayson, 2008) to actually perform the annotation. The Story Workbench is a semantic annotation program that, among other things, includes the ability to annotate referring expressions and coreferential relationships. We added the ability to annotate nuclei, modifiers, and their functions by writing a workbench “plugin” in Java that could be installed in the application. The Story Workbench is not yet available to the public at large, being in a limited distribution beta testing phase. The developers plan to release it as free software within the next year. At that time, we also plan to release our plugin as free, downloadable software. 2.4 Annotation & Adjudication The main task of the study was the annotation of the constituents of each referring expression, as well as the function (distinctive or descriptive) of each constituent. The system generated a first pass of constituent analysis, but did not mark functions. We hired two native English annotators, neither of whom had any linguistics background, who corrected these automatically-generated constituent analyses, and tagged each constituent as descriptive or distinctive. Every text was annotated by both annotators. Adjudication of the differences was conducted by discussion between the two annotators; the second author moderated these discussions and settled irreconcilable disagreements. We followed a “train-as-you-go” paradigm, where there was no distinct training period, but rather adjudication proceeded in step with annotation, and annotators received feedback during those sessions. We calculated two measures of inter-annotator agreement: a kappa statistic and an f-measure, shown in Table 1. All of our f-measures indicated that annotators agreed almost perfectly on the location of referring expressions and their breakdown into constituents. These agreement calculations were performed on the annotators’ original corrected texts. All the kappa statistics were calculated for two tags (nuclei vs. modifier for the constituents, and distinctive vs. descriptive for the functions) over both each token assigned to a nucleus or modifier and each referring expression pair. Our kappas indicate moderate to good agreement, especially for the folktales. These results are expected because of the inherent subjectivity of language. During the adjudication sessions it became clear that different people do not consider the same information 51 as obvious or descriptive for the same concepts, and even the contexts deduced by each annotators from the texts were sometimes substantially different. 3 Results Table 2 lists the primary results of the study. We considered a referring expression descriptive if any of its constituents were descriptive. Thus, 18% of the referring expressions in the corpus added additional information beyond what was required to unambiguously identify their referent. The results were similar in both genres. Tales Articles Total Texts303262 Words Sentences 12,050 904 12,372 571 24,422 1,475 Ref. Exp.3,6813,5267,207 Dist. Ref. Exp. 3,057 2,830 5,887 Desc. Ref. Exp. 609 672 1,281 % Dist. Ref.83%81%82% % Desc. Ref. 17% 19% Table 2: Primary results. 18% Table 3 contains the percentages of descriptive and distinctive tags broken down by constituent. Like Cheng’s results, our analysis shows that descriptive referring expressions make up a significant fraction of all referring expressions. Although Cheng did not examine nuclei, our results show that the use of descriptive nuclei is small but not negligible. 4 Relation to the Field Researchers working on generating referring expressions typically acknowledge that referring expressions can perform functions other than distinction. Despite this widespread acknowledgment, researchers have, for the most part, explicitly ignored these functions. Exceptions to this trend Tales Articles Total Nuclei3,6663,5027,168 Max. Nuc/Ref Dist. Nuc. 1 95% 1 97% 1 96% Desc. Nuc. 5% 3% 4% Modifiers2,2773,6275,904 Avg. Mod/Ref Max. Mod/Ref Dist. Mod. Desc. Mod. 0.6 4 78% 22% 1.0 6 81% 19% 0.8 6 80% 20% Table 3: Breakdown of Constituent Tags are three. First is the general study of aggregation in the process of referring expression generation. Second and third are corpus studies by Cheng et al. (2001) and Jordan (2000a) that bear on the prevalence of descriptive referring expressions. The NLG subtask of aggregation can be used to imbue referring expressions with a descriptive function (Reiter and Dale, 2000, §5.3). There is a specific nk (iRned otefr aggregation 0c0al0le,d § embedding t ihsa at moves information from one clause to another inside the structure of a separate noun phrase. This type of aggregation can be used to transform two sentences such as “The princess lived in a castle. She was pretty ” into “The pretty princess lived in a castle ”. The adjective pretty, previously a cop- ular predicate, becomes a descriptive modifier of the reference to the princess, making the second text more natural and fluent. This kind of aggregation is widely used by humans for making the discourse more compact and efficient. In order to create NLG systems with this ability, we must take into account the caveat, noted by Cheng (1998), that any non-distinctive information in a referring expression must not lead to confusion about the distinctive function of the referring expression. This is by no means a trivial problem this sort of aggregation interferes with referring and coherence planning at both a local and global level (Cheng and Mellish, 2000; Cheng et al., 2001). It is clear, from the current state of the art of NLG, that we have not yet obtained a deep enough understanding of aggregation to enable us to handle these interactions. More research on the topic is needed. Two previous corpus studies have looked at the use of descriptive referring expressions. The first showed explicitly that people craft descriptive referring expressions to accomplish different – 52 goals. Jordan and colleagues (Jordan, 2000b; Jordan, 2000a) examined the use of referring expressions using the COCONUT corpus (Eugenio et al., 1998). They tested how domain and discourse goals can influence the content of non-pronominal referring expressions in a dialogue context, checking whether or not a subject’s goals led them to include non-referring information in a referring expression. Their results are intriguing because they point toward heretofore unexamined constraints, utilities and expectations (possibly genre- or styledependent) that may underlie the use ofdescriptive information to perform different functions, and are not yet captured by aggregation modules in particular or NLG systems in general. In the other corpus study, which partially inspired this work, Cheng and colleagues analyzed a set of museum descriptions, the GNOME corpus (Poesio, 2004), for the pragmatic functions of referring expressions. They had three functions in their study, in contrast to our two. Their first function (marked by their uniq tag) was equiv- alent to our distinctive function. The other two were specializations of our descriptive tag, where they differentiated between additional information that helped to understand the text (int), or additional information not necessary for understanding (att r). Despite their annotators seeming to have trouble distinguishing between the latter two tags, they did achieve good overall inter-annotator agreement. They identified 1,863 modifiers to referring expressions in their corpus, of which 47.3% fulfilled a descriptive (att r or int) function. This is supportive of our main assertion, namely, that descriptive referring expressions, not only crucial for efficient and fluent text, are actually a significant phenomenon. It is interesting, though, that Cheng’s fraction of descriptive referring expression was so much higher than ours (47.3% versus our 18%). We attribute this substantial difference to genre, in that Cheng studied museum labels, in which the writer is spaceconstrained, having to pack a lot of information into a small label. The issue bears further study, and perhaps will lead to insights into differences in writing style that may be attributed to author or genre. 5 Contributions We make two contributions in this paper. First, we assembled, double-annotated, and adjudicated into a gold-standard a corpus of 24,422 words. We marked all referring expressions, coreferential relations, and referring expression constituents, and tagged each constituent as having a descriptive or distinctive function. We wrote an annotation guide and created software that allows the annotation of this information in free text. The corpus and the guide are available on-line in a permanent digital archive (Finlayson and Herv a´s, 2010a; Finlayson and Herv a´s, 2010b). The software will also be released in the same archive when the Story Workbench annotation application is released to the public. This corpus will be useful for the automatic generation and analysis of both descriptive and distinctive referring expressions. Any kind of system intended to generate text as humans do must take into account that identifica- tion is not the only function of referring expressions. Many analysis applications would benefit from the automatic recognition of descriptive referring expressions. Second, we demonstrated that descriptive referring expressions comprise a substantial fraction (18%) of the referring expressions in news and narrative. Along with museum descriptions, studied by Cheng, it seems that news and narrative are genres where authors naturally use a large number ofdescriptive referring expressions. Given that so little work has been done on descriptive referring expressions, this indicates that the field would be well served by focusing more attention on this phenomenon. Acknowledgments This work was supported in part by the Air Force Office of Scientific Research under grant number A9550-05-1-0321, as well as by the Office of Naval Research under award number N00014091059. Any opinions, findings, and con- clusions or recommendations expressed in this paper are those of the authors and do not necessarily reflect the views of the Office of Naval Research. This research is also partially funded the Spanish Ministry of Education and Science (TIN200914659-C03-01) and Universidad Complutense de Madrid (GR58/08). We also thank Whitman Richards, Ozlem Uzuner, Peter Szolovits, Patrick Winston, Pablo Gerv a´s, and Mark Seifter for their helpful comments and discussion, and thank our annotators Saam Batmanghelidj and Geneva Trotter. 53 References Alaukik Aggarwal, Pablo Gerv a´s, and Raquel Herv a´s. 2009. Measuring the influence of errors induced by the presence of dialogues in reference clustering of narrative text. In Proceedings of ICON-2009: 7th International Conference on Natural Language Processing, India. Macmillan Publishers. Douglas E. Appelt. 1985. Planning English referring expressions. Artificial Intelligence, 26: 1–33. Hua Cheng and Chris Mellish. 2000. Capturing the interaction between aggregation and text planning in two generation systems. In INLG ’00: First international conference on Natural Language Generation 2000, pages 186–193, Morristown, NJ, USA. Association for Computational Linguistics. Hua Cheng, Massimo Poesio, Renate Henschel, and Chris Mellish. 2001 . Corpus-based np modifier generation. In NAACL ’01: Second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies 2001, pages 1–8, Morristown, NJ, USA. Association for Computational Linguistics. Hua Cheng. 1998. Embedding new information into referring expressions. In ACL-36: Proceedings of the 36thAnnual Meeting ofthe Associationfor Computational Linguistics and 17th International Conference on Computational Linguistics, pages 1478– 1480, Morristown, NJ, USA. Association for Computational Linguistics. Barbara Di Eugenio, Johanna D. Moore, Pamela W. Jordan, and Richmond H. Thomason. 1998. An empirical investigation of proposals in collaborative dialogues. In Proceedings of the 17th international conference on Computational linguistics, pages 325–329, Morristown, NJ, USA. Association for Computational Linguistics. Mark A. Finlayson and Raquel Herv a´s. 2010a. Annotation guide for the UCM/MIT indications, referring expressions, and coreference corpus (UMIREC corpus). Technical Report MIT-CSAIL-TR-2010-025, MIT Computer Science and Artificial Intelligence Laboratory. http://hdl.handle.net/1721. 1/54765. Mark A. Finlayson and Raquel Herv a´s. 2010b. UCM/MIT indications, referring expressions, and coreference corpus (UMIREC corpus). Work product, MIT Computer Science and Artificial Intelligence Laboratory. http://hdl.handle.net/1721 .1/54766. Mark A. Finlayson. 2008. Collecting semantics in the wild: The Story Workbench. In Proceedings of the AAAI Fall Symposium on Naturally-Inspired Artificial Intelligence, pages 46–53, Menlo Park, CA, USA. AAAI Press. Albert Gatt, Anja Belz, and Eric Kow. 2009. The TUNA-REG challenge 2009: overview and evaluation results. In ENLG ’09: Proceedings of the 12th European Workshop on Natural Language Generation, pages 174–182, Morristown, NJ, USA. Association for Computational Linguistics. Pamela W. Jordan. 2000a. Can nominal expressions achieve multiple goals?: an empirical study. In ACL ’00: Proceedings of the 38th Annual Meeting on Association for Computational Linguistics, pages 142– 149, Morristown, NJ, USA. Association for Computational Linguistics. Pamela W. Jordan. 2000b. Influences on attribute selection in redescriptions: A corpus study. In Proceedings of CogSci2000, pages 250–255. Mitchell P. Marcus, Mary Ann Marcinkiewicz, and Beatrice Santorini. 1993. Building a large annotated corpus of english: the penn treebank. Computational Linguistics, 19(2):3 13–330. Michael O’Donnell, Hua Cheng, and Janet Hitzeman. 1998. Integrating referring and informing in NP planning. In Proceedings of COLING-ACL’98 Workshop on the Computational Treatment of Nominals, pages 46–56. Massimo Poesio. 2004. Discourse annotation and semantic annotation in the GNOME corpus. In DiscAnnotation ’04: Proceedings of the 2004 ACL Workshop on Discourse Annotation, pages 72–79, Morristown, NJ, USA. Association for Computational Linguistics. Ehud Reiter and Robert Dale. 1992. A fast algorithm for the generation of referring expressions. In Proceedings of the 14th conference on Computational linguistics, Nantes, France. Ehud Reiter and Robert Dale. 2000. Building Natural Language Generation Systems. Cambridge University Press. Kees van Deemter, Ielka van der Sluis, and Albert Gatt. 2006. Building a semantically transparent corpus for the generation of referring expressions. In Proceedings of the 4th International Conference on Natural Language Generation (Special Session on Data Sharing and Evaluation), INLG-06. Jette Viethen and Robert Dale. 2008. The use of spatial relations in referring expressions. In Proceedings of the 5th International Conference on Natural Language Generation. 54

4 0.073120162 246 acl-2010-Unsupervised Discourse Segmentation of Documents with Inherently Parallel Structure

Author: Minwoo Jeong ; Ivan Titov

Abstract: Documents often have inherently parallel structure: they may consist of a text and commentaries, or an abstract and a body, or parts presenting alternative views on the same problem. Revealing relations between the parts by jointly segmenting and predicting links between the segments, would help to visualize such documents and construct friendlier user interfaces. To address this problem, we propose an unsupervised Bayesian model for joint discourse segmentation and alignment. We apply our method to the “English as a second language” podcast dataset where each episode is composed of two parallel parts: a story and an explanatory lecture. The predicted topical links uncover hidden re- lations between the stories and the lectures. In this domain, our method achieves competitive results, rivaling those of a previously proposed supervised technique.

5 0.072352074 38 acl-2010-Automatic Evaluation of Linguistic Quality in Multi-Document Summarization

Author: Emily Pitler ; Annie Louis ; Ani Nenkova

Abstract: To date, few attempts have been made to develop and validate methods for automatic evaluation of linguistic quality in text summarization. We present the first systematic assessment of several diverse classes of metrics designed to capture various aspects of well-written text. We train and test linguistic quality models on consecutive years of NIST evaluation data in order to show the generality of results. For grammaticality, the best results come from a set of syntactic features. Focus, coherence and referential clarity are best evaluated by a class of features measuring local coherence on the basis of cosine similarity between sentences, coreference informa- tion, and summarization specific features. Our best results are 90% accuracy for pairwise comparisons of competing systems over a test set of several inputs and 70% for ranking summaries of a specific input.

6 0.068963796 165 acl-2010-Learning Script Knowledge with Web Experiments

7 0.064448975 39 acl-2010-Automatic Generation of Story Highlights

8 0.05566768 101 acl-2010-Entity-Based Local Coherence Modelling Using Topological Fields

9 0.045624278 112 acl-2010-Extracting Social Networks from Literary Fiction

10 0.039888818 35 acl-2010-Automated Planning for Situated Natural Language Generation

11 0.039051425 85 acl-2010-Detecting Experiences from Weblogs

12 0.038532201 158 acl-2010-Latent Variable Models of Selectional Preference

13 0.037885219 127 acl-2010-Global Learning of Focused Entailment Graphs

14 0.036453139 125 acl-2010-Generating Templates of Entity Summaries with an Entity-Aspect Model and Pattern Mining

15 0.034949765 198 acl-2010-Predicate Argument Structure Analysis Using Transformation Based Learning

16 0.034183033 14 acl-2010-A Risk Minimization Framework for Extractive Speech Summarization

17 0.033752739 136 acl-2010-How Many Words Is a Picture Worth? Automatic Caption Generation for News Images

18 0.032908399 247 acl-2010-Unsupervised Event Coreference Resolution with Rich Linguistic Features

19 0.032655235 216 acl-2010-Starting from Scratch in Semantic Role Labeling

20 0.031880129 4 acl-2010-A Cognitive Cost Model of Annotations Based on Eye-Tracking Data


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.114), (1, 0.049), (2, -0.016), (3, -0.05), (4, -0.005), (5, -0.003), (6, -0.002), (7, -0.054), (8, 0.0), (9, -0.024), (10, -0.015), (11, -0.007), (12, -0.02), (13, 0.008), (14, 0.023), (15, -0.011), (16, 0.066), (17, 0.034), (18, 0.007), (19, -0.005), (20, 0.045), (21, 0.006), (22, 0.008), (23, 0.006), (24, 0.057), (25, -0.015), (26, 0.002), (27, 0.029), (28, 0.049), (29, 0.013), (30, -0.009), (31, -0.028), (32, -0.011), (33, -0.058), (34, -0.016), (35, -0.006), (36, 0.085), (37, 0.018), (38, -0.071), (39, -0.021), (40, 0.015), (41, 0.047), (42, 0.054), (43, -0.147), (44, 0.05), (45, 0.08), (46, -0.083), (47, 0.118), (48, -0.022), (49, 0.078)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.89093029 196 acl-2010-Plot Induction and Evolutionary Search for Story Generation

Author: Neil McIntyre ; Mirella Lapata

Abstract: In this paper we develop a story generator that leverages knowledge inherent in corpora without requiring extensive manual involvement. A key feature in our approach is the reliance on a story planner which we acquire automatically by recording events, their participants, and their precedence relationships in a training corpus. Contrary to previous work our system does not follow a generate-and-rank architecture. Instead, we employ evolutionary search techniques to explore the space of possible stories which we argue are well suited to the story generation task. Experiments on generating simple children’s stories show that our system outperforms pre- vious data-driven approaches.

2 0.51904273 231 acl-2010-The Prevalence of Descriptive Referring Expressions in News and Narrative

Author: Raquel Hervas ; Mark Finlayson

Abstract: Generating referring expressions is a key step in Natural Language Generation. Researchers have focused almost exclusively on generating distinctive referring expressions, that is, referring expressions that uniquely identify their intended referent. While undoubtedly one of their most important functions, referring expressions can be more than distinctive. In particular, descriptive referring expressions those that provide additional information not required for distinction are critical to flu– – ent, efficient, well-written text. We present a corpus analysis in which approximately one-fifth of 7,207 referring expressions in 24,422 words ofnews and narrative are descriptive. These data show that if we are ever to fully master natural language generation, especially for the genres of news and narrative, researchers will need to devote more attention to understanding how to generate descriptive, and not just distinctive, referring expressions. 1 A Distinctive Focus Generating referring expressions is a key step in Natural Language Generation (NLG). From early treatments in seminal papers by Appelt (1985) and Reiter and Dale (1992) to the recent set of Referring Expression Generation (REG) Challenges (Gatt et al., 2009) through different corpora available for the community (Eugenio et al., 1998; van Deemter et al., 2006; Viethen and Dale, 2008), generating referring expressions has become one of the most studied areas of NLG. Researchers studying this area have, almost without exception, focused exclusively on how to generate distinctive referring expressions, that is, referring expressions that unambiguously idenMark Alan Finlayson Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology Cambridge, MA, 02139 USA markaf@mit .edu tify their intended referent. Referring expressions, however, may be more than distinctive. It is widely acknowledged that they can be used to achieve multiple goals, above and beyond distinction. Here we focus on descriptive referring expressions, that is, referring expressions that are not only distinctive, but provide additional information not required for identifying their intended referent. Consider the following text, in which some of the referring expressions have been underlined: Once upon a time there was a man, who had three daughters. They lived in a house and their dresses were made of fabric. While a bit strange, the text is perfectly wellformed. All the referring expressions are distinctive, in that we can properly identify the referents of each expression. But the real text, the opening lines to the folktale The Beauty and the Beast, is actually much more lyrical: Once upon a time there was a rich merchant, who had three daughters. They lived in a very fine house and their gowns were made of the richest fabric sewn with jewels. All the boldfaced portions namely, the choice of head nouns, the addition of adjectives, the use of appositive phrases serve to perform a descriptive function, and, importantly, are all unnecessary for distinction! In all of these cases, the author is using the referring expressions as a vehicle for communicating information about the referents. This descriptive information is sometimes – – new, sometimes necessary for understanding the text, and sometimes just for added flavor. But when the expression is descriptive, as opposed to distinctive, this additional information is not required for identifying the referent of the expression, and it is these sorts of referring expressions that we will be concerned with here. 49 Uppsala,P Srwoce de dni,n 1g1s- 1of6 t Jhuely AC 20L1 20 .1 ?0c 2 C0o1n0fe Aresnsoceci Sathio rnt f Poarp Ceorsm,p paugteastio 4n9a–l5 L4i,nguistics Although these sorts of referring expression have been mostly ignored by researchers in this area1 , we show in this corpus study that descriptive expressions are in fact quite prevalent: nearly one-fifth of referring expressions in news and narrative are descriptive. In particular, our data, the trained judgments of native English speakers, show that 18% of all distinctive referring expressions in news and 17% of those in narrative folktales are descriptive. With this as motivation, we argue that descriptive referring expressions must be studied more carefully, especially as the field progresses from referring in a physical, immediate context (like that in the REG Challenges) to generating more literary forms of text. 2 Corpus Annotation This is a corpus study; our procedure was therefore to define our annotation guidelines (Section 2.1), select texts to annotate (2.2), create an annotation tool for our annotators (2.3), and, finally, train annotators, have them annotate referring expressions’ constituents and function, and then adjudicate the double-annotated texts into a gold standard (2.4). 2.1 Definitions We wrote an annotation guide explaining the difference between distinctive and descriptive referring expressions. We used the guide when training annotators, and it was available to them while annotating. With limited space here we can only give an outline of what is contained in the guide; for full details see (Finlayson and Herv a´s, 2010a). Referring Expressions We defined referring expressions as referential noun phrases and their coreferential expressions, e.g., “John kissed Mary. She blushed.”. This included referring expressions to generics (e.g., “Lions are fierce”), dates, times, and numbers, as well as events if they were referred to using a noun phrase. We included in each referring expression all the determiners, quantifiers, adjectives, appositives, and prepositional phrases that syntactically attached to that expression. When referring expressions were nested, all the nested referring expressions were also marked separately. Nuclei vs. Modifiers In the only previous corpus study of descriptive referring expressions, on 1With the exception of a small amount of work, discussed in Section 4. museum labels, Cheng et al. (2001) noted that descriptive information is often integrated into referring expressions using modifiers to the head noun. To study this, and to allow our results to be more closely compared with Cheng’s, we had our annotators split referring expressions into their constituents, portions called either nuclei or modifiers. The nuclei were the portions of the referring expression that performed the ‘core’ referring function; the modifiers were those portions that could be varied, syntactically speaking, independently of the nuclei. Annotators then assigned a distinctive or descriptive function to each constituent, rather than the referring expression as a whole. Normally, the nuclei corresponded to the head of the noun phrase. In (1), the nucleus is the token king, which we have here surrounded with square brackets. The modifiers, surrounded by parentheses, are The and old. (1) (The) (old) [king] was wise. Phrasal modifiers were marked as single modifiers, for example, in (2). (2) (The) [roof] (of the house) collapsed. It is significant that we had our annotators mark and tag the nuclei of referring expressions. Cheng and colleagues only mentioned the possibility that additional information could be introduced in the modifiers. However, O’Donnell et al. (1998) observed that often the choice of head noun can also influence the function of a referring expression. Consider (3), in which the word villain is used to refer to the King. (3) The King assumed the throne today. I ’t trust (that) [villain] one bit. don The speaker could have merely used him to refer to the King–the choice of that particular head noun villain gives us additional information about the disposition of the speaker. Thus villain is descriptive. Function: Distinctive vs. Descriptive As already noted, instead of tagging the whole referring expression, annotators tagged each constituent (nuclei and modifiers) as distinctive or descriptive. The two main tests for determining descriptiveness were (a) if presence of the constituent was unnecessary for identifying the referent, or (b) if 50 the constituent was expressed using unusual or ostentatious word choice. If either was true, the constituent was considered descriptive; otherwise, it was tagged as distinctive. In cases where the constituent was completely irrelevant to identifying the referent, it was tagged as descriptive. For example, in the folktale The Princess and the Pea, from which (1) was extracted, there is only one king in the entire story. Thus, in that story, the king is sufficient for identification, and therefore the modifier old is descriptive. This points out the importance of context in determining distinctiveness or descriptiveness; if there had been a roomful of kings, the tags on those modifiers would have been reversed. There is some question as to whether copular predicates, such as the plumber in (4), are actually referring expressions. (4) John is the plumber Our annotators marked and tagged these constructions as normal referring expressions, but they added an additional flag to identify them as copular predicates. We then excluded these constructions from our final analysis. Note that copular predicates were treated differently from appositives: in appositives the predicate was included in the referring expression, and in most cases (again, depending on context) was marked descriptive (e.g., John, the plumber, slept.). 2.2 Text Selection Our corpus comprised 62 texts, all originally written in English, from two different genres, news and folktales. We began with 30 folktales of different sizes, totaling 12,050 words. These texts were used in a previous work on the influence of dialogues on anaphora resolution algorithms (Aggarwal et al., 2009); they were assembled with an eye toward including different styles, different authors, and different time periods. Following this, we matched, approximately, the number of words in the folktales by selecting 32 texts from Wall Street Journal section of the Penn Treebank (Marcus et al., 1993). These texts were selected at ran- dom from the first 200 texts in the corpus. 2.3 The Story Workbench We used the Story Workbench application (Finlayson, 2008) to actually perform the annotation. The Story Workbench is a semantic annotation program that, among other things, includes the ability to annotate referring expressions and coreferential relationships. We added the ability to annotate nuclei, modifiers, and their functions by writing a workbench “plugin” in Java that could be installed in the application. The Story Workbench is not yet available to the public at large, being in a limited distribution beta testing phase. The developers plan to release it as free software within the next year. At that time, we also plan to release our plugin as free, downloadable software. 2.4 Annotation & Adjudication The main task of the study was the annotation of the constituents of each referring expression, as well as the function (distinctive or descriptive) of each constituent. The system generated a first pass of constituent analysis, but did not mark functions. We hired two native English annotators, neither of whom had any linguistics background, who corrected these automatically-generated constituent analyses, and tagged each constituent as descriptive or distinctive. Every text was annotated by both annotators. Adjudication of the differences was conducted by discussion between the two annotators; the second author moderated these discussions and settled irreconcilable disagreements. We followed a “train-as-you-go” paradigm, where there was no distinct training period, but rather adjudication proceeded in step with annotation, and annotators received feedback during those sessions. We calculated two measures of inter-annotator agreement: a kappa statistic and an f-measure, shown in Table 1. All of our f-measures indicated that annotators agreed almost perfectly on the location of referring expressions and their breakdown into constituents. These agreement calculations were performed on the annotators’ original corrected texts. All the kappa statistics were calculated for two tags (nuclei vs. modifier for the constituents, and distinctive vs. descriptive for the functions) over both each token assigned to a nucleus or modifier and each referring expression pair. Our kappas indicate moderate to good agreement, especially for the folktales. These results are expected because of the inherent subjectivity of language. During the adjudication sessions it became clear that different people do not consider the same information 51 as obvious or descriptive for the same concepts, and even the contexts deduced by each annotators from the texts were sometimes substantially different. 3 Results Table 2 lists the primary results of the study. We considered a referring expression descriptive if any of its constituents were descriptive. Thus, 18% of the referring expressions in the corpus added additional information beyond what was required to unambiguously identify their referent. The results were similar in both genres. Tales Articles Total Texts303262 Words Sentences 12,050 904 12,372 571 24,422 1,475 Ref. Exp.3,6813,5267,207 Dist. Ref. Exp. 3,057 2,830 5,887 Desc. Ref. Exp. 609 672 1,281 % Dist. Ref.83%81%82% % Desc. Ref. 17% 19% Table 2: Primary results. 18% Table 3 contains the percentages of descriptive and distinctive tags broken down by constituent. Like Cheng’s results, our analysis shows that descriptive referring expressions make up a significant fraction of all referring expressions. Although Cheng did not examine nuclei, our results show that the use of descriptive nuclei is small but not negligible. 4 Relation to the Field Researchers working on generating referring expressions typically acknowledge that referring expressions can perform functions other than distinction. Despite this widespread acknowledgment, researchers have, for the most part, explicitly ignored these functions. Exceptions to this trend Tales Articles Total Nuclei3,6663,5027,168 Max. Nuc/Ref Dist. Nuc. 1 95% 1 97% 1 96% Desc. Nuc. 5% 3% 4% Modifiers2,2773,6275,904 Avg. Mod/Ref Max. Mod/Ref Dist. Mod. Desc. Mod. 0.6 4 78% 22% 1.0 6 81% 19% 0.8 6 80% 20% Table 3: Breakdown of Constituent Tags are three. First is the general study of aggregation in the process of referring expression generation. Second and third are corpus studies by Cheng et al. (2001) and Jordan (2000a) that bear on the prevalence of descriptive referring expressions. The NLG subtask of aggregation can be used to imbue referring expressions with a descriptive function (Reiter and Dale, 2000, §5.3). There is a specific nk (iRned otefr aggregation 0c0al0le,d § embedding t ihsa at moves information from one clause to another inside the structure of a separate noun phrase. This type of aggregation can be used to transform two sentences such as “The princess lived in a castle. She was pretty ” into “The pretty princess lived in a castle ”. The adjective pretty, previously a cop- ular predicate, becomes a descriptive modifier of the reference to the princess, making the second text more natural and fluent. This kind of aggregation is widely used by humans for making the discourse more compact and efficient. In order to create NLG systems with this ability, we must take into account the caveat, noted by Cheng (1998), that any non-distinctive information in a referring expression must not lead to confusion about the distinctive function of the referring expression. This is by no means a trivial problem this sort of aggregation interferes with referring and coherence planning at both a local and global level (Cheng and Mellish, 2000; Cheng et al., 2001). It is clear, from the current state of the art of NLG, that we have not yet obtained a deep enough understanding of aggregation to enable us to handle these interactions. More research on the topic is needed. Two previous corpus studies have looked at the use of descriptive referring expressions. The first showed explicitly that people craft descriptive referring expressions to accomplish different – 52 goals. Jordan and colleagues (Jordan, 2000b; Jordan, 2000a) examined the use of referring expressions using the COCONUT corpus (Eugenio et al., 1998). They tested how domain and discourse goals can influence the content of non-pronominal referring expressions in a dialogue context, checking whether or not a subject’s goals led them to include non-referring information in a referring expression. Their results are intriguing because they point toward heretofore unexamined constraints, utilities and expectations (possibly genre- or styledependent) that may underlie the use ofdescriptive information to perform different functions, and are not yet captured by aggregation modules in particular or NLG systems in general. In the other corpus study, which partially inspired this work, Cheng and colleagues analyzed a set of museum descriptions, the GNOME corpus (Poesio, 2004), for the pragmatic functions of referring expressions. They had three functions in their study, in contrast to our two. Their first function (marked by their uniq tag) was equiv- alent to our distinctive function. The other two were specializations of our descriptive tag, where they differentiated between additional information that helped to understand the text (int), or additional information not necessary for understanding (att r). Despite their annotators seeming to have trouble distinguishing between the latter two tags, they did achieve good overall inter-annotator agreement. They identified 1,863 modifiers to referring expressions in their corpus, of which 47.3% fulfilled a descriptive (att r or int) function. This is supportive of our main assertion, namely, that descriptive referring expressions, not only crucial for efficient and fluent text, are actually a significant phenomenon. It is interesting, though, that Cheng’s fraction of descriptive referring expression was so much higher than ours (47.3% versus our 18%). We attribute this substantial difference to genre, in that Cheng studied museum labels, in which the writer is spaceconstrained, having to pack a lot of information into a small label. The issue bears further study, and perhaps will lead to insights into differences in writing style that may be attributed to author or genre. 5 Contributions We make two contributions in this paper. First, we assembled, double-annotated, and adjudicated into a gold-standard a corpus of 24,422 words. We marked all referring expressions, coreferential relations, and referring expression constituents, and tagged each constituent as having a descriptive or distinctive function. We wrote an annotation guide and created software that allows the annotation of this information in free text. The corpus and the guide are available on-line in a permanent digital archive (Finlayson and Herv a´s, 2010a; Finlayson and Herv a´s, 2010b). The software will also be released in the same archive when the Story Workbench annotation application is released to the public. This corpus will be useful for the automatic generation and analysis of both descriptive and distinctive referring expressions. Any kind of system intended to generate text as humans do must take into account that identifica- tion is not the only function of referring expressions. Many analysis applications would benefit from the automatic recognition of descriptive referring expressions. Second, we demonstrated that descriptive referring expressions comprise a substantial fraction (18%) of the referring expressions in news and narrative. Along with museum descriptions, studied by Cheng, it seems that news and narrative are genres where authors naturally use a large number ofdescriptive referring expressions. Given that so little work has been done on descriptive referring expressions, this indicates that the field would be well served by focusing more attention on this phenomenon. Acknowledgments This work was supported in part by the Air Force Office of Scientific Research under grant number A9550-05-1-0321, as well as by the Office of Naval Research under award number N00014091059. Any opinions, findings, and con- clusions or recommendations expressed in this paper are those of the authors and do not necessarily reflect the views of the Office of Naval Research. This research is also partially funded the Spanish Ministry of Education and Science (TIN200914659-C03-01) and Universidad Complutense de Madrid (GR58/08). We also thank Whitman Richards, Ozlem Uzuner, Peter Szolovits, Patrick Winston, Pablo Gerv a´s, and Mark Seifter for their helpful comments and discussion, and thank our annotators Saam Batmanghelidj and Geneva Trotter. 53 References Alaukik Aggarwal, Pablo Gerv a´s, and Raquel Herv a´s. 2009. Measuring the influence of errors induced by the presence of dialogues in reference clustering of narrative text. In Proceedings of ICON-2009: 7th International Conference on Natural Language Processing, India. Macmillan Publishers. Douglas E. Appelt. 1985. Planning English referring expressions. Artificial Intelligence, 26: 1–33. Hua Cheng and Chris Mellish. 2000. Capturing the interaction between aggregation and text planning in two generation systems. In INLG ’00: First international conference on Natural Language Generation 2000, pages 186–193, Morristown, NJ, USA. Association for Computational Linguistics. Hua Cheng, Massimo Poesio, Renate Henschel, and Chris Mellish. 2001 . Corpus-based np modifier generation. In NAACL ’01: Second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies 2001, pages 1–8, Morristown, NJ, USA. Association for Computational Linguistics. Hua Cheng. 1998. Embedding new information into referring expressions. In ACL-36: Proceedings of the 36thAnnual Meeting ofthe Associationfor Computational Linguistics and 17th International Conference on Computational Linguistics, pages 1478– 1480, Morristown, NJ, USA. Association for Computational Linguistics. Barbara Di Eugenio, Johanna D. Moore, Pamela W. Jordan, and Richmond H. Thomason. 1998. An empirical investigation of proposals in collaborative dialogues. In Proceedings of the 17th international conference on Computational linguistics, pages 325–329, Morristown, NJ, USA. Association for Computational Linguistics. Mark A. Finlayson and Raquel Herv a´s. 2010a. Annotation guide for the UCM/MIT indications, referring expressions, and coreference corpus (UMIREC corpus). Technical Report MIT-CSAIL-TR-2010-025, MIT Computer Science and Artificial Intelligence Laboratory. http://hdl.handle.net/1721. 1/54765. Mark A. Finlayson and Raquel Herv a´s. 2010b. UCM/MIT indications, referring expressions, and coreference corpus (UMIREC corpus). Work product, MIT Computer Science and Artificial Intelligence Laboratory. http://hdl.handle.net/1721 .1/54766. Mark A. Finlayson. 2008. Collecting semantics in the wild: The Story Workbench. In Proceedings of the AAAI Fall Symposium on Naturally-Inspired Artificial Intelligence, pages 46–53, Menlo Park, CA, USA. AAAI Press. Albert Gatt, Anja Belz, and Eric Kow. 2009. The TUNA-REG challenge 2009: overview and evaluation results. In ENLG ’09: Proceedings of the 12th European Workshop on Natural Language Generation, pages 174–182, Morristown, NJ, USA. Association for Computational Linguistics. Pamela W. Jordan. 2000a. Can nominal expressions achieve multiple goals?: an empirical study. In ACL ’00: Proceedings of the 38th Annual Meeting on Association for Computational Linguistics, pages 142– 149, Morristown, NJ, USA. Association for Computational Linguistics. Pamela W. Jordan. 2000b. Influences on attribute selection in redescriptions: A corpus study. In Proceedings of CogSci2000, pages 250–255. Mitchell P. Marcus, Mary Ann Marcinkiewicz, and Beatrice Santorini. 1993. Building a large annotated corpus of english: the penn treebank. Computational Linguistics, 19(2):3 13–330. Michael O’Donnell, Hua Cheng, and Janet Hitzeman. 1998. Integrating referring and informing in NP planning. In Proceedings of COLING-ACL’98 Workshop on the Computational Treatment of Nominals, pages 46–56. Massimo Poesio. 2004. Discourse annotation and semantic annotation in the GNOME corpus. In DiscAnnotation ’04: Proceedings of the 2004 ACL Workshop on Discourse Annotation, pages 72–79, Morristown, NJ, USA. Association for Computational Linguistics. Ehud Reiter and Robert Dale. 1992. A fast algorithm for the generation of referring expressions. In Proceedings of the 14th conference on Computational linguistics, Nantes, France. Ehud Reiter and Robert Dale. 2000. Building Natural Language Generation Systems. Cambridge University Press. Kees van Deemter, Ielka van der Sluis, and Albert Gatt. 2006. Building a semantically transparent corpus for the generation of referring expressions. In Proceedings of the 4th International Conference on Natural Language Generation (Special Session on Data Sharing and Evaluation), INLG-06. Jette Viethen and Robert Dale. 2008. The use of spatial relations in referring expressions. In Proceedings of the 5th International Conference on Natural Language Generation. 54

3 0.49657044 139 acl-2010-Identifying Generic Noun Phrases

Author: Nils Reiter ; Anette Frank

Abstract: This paper presents a supervised approach for identifying generic noun phrases in context. Generic statements express rulelike knowledge about kinds or events. Therefore, their identification is important for the automatic construction of knowledge bases. In particular, the distinction between generic and non-generic statements is crucial for the correct encoding of generic and instance-level information. Generic expressions have been studied extensively in formal semantics. Building on this work, we explore a corpus-based learning approach for identifying generic NPs, using selections of linguistically motivated features. Our results perform well above the baseline and existing prior work.

4 0.45203021 11 acl-2010-A New Approach to Improving Multilingual Summarization Using a Genetic Algorithm

Author: Marina Litvak ; Mark Last ; Menahem Friedman

Abstract: Automated summarization methods can be defined as “language-independent,” if they are not based on any languagespecific knowledge. Such methods can be used for multilingual summarization defined by Mani (2001) as “processing several languages, with summary in the same language as input.” In this paper, we introduce MUSE, a languageindependent approach for extractive summarization based on the linear optimization of several sentence ranking measures using a genetic algorithm. We tested our methodology on two languages—English and Hebrew—and evaluated its performance with ROUGE-1 Recall vs. state- of-the-art extractive summarization approaches. Our results show that MUSE performs better than the best known multilingual approach (TextRank1) in both languages. Moreover, our experimental results on a bilingual (English and Hebrew) document collection suggest that MUSE does not need to be retrained on each language and the same model can be used across at least two different languages.

5 0.422979 106 acl-2010-Event-Based Hyperspace Analogue to Language for Query Expansion

Author: Tingxu Yan ; Tamsin Maxwell ; Dawei Song ; Yuexian Hou ; Peng Zhang

Abstract: p . zhang1 @ rgu .ac .uk Bag-of-words approaches to information retrieval (IR) are effective but assume independence between words. The Hyperspace Analogue to Language (HAL) is a cognitively motivated and validated semantic space model that captures statistical dependencies between words by considering their co-occurrences in a surrounding window of text. HAL has been successfully applied to query expansion in IR, but has several limitations, including high processing cost and use of distributional statistics that do not exploit syntax. In this paper, we pursue two methods for incorporating syntactic-semantic information from textual ‘events’ into HAL. We build the HAL space directly from events to investigate whether processing costs can be reduced through more careful definition of word co-occurrence, and improve the quality of the pseudo-relevance feedback by applying event information as a constraint during HAL construction. Both methods significantly improve performance results in comparison with original HAL, and interpolation of HAL and relevance model expansion outperforms either method alone.

6 0.41608426 149 acl-2010-Incorporating Extra-Linguistic Information into Reference Resolution in Collaborative Task Dialogue

7 0.4127239 199 acl-2010-Preferences versus Adaptation during Referring Expression Generation

8 0.41081515 101 acl-2010-Entity-Based Local Coherence Modelling Using Topological Fields

9 0.40653485 165 acl-2010-Learning Script Knowledge with Web Experiments

10 0.40444666 246 acl-2010-Unsupervised Discourse Segmentation of Documents with Inherently Parallel Structure

11 0.39870647 140 acl-2010-Identifying Non-Explicit Citing Sentences for Citation-Based Summarization.

12 0.39578629 7 acl-2010-A Generalized-Zero-Preserving Method for Compact Encoding of Concept Lattices

13 0.39088061 225 acl-2010-Temporal Information Processing of a New Language: Fast Porting with Minimal Resources

14 0.38532516 39 acl-2010-Automatic Generation of Story Highlights

15 0.38281912 64 acl-2010-Complexity Assumptions in Ontology Verbalisation

16 0.36509788 38 acl-2010-Automatic Evaluation of Linguistic Quality in Multi-Document Summarization

17 0.3617411 109 acl-2010-Experiments in Graph-Based Semi-Supervised Learning Methods for Class-Instance Acquisition

18 0.3601476 186 acl-2010-Optimal Rank Reduction for Linear Context-Free Rewriting Systems with Fan-Out Two

19 0.35984209 130 acl-2010-Hard Constraints for Grammatical Function Labelling

20 0.35057837 126 acl-2010-GernEdiT - The GermaNet Editing Tool


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(4, 0.025), (14, 0.02), (25, 0.041), (39, 0.014), (42, 0.029), (44, 0.024), (59, 0.055), (72, 0.013), (73, 0.039), (76, 0.014), (78, 0.035), (80, 0.01), (83, 0.094), (84, 0.034), (97, 0.357), (98, 0.099)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.77171105 196 acl-2010-Plot Induction and Evolutionary Search for Story Generation

Author: Neil McIntyre ; Mirella Lapata

Abstract: In this paper we develop a story generator that leverages knowledge inherent in corpora without requiring extensive manual involvement. A key feature in our approach is the reliance on a story planner which we acquire automatically by recording events, their participants, and their precedence relationships in a training corpus. Contrary to previous work our system does not follow a generate-and-rank architecture. Instead, we employ evolutionary search techniques to explore the space of possible stories which we argue are well suited to the story generation task. Experiments on generating simple children’s stories show that our system outperforms pre- vious data-driven approaches.

2 0.70609492 226 acl-2010-The Human Language Project: Building a Universal Corpus of the World's Languages

Author: Steven Abney ; Steven Bird

Abstract: We present a grand challenge to build a corpus that will include all of the world’s languages, in a consistent structure that permits large-scale cross-linguistic processing, enabling the study of universal linguistics. The focal data types, bilingual texts and lexicons, relate each language to one of a set of reference languages. We propose that the ability to train systems to translate into and out of a given language be the yardstick for determining when we have successfully captured a language. We call on the computational linguistics community to begin work on this Universal Corpus, pursuing the many strands of activity described here, as their contribution to the global effort to document the world’s linguistic heritage before more languages fall silent.

3 0.69336683 189 acl-2010-Optimizing Question Answering Accuracy by Maximizing Log-Likelihood

Author: Matthias H. Heie ; Edward W. D. Whittaker ; Sadaoki Furui

Abstract: In this paper we demonstrate that there is a strong correlation between the Question Answering (QA) accuracy and the log-likelihood of the answer typing component of our statistical QA model. We exploit this observation in a clustering algorithm which optimizes QA accuracy by maximizing the log-likelihood of a set of question-and-answer pairs. Experimental results show that we achieve better QA accuracy using the resulting clusters than by using manually derived clusters.

4 0.56269503 120 acl-2010-Fully Unsupervised Core-Adjunct Argument Classification

Author: Omri Abend ; Ari Rappoport

Abstract: The core-adjunct argument distinction is a basic one in the theory of argument structure. The task of distinguishing between the two has strong relations to various basic NLP tasks such as syntactic parsing, semantic role labeling and subcategorization acquisition. This paper presents a novel unsupervised algorithm for the task that uses no supervised models, utilizing instead state-of-the-art syntactic induction algorithms. This is the first work to tackle this task in a fully unsupervised scenario.

5 0.43672338 230 acl-2010-The Manually Annotated Sub-Corpus: A Community Resource for and by the People

Author: Nancy Ide ; Collin Baker ; Christiane Fellbaum ; Rebecca Passonneau

Abstract: The Manually Annotated Sub-Corpus (MASC) project provides data and annotations to serve as the base for a communitywide annotation effort of a subset of the American National Corpus. The MASC infrastructure enables the incorporation of contributed annotations into a single, usable format that can then be analyzed as it is or ported to any of a variety of other formats. MASC includes data from a much wider variety of genres than existing multiply-annotated corpora of English, and the project is committed to a fully open model of distribution, without restriction, for all data and annotations produced or contributed. As such, MASC is the first large-scale, open, communitybased effort to create much needed language resources for NLP. This paper describes the MASC project, its corpus and annotations, and serves as a call for contributions of data and annotations from the language processing community.

6 0.42348206 231 acl-2010-The Prevalence of Descriptive Referring Expressions in News and Narrative

7 0.41453487 215 acl-2010-Speech-Driven Access to the Deep Web on Mobile Devices

8 0.41164556 153 acl-2010-Joint Syntactic and Semantic Parsing of Chinese

9 0.41145849 101 acl-2010-Entity-Based Local Coherence Modelling Using Topological Fields

10 0.41050184 71 acl-2010-Convolution Kernel over Packed Parse Forest

11 0.41032004 1 acl-2010-"Ask Not What Textual Entailment Can Do for You..."

12 0.40927505 251 acl-2010-Using Anaphora Resolution to Improve Opinion Target Identification in Movie Reviews

13 0.40810844 158 acl-2010-Latent Variable Models of Selectional Preference

14 0.40782416 109 acl-2010-Experiments in Graph-Based Semi-Supervised Learning Methods for Class-Instance Acquisition

15 0.40698063 208 acl-2010-Sentence and Expression Level Annotation of Opinions in User-Generated Discourse

16 0.40595588 39 acl-2010-Automatic Generation of Story Highlights

17 0.40562394 93 acl-2010-Dynamic Programming for Linear-Time Incremental Parsing

18 0.40535066 211 acl-2010-Simple, Accurate Parsing with an All-Fragments Grammar

19 0.40507525 55 acl-2010-Bootstrapping Semantic Analyzers from Non-Contradictory Texts

20 0.40482241 195 acl-2010-Phylogenetic Grammar Induction