acl acl2013 acl2013-253 knowledge-graph by maker-knowledge-mining

253 acl-2013-Multilingual Affect Polarity and Valence Prediction in Metaphor-Rich Texts

Source: pdf

Author: Zornitsa Kozareva

Abstract: Metaphor is an important way of conveying the affect of people, hence understanding how people use metaphors to convey affect is important for the communication between individuals and increases cohesion if the perceived affect of the concrete example is the same for the two individuals. Therefore, building computational models that can automatically identify the affect in metaphor-rich texts like “The team captain is a rock.”, “Time is money.”, “My lawyer is a shark.” is an important challenging problem, which has been of great interest to the research community. To solve this task, we have collected and manually annotated the affect of metaphor-rich texts for four languages. We present novel algorithms that integrate triggers for cognitive, affective, perceptual and social processes with stylistic and lexical information. By running evaluations on datasets in English, Spanish, Russian and Farsi, we show that the developed affect polarity and valence prediction technology of metaphor-rich texts is portable and works equally well for different languages.

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Abstract Metaphor is an important way of conveying the affect of people, hence understanding how people use metaphors to convey affect is important for the communication between individuals and increases cohesion if the perceived affect of the concrete example is the same for the two individuals. [sent-2, score-0.723]

2 Therefore, building computational models that can automatically identify the affect in metaphor-rich texts like “The team captain is a rock. [sent-3, score-0.228]

3 To solve this task, we have collected and manually annotated the affect of metaphor-rich texts for four languages. [sent-7, score-0.228]

4 By running evaluations on datasets in English, Spanish, Russian and Farsi, we show that the developed affect polarity and valence prediction technology of metaphor-rich texts is portable and works equally well for different languages. [sent-9, score-0.977]

5 For instance, in “My lawyer is a shark” the speaker may want to communicate that his/her lawyer is strong and aggressive, and that he will attack in court and persist until the goals are achieved. [sent-11, score-0.151]

6 By using the metaphor, the speaker actually conveys positive affect because having an aggressive lawyer is good if one is being sued. [sent-12, score-0.285]

7 There has been a substantial body of work on metaphor identification and interpretation (Wilks, 2007; Shutova et al. [sent-13, score-0.589]

8 However, in this paper we focus on an equally interesting, challenging and important problem, which concerns the automatic identification of affect carried by metaphors. [sent-15, score-0.264]

9 Building such computational models is important to understand how people use metaphors to convey affect and how affect is expressed using metaphors. [sent-16, score-0.549]

10 The existence of such models can be also used to improve the communication between individuals and to make sure that the speakers perceived the affect of the concrete metaphor example in the same way. [sent-17, score-0.692]

11 The questions we address in this paper are: “How can we build computational models that can identify the polarity and valence associated with metaphor-rich texts? [sent-18, score-0.644]

12 We have proposed and developed automated mWeeth hoadvse fporro solving nthde d polarity dan adu tovamleantceed tasks for all four languages. [sent-22, score-0.318]

13 We model the polarity task as a classification problem, while the valence task as a regression problem. [sent-23, score-0.775]

14 We have studied the influence of different infWorem haatvieon s sources eli iknefl uthenec metaphor itself, the context in which it resides, the source and 682 Proce dingSsof oifa, th Beu 5l1gsarti Aan,An u aglu Mste 4e-ti9n2g 0 o1f3 t. [sent-24, score-0.632]

15 Sections 4 and 5 describe the polarity classification and valence prediction tasks for affect of metaphor-rich texts. [sent-30, score-0.951]

16 2 Related Work A substantial body of work has been done on determining the affect (sentiment analysis) of texts (Kim and Hovy, 2004; Strapparava and Mihalcea, 2007; Wiebe and Cardie, 2005; Yessenalina and Cardie, 2011; Breck et al. [sent-33, score-0.228]

17 Various tasks have been solved among which polarity and va- lence identification are the most common. [sent-35, score-0.329]

18 While polarity identification aims at finding the Positive and Negative affect, valence is more challenging as it has to map the affect on a [−3, +3] scale depending on aitsp intensity (Polanyi a3n,d+ Zaenen, 2004; Strapparava and Mihalcea, 2007). [sent-36, score-0.939]

19 Over the years researchers have developed various approaches to identify polarity of words (Esuli and Sebastiani, 2006), phrases (Turney, 2002; Wilson et al. [sent-37, score-0.318]

20 , 2005; Pang and Lee, 2008), but yet what is missing is affect analyzer for metaphor-rich texts. [sent-41, score-0.174]

21 While the affect of metaphors is well studied from its linguistic and psychological aspects (Blanchette et al. [sent-42, score-0.404]

22 , 2001 ; Tomlinson and Love, 2006; Crawdord, 2009), to our knowledge the building of computational models for polarity and valence identification in metaphor-rich texts is still a novel task (Smith et al. [sent-43, score-0.735]

23 Little (almost no) effort has been put into multilingual computational affect models of metaphor-rich texts. [sent-46, score-0.174]

24 3 Metaphors Although there are different views on metaphor in linguistics and philosophy (Black, 1962; Lakoff and Johnson, 1980; Gentner, 1983; Wilks, 2007), the common among all approaches is the idea of an interconceptual mapping that underlies the production of metaphorical expressions. [sent-49, score-0.518]

25 There are two concepts or conceptual domains: the target (also called topic in the linguistics literature) and the source (or vehicle), and the existence of a link between them gives rise to metaphors. [sent-50, score-0.085]

26 As we mentioned before, there has been a lot of work on the automatic identification of metaphors (Wilks, 2007; Shutova et al. [sent-55, score-0.238]

27 Instead we focus on an equally interesting, challenging and important problem, which concerns the automatic identification of affect carried by metaphors. [sent-57, score-0.264]

28 To conduct our study, we use human annotators to collect metaphor-rich texts (Shutova and Teufel, 2010) and tag each metaphor with its corresponding polarity (Posi- tive/Negative) and valence [−3, +3] scores. [sent-58, score-1.295]

29 The tnivexe/t Nseegctaitoivnes) )d aensdcr vibaele tnhcee a [−ffe3c,t+ polarity easn. [sent-59, score-0.292]

30 d valence tasks we have defined, the collected and annotated metaphor-rich data for each one of the English, Spanish, Russian and Farsi languages, the conducted experiments and obtained results. [sent-60, score-0.377]

31 For instance, the metaphor “tough pill to swallow” has Negative polarity as it stands for something being hard to digest or comprehend, while the metaphor “values that gave our nation birth” has a Positive polarity as giving birth is like starting a new beginning. [sent-66, score-1.776]

32 Evaluation Measures: To evaluate the goodness of the polarity classification algorithms, we calculate the f-score and accuracy on 10-fold cross validation. [sent-71, score-0.372]

33 com/ tation toolkit specifically for the task of metaphor detection, interpretation and affect assignment. [sent-75, score-0.726]

34 The domain for which the metaphors were collected was Governance. [sent-77, score-0.201]

35 The metaphors were collected from political speeches, political websites, online newspapers among others (Mohler et al. [sent-79, score-0.249]

36 The annotation toolkit allowed annotators to provide for each metaphor the following information: the metaphor, the context in which the metaphor was found, the meaning of the metaphor in the source and target domains from the perspective of a native speaker. [sent-81, score-1.666]

37 ; the annotators tagged the Metaphor: values that gave our nation birth; and listed as Source: mother gave birth to baby; and Target: values of freedom and equality motivated the creation of America. [sent-83, score-0.248]

38 The same annotators also provided the affect associated with the metaphor. [sent-84, score-0.23]

39 In our study, the maximum length of a metaphor is a sentence, but typically it has the span of a phrase. [sent-90, score-0.518]

40 In our study, the source and target domains are provided by the human annotators who agree on these definitions, however the source and target can be also automatically generated by an interpretation system or a concept mapper. [sent-92, score-0.202]

41 The generation of source and target information is beyond the scope of this paper, but studying their impact on affect is important. [sent-93, score-0.23]

42 At the same time, we want to show that if the technology for source/target detection and interpretation is not yet available, then how far can one reach by using the metaphor itself and the context around it. [sent-94, score-0.552]

43 In the experimental sections, we show how the individual information sources and their combination affects the resolution of the metaphor polarity and valence prediction tasks. [sent-96, score-1.294]

44 Table 1 shows the positive and negative class 684 distribution for each one of the four languages. [sent-97, score-0.108]

45 4 N-gram Evaluation and Results N-gram features are widely used in a variety of classification tasks, therefore we also use them in our polarity classification task. [sent-101, score-0.4]

46 Figure 2 shows a study of the influence of the different information sources and their combination with n-gram features for English. [sent-104, score-0.11]

47 The results from this study show that for English, the more information sources one combines, the higher the classification accuracy becomes. [sent-120, score-0.133]

48 For instance, the LIWC category discrepancy contains words like should, could among others, while the LIWC category inhibition contains words like block, stop, constrain. [sent-131, score-0.117]

49 When LIWC analyzes texts it generates statistics like number of words found in category Ci divided by the total number of words in the text. [sent-136, score-0.09]

50 For our metaphor polarity task, we use LIWC’s statistics of all 64 categories and feed this information as features for the machine learning classifiers. [sent-137, score-0.858]

51 LIWC repository contains conceptual categories (dictionaries) both for the English and Spanish languages. [sent-138, score-0.101]

52 LIWC Evaluation and Results: In our experiments LIWC is applied to English and Spanish metaphor-rich texts since the LIWC category dictionaries are available for both languages. [sent-139, score-0.09]

53 52F390-score on 10-fold validation for English and Spanish The best performances are reached with individual information sources like metaphor, context, source or target instead of their combinations. [sent-145, score-0.109]

54 LIWC Category Relevance to Metaphor Polarity: We also study the importance and relevance of the LIWC categories for the metaphor polarity task. [sent-147, score-0.884]

55 We use information gain (IG) to measure the amount of information in bits about the polarity class prediction, if the only information available is the presence of a given LIWC category (feature) and the corresponding polarity class distribution. [sent-148, score-0.67]

56 Figure 3 illustrates how certain categories occur more with the positive (in red color) vs negative (in green color) class. [sent-150, score-0.131]

57 With the positive metaphors we observe the LIWC categories for present tense, social, affect and family, while for the negative metaphors we see LIWC categories for past tense, inhibition and anger. [sent-151, score-0.8]

58 Our study shows that some ofthe LIWC categories are important across all information sources, but overall different triggers activate depending on the information source and the length of the text used. [sent-169, score-0.134]

59 We compare the performance of the algorithms with a majority baseline, which assigns the majority class to each example. [sent-173, score-0.097]

60 Since the positive class is the predominant one for this language and dataset, a majority classifier ! [sent-175, score-0.111]

61 59 accuracy in returning the positive Figure 3: LIWC category relevance to Metaphor Polarity In addition, we show in Figure 4 examples of the top LIWC categories according to IG ranking class as an answer. [sent-178, score-0.159]

62 7 Lessons Learned To summarize, in this section we have defined the task of polarity classification and we have presented a machine learning solution. [sent-196, score-0.346]

63 imcb3aheutilosvdpa+hto3remn)-,cgidwehsplcotreiax−stvoc3afyni,epd0worictanpdetsi- Figure 6 shows an example of the valence prediction task in which the metaphor-rich texts must be arranged by the intensity of the emotional state provoked by the texts. [sent-203, score-0.571]

64 For instance, −3 corresponds to very strong negativity, t−an2c strong negativity, s− t1o w veearyk negativity (similarly fo strr otnheg positive classes). [sent-204, score-0.092]

65 Ikn tehgisa itvaistyk we aillasorl yco fnosrid theer metaphors with neutral affect. [sent-205, score-0.201]

66 They are annotated with the 0 label and the prediction model should be able to predict such intensity as well. [sent-206, score-0.137]

67 For instance, the metaphor “values that gave our nation birth”, is considered by American people that giving birth sets new beginning and has a positive score +1, but “budget knife” is more positive +3 since tax cut is more important. [sent-207, score-0.774]

68 As any sentiment analysis task, affect assignment of metaphors is also a subjective task and the produced annotations express the values, believes and understanding of the annotators. [sent-208, score-0.419]

69 2 Regression Model We model the valence task a regression problem, in which for a given metaphor m, we seek to predict the valence v of m. [sent-210, score-1.299]

70 The objective is to wlear ∈n w from a collection of N training examples {< mi, vi >}iN=1, where mi are the metaphor ex- limaes moSapd:uwlep sbloay-rnktsdovn elviwc∈tnogrmRrthegi tsrhfteohsdleoifvwoanrlie (tnDgracoeiunpcstnikcmgeorizea toraeifolg. [sent-212, score-0.547]

71 687 Evaluation Measures: To evaluate the quality of the valence prediction model, we compare the actual valence score of the metaphor given by human annotators denoted with y against those valence scores predicted by the regression model denoted with x. [sent-219, score-1.786]

72 We estimate the goodness of the regression model calculating boPth the PcorrePlation coef- ficient ccx,y=√nPxni2−P(Pxiyxii−)2P√nxiPPyyi2i−(Pyi)2 and the mean squarePd errorP msex,y Pin=iPn(x− xˆ). [sent-220, score-0.103]

73 Intuitively the higher the correlation score is, the better the correlation between the actual and the predicted valence scores will be. [sent-222, score-0.352]

74 Similarly the smaller the mean squared error rate, the better the regression model fits the valence predictions to the actual score. [sent-223, score-0.458]

75 3 Data Annotation To conduct our valence prediction study, we used the same human annotators from the polarity classification task for each one of the English, Spanish, Russian and Farsi languages. [sent-225, score-0.833]

76 We asked the annotators to map each metaphor on a [−3, +3] sacnanloet depending on tehaec intensity oofr t ohen a aff [e−ct3 associated with the metaphor. [sent-226, score-0.632]

77 Table 4 shows the distribution (number of examples) for each valence class and for each language. [sent-227, score-0.377]

78 4 Empirical Evaluation and Results For each language and information source we built separate valence prediction regression models. [sent-229, score-0.538]

79 We used the same features for the regression task as we have used in the classification task. [sent-230, score-0.131]

80 The Farsi and Russian regression models are based only on n-gram features, while the English and Spanish regression models have both n-gram and LIWC features. [sent-233, score-0.154]

81 This means that the LIWC based valence regression model approximates the predicted values better to those of the human annotators. [sent-235, score-0.429]

82 The better valence prediction happens when the metaphor itself is used by LIWC. [sent-236, score-0.949]

83 In Russian and Farsi the lowest MSE is when the combined metaphor, source and target information sources are used. [sent-238, score-0.109]

84 5 Lessons Learned To summarize, in this section we have defined the task of valence prediction of metaphor-rich texts and we have described a regression model for its solution. [sent-245, score-0.562]

85 6 Conclusion People use metaphor-rich language to express affect and often affect is expressed through the usage of metaphors. [sent-249, score-0.348]

86 Therefore, understanding that the metaphor “I was boiling inside when I saw him. [sent-250, score-0.518]

87 ” has Negative polarity as it conveys feeling of anger is very important for interpersonal or multicultural communications. [sent-251, score-0.326]

88 In this paper, we have introduced a novel corpus of metaphor-rich texts for the English, Spanish, Russian and Farsi languages, which was manually annotated with the polarity and valence scores of the affect conveyed by the metaphors. [sent-252, score-0.898]

89 i 529 s308269h,Rus- sian and Farsi their combination in order to understand how such information helps and impacts the interpretation of the affect associated with the metaphor. [sent-265, score-0.208]

90 We have conducted exhaustive evaluation with multiple machine learning classifiers and different features sets spanning from lexical information to psychological categories developed by (Tausczik and Pennebaker, 2010). [sent-266, score-0.179]

91 Through experiments carried out on the developed datasets, we showed that the proposed polarity classification and valence regression models significantly improve baselines (from 11. [sent-267, score-0.801]

92 From the two tasks, the valence prediction problem was more challenging both for the human annotators and the automated system. [sent-270, score-0.513]

93 The mean squared error in valence prediction in the range [−3, +3], rwohrer ine v−a3le inncdeic patreesd strong negative agned −+33 +ind3]i-, cwahteesr strong positive raoffnegct n feogra English, Spanish and Russian was around 1. [sent-271, score-0.543]

94 In the future we are interested in studying the affect of metaphors for domains different than Governance. [sent-274, score-0.375]

95 We want to conduct studies with the help of social sciences who would research whether the tagging of affect in metaphors depends on the political affiliation, age, gender or culture of the annotators. [sent-275, score-0.425]

96 Not on a last place, we would like to improve the built valence prediction models and to collect more data for Spanish, Russian and Farsi. [sent-276, score-0.454]

97 Adapting a polarity lexicon using integer linear programming for domain-specific sentiment classification. [sent-299, score-0.336]

98 Sentiment classification of movie and product reviews using contextual valence shifters. [sent-325, score-0.406]

99 Don’t worry about metaphor: affect extraction for conversational agents. [sent-412, score-0.174]

100 Specifying viewpoint and information need with affective metaphors: a system demonstration of the metaphor magnet web app/service. [sent-446, score-0.55]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('metaphor', 0.518), ('valence', 0.352), ('liwc', 0.348), ('polarity', 0.292), ('farsi', 0.25), ('metaphors', 0.201), ('affect', 0.174), ('russian', 0.165), ('spanish', 0.158), ('shutova', 0.084), ('prediction', 0.079), ('regression', 0.077), ('birth', 0.075), ('reyes', 0.069), ('lessons', 0.063), ('lawyer', 0.061), ('intensity', 0.058), ('tausczik', 0.056), ('annotators', 0.056), ('texts', 0.054), ('classification', 0.054), ('sources', 0.053), ('lakoff', 0.05), ('ekaterina', 0.05), ('positive', 0.05), ('categories', 0.048), ('mse', 0.048), ('inhibition', 0.045), ('wilks', 0.045), ('nation', 0.045), ('sentiment', 0.044), ('lcc', 0.042), ('negativity', 0.042), ('claire', 0.04), ('veale', 0.039), ('pennebaker', 0.038), ('identification', 0.037), ('majority', 0.036), ('gave', 0.036), ('category', 0.036), ('tense', 0.035), ('ig', 0.035), ('blanchette', 0.034), ('mohler', 0.034), ('quercia', 0.034), ('tomlinson', 0.034), ('interpretation', 0.034), ('anger', 0.034), ('negative', 0.033), ('cardie', 0.033), ('choi', 0.032), ('affective', 0.032), ('yejin', 0.032), ('wiebe', 0.032), ('influence', 0.031), ('unigrams', 0.031), ('swear', 0.03), ('triggers', 0.03), ('source', 0.03), ('strapparava', 0.029), ('tony', 0.029), ('mi', 0.029), ('english', 0.029), ('conceptual', 0.029), ('psychological', 0.029), ('squared', 0.029), ('attack', 0.029), ('janyce', 0.028), ('polanyi', 0.028), ('irony', 0.028), ('gonz', 0.028), ('emotional', 0.028), ('concerns', 0.027), ('conveyed', 0.026), ('battle', 0.026), ('goodness', 0.026), ('developed', 0.026), ('target', 0.026), ('challenging', 0.026), ('aaai', 0.026), ('classifiers', 0.026), ('social', 0.026), ('study', 0.026), ('exhaustive', 0.025), ('class', 0.025), ('rosso', 0.025), ('ez', 0.025), ('conducted', 0.025), ('repository', 0.024), ('defense', 0.024), ('ott', 0.024), ('sarcasm', 0.024), ('breck', 0.024), ('yessenalina', 0.024), ('political', 0.024), ('bernhard', 0.023), ('collect', 0.023), ('kennedy', 0.023), ('iarpa', 0.023), ('adaboost', 0.023)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.000001 253 acl-2013-Multilingual Affect Polarity and Valence Prediction in Metaphor-Rich Texts

Author: Zornitsa Kozareva

2 0.41379282 116 acl-2013-Detecting Metaphor by Contextual Analogy

Author: Eirini Florou

Abstract: As one of the most challenging issues in NLP, metaphor identification and its interpretation have seen many models and methods proposed. This paper presents a study on metaphor identification based on the semantic similarity between literal and non literal meanings of words that can appear at the same context.

3 0.16769619 88 acl-2013-Computational considerations of comparisons and similes

Author: Vlad Niculae ; Victoria Yaneva

Abstract: This paper presents work in progress towards automatic recognition and classification of comparisons and similes. Among possible applications, we discuss the place of this task in text simplification for readers with Autism Spectrum Disorders (ASD), who are known to have deficits in comprehending figurative language. We propose an approach to comparison recognition through the use of syntactic patterns. Keeping in mind the requirements of autistic readers, we discuss the properties relevant for distinguishing semantic criteria like figurativeness and abstractness.

4 0.16729069 148 acl-2013-Exploring Sentiment in Social Media: Bootstrapping Subjectivity Clues from Multilingual Twitter Streams

Author: Svitlana Volkova ; Theresa Wilson ; David Yarowsky

Abstract: We study subjective language media and create Twitter-specific lexicons via bootstrapping sentiment-bearing terms from multilingual Twitter streams. Starting with a domain-independent, highprecision sentiment lexicon and a large pool of unlabeled data, we bootstrap Twitter-specific sentiment lexicons, using a small amount of labeled data to guide the process. Our experiments on English, Spanish and Russian show that the resulting lexicons are effective for sentiment classification for many underexplored languages in social media.

5 0.15998404 188 acl-2013-Identifying Sentiment Words Using an Optimization-based Model without Seed Words

Author: Hongliang Yu ; Zhi-Hong Deng ; Shiyingxue Li

Abstract: Sentiment Word Identification (SWI) is a basic technique in many sentiment analysis applications. Most existing researches exploit seed words, and lead to low robustness. In this paper, we propose a novel optimization-based model for SWI. Unlike previous approaches, our model exploits the sentiment labels of documents instead of seed words. Several experiments on real datasets show that WEED is effective and outperforms the state-of-the-art methods with seed words.

6 0.14943795 131 acl-2013-Dual Training and Dual Prediction for Polarity Classification

7 0.10071521 117 acl-2013-Detecting Turnarounds in Sentiment Analysis: Thwarting

8 0.090453699 2 acl-2013-A Bayesian Model for Joint Unsupervised Induction of Sentiment, Aspect and Discourse Representations

9 0.08499179 96 acl-2013-Creating Similarity: Lateral Thinking for Vertical Similarity Judgments

10 0.083706297 187 acl-2013-Identifying Opinion Subgroups in Arabic Online Discussions

11 0.083474807 81 acl-2013-Co-Regression for Cross-Language Review Rating Prediction

12 0.081235252 310 acl-2013-Semantic Frames to Predict Stock Price Movement

13 0.079870515 63 acl-2013-Automatic detection of deception in child-produced speech using syntactic complexity features

14 0.076954909 211 acl-2013-LABR: A Large Scale Arabic Book Reviews Dataset

15 0.076317571 135 acl-2013-English-to-Russian MT evaluation campaign

16 0.075835221 344 acl-2013-The Effects of Lexical Resource Quality on Preference Violation Detection

17 0.075067855 91 acl-2013-Connotation Lexicon: A Dash of Sentiment Beneath the Surface Meaning

18 0.072152086 115 acl-2013-Detecting Event-Related Links and Sentiments from Social Media Texts

19 0.068974622 318 acl-2013-Sentiment Relevance

20 0.06874989 79 acl-2013-Character-to-Character Sentiment Analysis in Shakespeare's Plays

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.154), (1, 0.162), (2, -0.007), (3, 0.071), (4, -0.072), (5, -0.095), (6, -0.03), (7, 0.064), (8, 0.069), (9, 0.075), (10, 0.041), (11, 0.022), (12, -0.117), (13, -0.028), (14, -0.02), (15, 0.011), (16, -0.0), (17, 0.016), (18, 0.056), (19, 0.037), (20, 0.044), (21, -0.026), (22, 0.039), (23, -0.054), (24, 0.056), (25, 0.015), (26, -0.175), (27, -0.007), (28, -0.061), (29, 0.064), (30, -0.003), (31, -0.258), (32, 0.023), (33, -0.005), (34, 0.115), (35, 0.178), (36, 0.194), (37, -0.101), (38, 0.133), (39, 0.028), (40, -0.04), (41, 0.175), (42, 0.064), (43, 0.185), (44, -0.06), (45, -0.078), (46, 0.016), (47, 0.141), (48, 0.181), (49, 0.092)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.93382645 253 acl-2013-Multilingual Affect Polarity and Valence Prediction in Metaphor-Rich Texts

Author: Zornitsa Kozareva

2 0.80866283 88 acl-2013-Computational considerations of comparisons and similes

Author: Vlad Niculae ; Victoria Yaneva

3 0.69918364 116 acl-2013-Detecting Metaphor by Contextual Analogy

Author: Eirini Florou

4 0.43550637 96 acl-2013-Creating Similarity: Lateral Thinking for Vertical Similarity Judgments

Author: Tony Veale ; Guofu Li

Abstract: Just as observing is more than just seeing, comparing is far more than mere matching. It takes understanding, and even inventiveness, to discern a useful basis for judging two ideas as similar in a particular context, especially when our perspective is shaped by an act of linguistic creativity such as metaphor, simile or analogy. Structured resources such as WordNet offer a convenient hierarchical means for converging on a common ground for comparison, but offer little support for the divergent thinking that is needed to creatively view one concept as another. We describe such a means here, by showing how the web can be used to harvest many divergent views for many familiar ideas. These lateral views complement the vertical views of WordNet, and support a system for idea exploration called Thesaurus Rex. We show also how Thesaurus Rex supports a novel, generative similarity measure for WordNet. 1 Seeing is Believing (and Creating) Similarity is a cognitive phenomenon that is both complex and subjective, yet for practical reasons it is often modeled as if it were simple and objective. This makes sense for the many situations where we want to align our similarity judgments with those of others, and thus focus on the same conventional properties that others are also likely to focus upon. This reliance on the consensus viewpoint explains why WordNet (Fellbaum, 1998) has proven so useful as a basis for computational measures of lexico-semantic similarity Guofu Li School of Computer Science and Informatics, University College Dublin, Belfield, Dublin D2, Ireland. l .guo fu . l gmai l i @ .com (e.g. see Pederson et al. 2004, Budanitsky & Hirst, 2006; Seco et al. 2006). These measures reduce the similarity of two lexical concepts to a single number, by viewing similarity as an objective estimate of the overlap in their salient qualities. This convenient perspective is poorly suited to creative or insightful comparisons, but it is sufficient for the many mundane comparisons we often perform in daily life, such as when we organize books or look for items in a supermarket. So if we do not know in which aisle to locate a given item (such as oatmeal), we may tacitly know how to locate a similar product (such as cornflakes) and orient ourselves accordingly. Yet there are occasions when the recognition of similarities spurs the creation of similarities, when the act of comparison spurs us to invent new ways of looking at an idea. By placing pop tarts in the breakfast aisle, food manufacturers encourage us to view them as a breakfast food that is not dissimilar to oatmeal or cornflakes. When ex-PM Tony Blair published his memoirs, a mischievous activist encouraged others to move his book from Biography to Fiction in bookshops, in the hope that buyers would see it in a new light. Whenever we use a novel metaphor to convey a non-obvious viewpoint on a topic, such as “cigarettes are time bombs”, the comparison may spur us to insight, to see aspects of the topic that make it more similar to the vehicle (see Ortony, 1979; Veale & Hao, 2007). In formal terms, assume agent A has an insight about concept X, and uses the metaphor X is a Y to also provoke this insight in agent B. To arrive at this insight for itself, B must intuit what X and Y have in common. But this commonality is surely more than a standard categorization of X, or else it would not count as an insight about X. To understand the metaphor, B must place X 660 Proce dingSsof oifa, th Beu 5l1gsarti Aan,An u aglu Mste 4e-ti9n2g 0 o1f3 t.he ?c A2s0s1o3ci Aatsiosonc fioartio Cno fmorpu Ctoamtiopnuatalt Lioin gauli Lsitnicgsu,i psatgices 6 0–670, in a new category, so that X can be seen as more similar to Y. Metaphors shape the way we per- ceive the world by re-shaping the way we make similarity judgments. So if we want to imbue computers with the ability to make and to understand creative metaphors, we must first give them the ability to look beyond the narrow viewpoints of conventional resources. Any measure that models similarity as an objective function of a conventional worldview employs a convergent thought process. Using WordNet, for instance, a similarity measure can vertically converge on a common superordinate category of both inputs, and generate a single numeric result based on their distance to, and the information content of, this common generalization. So to find the most conventional ways of seeing a lexical concept, one simply ascends a narrowing concept hierarchy, using a process de Bono (1970) calls vertical thinking. To find novel, non-obvious and useful ways of looking at a lexical concept, one must use what Guilford (1967) calls divergent thinking and what de Bono calls lateral thinking. These processes cut across familiar category boundaries, to simultaneously place a concept in many different categories so that we can see it in many different ways. de Bono argues that vertical thinking is selective while lateral thinking is generative. Whereas vertical thinking concerns itself with the “right” way or a single “best” way of looking at things, lateral thinking focuses on producing alternatives to the status quo. To be as useful for creative tasks as they are for conventional tasks, we need to re-imagine our computational similarity measures as generative rather than selective, expansive rather than reductive, divergent as well as convergent and lateral as well as vertical. Though WordNet is ideally structured to support vertical, convergent reasoning, its comprehensive nature means it can also be used as a solid foundation for building a more lateral and divergent model of similarity. Here we will use the web as a source of diverse perspectives on familiar ideas, to complement the conventional and often narrow views codified by WordNet. Section 2 provides a brief overview of past work in the area of similarity measurement, before section 3 describes a simple bootstrapping loop for acquiring richly diverse perspectives from the web for a wide variety of familiar ideas. These perspectives are used to enhance a Word- Net-based measure of lexico-semantic similarity in section 4, by broadening the range of informative viewpoints the measure can select from. Similarity is thus modeled as a process that is both generative and selective. This lateral-andvertical approach is evaluated in section 5, on the Miller & Charles (1991) data-set. A web app for the lateral exploration of diverse viewpoints, named Thesaurus Rex, is also presented, before closing remarks are offered in section 6. 2 Related Work and Ideas WordNet’s taxonomic organization of nounsenses and verb-senses – in which very general categories are successively divided into increasingly informative sub-categories or instancelevel ideas – allows us to gauge the overlap in information content, and thus of meaning, of two lexical concepts. We need only identify the deepest point in the taxonomy at which this content starts to diverge. This point of divergence is often called the LCS, or least common subsumer, of two concepts (Pederson et al., 2004). Since sub-categories add new properties to those they inherit from their parents – Aristotle called these properties the differentia that stop a category system from trivially collapsing into itself – the depth of a lexical concept in a taxonomy is an intuitive proxy for its information content. Wu & Palmer (1994) use the depth of a lexical concept in the WordNet hierarchy as such a proxy, and thereby estimate the similarity of two lexical concepts as twice the depth of their LCS divided by the sum of their individual depths. Leacock and Chodorow (1998) instead use the length of the shortest path between two concepts as a proxy for the conceptual distance between them. To connect any two ideas in a hierarchical system, one must vertically ascend the hierarchy from one concept, change direction at a potential LCS, and then descend the hierarchy to reach the second concept. (Aristotle was also first to suggest this approach in his Poetics). Leacock and Chodorow normalize the length of this path by dividing its size (in nodes) by twice the depth of the deepest concept in the hierarchy; the latter is an upper bound on the distance between any two concepts in the hierarchy. Negating the log of this normalized length yields a corresponding similarity score. While the role of an LCS is merely implied in Leacock and Chodorow’s use of a shortest path, the LCS is pivotal nonetheless, and like that of Wu & Palmer, the approach uses an essentially vertical reasoning process to identify a single “best” generalization. Depth is a convenient proxy for information content, but more nuanced proxies can yield 661 more rounded similarity measures. Resnick (1995) draws on information theory to define the information content of a lexical concept as the negative log likelihood of its occurrence in a corpus, either explicitly (via a direct mention) or by presupposition (via a mention of any of its sub-categories or instances). Since the likelihood of a general category occurring in a corpus is higher than that of any of its sub-categories or instances, such categories are more predictable, and less informative, than rarer categories whose occurrences are less predictable and thus more informative. The negative log likelihood of the most informative LCS of two lexical concepts offers a reliable estimate of the amount of infor- mation shared by those concepts, and thus a good estimate of their similarity. Lin (1998) combines the intuitions behind Resnick’s metric and that of Wu and Palmer to estimate the similarity of two lexical concepts as an information ratio: twice the information content of their LCS divided by the sum of their individual information contents. Jiang and Conrath (1997) consider the converse notion of dissimilarity, noting that two lexical concepts are dissimilar to the extent that each contains information that is not shared by the other. So if the information content of their most informative LCS is a good measure of what they do share, then the sum of their individual information contents, minus twice the content of their most informative LCS, is a reliable estimate of their dissimilarity. Seco et al. (2006) presents a minor innovation, showing how Resnick’s notion of information content can be calculated without the use of an external corpus. Rather, when using Resnick’s metric (or that of Lin, or Jiang and Conrath) for measuring the similarity of lexical concepts in WordNet, one can use the category structure of WordNet itself to estimate infor- mation content. Typically, the more general a concept, the more descendants it will possess. Seco et al. thus estimate the information content of a lexical concept as the log of the sum of all its unique descendants (both direct and indirect), divided by the log of the total number of concepts in the entire hierarchy. Not only is this intrinsic view of information content convenient to use, without recourse to an external corpus, Seco et al. show that it offers a better estimate of information content than its extrinsic, corpus-based alternatives, as measured relative to average human similarity ratings for the 30 word-pairs in the Miller & Charles (1991) test set. A similarity measure can draw on other sources of information besides WordNet’s category structures. One might eke out additional information from WordNet’s textual glosses, as in Lesk (1986), or use category structures other than those offered by WordNet. Looking beyond WordNet, entries in the online encyclopedia Wikipedia are not only connected by a dense topology of lateral links, they are also organized by a rich hierarchy of overlapping categories. Strube and Ponzetto (2006) show how Wikipedia can support a measure of similarity (and relatedness) that better approximates human judgments than many WordNet-based measures. Nonetheless, WordNet can be a valuable component of a hybrid measure, and Agirre et al. (2009) use an SVM (support vector machine) to combine information from WordNet with information harvested from the web. Their best similarity measure achieves a remarkable 0.93 correlation with human judgments on the Miller & Charles word-pair set. Similarity is not always applied to pairs of concepts; it is sometimes analogically applied to pairs of pairs of concepts, as in proportional analogies of the form A is to B as C is to D (e.g., hacks are to writers as mercenaries are to soldiers, or chisels are to sculptors as scalpels are to surgeons). In such analogies, one is really assessing the similarity of the unstated relationship between each pair of concepts: thus, mercenaries are soldiers whose allegiance is paid for, much as hacks are writers with income-driven loyalties; sculptors use chisels to carve stone, while surgeons use scalpels to cut or carve flesh. Veale (2004) used WordNet to assess the similarity of A:B to C:D as a function of the combined similarity of A to C and of B to D. In contrast, Turney (2005) used the web to pursue a more divergent course, to represent the tacit relationships of A to B and of C to D as points in a highdimensional space. The dimensions of this space initially correspond to linking phrases on the web, before these dimensions are significantly reduced using singular value decomposition. In the infamous SAT test, an analogy A:B::C:D has four other pairs of concepts that serve as likely distractors (e.g. singer:songwriter for hack:writer) and the goal is to choose the most appropriate C:D pair for a given A:B pairing. Using variants of Wu and Palmer (1994) on the 374 SAT analogies of Turney (2005), Veale (2004) reports a success rate of 38–44% using only WordNet-based similarity. In contrast, Turney (2005) reports up to 55% success on the same analogies, partly because his approach aims 662 to match implicit relations rather than explicit concepts, and in part because it uses a divergent process to gather from the web as rich a perspec- tive as it can on these latent relationships. 2.1 Clever Comparisons Create Similarity Each of these approaches to similarity is a user of information, rather than a creator, and each fails to capture how a creative comparison (such as a metaphor) can spur a listener to view a topic from an atypical perspective. Camac & Glucksberg (1984) provide experimental evidence for the claim that “metaphors do not use preexisting associations to achieve their effects [… ] people use metaphors to create new relations between concepts.” They also offer a salutary reminder of an often overlooked fact: every comparison exploits information, but each is also a source of new information in its own right. Thus, “this cola is acid” reveals a different perspective on cola (e.g. as a corrosive substance or an irritating food) than “this acid is cola” highlights for acid (such as e.g., a familiar substance) Veale & Keane (1994) model the role of similarity in realizing the long-term perlocutionary effect of an informative comparison. For example, to compare surgeons to butchers is to encourage one to see all surgeons as more bloody, … crude or careless. The reverse comparison, of butchers to surgeons, encourages one to see butchers as more skilled and precise. Veale & Keane present a network model of memory, called Sapper, in which activation can spread between related concepts, thus allowing one concept to prime the properties of a neighbor. To interpret an analogy, Sapper lays down new activation-carrying bridges in memory between analogical counterparts, such as between surgeon & butcher, flesh & meat, and scalpel & cleaver. Comparisons can thus have lasting effects on how Sapper sees the world, changing the pattern of activation that arises when it primes a concept. Veale (2003) adopts a similarly dynamic view of similarity in WordNet, showing how an analogical comparison can result in the automatic addition of new categories and relations to WordNet itself. Veale considers the problem of finding an analogical mapping between different parts of WordNet’s noun-sense hierarchy, such as between instances of Greek god and Norse god, or between the letters of different alphabets, such as of Greek and Hebrew. But no structural similarity measure for WordNet exhibits enough discernment to e.g. assign a higher similarity to Zeus & Odin (each is the supreme deity of its pantheon) than to a pairing of Zeus and any other Norse god, just as no structural measure will assign a higher similarity to Alpha & Aleph or to Beta & Beth than to any random letter pairing. A fine-grained category hierarchy permits fine-grained similarity judgments, and though WordNet is useful, its sense hierarchies are not especially fine-grained. However, we can automatically make WordNet subtler and more discerning, by adding new fine-grained categories to unite lexical concepts whose similarity is not reflected by any existing categories. Veale (2003) shows how a property that is found in the glosses of two lexical concepts, of the same depth, can be combined with their LCS to yield a new fine-grained parent category, so e.g. “supreme” + deity = Supreme-deity (for Odin, Zeus, Jupiter, etc.) and “1 st” + letter = 1st-letter (for Alpha, Aleph, etc.) Selected aspects of the textual similarity of two WordNet glosses – the key to similarity in Lesk (1986) – can thus be reified into an explicitly categorical WordNet form. 3 Divergent (Re)Categorization To tap into a richer source of concept properties than WordNet’s glosses, we can use web ngrams. Consider these descriptions of a cowboy from the Google n-grams (Brants & Franz, 2006). The numbers to the right are Google frequency counts. a lonesome cowboy 432 a mounted cowboy 122 a grizzled cowboy 74 a swaggering cowboy 68 To find the stable properties that can underpin a meaningful fine-grained category for cowboy, we must seek out the properties that are so often presupposed to be salient of all cowboys that one can use them to anchor a simile, such as

5 0.43446785 131 acl-2013-Dual Training and Dual Prediction for Polarity Classification

Author: Rui Xia ; Tao Wang ; Xuelei Hu ; Shoushan Li ; Chengqing Zong

Abstract: Bag-of-words (BOW) is now the most popular way to model text in machine learning based sentiment classification. However, the performance of such approach sometimes remains rather limited due to some fundamental deficiencies of the BOW model. In this paper, we focus on the polarity shift problem, and propose a novel approach, called dual training and dual prediction (DTDP), to address it. The basic idea of DTDP is to first generate artificial samples that are polarity-opposite to the original samples by polarity reversion, and then leverage both the original and opposite samples for (dual) training and (dual) prediction. Experimental results on four datasets demonstrate the effectiveness of the proposed approach for polarity classification. 1

6 0.43068561 188 acl-2013-Identifying Sentiment Words Using an Optimization-based Model without Seed Words

7 0.41450211 91 acl-2013-Connotation Lexicon: A Dash of Sentiment Beneath the Surface Meaning

8 0.39945787 344 acl-2013-The Effects of Lexical Resource Quality on Preference Violation Detection

9 0.39299652 148 acl-2013-Exploring Sentiment in Social Media: Bootstrapping Subjectivity Clues from Multilingual Twitter Streams

10 0.38773075 117 acl-2013-Detecting Turnarounds in Sentiment Analysis: Thwarting

11 0.36584836 302 acl-2013-Robust Automated Natural Language Processing with Multiword Expressions and Collocations

12 0.36072594 278 acl-2013-Patient Experience in Online Support Forums: Modeling Interpersonal Interactions and Medication Use

13 0.32153901 61 acl-2013-Automatic Interpretation of the English Possessive

14 0.30845124 318 acl-2013-Sentiment Relevance

15 0.30613056 30 acl-2013-A computational approach to politeness with application to social factors

16 0.30108789 310 acl-2013-Semantic Frames to Predict Stock Price Movement

17 0.29840273 49 acl-2013-An annotated corpus of quoted opinions in news articles

18 0.29817811 53 acl-2013-Annotation of regular polysemy and underspecification

19 0.29450807 42 acl-2013-Aid is Out There: Looking for Help from Tweets during a Large Scale Disaster

20 0.28617013 236 acl-2013-Mapping Source to Target Strings without Alignment by Analogical Learning: A Case Study with Transliteration

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(0, 0.073), (6, 0.022), (7, 0.249), (11, 0.056), (15, 0.022), (24, 0.065), (26, 0.062), (28, 0.013), (35, 0.07), (42, 0.052), (48, 0.036), (70, 0.032), (88, 0.068), (90, 0.031), (95, 0.045)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.801292 253 acl-2013-Multilingual Affect Polarity and Valence Prediction in Metaphor-Rich Texts

Author: Zornitsa Kozareva

2 0.75466573 346 acl-2013-The Impact of Topic Bias on Quality Flaw Prediction in Wikipedia

Author: Oliver Ferschke ; Iryna Gurevych ; Marc Rittberger

Abstract: With the increasing amount of user generated reference texts in the web, automatic quality assessment has become a key challenge. However, only a small amount of annotated data is available for training quality assessment systems. Wikipedia contains a large amount of texts annotated with cleanup templates which identify quality flaws. We show that the distribution of these labels is topically biased, since they cannot be applied freely to any arbitrary article. We argue that it is necessary to consider the topical restrictions of each label in order to avoid a sampling bias that results in a skewed classifier and overly optimistic evaluation results. . We factor out the topic bias by extracting reliable training instances from the revision history which have a topic distribution similar to the labeled articles. This approach better reflects the situation a classifier would face in a real-life application.

3 0.73550487 153 acl-2013-Extracting Events with Informal Temporal References in Personal Histories in Online Communities

Author: Miaomiao Wen ; Zeyu Zheng ; Hyeju Jang ; Guang Xiang ; Carolyn Penstein Rose

Abstract: We present a system for extracting the dates of illness events (year and month of the event occurrence) from posting histories in the context of an online medical support community. A temporal tagger retrieves and normalizes dates mentioned informally in social media to actual month and year referents. Building on this, an event date extraction system learns to integrate the likelihood of candidate dates extracted from time-rich sentences with temporal constraints extracted from eventrelated sentences. Our integrated model achieves 89.7% of the maximum performance given the performance of the temporal expression retrieval step.

4 0.69957727 259 acl-2013-Non-Monotonic Sentence Alignment via Semisupervised Learning

Author: Xiaojun Quan ; Chunyu Kit ; Yan Song

Abstract: This paper studies the problem of nonmonotonic sentence alignment, motivated by the observation that coupled sentences in real bitexts do not necessarily occur monotonically, and proposes a semisupervised learning approach based on two assumptions: (1) sentences with high affinity in one language tend to have their counterparts with similar relatedness in the other; and (2) initial alignment is readily available with existing alignment techniques. They are incorporated as two constraints into a semisupervised learning framework for optimization to produce a globally optimal solution. The evaluation with realworld legal data from a comprehensive legislation corpus shows that while exist- ing alignment algorithms suffer severely from non-monotonicity, this approach can work effectively on both monotonic and non-monotonic data.

5 0.60048234 264 acl-2013-Online Relative Margin Maximization for Statistical Machine Translation

Author: Vladimir Eidelman ; Yuval Marton ; Philip Resnik

Abstract: Recent advances in large-margin learning have shown that better generalization can be achieved by incorporating higher order information into the optimization, such as the spread of the data. However, these solutions are impractical in complex structured prediction problems such as statistical machine translation. We present an online gradient-based algorithm for relative margin maximization, which bounds the spread ofthe projected data while maximizing the margin. We evaluate our optimizer on Chinese-English and ArabicEnglish translation tasks, each with small and large feature sets, and show that our learner is able to achieve significant im- provements of 1.2-2 BLEU and 1.7-4.3 TER on average over state-of-the-art optimizers with the large feature set.

6 0.56275064 2 acl-2013-A Bayesian Model for Joint Unsupervised Induction of Sentiment, Aspect and Discourse Representations

7 0.56156456 318 acl-2013-Sentiment Relevance

8 0.5557847 252 acl-2013-Multigraph Clustering for Unsupervised Coreference Resolution

9 0.55443656 83 acl-2013-Collective Annotation of Linguistic Resources: Basic Principles and a Formal Model

10 0.5533725 70 acl-2013-Bilingually-Guided Monolingual Dependency Grammar Induction

11 0.55255634 183 acl-2013-ICARUS - An Extensible Graphical Search Tool for Dependency Treebanks

12 0.552212 369 acl-2013-Unsupervised Consonant-Vowel Prediction over Hundreds of Languages

13 0.55198425 225 acl-2013-Learning to Order Natural Language Texts

14 0.54658151 85 acl-2013-Combining Intra- and Multi-sentential Rhetorical Parsing for Document-level Discourse Analysis

15 0.54532743 187 acl-2013-Identifying Opinion Subgroups in Arabic Online Discussions

16 0.54479599 123 acl-2013-Discriminative Learning with Natural Annotations: Word Segmentation as a Case Study

17 0.54397255 233 acl-2013-Linking Tweets to News: A Framework to Enrich Short Text Data in Social Media

18 0.54330403 111 acl-2013-Density Maximization in Context-Sense Metric Space for All-words WSD

19 0.54233533 185 acl-2013-Identifying Bad Semantic Neighbors for Improving Distributional Thesauri

20 0.54186469 17 acl-2013-A Random Walk Approach to Selectional Preferences Based on Preference Ranking and Propagation