acl acl2013 acl2013-91 knowledge-graph by maker-knowledge-mining

91 acl-2013-Connotation Lexicon: A Dash of Sentiment Beneath the Surface Meaning

Source: pdf

Author: Song Feng ; Jun Seok Kang ; Polina Kuznetsova ; Yejin Choi

Abstract: Understanding the connotation of words plays an important role in interpreting subtle shades of sentiment beyond denotative or surface meaning of text, as seemingly objective statements often allude nuanced sentiment of the writer, and even purposefully conjure emotion from the readers’ minds. The focus of this paper is drawing nuanced, connotative sentiments from even those words that are objective on the surface, such as “intelligence ”, “human ”, and “cheesecake ”. We propose induction algorithms encoding a diverse set of linguistic insights (semantic prosody, distributional similarity, semantic parallelism of coordination) and prior knowledge drawn from lexical resources, resulting in the first broad-coverage connotation lexicon.

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 The focus of this paper is drawing nuanced, connotative sentiments from even those words that are objective on the surface, such as “intelligence ”, “human ”, and “cheesecake ”. [sent-4, score-0.446]

2 We propose induction algorithms encoding a diverse set of linguistic insights (semantic prosody, distributional similarity, semantic parallelism of coordination) and prior knowledge drawn from lexical resources, resulting in the first broad-coverage connotation lexicon. [sent-5, score-0.785]

3 1 Introduction There has been a substantial body of research in sentiment analysis over the last decade (Pang and Lee, 2008), where a considerable amount of work has focused on recognizing sentiment that is generally explicit and pronounced rather than implied and subdued. [sent-6, score-0.33]

4 However in many real-world texts, even seemingly objective statements can be opinion-laden in that they often allude nuanced sentiment ofthe writer (Greene and Resnik, 2009), or purposefully conjure emotion from the readers’ minds (Mohammad and Turney, 2010). [sent-7, score-0.459]

5 Although some researchers have explored formal and statistical treatments of those implicit and implied sentiments (e. [sent-8, score-0.098]

6 In this paper, we concentrate on understanding the connotative sentiments of words, as they play an important role in interpreting subtle shades of sentiment beyond denotative or surface meaning of text. [sent-13, score-0.768]

7 1 Although this sentence could be considered as a factual statement from the general standpoint, the subtle effect of this sentence may not be entirely objective: this sentence is likely to have an influence on readers’ minds in regard to their opinion toward “geothermal”. [sent-15, score-0.125]

8 In order to sense the subtle overtone of sentiments, one needs to know that the word “emissions ” has generally negative connotation, which geothermal reduces. [sent-16, score-0.249]

9 The main contribution of this paper is a broadcoverage connotation lexicon that determines the connotative polarity of even those words with ever so subtle connotation beneath their surface meaning, such as “Literature ”, “Mediterranean ”, and “wine ”. [sent-18, score-1.847]

10 Although there has been a number of previous work that constructed sentiment lexicons (e. [sent-19, score-0.199]

11 (2009)), which seem to be increasingly and inevitably expanding over words with (strongly) connotative sentiments rather than explicit sentiments alone (e. [sent-23, score-0.498]

12 , “gun”), little prior work has directly tackled this problem of learning connotation,2 and much of the subtle connotation of many seemingly objective words is yet to be determined. [sent-25, score-0.76]

13 1Our learned lexicon correctly assigns negative polarity to emission. [sent-26, score-0.324]

14 A central premise to our approach is that it is collocational statistics of words that affect and shape the polarity of connotation. [sent-31, score-0.187]

15 It is important to clarify, however, that we do not simply assume that words that collocate share the same polarity of connotation. [sent-33, score-0.156]

16 Although such an assumption played a key role in previous work for the analogous task of learning sentiment lexicon (Velikovich et al. [sent-34, score-0.239]

17 , 2010), we expect that the same assumption would be less reliable in drawing subtle connotative sentiments of words. [sent-35, score-0.49]

18 As one example, the predicate “cure”, which has a positive connotation typically takes arguments with negative connotation, e. [sent-36, score-0.783]

19 We cast the connotation lexicon induction task as a collective inference problem, and consider approaches based on three distinct types of algorith- mic framework that have been shown successful for conventional sentiment lexicon induction: Random walk based on HITS/PageRank (e. [sent-42, score-0.971]

20 In this work, we assume the general connotation of each word over statistically prevailing senses, leaving a more cautious handling of WSD as future work. [sent-54, score-0.58]

21 We provide comparative empirical results over several variants of these approaches with comprehensive evaluations including lexicon-based, hu- man judgments, and extrinsic evaluations. [sent-60, score-0.065]

22 It is worthwhile to note that not all words have connotative meanings that are distinct from denotational meanings, and in some cases, it can be difficult to determine whether the overall sentiment is drawn from denotational or connotative meanings exclusively, or both. [sent-61, score-0.935]

23 Therefore, we encompass any sentiment from either type of meanings into the lexicon, where non-neutral polarity prevails over neutral one if some meanings lead to neutral while others to non-neutral. [sent-62, score-0.511]

24 4 Our work results in the first broad-coverage connotation lexicon,5 significantly improving both the coverage and the precision of Feng et al. [sent-63, score-0.617]

25 As an interesting by-product, our algorithm can be also used as a proxy to measure the general connotation of real-world named entities based on their collocational statistics. [sent-65, score-0.611]

26 In §2 we describe three types of induction algorInith §m2s w wfoell doewscedri by hevraeleu taytipoens i onf §3. [sent-68, score-0.078]

27 Tuchteinon we re- rviitshitm tsh feo ilnldowucetdio bny algorithms b inase §d3. [sent-69, score-0.039]

28 4In general, polysemous words do not seem to have conflicting non-neutral polarities over different senses, though there are many exceptions, e. [sent-74, score-0.087]

29 We treat each word in each part-of-speech as a separate word to reduce such cases, otherwise aim to learn the most prevalent polarity in the corpus with respect to each part-of-speech of each word. [sent-77, score-0.156]

30 1775 Arg-Arg Pred-Atnrhagjonykwprh…iteolfnpigt nrveasditmneg t pred-arg distr sim Figure 1: Graph for Graph Propagation (§2. [sent-82, score-0.038]

31 2 Connotation Induction Algorithms We develop induction algorithms based on three distinct types of algorithmic framework that have been shown successful for the analogous task of sentiment lexicon induction: HITS & PageRank (§2. [sent-84, score-0.356]

32 , 1999) to induce the general connotation of words hinging on the linguistic phenomena of selectional preference and semantic prosody, i. [sent-96, score-0.58]

33 For example, the object of a negative connotative predicate “cure ” is likely to have negative connotation, e. [sent-99, score-0.536]

34 The bipartite graph structure for this approach corresponds to the left-most box (labeled as “pred-arg”) in Figure 1. [sent-102, score-0.094]

35 2 Label Propagation With the goal of obtaining a broad-coverage lexicon in mind, we find that relying only on the structure of semantic prosody is limiting, due to relatively small sets of connotative predicates available. [sent-104, score-0.577]

36 6 Therefore, we extend the graph structure as an overlay of two sub-graphs (Figure 1) as described below: 6For connotative predicates, we use the seed predicate set of Feng et al. [sent-105, score-0.573]

37 (2011), which comprises of 20 positive and 20 negative predicates. [sent-106, score-0.157]

38 distr sim antonyms Figure 2: Graph for ILP/LP (§2. [sent-107, score-0.082]

39 Sub-graph #1: Predicate–Argument Graph This sub-graph is the bipartite graph that encodes the selectional preference of connotative predicates over their arguments. [sent-110, score-0.44]

40 In this graph, connotative predicates p reside on one side of the graph and their co-occurring arguments a reside on the other side of the graph based on Google Web 1T corpus. [sent-111, score-0.596]

41 7 The weight on the edges between the predicates p and arguments a are defined using Point-wise Mutual Information (PMI) as follows: w(p → a) := PMI(p,a) = log2 PP(p(p)P,a(a)) PMI scores have been widely used in previous studies to measure association between words (e. [sent-112, score-0.044]

42 One possible way of constructing such a graph is simply connecting all nodes and assign edge weights proportionate to the word association scores, such as PMI, or distributional similarity. [sent-116, score-0.175]

43 However, such a completely connected graph can be susceptible to propagating noise, and does not scale well over a very large set of vocabulary. [sent-117, score-0.094]

44 We therefore reduce the graph connectivity by exploiting semantic parallelism of coordination (Bock (1986), Hatzivassiloglou and McKeown 7We restrict predicte-argument pairs to verb-object pairs in this study. [sent-118, score-0.267]

45 That is, we only allow edges to go from a predicate to an argument. [sent-122, score-0.046]

46 Therefore, we can encode only positive (supportive) relations among words (e. [sent-135, score-0.093]

47 , distributionally similar words will endorse each other with the same polarity), while missing on exploiting negative relations (e. [sent-137, score-0.153]

48 , antonyms may drive each other into the opposite polarity). [sent-139, score-0.044]

49 They induce positive and negative polarities in isolation via separate graphs. [sent-141, score-0.244]

50 However, we expect that a more effective algorithm should induce both polarities simultaneously. [sent-142, score-0.087]

51 9Note that cosine similarity does not make sense for the first sub-graph as there is no reason why a predicate and an argument should be distributionally similar. [sent-146, score-0.105]

52 We experimented with many different variations on the graph structure and edge weights, including ones that include any word pairs that occurred frequently enough together. [sent-147, score-0.145]

53 2), we propose an induction algorithm based on Integer pLroinpeoarse Programming (ILP). [sent-151, score-0.078]

54 Definition of variables: For each word i, we define binary variables xi, yi, zi ∈ {0, 1}, where xi = 1(yi = 1, zi = 1) if and only {if0 i, h1}as, a positive (negative, neutral) connotation respectively. [sent-169, score-0.989]

55 For every pair of word iand j, we define binary variables dipjq where p, q ∈ {+, 0} and dipjq = 1 if and only ifw wthhee polarity {of+ i, −an,d0 j are p and q respectively. [sent-170, score-0.328]

56 Objective function: We aim to maximize: F Φprosody + Φcoord + Φneu −, = where Φprosody is the scores based on semantic prosody, Φcoord captures the distributional similarity over coordination, and Φneu controls the sensitivity of connotation detection between positive (negative) and neutral. [sent-171, score-0.673]

57 α controls the sensitivity of connotation detection such that higher value of α will promote neutral connotation over polar ones. [sent-174, score-1.26]

58 Each word ihas one of {+, ø}as polarity: ∀i, xi yi zi = 1e 2. [sent-176, score-0.359]

59 Variable consistency between dipjq and xi , yi , zi : −, + + + xj − 1 yi + yj − 1 zi + zj − 1 xi + yj − 1 yi + xj 1 xi − ≤ 2di+,j+ ≤ ≤ 2di−,j− ≤ 2di0,0j ≤ 2di+,j− ≤ 2di−,j+ ≤ ≤ ≤ ≤ + xj yi + yj zi + zj xi + yj yi + xj xi Hard constrains for WordNet relations: 1. [sent-177, score-2.669]

60 Cant: Antonym pairs will not have the same positive or negative polarity: ∀(i, j) ∈ Rant, xi + xj ≤ 1, yi + yj ≤ 1 For this constraint, we only consider antonym pairs that share the same root, e. [sent-178, score-0.684]

61 , “sufficient” and “insufficient”, as those pairs are more likely to have the opposite polarities than pairs without sharing the same root, e. [sent-180, score-0.087]

62 Csyn: Synonym pairs will not have the opposCite polarity: + 3 ∀(i,j) ∈ Rsyn, xi yj ≤ 1, xj Experimental Result I + yi ≤ 1 We provide comprehensive comparisons over variants of three types of algorithms proposed in §2. [sent-184, score-0.59]

63 1 Comparison against Conventional Sentiment Lexicon Note that we consider the connotation lexicon to be inclusive of a sentiment lexicon for two practical reasons: first, it is highly unlikely that any word with non-neutral sentiment (i. [sent-189, score-1.058]

64 , positive or negative) would carry connotation of the opposite, i. [sent-191, score-0.643]

65 Second, for some words with distinct sentiment or strong connotation, it can be difficult or even unnatural to draw a precise distinction between connotation and sentiment, e. [sent-194, score-0.745]

66 Therefore, sentiment lexicons can serve as a surrogate to measure a subset of connotation words induced by the algorithms, as shown in Table 3 with respect to General Inquirer (Stone and Hunt (1963)) and MPQA (Wil- son et al. [sent-197, score-0.779]

67 11 Discussion Table 3 shows the agreement statistics with respect to two conventional sentiment lexicons. [sent-199, score-0.165]

68 We find that the use of label propagation alone [PRED-ARG (CP)] improves the performance substantially over the comparable graph construction with different graph analysis algorithms, in particular, HITS and PageRank approaches of Feng et al. [sent-200, score-0.255]

69 The two completely connected variants of the graph propagation on the Pred-Arg graph, [N PRED-ARG (PMI)] and [N PRED-ARG (CP)]N, do not necessarily improveN the performance over the simpler and computationally lighter alternative, [PREDARG (CP)]. [sent-202, score-0.195]

70 This result suggests: 1 The sub-graph #2, based on the semantic parallelism of coordination, is simple and yet very powerful as an inductive bias. [sent-205, score-0.058]

71 2 The performance of graph propagation varies significantly depending on the graph topology and the corresponding edge weights. [sent-206, score-0.306]

72 Only for comparison purposes however, we assign 10We consider “positive” and “negative” polarities conflict, but “neutral” polarity does not conflict with any. [sent-208, score-0.243]

73 11In the case of General Inquirer, we use words in POSITIV and NEGATIV sets as words with positive and negative labels respectively. [sent-209, score-0.157]

74 Importantly, when evaluated over more than top 5k words, ILP is overall the top performer considering both precision (shown in Table 3) and coverage (omitted for brevity). [sent-212, score-0.037]

75 12 4 Precision, Coverage, and Efficiency In this section, we address three important aspects of an ideal induction algorithm: precision, coverage, and efficiency. [sent-213, score-0.078]

76 For brevity, the remainder of the paper will focus on the algorithms based on constraint optimization, as it turned out to be the most effective one from the empirical results in §3. [sent-214, score-0.039]

77 Precision In order to see the effectiveness of the induction algorithms more sharply, we had used a limited set of seed words in §3. [sent-215, score-0.183]

78 n Hcoewd precision, we will use as a large seed set as possible, e. [sent-217, score-0.066]

79 Broad coverage Although statistics in Google 1T corpus represent a very large amount of text, words that appear in pred-arg and coordination relations are still limited. [sent-220, score-0.182]

80 12In fact, the performance of PRED-ARG variants for top 10K w. [sent-226, score-0.034]

81 13Note that doing so will prevent us from evaluating against the same sentiment lexicon used as a seed set. [sent-230, score-0.305]

82 3va arrieables are binary integers, those constraints are not as meaningful when considered for real numbers. [sent-237, score-0.033]

83 Therefore we revise those hard constraints to encode various semantic relations (WordNet and semantic coordination) more directly. [sent-238, score-0.063]

84 Definition of variables: For each word i, we define variables xi, yi, zi ∈ [0, 1] . [sent-239, score-0.14]

85 ihas a positive (negative) connotation if∈ ∈an [d0, only i f h tahse a xi (yi) vise assigned the greatest value among the three variables; otherwise, iis neutral. [sent-240, score-0.786]

86 Constraints for semantic coordination Rcoord can be defined similarly. [sent-243, score-0.115]

87 Lastly, following constraints encode antonym relations: For (i, j) ∈ Rant , dai+,j+ ≤ xi dai−,j− ≤ yi − (1 − xj), − (1 − yj), dai+,j+ ≤ (1 − xj) − xi dai−,j− ≤ (1 − yj) − yi Interpretation Unlike ILP, some of the variables result in fractional values. [sent-244, score-0.607]

88 We consider a word has positive or negative polarity only if the assignment indicates 1for the corresponding polarity and 0 for the rest. [sent-245, score-0.469]

89 In other words, we treat all words with fractional assignments over different polarities as neutral. [sent-246, score-0.126]

90 We find that LP variants much better recall and F-score, while maintaining comparable precision. [sent-256, score-0.034]

91 Therefore, we choose the connotation lexicon by LP (C-LP) in the following evaluations in §5. [sent-257, score-0.654]

92 2o envs flruoamtio §2 c&o; §4: nCgLP, OVERLAY, aPtiRvEeD l-eAxiRcoGn (CP), man §d2 two popular sentiment lexicons: SentiWordNet (Baccianella et al. [sent-261, score-0.165]

93 14 Note that C-LP is the largest among all connotation lexicons, including ∼70,000 polar words. [sent-263, score-0.625]

94 Because we expect that judging a connotation can be dependent on one’s cultural background, personality and value systems, we gather judgements from 5 people for each word, from which we hope to draw a more general judgement of connotative polarity. [sent-266, score-0.882]

95 We gather gold standard only for those words for which more than half of the judges agreed on the same polarity. [sent-268, score-0.039]

96 17 Figure 3 shows a part ofthe AMT task, where Turkers are presented with questions that help judges to determine the subtle connotative polarity of each word, then asked to rate the degree of connotation on a scale from 5 (most negative) and 5 (most positive). [sent-270, score-1.167]

97 For SentiWordNet, to retrieve the polarity of a given word, we sum over the polarity scores over all senses, where positive (negative) values correspond to positive (negative) polarity. [sent-273, score-0.438]

98 17We allow Turkers to mark words that can be used with both positive and negative connotation, which results in about 7% of words that are excluded from the gold standard set. [sent-277, score-0.157]

99 v47g1 the gold standard, we consider two different voting schemes: • ΩV ote: The judgement of each Turker is mapped to neutral for −1 ≤ score ≤ 1, posmitivape feodr score 2, negative fcoorr score ≤ 2, itthieven we ta sckoer tehe ≥ majority tvivoete. [sent-283, score-0.149]

100 f ΩScore: Let σ(i) be the sum (weighted vote) oΩf the scores given by 5 judges for word i. [sent-284, score-0.039]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('connotation', 0.58), ('connotative', 0.302), ('dsi', 0.267), ('sentiment', 0.165), ('prosody', 0.157), ('polarity', 0.156), ('yj', 0.144), ('yi', 0.118), ('ilp', 0.116), ('xj', 0.116), ('coordination', 0.115), ('xi', 0.108), ('sentiments', 0.098), ('zi', 0.098), ('negative', 0.094), ('graph', 0.094), ('subtle', 0.09), ('dai', 0.09), ('coord', 0.088), ('polarities', 0.087), ('induction', 0.078), ('lexicon', 0.074), ('zj', 0.07), ('neu', 0.07), ('propagation', 0.067), ('feng', 0.067), ('seed', 0.066), ('lp', 0.065), ('dipjq', 0.065), ('geothermal', 0.065), ('overlay', 0.065), ('positive', 0.063), ('pmi', 0.06), ('parallelism', 0.058), ('cure', 0.057), ('neutral', 0.055), ('mpqa', 0.053), ('hits', 0.053), ('edge', 0.051), ('pagerank', 0.048), ('inquirer', 0.047), ('objective', 0.046), ('predicate', 0.046), ('polar', 0.045), ('turkers', 0.045), ('nuanced', 0.045), ('predicates', 0.044), ('cp', 0.044), ('seemingly', 0.044), ('antonyms', 0.044), ('allude', 0.043), ('conjure', 0.043), ('dci', 0.043), ('denotational', 0.043), ('negativ', 0.043), ('positiv', 0.043), ('rpxred', 0.043), ('rrpred', 0.043), ('rsyn', 0.043), ('shades', 0.043), ('wcoord', 0.043), ('di', 0.042), ('variables', 0.042), ('antonym', 0.041), ('meanings', 0.04), ('amt', 0.04), ('algorithms', 0.039), ('fractional', 0.039), ('judges', 0.039), ('greene', 0.038), ('brook', 0.038), ('cplex', 0.038), ('denotative', 0.038), ('distr', 0.038), ('onybrook', 0.038), ('purposefully', 0.038), ('velikovich', 0.038), ('brevity', 0.037), ('coverage', 0.037), ('rant', 0.035), ('ihas', 0.035), ('minds', 0.035), ('stony', 0.035), ('programming', 0.034), ('variants', 0.034), ('lexicons', 0.034), ('constraints', 0.033), ('beneath', 0.033), ('readers', 0.033), ('surface', 0.032), ('reside', 0.031), ('collocational', 0.031), ('google', 0.031), ('comprehensive', 0.031), ('gi', 0.031), ('argument', 0.03), ('relations', 0.03), ('distributional', 0.03), ('kleinberg', 0.029), ('distributionally', 0.029)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999976 91 acl-2013-Connotation Lexicon: A Dash of Sentiment Beneath the Surface Meaning

Author: Song Feng ; Jun Seok Kang ; Polina Kuznetsova ; Yejin Choi

2 0.22749662 188 acl-2013-Identifying Sentiment Words Using an Optimization-based Model without Seed Words

Author: Hongliang Yu ; Zhi-Hong Deng ; Shiyingxue Li

Abstract: Sentiment Word Identification (SWI) is a basic technique in many sentiment analysis applications. Most existing researches exploit seed words, and lead to low robustness. In this paper, we propose a novel optimization-based model for SWI. Unlike previous approaches, our model exploits the sentiment labels of documents instead of seed words. Several experiments on real datasets show that WEED is effective and outperforms the state-of-the-art methods with seed words.

3 0.14688778 148 acl-2013-Exploring Sentiment in Social Media: Bootstrapping Subjectivity Clues from Multilingual Twitter Streams

Author: Svitlana Volkova ; Theresa Wilson ; David Yarowsky

Abstract: We study subjective language media and create Twitter-specific lexicons via bootstrapping sentiment-bearing terms from multilingual Twitter streams. Starting with a domain-independent, highprecision sentiment lexicon and a large pool of unlabeled data, we bootstrap Twitter-specific sentiment lexicons, using a small amount of labeled data to guide the process. Our experiments on English, Spanish and Russian show that the resulting lexicons are effective for sentiment classification for many underexplored languages in social media.

4 0.13507605 2 acl-2013-A Bayesian Model for Joint Unsupervised Induction of Sentiment, Aspect and Discourse Representations

Author: Angeliki Lazaridou ; Ivan Titov ; Caroline Sporleder

Abstract: We propose a joint model for unsupervised induction of sentiment, aspect and discourse information and show that by incorporating a notion of latent discourse relations in the model, we improve the prediction accuracy for aspect and sentiment polarity on the sub-sentential level. We deviate from the traditional view of discourse, as we induce types of discourse relations and associated discourse cues relevant to the considered opinion analysis task; consequently, the induced discourse relations play the role of opinion and aspect shifters. The quantitative analysis that we conducted indicated that the integration of a discourse model increased the prediction accuracy results with respect to the discourse-agnostic approach and the qualitative analysis suggests that the induced representations encode a meaningful discourse structure.

5 0.11653449 318 acl-2013-Sentiment Relevance

Author: Christian Scheible ; Hinrich Schutze

Abstract: A number of different notions, including subjectivity, have been proposed for distinguishing parts of documents that convey sentiment from those that do not. We propose a new concept, sentiment relevance, to make this distinction and argue that it better reflects the requirements of sentiment analysis systems. We demonstrate experimentally that sentiment relevance and subjectivity are related, but different. Since no large amount of labeled training data for our new notion of sentiment relevance is available, we investigate two semi-supervised methods for creating sentiment relevance classifiers: a distant supervision approach that leverages structured information about the domain of the reviews; and transfer learning on feature representations based on lexical taxonomies that enables knowledge transfer. We show that both methods learn sentiment relevance classifiers that perform well.

6 0.11074382 131 acl-2013-Dual Training and Dual Prediction for Polarity Classification

7 0.09388677 379 acl-2013-Utterance-Level Multimodal Sentiment Analysis

8 0.091373682 117 acl-2013-Detecting Turnarounds in Sentiment Analysis: Thwarting

9 0.090581849 79 acl-2013-Character-to-Character Sentiment Analysis in Shakespeare's Plays

10 0.088708133 115 acl-2013-Detecting Event-Related Links and Sentiments from Social Media Texts

11 0.087901309 211 acl-2013-LABR: A Large Scale Arabic Book Reviews Dataset

12 0.086372904 345 acl-2013-The Haves and the Have-Nots: Leveraging Unlabelled Corpora for Sentiment Analysis

13 0.086108923 284 acl-2013-Probabilistic Sense Sentiment Similarity through Hidden Emotions

14 0.08348731 17 acl-2013-A Random Walk Approach to Selectional Preferences Based on Preference Ranking and Propagation

15 0.081564255 147 acl-2013-Exploiting Topic based Twitter Sentiment for Stock Prediction

16 0.079623207 173 acl-2013-Graph-based Semi-Supervised Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging

17 0.078604594 187 acl-2013-Identifying Opinion Subgroups in Arabic Online Discussions

18 0.076504409 377 acl-2013-Using Supervised Bigram-based ILP for Extractive Summarization

19 0.075584106 168 acl-2013-Generating Recommendation Dialogs by Extracting Information from User Reviews

20 0.075067855 253 acl-2013-Multilingual Affect Polarity and Valence Prediction in Metaphor-Rich Texts

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.165), (1, 0.17), (2, -0.022), (3, 0.09), (4, -0.064), (5, -0.069), (6, -0.009), (7, 0.049), (8, -0.011), (9, 0.077), (10, 0.105), (11, -0.026), (12, -0.02), (13, -0.074), (14, 0.044), (15, 0.039), (16, 0.076), (17, -0.002), (18, 0.053), (19, 0.085), (20, -0.001), (21, 0.049), (22, -0.013), (23, 0.055), (24, 0.004), (25, 0.01), (26, -0.04), (27, 0.086), (28, -0.043), (29, 0.0), (30, -0.012), (31, -0.098), (32, -0.051), (33, 0.028), (34, 0.054), (35, 0.011), (36, -0.066), (37, -0.037), (38, 0.016), (39, -0.032), (40, 0.068), (41, 0.027), (42, 0.059), (43, -0.03), (44, -0.076), (45, -0.001), (46, 0.001), (47, 0.014), (48, -0.001), (49, -0.032)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.95515102 91 acl-2013-Connotation Lexicon: A Dash of Sentiment Beneath the Surface Meaning

Author: Song Feng ; Jun Seok Kang ; Polina Kuznetsova ; Yejin Choi

2 0.81746668 188 acl-2013-Identifying Sentiment Words Using an Optimization-based Model without Seed Words

Author: Hongliang Yu ; Zhi-Hong Deng ; Shiyingxue Li

3 0.74736112 131 acl-2013-Dual Training and Dual Prediction for Polarity Classification

Author: Rui Xia ; Tao Wang ; Xuelei Hu ; Shoushan Li ; Chengqing Zong

Abstract: Bag-of-words (BOW) is now the most popular way to model text in machine learning based sentiment classification. However, the performance of such approach sometimes remains rather limited due to some fundamental deficiencies of the BOW model. In this paper, we focus on the polarity shift problem, and propose a novel approach, called dual training and dual prediction (DTDP), to address it. The basic idea of DTDP is to first generate artificial samples that are polarity-opposite to the original samples by polarity reversion, and then leverage both the original and opposite samples for (dual) training and (dual) prediction. Experimental results on four datasets demonstrate the effectiveness of the proposed approach for polarity classification. 1

4 0.72918361 117 acl-2013-Detecting Turnarounds in Sentiment Analysis: Thwarting

Author: Ankit Ramteke ; Akshat Malu ; Pushpak Bhattacharyya ; J. Saketha Nath

Abstract: Thwarting and sarcasm are two uncharted territories in sentiment analysis, the former because of the lack of training corpora and the latter because of the enormous amount of world knowledge it demands. In this paper, we propose a working definition of thwarting amenable to machine learning and create a system that detects if the document is thwarted or not. We focus on identifying thwarting in product reviews, especially in the camera domain. An ontology of the camera domain is created. Thwarting is looked upon as the phenomenon of polarity reversal at a higher level of ontology compared to the polarity expressed at the lower level. This notion of thwarting defined with respect to an ontology is novel, to the best of our knowledge. A rule based implementation building upon this idea forms our baseline. We show that machine learning with annotated corpora (thwarted/nonthwarted) is more effective than the rule based system. Because of the skewed distribution of thwarting, we adopt the Areaunder-the-Curve measure of performance. To the best of our knowledge, this is the first attempt at the difficult problem of thwarting detection, which we hope will at Akshat Malu Dept. of Computer Science & Engg., Indian Institute of Technology Bombay, Mumbai, India. akshatmalu@ cse .i itb .ac .in J. Saketha Nath Dept. of Computer Science & Engg., Indian Institute of Technology Bombay, Mumbai, India. s aketh@ cse .i itb .ac .in least provide a baseline system to compare against. 1 Credits The authors thank the lexicographers at Center for Indian Language Technology (CFILT) at IIT Bombay for their support for this work. 2

5 0.70928442 148 acl-2013-Exploring Sentiment in Social Media: Bootstrapping Subjectivity Clues from Multilingual Twitter Streams

Author: Svitlana Volkova ; Theresa Wilson ; David Yarowsky

6 0.68237966 79 acl-2013-Character-to-Character Sentiment Analysis in Shakespeare's Plays

7 0.67982739 318 acl-2013-Sentiment Relevance

8 0.60132766 284 acl-2013-Probabilistic Sense Sentiment Similarity through Hidden Emotions

9 0.55249017 211 acl-2013-LABR: A Large Scale Arabic Book Reviews Dataset

10 0.52465987 168 acl-2013-Generating Recommendation Dialogs by Extracting Information from User Reviews

11 0.51967239 253 acl-2013-Multilingual Affect Polarity and Valence Prediction in Metaphor-Rich Texts

12 0.51759487 2 acl-2013-A Bayesian Model for Joint Unsupervised Induction of Sentiment, Aspect and Discourse Representations

13 0.49481347 379 acl-2013-Utterance-Level Multimodal Sentiment Analysis

14 0.45593575 81 acl-2013-Co-Regression for Cross-Language Review Rating Prediction

15 0.45274559 345 acl-2013-The Haves and the Have-Nots: Leveraging Unlabelled Corpora for Sentiment Analysis

16 0.44925168 326 acl-2013-Social Text Normalization using Contextual Graph Random Walks

17 0.44754386 42 acl-2013-Aid is Out There: Looking for Help from Tweets during a Large Scale Disaster

18 0.43554386 310 acl-2013-Semantic Frames to Predict Stock Price Movement

19 0.43241432 17 acl-2013-A Random Walk Approach to Selectional Preferences Based on Preference Ranking and Propagation

20 0.42243293 324 acl-2013-Smatch: an Evaluation Metric for Semantic Feature Structures

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(0, 0.048), (6, 0.038), (11, 0.072), (15, 0.015), (24, 0.038), (26, 0.057), (35, 0.087), (42, 0.03), (48, 0.061), (63, 0.01), (70, 0.059), (80, 0.265), (88, 0.044), (90, 0.028), (95, 0.061)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.88307303 14 acl-2013-A Novel Classifier Based on Quantum Computation

Author: Ding Liu ; Xiaofang Yang ; Minghu Jiang

Abstract: In this article, we propose a novel classifier based on quantum computation theory. Different from existing methods, we consider the classification as an evolutionary process of a physical system and build the classifier by using the basic quantum mechanics equation. The performance of the experiments on two datasets indicates feasibility and potentiality of the quantum classifier.

2 0.82893562 227 acl-2013-Learning to lemmatise Polish noun phrases

Author: Adam Radziszewski

Abstract: We present a novel approach to noun phrase lemmatisation where the main phase is cast as a tagging problem. The idea draws on the observation that the lemmatisation of almost all Polish noun phrases may be decomposed into transformation of singular words (tokens) that make up each phrase. We perform evaluation, which shows results similar to those obtained earlier by a rule-based system, while our approach allows to separate chunking from lemmatisation.

same-paper 3 0.81777561 91 acl-2013-Connotation Lexicon: A Dash of Sentiment Beneath the Surface Meaning

Author: Song Feng ; Jun Seok Kang ; Polina Kuznetsova ; Yejin Choi

4 0.72306597 196 acl-2013-Improving pairwise coreference models through feature space hierarchy learning

Author: Emmanuel Lassalle ; Pascal Denis

Abstract: This paper proposes a new method for significantly improving the performance of pairwise coreference models. Given a set of indicators, our method learns how to best separate types of mention pairs into equivalence classes for which we construct distinct classification models. In effect, our approach finds an optimal feature space (derived from a base feature set and indicator set) for discriminating coreferential mention pairs. Although our approach explores a very large space of possible feature spaces, it remains tractable by exploiting the structure of the hierarchies built from the indicators. Our exper- iments on the CoNLL-2012 Shared Task English datasets (gold mentions) indicate that our method is robust relative to different clustering strategies and evaluation metrics, showing large and consistent improvements over a single pairwise model using the same base features. Our best system obtains a competitive 67.2 of average F1 over MUC, and CEAF which, despite its simplicity, places it above the mean score of other systems on these datasets. B3,

5 0.67813385 135 acl-2013-English-to-Russian MT evaluation campaign

Author: Pavel Braslavski ; Alexander Beloborodov ; Maxim Khalilov ; Serge Sharoff

Abstract: This paper presents the settings and the results of the ROMIP 2013 MT shared task for the English→Russian language directfioorn. t Teh Een quality Rofu generated utraagnsel datiiroencswas assessed using automatic metrics and human evaluation. We also discuss ways to reduce human evaluation efforts using pairwise sentence comparisons by human judges to simulate sort operations.

6 0.66565108 281 acl-2013-Post-Retrieval Clustering Using Third-Order Similarity Measures

7 0.55975497 318 acl-2013-Sentiment Relevance

8 0.55374974 275 acl-2013-Parsing with Compositional Vector Grammars

9 0.55210298 272 acl-2013-Paraphrase-Driven Learning for Open Question Answering

10 0.55152661 17 acl-2013-A Random Walk Approach to Selectional Preferences Based on Preference Ranking and Propagation

11 0.54978657 169 acl-2013-Generating Synthetic Comparable Questions for News Articles

12 0.54858661 185 acl-2013-Identifying Bad Semantic Neighbors for Improving Distributional Thesauri

13 0.54789168 83 acl-2013-Collective Annotation of Linguistic Resources: Basic Principles and a Formal Model

14 0.54776162 2 acl-2013-A Bayesian Model for Joint Unsupervised Induction of Sentiment, Aspect and Discourse Representations

15 0.54686165 62 acl-2013-Automatic Term Ambiguity Detection

16 0.54683578 159 acl-2013-Filling Knowledge Base Gaps for Distant Supervision of Relation Extraction

17 0.54629076 215 acl-2013-Large-scale Semantic Parsing via Schema Matching and Lexicon Extension

18 0.54591829 291 acl-2013-Question Answering Using Enhanced Lexical Semantic Models

19 0.54556364 188 acl-2013-Identifying Sentiment Words Using an Optimization-based Model without Seed Words

20 0.54464263 7 acl-2013-A Lattice-based Framework for Joint Chinese Word Segmentation, POS Tagging and Parsing