acl acl2011 acl2011-292 knowledge-graph by maker-knowledge-mining

292 acl-2011-Target-dependent Twitter Sentiment Classification

Source: pdf

Author: Long Jiang ; Mo Yu ; Ming Zhou ; Xiaohua Liu ; Tiejun Zhao

Abstract: Sentiment analysis on Twitter data has attracted much attention recently. In this paper, we focus on target-dependent Twitter sentiment classification; namely, given a query, we classify the sentiments of the tweets as positive, negative or neutral according to whether they contain positive, negative or neutral sentiments about that query. Here the query serves as the target of the sentiments. The state-ofthe-art approaches for solving this problem always adopt the target-independent strategy, which may assign irrelevant sentiments to the given target. Moreover, the state-of-the-art approaches only take the tweet to be classified into consideration when classifying the sentiment; they ignore its context (i.e., related tweets). However, because tweets are usually short and more ambiguous, sometimes it is not enough to consider only the current tweet for sentiment classification. In this paper, we propose to improve target-dependent Twitter sentiment classification by 1) incorporating target-dependent features; and 2) taking related tweets into consideration. According to the experimental results, our approach greatly improves the performance of target-dependent sentiment classification. 1

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 In this paper, we focus on target-dependent Twitter sentiment classification; namely, given a query, we classify the sentiments of the tweets as positive, negative or neutral according to whether they contain positive, negative or neutral sentiments about that query. [sent-3, score-1.818]

2 Moreover, the state-of-the-art approaches only take the tweet to be classified into consideration when classifying the sentiment; they ignore its context (i. [sent-6, score-0.508]

3 However, because tweets are usually short and more ambiguous, sometimes it is not enough to consider only the current tweet for sentiment classification. [sent-9, score-1.392]

4 In this paper, we propose to improve target-dependent Twitter sentiment classification by 1) incorporating target-dependent features; and 2) taking related tweets into consideration. [sent-10, score-1.064]

5 1 Introduction Twitter, as a micro-blogging system, allows users to publish tweets of up to 140 characters in length to tell others what they are doing, what they are thinking, or what is happening around them. [sent-12, score-0.503]

6 cn ipedia, the number of Twitter users has climbed to 190 million and the number of tweets published on Twitter every day is over 65 million1. [sent-17, score-0.546]

7 As a result of the rapidly increasing number of tweets, mining people’s sentiments expressed in tweets has attracted more and more attention. [sent-18, score-0.744]

8 In those web sites, the user can input a sentiment target as a query, and search for tweets containing positive or negative sentiments towards the target. [sent-20, score-1.477]

9 , 2002), who utilize machine learning based classifiers for the sentiment classification of texts. [sent-25, score-0.621]

10 However, their classifiers actually work in a target-independent way: all the features used in the classifiers are independent of the target, so the sentiment is decided no matter what the target is. [sent-26, score-0.74]

11 , 20095; 2002) (or later research on sentiment classification 1 http://en. [sent-28, score-0.561]

12 However, for target-dependent sentiment classification of tweets, it is not suitable to exactly adopt that approach. [sent-39, score-0.579]

13 Because people may mention multiple targets in one tweet or comment on a target in a tweet while saying many other unrelated things in the same tweet, target-independent approaches are likely to yield unsatisfactory results: 1. [sent-40, score-1.111]

14 Tweets that do not express any sentiments to the given target but express sentiments to other things will be considered as being opinionated about the target. [sent-41, score-0.634]

15 For example, the following tweet expresses no sentiment to Bill Gates but is very likely to be classified as positive about Bill Gates by targetindependent approaches. [sent-42, score-1.119]

16 The polarities of some tweets towards the given target are misclassified because of the interference from sentiments towards other targets in the tweets. [sent-45, score-1.026]

17 For example, the following tweet expresses a positive sentiment to Windows 7 and a negative sentiment to Vista. [sent-46, score-1.543]

18 However, with targetindependent sentiment classification, both of the targets would get positive polarity. [sent-47, score-0.753]

19 In addition, tweets are usually shorter and more ambiguous than other sentiment data commonly used for sentiment analysis, such as reviews and blogs. [sent-52, score-1.518]

20 Consequently, it is more difficult to classify the sentiment of a tweet only based on its content. [sent-53, score-0.93]

21 For instance, for the following tweet, which contains only three words, it is difficult for any existing approaches to classify its sentiment correctly. [sent-54, score-0.515]

22 ” 152 However, relations between individual tweets are more common than those in other sentiment data. [sent-56, score-0.998]

23 We can easily find many related tweets of a given tweet, such as the tweets published by the same person, the tweets replying to or replied by the given tweet, and retweets of the given tweet. [sent-57, score-1.667]

24 These related tweets provide rich information about what the given tweet expresses and should definitely be taken into consideration for classifying the sentiment of the given tweet. [sent-58, score-1.497]

25 In this paper, we propose to improve targetdependent sentiment classification of tweets by using both target-dependent and context-aware approaches. [sent-59, score-1.154]

26 Specifically, the target-dependent approach refers to incorporating syntactic features generated using words syntactically connected with the given target in the tweet to decide whether or not the sentiment is about the given target. [sent-60, score-1.086]

27 By learning from training data, we can probably predict that “Windows 7” should get a positive sentiment and “Vista” should get a negative sentiment. [sent-62, score-0.617]

28 In addition, we also propose to incorporate the contexts of tweets into classification, which we call a context-aware approach. [sent-63, score-0.503]

29 By considering the sentiment labels of the related tweets, we can further boost the performance of the sentiment classification, especially for very short and ambiguous tweets. [sent-64, score-0.969]

30 For example, in the third example we mentioned above, if we find that the previous and following tweets published by the same person are both positive about the Lakers, we can confidently classify this tweet as positive. [sent-65, score-1.131]

31 , 2002) treat the sentiment classification of movie reviews simply as a special case of a topic-based text categorization problem and investigate three classification algorithms: Naive Bayes, Maximum Entropy, and Support Vector Machines. [sent-79, score-0.729]

32 2 Target-dependent SA Besides the above mentioned work for targetindependent sentiment classification, there are also several approaches proposed for target-dependent classification, such as (Nasukawa and Yi, 2003; Hu and Liu, 2004; Ding and Liu, 2007). [sent-82, score-0.589]

33 Given a sentiment target and its context, part-of-speech tagging and dependency parsing are first performed on the context. [sent-84, score-0.583]

34 Then predefined rules are matched in the context to determine the sentiment about the target. [sent-85, score-0.505]

35 The sentiment about each target in each sentence of the review is determined based on the dominant orientation of the opinion words appearing in the sentence. [sent-87, score-0.665]

36 As mentioned in Section 1, target-dependent sentiment classification of review sentences is quite different from that of tweets. [sent-88, score-0.586]

37 In reviews, if any sentiment is expressed in a sentence containing a feature, it is very likely that the sentiment is about the feature. [sent-89, score-0.948]

38 , 2010) all follow the machine learning based approach for sentiment classification of tweets. [sent-95, score-0.561]

39 , 2010) propose to classify tweets into multiple sentiment types using hashtags and smileys as labels. [sent-97, score-1.058]

40 In contrast, (Barbosa and Feng, 2010) propose a two-step approach to classify the sentiments of tweets using SVM classifiers with abstract features. [sent-99, score-0.819]

41 The training data is collected from the outputs of three existing Twitter sentiment classification web sites. [sent-100, score-0.561]

42 As mentioned above, these approaches work in a target-independent way, and so need to be adapted for target-dependent sentiment classification. [sent-101, score-0.499]

43 3 Approach Overview The problem we address in this paper is targetdependent sentiment classification of tweets. [sent-102, score-0.651]

44 So the input of our task is a collection of tweets containing the target and the output is labels assigned to each of the tweets. [sent-103, score-0.612]

45 Subjectivity classification as the first step to decide if the tweet is subjective or neutral about the target; 2. [sent-105, score-0.677]

46 Polarity classification as the second step to decide if the tweet is positive or negative about the target if it is classified as subjective in Step 1; 3. [sent-106, score-0.846]

47 Graph-based optimization as the third step to further boost the performance by taking the related tweets into consideration. [sent-107, score-0.527]

48 , 2010) has discovered many effective features for sentiment analysis of tweets, such as emoticons, punctuation, prior subjectivity and polarity of a word. [sent-122, score-0.673]

49 Sentiment lexicon features, indicating how many positive or negative words are included in the tweet according to a predefined lexicon. [sent-129, score-0.642]

50 1 Extended Targets It is quite common that people express their sentiments about a target by commenting not on the target itself but on some related things of the target. [sent-140, score-0.557]

51 For example, one may express a sentiment about a company by commenting on its products or technologies. [sent-141, score-0.537]

52 To express a sentiment about a product, one may choose to comment on the features or functionalities of the product. [sent-142, score-0.544]

53 It is assumed that readers or audiences can clearly infer the sentiment about the target based on those sentiments about the related things. [sent-143, score-0.798]

54 As shown in the tweet below, the author expresses a positive sentiment about “Microsoft” by expressing a positive sentiment directly about “Microsoft technologies”. [sent-144, score-1.576]

55 Tweets expressing positive or negative sentiments towards the extended targets are also regarded as positive or negative about the target. [sent-147, score-0.752]

56 Therefore, for targetdependent sentiment classification of tweets, the first thing is identifying all extended targets in the input tweet collection. [sent-148, score-1.264]

57 However, it would be interesting to know under what circumstances the sentiment towards the target is truly consistent with that towards its extended targets. [sent-150, score-0.762]

58 For example, a sentiment about someone’s behavior usually means a sentiment about the person, while a sentiment about someone’s colleague usually has nothing to do with the person. [sent-151, score-1.422]

59 It is common that people use definite or demonstrative noun phrases or pronouns referring to the target in a tweet and express sentiments directly on them. [sent-155, score-0.828]

60 ”, the author expresses a positive sentiment to “you” which actually refers to “Jon Stewart”. [sent-158, score-0.589]

61 2 Target-dependent Features Target-dependent sentiment classification needs to distinguish the expressions describing the target from other expressions. [sent-172, score-0.67]

62 For example, for the target iPhone in the tweet “iPhone works better with the CellBand”, we will generate the feature “arg1_v_well”. [sent-186, score-0.569]

63 For example, for the target iPhone in the tweet “iPhone does not work better with the CellBand”, we will generate the features “arg1_v_neg-well” and “neg-work_it_arg1”. [sent-188, score-0.587]

64 5 not, no, never, Graph-based Sentiment Optimization As we mentioned in Section 1, since tweets are usually shorter and more ambiguous, it would be useful to take their contexts into consideration when classifying the sentiments. [sent-193, score-0.596]

65 In this paper, we regard the following three kinds of related tweets as context for a tweet. [sent-194, score-0.525]

66 So retweets usually have the same sentiment as the original tweets. [sent-199, score-0.514]

67 Intuitively, the tweets published by the same person within a short timeframe should have a consistent sentiment about the same target. [sent-202, score-1.046]

68 An example graph of tweets about a target If we consider that the sentiment of a tweet only depends on its content and immediate neighbors, we can leverage a graph-based method for sentiment classification of tweets. [sent-209, score-2.11]

69 We can convert the output scores of a tweet by the subjectivity and polarity classifiers into probabilistic form and use them to approximate p(c| τ). [sent-211, score-0.637]

70 After the iteration ends, for any tweet in the graph, the sentiment label that has the maximum p(c| τ, G) is considered the final label. [sent-213, score-0.889]

71 6 Experiments Because there is no annotated tweet corpus publicly available for evaluation of target-dependent Twitter sentiment classification, we have to create our own. [sent-214, score-0.889]

72 We manually classify each tweet as positive, tweets10 negative or neutral towards the query with which it is downloaded. [sent-217, score-0.696]

73 Among the 14 tweets which the two annotators disagree on, only 1 case is a positive-negative disagreement (one annotator considers it positive while the other negative), and the other 13 are all neutral-subjective disagreement. [sent-221, score-0.581]

74 This probably indicates that it is harder for humans to decide if a tweet is neutral or subjective than to decide if it is positive or negative. [sent-222, score-0.716]

75 10 In this paper, we use sentiment classification of English tweets as a case study; however, our approach is applicable to other languages as well. [sent-223, score-1.064]

76 For each query, we randomly select 20 tweets labeled as positive or negative by TS. [sent-226, score-0.646]

77 We also manually classify each tweet as positive, negative or neutral about the corresponding query. [sent-227, score-0.629]

78 Then, we analyze those tweets that get different labels from TS and humans. [sent-228, score-0.503]

79 Finally we find two major types of error: 1) Tweets which are totally neutral (for any target) are classified as subjective by TS; 2) sentiments in some tweets are classified correctly but the sentiments are not truly about the query. [sent-229, score-1.157]

80 After further checking those tweets of the second type, we found that most of them are actually neutral for the target, which means that the dominant error in Twitter Sentiment is classifying neutral tweets as subjective. [sent-232, score-1.267]

81 In the experiments, we consider the positive and negative tweets annotated by humans as subjective tweets (i. [sent-242, score-1.21]

82 , 2002), we balance the evaluation data set by randomly selecting 727 tweets from all neutral tweets annotated by humans and consider them as objective tweets (i. [sent-246, score-1.638]

83 Adding sentiment lexicon features improves the accuracy to 63. [sent-259, score-0.54]

84 3 Evaluation of Polarity Classification Similarly, we conduct several experiments on positive and negative tweets to compare the polarity classifiers with different features, where we use 268 negative and 268 randomly selected positive tweets. [sent-281, score-0.922]

85 The results show that our system using both content features and sentiment lexicon features performs slightly better than (Barbosa and Feng, 2010). [sent-290, score-0.604]

86 Both the classifiers with all features and with the combination of content and sentiment lexicon features are significantly better than that with only the content features (p < 0. [sent-293, score-0.728]

87 However, the classifier with all features does not significantly outperform that using the combination of content and sentiment lexicon features. [sent-295, score-0.595]

88 We also note that all numbers in Table 3 are much bigger than those in Table 1, which sug158 gests that subjectivity classification of tweets is more difficult than polarity classification. [sent-300, score-0.752]

89 4 Evaluation of Graph-based Optimization As seen in Figure 1, there are several tweets which are not connected with any other tweets. [sent-305, score-0.527]

90 The following table shows the percentages of the tweets in our evaluation data set which have at least one related tweet according to various relation types. [sent-307, score-0.942]

91 Percentages of tweets having at least one related tweet according to various relation types. [sent-310, score-0.942]

92 2% of the tweets concerning the test queries, we can find at least one related tweet. [sent-312, score-0.503]

93 From the detailed improvement for each sentiment class, we find that the context-aware approach is especially helpful for positive and negative classes. [sent-326, score-0.617]

94 Clearly, being published by the same person is the most useful relation for sentiment classification, which is consistent with the percentage distribution of the tweets over relation types; using retweet only does not help. [sent-331, score-1.046]

95 One possible reason for this is that the retweets and their original tweets are nearly the same, so it is very likely that they have already got the same labels in previous classifications. [sent-332, score-0.543]

96 7 Conclusions and Future Work Twitter sentiment analysis has attracted much attention recently. [sent-333, score-0.5]

97 In this paper, we address targetdependent sentiment classification of tweets. [sent-334, score-0.651]

98 Different from previous work using targetindependent classification, we propose to incorpo- rate syntactic features to distinguish texts used for expressing sentiments towards different targets in a tweet. [sent-335, score-0.506]

99 In addition, different from previous work using only information on the current tweet for sentiment classification, we propose to take the related tweets of the current tweet into consideration by utilizing graph-based optimization. [sent-337, score-1.83]

100 We are also interested in exploring relations between Twitter accounts for classifying the sentiments of the tweets published by them. [sent-341, score-0.827]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('tweets', 0.503), ('sentiment', 0.474), ('tweet', 0.415), ('twitter', 0.223), ('sentiments', 0.215), ('barbosa', 0.134), ('targets', 0.111), ('target', 0.109), ('neutral', 0.108), ('iphone', 0.093), ('targetdependent', 0.09), ('targetindependent', 0.09), ('subjectivity', 0.089), ('classification', 0.087), ('extended', 0.087), ('feng', 0.084), ('positive', 0.078), ('polarity', 0.073), ('negative', 0.065), ('classifiers', 0.06), ('davidiv', 0.06), ('lakers', 0.053), ('gaga', 0.053), ('pang', 0.05), ('lady', 0.049), ('windows', 0.047), ('reviews', 0.046), ('classifying', 0.045), ('replying', 0.045), ('ts', 0.043), ('published', 0.043), ('classify', 0.041), ('hashtags', 0.04), ('subjective', 0.04), ('retweets', 0.04), ('vista', 0.04), ('wi', 0.039), ('expresses', 0.037), ('orientation', 0.037), ('features', 0.037), ('gates', 0.036), ('microsoft', 0.036), ('movie', 0.035), ('query', 0.034), ('nasukawa', 0.034), ('towards', 0.033), ('express', 0.033), ('people', 0.032), ('pmi', 0.031), ('predefined', 0.031), ('product', 0.031), ('cellband', 0.03), ('commenting', 0.03), ('lovegame', 0.03), ('nplaeft', 0.03), ('parikh', 0.03), ('replied', 0.03), ('tbu', 0.03), ('lexicon', 0.029), ('things', 0.029), ('classifier', 0.028), ('sentimental', 0.028), ('content', 0.027), ('sa', 0.027), ('love', 0.027), ('decide', 0.027), ('attracted', 0.026), ('angelova', 0.026), ('person', 0.026), ('generate', 0.026), ('truly', 0.026), ('classified', 0.025), ('mentioned', 0.025), ('optimization', 0.024), ('connected', 0.024), ('ipad', 0.024), ('harbin', 0.024), ('copula', 0.024), ('retweeting', 0.024), ('thumbs', 0.024), ('according', 0.024), ('noun', 0.024), ('consideration', 0.023), ('soon', 0.023), ('appearing', 0.023), ('kinds', 0.022), ('polarities', 0.022), ('opinion', 0.022), ('graph', 0.021), ('humans', 0.021), ('relations', 0.021), ('ambiguous', 0.021), ('bill', 0.021), ('intransitive', 0.021), ('expressing', 0.02), ('week', 0.02), ('ding', 0.02), ('feature', 0.019), ('price', 0.019), ('adopt', 0.018)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000004 292 acl-2011-Target-dependent Twitter Sentiment Classification

Author: Long Jiang ; Mo Yu ; Ming Zhou ; Xiaohua Liu ; Tiejun Zhao

2 0.6708855 64 acl-2011-C-Feel-It: A Sentiment Analyzer for Micro-blogs

Author: Aditya Joshi ; Balamurali AR ; Pushpak Bhattacharyya ; Rajat Mohanty

Abstract: Social networking and micro-blogging sites are stores of opinion-bearing content created by human users. We describe C-Feel-It, a system which can tap opinion content in posts (called tweets) from the micro-blogging website, Twitter. This web-based system categorizes tweets pertaining to a search string as positive, negative or objective and gives an aggregate sentiment score that represents a sentiment snapshot for a search string. We present a qualitative evaluation of this system based on a human-annotated tweet corpus.

3 0.35403419 332 acl-2011-Using Multiple Sources to Construct a Sentiment Sensitive Thesaurus for Cross-Domain Sentiment Classification

Author: Danushka Bollegala ; David Weir ; John Carroll

Abstract: We describe a sentiment classification method that is applicable when we do not have any labeled data for a target domain but have some labeled data for multiple other domains, designated as the source domains. We automat- ically create a sentiment sensitive thesaurus using both labeled and unlabeled data from multiple source domains to find the association between words that express similar sentiments in different domains. The created thesaurus is then used to expand feature vectors to train a binary classifier. Unlike previous cross-domain sentiment classification methods, our method can efficiently learn from multiple source domains. Our method significantly outperforms numerous baselines and returns results that are better than or comparable to previous cross-domain sentiment classification methods on a benchmark dataset containing Amazon user reviews for different types of products.

4 0.34685415 204 acl-2011-Learning Word Vectors for Sentiment Analysis

Author: Andrew L. Maas ; Raymond E. Daly ; Peter T. Pham ; Dan Huang ; Andrew Y. Ng ; Christopher Potts

Abstract: Unsupervised vector-based approaches to semantics can model rich lexical meanings, but they largely fail to capture sentiment information that is central to many word meanings and important for a wide range of NLP tasks. We present a model that uses a mix of unsupervised and supervised techniques to learn word vectors capturing semanticterm–documentinformation as well as rich sentiment content. The proposed model can leverage both continuous and multi-dimensional sentiment information as well as non-sentiment annotations. We instantiate the model to utilize the document-level sentiment polarity annotations present in many online documents (e.g. star ratings). We evaluate the model using small, widely used sentiment and subjectivity corpora and find it out-performs several previously introduced methods for sentiment classification. We also introduce a large dataset , of movie reviews to serve as a more robust benchmark for work in this area.

5 0.34564659 242 acl-2011-Part-of-Speech Tagging for Twitter: Annotation, Features, and Experiments

Author: Kevin Gimpel ; Nathan Schneider ; Brendan O'Connor ; Dipanjan Das ; Daniel Mills ; Jacob Eisenstein ; Michael Heilman ; Dani Yogatama ; Jeffrey Flanigan ; Noah A. Smith

Abstract: We address the problem of part-of-speech tagging for English data from the popular microblogging service Twitter. We develop a tagset, annotate data, develop features, and report tagging results nearing 90% accuracy. The data and tools have been made available to the research community with the goal of enabling richer text analysis of Twitter and related social media data sets.

6 0.33669412 160 acl-2011-Identifying Sarcasm in Twitter: A Closer Look

7 0.31505099 261 acl-2011-Recognizing Named Entities in Tweets

8 0.29551211 177 acl-2011-Interactive Group Suggesting for Twitter

9 0.28741983 281 acl-2011-Sentiment Analysis of Citations using Sentence Structure-Based Features

10 0.28405732 218 acl-2011-MemeTube: A Sentiment-based Audiovisual System for Analyzing and Displaying Microblog Messages

11 0.25937411 253 acl-2011-PsychoSentiWordNet

12 0.25873911 279 acl-2011-Semi-supervised latent variable models for sentence-level sentiment analysis

13 0.25552779 105 acl-2011-Dr Sentiment Knows Everything!

14 0.24390543 183 acl-2011-Joint Bilingual Sentiment Classification with Unlabeled Parallel Corpora

15 0.22181515 54 acl-2011-Automatically Extracting Polarity-Bearing Topics for Cross-Domain Sentiment Classification

16 0.21109134 45 acl-2011-Aspect Ranking: Identifying Important Product Aspects from Online Consumer Reviews

17 0.15704587 159 acl-2011-Identifying Noun Product Features that Imply Opinions

18 0.15698776 211 acl-2011-Liars and Saviors in a Sentiment Annotated Corpus of Comments to Political Debates

19 0.15049993 289 acl-2011-Subjectivity and Sentiment Analysis of Modern Standard Arabic

20 0.14257155 208 acl-2011-Lexical Normalisation of Short Text Messages: Makn Sens a #twitter

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.227), (1, 0.449), (2, 0.404), (3, -0.134), (4, 0.11), (5, 0.006), (6, 0.002), (7, -0.237), (8, -0.056), (9, 0.085), (10, -0.185), (11, 0.181), (12, 0.156), (13, -0.117), (14, -0.125), (15, -0.066), (16, -0.043), (17, 0.036), (18, -0.043), (19, 0.044), (20, -0.122), (21, -0.031), (22, -0.019), (23, -0.021), (24, 0.0), (25, 0.064), (26, -0.049), (27, -0.091), (28, 0.052), (29, 0.04), (30, -0.021), (31, 0.023), (32, -0.007), (33, -0.026), (34, -0.025), (35, 0.014), (36, 0.073), (37, -0.079), (38, 0.003), (39, -0.011), (40, -0.046), (41, -0.063), (42, 0.013), (43, 0.005), (44, -0.01), (45, 0.004), (46, -0.022), (47, -0.057), (48, 0.01), (49, 0.029)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.97436768 64 acl-2011-C-Feel-It: A Sentiment Analyzer for Micro-blogs

Author: Aditya Joshi ; Balamurali AR ; Pushpak Bhattacharyya ; Rajat Mohanty

same-paper 2 0.9592945 292 acl-2011-Target-dependent Twitter Sentiment Classification

Author: Long Jiang ; Mo Yu ; Ming Zhou ; Xiaohua Liu ; Tiejun Zhao

3 0.84917635 160 acl-2011-Identifying Sarcasm in Twitter: A Closer Look

Author: Roberto Gonzalez-Ibanez ; Smaranda Muresan ; Nina Wacholder

Abstract: Sarcasm transforms the polarity of an apparently positive or negative utterance into its opposite. We report on a method for constructing a corpus of sarcastic Twitter messages in which determination of the sarcasm of each message has been made by its author. We use this reliable corpus to compare sarcastic utterances in Twitter to utterances that express positive or negative attitudes without sarcasm. We investigate the impact of lexical and pragmatic factors on machine learning effectiveness for identifying sarcastic utterances and we compare the performance of machine learning techniques and human judges on this task. Perhaps unsurprisingly, neither the human judges nor the machine learning techniques perform very well. 1

4 0.66586745 218 acl-2011-MemeTube: A Sentiment-based Audiovisual System for Analyzing and Displaying Microblog Messages

Author: Cheng-Te Li ; Chien-Yuan Wang ; Chien-Lin Tseng ; Shou-De Lin

Abstract: Micro-blogging services provide platforms for users to share their feelings and ideas on the move. In this paper, we present a search-based demonstration system, called MemeTube, to summarize the sentiments of microblog messages in an audiovisual manner. MemeTube provides three main functions: (1) recognizing the sentiments of messages (2) generating music melody automatically based on detected sentiments, and (3) produce an animation of real-time piano playing for audiovisual display. Our MemeTube system can be accessed via: http://mslab.csie.ntu.edu.tw/memetube/ .

5 0.65559977 242 acl-2011-Part-of-Speech Tagging for Twitter: Annotation, Features, and Experiments

Author: Kevin Gimpel ; Nathan Schneider ; Brendan O'Connor ; Dipanjan Das ; Daniel Mills ; Jacob Eisenstein ; Michael Heilman ; Dani Yogatama ; Jeffrey Flanigan ; Noah A. Smith

6 0.63792479 261 acl-2011-Recognizing Named Entities in Tweets

7 0.58507895 279 acl-2011-Semi-supervised latent variable models for sentence-level sentiment analysis

8 0.57864422 332 acl-2011-Using Multiple Sources to Construct a Sentiment Sensitive Thesaurus for Cross-Domain Sentiment Classification

9 0.56362772 204 acl-2011-Learning Word Vectors for Sentiment Analysis

10 0.545654 177 acl-2011-Interactive Group Suggesting for Twitter

11 0.52174467 281 acl-2011-Sentiment Analysis of Citations using Sentence Structure-Based Features

12 0.50993568 45 acl-2011-Aspect Ranking: Identifying Important Product Aspects from Online Consumer Reviews

13 0.49485371 253 acl-2011-PsychoSentiWordNet

14 0.48733935 105 acl-2011-Dr Sentiment Knows Everything!

15 0.48210019 54 acl-2011-Automatically Extracting Polarity-Bearing Topics for Cross-Domain Sentiment Classification

16 0.43487042 183 acl-2011-Joint Bilingual Sentiment Classification with Unlabeled Parallel Corpora

17 0.41279832 211 acl-2011-Liars and Saviors in a Sentiment Annotated Corpus of Comments to Political Debates

18 0.40744686 305 acl-2011-Topical Keyphrase Extraction from Twitter

19 0.39997345 289 acl-2011-Subjectivity and Sentiment Analysis of Modern Standard Arabic

20 0.39808342 82 acl-2011-Content Models with Attitude

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(5, 0.023), (17, 0.026), (26, 0.057), (31, 0.026), (37, 0.172), (39, 0.038), (41, 0.051), (53, 0.029), (55, 0.017), (59, 0.045), (60, 0.141), (72, 0.12), (88, 0.014), (91, 0.026), (96, 0.122)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.85511971 292 acl-2011-Target-dependent Twitter Sentiment Classification

Author: Long Jiang ; Mo Yu ; Ming Zhou ; Xiaohua Liu ; Tiejun Zhao

2 0.83566308 120 acl-2011-Even the Abstract have Color: Consensus in Word-Colour Associations

Author: Saif Mohammad

Abstract: Colour is a key component in the successful dissemination of information. Since many real-world concepts are associated with colour, for example danger with red, linguistic information is often complemented with the use of appropriate colours in information visualization and product marketing. Yet, there is no comprehensive resource that captures concept–colour associations. We present a method to create a large word–colour association lexicon by crowdsourcing. A wordchoice question was used to obtain sense-level annotations and to ensure data quality. We focus especially on abstract concepts and emotions to show that even they tend to have strong colour associations. Thus, using the right colours can not only improve semantic coherence, but also inspire the desired emotional response.

3 0.80983448 261 acl-2011-Recognizing Named Entities in Tweets

Author: Xiaohua LIU ; Shaodian ZHANG ; Furu WEI ; Ming ZHOU

Abstract: The challenges of Named Entities Recognition (NER) for tweets lie in the insufficient information in a tweet and the unavailability of training data. We propose to combine a K-Nearest Neighbors (KNN) classifier with a linear Conditional Random Fields (CRF) model under a semi-supervised learning framework to tackle these challenges. The KNN based classifier conducts pre-labeling to collect global coarse evidence across tweets while the CRF model conducts sequential labeling to capture fine-grained information encoded in a tweet. The semi-supervised learning plus the gazetteers alleviate the lack of training data. Extensive experiments show the advantages of our method over the baselines as well as the effectiveness of KNN and semisupervised learning.

4 0.7855022 183 acl-2011-Joint Bilingual Sentiment Classification with Unlabeled Parallel Corpora

Author: Bin Lu ; Chenhao Tan ; Claire Cardie ; Benjamin K. Tsou

Abstract: Most previous work on multilingual sentiment analysis has focused on methods to adapt sentiment resources from resource-rich languages to resource-poor languages. We present a novel approach for joint bilingual sentiment classification at the sentence level that augments available labeled data in each language with unlabeled parallel data. We rely on the intuition that the sentiment labels for parallel sentences should be similar and present a model that jointly learns improved monolingual sentiment classifiers for each language. Experiments on multiple data sets show that the proposed approach (1) outperforms the monolingual baselines, significantly improving the accuracy for both languages by 3.44%-8. 12%; (2) outperforms two standard approaches for leveraging unlabeled data; and (3) produces (albeit smaller) performance gains when employing pseudo-parallel data from machine translation engines. 1

5 0.78528666 256 acl-2011-Query Weighting for Ranking Model Adaptation

Author: Peng Cai ; Wei Gao ; Aoying Zhou ; Kam-Fai Wong

Abstract: We propose to directly measure the importance of queries in the source domain to the target domain where no rank labels of documents are available, which is referred to as query weighting. Query weighting is a key step in ranking model adaptation. As the learning object of ranking algorithms is divided by query instances, we argue that it’s more reasonable to conduct importance weighting at query level than document level. We present two query weighting schemes. The first compresses the query into a query feature vector, which aggregates all document instances in the same query, and then conducts query weighting based on the query feature vector. This method can efficiently estimate query importance by compressing query data, but the potential risk is information loss resulted from the compression. The second measures the similarity between the source query and each target query, and then combines these fine-grained similarity values for its importance estimation. Adaptation experiments on LETOR3.0 data set demonstrate that query weighting significantly outperforms document instance weighting methods.

6 0.7839601 122 acl-2011-Event Extraction as Dependency Parsing

7 0.78304589 246 acl-2011-Piggyback: Using Search Engines for Robust Cross-Domain Named Entity Recognition

8 0.7816785 32 acl-2011-Algorithm Selection and Model Adaptation for ESL Correction Tasks

9 0.78019071 332 acl-2011-Using Multiple Sources to Construct a Sentiment Sensitive Thesaurus for Cross-Domain Sentiment Classification

10 0.77951545 334 acl-2011-Which Noun Phrases Denote Which Concepts?

11 0.77877724 48 acl-2011-Automatic Detection and Correction of Errors in Dependency Treebanks

12 0.77811956 147 acl-2011-Grammatical Error Correction with Alternating Structure Optimization

13 0.77712691 331 acl-2011-Using Large Monolingual and Bilingual Corpora to Improve Coordination Disambiguation

14 0.77631819 85 acl-2011-Coreference Resolution with World Knowledge

15 0.77533942 333 acl-2011-Web-Scale Features for Full-Scale Parsing

16 0.77329421 289 acl-2011-Subjectivity and Sentiment Analysis of Modern Standard Arabic

17 0.77297723 92 acl-2011-Data point selection for cross-language adaptation of dependency parsers

18 0.77266234 127 acl-2011-Exploiting Web-Derived Selectional Preference to Improve Statistical Dependency Parsing

19 0.76814497 186 acl-2011-Joint Training of Dependency Parsing Filters through Latent Support Vector Machines

20 0.7667821 204 acl-2011-Learning Word Vectors for Sentiment Analysis