emnlp emnlp2011 emnlp2011-30 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Ainur Yessenalina ; Claire Cardie
Abstract: We present a general learning-based approach for phrase-level sentiment analysis that adopts an ordinal sentiment scale and is explicitly compositional in nature. Thus, we can model the compositional effects required for accurate assignment of phrase-level sentiment. For example, combining an adverb (e.g., “very”) with a positive polar adjective (e.g., “good”) produces a phrase (“very good”) with increased polarity over the adjective alone. Inspired by recent work on distributional approaches to compositionality, we model each word as a matrix and combine words using iterated matrix multiplication, which allows for the modeling of both additive and multiplicative semantic effects. Although the multiplication-based matrix-space framework has been shown to be a theoretically elegant way to model composition (Rudolph and Giesbrecht, 2010), training such models has to be done carefully: the optimization is nonconvex and requires a good initial starting point. This paper presents the first such algorithm for learning a matrix-space model for semantic composition. In the context of the phrase-level sentiment analysis task, our experimental results show statistically significant improvements in performance over a bagof-words model.
Reference: text
sentIndex sentText sentNum sentScore
1 edu l Abstract We present a general learning-based approach for phrase-level sentiment analysis that adopts an ordinal sentiment scale and is explicitly compositional in nature. [sent-4, score-1.451]
2 Thus, we can model the compositional effects required for accurate assignment of phrase-level sentiment. [sent-5, score-0.245]
3 , “good”) produces a phrase (“very good”) with increased polarity over the adjective alone. [sent-10, score-0.31]
4 Inspired by recent work on distributional approaches to compositionality, we model each word as a matrix and combine words using iterated matrix multiplication, which allows for the modeling of both additive and multiplicative semantic effects. [sent-11, score-0.426]
5 Although the multiplication-based matrix-space framework has been shown to be a theoretically elegant way to model composition (Rudolph and Giesbrecht, 2010), training such models has to be done carefully: the optimization is nonconvex and requires a good initial starting point. [sent-12, score-0.231]
6 In the context of the phrase-level sentiment analysis task, our experimental results show statistically significant improvements in performance over a bagof-words model. [sent-14, score-0.507]
7 Work in the area ranges from identifying the sentiment of individual words to determining the sentiment of phrases, sentences and doc172 Claire Cardie Dept. [sent-16, score-1.014]
8 negative sentiment, collapsing positive (or negative) words, phrases and documents of differing intensities into just one positive (or negative) class. [sent-21, score-0.251]
9 For word-level sentiment, therefore, these methods would not recognize a difference in sentiment between words like “good” and “great”, which have the same direction of polarity (i. [sent-22, score-0.673]
10 At the phrase level, the methods will fail to register compositional effects in sentiment brought about by intensifiers like “very”, “absolutely”, “extremely”, etc. [sent-25, score-0.84]
11 In real-world settings, on the other hand, sentiment values extend across a polarity spectrum from very negative, to neutral, to very positive. [sent-27, score-0.673]
12 — This paper describes a general approach for phrase-level sentiment analysis that takes these realworld requirements into account: we adopt a fivelevel ordinal sentiment scale and present a learningbased method that assigns ordinal sentiment scores to phrases. [sent-30, score-1.992]
13 Consider, for example, combining an adverb like “very” with a polar adjective like “good”. [sent-35, score-0.21]
14 Combining “very” with a negative adjective, like “bad”, produces a phrase (“very bad”) that should be characterized as more negative than the original adjective. [sent-37, score-0.24]
15 Thus, it is convenient to think of the effect of combining an intensifying adverb with a polar adjective as being multiplicative in nature, if we assume the adjectives (“good” and “bad”) to have positive and a negative sentiment scores, respectively. [sent-38, score-1.0]
16 When modeling only positive and negative labels for sentiment, negators are generally treated as flipping the polarity of the adjective it modifies (Choi and Cardie, 2008; Nakagawa et al. [sent-40, score-0.507]
17 , 2011; Liu and Seneff, 2009) suggests that the effect of the negator when ordinal sentiment scores are employed is more akin to dampening the adjective’s polarity rather than flipping it. [sent-43, score-0.977]
18 For these cases, it is convenient to view “not” as shifting polarity to the opposite side of polarity scale by some value. [sent-46, score-0.368]
19 There are, of course, more interesting examples of compositional semantic effects on sentiment: e. [sent-47, score-0.245]
20 Here, the verbs prevent and ease act as content-word negators (Choi and Cardie, 2008) in that they modify the negative sentiment of their direct object arguments so that the phrase as a whole is perceived as somewhat positive. [sent-50, score-0.761]
21 173 One notable exception is Moilanen and Pulman (2007), who propose a compositional semantic approach to assign a positive or negative sentiment to newspaper article titles. [sent-56, score-0.853]
22 However, their knowledgebased approach presupposes the existence of a sentiment lexicon and a set of symbolic compositional rules. [sent-57, score-0.708]
23 But learning-based compositional approaches for sentiment analyis also exist. [sent-58, score-0.708]
24 Choi and Cardie (2008), for example, propose an algorithm for phrase-based sentiment analysis that learns proper assignments of intermediate sentiment analysis decision variables given the a priori (i. [sent-59, score-1.014]
25 , out of context) polarity of the words in the phrase and the (correct) phrase-level polarity. [sent-61, score-0.224]
26 As in Moilianen and Pulman (2007), semantic inference is based on (a small set of) hand-written compositional rules. [sent-62, score-0.201]
27 al (2010) use a dependency parse tree to guide the learning of compositional effects. [sent-64, score-0.201]
28 Each of the above, however, uses a binary rather than an ordinal sentiment scale. [sent-65, score-0.707]
29 In contrast, our proposed method for phraselevel sentiment analysis is inspired by recent work on distributional approaches to compositionality. [sent-66, score-0.548]
30 In particular, Baroni and Zamparelli (2010) tackle adjective-noun compositions using a vector representation for nouns and learning a matrix representation for each adjective. [sent-67, score-0.177]
31 The adjective matrices are then applied as functions over the meanings ofnouns via matrix-vector multiplication to derive the meaning of adjective-noun combinations. [sent-68, score-0.299]
32 In the sections below, we propose a learningbased approach to assign ordinal sentiment scores to sentiment-bearing phrases using a general compositional matrix-space model of language. [sent-71, score-0.943]
33 In contrast to previous work, all words are modeled as matrices, independent of their part-of-speech, and compositional inference is uniformly modeled as matrix multiplication. [sent-72, score-0.337]
34 To predict an ordinal scale sentiment value, we employ Ordered Logistic Regression, introducing a novel training algorithm to accommodate our compositional matrix-space representations (Section 2). [sent-73, score-0.944]
35 We evaluate the approach on a standard sentiment corpus (Wiebe et al. [sent-75, score-0.507]
36 We show (Section 4) that our matrix-space model significantly outperforms a bag-of-words model for the ordinal scale sentiment prediction task. [sent-77, score-0.743]
37 2 The Model for Ordinal Scale Sentiment Prediction As described above, our task is to predict an ordinal scale sentiment value for a phrase. [sent-78, score-0.743]
38 To this end, we employ a sentiment scale with five ordinal values: VERY NEGATIVE, NEGATIVE, NEUTRAL, POSITIVE and VERY POSITIVE. [sent-79, score-0.743]
39 Given a set of phraselevel training examples with their gold-standard ordinal sentiment value, we then use an Ordered Logistic Regression (OLogReg) model for prediction. [sent-80, score-0.748]
40 In the next subsections, we instantiate OLogReg for our sentiment prediction task using a matrix-space word model (2. [sent-86, score-0.507]
41 Let xi be the i-th phrase and yi would be the label of xi, where yi takes r different values yi ∈ {0, . [sent-96, score-0.387]
42 Note, that unlike bag-of-words model, the matrix-space model takes word order into account, since matrix multiplication is not commutative operation. [sent-121, score-0.203]
43 Then the constraints (3) would be: κr−1 ≥ τj 0, 1 ≤ j ≤ r − 2 (4) To simplify the equations we can rewrite the negative loglikelihood as follows: L = −Xi=n1Xkr−=10ln(Aik− Bik)I(yi= k) (5) where Aik=(1F,(κ0+Pjk=1τj− ξi), if k = 0,. [sent-138, score-0.179]
44 m affine (7) We choose the class of affine matrices since for affine matrices matrix multiplication represents both operations: linear transformation and translation. [sent-150, score-1.152]
45 Linear transformation is important for modeling changes in sentiment - translation is also useful (we make use of a translation vector during initialization, see Section 2. [sent-151, score-0.591]
46 Applying the affine transformation W to vector [x, 1]T is equivalent to applying linear transformation A and translation b to x. [sent-156, score-0.359]
47 Also the product of affine matrices is an affine matrix. [sent-162, score-0.528]
48 In case if a certain word appears multiple times in the phrase, the derivative with respect to that word would be a sum of derivatives with respect to each appearance of a word, while all other appearances are fixed. [sent-190, score-0.208]
49 So, given the negative loglikelihood and the derivatives with respect κ0 and τj-s and word matrices W, we optimize objective (5) subject to τj ≥ 0. [sent-212, score-0.405]
50 Fi∈rst mRake the matrix affine by updating the la∈st R row, rthsten m tahkee updated rmixatr afixfi nweil bl ylo uokpd laitkien:g Wˆj = ? [sent-222, score-0.327]
51 It can be proven that such a projection returns the closest affine matrix in Frobenius norm. [sent-224, score-0.327]
52 More specifically, we first use the Singular Value Decomposition (SVD) of the A11: UΣVT = A11, where U and V are orthogonal matrices, Σ is a matrix with singular values on the diagonal. [sent-228, score-0.18]
53 L-BFGS-B returns a solution that is not necessarily an affine matrix. [sent-240, score-0.191]
54 After projecting to the space of affine matrices we start L-BFGS-B from a better initial point. [sent-241, score-0.337]
55 We minimize negative loglikelihood using L-BFGS-B subject to τj ≥ 0. [sent-255, score-0.179]
56 The L2-regularized negative loglikelihood will consist of the expression in (5) and an additional term | |w| |22, where | | · ||2 is the L2-norm of a vector. [sent-258, score-0.212]
57 T|h|ew |d|er,i wvahtievree |o|f ·th ||e additional term with respect to w will be: 2λ ∂λ2∂||ww||22= λw Hence the partial derivative with respect to wxij will have an additional term λwxji. [sent-259, score-0.219]
58 1 applying transformation A of affine matrix W can model a linear transformation, while vector b represents a translation. [sent-272, score-0.411]
59 Since matrix-space model can encode a vectorspace model (Rudolph and Giesbrecht, 2010), we can initialize the matrices to exactly mimic the bagof-words model. [sent-273, score-0.2]
60 a phrase x1, x2 weights of these want to have the Here we assume x2 To compute the polarity score of the bag-of-words model sums the two words: wx1 and wx2 . [sent-276, score-0.224]
61 edu/mpqa/ 178 npP oe lsugait ritav ley hI mniget hd ni,su e imx tyre m e ,mdiuOlrad3b0241ienla Table 1: Mapping of combination of polarities and intensities from MPQA dataset to our ordinal sentiment scale. [sent-291, score-0.759]
62 7 The schematic mapping of phrase polarity and intensity values on ordinal sentimental scale is shown in Table 1. [sent-293, score-0.495]
63 With this value of λ fixed, the final model is the one with the lowest negative loglikelihood on the training set. [sent-309, score-0.216]
64 In Table 3 we show the sentiment scores of the best performing bag-of-words OLogReg model and the best performing model based 179 Table 3: Phrase and the sentiment scores of the phrase for 2 models Matrix-space OLogReg+BowInit and Bag-ofwords OLogReg respectively. [sent-345, score-1.072]
65 By sentiment score, we mean equation (1) of Bag-ofwords OLogReg and equation (2) of Matrix-space OLogReg+BowInit. [sent-347, score-0.507]
66 Here we choose two popular adjectives like ‘good’ and ‘bad’ that appeared in the training data, and examine the effect of applying the intensifier ‘very’ on the sentiment score. [sent-348, score-0.6]
67 As we can see, the matrix-space model learns a matrix that intensifies both ‘bad’ and ‘good’ in the correct sentiment scale, i. [sent-349, score-0.643]
68 , ξ(good) < ξ(very good) and ξ(bad) < ξ(very bad), while the bag-of-words model gets the sentiment of ‘very bad’ wrong: it is more positive than ‘bad’ . [sent-351, score-0.561]
69 The matrix-space model correctly encodes the effect of the negator for both positive and negative adjectives, such that ξ(not good) < ξ(good) and ξ(bad) < ξ(not bad). [sent-353, score-0.214]
70 each word is represented as a function, more specifically a linear operator, and the function composition defined as matrix multiplication, we can think of ”not very” being an operator itself, that is a composition of operator ”not” and operator ”very”. [sent-358, score-0.491]
71 There has been a lot of research in determining the sentiment of words and constructing polarity dictionaries (Hatzivassiloglou and McKeown, 1997; Wiebe, 2000; Rao and Ravichandran, 2009; Mohammad et al. [sent-360, score-0.673]
72 Some recent work is trying to identify the degree of sentiment of adjectives and adverbs from text using co-occurrence statistics. [sent-363, score-0.57]
73 al (201 1) and Liu and Seneff (2009), suggest ways of computing the sentiment of adjectives from data, and computing the effect of combining adjective with adverb as multiplicative effect and combining adjective with negation as additive effect. [sent-365, score-1.025]
74 However these models require the knowledge of a part of speech of given words and the list of negators (since the negator is an adjective as well). [sent-366, score-0.23]
75 On the other hand, there has been some research in trying to model compositional effects for sentiment at the phrase- and sentence-level. [sent-368, score-0.752]
76 Choi and Cardie (2008) hand-code compositional rules in order to model compositional effects ofcombining different words in the phrase. [sent-369, score-0.446]
77 Another recent work that tries to model the compositional semantics of combining different words is Nakagawa et. [sent-371, score-0.232]
78 (2010), which proposes a model that learns the effects of combining different words using phrase/sentence dependency parse trees and an initial polarity dictionary. [sent-373, score-0.241]
79 They present a learning method that employs hidden variables for sentiment classification: given the polarity of a sentence and the a priori polarities of its words, they learn how to model the interactions between words with headmodifier relations in the dependency tree. [sent-374, score-0.673]
80 Our task is different: we classify phrases according to a single ordinal scale that combines both polarity and strength. [sent-379, score-0.402]
81 In the current work we look at finegrained sentiment analysis, more specifically we study word representations for use in true compositional semantic settings. [sent-381, score-0.708]
82 They define composition as an additive or multiplicative function of two vectors and show that compositional approaches generally outperform non-compositional approaches that treat the phrase as the union of single lexical items. [sent-385, score-0.497]
83 It shows that modeling adjectives as linear transformations and applying those linear transformations to nouns results in final vectors for adjective-noun compositions that are close in semantic space to other similar phrases. [sent-387, score-0.176]
84 The authors argue that modeling adjectives as a linear transformation is a better idea than using additive vector-space models. [sent-388, score-0.188]
85 In this work, a separate matrix for each adjective is learned using the Partial Least Squares method in a completely unsupervised way. [sent-389, score-0.222]
86 — 6 Conclusions and Future work In the current work we present a novel matrix-space model for ordinal scale sentiment prediction and an algorithm for learning such a model. [sent-393, score-0.743]
87 The proposed model learns a matrix for each word; the composition of words is modeled as iterated matrix multiplication. [sent-394, score-0.432]
88 The matrix-space framework with iterated matrix multiplication defines an elegant framework for modeling composition; it is also quite general. [sent-395, score-0.241]
89 We use the matrix-space framework in the context of sentiment prediction, a domain where interesting compositional effects can be observed. [sent-396, score-0.752]
90 as a matrix) for use in true compositional semantic settings. [sent-398, score-0.201]
91 One of the benefits of the proposed approach is that by learning matrices for words, the model can handle unseen word compositions (e. [sent-399, score-0.187]
92 Though in our model the order of composition is the same as the word order, we believe that a linguistically informed order of composition can give us further performance gains. [sent-406, score-0.244]
93 One interesting direction to explore might be to use non-negative matrix factorization (Lee and Seung, 2001), co-clustering techniques (Dhillon, 2001) to better initialize words that share similar contexts. [sent-414, score-0.19]
94 The other possible direction is to use existing sentiment lexicons and employ- ing a “curriculum learning” strategy (Bengio et al. [sent-415, score-0.507]
95 Learning with compositional semantics as structural inference for subsentential sentiment analysis. [sent-440, score-0.708]
96 What’s great and what’s not: learning to classify the scope of negation for improved sentiment analysis. [sent-445, score-0.573]
97 Seeing stars when there aren’t many stars: Graph-based semi- supervised learning for sentiment categorization. [sent-470, score-0.548]
98 Dependency tree-based sentiment classification using crfs with hidden variables. [sent-518, score-0.507]
99 Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. [sent-522, score-0.546]
100 Assessing sentiment of text by semantic dependency and contextual valence analysis. [sent-553, score-0.546]
wordName wordTfidf (topN-words)
[('sentiment', 0.507), ('ologreg', 0.417), ('compositional', 0.201), ('ordinal', 0.2), ('affine', 0.191), ('polarity', 0.166), ('matrices', 0.146), ('matrix', 0.136), ('composition', 0.122), ('bik', 0.122), ('aik', 0.105), ('negative', 0.091), ('bad', 0.091), ('derivative', 0.089), ('loglikelihood', 0.088), ('adjective', 0.086), ('yi', 0.085), ('transformation', 0.084), ('rm', 0.082), ('wj', 0.078), ('multiplicative', 0.075), ('nakagawa', 0.075), ('mpqa', 0.075), ('negators', 0.075), ('xi', 0.074), ('cardie', 0.074), ('negator', 0.069), ('taboada', 0.069), ('xji', 0.069), ('rudolph', 0.068), ('multiplication', 0.067), ('negation', 0.066), ('adjectives', 0.063), ('wilson', 0.061), ('choi', 0.061), ('phrase', 0.058), ('positive', 0.054), ('initialize', 0.054), ('polar', 0.054), ('wiebe', 0.052), ('cornell', 0.052), ('ijk', 0.052), ('intensities', 0.052), ('moilanen', 0.052), ('seneff', 0.052), ('wxij', 0.052), ('wxji', 0.052), ('giesbrecht', 0.05), ('initialization', 0.046), ('nonconvex', 0.045), ('prank', 0.045), ('det', 0.045), ('effects', 0.044), ('singular', 0.044), ('stars', 0.041), ('compositions', 0.041), ('derivatives', 0.041), ('phraselevel', 0.041), ('additive', 0.041), ('adverb', 0.039), ('janyce', 0.039), ('valence', 0.039), ('respect', 0.039), ('iterated', 0.038), ('pulman', 0.038), ('operator', 0.037), ('lowest', 0.037), ('transformations', 0.036), ('scale', 0.036), ('wp', 0.035), ('intensity', 0.035), ('ainur', 0.035), ('corne', 0.035), ('councill', 0.035), ('curriculum', 0.035), ('flipping', 0.035), ('hardin', 0.035), ('ixi', 0.035), ('learningbased', 0.035), ('matrixspace', 0.035), ('pranking', 0.035), ('utw', 0.035), ('yjx', 0.035), ('optimization', 0.034), ('pang', 0.034), ('logistic', 0.033), ('expression', 0.033), ('combining', 0.031), ('validation', 0.031), ('theresa', 0.03), ('baroni', 0.03), ('good', 0.03), ('thresholds', 0.03), ('prevent', 0.03), ('byrd', 0.03), ('intensifier', 0.03), ('intensifiers', 0.03), ('ithaca', 0.03), ('shaikh', 0.03), ('terrible', 0.03)]
simIndex simValue paperId paperTitle
same-paper 1 0.9999997 30 emnlp-2011-Compositional Matrix-Space Models for Sentiment Analysis
Author: Ainur Yessenalina ; Claire Cardie
Abstract: We present a general learning-based approach for phrase-level sentiment analysis that adopts an ordinal sentiment scale and is explicitly compositional in nature. Thus, we can model the compositional effects required for accurate assignment of phrase-level sentiment. For example, combining an adverb (e.g., “very”) with a positive polar adjective (e.g., “good”) produces a phrase (“very good”) with increased polarity over the adjective alone. Inspired by recent work on distributional approaches to compositionality, we model each word as a matrix and combine words using iterated matrix multiplication, which allows for the modeling of both additive and multiplicative semantic effects. Although the multiplication-based matrix-space framework has been shown to be a theoretically elegant way to model composition (Rudolph and Giesbrecht, 2010), training such models has to be done carefully: the optimization is nonconvex and requires a good initial starting point. This paper presents the first such algorithm for learning a matrix-space model for semantic composition. In the context of the phrase-level sentiment analysis task, our experimental results show statistically significant improvements in performance over a bagof-words model.
2 0.32339135 120 emnlp-2011-Semi-Supervised Recursive Autoencoders for Predicting Sentiment Distributions
Author: Richard Socher ; Jeffrey Pennington ; Eric H. Huang ; Andrew Y. Ng ; Christopher D. Manning
Abstract: We introduce a novel machine learning framework based on recursive autoencoders for sentence-level prediction of sentiment label distributions. Our method learns vector space representations for multi-word phrases. In sentiment prediction tasks these representations outperform other state-of-the-art approaches on commonly used datasets, such as movie reviews, without using any pre-defined sentiment lexica or polarity shifting rules. We also evaluate the model’s ability to predict sentiment distributions on a new dataset based on confessions from the experience project. The dataset consists of personal user stories annotated with multiple labels which, when aggregated, form a multinomial distribution that captures emotional reactions. Our algorithm can more accurately predict distributions over such labels compared to several competitive baselines.
Author: Samuel Brody ; Nicholas Diakopoulos
Abstract: We present an automatic method which leverages word lengthening to adapt a sentiment lexicon specifically for Twitter and similar social messaging networks. The contributions of the paper are as follows. First, we call attention to lengthening as a widespread phenomenon in microblogs and social messaging, and demonstrate the importance of handling it correctly. We then show that lengthening is strongly associated with subjectivity and sentiment. Finally, we present an automatic method which leverages this association to detect domain-specific sentiment- and emotionbearing words. We evaluate our method by comparison to human judgments, and analyze its strengths and weaknesses. Our results are of interest to anyone analyzing sentiment in microblogs and social networks, whether for research or commercial purposes.
4 0.25690591 63 emnlp-2011-Harnessing WordNet Senses for Supervised Sentiment Classification
Author: Balamurali AR ; Aditya Joshi ; Pushpak Bhattacharyya
Abstract: Traditional approaches to sentiment classification rely on lexical features, syntax-based features or a combination of the two. We propose semantic features using word senses for a supervised document-level sentiment classifier. To highlight the benefit of sense-based features, we compare word-based representation of documents with a sense-based representation where WordNet senses of the words are used as features. In addition, we highlight the benefit of senses by presenting a part-ofspeech-wise effect on sentiment classification. Finally, we show that even if a WSD engine disambiguates between a limited set of words in a document, a sentiment classifier still performs better than what it does in absence of sense annotation. Since word senses used as features show promise, we also examine the possibility of using similarity metrics defined on WordNet to address the problem of not finding a sense in the training corpus. We per- form experiments using three popular similarity metrics to mitigate the effect of unknown synsets in a test corpus by replacing them with similar synsets from the training corpus. The results show promising improvement with respect to the baseline.
5 0.17943345 81 emnlp-2011-Learning General Connotation of Words using Graph-based Algorithms
Author: Song Feng ; Ritwik Bose ; Yejin Choi
Abstract: In this paper, we introduce a connotation lexicon, a new type of lexicon that lists words with connotative polarity, i.e., words with positive connotation (e.g., award, promotion) and words with negative connotation (e.g., cancer, war). Connotation lexicons differ from much studied sentiment lexicons: the latter concerns words that express sentiment, while the former concerns words that evoke or associate with a specific polarity of sentiment. Understanding the connotation of words would seem to require common sense and world knowledge. However, we demonstrate that much of the connotative polarity of words can be inferred from natural language text in a nearly unsupervised manner. The key linguistic insight behind our approach is selectional preference of connotative predicates. We present graphbased algorithms using PageRank and HITS that collectively learn connotation lexicon together with connotative predicates. Our empirical study demonstrates that the resulting connotation lexicon is of great value for sentiment analysis complementing existing sentiment lexicons.
6 0.16030636 17 emnlp-2011-Active Learning with Amazon Mechanical Turk
7 0.1547185 126 emnlp-2011-Structural Opinion Mining for Graph-based Sentiment Representation
8 0.12651584 53 emnlp-2011-Experimental Support for a Categorical Compositional Distributional Model of Meaning
9 0.11438113 142 emnlp-2011-Unsupervised Discovery of Discourse Relations for Eliminating Intra-sentence Polarity Ambiguities
10 0.097461157 9 emnlp-2011-A Non-negative Matrix Factorization Based Approach for Active Dual Supervision from Document and Word Labels
11 0.072956875 71 emnlp-2011-Identifying and Following Expert Investors in Stock Microblogs
12 0.070118263 56 emnlp-2011-Exploring Supervised LDA Models for Assigning Attributes to Adjective-Noun Phrases
13 0.068779141 80 emnlp-2011-Latent Vector Weighting for Word Meaning in Context
14 0.068639763 73 emnlp-2011-Improving Bilingual Projections via Sparse Covariance Matrices
15 0.06721808 104 emnlp-2011-Personalized Recommendation of User Comments via Factor Models
16 0.065992996 137 emnlp-2011-Training dependency parsers by jointly optimizing multiple objectives
17 0.063831106 117 emnlp-2011-Rumor has it: Identifying Misinformation in Microblogs
18 0.059256129 28 emnlp-2011-Closing the Loop: Fast, Interactive Semi-Supervised Annotation With Queries on Features and Instances
19 0.055333238 93 emnlp-2011-Minimum Imputed-Risk: Unsupervised Discriminative Training for Machine Translation
20 0.055256147 2 emnlp-2011-A Cascaded Classification Approach to Semantic Head Recognition
topicId topicWeight
[(0, 0.201), (1, -0.213), (2, 0.128), (3, 0.079), (4, 0.552), (5, 0.067), (6, 0.095), (7, 0.096), (8, 0.02), (9, 0.061), (10, 0.04), (11, 0.133), (12, 0.038), (13, 0.054), (14, 0.002), (15, -0.008), (16, 0.072), (17, 0.016), (18, -0.003), (19, 0.028), (20, -0.038), (21, -0.063), (22, -0.02), (23, -0.069), (24, 0.106), (25, 0.03), (26, 0.049), (27, 0.0), (28, -0.044), (29, 0.025), (30, 0.021), (31, -0.054), (32, 0.054), (33, -0.095), (34, 0.015), (35, 0.057), (36, 0.007), (37, 0.054), (38, -0.01), (39, 0.001), (40, -0.08), (41, -0.043), (42, 0.06), (43, 0.023), (44, -0.021), (45, 0.037), (46, 0.011), (47, -0.028), (48, -0.016), (49, -0.019)]
simIndex simValue paperId paperTitle
same-paper 1 0.96062714 30 emnlp-2011-Compositional Matrix-Space Models for Sentiment Analysis
Author: Ainur Yessenalina ; Claire Cardie
Abstract: We present a general learning-based approach for phrase-level sentiment analysis that adopts an ordinal sentiment scale and is explicitly compositional in nature. Thus, we can model the compositional effects required for accurate assignment of phrase-level sentiment. For example, combining an adverb (e.g., “very”) with a positive polar adjective (e.g., “good”) produces a phrase (“very good”) with increased polarity over the adjective alone. Inspired by recent work on distributional approaches to compositionality, we model each word as a matrix and combine words using iterated matrix multiplication, which allows for the modeling of both additive and multiplicative semantic effects. Although the multiplication-based matrix-space framework has been shown to be a theoretically elegant way to model composition (Rudolph and Giesbrecht, 2010), training such models has to be done carefully: the optimization is nonconvex and requires a good initial starting point. This paper presents the first such algorithm for learning a matrix-space model for semantic composition. In the context of the phrase-level sentiment analysis task, our experimental results show statistically significant improvements in performance over a bagof-words model.
2 0.87965798 120 emnlp-2011-Semi-Supervised Recursive Autoencoders for Predicting Sentiment Distributions
Author: Richard Socher ; Jeffrey Pennington ; Eric H. Huang ; Andrew Y. Ng ; Christopher D. Manning
Abstract: We introduce a novel machine learning framework based on recursive autoencoders for sentence-level prediction of sentiment label distributions. Our method learns vector space representations for multi-word phrases. In sentiment prediction tasks these representations outperform other state-of-the-art approaches on commonly used datasets, such as movie reviews, without using any pre-defined sentiment lexica or polarity shifting rules. We also evaluate the model’s ability to predict sentiment distributions on a new dataset based on confessions from the experience project. The dataset consists of personal user stories annotated with multiple labels which, when aggregated, form a multinomial distribution that captures emotional reactions. Our algorithm can more accurately predict distributions over such labels compared to several competitive baselines.
3 0.81304848 33 emnlp-2011-Cooooooooooooooollllllllllllll!!!!!!!!!!!!!! Using Word Lengthening to Detect Sentiment in Microblogs
Author: Samuel Brody ; Nicholas Diakopoulos
Abstract: We present an automatic method which leverages word lengthening to adapt a sentiment lexicon specifically for Twitter and similar social messaging networks. The contributions of the paper are as follows. First, we call attention to lengthening as a widespread phenomenon in microblogs and social messaging, and demonstrate the importance of handling it correctly. We then show that lengthening is strongly associated with subjectivity and sentiment. Finally, we present an automatic method which leverages this association to detect domain-specific sentiment- and emotionbearing words. We evaluate our method by comparison to human judgments, and analyze its strengths and weaknesses. Our results are of interest to anyone analyzing sentiment in microblogs and social networks, whether for research or commercial purposes.
4 0.68251771 63 emnlp-2011-Harnessing WordNet Senses for Supervised Sentiment Classification
Author: Balamurali AR ; Aditya Joshi ; Pushpak Bhattacharyya
Abstract: Traditional approaches to sentiment classification rely on lexical features, syntax-based features or a combination of the two. We propose semantic features using word senses for a supervised document-level sentiment classifier. To highlight the benefit of sense-based features, we compare word-based representation of documents with a sense-based representation where WordNet senses of the words are used as features. In addition, we highlight the benefit of senses by presenting a part-ofspeech-wise effect on sentiment classification. Finally, we show that even if a WSD engine disambiguates between a limited set of words in a document, a sentiment classifier still performs better than what it does in absence of sense annotation. Since word senses used as features show promise, we also examine the possibility of using similarity metrics defined on WordNet to address the problem of not finding a sense in the training corpus. We per- form experiments using three popular similarity metrics to mitigate the effect of unknown synsets in a test corpus by replacing them with similar synsets from the training corpus. The results show promising improvement with respect to the baseline.
5 0.65875506 81 emnlp-2011-Learning General Connotation of Words using Graph-based Algorithms
Author: Song Feng ; Ritwik Bose ; Yejin Choi
Abstract: In this paper, we introduce a connotation lexicon, a new type of lexicon that lists words with connotative polarity, i.e., words with positive connotation (e.g., award, promotion) and words with negative connotation (e.g., cancer, war). Connotation lexicons differ from much studied sentiment lexicons: the latter concerns words that express sentiment, while the former concerns words that evoke or associate with a specific polarity of sentiment. Understanding the connotation of words would seem to require common sense and world knowledge. However, we demonstrate that much of the connotative polarity of words can be inferred from natural language text in a nearly unsupervised manner. The key linguistic insight behind our approach is selectional preference of connotative predicates. We present graphbased algorithms using PageRank and HITS that collectively learn connotation lexicon together with connotative predicates. Our empirical study demonstrates that the resulting connotation lexicon is of great value for sentiment analysis complementing existing sentiment lexicons.
6 0.52206725 126 emnlp-2011-Structural Opinion Mining for Graph-based Sentiment Representation
7 0.42211258 17 emnlp-2011-Active Learning with Amazon Mechanical Turk
8 0.34917465 53 emnlp-2011-Experimental Support for a Categorical Compositional Distributional Model of Meaning
9 0.30952787 142 emnlp-2011-Unsupervised Discovery of Discourse Relations for Eliminating Intra-sentence Polarity Ambiguities
10 0.27464122 9 emnlp-2011-A Non-negative Matrix Factorization Based Approach for Active Dual Supervision from Document and Word Labels
11 0.27154726 73 emnlp-2011-Improving Bilingual Projections via Sparse Covariance Matrices
12 0.26108277 80 emnlp-2011-Latent Vector Weighting for Word Meaning in Context
13 0.25753623 117 emnlp-2011-Rumor has it: Identifying Misinformation in Microblogs
14 0.23555315 107 emnlp-2011-Probabilistic models of similarity in syntactic context
15 0.2333802 2 emnlp-2011-A Cascaded Classification Approach to Semantic Head Recognition
16 0.23144858 71 emnlp-2011-Identifying and Following Expert Investors in Stock Microblogs
17 0.2193599 93 emnlp-2011-Minimum Imputed-Risk: Unsupervised Discriminative Training for Machine Translation
18 0.21413422 104 emnlp-2011-Personalized Recommendation of User Comments via Factor Models
19 0.18685588 82 emnlp-2011-Learning Local Content Shift Detectors from Document-level Information
20 0.1853393 56 emnlp-2011-Exploring Supervised LDA Models for Assigning Attributes to Adjective-Noun Phrases
topicId topicWeight
[(15, 0.013), (23, 0.081), (36, 0.027), (37, 0.025), (45, 0.058), (53, 0.022), (54, 0.024), (57, 0.013), (62, 0.025), (64, 0.03), (66, 0.046), (79, 0.046), (82, 0.015), (87, 0.013), (90, 0.018), (96, 0.053), (97, 0.397), (98, 0.025)]
simIndex simValue paperId paperTitle
same-paper 1 0.73071373 30 emnlp-2011-Compositional Matrix-Space Models for Sentiment Analysis
Author: Ainur Yessenalina ; Claire Cardie
Abstract: We present a general learning-based approach for phrase-level sentiment analysis that adopts an ordinal sentiment scale and is explicitly compositional in nature. Thus, we can model the compositional effects required for accurate assignment of phrase-level sentiment. For example, combining an adverb (e.g., “very”) with a positive polar adjective (e.g., “good”) produces a phrase (“very good”) with increased polarity over the adjective alone. Inspired by recent work on distributional approaches to compositionality, we model each word as a matrix and combine words using iterated matrix multiplication, which allows for the modeling of both additive and multiplicative semantic effects. Although the multiplication-based matrix-space framework has been shown to be a theoretically elegant way to model composition (Rudolph and Giesbrecht, 2010), training such models has to be done carefully: the optimization is nonconvex and requires a good initial starting point. This paper presents the first such algorithm for learning a matrix-space model for semantic composition. In the context of the phrase-level sentiment analysis task, our experimental results show statistically significant improvements in performance over a bagof-words model.
2 0.68094575 104 emnlp-2011-Personalized Recommendation of User Comments via Factor Models
Author: Deepak Agarwal ; Bee-Chung Chen ; Bo Pang
Abstract: In recent years, the amount of user-generated opinionated texts (e.g., reviews, user comments) continues to grow at a rapid speed: featured news stories on a major event easily attract thousands of user comments on a popular online News service. How to consume subjective information ofthis volume becomes an interesting and important research question. In contrast to previous work on review analysis that tried to filter or summarize information for a generic average user, we explore a different direction of enabling personalized recommendation of such information. For each user, our task is to rank the comments associated with a given article according to personalized user preference (i.e., whether the user is likely to like or dislike the comment). To this end, we propose a factor model that incorporates rater-comment and rater-author interactions simultaneously in a principled way. Our full model significantly outperforms strong baselines as well as related models that have been considered in previous work.
3 0.44127491 33 emnlp-2011-Cooooooooooooooollllllllllllll!!!!!!!!!!!!!! Using Word Lengthening to Detect Sentiment in Microblogs
Author: Samuel Brody ; Nicholas Diakopoulos
Abstract: We present an automatic method which leverages word lengthening to adapt a sentiment lexicon specifically for Twitter and similar social messaging networks. The contributions of the paper are as follows. First, we call attention to lengthening as a widespread phenomenon in microblogs and social messaging, and demonstrate the importance of handling it correctly. We then show that lengthening is strongly associated with subjectivity and sentiment. Finally, we present an automatic method which leverages this association to detect domain-specific sentiment- and emotionbearing words. We evaluate our method by comparison to human judgments, and analyze its strengths and weaknesses. Our results are of interest to anyone analyzing sentiment in microblogs and social networks, whether for research or commercial purposes.
4 0.43983519 81 emnlp-2011-Learning General Connotation of Words using Graph-based Algorithms
Author: Song Feng ; Ritwik Bose ; Yejin Choi
Abstract: In this paper, we introduce a connotation lexicon, a new type of lexicon that lists words with connotative polarity, i.e., words with positive connotation (e.g., award, promotion) and words with negative connotation (e.g., cancer, war). Connotation lexicons differ from much studied sentiment lexicons: the latter concerns words that express sentiment, while the former concerns words that evoke or associate with a specific polarity of sentiment. Understanding the connotation of words would seem to require common sense and world knowledge. However, we demonstrate that much of the connotative polarity of words can be inferred from natural language text in a nearly unsupervised manner. The key linguistic insight behind our approach is selectional preference of connotative predicates. We present graphbased algorithms using PageRank and HITS that collectively learn connotation lexicon together with connotative predicates. Our empirical study demonstrates that the resulting connotation lexicon is of great value for sentiment analysis complementing existing sentiment lexicons.
5 0.41564649 120 emnlp-2011-Semi-Supervised Recursive Autoencoders for Predicting Sentiment Distributions
Author: Richard Socher ; Jeffrey Pennington ; Eric H. Huang ; Andrew Y. Ng ; Christopher D. Manning
Abstract: We introduce a novel machine learning framework based on recursive autoencoders for sentence-level prediction of sentiment label distributions. Our method learns vector space representations for multi-word phrases. In sentiment prediction tasks these representations outperform other state-of-the-art approaches on commonly used datasets, such as movie reviews, without using any pre-defined sentiment lexica or polarity shifting rules. We also evaluate the model’s ability to predict sentiment distributions on a new dataset based on confessions from the experience project. The dataset consists of personal user stories annotated with multiple labels which, when aggregated, form a multinomial distribution that captures emotional reactions. Our algorithm can more accurately predict distributions over such labels compared to several competitive baselines.
6 0.36575127 126 emnlp-2011-Structural Opinion Mining for Graph-based Sentiment Representation
7 0.35680434 63 emnlp-2011-Harnessing WordNet Senses for Supervised Sentiment Classification
8 0.33325306 17 emnlp-2011-Active Learning with Amazon Mechanical Turk
9 0.33196217 71 emnlp-2011-Identifying and Following Expert Investors in Stock Microblogs
10 0.33040774 91 emnlp-2011-Literal and Metaphorical Sense Identification through Concrete and Abstract Context
11 0.3293848 117 emnlp-2011-Rumor has it: Identifying Misinformation in Microblogs
12 0.32622668 8 emnlp-2011-A Model of Discourse Predictions in Human Sentence Processing
13 0.32531583 108 emnlp-2011-Quasi-Synchronous Phrase Dependency Grammars for Machine Translation
14 0.32507068 1 emnlp-2011-A Bayesian Mixture Model for PoS Induction Using Multiple Features
15 0.32389545 142 emnlp-2011-Unsupervised Discovery of Discourse Relations for Eliminating Intra-sentence Polarity Ambiguities
16 0.31964585 53 emnlp-2011-Experimental Support for a Categorical Compositional Distributional Model of Meaning
17 0.31853694 136 emnlp-2011-Training a Parser for Machine Translation Reordering
18 0.31527027 123 emnlp-2011-Soft Dependency Constraints for Reordering in Hierarchical Phrase-Based Translation
19 0.31464881 54 emnlp-2011-Exploiting Parse Structures for Native Language Identification
20 0.31404814 128 emnlp-2011-Structured Relation Discovery using Generative Models