acl acl2013 acl2013-318 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Christian Scheible ; Hinrich Schutze
Abstract: A number of different notions, including subjectivity, have been proposed for distinguishing parts of documents that convey sentiment from those that do not. We propose a new concept, sentiment relevance, to make this distinction and argue that it better reflects the requirements of sentiment analysis systems. We demonstrate experimentally that sentiment relevance and subjectivity are related, but different. Since no large amount of labeled training data for our new notion of sentiment relevance is available, we investigate two semi-supervised methods for creating sentiment relevance classifiers: a distant supervision approach that leverages structured information about the domain of the reviews; and transfer learning on feature representations based on lexical taxonomies that enables knowledge transfer. We show that both methods learn sentiment relevance classifiers that perform well.
Reference: text
sentIndex sentText sentNum sentScore
1 Abstract A number of different notions, including subjectivity, have been proposed for distinguishing parts of documents that convey sentiment from those that do not. [sent-3, score-0.464]
2 We propose a new concept, sentiment relevance, to make this distinction and argue that it better reflects the requirements of sentiment analysis systems. [sent-4, score-1.004]
3 We demonstrate experimentally that sentiment relevance and subjectivity are related, but different. [sent-5, score-1.116]
4 We show that both methods learn sentiment relevance classifiers that perform well. [sent-7, score-0.732]
5 1 Introduction It is generally recognized in sentiment analysis that only a subset of the content of a document contributes to the sentiment it conveys. [sent-8, score-1.019]
6 For this reason, some authors distinguish the categories subjective and objective (Wilson and Wiebe, 2003). [sent-9, score-0.316]
7 Subjective statements refer to the internal state of mind of a person, which cannot be observed. [sent-10, score-0.094]
8 In contrast, objective statements can be verified by observing and checking reality. [sent-11, score-0.143]
9 Some sentiment analysis systems filter out objective language and predict sentiment based on subjective language only because objective statements do not directly reveal sentiment. [sent-12, score-1.355]
10 Even though the categories subjective/objective are well-established in philosophy, we argue that and Language Processing de University of Munich, Germany they are not optimal for sentiment analysis. [sent-13, score-0.54]
11 We instead introduce the notion of sentiment relevance (S-relevance or SR for short). [sent-14, score-0.794]
12 A sentence or linguistic expression is S-relevant if it contains information about the sentiment the document conveys; it is S-nonrelevant (SNR) otherwise. [sent-15, score-0.518]
13 Ideally, we would like to have at our disposal a large annotated training set for our new concept of sentiment relevance. [sent-16, score-0.51]
14 For this reason, we investigate two semi-supervised approaches to S-relevance classification that do not require Srelevance-labeled data. [sent-18, score-0.069]
15 We create an initial labeling based on domain-specific metadata that we extract from a public database and show that this improves performance by 5. [sent-20, score-0.044]
16 The second approach is transfer learning (TL) (Thrun, 1996). [sent-22, score-0.092]
17 6% for sentiment relevance classification when we use a feature representation based on lex- ical taxonomies that supports knowledge transfer. [sent-24, score-0.851]
18 In our approach, we classify sentences as S(non)relevant because this is the most fine-grained level at which S-relevance manifests itself; at the word or phrase level, S-relevance classification is not possible because of scope and context effects. [sent-25, score-0.145]
19 However, S-relevance is also a discourse phenomenon: authors tend to structure documents into S-relevant passages and S-nonrelevant passages. [sent-26, score-0.075]
20 To impose this discourse constraint, we employ a sequence model. [sent-27, score-0.154]
21 We represent each document as a graph of sentences and apply a minimum cut method. [sent-28, score-0.279]
22 Section 2 introduces the concept of sentiment relevance and relates it to subjectivity. [sent-30, score-0.778]
23 In Section 3, we review previous work related to sentiment relevance. [sent-31, score-0.537]
24 c A2s0s1o3ci Aatsiosonc fioartio Cno fmorpu Ctoamtiopnuatalt Lioin gauli Lsitnicgsu,i psatgices 954–963, results of our experiments on distant supervision (Section 6) and transfer learning (Section 7). [sent-35, score-0.283]
25 2 Sentiment Relevance Sentiment Relevance is a concept to distinguish content informative for determining the sentiment of a document from uninformative content. [sent-37, score-0.564]
26 This is in contrast to the usual distinction between subjective and objective content. [sent-38, score-0.316]
27 Consider the following examples for subjective and objective sentences: (1) Subjective example: Bruce Banner, a genetics researcher with a tragic past, suffers a horrible accident. [sent-40, score-0.421]
28 (2) Objective example: The movie won a Golden Globe for best foreign film and an Oscar. [sent-41, score-0.156]
29 Sentence (1) is subjective because assessments like tragic past and horrible accident are subjective to the reader and writer. [sent-42, score-0.569]
30 Sentence (2) is objective since we can check the truth of the statement. [sent-43, score-0.086]
31 However, even though sentence (1) has negative subjective content, it is not S-relevant because it is about the plot of the movie and can appear in a glowingly positive review. [sent-44, score-0.486]
32 Subjectivity and S-relevance are two distinct concepts that do not imply each other: Generally neutral and objective sentences can be S-relevant while certain subjective content is Snonrelevant. [sent-46, score-0.395]
33 Below, we first describe the annotation procedure for the sentiment relevance corpus and then demonstrate empirically that subjectivity and S-relevance differ. [sent-47, score-1.062]
34 1 Sentiment Relevance Corpus For our initial experiments, we focus on sentiment relevance classification in the movie domain. [sent-49, score-0.957]
35 To create a sentiment-relevance-annotated corpus, the SR corpus, we randomly selected 125 documents from the movie review data set (Pang et al. [sent-50, score-0.229]
36 1 Two annotators annotated the sentences for S-relevance, using the labels SR and SNR. [sent-52, score-0.045]
37 We excluded 360 sentences that were labeled uncertain from the 1We used the texts from the raw HTML files since the processed version does not have capitalization. [sent-54, score-0.045]
38 We had 762 sentences annotated for Srelevance by both annotators with an agreement (Fleiss’ κ) of . [sent-60, score-0.077]
39 In addition, we obtained subjectivity annotations for the same data on Amazon Mechanical Turk, obtaining each label through a vote of three, with an agreement of κ = . [sent-62, score-0.362]
40 However, the agreement of the subjectivity and relevance labelings after voting, assuming that subjectivity equals relevance, is only at κ = . [sent-64, score-0.96]
41 This suggests that there is indeed a measurable difference between subjectivity and relevance. [sent-66, score-0.366]
42 2 Contrastive Classification Experiment We will now examine the similarities of Srelevance and an existing subjectivity dataset. [sent-70, score-0.33]
43 Pang and Lee (2004) introduced subjectivity data (henceforth P&L; corpus) that consists of 5000 highly subjective (quote) review snippets from rottentomatoes. [sent-71, score-0.67]
44 com and 5000 objective (plot) sentences from IMDb plot descriptions. [sent-72, score-0.263]
45 We now show that although the P&L; selection criteria (quotes, plot) bear resemblance to the definition of S-relevance, the two concepts are different. [sent-73, score-0.066]
46 We use quote as S-relevant and plot as Snonrelevant data in TL. [sent-74, score-0.214]
47 We divide both the SR and P&L; corpora into training (50%) and test sets (50%) and train a Maximum Entropy (MaxEnt) classifier (Manning and Klein, 2003) with bag-ofword features. [sent-75, score-0.069]
48 The results clearly show that the classes defined by the two labeled sets are different. [sent-77, score-0.052]
49 A classifier trained on P&L; performs worse by about 8% on SR than a classifier trained on SR (68. [sent-78, score-0.138]
50 A classifier trained on SR performs worse by more than 20% on P&L; than a classifier trained on P&L; (67. [sent-82, score-0.138]
51 Note that the classes are not balanced in the S-relevance data while they are balanced in the subjectivity data. [sent-86, score-0.456]
52 de / for s chung/ re s s ourcen / korpora / s ent iment re levance / 955 Figure 1: Example data from the SR corpus with subjectivity (S/O) and S-relevance (SR/SNR) annota- tions test ntariPSR&LP86;&97. [sent-90, score-0.437]
53 Indeed, if we either balance the S-relevance data or unbalance the subjectivity data, we can significantly increase F1 to 74. [sent-97, score-0.33]
54 We will show in Section 7 that using an unsupervised sequence model is superior to artificial manipulation of class-imbalances. [sent-102, score-0.073]
55 An error analysis for the classifier trained on P&L; shows that many sentences misclassified as S-relevant (fpSR) contain polar words; for exam– ple, Then, the situation turns bad. [sent-103, score-0.253]
56 In contrast, sentences misclassified as S-nonrelevant (fpSNR) contain named entities or plot and movie business vocabulary; for example, Tim Roth delivers the most impressive acting job by getting the body language right. [sent-104, score-0.51]
57 The word count statistics in Table 2 show this for three polar words and for three plot/movie business words. [sent-105, score-0.164]
58 These snippets rarely contain plot/movie-business words, so that the P&Ltrained; classifier assigns almost all sentences with such words to the category S-nonrelevant. [sent-107, score-0.215]
59 3 Related Work Many publications have addressed subjectivity in sentiment analysis. [sent-108, score-0.794]
60 Two important papers that are based on the original philosophical definition of the term (internal state of mind vs. [sent-109, score-0.091]
61 As we argue above, if the goal is to identify parts of a document that are useful/nonuseful for sentiment analysis, then S-relevance is a better notion to use. [sent-111, score-0.624]
62 Researchers have implicitly deviated from the philosophical definition because they were primarily interested in satisfying the needs of a particular task. [sent-112, score-0.087]
63 For example, Pang and Lee (2004) use a minimum cut graph model for review summarization. [sent-113, score-0.253]
64 Because they do not directly evaluate the results of subjectivity classification, it is not clear to what extent their method is able to identify subjectivity correctly. [sent-114, score-0.66]
65 In general, it is not possible to know what the underlying concepts of a statistical classification are if no detailed annotation guidelines exist and no direct evaluation of manually labeled data is performed. [sent-115, score-0.203]
66 , 2009) who define a fine-grained classifica- tion that is similar to sentiment relevance on the highest level. [sent-117, score-0.732]
67 However, unlike our study, they fail to experimentally compare their classification scheme to prior work in their experiments and 956 to show that this scheme is different. [sent-118, score-0.193]
68 We use the minimum cut method and are therefore able to incorporate discourse-level constraints in a more flexible fashion, giving preference to “relevance-uniform” paragraphs without mandating them. [sent-121, score-0.26]
69 However, they do not use the category S-nonrelevance directly in their experiments and do not evaluate classification accuracy for it. [sent-123, score-0.101]
70 We do not use their data set as it would cause domain mismatch between the product reviews they use and the available movie review subjectivity data (Pang and Lee, 2004) in the TL approach. [sent-124, score-0.604]
71 , 2007) has some overlap with our notion of sentiment relevance. [sent-127, score-0.526]
72 (2010) use rationales in a multi-level model to integrate sentence-level information into a document classifier. [sent-129, score-0.125]
73 In summary, no direct evaluation of sentiment relevance has been performed previously. [sent-131, score-0.767]
74 One contribution in this paper is that we provide a single-domain gold standard for sentiment relevance, created based on clear annotation guidelines, and use it for direct evaluation. [sent-132, score-0.499]
75 Sentiment relevance is also related to review mining (e. [sent-133, score-0.341]
76 , (Eguchi and Lavrenko, 2006)) in that they aim to find phrases, sentences or snippets that are relevant for sentiment, either with respect to certain features or with a focus on high-precision retrieval (cf. [sent-138, score-0.114]
77 However, finding a few S-relevant items with high precision is much easier than the task we address: exhaustive classification of all sentences. [sent-140, score-0.069]
78 Another contribution is that we show that generalization based on semantic classes improves Srelevance classification. [sent-141, score-0.114]
79 While previous work has shown the utility of other types of feature generalization for sentiment and subjectivity analysis (e. [sent-142, score-0.856]
80 , syntax and part-of-speech (Riloff and Wiebe, 2003)), semantic classes have so far not been exploited. [sent-144, score-0.052]
81 Named-entity features in movie reviews were first used by Zhuang et al. [sent-145, score-0.201]
82 T ¨ackstr o¨m and McDonald (201 1) also solve a similar sequence problem by applying a distantly supervised classifier with an unsupervised hidden sequence component. [sent-159, score-0.185]
83 Their setup differs from ours as our focus lies on pattern-based distant supervision instead of distant supervision using documents for sentence classification. [sent-160, score-0.382]
84 Transfer learning has been applied previously in sentiment analysis (Tan and Cheng, 2009), targeting polarity detection. [sent-161, score-0.464]
85 (2009)), we impose the discourse constraint that an S-relevant (resp. [sent-164, score-0.149]
86 Following Pang and Lee (2004), we use minimum cut (MinCut) to formalize this discourse constraint. [sent-167, score-0.255]
87 For a document with n sentences, we create a graph with n + 2 nodes: n sentence nodes and source and sink nodes. [sent-168, score-0.12]
88 We define source and sink to represent the classes S-relevance and Snonrelevance, respectively, and refer to them as SR and SNR. [sent-169, score-0.118]
89 The minimum cut is a tradeoff between the confidence of the classification decisions and “discourse coherence”. [sent-173, score-0.249]
90 The discourse constraint often has the effect that high-confidence labels are propagated over the se957 quence. [sent-174, score-0.11]
91 To compute minimum cuts, we use the push- relabel maximum flow method (Cherkassky and Goldberg, 1995). [sent-176, score-0.096]
92 We therefore resort to a proxy measure, the run count. [sent-179, score-0.047]
93 A run is a sequence of sentences with the same label. [sent-180, score-0.132]
94 We set each parameter p to the value that produces a median run count that is closest to the true median run count (or, in case of a tie, closest to the true mean run count). [sent-181, score-0.309]
95 We assume that the optimal median/mean run count is known. [sent-182, score-0.089]
96 We propose two linguistic feature types for S-relevance classification that meet these requirements. [sent-187, score-0.069]
97 1 Generalization through Semantic Features Distant supervision and transfer learning are settings where exact training data is unavailable. [sent-189, score-0.179]
98 We therefore introduce generalization features which are more likely to support knowledge transfer. [sent-190, score-0.062]
99 A set of generalizations can be induced by making a cut in the taxonomy and defining the concepts there as base classes. [sent-192, score-0.257]
100 2 Named Entities As standard named entity recognition (NER) systems do not capture categories that are relevant to the movie domain, we opt for a lexicon-based approach similar to (Zhuang et al. [sent-205, score-0.221]
wordName wordTfidf (topN-words)
[('sentiment', 0.464), ('subjectivity', 0.33), ('sr', 0.27), ('relevance', 0.268), ('subjective', 0.198), ('movie', 0.156), ('plot', 0.132), ('corelex', 0.121), ('srelevance', 0.121), ('cut', 0.117), ('verbnet', 0.105), ('distant', 0.104), ('pang', 0.103), ('transfer', 0.092), ('supervision', 0.087), ('objective', 0.086), ('polar', 0.085), ('quote', 0.082), ('imdb', 0.082), ('tl', 0.082), ('snonrelevance', 0.081), ('snonrelevant', 0.081), ('snr', 0.081), ('discourse', 0.075), ('wiebe', 0.074), ('review', 0.073), ('tragic', 0.071), ('rationales', 0.071), ('classification', 0.069), ('snippets', 0.069), ('classifier', 0.069), ('concepts', 0.066), ('horrible', 0.066), ('zhuang', 0.066), ('sink', 0.066), ('minimum', 0.063), ('generalization', 0.062), ('notion', 0.062), ('taboada', 0.062), ('vn', 0.062), ('statements', 0.057), ('experimentally', 0.054), ('philosophical', 0.054), ('misclassified', 0.054), ('document', 0.054), ('acting', 0.053), ('classes', 0.052), ('taxonomies', 0.05), ('lee', 0.047), ('run', 0.047), ('concept', 0.046), ('ackstr', 0.046), ('notions', 0.046), ('reviews', 0.045), ('sentences', 0.045), ('argue', 0.044), ('metadata', 0.044), ('paragraphs', 0.044), ('median', 0.042), ('count', 0.042), ('riloff', 0.041), ('ims', 0.04), ('sequence', 0.04), ('impose', 0.039), ('wilson', 0.039), ('ent', 0.038), ('balanced', 0.037), ('mind', 0.037), ('taxonomy', 0.037), ('business', 0.037), ('sj', 0.037), ('base', 0.037), ('contributes', 0.037), ('germany', 0.036), ('emphatic', 0.036), ('accident', 0.036), ('distantly', 0.036), ('ourcen', 0.036), ('measurable', 0.036), ('mandating', 0.036), ('constraint', 0.035), ('scheme', 0.035), ('direct', 0.035), ('named', 0.033), ('guidelines', 0.033), ('iment', 0.033), ('manipulation', 0.033), ('relabel', 0.033), ('thrun', 0.033), ('exam', 0.033), ('deviated', 0.033), ('agreement', 0.032), ('categories', 0.032), ('category', 0.032), ('distinction', 0.032), ('manifests', 0.031), ('assoc', 0.031), ('smoother', 0.031), ('kipper', 0.031), ('mincut', 0.031)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000006 318 acl-2013-Sentiment Relevance
Author: Christian Scheible ; Hinrich Schutze
Abstract: A number of different notions, including subjectivity, have been proposed for distinguishing parts of documents that convey sentiment from those that do not. We propose a new concept, sentiment relevance, to make this distinction and argue that it better reflects the requirements of sentiment analysis systems. We demonstrate experimentally that sentiment relevance and subjectivity are related, but different. Since no large amount of labeled training data for our new notion of sentiment relevance is available, we investigate two semi-supervised methods for creating sentiment relevance classifiers: a distant supervision approach that leverages structured information about the domain of the reviews; and transfer learning on feature representations based on lexical taxonomies that enables knowledge transfer. We show that both methods learn sentiment relevance classifiers that perform well.
2 0.33703265 148 acl-2013-Exploring Sentiment in Social Media: Bootstrapping Subjectivity Clues from Multilingual Twitter Streams
Author: Svitlana Volkova ; Theresa Wilson ; David Yarowsky
Abstract: We study subjective language media and create Twitter-specific lexicons via bootstrapping sentiment-bearing terms from multilingual Twitter streams. Starting with a domain-independent, highprecision sentiment lexicon and a large pool of unlabeled data, we bootstrap Twitter-specific sentiment lexicons, using a small amount of labeled data to guide the process. Our experiments on English, Spanish and Russian show that the resulting lexicons are effective for sentiment classification for many underexplored languages in social media.
3 0.32363597 2 acl-2013-A Bayesian Model for Joint Unsupervised Induction of Sentiment, Aspect and Discourse Representations
Author: Angeliki Lazaridou ; Ivan Titov ; Caroline Sporleder
Abstract: We propose a joint model for unsupervised induction of sentiment, aspect and discourse information and show that by incorporating a notion of latent discourse relations in the model, we improve the prediction accuracy for aspect and sentiment polarity on the sub-sentential level. We deviate from the traditional view of discourse, as we induce types of discourse relations and associated discourse cues relevant to the considered opinion analysis task; consequently, the induced discourse relations play the role of opinion and aspect shifters. The quantitative analysis that we conducted indicated that the integration of a discourse model increased the prediction accuracy results with respect to the discourse-agnostic approach and the qualitative analysis suggests that the induced representations encode a meaningful discourse structure.
4 0.30769137 188 acl-2013-Identifying Sentiment Words Using an Optimization-based Model without Seed Words
Author: Hongliang Yu ; Zhi-Hong Deng ; Shiyingxue Li
Abstract: Sentiment Word Identification (SWI) is a basic technique in many sentiment analysis applications. Most existing researches exploit seed words, and lead to low robustness. In this paper, we propose a novel optimization-based model for SWI. Unlike previous approaches, our model exploits the sentiment labels of documents instead of seed words. Several experiments on real datasets show that WEED is effective and outperforms the state-of-the-art methods with seed words.
5 0.24888572 211 acl-2013-LABR: A Large Scale Arabic Book Reviews Dataset
Author: Mohamed Aly ; Amir Atiya
Abstract: We introduce LABR, the largest sentiment analysis dataset to-date for the Arabic language. It consists of over 63,000 book reviews, each rated on a scale of 1 to 5 stars. We investigate the properties of the the dataset, and present its statistics. We explore using the dataset for two tasks: sentiment polarity classification and rating classification. We provide standard splits of the dataset into training and testing, for both polarity and rating classification, in both balanced and unbalanced settings. We run baseline experiments on the dataset to establish a benchmark.
6 0.20914422 379 acl-2013-Utterance-Level Multimodal Sentiment Analysis
7 0.19448984 345 acl-2013-The Haves and the Have-Nots: Leveraging Unlabelled Corpora for Sentiment Analysis
8 0.19344544 79 acl-2013-Character-to-Character Sentiment Analysis in Shakespeare's Plays
9 0.18661503 115 acl-2013-Detecting Event-Related Links and Sentiments from Social Media Texts
10 0.18496548 67 acl-2013-Bi-directional Inter-dependencies of Subjective Expressions and Targets and their Value for a Joint Model
11 0.17145255 121 acl-2013-Discovering User Interactions in Ideological Discussions
12 0.17137502 187 acl-2013-Identifying Opinion Subgroups in Arabic Online Discussions
13 0.15829168 49 acl-2013-An annotated corpus of quoted opinions in news articles
14 0.15313603 284 acl-2013-Probabilistic Sense Sentiment Similarity through Hidden Emotions
15 0.15160932 147 acl-2013-Exploiting Topic based Twitter Sentiment for Stock Prediction
16 0.14302967 168 acl-2013-Generating Recommendation Dialogs by Extracting Information from User Reviews
17 0.136198 131 acl-2013-Dual Training and Dual Prediction for Polarity Classification
18 0.13194863 117 acl-2013-Detecting Turnarounds in Sentiment Analysis: Thwarting
19 0.12153488 81 acl-2013-Co-Regression for Cross-Language Review Rating Prediction
20 0.11765342 294 acl-2013-Re-embedding words
topicId topicWeight
[(0, 0.229), (1, 0.341), (2, -0.052), (3, 0.255), (4, -0.101), (5, -0.085), (6, 0.038), (7, 0.021), (8, 0.047), (9, 0.133), (10, 0.189), (11, -0.059), (12, -0.068), (13, -0.009), (14, -0.005), (15, 0.05), (16, 0.021), (17, -0.006), (18, 0.002), (19, 0.058), (20, -0.031), (21, 0.054), (22, -0.077), (23, 0.081), (24, 0.049), (25, -0.082), (26, -0.015), (27, -0.023), (28, -0.048), (29, 0.012), (30, -0.035), (31, -0.007), (32, -0.036), (33, -0.072), (34, -0.025), (35, -0.007), (36, 0.028), (37, 0.019), (38, -0.057), (39, -0.074), (40, -0.019), (41, -0.001), (42, -0.009), (43, -0.02), (44, 0.038), (45, 0.081), (46, 0.033), (47, -0.045), (48, -0.013), (49, -0.004)]
simIndex simValue paperId paperTitle
same-paper 1 0.96662718 318 acl-2013-Sentiment Relevance
Author: Christian Scheible ; Hinrich Schutze
Abstract: A number of different notions, including subjectivity, have been proposed for distinguishing parts of documents that convey sentiment from those that do not. We propose a new concept, sentiment relevance, to make this distinction and argue that it better reflects the requirements of sentiment analysis systems. We demonstrate experimentally that sentiment relevance and subjectivity are related, but different. Since no large amount of labeled training data for our new notion of sentiment relevance is available, we investigate two semi-supervised methods for creating sentiment relevance classifiers: a distant supervision approach that leverages structured information about the domain of the reviews; and transfer learning on feature representations based on lexical taxonomies that enables knowledge transfer. We show that both methods learn sentiment relevance classifiers that perform well.
2 0.91572994 188 acl-2013-Identifying Sentiment Words Using an Optimization-based Model without Seed Words
Author: Hongliang Yu ; Zhi-Hong Deng ; Shiyingxue Li
Abstract: Sentiment Word Identification (SWI) is a basic technique in many sentiment analysis applications. Most existing researches exploit seed words, and lead to low robustness. In this paper, we propose a novel optimization-based model for SWI. Unlike previous approaches, our model exploits the sentiment labels of documents instead of seed words. Several experiments on real datasets show that WEED is effective and outperforms the state-of-the-art methods with seed words.
3 0.87795103 117 acl-2013-Detecting Turnarounds in Sentiment Analysis: Thwarting
Author: Ankit Ramteke ; Akshat Malu ; Pushpak Bhattacharyya ; J. Saketha Nath
Abstract: Thwarting and sarcasm are two uncharted territories in sentiment analysis, the former because of the lack of training corpora and the latter because of the enormous amount of world knowledge it demands. In this paper, we propose a working definition of thwarting amenable to machine learning and create a system that detects if the document is thwarted or not. We focus on identifying thwarting in product reviews, especially in the camera domain. An ontology of the camera domain is created. Thwarting is looked upon as the phenomenon of polarity reversal at a higher level of ontology compared to the polarity expressed at the lower level. This notion of thwarting defined with respect to an ontology is novel, to the best of our knowledge. A rule based implementation building upon this idea forms our baseline. We show that machine learning with annotated corpora (thwarted/nonthwarted) is more effective than the rule based system. Because of the skewed distribution of thwarting, we adopt the Areaunder-the-Curve measure of performance. To the best of our knowledge, this is the first attempt at the difficult problem of thwarting detection, which we hope will at Akshat Malu Dept. of Computer Science & Engg., Indian Institute of Technology Bombay, Mumbai, India. akshatmalu@ cse .i itb .ac .in J. Saketha Nath Dept. of Computer Science & Engg., Indian Institute of Technology Bombay, Mumbai, India. s aketh@ cse .i itb .ac .in least provide a baseline system to compare against. 1 Credits The authors thank the lexicographers at Center for Indian Language Technology (CFILT) at IIT Bombay for their support for this work. 2
4 0.86257231 79 acl-2013-Character-to-Character Sentiment Analysis in Shakespeare's Plays
Author: Eric T. Nalisnick ; Henry S. Baird
Abstract: We present an automatic method for analyzing sentiment dynamics between characters in plays. This literary format’s structured dialogue allows us to make assumptions about who is participating in a conversation. Once we have an idea of who a character is speaking to, the sentiment in his or her speech can be attributed accordingly, allowing us to generate lists of a character’s enemies and allies as well as pinpoint scenes critical to a character’s emotional development. Results of experiments on Shakespeare’s plays are presented along with discussion of how this work can be extended to unstructured texts (i.e. novels).
5 0.84383148 148 acl-2013-Exploring Sentiment in Social Media: Bootstrapping Subjectivity Clues from Multilingual Twitter Streams
Author: Svitlana Volkova ; Theresa Wilson ; David Yarowsky
Abstract: We study subjective language media and create Twitter-specific lexicons via bootstrapping sentiment-bearing terms from multilingual Twitter streams. Starting with a domain-independent, highprecision sentiment lexicon and a large pool of unlabeled data, we bootstrap Twitter-specific sentiment lexicons, using a small amount of labeled data to guide the process. Our experiments on English, Spanish and Russian show that the resulting lexicons are effective for sentiment classification for many underexplored languages in social media.
6 0.80421853 211 acl-2013-LABR: A Large Scale Arabic Book Reviews Dataset
7 0.80088949 131 acl-2013-Dual Training and Dual Prediction for Polarity Classification
8 0.75053692 2 acl-2013-A Bayesian Model for Joint Unsupervised Induction of Sentiment, Aspect and Discourse Representations
9 0.71157944 379 acl-2013-Utterance-Level Multimodal Sentiment Analysis
10 0.69810307 91 acl-2013-Connotation Lexicon: A Dash of Sentiment Beneath the Surface Meaning
11 0.65943307 345 acl-2013-The Haves and the Have-Nots: Leveraging Unlabelled Corpora for Sentiment Analysis
12 0.63143605 49 acl-2013-An annotated corpus of quoted opinions in news articles
13 0.59308833 232 acl-2013-Linguistic Models for Analyzing and Detecting Biased Language
14 0.59131867 168 acl-2013-Generating Recommendation Dialogs by Extracting Information from User Reviews
15 0.58776999 81 acl-2013-Co-Regression for Cross-Language Review Rating Prediction
16 0.58682144 284 acl-2013-Probabilistic Sense Sentiment Similarity through Hidden Emotions
17 0.57884485 115 acl-2013-Detecting Event-Related Links and Sentiments from Social Media Texts
18 0.55181348 147 acl-2013-Exploiting Topic based Twitter Sentiment for Stock Prediction
19 0.54987305 67 acl-2013-Bi-directional Inter-dependencies of Subjective Expressions and Targets and their Value for a Joint Model
20 0.53621799 187 acl-2013-Identifying Opinion Subgroups in Arabic Online Discussions
topicId topicWeight
[(0, 0.044), (6, 0.045), (8, 0.181), (11, 0.075), (15, 0.025), (24, 0.07), (26, 0.092), (35, 0.086), (42, 0.057), (48, 0.048), (64, 0.014), (70, 0.062), (88, 0.053), (90, 0.016), (95, 0.052)]
simIndex simValue paperId paperTitle
same-paper 1 0.84729385 318 acl-2013-Sentiment Relevance
Author: Christian Scheible ; Hinrich Schutze
Abstract: A number of different notions, including subjectivity, have been proposed for distinguishing parts of documents that convey sentiment from those that do not. We propose a new concept, sentiment relevance, to make this distinction and argue that it better reflects the requirements of sentiment analysis systems. We demonstrate experimentally that sentiment relevance and subjectivity are related, but different. Since no large amount of labeled training data for our new notion of sentiment relevance is available, we investigate two semi-supervised methods for creating sentiment relevance classifiers: a distant supervision approach that leverages structured information about the domain of the reviews; and transfer learning on feature representations based on lexical taxonomies that enables knowledge transfer. We show that both methods learn sentiment relevance classifiers that perform well.
2 0.74867737 109 acl-2013-Decipherment Complexity in 1:1 Substitution Ciphers
Author: Malte Nuhn ; Hermann Ney
Abstract: In this paper we show that even for the case of 1:1 substitution ciphers—which encipher plaintext symbols by exchanging them with a unique substitute—finding the optimal decipherment with respect to a bigram language model is NP-hard. We show that in this case the decipherment problem is equivalent to the quadratic assignment problem (QAP). To the best of our knowledge, this connection between the QAP and the decipherment problem has not been known in the literature before.
3 0.72739732 2 acl-2013-A Bayesian Model for Joint Unsupervised Induction of Sentiment, Aspect and Discourse Representations
Author: Angeliki Lazaridou ; Ivan Titov ; Caroline Sporleder
Abstract: We propose a joint model for unsupervised induction of sentiment, aspect and discourse information and show that by incorporating a notion of latent discourse relations in the model, we improve the prediction accuracy for aspect and sentiment polarity on the sub-sentential level. We deviate from the traditional view of discourse, as we induce types of discourse relations and associated discourse cues relevant to the considered opinion analysis task; consequently, the induced discourse relations play the role of opinion and aspect shifters. The quantitative analysis that we conducted indicated that the integration of a discourse model increased the prediction accuracy results with respect to the discourse-agnostic approach and the qualitative analysis suggests that the induced representations encode a meaningful discourse structure.
4 0.71584713 144 acl-2013-Explicit and Implicit Syntactic Features for Text Classification
Author: Matt Post ; Shane Bergsma
Abstract: Syntactic features are useful for many text classification tasks. Among these, tree kernels (Collins and Duffy, 2001) have been perhaps the most robust and effective syntactic tool, appealing for their empirical success, but also because they do not require an answer to the difficult question of which tree features to use for a given task. We compare tree kernels to different explicit sets of tree features on five diverse tasks, and find that explicit features often perform as well as tree kernels on accuracy and always in orders of magnitude less time, and with smaller models. Since explicit features are easy to generate and use (with publicly avail- able tools) , we suggest they should always be included as baseline comparisons in tree kernel method evaluations.
5 0.71503335 169 acl-2013-Generating Synthetic Comparable Questions for News Articles
Author: Oleg Rokhlenko ; Idan Szpektor
Abstract: We introduce the novel task of automatically generating questions that are relevant to a text but do not appear in it. One motivating example of its application is for increasing user engagement around news articles by suggesting relevant comparable questions, such as “is Beyonce a better singer than Madonna?”, for the user to answer. We present the first algorithm for the task, which consists of: (a) offline construction of a comparable question template database; (b) ranking of relevant templates to a given article; and (c) instantiation of templates only with entities in the article whose comparison under the template’s relation makes sense. We tested the suggestions generated by our algorithm via a Mechanical Turk experiment, which showed a significant improvement over the strongest baseline of more than 45% in all metrics.
6 0.70898283 369 acl-2013-Unsupervised Consonant-Vowel Prediction over Hundreds of Languages
7 0.70877236 225 acl-2013-Learning to Order Natural Language Texts
8 0.70720786 233 acl-2013-Linking Tweets to News: A Framework to Enrich Short Text Data in Social Media
9 0.70458162 82 acl-2013-Co-regularizing character-based and word-based models for semi-supervised Chinese word segmentation
10 0.70336592 83 acl-2013-Collective Annotation of Linguistic Resources: Basic Principles and a Formal Model
11 0.7029708 7 acl-2013-A Lattice-based Framework for Joint Chinese Word Segmentation, POS Tagging and Parsing
12 0.69997281 224 acl-2013-Learning to Extract International Relations from Political Context
13 0.69994175 275 acl-2013-Parsing with Compositional Vector Grammars
14 0.69945657 159 acl-2013-Filling Knowledge Base Gaps for Distant Supervision of Relation Extraction
15 0.6993022 70 acl-2013-Bilingually-Guided Monolingual Dependency Grammar Induction
16 0.6988194 80 acl-2013-Chinese Parsing Exploiting Characters
17 0.69832093 276 acl-2013-Part-of-Speech Induction in Dependency Trees for Statistical Machine Translation
18 0.69798887 123 acl-2013-Discriminative Learning with Natural Annotations: Word Segmentation as a Case Study
19 0.69710147 373 acl-2013-Using Conceptual Class Attributes to Characterize Social Media Users
20 0.69701719 333 acl-2013-Summarization Through Submodularity and Dispersion