acl acl2013 acl2013-131 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Rui Xia ; Tao Wang ; Xuelei Hu ; Shoushan Li ; Chengqing Zong
Abstract: Bag-of-words (BOW) is now the most popular way to model text in machine learning based sentiment classification. However, the performance of such approach sometimes remains rather limited due to some fundamental deficiencies of the BOW model. In this paper, we focus on the polarity shift problem, and propose a novel approach, called dual training and dual prediction (DTDP), to address it. The basic idea of DTDP is to first generate artificial samples that are polarity-opposite to the original samples by polarity reversion, and then leverage both the original and opposite samples for (dual) training and (dual) prediction. Experimental results on four datasets demonstrate the effectiveness of the proposed approach for polarity classification. 1
Reference: text
sentIndex sentText sentNum sentScore
1 Dual Training and Dual Prediction for Polarity Classification Rui Xia, Tao Wang, Xuelei Hu Department of Computer Science Nanjing University of Science and Technology rxia@ nj ust . [sent-1, score-0.07]
2 cn Abstract Bag-of-words (BOW) is now the most popular way to model text in machine learning based sentiment classification. [sent-6, score-0.246]
3 However, the performance of such approach sometimes remains rather limited due to some fundamental deficiencies of the BOW model. [sent-7, score-0.023]
4 In this paper, we focus on the polarity shift problem, and propose a novel approach, called dual training and dual prediction (DTDP), to address it. [sent-8, score-1.599]
5 The basic idea of DTDP is to first generate artificial samples that are polarity-opposite to the original samples by polarity reversion, and then leverage both the original and opposite samples for (dual) training and (dual) prediction. [sent-9, score-1.192]
6 Experimental results on four datasets demonstrate the effectiveness of the proposed approach for polarity classification. [sent-10, score-0.352]
7 Although the BOW model is simple and has achieved great successes in topic-based text classification, it disrupts word order, breaks the syntactic structures and discards some kinds of semantic information that are possibly very important for sentiment classification. [sent-12, score-0.226]
8 Such disadvantages sometimes limit the performance of sentiment classification systems. [sent-13, score-0.315]
9 A lot of subsequent work focused on feature engineering that aims to find a set of effective features based on the BOW representation. [sent-14, score-0.025]
10 However, there still remain some problems that are not well addressed. [sent-15, score-0.021]
11 Out of them, the polarity shift problem is the biggest one. [sent-16, score-0.525]
12 Shoushan Li Chengqing Zong NLP Lab National Lab of Department of Pattern Recognition Computer Science Institute of Automation Soochow University CAS shoushan . [sent-17, score-0.05]
13 cn We refer to “polarity shift” as a linguistic phenomenon that the sentiment orientation of a text is reversed (from positive to negative or vice versa) because of some particular expressions called polarity shifters. [sent-22, score-0.915]
14 , “no”, “not” and “don’t”) are the most important type of polarity shifter. [sent-25, score-0.307]
15 For example, by adding a negation word “don’t” to a positive text “I like this book” in front of “like”, the orientation of the text is reversed from positive to negative. [sent-26, score-0.549]
16 Naturally, handling polarity shift is very important for sentiment classification. [sent-27, score-0.79]
17 , “I like this book” and “I don ’t like this book”, are considered to be very similar by most of machine learning algorithms. [sent-30, score-0.07]
18 Although some methods have been proposed in the literature to address the polarity shift problem (Das and Chen, 2001 ; Pang et al. [sent-31, score-0.573]
19 , 2010), the state-of-the-art results are still far from satisfactory. [sent-35, score-0.021]
20 For example, the improvements are less than 2% after considering polarity shift in Li et al. [sent-36, score-0.597]
21 In this work, we propose a novel approach, called dual training and dual prediction (DTDP), to address the polarity shift problem. [sent-38, score-1.599]
22 By taking advantage of the unique nature of polarity classification, DTDP is motivated by first generating artificial samples that are polarity-opposite to the original ones. [sent-39, score-0.554]
23 For example, given the original sample “I don ’t like this book. [sent-40, score-0.206]
24 It is boring,” its polarity-opposite version, “I like this book. [sent-41, score-0.035]
25 Second, the original and opposite training samples are used together for training a sentiment classifier (called dual training), and the original and opposite test samples are used together for prediction (called dual prediction). [sent-43, score-2.242]
26 Experimental results prove that the procedure of DTDP is very effective at correcting the training and prediction errors caused 521 Proce dingSsof oifa, th Beu 5l1gsarti Aan,An u aglu Mste 4e-ti9n2g 0 o1f3 t. [sent-44, score-0.203]
27 c A2s0s1o3ci Aatsiosonc fioartio Cno fmorpu Ctoamtiopnuatalt Lioin gauli Lsitnicgsu,i psatgices 521–525, by polarity shift, and it beats other alternative methods of considering polarity shift. [sent-46, score-0.672]
28 2 Related Work The lexicon-based sentiment classification systems can be easily modified to include polarity shift. [sent-47, score-0.622]
29 One common way is to directly reverse the sentiment orientation of polarity-shifted words, and then sum up the orientations word by word (Hu and Liu, 2004; Kim and Hovy, 2004; Polanyi and Zaenen, 2004; Kennedy and Inkpen, 2006). [sent-48, score-0.278]
30 (2005) discussed other complex negation effects by using conjunctive and dependency relations among polarity words. [sent-50, score-0.53]
31 Although handling polarity shift is easy and effective in term-counting systems, they rarely outperform the baselines of machine learning methods (Kennedy, 2006). [sent-51, score-0.589]
32 The machine learning methods are generally more effective for sentiment classification. [sent-52, score-0.251]
33 However, it is difficult to handle polarity shift based on the BOW model. [sent-53, score-0.525]
34 Das and Chen (2001) proposed a method by simply attaching “NOT” to words in the scope of negation, so that in the text “I don ’t like book”, the word “like” is changed to a new word “like-NOT”. [sent-54, score-0.12]
35 There were also some attempts to model polarity shift by using more complex linguistic features (Na et al. [sent-55, score-0.525]
36 But the improvements upon the baselines of machine learning systems are very slight (less than 1%). [sent-57, score-0.073]
37 (2008) proposed a machine learning method, to model polarity-shifters for both word-wise and sentence-wise sentiment classification, based on a dictionary extracted from General Inquirer. [sent-59, score-0.274]
38 Li and Huang (2009) proposed a method first to classify each sentence in a text into a polarity-unshifted part and a polarityshifted part according to certain rules, then to represent them as two bag-of-words for sentiment classification. [sent-60, score-0.25]
39 (2010) further proposed a method to separate the shifted and unshifted text based on training a binary detector. [sent-62, score-0.07]
40 An ensemble of two component parts is used at last to get the final polarity of the whole text. [sent-64, score-0.307]
41 3 The Proposed Approach We first present the method for generating artificial polarity-opposite samples, and then introduce the algorithm of dual training and dual prediction (DTDP). [sent-65, score-1.043]
42 1 Generating Artificial Polarity-Opposite Samples Given an original sample and an antonym dictionary (e. [sent-67, score-0.231]
43 The sentiment words in the scope of negation are not 1 reversed; 3) Label reversion: The class label of the labeled sample is also reversed to its opposite (i. [sent-72, score-1.036]
44 , Positive to Negative, or vice versa) as the class label of newly generated samples (called polarity-opposite samples). [sent-74, score-0.258]
45 Given the original sample: The original sample Text: I don’t like this book. [sent-76, score-0.278]
46 Label: Negative According to Rule 1, “boring” is reversed to its antonym “interesting”; According to Rule 2, the negation word “don ’t” is removed, and “like” is not reversed; According to Rule 3, the class label Negative is reversed to Positive. [sent-78, score-0.623]
47 Finally, an artificial polarity-opposite sample is generated: The generated opposite sample Text: I like this book. [sent-79, score-0.508]
48 Label: Positive All samples in the training and test set are reversed to their polarity-opposite versions. [sent-81, score-0.379]
49 We refer to them as “opposite training set” and “opposite test set”, respectively. [sent-82, score-0.07]
50 2 Dual Training and Dual Prediction In this part, we introduce how to make use of the original and opposite training/test data together for dual training and dual prediction (DTDP). [sent-84, score-1.333]
51 Dual Training: Let D = f(xi; yi)giN=1 and D~ = f(~ xi; y~i)giN=1 be the original and opposite training set respectively, where x denotes the feature vector, y denotes the class label, and N denotes the size of training set. [sent-85, score-0.471]
52 In dual training, D [ D~ are used together as training data to learn 1 http://wordnet. [sent-86, score-0.496]
53 The size of training data is doubled in dual training. [sent-89, score-0.497]
54 As far as only the original sample (“I don ’t like this book. [sent-92, score-0.206]
55 ”) is considered, the feature “like” will be improperly recognized as a negative indicator (since the class label is Negative), ignoring the expression of negation. [sent-94, score-0.132]
56 Nevertheless, if the generated opposite sample (“I like this book. [sent-95, score-0.382]
57 ”) is also used for training, “like” will be learned correctly, due to the removal of negation in sample reversion. [sent-97, score-0.3]
58 Therefore, the procedure of dual training can correct some learning errors caused by polarity shift. [sent-98, score-0.802]
59 Dual Prediction: Given an already-trained classification model, in dual prediction, the original and opposite test samples are used together for prediction. [sent-99, score-1.007]
60 In dual prediction, when we predict the positive degree of a test sample, we measure not only how positive the original test sample is, but also how negative the opposite sample is. [sent-100, score-1.114]
61 Let x and x~ denote the feature vector of the original and opposite test samples respectively; let pd(cjx) and pd(cj ~x) denote the predictions of the original and opposite test sample, based on the dual training model. [sent-101, score-1.263]
62 The dual predicting function is defined as: pd(+jx; x~) = (1 ¡ a)pd(+jx) + apd(¡j~ x), pd(¡jx; x~) = (1 ¡ a)pd(¡jx) + apd(+j~ x), where a (0 6 a 6 1) is the weight of the opposite prediction. [sent-102, score-0.653]
63 As far as only the original test sample (“I don ’t like this book. [sent-105, score-0.23]
64 ”) is used for prediction, it is very likely that it is falsely predicted as Positive, since “like” is a strong positive feature, despite that it is in the scope of negation. [sent-107, score-0.111]
65 While in dual prediction, we still measure the “sentiment-opposite” degree of the opposite test sample (“I like this book. [sent-108, score-0.832]
66 Since negation is removed, it is very likely that the opposite test sample is assigned with a high positive score, which could compensate the prediction errors of the original test sample. [sent-111, score-0.806]
67 Final Output: It should be noted that although the artificially generated training and testing data are helpful in most cases, they still produce some noises (e. [sent-112, score-0.148]
68 , some poorly generated samples may violate the quality of the original data set). [sent-114, score-0.244]
69 Therefore, instead of using all dual predictions as the final output, we use the original prediction po(cjx) as an alternate, in case that the dual prediction pd(cjx; x~) is not enough con- fident, according to a confidence threshold t. [sent-115, score-1.154]
70 They consist of product reviews collected from four different domains: Book, DVD, Electronics and Kitchen. [sent-119, score-0.042]
71 Each of them contains 1,000 positive and 1,000 negative reviews. [sent-120, score-0.093]
72 Each of the datasets is randomly spit into 5 folds, with four folds serving as training data, and the remaining one fold serving as test data. [sent-121, score-0.203]
73 The details of the algorithm is introduced in related work; 4) DTDP: our approach proposed in Section 3. [sent-126, score-0.024]
74 3 Comparison of the Evaluated Systems In table 1, we report the classification accuracy of four evaluated systems using unigram features. [sent-130, score-0.11]
75 We consider two widely-used classification algorithms: SVM and Naïve Bayes. [sent-131, score-0.089]
76 cn 523 Dataset Baseline Das-20S0V1 M Li-2010 DTDP Baseline DaNs-a2ï0v0e1 B aLyei-s2 010 DTDP Compared to the Baseline system, the Das2001 approach achieves very slight improvements (less than 1%). [sent-144, score-0.093]
77 As for our approach (DTDP), the improvements are remarkable. [sent-149, score-0.049]
78 Compared to the Baseline system, the average improvements are 4. [sent-150, score-0.049]
79 We also report the classification accuracy of four systems using both unigrams and bigrams features for classification in Table 2. [sent-156, score-0.22]
80 It is now relatively difficult to show improvements by incorporating polarity shift, because using bigrams already captured a part of negations (e. [sent-158, score-0.397]
81 The Das-2001 approach still shows very limited improvements (less than 0. [sent-161, score-0.07]
82 Although the improvements of the previous two systems are both limited, the performance of our approach (DTDP) is still sound. [sent-166, score-0.07]
83 5 Conclusions In this work, we propose a method, called dual training and dual prediction (DTDP), to address the polarity shift problem in sentiment classifica- tion. [sent-172, score-1.825]
84 The basic idea of DTDP is to generate artificial samples that are polarity-opposite to the original samples, and to make use of both the original and opposite samples for dual training and dual prediction. [sent-173, score-1.595]
85 Experimental studies show that our DTDP algorithm is very effective for sentiment classification and it beats other alternative methods of considering polarity shift. [sent-174, score-0.705]
86 Sentiment classification of movie reviews using contextual valence shifters. [sent-197, score-0.142]
87 Effectiveness of simple linguistic processing in automatic sentiment classification of product reviews. [sent-221, score-0.315]
wordName wordTfidf (topN-words)
[('dtdp', 0.452), ('dual', 0.429), ('polarity', 0.307), ('sentiment', 0.226), ('opposite', 0.224), ('shift', 0.218), ('negation', 0.201), ('cjx', 0.169), ('reversed', 0.161), ('samples', 0.148), ('bow', 0.143), ('pd', 0.129), ('na', 0.114), ('prediction', 0.112), ('sample', 0.099), ('classification', 0.089), ('ve', 0.086), ('reversion', 0.085), ('bayes', 0.081), ('kennedy', 0.076), ('ikeda', 0.075), ('original', 0.072), ('jx', 0.071), ('book', 0.07), ('svm', 0.062), ('scope', 0.061), ('artificially', 0.057), ('apd', 0.056), ('gin', 0.056), ('jiangsu', 0.056), ('inkpen', 0.055), ('orientation', 0.052), ('proceeding', 0.051), ('positive', 0.05), ('shoushan', 0.05), ('improvements', 0.049), ('thumbs', 0.047), ('polanyi', 0.046), ('ust', 0.046), ('training', 0.046), ('negative', 0.043), ('li', 0.041), ('serving', 0.04), ('boring', 0.04), ('handling', 0.039), ('label', 0.038), ('nlpr', 0.037), ('pang', 0.036), ('antonym', 0.036), ('like', 0.035), ('beats', 0.035), ('called', 0.034), ('das', 0.032), ('folds', 0.032), ('valence', 0.032), ('versa', 0.032), ('program', 0.032), ('artificial', 0.027), ('chen', 0.027), ('class', 0.026), ('po', 0.026), ('improperly', 0.025), ('sui', 0.025), ('dvd', 0.025), ('khoo', 0.025), ('effective', 0.025), ('generated', 0.024), ('slight', 0.024), ('nj', 0.024), ('lab', 0.024), ('test', 0.024), ('china', 0.024), ('dictionary', 0.024), ('address', 0.024), ('proposed', 0.024), ('considering', 0.023), ('nanjing', 0.023), ('deficiencies', 0.023), ('hu', 0.022), ('huang', 0.022), ('vice', 0.022), ('zaenen', 0.022), ('cqz', 0.022), ('conjunctive', 0.022), ('doubled', 0.022), ('bigrams', 0.021), ('still', 0.021), ('reviews', 0.021), ('together', 0.021), ('soochow', 0.021), ('electronics', 0.021), ('four', 0.021), ('suppose', 0.02), ('caused', 0.02), ('safety', 0.02), ('universities', 0.02), ('pf', 0.02), ('negations', 0.02), ('cn', 0.02), ('denotes', 0.019)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999988 131 acl-2013-Dual Training and Dual Prediction for Polarity Classification
Author: Rui Xia ; Tao Wang ; Xuelei Hu ; Shoushan Li ; Chengqing Zong
Abstract: Bag-of-words (BOW) is now the most popular way to model text in machine learning based sentiment classification. However, the performance of such approach sometimes remains rather limited due to some fundamental deficiencies of the BOW model. In this paper, we focus on the polarity shift problem, and propose a novel approach, called dual training and dual prediction (DTDP), to address it. The basic idea of DTDP is to first generate artificial samples that are polarity-opposite to the original samples by polarity reversion, and then leverage both the original and opposite samples for (dual) training and (dual) prediction. Experimental results on four datasets demonstrate the effectiveness of the proposed approach for polarity classification. 1
2 0.27625054 188 acl-2013-Identifying Sentiment Words Using an Optimization-based Model without Seed Words
Author: Hongliang Yu ; Zhi-Hong Deng ; Shiyingxue Li
Abstract: Sentiment Word Identification (SWI) is a basic technique in many sentiment analysis applications. Most existing researches exploit seed words, and lead to low robustness. In this paper, we propose a novel optimization-based model for SWI. Unlike previous approaches, our model exploits the sentiment labels of documents instead of seed words. Several experiments on real datasets show that WEED is effective and outperforms the state-of-the-art methods with seed words.
3 0.17823702 2 acl-2013-A Bayesian Model for Joint Unsupervised Induction of Sentiment, Aspect and Discourse Representations
Author: Angeliki Lazaridou ; Ivan Titov ; Caroline Sporleder
Abstract: We propose a joint model for unsupervised induction of sentiment, aspect and discourse information and show that by incorporating a notion of latent discourse relations in the model, we improve the prediction accuracy for aspect and sentiment polarity on the sub-sentential level. We deviate from the traditional view of discourse, as we induce types of discourse relations and associated discourse cues relevant to the considered opinion analysis task; consequently, the induced discourse relations play the role of opinion and aspect shifters. The quantitative analysis that we conducted indicated that the integration of a discourse model increased the prediction accuracy results with respect to the discourse-agnostic approach and the qualitative analysis suggests that the induced representations encode a meaningful discourse structure.
4 0.17527977 148 acl-2013-Exploring Sentiment in Social Media: Bootstrapping Subjectivity Clues from Multilingual Twitter Streams
Author: Svitlana Volkova ; Theresa Wilson ; David Yarowsky
Abstract: We study subjective language media and create Twitter-specific lexicons via bootstrapping sentiment-bearing terms from multilingual Twitter streams. Starting with a domain-independent, highprecision sentiment lexicon and a large pool of unlabeled data, we bootstrap Twitter-specific sentiment lexicons, using a small amount of labeled data to guide the process. Our experiments on English, Spanish and Russian show that the resulting lexicons are effective for sentiment classification for many underexplored languages in social media.
5 0.1517857 211 acl-2013-LABR: A Large Scale Arabic Book Reviews Dataset
Author: Mohamed Aly ; Amir Atiya
Abstract: We introduce LABR, the largest sentiment analysis dataset to-date for the Arabic language. It consists of over 63,000 book reviews, each rated on a scale of 1 to 5 stars. We investigate the properties of the the dataset, and present its statistics. We explore using the dataset for two tasks: sentiment polarity classification and rating classification. We provide standard splits of the dataset into training and testing, for both polarity and rating classification, in both balanced and unbalanced settings. We run baseline experiments on the dataset to establish a benchmark.
6 0.14943795 253 acl-2013-Multilingual Affect Polarity and Valence Prediction in Metaphor-Rich Texts
7 0.136198 318 acl-2013-Sentiment Relevance
8 0.1361798 117 acl-2013-Detecting Turnarounds in Sentiment Analysis: Thwarting
9 0.12097979 345 acl-2013-The Haves and the Have-Nots: Leveraging Unlabelled Corpora for Sentiment Analysis
10 0.1199669 143 acl-2013-Exact Maximum Inference for the Fertility Hidden Markov Model
11 0.11710606 79 acl-2013-Character-to-Character Sentiment Analysis in Shakespeare's Plays
12 0.11373106 334 acl-2013-Supervised Model Learning with Feature Grouping based on a Discrete Constraint
13 0.11177961 115 acl-2013-Detecting Event-Related Links and Sentiments from Social Media Texts
14 0.11074382 91 acl-2013-Connotation Lexicon: A Dash of Sentiment Beneath the Surface Meaning
15 0.11008187 310 acl-2013-Semantic Frames to Predict Stock Price Movement
16 0.10481744 379 acl-2013-Utterance-Level Multimodal Sentiment Analysis
17 0.098169476 284 acl-2013-Probabilistic Sense Sentiment Similarity through Hidden Emotions
18 0.095377862 147 acl-2013-Exploiting Topic based Twitter Sentiment for Stock Prediction
19 0.094332039 157 acl-2013-Fast and Robust Compressive Summarization with Dual Decomposition and Multi-Task Learning
20 0.092585027 362 acl-2013-Turning on the Turbo: Fast Third-Order Non-Projective Turbo Parsers
topicId topicWeight
[(0, 0.15), (1, 0.187), (2, -0.047), (3, 0.195), (4, -0.065), (5, -0.09), (6, 0.059), (7, 0.03), (8, 0.017), (9, 0.112), (10, 0.157), (11, -0.068), (12, -0.087), (13, -0.088), (14, 0.026), (15, 0.089), (16, 0.039), (17, -0.01), (18, 0.065), (19, 0.066), (20, 0.004), (21, 0.106), (22, -0.012), (23, 0.115), (24, -0.011), (25, -0.005), (26, -0.128), (27, -0.037), (28, -0.025), (29, 0.005), (30, 0.02), (31, -0.048), (32, 0.005), (33, 0.023), (34, -0.011), (35, 0.077), (36, -0.029), (37, -0.01), (38, 0.078), (39, 0.043), (40, -0.025), (41, 0.043), (42, -0.076), (43, -0.041), (44, -0.051), (45, -0.011), (46, -0.011), (47, 0.018), (48, -0.059), (49, -0.045)]
simIndex simValue paperId paperTitle
same-paper 1 0.94998723 131 acl-2013-Dual Training and Dual Prediction for Polarity Classification
Author: Rui Xia ; Tao Wang ; Xuelei Hu ; Shoushan Li ; Chengqing Zong
Abstract: Bag-of-words (BOW) is now the most popular way to model text in machine learning based sentiment classification. However, the performance of such approach sometimes remains rather limited due to some fundamental deficiencies of the BOW model. In this paper, we focus on the polarity shift problem, and propose a novel approach, called dual training and dual prediction (DTDP), to address it. The basic idea of DTDP is to first generate artificial samples that are polarity-opposite to the original samples by polarity reversion, and then leverage both the original and opposite samples for (dual) training and (dual) prediction. Experimental results on four datasets demonstrate the effectiveness of the proposed approach for polarity classification. 1
2 0.81566161 188 acl-2013-Identifying Sentiment Words Using an Optimization-based Model without Seed Words
Author: Hongliang Yu ; Zhi-Hong Deng ; Shiyingxue Li
Abstract: Sentiment Word Identification (SWI) is a basic technique in many sentiment analysis applications. Most existing researches exploit seed words, and lead to low robustness. In this paper, we propose a novel optimization-based model for SWI. Unlike previous approaches, our model exploits the sentiment labels of documents instead of seed words. Several experiments on real datasets show that WEED is effective and outperforms the state-of-the-art methods with seed words.
3 0.7519961 117 acl-2013-Detecting Turnarounds in Sentiment Analysis: Thwarting
Author: Ankit Ramteke ; Akshat Malu ; Pushpak Bhattacharyya ; J. Saketha Nath
Abstract: Thwarting and sarcasm are two uncharted territories in sentiment analysis, the former because of the lack of training corpora and the latter because of the enormous amount of world knowledge it demands. In this paper, we propose a working definition of thwarting amenable to machine learning and create a system that detects if the document is thwarted or not. We focus on identifying thwarting in product reviews, especially in the camera domain. An ontology of the camera domain is created. Thwarting is looked upon as the phenomenon of polarity reversal at a higher level of ontology compared to the polarity expressed at the lower level. This notion of thwarting defined with respect to an ontology is novel, to the best of our knowledge. A rule based implementation building upon this idea forms our baseline. We show that machine learning with annotated corpora (thwarted/nonthwarted) is more effective than the rule based system. Because of the skewed distribution of thwarting, we adopt the Areaunder-the-Curve measure of performance. To the best of our knowledge, this is the first attempt at the difficult problem of thwarting detection, which we hope will at Akshat Malu Dept. of Computer Science & Engg., Indian Institute of Technology Bombay, Mumbai, India. akshatmalu@ cse .i itb .ac .in J. Saketha Nath Dept. of Computer Science & Engg., Indian Institute of Technology Bombay, Mumbai, India. s aketh@ cse .i itb .ac .in least provide a baseline system to compare against. 1 Credits The authors thank the lexicographers at Center for Indian Language Technology (CFILT) at IIT Bombay for their support for this work. 2
4 0.72179776 318 acl-2013-Sentiment Relevance
Author: Christian Scheible ; Hinrich Schutze
Abstract: A number of different notions, including subjectivity, have been proposed for distinguishing parts of documents that convey sentiment from those that do not. We propose a new concept, sentiment relevance, to make this distinction and argue that it better reflects the requirements of sentiment analysis systems. We demonstrate experimentally that sentiment relevance and subjectivity are related, but different. Since no large amount of labeled training data for our new notion of sentiment relevance is available, we investigate two semi-supervised methods for creating sentiment relevance classifiers: a distant supervision approach that leverages structured information about the domain of the reviews; and transfer learning on feature representations based on lexical taxonomies that enables knowledge transfer. We show that both methods learn sentiment relevance classifiers that perform well.
5 0.70559537 91 acl-2013-Connotation Lexicon: A Dash of Sentiment Beneath the Surface Meaning
Author: Song Feng ; Jun Seok Kang ; Polina Kuznetsova ; Yejin Choi
Abstract: Understanding the connotation of words plays an important role in interpreting subtle shades of sentiment beyond denotative or surface meaning of text, as seemingly objective statements often allude nuanced sentiment of the writer, and even purposefully conjure emotion from the readers’ minds. The focus of this paper is drawing nuanced, connotative sentiments from even those words that are objective on the surface, such as “intelligence ”, “human ”, and “cheesecake ”. We propose induction algorithms encoding a diverse set of linguistic insights (semantic prosody, distributional similarity, semantic parallelism of coordination) and prior knowledge drawn from lexical resources, resulting in the first broad-coverage connotation lexicon.
6 0.68910474 211 acl-2013-LABR: A Large Scale Arabic Book Reviews Dataset
7 0.68881178 79 acl-2013-Character-to-Character Sentiment Analysis in Shakespeare's Plays
8 0.68241334 148 acl-2013-Exploring Sentiment in Social Media: Bootstrapping Subjectivity Clues from Multilingual Twitter Streams
9 0.54888421 2 acl-2013-A Bayesian Model for Joint Unsupervised Induction of Sentiment, Aspect and Discourse Representations
10 0.5443784 81 acl-2013-Co-Regression for Cross-Language Review Rating Prediction
11 0.52399611 379 acl-2013-Utterance-Level Multimodal Sentiment Analysis
12 0.50487709 253 acl-2013-Multilingual Affect Polarity and Valence Prediction in Metaphor-Rich Texts
13 0.50192606 284 acl-2013-Probabilistic Sense Sentiment Similarity through Hidden Emotions
14 0.47121185 345 acl-2013-The Haves and the Have-Nots: Leveraging Unlabelled Corpora for Sentiment Analysis
15 0.46437931 310 acl-2013-Semantic Frames to Predict Stock Price Movement
16 0.45670629 334 acl-2013-Supervised Model Learning with Feature Grouping based on a Discrete Constraint
17 0.43831488 168 acl-2013-Generating Recommendation Dialogs by Extracting Information from User Reviews
18 0.42928219 115 acl-2013-Detecting Event-Related Links and Sentiments from Social Media Texts
19 0.42383218 237 acl-2013-Margin-based Decomposed Amortized Inference
20 0.40961099 42 acl-2013-Aid is Out There: Looking for Help from Tweets during a Large Scale Disaster
topicId topicWeight
[(0, 0.075), (6, 0.051), (11, 0.065), (24, 0.06), (26, 0.088), (28, 0.012), (35, 0.042), (42, 0.025), (48, 0.044), (63, 0.014), (70, 0.059), (88, 0.081), (90, 0.014), (91, 0.215), (95, 0.062)]
simIndex simValue paperId paperTitle
1 0.76245242 52 acl-2013-Annotating named entities in clinical text by combining pre-annotation and active learning
Author: Maria Skeppstedt
Abstract: For expanding a corpus of clinical text, annotated for named entities, a method that combines pre-tagging with a version of active learning is proposed. In order to facilitate annotation and to avoid bias, two alternative automatic pre-taggings are presented to the annotator, without revealing which of them is given a higher confidence by the pre-tagging system. The task of the annotator is to select the correct version among these two alternatives. To minimise the instances in which none of the presented pre-taggings is correct, the texts presented to the annotator are actively selected from a pool of unlabelled text, with the selection criterion that one of the presented pre-taggings should have a high probability of being correct, while still being useful for improving the result of an automatic classifier.
same-paper 2 0.75928283 131 acl-2013-Dual Training and Dual Prediction for Polarity Classification
Author: Rui Xia ; Tao Wang ; Xuelei Hu ; Shoushan Li ; Chengqing Zong
Abstract: Bag-of-words (BOW) is now the most popular way to model text in machine learning based sentiment classification. However, the performance of such approach sometimes remains rather limited due to some fundamental deficiencies of the BOW model. In this paper, we focus on the polarity shift problem, and propose a novel approach, called dual training and dual prediction (DTDP), to address it. The basic idea of DTDP is to first generate artificial samples that are polarity-opposite to the original samples by polarity reversion, and then leverage both the original and opposite samples for (dual) training and (dual) prediction. Experimental results on four datasets demonstrate the effectiveness of the proposed approach for polarity classification. 1
3 0.68348646 169 acl-2013-Generating Synthetic Comparable Questions for News Articles
Author: Oleg Rokhlenko ; Idan Szpektor
Abstract: We introduce the novel task of automatically generating questions that are relevant to a text but do not appear in it. One motivating example of its application is for increasing user engagement around news articles by suggesting relevant comparable questions, such as “is Beyonce a better singer than Madonna?”, for the user to answer. We present the first algorithm for the task, which consists of: (a) offline construction of a comparable question template database; (b) ranking of relevant templates to a given article; and (c) instantiation of templates only with entities in the article whose comparison under the template’s relation makes sense. We tested the suggestions generated by our algorithm via a Mechanical Turk experiment, which showed a significant improvement over the strongest baseline of more than 45% in all metrics.
4 0.63441062 369 acl-2013-Unsupervised Consonant-Vowel Prediction over Hundreds of Languages
Author: Young-Bum Kim ; Benjamin Snyder
Abstract: In this paper, we present a solution to one aspect of the decipherment task: the prediction of consonants and vowels for an unknown language and alphabet. Adopting a classical Bayesian perspective, we performs posterior inference over hundreds of languages, leveraging knowledge of known languages and alphabets to uncover general linguistic patterns of typologically coherent language clusters. We achieve average accuracy in the unsupervised consonant/vowel prediction task of 99% across 503 languages. We further show that our methodology can be used to predict more fine-grained phonetic distinctions. On a three-way classification task between vowels, nasals, and nonnasal consonants, our model yields unsu- pervised accuracy of 89% across the same set of languages.
5 0.6260764 318 acl-2013-Sentiment Relevance
Author: Christian Scheible ; Hinrich Schutze
Abstract: A number of different notions, including subjectivity, have been proposed for distinguishing parts of documents that convey sentiment from those that do not. We propose a new concept, sentiment relevance, to make this distinction and argue that it better reflects the requirements of sentiment analysis systems. We demonstrate experimentally that sentiment relevance and subjectivity are related, but different. Since no large amount of labeled training data for our new notion of sentiment relevance is available, we investigate two semi-supervised methods for creating sentiment relevance classifiers: a distant supervision approach that leverages structured information about the domain of the reviews; and transfer learning on feature representations based on lexical taxonomies that enables knowledge transfer. We show that both methods learn sentiment relevance classifiers that perform well.
6 0.6221137 144 acl-2013-Explicit and Implicit Syntactic Features for Text Classification
7 0.62159055 333 acl-2013-Summarization Through Submodularity and Dispersion
8 0.62119758 70 acl-2013-Bilingually-Guided Monolingual Dependency Grammar Induction
9 0.61940134 7 acl-2013-A Lattice-based Framework for Joint Chinese Word Segmentation, POS Tagging and Parsing
10 0.61605811 373 acl-2013-Using Conceptual Class Attributes to Characterize Social Media Users
11 0.61234033 41 acl-2013-Aggregated Word Pair Features for Implicit Discourse Relation Disambiguation
12 0.61091602 111 acl-2013-Density Maximization in Context-Sense Metric Space for All-words WSD
13 0.60874009 299 acl-2013-Reconstructing an Indo-European Family Tree from Non-native English Texts
14 0.60810912 2 acl-2013-A Bayesian Model for Joint Unsupervised Induction of Sentiment, Aspect and Discourse Representations
15 0.60793704 252 acl-2013-Multigraph Clustering for Unsupervised Coreference Resolution
16 0.60675937 81 acl-2013-Co-Regression for Cross-Language Review Rating Prediction
17 0.60571861 136 acl-2013-Enhanced and Portable Dependency Projection Algorithms Using Interlinear Glossed Text
18 0.60538101 196 acl-2013-Improving pairwise coreference models through feature space hierarchy learning
19 0.60359955 345 acl-2013-The Haves and the Have-Nots: Leveraging Unlabelled Corpora for Sentiment Analysis
20 0.60359126 188 acl-2013-Identifying Sentiment Words Using an Optimization-based Model without Seed Words