acl acl2013 acl2013-151 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Kazi Saidul Hasan ; Vincent Ng
Abstract: Determining the stance expressed by an author from a post written for a twosided debate in an online debate forum is a relatively new problem. We seek to improve Anand et al.’s (201 1) approach to debate stance classification by modeling two types of soft extra-linguistic constraints on the stance labels of debate posts, user-interaction constraints and ideology constraints. Experimental results on four datasets demonstrate the effectiveness of these inter-post constraints in improving debate stance classification.
Reference: text
sentIndex sentText sentNum sentScore
1 edu Abstract Determining the stance expressed by an author from a post written for a twosided debate in an online debate forum is a relatively new problem. [sent-3, score-1.759]
2 ’s (201 1) approach to debate stance classification by modeling two types of soft extra-linguistic constraints on the stance labels of debate posts, user-interaction constraints and ideology constraints. [sent-5, score-2.452]
3 Experimental results on four datasets demonstrate the effectiveness of these inter-post constraints in improving debate stance classification. [sent-6, score-1.135]
4 1 Introduction While a lot of work on document-level opinion mining has involved determining the polarity expressed in a customer review (e. [sent-7, score-0.135]
5 , whether a re- view is “thumbs up” or “thumbs down”) (see Pang and Lee (2008) and Liu (2012) for an overview of the field), researchers have begun exploring new opinion mining tasks in recent years. [sent-9, score-0.04]
6 One such task is debate stance classification: given a post written for a two-sided topic discussed in an online debate forum (e. [sent-10, score-1.676]
7 Debate stance classification is potentially more interesting and challenging than polarity classification for at least two reasons. [sent-16, score-0.783]
8 First, while in polarity classification sentiment-bearing words and phrases have proven to be useful (e. [sent-17, score-0.085]
9 , “excellent” correlates strongly with the positive polarity), in debate stance classification it is not uncommon to find debate posts where stances are not expressed in terms of sentiment words, as exemplified in Figure 1, where the author is for abortion. [sent-19, score-1.782]
10 Second, while customer reviews are typically written independently of other reviews in an online forum, the same is not true for debate posts. [sent-20, score-0.435]
11 fpcthrliengr-e a debate forum, debate posts form threads, where later posts often support or oppose the viewpoints raised in earlier posts in the same thread. [sent-25, score-1.287]
12 Previous approaches to debate stance classification have focused on three debate settings, namely congressional floor debates (Thomas et al. [sent-26, score-1.579]
13 , 2011), companyinternal discussions (Murakami and Raymond, 2010), and online social, political, and ideological debates in public forums (Agrawal et al. [sent-31, score-0.288]
14 (2012) point out, debates in public forums differ from congressional debates and company-internal discussions in terms of language use. [sent-34, score-0.335]
15 Specifically, online debaters use colorful and emotional language to express their points, which may involve sarcasm, insults, and questioning another debater’s assumptions and evidence. [sent-35, score-0.034]
16 These properties can potentially make stance classification of online debates more challenging than that of the other two types of debates. [sent-36, score-0.859]
17 Our goal in this paper is to improve the stateof-the-art supervised learning approach to debate stance classification of online debates proposed by Anand et al. [sent-37, score-1.205]
18 Specifically, we hypothesize that there are two types of soft extra-linguistic constraints on the stance labels of debate posts that, 816 ProceedingSsof oifa, th Beu 5l1gsarti Aan,An uuaglu Mste 4e-ti9n2g 0 o1f3 t. [sent-39, score-1.368]
19 if explicitly modeled, could improve a learningbased stance classification system. [sent-45, score-0.698]
20 We refer to these two types of inter-post constraints as userinteraction constraints and ideology constraints. [sent-46, score-0.388]
21 We show how they can be learned from stanceannotated debate posts in Sections 4. [sent-47, score-0.566]
22 2 Datasets For our experiments, we collect debate posts from four popular domains, Abortion (ABO), Gay Rights (GAY), Obama (OBA), and Marijuana (MAR), from an online debate forum1 . [sent-50, score-0.912]
23 All debates are two-sided, so each post receives one of two domain labels, for or against, depending on whether the author of the post supports or opposes abortion, gay rights, Obama, or the legalization of marijuana. [sent-51, score-0.85]
24 The fourth column of the table shows the percentage of posts in each domain that appear in a thread. [sent-53, score-0.186]
25 More precisely, a thread is a tree with one or more nodes such that (1) each node corresponds to a debate post, and (2) a post yi is the parent of another post yj if yj is a reply to yi. [sent-54, score-0.937]
26 Given a thread, we can generate post sequences, each of which is a path from the root of the thread to one of its leaves. [sent-55, score-0.255]
27 3 Baseline Systems We employ as baselines two stance classification systems, Anand et al. [sent-56, score-0.731]
28 ’s approach is a supervised method that trains a stance classifier for determining whether the stance expressed in a debate post is for or against the topic. [sent-59, score-1.933]
29 Hence, we create one training instance from each post in the training set, using the stance it expresses as its class label. [sent-60, score-0.867]
30 After training, we can apply the classifier to classify the test instances, which are generated in the same way as the training instances. [sent-68, score-0.05]
31 Related work on stance classification of congressional debates has found that enforcing author constraints (ACs) can improve classification performance (e. [sent-69, score-1.15]
32 ACs are a type of interpost constraints that specify that two posts written by the same author for the same debate domain should have the same stance. [sent-77, score-0.794]
33 We hypothesize that ACs could similarly be used to improve stance classification of ideological debates, and therefore propose a second baseline where we enhance the first baseline with ACs. [sent-78, score-0.806]
34 We first use the learned stance classifier to classify the test posts as in the first baseline, and then postprocess the labels of the test posts. [sent-80, score-0.938]
35 Specifically, we sum up the confidence values2 assigned to the set of test posts written by the same author for the same debate domain. [sent-81, score-0.632]
36 If the sum is positive, then we label all the posts in this set as for; otherwise we label them as against. [sent-82, score-0.186]
37 4 Extra-Linguistic Constraints In this section, we introduce two types of interpost constraints on debate stance classification. [sent-83, score-1.152]
38 1 User-Interaction Constraints We call the first type of constraints userinteraction constraints (UCs). [sent-85, score-0.29]
39 UCs are motivated by the observation that the stance labels of the posts in a post sequence are not independent of each other. [sent-86, score-1.124]
40 Consider the post sequence in Figure 2, where each post is a response to the preceding post. [sent-87, score-0.521]
41 It shows an opening anti-abortion post (P1), followed by a pro-abortion comment (P2), which is in turn followed by another anti-abortion view (P3). [sent-88, score-0.223]
42 While this sequence contains alternating posts from opposing stances, in general there is no hard constraint on the stance of a post given 2We use as the confidence value the signed distance of the associated test point from the SVM hyperplane. [sent-89, score-1.158]
43 Nevertheless, we found that in our training data, a for (against) post is followed by a against (for) post 80% of the time. [sent-93, score-0.446]
44 UCs aim to model the regularities in how users interact with each other in a post sequence as soft constraints. [sent-94, score-0.31]
45 These kinds of soft constraints can be naturally encoded as factors over adjacent posts in a post sequence (see Kschischang et al. [sent-95, score-0.624]
46 (2001)), which can in turn be learned by recasting stance classification as a sequence labeling task. [sent-96, score-0.745]
47 In our experiments, we seek to derive the best sequence of stance labels for each post sequence of length ≥ 1o using a Caboenlsdi ftioorn eaalc Rha pnodsotm se Fuieenldce (CRF) (Lafferty et al. [sent-97, score-0.985]
48 Each post in a sequence is represented using the same set of features as in the baselines. [sent-101, score-0.27]
49 After training, the resulting CRF model can be used to assign a stance sequence to each test post sequence. [sent-102, score-0.914]
50 Since a given test post may appear in more than one sequence, different occurrences of it may be assigned different stance labels by the CRF. [sent-104, score-0.891]
51 To determine the final stance label for the post, we average the probabilities assigned to the for stance over all its occurrences; if the average is ≥ 0. [sent-105, score-1.288]
52 2 Ideology Constraints Next, we introduce our second type of inter-post constraints, ideology constraints (ICs). [sent-108, score-0.226]
53 ICs are cross-domain, author-based constraints: they are only applicable to debate posts written by the same author in different domains. [sent-109, score-0.632]
54 ICs model the fact that for some authors, their stances on various issues are determined in part by their ideological values, and in particular, their stances on different issues may be correlated. [sent-110, score-0.3]
55 For example, someone who opposes abortion is likely to be a conservative and has a good chance of opposing gay rights. [sent-111, score-0.316]
56 1 Implementing Ideology Constraints We first compute a set of conditional probabilities, P(stance(dq)=sd|stance(dp)=sc), where (1) dp, dq ∈ Domains (i. [sent-116, score-0.052]
57 , t ahnec set of four domains), (2) sc, sd ∈ {for, against}, atn odf (3) dp dq. [sent-118, score-0.122]
58 It should be fairly easy to see that these conditional probabilities measure the degree of correlation between the stances in different domains. [sent-120, score-0.096]
59 2 Inference Using ILP Recall that in our second baseline, we employ ACs to postprocess the output of the stance classifier simply by summing up the confidence values assigned to the posts written by the same author for the same debate domain. [sent-123, score-1.372]
60 However, since we now want to enforce two types of inter-post constraints (namely, ACs and ICs), we will have to employ a more sophisticated inference mechanism. [sent-124, score-0.179]
61 Previous work has focused on employing graph minimum cut (MinCut) as the inference algorithm. [sent-125, score-0.044]
62 However, since MinCut suffers from the weakness of not being able to enforce negative constraints (i. [sent-126, score-0.128]
63 , two posts cannot receive the same label) (Bansal et al. [sent-128, score-0.186]
64 Let pi = P(for|yi) be the “benefit” of setting xi to 1, wh=ere P P(for|yi) is provided by the CRF. [sent-139, score-0.035]
65 Consequently, eaf Pter(f optimization, yi’s stance is for if its xi is set to 1. [sent-140, score-0.679]
66 If yi and yj are composed by the same author, we ensure that xi and xj will be assigned the same value by employing the linear constraint |xi − xj | = 0. [sent-143, score-0.123]
67 Since all experiments require the use of development data for parameter tuning, we use three folds for model training, one fold for development, and one fold for testing in each fold experiment. [sent-163, score-0.107]
68 (201 1) baseline (see Section 3) on the four datasets, obtained by training a SVM stance classifier using the SVMlight software. [sent-167, score-0.673]
69 Overall, our inter-post constraints yield a stance classification system that significantly outperforms the better baseline on all four datasets, with an average improvement of 6. [sent-173, score-0.826]
70 We find three pairs of ICs involving the other three domains ABO, GAY, and OBA in our training data. [sent-182, score-0.046]
71 More specifically, the stances of the posts written by an author for these three domains are all positively co-related. [sent-183, score-0.428]
72 In other words, if an author supports abortion, it is likely that she supports both gay rights and Obama as well. [sent-184, score-0.328]
73 This means that no ICs can be established between the posts in MAR and those in the remaining domains. [sent-186, score-0.186]
74 Specifically, ICs are seen in all five folds of the data in the first two pairs of domains, whereas they are seen in only two folds in the last pair of domains. [sent-189, score-0.058]
75 6 Related Work Previous work has investigated the use of extralinguistic constraints to improve stance classification. [sent-190, score-0.772]
76 First, ICs are softer than ACs, so accurate modeling of ICs has to be based on stanceannotated data. [sent-199, score-0.034]
77 Although we employ ICs as hard constraints (owing in part to our use of the ILP framework), they can be used directly as soft constraints in other frameworks, such as MinCut. [sent-200, score-0.329]
78 To our knowledge, this is the first time inter-domain constraints are employed for stance classification. [sent-202, score-0.772]
79 There has been work related to the modeling of user interaction in a post sequence. [sent-203, score-0.223]
80 Recall that between two adjacent posts in a post sequence that have opposing stances, there exists a rebuttal link. [sent-204, score-0.582]
81 (2012) employ manually identified rebuttal links as hard inter-post constraints during inference. [sent-206, score-0.229]
82 However, since automatic discovery of rebuttal links is a non-trivial problem, employing gold rebuttal links substantially simplifies the stance classification task. [sent-207, score-0.86]
83 Instead, hypothesizing that the content of the preceding post in a post sequence would be useful for predicting the stance of the current post, they employ features computed based on the preceding post when training a stance classifier. [sent-212, score-2.093]
84 Hence, unlike us, they classify each post independently of the others, whereas we classify the posts in a sequence in dependent relation to each other. [sent-213, score-0.498]
85 The ILP framework has been applied to perform joint inference for a variety of stance prediction tasks. [sent-214, score-0.662]
86 (2012) address the task of discovering opposing opinion networks, where the goal is to partition the authors in a debate (e. [sent-216, score-0.444]
87 , gay rights) based on whether they support or oppose the given issue. [sent-218, score-0.203]
88 To this end, they employ ILP to coordinate different sources of information. [sent-219, score-0.061]
89 In our previous work on debate stance classification (Hasan and Ng, 2012), we employ ILP to coordinate the output of two classifiers: a post-stance classifier, which determines the stance of a debate post written for a domain (e. [sent-220, score-2.356]
90 , gay rights); and a topic-stance classifier, which determines the author’s stance on each topic mentioned in her post (e. [sent-222, score-1.033]
91 7 Conclusions We examined the under-studied task of stance classification of ideological debates. [sent-226, score-0.806]
92 Employing our two types of extra-linguistic constraints yields a system that outperforms an improved version of Anand et al. [sent-227, score-0.128]
93 While the effectiveness of ideology constraints depends to some extent on the “relatedness” of the underlying ideological domains, we believe that the gains they offer will increase with the number of authors posting in different domains and the number of related domains. [sent-230, score-0.38]
94 6 6Only a small fraction of the authors posted in multiple domains in our datasets: 12% and 5% of them posted in two and three domains, respectively. [sent-231, score-0.106]
95 Determining the polarity and source of opinions expressed in political debates. [sent-244, score-0.052]
96 The power of negative thinking: Exploiting label disagreement in the min-cut classification framework. [sent-248, score-0.071]
97 Predicting stance in ideological debate with rich linguistic knowledge. [sent-260, score-1.098]
98 Conditional random fields: Probabilistic models for segmenting and labeling sequence data. [sent-277, score-0.047]
99 Unsupervised discovery of opposing opinion networks from forum discussions. [sent-285, score-0.143]
100 Classifying positions in online debates from reply activities and opinion expressions. [sent-297, score-0.223]
wordName wordTfidf (topN-words)
[('stance', 0.644), ('debate', 0.346), ('ics', 0.245), ('post', 0.223), ('acs', 0.211), ('posts', 0.186), ('gay', 0.166), ('anand', 0.131), ('constraints', 0.128), ('debates', 0.127), ('ideological', 0.108), ('ilp', 0.1), ('ideology', 0.098), ('stances', 0.096), ('abo', 0.085), ('oba', 0.085), ('dp', 0.081), ('rebuttal', 0.068), ('rights', 0.062), ('congressional', 0.062), ('abortion', 0.062), ('author', 0.062), ('opposing', 0.058), ('walker', 0.057), ('burfoot', 0.056), ('bansal', 0.054), ('classification', 0.054), ('dq', 0.052), ('sc', 0.049), ('sequence', 0.047), ('domains', 0.046), ('ucs', 0.045), ('forum', 0.045), ('hasan', 0.042), ('sd', 0.041), ('mar', 0.041), ('soft', 0.04), ('opinion', 0.04), ('written', 0.038), ('oppose', 0.037), ('xi', 0.035), ('online', 0.034), ('interpost', 0.034), ('postprocess', 0.034), ('stanceannotated', 0.034), ('userinteraction', 0.034), ('yi', 0.033), ('employ', 0.033), ('thread', 0.032), ('polarity', 0.031), ('obama', 0.031), ('lu', 0.031), ('opposes', 0.03), ('kschischang', 0.03), ('kazi', 0.03), ('pranav', 0.03), ('saidul', 0.03), ('posted', 0.03), ('folds', 0.029), ('classifier', 0.029), ('yj', 0.029), ('preceding', 0.028), ('coordinate', 0.028), ('abbott', 0.028), ('sentiment', 0.027), ('posters', 0.026), ('murakami', 0.026), ('agrawal', 0.026), ('mincut', 0.026), ('crf', 0.026), ('fold', 0.026), ('determining', 0.026), ('employing', 0.026), ('biran', 0.025), ('balahur', 0.025), ('implementing', 0.025), ('labels', 0.024), ('ros', 0.024), ('owing', 0.024), ('yessenalina', 0.024), ('marilyn', 0.023), ('lillian', 0.022), ('pang', 0.022), ('thomas', 0.022), ('rob', 0.022), ('reply', 0.022), ('svmlight', 0.021), ('classify', 0.021), ('expressed', 0.021), ('somasundaran', 0.02), ('forums', 0.019), ('supports', 0.019), ('enforcing', 0.019), ('thumbs', 0.019), ('svm', 0.019), ('mallet', 0.019), ('inference', 0.018), ('datasets', 0.017), ('customer', 0.017), ('disagreement', 0.017)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999988 151 acl-2013-Extra-Linguistic Constraints on Stance Recognition in Ideological Debates
Author: Kazi Saidul Hasan ; Vincent Ng
Abstract: Determining the stance expressed by an author from a post written for a twosided debate in an online debate forum is a relatively new problem. We seek to improve Anand et al.’s (201 1) approach to debate stance classification by modeling two types of soft extra-linguistic constraints on the stance labels of debate posts, user-interaction constraints and ideology constraints. Experimental results on four datasets demonstrate the effectiveness of these inter-post constraints in improving debate stance classification.
2 0.19043432 121 acl-2013-Discovering User Interactions in Ideological Discussions
Author: Arjun Mukherjee ; Bing Liu
Abstract: Online discussion forums are a popular platform for people to voice their opinions on any subject matter and to discuss or debate any issue of interest. In forums where users discuss social, political, or religious issues, there are often heated debates among users or participants. Existing research has studied mining of user stances or camps on certain issues, opposing perspectives, and contention points. In this paper, we focus on identifying the nature of interactions among user pairs. The central questions are: How does each pair of users interact with each other? Does the pair of users mostly agree or disagree? What is the lexicon that people often use to express agreement and disagreement? We present a topic model based approach to answer these questions. Since agreement and disagreement expressions are usually multiword phrases, we propose to employ a ranking method to identify highly relevant phrases prior to topic modeling. After modeling, we use the modeling results to classify the nature of interaction of each user pair. Our evaluation results using real-life discussion/debate posts demonstrate the effectiveness of the proposed techniques.
3 0.13611557 287 acl-2013-Public Dialogue: Analysis of Tolerance in Online Discussions
Author: Arjun Mukherjee ; Vivek Venkataraman ; Bing Liu ; Sharon Meraz
Abstract: Social media platforms have enabled people to freely express their views and discuss issues of interest with others. While it is important to discover the topics in discussions, it is equally useful to mine the nature of such discussions or debates and the behavior of the participants. There are many questions that can be asked. One key question is whether the participants give reasoned arguments with justifiable claims via constructive debates or exhibit dogmatism and egotistic clashes of ideologies. The central idea of this question is tolerance, which is a key concept in the field of communications. In this work, we perform a computational study of tolerance in the context of online discussions. We aim to identify tolerant vs. intolerant participants and investigate how disagreement affects tolerance in discussions in a quantitative framework. To the best of our knowledge, this is the first such study. Our experiments using real-life discussions demonstrate the effective- ness of the proposed technique and also provide some key insights into the psycholinguistic phenomenon of tolerance in online discussions.
4 0.13493295 187 acl-2013-Identifying Opinion Subgroups in Arabic Online Discussions
Author: Amjad Abu-Jbara ; Ben King ; Mona Diab ; Dragomir Radev
Abstract: In this paper, we use Arabic natural language processing techniques to analyze Arabic debates. The goal is to identify how the participants in a discussion split into subgroups with contrasting opinions. The members of each subgroup share the same opinion with respect to the discussion topic and an opposing opinion to the members of other subgroups. We use opinion mining techniques to identify opinion expressions and determine their polarities and their targets. We opinion predictions to represent the discussion in one of two formal representations: signed attitude network or a space of attitude vectors. We identify opinion subgroups by partitioning the signed network representation or by clustering the vector space representation. We evaluate the system using a data set of labeled discussions and show that it achieves good results.
5 0.11232881 49 acl-2013-An annotated corpus of quoted opinions in news articles
Author: Tim O'Keefe ; James R. Curran ; Peter Ashwell ; Irena Koprinska
Abstract: Quotes are used in news articles as evidence of a person’s opinion, and thus are a useful target for opinion mining. However, labelling each quote with a polarity score directed at a textually-anchored target can ignore the broader issue that the speaker is commenting on. We address this by instead labelling quotes as supporting or opposing a clear expression of a point of view on a topic, called a position statement. Using this we construct a corpus covering 7 topics with 2,228 quotes.
6 0.096156046 232 acl-2013-Linguistic Models for Analyzing and Detecting Biased Language
7 0.064720884 207 acl-2013-Joint Inference for Fine-grained Opinion Extraction
8 0.059541315 377 acl-2013-Using Supervised Bigram-based ILP for Extractive Summarization
9 0.049437366 91 acl-2013-Connotation Lexicon: A Dash of Sentiment Beneath the Surface Meaning
10 0.045716897 244 acl-2013-Mining Opinion Words and Opinion Targets in a Two-Stage Framework
11 0.04394757 260 acl-2013-Nonconvex Global Optimization for Latent-Variable Models
12 0.043062214 318 acl-2013-Sentiment Relevance
13 0.042689893 160 acl-2013-Fine-grained Semantic Typing of Emerging Entities
14 0.042227615 2 acl-2013-A Bayesian Model for Joint Unsupervised Induction of Sentiment, Aspect and Discourse Representations
15 0.040180486 240 acl-2013-Microblogs as Parallel Corpora
16 0.039738279 144 acl-2013-Explicit and Implicit Syntactic Features for Text Classification
17 0.039544966 188 acl-2013-Identifying Sentiment Words Using an Optimization-based Model without Seed Words
18 0.039134912 336 acl-2013-Syntactic Patterns versus Word Alignment: Extracting Opinion Targets from Online Reviews
19 0.039057501 278 acl-2013-Patient Experience in Online Support Forums: Modeling Interpersonal Interactions and Medication Use
20 0.03874917 237 acl-2013-Margin-based Decomposed Amortized Inference
topicId topicWeight
[(0, 0.102), (1, 0.093), (2, -0.013), (3, 0.07), (4, -0.012), (5, 0.024), (6, -0.009), (7, -0.038), (8, -0.047), (9, 0.009), (10, -0.013), (11, 0.015), (12, -0.03), (13, -0.015), (14, -0.052), (15, -0.023), (16, -0.008), (17, 0.046), (18, 0.023), (19, -0.013), (20, 0.009), (21, -0.009), (22, -0.005), (23, 0.022), (24, -0.022), (25, -0.001), (26, -0.079), (27, -0.027), (28, -0.008), (29, -0.054), (30, 0.039), (31, -0.0), (32, 0.054), (33, 0.022), (34, 0.005), (35, 0.053), (36, 0.104), (37, -0.011), (38, -0.101), (39, -0.027), (40, 0.116), (41, 0.034), (42, 0.107), (43, -0.117), (44, 0.04), (45, 0.0), (46, -0.06), (47, -0.234), (48, 0.053), (49, -0.043)]
simIndex simValue paperId paperTitle
same-paper 1 0.92432976 151 acl-2013-Extra-Linguistic Constraints on Stance Recognition in Ideological Debates
Author: Kazi Saidul Hasan ; Vincent Ng
Abstract: Determining the stance expressed by an author from a post written for a twosided debate in an online debate forum is a relatively new problem. We seek to improve Anand et al.’s (201 1) approach to debate stance classification by modeling two types of soft extra-linguistic constraints on the stance labels of debate posts, user-interaction constraints and ideology constraints. Experimental results on four datasets demonstrate the effectiveness of these inter-post constraints in improving debate stance classification.
2 0.85898381 287 acl-2013-Public Dialogue: Analysis of Tolerance in Online Discussions
Author: Arjun Mukherjee ; Vivek Venkataraman ; Bing Liu ; Sharon Meraz
Abstract: Social media platforms have enabled people to freely express their views and discuss issues of interest with others. While it is important to discover the topics in discussions, it is equally useful to mine the nature of such discussions or debates and the behavior of the participants. There are many questions that can be asked. One key question is whether the participants give reasoned arguments with justifiable claims via constructive debates or exhibit dogmatism and egotistic clashes of ideologies. The central idea of this question is tolerance, which is a key concept in the field of communications. In this work, we perform a computational study of tolerance in the context of online discussions. We aim to identify tolerant vs. intolerant participants and investigate how disagreement affects tolerance in discussions in a quantitative framework. To the best of our knowledge, this is the first such study. Our experiments using real-life discussions demonstrate the effective- ness of the proposed technique and also provide some key insights into the psycholinguistic phenomenon of tolerance in online discussions.
3 0.71779883 121 acl-2013-Discovering User Interactions in Ideological Discussions
Author: Arjun Mukherjee ; Bing Liu
Abstract: Online discussion forums are a popular platform for people to voice their opinions on any subject matter and to discuss or debate any issue of interest. In forums where users discuss social, political, or religious issues, there are often heated debates among users or participants. Existing research has studied mining of user stances or camps on certain issues, opposing perspectives, and contention points. In this paper, we focus on identifying the nature of interactions among user pairs. The central questions are: How does each pair of users interact with each other? Does the pair of users mostly agree or disagree? What is the lexicon that people often use to express agreement and disagreement? We present a topic model based approach to answer these questions. Since agreement and disagreement expressions are usually multiword phrases, we propose to employ a ranking method to identify highly relevant phrases prior to topic modeling. After modeling, we use the modeling results to classify the nature of interaction of each user pair. Our evaluation results using real-life discussion/debate posts demonstrate the effectiveness of the proposed techniques.
4 0.63408256 232 acl-2013-Linguistic Models for Analyzing and Detecting Biased Language
Author: Marta Recasens ; Cristian Danescu-Niculescu-Mizil ; Dan Jurafsky
Abstract: Unbiased language is a requirement for reference sources like encyclopedias and scientific texts. Bias is, nonetheless, ubiquitous, making it crucial to understand its nature and linguistic realization and hence detect bias automatically. To this end we analyze real instances of human edits designed to remove bias from Wikipedia articles. The analysis uncovers two classes of bias: framing bias, such as praising or perspective-specific words, which we link to the literature on subjectivity; and epistemological bias, related to whether propositions that are presupposed or entailed in the text are uncontroversially accepted as true. We identify common linguistic cues for these classes, including factive verbs, implicatives, hedges, and subjective inten- cs . sifiers. These insights help us develop features for a model to solve a new prediction task of practical importance: given a biased sentence, identify the bias-inducing word. Our linguistically-informed model performs almost as well as humans tested on the same task.
5 0.56209999 49 acl-2013-An annotated corpus of quoted opinions in news articles
Author: Tim O'Keefe ; James R. Curran ; Peter Ashwell ; Irena Koprinska
Abstract: Quotes are used in news articles as evidence of a person’s opinion, and thus are a useful target for opinion mining. However, labelling each quote with a polarity score directed at a textually-anchored target can ignore the broader issue that the speaker is commenting on. We address this by instead labelling quotes as supporting or opposing a clear expression of a point of view on a topic, called a position statement. Using this we construct a corpus covering 7 topics with 2,228 quotes.
6 0.49449638 30 acl-2013-A computational approach to politeness with application to social factors
7 0.47324237 278 acl-2013-Patient Experience in Online Support Forums: Modeling Interpersonal Interactions and Medication Use
8 0.41189492 67 acl-2013-Bi-directional Inter-dependencies of Subjective Expressions and Targets and their Value for a Joint Model
9 0.39990094 187 acl-2013-Identifying Opinion Subgroups in Arabic Online Discussions
10 0.39897421 298 acl-2013-Recognizing Rare Social Phenomena in Conversation: Empowerment Detection in Support Group Chatrooms
11 0.37787583 83 acl-2013-Collective Annotation of Linguistic Resources: Basic Principles and a Formal Model
12 0.35528231 33 acl-2013-A user-centric model of voting intention from Social Media
13 0.32629937 318 acl-2013-Sentiment Relevance
14 0.31507394 54 acl-2013-Are School-of-thought Words Characterizable?
15 0.31091446 184 acl-2013-Identification of Speakers in Novels
16 0.30947328 52 acl-2013-Annotating named entities in clinical text by combining pre-annotation and active learning
17 0.2887111 117 acl-2013-Detecting Turnarounds in Sentiment Analysis: Thwarting
18 0.28576186 346 acl-2013-The Impact of Topic Bias on Quality Flaw Prediction in Wikipedia
19 0.27925098 324 acl-2013-Smatch: an Evaluation Metric for Semantic Feature Structures
20 0.26746982 277 acl-2013-Part-of-speech tagging with antagonistic adversaries
topicId topicWeight
[(0, 0.034), (4, 0.044), (6, 0.032), (11, 0.026), (24, 0.026), (26, 0.074), (35, 0.098), (42, 0.032), (48, 0.035), (63, 0.333), (70, 0.024), (88, 0.05), (90, 0.023), (95, 0.068)]
simIndex simValue paperId paperTitle
same-paper 1 0.7960037 151 acl-2013-Extra-Linguistic Constraints on Stance Recognition in Ideological Debates
Author: Kazi Saidul Hasan ; Vincent Ng
Abstract: Determining the stance expressed by an author from a post written for a twosided debate in an online debate forum is a relatively new problem. We seek to improve Anand et al.’s (201 1) approach to debate stance classification by modeling two types of soft extra-linguistic constraints on the stance labels of debate posts, user-interaction constraints and ideology constraints. Experimental results on four datasets demonstrate the effectiveness of these inter-post constraints in improving debate stance classification.
2 0.65255421 219 acl-2013-Learning Entity Representation for Entity Disambiguation
Author: Zhengyan He ; Shujie Liu ; Mu Li ; Ming Zhou ; Longkai Zhang ; Houfeng Wang
Abstract: We propose a novel entity disambiguation model, based on Deep Neural Network (DNN). Instead of utilizing simple similarity measures and their disjoint combinations, our method directly optimizes document and entity representations for a given similarity measure. Stacked Denoising Auto-encoders are first employed to learn an initial document representation in an unsupervised pre-training stage. A supervised fine-tuning stage follows to optimize the representation towards the similarity measure. Experiment results show that our method achieves state-of-the-art performance on two public datasets without any manually designed features, even beating complex collective approaches.
3 0.62125909 140 acl-2013-Evaluating Text Segmentation using Boundary Edit Distance
Author: Chris Fournier
Abstract: This work proposes a new segmentation evaluation metric, named boundary similarity (B), an inter-coder agreement coefficient adaptation, and a confusion-matrix for segmentation that are all based upon an adaptation of the boundary edit distance in Fournier and Inkpen (2012). Existing segmentation metrics such as Pk, WindowDiff, and Segmentation Similarity (S) are all able to award partial credit for near misses between boundaries, but are biased towards segmentations containing few or tightly clustered boundaries. Despite S’s improvements, its normalization also produces cosmetically high values that overestimate agreement & performance, leading this work to propose a solution.
4 0.59497845 188 acl-2013-Identifying Sentiment Words Using an Optimization-based Model without Seed Words
Author: Hongliang Yu ; Zhi-Hong Deng ; Shiyingxue Li
Abstract: Sentiment Word Identification (SWI) is a basic technique in many sentiment analysis applications. Most existing researches exploit seed words, and lead to low robustness. In this paper, we propose a novel optimization-based model for SWI. Unlike previous approaches, our model exploits the sentiment labels of documents instead of seed words. Several experiments on real datasets show that WEED is effective and outperforms the state-of-the-art methods with seed words.
5 0.46039402 49 acl-2013-An annotated corpus of quoted opinions in news articles
Author: Tim O'Keefe ; James R. Curran ; Peter Ashwell ; Irena Koprinska
Abstract: Quotes are used in news articles as evidence of a person’s opinion, and thus are a useful target for opinion mining. However, labelling each quote with a polarity score directed at a textually-anchored target can ignore the broader issue that the speaker is commenting on. We address this by instead labelling quotes as supporting or opposing a clear expression of a point of view on a topic, called a position statement. Using this we construct a corpus covering 7 topics with 2,228 quotes.
6 0.44889903 187 acl-2013-Identifying Opinion Subgroups in Arabic Online Discussions
7 0.44773221 287 acl-2013-Public Dialogue: Analysis of Tolerance in Online Discussions
8 0.44226426 315 acl-2013-Semi-Supervised Semantic Tagging of Conversational Understanding using Markov Topic Regression
9 0.43133247 121 acl-2013-Discovering User Interactions in Ideological Discussions
10 0.43030334 273 acl-2013-Paraphrasing Adaptation for Web Search Ranking
11 0.42389867 351 acl-2013-Topic Modeling Based Classification of Clinical Reports
12 0.42305914 2 acl-2013-A Bayesian Model for Joint Unsupervised Induction of Sentiment, Aspect and Discourse Representations
13 0.42289403 295 acl-2013-Real-World Semi-Supervised Learning of POS-Taggers for Low-Resource Languages
14 0.41945851 131 acl-2013-Dual Training and Dual Prediction for Polarity Classification
15 0.41895905 294 acl-2013-Re-embedding words
16 0.41827008 172 acl-2013-Graph-based Local Coherence Modeling
17 0.41788292 312 acl-2013-Semantic Parsing as Machine Translation
18 0.41770759 147 acl-2013-Exploiting Topic based Twitter Sentiment for Stock Prediction
19 0.41761476 316 acl-2013-SenseSpotting: Never let your parallel data tie you to an old domain
20 0.41656643 196 acl-2013-Improving pairwise coreference models through feature space hierarchy learning