acl acl2012 acl2012-187 knowledge-graph by maker-knowledge-mining

187 acl-2012-Subgroup Detection in Ideological Discussions


Source: pdf

Author: Amjad Abu-Jbara ; Pradeep Dasigi ; Mona Diab ; Dragomir Radev

Abstract: The rapid and continuous growth of social networking sites has led to the emergence of many communities of communicating groups. Many of these groups discuss ideological and political topics. It is not uncommon that the participants in such discussions split into two or more subgroups. The members of each subgroup share the same opinion toward the discussion topic and are more likely to agree with members of the same subgroup and disagree with members from opposing subgroups. In this paper, we propose an unsupervised approach for automatically detecting discussant subgroups in online communities. We analyze the text exchanged between the participants of a discussion to identify the attitude they carry toward each other and towards the various aspects of the discussion topic. We use attitude predictions to construct an attitude vector for each discussant. We use clustering techniques to cluster these vectors and, hence, determine the subgroup membership of each participant. We compare our methods to text clustering and other baselines, and show that our method achieves promising results.

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 The members of each subgroup share the same opinion toward the discussion topic and are more likely to agree with members of the same subgroup and disagree with members from opposing subgroups. [sent-9, score-1.214]

2 In this paper, we propose an unsupervised approach for automatically detecting discussant subgroups in online communities. [sent-10, score-0.411]

3 We analyze the text exchanged between the participants of a discussion to identify the attitude they carry toward each other and towards the various aspects of the discussion topic. [sent-11, score-0.923]

4 We use attitude predictions to construct an attitude vector for each discussant. [sent-12, score-1.193]

5 We use clustering techniques to cluster these vectors and, hence, determine the subgroup membership of each participant. [sent-13, score-0.464]

6 The members of each subgroup carry the same opinion common1 1www. [sent-17, score-0.505]

7 The member of a subgroup is more likely to show positive attitude to the members of the same subgroup, and negative attitude to the members of opposing subgroups. [sent-24, score-1.687]

8 For example, let us consider the following two snippets from a debate about the enforcement of a new immigration law in Arizona state in the United States: (1) Discussant 1: Arizona immigration law is good. [sent-25, score-0.533]

9 Arizona immigration law is blatant racism, and quite unconstitutional. [sent-28, score-0.269]

10 In (1), the writer is expressing positive attitude regarding the immigration law and negative attitude regarding illegal immigration. [sent-29, score-1.634]

11 The writer of (2) is expressing negative attitude towards the writer of (1) and negative attitude regarding the immigration law. [sent-30, score-1.531]

12 It is clear from this short dialog that the writer of (1) and the writer of (2) are members of two opposing subgroups. [sent-31, score-0.298]

13 In this paper, we present an unsupervised approach for determining the subgroup membership of each participant in a discussion. [sent-33, score-0.407]

14 We use linguistic techniques to identify attitude expressions, their polarities, and their targets. [sent-34, score-0.631]

15 The target of attitude could be another discussant or an entity mentioned in the discussion. [sent-35, score-1.063]

16 We use sentiment analysis techniques to identify opinion expressions. [sent-36, score-0.381]

17 c so2c0ia1t2io Ans fso rc Ciatoiomnp fuotart Cio nmaplu Ltiantgiounisatlic Lsi,n pgaugiestsi3c 9s9–409, tity recognition and noun phrase chunking to identify the entities mentioned in the discussion. [sent-39, score-0.212]

18 For each participant in the discussion, we construct a vector of attitude features. [sent-41, score-0.739]

19 The attitude profile of a discussant contains an entry for every other discussant and an entry for every entity mentioned in the discission. [sent-43, score-1.386]

20 We use clustering techniques to cluster the attitude vector space. [sent-44, score-0.765]

21 We use the clustering results to determine the subgroup structure of the discussion group and the subgroup membership of each participant. [sent-45, score-0.67]

22 We use previous work on subjectivity and polarity prediction to identify opinion words in discussions. [sent-67, score-0.45]

23 (2010) presents a method for identifying sentences that display an attitude from the text writer toward the text recipient. [sent-74, score-0.72]

24 They define attitude as the mental position of one partici- pant with regard to another participant. [sent-75, score-0.578]

25 A very detailed survey that covers techniques and approaches in sentiment analysis and opinion mining could be found in (Pang and Lee, 2008). [sent-76, score-0.37]

26 2 Opinion Target Extraction Several methods have been proposed to identify the target of an opinion expression. [sent-78, score-0.314]

27 In this context, opinion targets usually refer to product features (i. [sent-82, score-0.313]

28 In another related work, Jakob and Gurevych (2010) showed that resolving the anaphoric links in the text significantly improves opinion target extraction. [sent-91, score-0.292]

29 Table 1: Example posts from the Arizona Immigration Law thread pairing as shown in Section 3 below. [sent-101, score-0.197]

30 Somasundaran and Wiebe (2009) presents an unsupervised opinion analysis method for debate-side classification. [sent-104, score-0.213]

31 They mine the web to learn associations that are indicative of opinion stances in debates and combine this knowledge with discourse information. [sent-105, score-0.263]

32 They use a number of linguistic and structural features such as unigrams, bigrams, cue words, repeated punctuation, and opinion dependencies to build a stance classification model. [sent-108, score-0.273]

33 Our work is characterized by handling multi-side debates and by regarding the problem as a clustering problem where the number of sides is not known by the algorithm. [sent-110, score-0.187]

34 This work also utilizes only discussant-to-topic attitude predictions for debate-side classification. [sent-111, score-0.578]

35 Out work utilizes both discussant-to-topic and discussant-to-discussant attitude predictions. [sent-112, score-0.578]

36 Moreover, although this work is related to ours at the goal level, it does not involve any opinion analysis. [sent-116, score-0.213]

37 The posts cover 12 disputed political and ideological topics. [sent-130, score-0.2]

38 The poll asked them to determine their stance on the discussion topic by choosing one item from a list of possible arguments. [sent-132, score-0.217]

39 The people who participated in the poll were allowed to post text to that thread to justify their choices and to argue with other participants. [sent-135, score-0.27]

40 We collected the votes and the discussion thread of each poll. [sent-136, score-0.204]

41 We used the votes to identify the subgroup membership of each participant. [sent-137, score-0.336]

42 When a new participant enters the discussion, she explicitly picks a position and posts text to support it, support a post written by another participant who took the same position, or to dispute a post written by another participant who took an opposing position. [sent-146, score-0.637]

43 We collected the discussion thread and the participant positions for each debate. [sent-147, score-0.328]

44 Table 1 shows a portion of discussion thread between three participants about enforcing a new immigration law in Arizona. [sent-155, score-0.513]

45 This means that A and B belong to the same opinion subgroup, while belongs to an opposing subgroup. [sent-160, score-0.311]

46 3 Approach In this section, we describe a system that takes a discussion thread as input and outputs the subgroup membership of each discussant. [sent-163, score-0.487]

47 1 Thread Parsing We start by parsing the thread to identify posts, participants, and the reply structure of the thread (i. [sent-167, score-0.321]

48 2 Opinion Word Identification The next step is to identify the words that express opinion and determine their polarity (positive or negative). [sent-174, score-0.391]

49 OpinionFinder uses a large set of features to identify the contextual polarity of a given polarized word given its isolated polarity and the sentence in which it appears (Wilson et al. [sent-181, score-0.303]

50 A target could be another discussant or an entity mentioned in the discussion. [sent-188, score-0.485]

51 When the target of opinion is another discussant, either the discussant name is mentioned explicitly or a second person pronoun is used to indicate that the opinion is targeting the recipient of the post. [sent-189, score-0.908]

52 For example, in snippet (2) above the second person pronoun you indicates that the opinion word disagree is targeting Discussant 1, the recipient of the post. [sent-190, score-0.392]

53 The target of opinion can also be an entity mentioned in the discussion. [sent-191, score-0.377]

54 For example, the noun group Arizona immigration law is mentioned by Discussant 1 and Discussant 2 in snippets 1 and 2 above respectively. [sent-198, score-0.315]

55 Illegal im403 NP Chunking in a discussion thread about the US 2012 elections migration is bad. [sent-201, score-0.204]

56 The final set of entities identified in a thread is the union of the entities identified by the two aforemen- tioned methods. [sent-212, score-0.296]

57 Previous work has shown that For example, the following snippet contains an explicit mention of the entity Obama in the first sentence, and then uses a pronoun to refer to the same entity in the second sentence. [sent-214, score-0.249]

58 The opinion word unbeatable appears in the second sentence and is syntactically related to the pronoun He. [sent-215, score-0.258]

59 Jakob and Gurevych (2010) showed experimentally that resolving the anaphoric links in the text significantly improves opinion target extraction. [sent-223, score-0.292]

60 4 Opinion-Target Pairing At this point, we have all the opinion words and the potential targets identified separately. [sent-230, score-0.313]

61 The next step is to determine which opinion word is targeting which target. [sent-231, score-0.244]

62 An opinion word and a target form a pair if they stratify at least one of our dependency rules. [sent-235, score-0.261]

63 The rules basically examine the types of the dependencies on the shortest path that connect the opinion word and the target in the dependency parse tree. [sent-237, score-0.261]

64 If a sentence S in a post written by participant Pi contains an opinion word OPj and a target TRk, and if the opinion-target pair satisfies one of our dependency rules, we say that Pi expresses an attitude towards TRk. [sent-239, score-1.015]

65 The polarity of the attitude is determined by the polarity of OPj. [sent-240, score-0.828]

66 It is likely that the same participant Pi express sentiment toward the same target TRk multiple times in different sentences in different posts. [sent-242, score-0.367]

67 We keep track of the counts of all the instances of positive/negative attitude Pi expresses toward TRk. [sent-243, score-0.658]

68 We −m − →+ represent this as Pi TRk where m (n) is the number of times Pi expressed positive (negative) attitude toward TRk. [sent-244, score-0.688]

69 5 Discussant Attitude Profile We propose a representation of discussants a´ttitudes towards the identified targets in the discussion thread. [sent-246, score-0.286]

70 As stated above, a target could be another discussant or an entity mentioned in the discussion. [sent-247, score-0.485]

71 The values correspond to the counts of positive/negative attitudes expressed by the discussant toward each of the targets. [sent-249, score-0.401]

72 We call this vector the discussant attitude profile (DAP). [sent-250, score-0.986]

73 Given a discussion thread with d discussants and e entity targets, each attitude profile vector has n = (d + e) ∗ 3 dimensions. [sent-252, score-1.064]

74 6 Clustering At this point, we have an attitude profile (or vector) constructed for each discussant. [sent-257, score-0.628]

75 Our goal is to use these attitude profiles to determine the subgroup membership of each discussant. [sent-258, score-0.893]

76 We can achieve this goal by noticing that the attitude profiles of discussants who share the same opinion are more likely to be similar to each other than to the attitude profiles of discussants with opposing opinions. [sent-259, score-1.763]

77 This suggests that clustering the attitude vector space will achieve the goal and split the discussants into subgroups according to their opinion. [sent-260, score-0.922]

78 , 2008), each cluster is assigned the class of the majority vote within the cluster, and then the accuracy of this assignment is measured by dividing the number of correctly assigned members by the total number of instances. [sent-272, score-0.179]

79 The second baseline (TC) is based on the premise that the member of the same subgroup are more likely to use vocabulary drawn from the same language model. [sent-293, score-0.216]

80 We collect all the text posted by each participant and create a tf-idf representations of the text in a high dimensional vector space. [sent-294, score-0.201]

81 We use k-means (MacQueen, 1967) as our clustering algorithm in this experiment (comparison of various clustering algorithms is presented in the next subsection). [sent-296, score-0.202]

82 We believe that the baselines performed poorly because the interaction frequency and the text similarity are not key factors in identifying subgroup structures. [sent-312, score-0.282]

83 Also, people in opposing subgroups tend to use very similar text when discussing the same topic and hence text clustering does not work as well. [sent-314, score-0.358]

84 2 Choice of the clustering algorithm We experimented with three different clustering algorithms: expectation maximization (EM), and k- means (MacQueen, 1967), and FarthestFirst (FF) (Hochbaum and Shmoys, 1985; Dasgupta, 2002). [sent-316, score-0.202]

85 We also experimented with using Manhattan distance and cosine similarity instead of Euclidean distance to measure the distance between attitude vectors. [sent-321, score-0.578]

86 2) We run the system and include only discussant-to-discussant attitude features in the attitude vectors (DAPC-DD). [sent-327, score-1.187]

87 3) We include only discussant-to-entity attitude features in the attitude vectors (DAPC-DE). [sent-328, score-1.187]

88 4) We include only sentiment features in the attitude vector; i. [sent-329, score-0.693]

89 5) We include only interaction count features to the attitude vector; i. [sent-332, score-0.644]

90 7) We only use named entity recognition to identify entity targets; i. [sent-336, score-0.211]

91 8) Finally, we only noun phrase chunking to identify entity targets (DAPC-NP). [sent-339, score-0.273]

92 We also notice that the performance drops significantly in DAPC-DD and DAPCDD which also supports our hypotheses that both the sentiment discussants show toward one another and the sentiment they show toward the aspects of the discussed topic are important for the task. [sent-345, score-0.542]

93 Finally, the results support Jakob and Gurevych (2010) findings that anaphora resolution aids opinion mining systems. [sent-347, score-0.34]

94 5 Conclusions In this paper, we presented an approach for subgroup detection in ideological discussions. [sent-348, score-0.28]

95 Our system uses linguistic analysis techniques to identify the at- titude the participants of online discussions carry toward each other and toward the aspects ofthe discussion topic. [sent-349, score-0.355]

96 Attitude prediction as well as interaction frequency to construct an attitude vector for each participant. [sent-350, score-0.681]

97 The attitude vectors of discussants are then clustered to form subgroups. [sent-351, score-0.725]

98 All statements of fact, opinion or conclusions contained herein are those of the authors and should not be construed as representing the official views or policies of IARPA, the ODNI or the U. [sent-357, score-0.213]

99 Using anaphora resolution to improve opinion target identification in movie reviews. [sent-449, score-0.346]

100 Towards answering opinion questions: separating facts from opinions and identifying the polarity of opinion sentences. [sent-571, score-0.551]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('attitude', 0.578), ('discussant', 0.321), ('subgroup', 0.216), ('opinion', 0.213), ('immigration', 0.145), ('thread', 0.134), ('polarity', 0.125), ('participant', 0.124), ('discussants', 0.116), ('sentiment', 0.115), ('arizona', 0.113), ('clustering', 0.101), ('opposing', 0.098), ('law', 0.092), ('subgroups', 0.09), ('toward', 0.08), ('entity', 0.079), ('members', 0.076), ('purity', 0.075), ('participants', 0.072), ('trk', 0.07), ('discussion', 0.07), ('membership', 0.067), ('interaction', 0.066), ('janyce', 0.065), ('ideological', 0.064), ('targets', 0.064), ('posts', 0.063), ('hassan', 0.062), ('threads', 0.062), ('writer', 0.062), ('stance', 0.06), ('debate', 0.059), ('subjectivity', 0.059), ('pi', 0.057), ('disagree', 0.057), ('dap', 0.056), ('vote', 0.054), ('identify', 0.053), ('wiebe', 0.052), ('post', 0.052), ('hatzivassiloglou', 0.052), ('poll', 0.051), ('debates', 0.05), ('profile', 0.05), ('anaphora', 0.05), ('cluster', 0.049), ('createdebate', 0.048), ('politicalforum', 0.048), ('target', 0.048), ('radev', 0.048), ('dragomir', 0.048), ('snippet', 0.046), ('entities', 0.045), ('orientation', 0.045), ('pronoun', 0.045), ('claire', 0.042), ('illegal', 0.042), ('disputed', 0.042), ('opj', 0.042), ('mining', 0.042), ('noun', 0.041), ('posted', 0.04), ('amjad', 0.039), ('opinionfinder', 0.039), ('mentioned', 0.037), ('vector', 0.037), ('topic', 0.036), ('product', 0.036), ('chunking', 0.036), ('jakob', 0.036), ('subsection', 0.036), ('wilson', 0.036), ('identified', 0.036), ('regarding', 0.036), ('pages', 0.035), ('resolution', 0.035), ('hu', 0.035), ('negative', 0.035), ('people', 0.033), ('vasileios', 0.032), ('profiles', 0.032), ('abujbara', 0.032), ('blatant', 0.032), ('clairlib', 0.032), ('dapc', 0.032), ('eculidean', 0.032), ('grover', 0.032), ('hochbaum', 0.032), ('isupport', 0.032), ('racism', 0.032), ('vectors', 0.031), ('political', 0.031), ('entropy', 0.031), ('gurevych', 0.031), ('somasundaran', 0.031), ('anaphoric', 0.031), ('targeting', 0.031), ('ny', 0.031), ('positive', 0.03)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.9999994 187 acl-2012-Subgroup Detection in Ideological Discussions

Author: Amjad Abu-Jbara ; Pradeep Dasigi ; Mona Diab ; Dragomir Radev

Abstract: The rapid and continuous growth of social networking sites has led to the emergence of many communities of communicating groups. Many of these groups discuss ideological and political topics. It is not uncommon that the participants in such discussions split into two or more subgroups. The members of each subgroup share the same opinion toward the discussion topic and are more likely to agree with members of the same subgroup and disagree with members from opposing subgroups. In this paper, we propose an unsupervised approach for automatically detecting discussant subgroups in online communities. We analyze the text exchanged between the participants of a discussion to identify the attitude they carry toward each other and towards the various aspects of the discussion topic. We use attitude predictions to construct an attitude vector for each discussant. We use clustering techniques to cluster these vectors and, hence, determine the subgroup membership of each participant. We compare our methods to text clustering and other baselines, and show that our method achieves promising results.

2 0.82601637 188 acl-2012-Subgroup Detector: A System for Detecting Subgroups in Online Discussions

Author: Amjad Abu-Jbara ; Dragomir Radev

Abstract: We present Subgroup Detector, a system for analyzing threaded discussions and identifying the attitude of discussants towards one another and towards the discussion topic. The system uses attitude predictions to detect the split of discussants into subgroups of opposing views. The system uses an unsupervised approach based on rule-based opinion target detecting and unsupervised clustering techniques. The system is open source and is freely available for download. An online demo of the system is available at: http://clair.eecs.umich.edu/SubgroupDetector/

3 0.60676849 102 acl-2012-Genre Independent Subgroup Detection in Online Discussion Threads: A Study of Implicit Attitude using Textual Latent Semantics

Author: Pradeep Dasigi ; Weiwei Guo ; Mona Diab

Abstract: We describe an unsupervised approach to the problem of automatically detecting subgroups of people holding similar opinions in a discussion thread. An intuitive way of identifying this is to detect the attitudes of discussants towards each other or named entities or topics mentioned in the discussion. Sentiment tags play an important role in this detection, but we also note another dimension to the detection of people’s attitudes in a discussion: if two persons share the same opinion, they tend to use similar language content. We consider the latter to be an implicit attitude. In this paper, we investigate the impact of implicit and explicit attitude in two genres of social media discussion data, more formal wikipedia discussions and a debate discussion forum that is much more informal. Experimental results strongly suggest that implicit attitude is an important complement for explicit attitudes (expressed via sentiment) and it can improve the sub-group detection performance independent of genre.

4 0.15293391 161 acl-2012-Polarity Consistency Checking for Sentiment Dictionaries

Author: Eduard Dragut ; Hong Wang ; Clement Yu ; Prasad Sistla ; Weiyi Meng

Abstract: Polarity classification of words is important for applications such as Opinion Mining and Sentiment Analysis. A number of sentiment word/sense dictionaries have been manually or (semi)automatically constructed. The dictionaries have substantial inaccuracies. Besides obvious instances, where the same word appears with different polarities in different dictionaries, the dictionaries exhibit complex cases, which cannot be detected by mere manual inspection. We introduce the concept of polarity consistency of words/senses in sentiment dictionaries in this paper. We show that the consistency problem is NP-complete. We reduce the polarity consistency problem to the satisfiability problem and utilize a fast SAT solver to detect inconsistencies in a sentiment dictionary. We perform experiments on four sentiment dictionaries and WordNet.

5 0.14771245 61 acl-2012-Cross-Domain Co-Extraction of Sentiment and Topic Lexicons

Author: Fangtao Li ; Sinno Jialin Pan ; Ou Jin ; Qiang Yang ; Xiaoyan Zhu

Abstract: Extracting sentiment and topic lexicons is important for opinion mining. Previous works have showed that supervised learning methods are superior for this task. However, the performance of supervised methods highly relies on manually labeled training data. In this paper, we propose a domain adaptation framework for sentiment- and topic- lexicon co-extraction in a domain of interest where we do not require any labeled data, but have lots of labeled data in another related domain. The framework is twofold. In the first step, we generate a few high-confidence sentiment and topic seeds in the target domain. In the second step, we propose a novel Relational Adaptive bootstraPping (RAP) algorithm to expand the seeds in the target domain by exploiting the labeled source domain data and the relationships between topic and sentiment words. Experimental results show that our domain adaptation framework can extract precise lexicons in the target domain without any annotation.

6 0.12722495 180 acl-2012-Social Event Radar: A Bilingual Context Mining and Sentiment Analysis Summarization System

7 0.11945499 151 acl-2012-Multilingual Subjectivity and Sentiment Analysis

8 0.11631646 115 acl-2012-Identifying High-Impact Sub-Structures for Convolution Kernels in Document-level Sentiment Classification

9 0.11198985 100 acl-2012-Fine Granular Aspect Analysis using Latent Structural Models

10 0.10576659 62 acl-2012-Cross-Lingual Mixture Model for Sentiment Classification

11 0.099331066 177 acl-2012-Sentence Dependency Tagging in Online Question Answering Forums

12 0.097729504 21 acl-2012-A System for Real-time Twitter Sentiment Analysis of 2012 U.S. Presidential Election Cycle

13 0.093353279 208 acl-2012-Unsupervised Relation Discovery with Sense Disambiguation

14 0.087695971 28 acl-2012-Aspect Extraction through Semi-Supervised Modeling

15 0.085323505 37 acl-2012-Baselines and Bigrams: Simple, Good Sentiment and Topic Classification

16 0.083170407 144 acl-2012-Modeling Review Comments

17 0.064721696 58 acl-2012-Coreference Semantics from Web Features

18 0.064577371 171 acl-2012-SITS: A Hierarchical Nonparametric Model using Speaker Identity for Topic Segmentation in Multiparty Conversations

19 0.063701071 150 acl-2012-Multilingual Named Entity Recognition using Parallel Data and Metadata from Wikipedia

20 0.062031932 159 acl-2012-Pattern Learning for Relation Extraction with a Hierarchical Topic Model


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.22), (1, 0.314), (2, 0.25), (3, -0.335), (4, 0.352), (5, -0.065), (6, -0.341), (7, -0.073), (8, 0.459), (9, 0.061), (10, -0.158), (11, -0.02), (12, 0.048), (13, 0.119), (14, 0.037), (15, 0.013), (16, -0.016), (17, -0.012), (18, 0.048), (19, 0.01), (20, -0.008), (21, -0.021), (22, 0.016), (23, 0.032), (24, 0.016), (25, -0.01), (26, 0.027), (27, 0.014), (28, 0.003), (29, -0.012), (30, -0.022), (31, 0.011), (32, 0.008), (33, -0.037), (34, -0.016), (35, 0.002), (36, -0.012), (37, 0.005), (38, 0.004), (39, -0.003), (40, 0.006), (41, 0.02), (42, -0.013), (43, -0.003), (44, 0.0), (45, 0.006), (46, 0.015), (47, -0.021), (48, 0.015), (49, -0.011)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.96122462 188 acl-2012-Subgroup Detector: A System for Detecting Subgroups in Online Discussions

Author: Amjad Abu-Jbara ; Dragomir Radev

Abstract: We present Subgroup Detector, a system for analyzing threaded discussions and identifying the attitude of discussants towards one another and towards the discussion topic. The system uses attitude predictions to detect the split of discussants into subgroups of opposing views. The system uses an unsupervised approach based on rule-based opinion target detecting and unsupervised clustering techniques. The system is open source and is freely available for download. An online demo of the system is available at: http://clair.eecs.umich.edu/SubgroupDetector/

same-paper 2 0.93791717 187 acl-2012-Subgroup Detection in Ideological Discussions

Author: Amjad Abu-Jbara ; Pradeep Dasigi ; Mona Diab ; Dragomir Radev

Abstract: The rapid and continuous growth of social networking sites has led to the emergence of many communities of communicating groups. Many of these groups discuss ideological and political topics. It is not uncommon that the participants in such discussions split into two or more subgroups. The members of each subgroup share the same opinion toward the discussion topic and are more likely to agree with members of the same subgroup and disagree with members from opposing subgroups. In this paper, we propose an unsupervised approach for automatically detecting discussant subgroups in online communities. We analyze the text exchanged between the participants of a discussion to identify the attitude they carry toward each other and towards the various aspects of the discussion topic. We use attitude predictions to construct an attitude vector for each discussant. We use clustering techniques to cluster these vectors and, hence, determine the subgroup membership of each participant. We compare our methods to text clustering and other baselines, and show that our method achieves promising results.

3 0.9169513 102 acl-2012-Genre Independent Subgroup Detection in Online Discussion Threads: A Study of Implicit Attitude using Textual Latent Semantics

Author: Pradeep Dasigi ; Weiwei Guo ; Mona Diab

Abstract: We describe an unsupervised approach to the problem of automatically detecting subgroups of people holding similar opinions in a discussion thread. An intuitive way of identifying this is to detect the attitudes of discussants towards each other or named entities or topics mentioned in the discussion. Sentiment tags play an important role in this detection, but we also note another dimension to the detection of people’s attitudes in a discussion: if two persons share the same opinion, they tend to use similar language content. We consider the latter to be an implicit attitude. In this paper, we investigate the impact of implicit and explicit attitude in two genres of social media discussion data, more formal wikipedia discussions and a debate discussion forum that is much more informal. Experimental results strongly suggest that implicit attitude is an important complement for explicit attitudes (expressed via sentiment) and it can improve the sub-group detection performance independent of genre.

4 0.37764388 161 acl-2012-Polarity Consistency Checking for Sentiment Dictionaries

Author: Eduard Dragut ; Hong Wang ; Clement Yu ; Prasad Sistla ; Weiyi Meng

Abstract: Polarity classification of words is important for applications such as Opinion Mining and Sentiment Analysis. A number of sentiment word/sense dictionaries have been manually or (semi)automatically constructed. The dictionaries have substantial inaccuracies. Besides obvious instances, where the same word appears with different polarities in different dictionaries, the dictionaries exhibit complex cases, which cannot be detected by mere manual inspection. We introduce the concept of polarity consistency of words/senses in sentiment dictionaries in this paper. We show that the consistency problem is NP-complete. We reduce the polarity consistency problem to the satisfiability problem and utilize a fast SAT solver to detect inconsistencies in a sentiment dictionary. We perform experiments on four sentiment dictionaries and WordNet.

5 0.30900115 180 acl-2012-Social Event Radar: A Bilingual Context Mining and Sentiment Analysis Summarization System

Author: Wen-Tai Hsieh ; Chen-Ming Wu ; Tsun Ku ; Seng-cho T. Chou

Abstract: Social Event Radar is a new social networking-based service platform, that aim to alert as well as monitor any merchandise flaws, food-safety related issues, unexpected eruption of diseases or campaign issues towards to the Government, enterprises of any kind or election parties, through keyword expansion detection module, using bilingual sentiment opinion analysis tool kit to conclude the specific event social dashboard and deliver the outcome helping authorities to plan “risk control” strategy. With the rapid development of social network, people can now easily publish their opinions on the Internet. On the other hand, people can also obtain various opinions from others in a few seconds even though they do not know each other. A typical approach to obtain required information is to use a search engine with some relevant keywords. We thus take the social media and forum as our major data source and aim at collecting specific issues efficiently and effectively in this work. 163 Chen-Ming Wu Institute for Information Industry cmwu@ i i i .org .tw Seng-cho T. Chou Department of IM, National Taiwan University chou @ im .ntu .edu .tw 1

6 0.24606901 115 acl-2012-Identifying High-Impact Sub-Structures for Convolution Kernels in Document-level Sentiment Classification

7 0.24093577 37 acl-2012-Baselines and Bigrams: Simple, Good Sentiment and Topic Classification

8 0.23673742 61 acl-2012-Cross-Domain Co-Extraction of Sentiment and Topic Lexicons

9 0.23303962 151 acl-2012-Multilingual Subjectivity and Sentiment Analysis

10 0.21406905 21 acl-2012-A System for Real-time Twitter Sentiment Analysis of 2012 U.S. Presidential Election Cycle

11 0.20920105 195 acl-2012-The Creation of a Corpus of English Metalanguage

12 0.20417553 6 acl-2012-A Comprehensive Gold Standard for the Enron Organizational Hierarchy

13 0.20361486 156 acl-2012-Online Plagiarized Detection Through Exploiting Lexical, Syntax, and Semantic Information

14 0.20099001 120 acl-2012-Information-theoretic Multi-view Domain Adaptation

15 0.19895293 208 acl-2012-Unsupervised Relation Discovery with Sense Disambiguation

16 0.19263491 62 acl-2012-Cross-Lingual Mixture Model for Sentiment Classification

17 0.18289587 133 acl-2012-Learning to "Read Between the Lines" using Bayesian Logic Programs

18 0.17630374 100 acl-2012-Fine Granular Aspect Analysis using Latent Structural Models

19 0.17302734 58 acl-2012-Coreference Semantics from Web Features

20 0.17229541 28 acl-2012-Aspect Extraction through Semi-Supervised Modeling


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(0, 0.016), (25, 0.015), (26, 0.137), (28, 0.036), (30, 0.029), (37, 0.039), (39, 0.075), (57, 0.014), (59, 0.013), (74, 0.011), (82, 0.204), (84, 0.019), (85, 0.019), (90, 0.105), (91, 0.027), (92, 0.062), (94, 0.012), (99, 0.06)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.88798982 188 acl-2012-Subgroup Detector: A System for Detecting Subgroups in Online Discussions

Author: Amjad Abu-Jbara ; Dragomir Radev

Abstract: We present Subgroup Detector, a system for analyzing threaded discussions and identifying the attitude of discussants towards one another and towards the discussion topic. The system uses attitude predictions to detect the split of discussants into subgroups of opposing views. The system uses an unsupervised approach based on rule-based opinion target detecting and unsupervised clustering techniques. The system is open source and is freely available for download. An online demo of the system is available at: http://clair.eecs.umich.edu/SubgroupDetector/

2 0.87699538 90 acl-2012-Extracting Narrative Timelines as Temporal Dependency Structures

Author: Oleksandr Kolomiyets ; Steven Bethard ; Marie-Francine Moens

Abstract: We propose a new approach to characterizing the timeline of a text: temporal dependency structures, where all the events of a narrative are linked via partial ordering relations like BEFORE, AFTER, OVERLAP and IDENTITY. We annotate a corpus of children’s stories with temporal dependency trees, achieving agreement (Krippendorff’s Alpha) of 0.856 on the event words, 0.822 on the links between events, and of 0.700 on the ordering relation labels. We compare two parsing models for temporal dependency structures, and show that a deterministic non-projective dependency parser outperforms a graph-based maximum spanning tree parser, achieving labeled attachment accuracy of 0.647 and labeled tree edit distance of 0.596. Our analysis of the dependency parser errors gives some insights into future research directions.

same-paper 3 0.84837812 187 acl-2012-Subgroup Detection in Ideological Discussions

Author: Amjad Abu-Jbara ; Pradeep Dasigi ; Mona Diab ; Dragomir Radev

Abstract: The rapid and continuous growth of social networking sites has led to the emergence of many communities of communicating groups. Many of these groups discuss ideological and political topics. It is not uncommon that the participants in such discussions split into two or more subgroups. The members of each subgroup share the same opinion toward the discussion topic and are more likely to agree with members of the same subgroup and disagree with members from opposing subgroups. In this paper, we propose an unsupervised approach for automatically detecting discussant subgroups in online communities. We analyze the text exchanged between the participants of a discussion to identify the attitude they carry toward each other and towards the various aspects of the discussion topic. We use attitude predictions to construct an attitude vector for each discussant. We use clustering techniques to cluster these vectors and, hence, determine the subgroup membership of each participant. We compare our methods to text clustering and other baselines, and show that our method achieves promising results.

4 0.84692246 12 acl-2012-A Graph-based Cross-lingual Projection Approach for Weakly Supervised Relation Extraction

Author: Seokhwan Kim ; Gary Geunbae Lee

Abstract: Although researchers have conducted extensive studies on relation extraction in the last decade, supervised approaches are still limited because they require large amounts of training data to achieve high performances. To build a relation extractor without significant annotation effort, we can exploit cross-lingual annotation projection, which leverages parallel corpora as external resources for supervision. This paper proposes a novel graph-based projection approach and demonstrates the merits of it by using a Korean relation extraction system based on projected dataset from an English-Korean parallel corpus.

5 0.84598565 57 acl-2012-Concept-to-text Generation via Discriminative Reranking

Author: Ioannis Konstas ; Mirella Lapata

Abstract: This paper proposes a data-driven method for concept-to-text generation, the task of automatically producing textual output from non-linguistic input. A key insight in our approach is to reduce the tasks of content selection (“what to say”) and surface realization (“how to say”) into a common parsing problem. We define a probabilistic context-free grammar that describes the structure of the input (a corpus of database records and text describing some of them) and represent it compactly as a weighted hypergraph. The hypergraph structure encodes exponentially many derivations, which we rerank discriminatively using local and global features. We propose a novel decoding algorithm for finding the best scoring derivation and generating in this setting. Experimental evaluation on the ATIS domain shows that our model outperforms a competitive discriminative system both using BLEU and in a judgment elicitation study.

6 0.75815928 91 acl-2012-Extracting and modeling durations for habits and events from Twitter

7 0.73446286 191 acl-2012-Temporally Anchored Relation Extraction

8 0.73231733 102 acl-2012-Genre Independent Subgroup Detection in Online Discussion Threads: A Study of Implicit Attitude using Textual Latent Semantics

9 0.70196587 41 acl-2012-Bootstrapping a Unified Model of Lexical and Phonetic Acquisition

10 0.69496608 209 acl-2012-Unsupervised Semantic Role Induction with Global Role Ordering

11 0.69463402 43 acl-2012-Building Trainable Taggers in a Web-based, UIMA-Supported NLP Workbench

12 0.68872428 31 acl-2012-Authorship Attribution with Author-aware Topic Models

13 0.6860159 206 acl-2012-UWN: A Large Multilingual Lexical Knowledge Base

14 0.6823889 85 acl-2012-Event Linking: Grounding Event Reference in a News Archive

15 0.68139488 60 acl-2012-Coupling Label Propagation and Constraints for Temporal Fact Extraction

16 0.67303503 99 acl-2012-Finding Salient Dates for Building Thematic Timelines

17 0.67178988 174 acl-2012-Semantic Parsing with Bayesian Tree Transducers

18 0.66494983 134 acl-2012-Learning to Find Translations and Transliterations on the Web

19 0.66063213 84 acl-2012-Estimating Compact Yet Rich Tree Insertion Grammars

20 0.6602065 139 acl-2012-MIX Is Not a Tree-Adjoining Language