acl acl2010 acl2010-22 knowledge-graph by maker-knowledge-mining

22 acl-2010-A Unified Graph Model for Sentence-Based Opinion Retrieval


Source: pdf

Author: Binyang Li ; Lanjun Zhou ; Shi Feng ; Kam-Fai Wong

Abstract: There is a growing research interest in opinion retrieval as on-line users’ opinions are becoming more and more popular in business, social networks, etc. Practically speaking, the goal of opinion retrieval is to retrieve documents, which entail opinions or comments, relevant to a target subject specified by the user’s query. A fundamental challenge in opinion retrieval is information representation. Existing research focuses on document-based approaches and documents are represented by bag-of-word. However, due to loss of contextual information, this representation fails to capture the associative information between an opinion and its corresponding target. It cannot distinguish different degrees of a sentiment word when associated with different targets. This in turn seriously affects opinion retrieval performance. In this paper, we propose a sentence-based approach based on a new information representa- , tion, namely topic-sentiment word pair, to capture intra-sentence contextual information between an opinion and its target. Additionally, we consider inter-sentence information to capture the relationships among the opinions on the same topic. Finally, the two types of information are combined in a unified graph-based model, which can effectively rank the documents. Compared with existing approaches, experimental results on the COAE08 dataset showed that our graph-based model achieved significant improvement. 1

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Practically speaking, the goal of opinion retrieval is to retrieve documents, which entail opinions or comments, relevant to a target subject specified by the user’s query. [sent-2, score-1.263]

2 A fundamental challenge in opinion retrieval is information representation. [sent-3, score-1.012]

3 However, due to loss of contextual information, this representation fails to capture the associative information between an opinion and its corresponding target. [sent-5, score-0.977]

4 In this paper, we propose a sentence-based approach based on a new information representa- , tion, namely topic-sentiment word pair, to capture intra-sentence contextual information between an opinion and its target. [sent-8, score-0.898]

5 tion, opinion extraction, opinion question answering, and opinion summarization, etc. [sent-18, score-2.421]

6 In this paper, we focus on opinion retrieval, whose goal is to find a set of documents containing not only the query keyword(s) but also the relevant opinions. [sent-24, score-1.088]

7 This requirement brings about the challenge on how to represent information needs for effective opinion retrieval. [sent-25, score-0.833]

8 In the second stage, an opinion score is generated for each relevant document (Macdonald and Ounis, 2007; Oard et al. [sent-30, score-1.039]

9 The opinion score can be acquired by either machine learning-based sentiment classifiers, such as SVM (Zhang and Yu, 2007), or a sentiment lexicons with weighted scores from training documents (Amati et al. [sent-32, score-1.471]

10 This representation, however, can only ensure that there is at least one opinion in each relevant document, but it cannot determine the relevance pairing of individual opinion to its target. [sent-40, score-1.836]

11 This may result in possible mismatch between an opinion and a target and in turn affects opinion retrieval performance. [sent-44, score-1.857]

12 bag-of-word, cannot satisfy the information needs for opinion retrieval. [sent-51, score-0.833]

13 In this paper, we propose to handle opinion re- trieval in the granularity of sentence. [sent-52, score-0.863]

14 It is observed that a complete opinion is always expressed in one sentence, and the relevant target of the opinion is mostly the one found in it. [sent-53, score-1.77]

15 Therefore, it is crucial to maintain the associative information between an opinion and its target within a sentence. [sent-54, score-0.977]

16 Finally, we combine both intra-sentence and inter-sentence contextual information to construct a unified undirected graph to achieve effective opinion retrieval. [sent-63, score-0.962]

17 Section 3 presents a novel unified graph-based model for opinion retrieval. [sent-66, score-0.873]

18 We review related works on opinion retrieval in Section 5. [sent-68, score-1.012]

19 2 Motivation In this section, we start from briefly describing the objective of opinion retrieval. [sent-70, score-0.807]

20 We then illustrate the limitations of current opinion retrieval approaches, and analyze the motivation of our method. [sent-71, score-1.031]

21 1 Formal Description of Problem Opinion retrieval was first presented in the TREC 2006 Blog track, and the objective is to retrieve documents that express an opinion about a given target. [sent-73, score-1.14]

22 The opinion target can be a “traditional” named entity (e. [sent-74, score-0.845]

23 The topic of the document is not required to be the same as the target, but an opinion about the target has to be presented in the document or one of the comments to the docu- ment (Macdonald and Ounis, 2006). [sent-82, score-1.125]

24 Therefore, in this paper we regard the information needs for opinion retrieval as relevant opinion. [sent-83, score-1.156]

25 However, in opinion retrieval, information need target at relevant opinion, and this renders bag-of-word representation ineffective. [sent-86, score-0.983]

26 According to the conventional 2-stage opinion retrieval approach, di is represented by a bag-of-word. [sent-90, score-1.036]

27 Although bag-of-word representation achieves good performance in retrieving relevant documents, our study shows that it cannot satisfy the information needs for retrieval of relevant opinion. [sent-204, score-0.487]

28 It suffers from the following limitations: (1) It cannot maintain contextual information; thus, an opinion may not be related to the target of the retrieved document is neglected. [sent-205, score-1.066]

29 In this example, only the opinion favorite (o2) on Avatar in C is the relevant opinion. [sent-206, score-0.982]

30 But due to loss of contextual information between the opinion and its corresponding target, Avatar in A and com1368 fortable (o1) are also regarded as relevant opinion mistakenly, creating a false positive. [sent-207, score-1.819]

31 In reality comfortable (o1) describes “the seats in IMAX”, which is an irrelevant opinion, and sentence A is a factual statement rather than an opinion statement. [sent-208, score-0.863]

32 (a) (b) Figure 2: Two kinds of information representation of opinion retrieval. [sent-209, score-0.827]

33 Suppose there is another document including sentence C which expresses the same opinion on Avatar. [sent-211, score-0.919]

34 In this paper, we process opinion retrieval in the granularity of sentence as we observe that a complete opinion always exists within a sentence (refer to Figure 2 (b)). [sent-214, score-1.898]

35 To represent a relevant opinion, we define the notion of topic-sentiment word pair, which consists of a topic term and a sentiment word. [sent-215, score-0.553]

36 A word pair maintains the associative information between the two words, and enables systems to draw up the relationship among all the sentences with the same opinion on an identical target. [sent-216, score-0.986]

37 Furthermore, based on word pairs, we designed a unified graph-based method for opinion retrieval (see later in Section 3). [sent-218, score-1.108]

38 1 Graph-based model Basic Idea Different from existing approaches which simply make use of document relevance to reflect the relevance of opinions embedded in them, our approach concerns more on identifying the relevance of individual opinions. [sent-220, score-0.516]

39 Intuitively, we believed that the more relevant opinions appear in a document, the more relevant is that document for subsequent opinion analysis operations. [sent-221, score-1.226]

40 Further, since the lexical scope of an opinion does not usually go beyond a sentence, we propose to handle opinion retrieval in the granularity of sentence. [sent-222, score-1.85]

41 determines relevance by the query term matching, and the sentiment word from ? [sent-317, score-0.498]

42 We use the word pair to maintain the associative information between the topic term and the opinion word (also referred to as sentiment word). [sent-320, score-1.442]

43 The word pair is used to identify a relevant opinion in a sentence. [sent-321, score-0.993]

44 Avatar in C, is a topic term relevant to the query, and o2 (‘favorite’) is supposed to be an opinion; and the word pair < t1, o2> indicates sentence C contains a relevant opinion. [sent-324, score-0.479]

45 In practice, not all word pairs carry equal weights to express a relevant opinion as the contribution of an opinion word differs from different target topics, and vice versa. [sent-355, score-1.917]

46 For example, the word pair < t1, o2> should be more probable as a relevant opinion than < t1, o1>. [sent-356, score-0.993]

47 We believe that the more a word pair appears the higher should be the weight between the opinion and the target in the context. [sent-359, score-0.946]

48 2 HITS Model We propose an opinion retrieval model based on HITS, a popular graph ranking algorithm (Kleinberg, 1999). [sent-362, score-1.04]

49 In Figure 3, we can see that the word pair that has links to many documents can be assigned a high weight to denote a strong associative degree between the topic term and a sentiment word, and it likely expresses a relevant opinion. [sent-398, score-0.842]

50 4 Experiment We performed the experiments on the Chinese benchmark dataset to verify our proposed approach for opinion retrieval. [sent-947, score-0.836]

51 To demonstrate the effectiveness of our opinion re- trieval model, we compared its performance with the same of other approaches. [sent-950, score-0.832]

52 COAE dataset is the benchmark data set for the opinion retrieval track in the Chinese Opinion Analysis Evaluation (COAE) workshop, consisting of blogs and reviews. [sent-958, score-1.065]

53 Since polarity is not considered, all relevant documents with opinion are classified into the same level. [sent-961, score-1.025]

54 It contains 1836 positive sentiment words, 3,730 positive comments, 1,254 negative sentiment words and 3,116 negative comment words. [sent-966, score-0.508]

55 (2) Doc: The 2-stage document-based opinion retrieval model was adopted. [sent-1009, score-1.012]

56 The model used sentiment lexicon-based method for opinion identification and a conventional information retrieval method for relevance detection. [sent-1010, score-1.394]

57 (4) ROCC: This model was similar to ROSC, but it considered the factor of sentence and regarded the count of relevant opinionated sentence to be the opinion score (Zhang and Yu, 2007). [sent-1013, score-1.089]

58 (5) GORM: our proposed graph-based opinion retrieval model. [sent-1015, score-1.012]

59 Most of the above models were originally designed for opinion retrieval in English, and re-designed them to handle Chinese opinionated documents. [sent-1021, score-1.076]

60 In order to solve this problem, we extracted the topic term with highest relevant weight in the sentence to form word pairs so that it reduce the impact on the topic terms in common. [sent-1039, score-0.499]

61 ‘指环王’ (Lord of King), there were only 8 relevant documents without any opinion and 14 documents with relevant opinions. [sent-1043, score-1.243]

62 Furthermore, since word pairs can indicate relevant opinions effectively, it is worth further study on how they could be applied to other opinion oriented applications, e. [sent-1049, score-1.121]

63 5 Related Work Our research focuses on relevant opinion rather than on relevant document retrieval. [sent-1052, score-1.131]

64 We, therefore, review related works in opinion identification research. [sent-1053, score-0.807]

65 Furthermore, we do not support the conventional 2-stage opinion retrieval approach. [sent-1054, score-1.036]

66 We conducted literature review on unified opinion retrieval models and related work in this area is presented in the section. [sent-1055, score-1.078]

67 1 Lexicon-based Opinion Identification Different from traditional IR, opinion retrieval focuses on the opinion nature of documents. [sent-1057, score-1.819]

68 During the last three years, NTICR and TREC evaluations have shown that sentiment lexicon-based methods led to good performance in opinion identification. [sent-1058, score-1.061]

69 In this method, the distribution of terms in relevant opinionated documents was compared to their distribution in relevant fact-based documents to calculate an opinion weight. [sent-1061, score-1.307]

70 These weights were used to compute opinion scores for each retrieved document. [sent-1062, score-0.866]

71 This dictionary was submitted as a query to a search engine to get an initial query-independent opinion score of all retrieved documents. [sent-1065, score-0.944]

72 Similarly, a pseudo opinionated word composed of all opinion words was first created, and then used to estimate the opinion score of a document (Na et al. [sent-1066, score-1.846]

73 In our approach, we also adopt sentiment lexicon-based method for opinion identification. [sent-1071, score-1.061]

74 Unlike the above methods, we generate a weight to a sentiment word for each target (associated topic term) rather than assign a unified weight or an equal weight to the sentiment word for the whole topics. [sent-1072, score-0.875]

75 2 Unified Opinion Retrieval Model In addition to conventional 2-stage approach, there has been some research on unified opinion retrieval models. [sent-1076, score-1.102]

76 Eguchi and Lavrenko proposed an opinion retrieval model in the framework of generative language modeling (Eguchi and Lavrenko, 2006). [sent-1077, score-1.012]

77 The sentiment was either represented by a group of predefined seed words, or extracted from a training sentiment corpus. [sent-1079, score-0.508]

78 tried to build a fine-grained opinion retrieval system for consumer products (Mei et al. [sent-1082, score-1.012]

79 The opinion score for a product was a mixture of several facets. [sent-1084, score-0.852]

80 Zhang and Ye proposed a generative model to unify topic relevance and opinion generation (Zhang and Ye, 2008). [sent-1086, score-1.036]

81 This model led to satisfactory performance, but an intensive computation load was inevitable during retrieval, since for each possible candidate document, an opinion score was summed up from the generative prob- ability of thousands of sentiment words. [sent-1087, score-1.087]

82 Huang and Croft proposed a unified opinion retrieval model according to the Kullback-Leibler divergence between the two probability distributions of opinion relevance model and document model (Huang and Croft, 2009). [sent-1088, score-2.077]

83 They divided the sentiment words into query-dependent and query-independent by utilizing several sentiment expansion techniques, and integrated them into a mixed model. [sent-1089, score-0.531]

84 Different from the above opinion retrieval approaches, our proposed graph-based model processes opinion retrieval in the granularity of sentence. [sent-1092, score-2.055]

85 On the one hand, word pair can identify the relevant opinion according to intra-sentence contextual information. [sent-1094, score-1.054]

86 On the other hand, it can measure the degree of a relevant opinion by considering the inter-sentence contextual information. [sent-1095, score-1.015]

87 6 Conclusion and Future Work In this work we focus on the problem of opinion retrieval. [sent-1096, score-0.807]

88 Different from existing approaches, which regard document relevance as the key indicator of opinion relevance, we propose to explore the relevance of individual opinion. [sent-1097, score-1.124]

89 To do that, opinion retrieval is performed in the granularity of sentence. [sent-1098, score-1.043]

90 We define the notion of word pair, which can not only maintain the association between the opinion and the corresponding target in the sentence, but it can also build up the relationship among sentences through the same word pair. [sent-1099, score-0.97]

91 Furthermore, we convert the relationships between word pairs and sentences into a unified graph, and use the HITS algorithm to achieve document ranking for opinion retrieval. [sent-1100, score-1.03]

92 The novelty of our work lies in using word pairs to represent the information needs for opinion retrieval. [sent-1103, score-0.902]

93 On the one hand, word pairs can identify the relevant opinion according to in- tra-sentence contextual information. [sent-1104, score-1.055]

94 On the other hand, word pairs can measure the degree of a relevant opinion by taking inter-sentence contextual information into consideration. [sent-1105, score-1.084]

95 With the help of word pairs, the information needs for opinion retrieval can be represented appropriately. [sent-1106, score-1.068]

96 In the future, more research is required in the following directions: (1) Since word pairs can indicate relevant opinions effectively, it is worth further study on how they could be applied to other opinion oriented applications, e. [sent-1107, score-1.121]

97 (3) Opinion holder is another important role of an opinion, and the identification of opinion holder is a main task in NTCIR. [sent-1113, score-0.849]

98 It would be interesting to study opinion holders, e. [sent-1114, score-0.807]

99 Opinion observer: Analyzing and comparing opinion s on the web. [sent-1164, score-0.807]

100 A generation model to unify topic relevance and lexicon-based sentiment for opinion retrieval. [sent-1218, score-1.29]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('opinion', 0.807), ('sentiment', 0.254), ('retrieval', 0.205), ('relevant', 0.118), ('avatar', 0.107), ('topic', 0.104), ('relevance', 0.104), ('documents', 0.1), ('opinions', 0.095), ('associative', 0.089), ('document', 0.088), ('gorm', 0.085), ('hub', 0.085), ('blog', 0.081), ('trec', 0.067), ('unified', 0.066), ('opinionated', 0.064), ('query', 0.063), ('contextual', 0.061), ('favorite', 0.057), ('authority', 0.053), ('macdonald', 0.053), ('chinese', 0.052), ('term', 0.047), ('maintain', 0.043), ('rosc', 0.043), ('pairs', 0.039), ('xu', 0.039), ('target', 0.038), ('pair', 0.038), ('hits', 0.037), ('iadh', 0.037), ('eguchi', 0.037), ('hannah', 0.037), ('hubs', 0.037), ('lexicon', 0.034), ('amati', 0.034), ('oard', 0.034), ('sigir', 0.034), ('weight', 0.033), ('median', 0.032), ('comfortable', 0.032), ('oriented', 0.032), ('map', 0.032), ('granularity', 0.031), ('word', 0.03), ('scores', 0.03), ('huang', 0.029), ('degree', 0.029), ('craig', 0.029), ('retrieved', 0.029), ('benchmark', 0.029), ('coae', 0.028), ('cuhk', 0.028), ('hongbo', 0.028), ('xuanjing', 0.028), ('graph', 0.028), ('mei', 0.028), ('erkan', 0.028), ('bipartite', 0.028), ('express', 0.028), ('zhang', 0.028), ('queries', 0.027), ('entry', 0.027), ('score', 0.026), ('regarded', 0.026), ('needs', 0.026), ('gunes', 0.025), ('tsunami', 0.025), ('trieval', 0.025), ('yeha', 0.025), ('sentence', 0.024), ('page', 0.024), ('iteration', 0.024), ('conventional', 0.024), ('blogs', 0.024), ('croft', 0.024), ('pseudo', 0.024), ('otterbacher', 0.023), ('expansion', 0.023), ('relationship', 0.022), ('facets', 0.021), ('unify', 0.021), ('ounis', 0.021), ('holder', 0.021), ('existing', 0.021), ('ranked', 0.021), ('na', 0.02), ('enterprise', 0.02), ('nam', 0.02), ('influenced', 0.02), ('contribution', 0.02), ('representation', 0.02), ('computed', 0.019), ('motivation', 0.019), ('mixture', 0.019), ('lavrenko', 0.019), ('submitted', 0.019), ('achieved', 0.018), ('iterative', 0.018)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999946 22 acl-2010-A Unified Graph Model for Sentence-Based Opinion Retrieval

Author: Binyang Li ; Lanjun Zhou ; Shi Feng ; Kam-Fai Wong

Abstract: There is a growing research interest in opinion retrieval as on-line users’ opinions are becoming more and more popular in business, social networks, etc. Practically speaking, the goal of opinion retrieval is to retrieve documents, which entail opinions or comments, relevant to a target subject specified by the user’s query. A fundamental challenge in opinion retrieval is information representation. Existing research focuses on document-based approaches and documents are represented by bag-of-word. However, due to loss of contextual information, this representation fails to capture the associative information between an opinion and its corresponding target. It cannot distinguish different degrees of a sentiment word when associated with different targets. This in turn seriously affects opinion retrieval performance. In this paper, we propose a sentence-based approach based on a new information representa- , tion, namely topic-sentiment word pair, to capture intra-sentence contextual information between an opinion and its target. Additionally, we consider inter-sentence information to capture the relationships among the opinions on the same topic. Finally, the two types of information are combined in a unified graph-based model, which can effectively rank the documents. Compared with existing approaches, experimental results on the COAE08 dataset showed that our graph-based model achieved significant improvement. 1

2 0.55464411 251 acl-2010-Using Anaphora Resolution to Improve Opinion Target Identification in Movie Reviews

Author: Niklas Jakob ; Iryna Gurevych

Abstract: unkown-abstract

3 0.453908 123 acl-2010-Generating Focused Topic-Specific Sentiment Lexicons

Author: Valentin Jijkoun ; Maarten de Rijke ; Wouter Weerkamp

Abstract: We present a method for automatically generating focused and accurate topicspecific subjectivity lexicons from a general purpose polarity lexicon that allow users to pin-point subjective on-topic information in a set of relevant documents. We motivate the need for such lexicons in the field of media analysis, describe a bootstrapping method for generating a topic-specific lexicon from a general purpose polarity lexicon, and evaluate the quality of the generated lexicons both manually and using a TREC Blog track test set for opinionated blog post retrieval. Although the generated lexicons can be an order of magnitude more selective than the general purpose lexicon, they maintain, or even improve, the performance of an opin- ion retrieval system.

4 0.4330577 134 acl-2010-Hierarchical Sequential Learning for Extracting Opinions and Their Attributes

Author: Yejin Choi ; Claire Cardie

Abstract: Automatic opinion recognition involves a number of related tasks, such as identifying the boundaries of opinion expression, determining their polarity, and determining their intensity. Although much progress has been made in this area, existing research typically treats each of the above tasks in isolation. In this paper, we apply a hierarchical parameter sharing technique using Conditional Random Fields for fine-grained opinion analysis, jointly detecting the boundaries of opinion expressions as well as determining two of their key attributes polarity and intensity. Our experimental results show that our proposed approach improves the performance over a baseline that does not — exploit hierarchical structure among the classes. In addition, we find that the joint approach outperforms a baseline that is based on cascading two separate components.

5 0.39925045 208 acl-2010-Sentence and Expression Level Annotation of Opinions in User-Generated Discourse

Author: Cigdem Toprak ; Niklas Jakob ; Iryna Gurevych

Abstract: In this paper, we introduce a corpus of consumer reviews from the rateitall and the eopinions websites annotated with opinion-related information. We present a two-level annotation scheme. In the first stage, the reviews are analyzed at the sentence level for (i) relevancy to a given topic, and (ii) expressing an evaluation about the topic. In the second stage, on-topic sentences containing evaluations about the topic are further investigated at the expression level for pinpointing the properties (semantic orientation, intensity), and the functional components of the evaluations (opinion terms, targets and holders). We discuss the annotation scheme, the inter-annotator agreement for different subtasks and our observations.

6 0.21288849 209 acl-2010-Sentiment Learning on Product Reviews via Sentiment Ontology Tree

7 0.20782641 18 acl-2010-A Study of Information Retrieval Weighting Schemes for Sentiment Analysis

8 0.16733301 210 acl-2010-Sentiment Translation through Lexicon Induction

9 0.12145557 188 acl-2010-Optimizing Informativeness and Readability for Sentiment Summarization

10 0.10725599 204 acl-2010-Recommendation in Internet Forums and Blogs

11 0.095480114 42 acl-2010-Automatically Generating Annotator Rationales to Improve Sentiment Classification

12 0.090592682 105 acl-2010-Evaluating Multilanguage-Comparability of Subjectivity Analysis Systems

13 0.08884719 79 acl-2010-Cross-Lingual Latent Topic Extraction

14 0.08836931 129 acl-2010-Growing Related Words from Seed via User Behaviors: A Re-Ranking Based Approach

15 0.086127281 256 acl-2010-Vocabulary Choice as an Indicator of Perspective

16 0.079969145 77 acl-2010-Cross-Language Document Summarization Based on Machine Translation Quality Prediction

17 0.079028539 14 acl-2010-A Risk Minimization Framework for Extractive Speech Summarization

18 0.078924231 141 acl-2010-Identifying Text Polarity Using Random Walks

19 0.078466542 78 acl-2010-Cross-Language Text Classification Using Structural Correspondence Learning

20 0.072993033 80 acl-2010-Cross Lingual Adaptation: An Experiment on Sentiment Classifications


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.189), (1, 0.185), (2, -0.372), (3, 0.42), (4, -0.3), (5, 0.092), (6, -0.053), (7, 0.199), (8, 0.038), (9, -0.059), (10, 0.059), (11, -0.084), (12, 0.035), (13, 0.103), (14, 0.024), (15, -0.052), (16, -0.244), (17, 0.078), (18, -0.059), (19, 0.003), (20, 0.007), (21, -0.017), (22, -0.013), (23, 0.096), (24, 0.057), (25, -0.018), (26, -0.046), (27, 0.007), (28, 0.095), (29, 0.022), (30, 0.036), (31, -0.037), (32, 0.078), (33, -0.01), (34, -0.019), (35, 0.008), (36, -0.056), (37, -0.061), (38, 0.001), (39, 0.014), (40, -0.025), (41, -0.055), (42, 0.023), (43, -0.024), (44, -0.023), (45, 0.028), (46, -0.016), (47, -0.015), (48, -0.004), (49, 0.05)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.9778446 22 acl-2010-A Unified Graph Model for Sentence-Based Opinion Retrieval

Author: Binyang Li ; Lanjun Zhou ; Shi Feng ; Kam-Fai Wong

Abstract: There is a growing research interest in opinion retrieval as on-line users’ opinions are becoming more and more popular in business, social networks, etc. Practically speaking, the goal of opinion retrieval is to retrieve documents, which entail opinions or comments, relevant to a target subject specified by the user’s query. A fundamental challenge in opinion retrieval is information representation. Existing research focuses on document-based approaches and documents are represented by bag-of-word. However, due to loss of contextual information, this representation fails to capture the associative information between an opinion and its corresponding target. It cannot distinguish different degrees of a sentiment word when associated with different targets. This in turn seriously affects opinion retrieval performance. In this paper, we propose a sentence-based approach based on a new information representa- , tion, namely topic-sentiment word pair, to capture intra-sentence contextual information between an opinion and its target. Additionally, we consider inter-sentence information to capture the relationships among the opinions on the same topic. Finally, the two types of information are combined in a unified graph-based model, which can effectively rank the documents. Compared with existing approaches, experimental results on the COAE08 dataset showed that our graph-based model achieved significant improvement. 1

2 0.92879248 251 acl-2010-Using Anaphora Resolution to Improve Opinion Target Identification in Movie Reviews

Author: Niklas Jakob ; Iryna Gurevych

Abstract: unkown-abstract

3 0.85319465 134 acl-2010-Hierarchical Sequential Learning for Extracting Opinions and Their Attributes

Author: Yejin Choi ; Claire Cardie

Abstract: Automatic opinion recognition involves a number of related tasks, such as identifying the boundaries of opinion expression, determining their polarity, and determining their intensity. Although much progress has been made in this area, existing research typically treats each of the above tasks in isolation. In this paper, we apply a hierarchical parameter sharing technique using Conditional Random Fields for fine-grained opinion analysis, jointly detecting the boundaries of opinion expressions as well as determining two of their key attributes polarity and intensity. Our experimental results show that our proposed approach improves the performance over a baseline that does not — exploit hierarchical structure among the classes. In addition, we find that the joint approach outperforms a baseline that is based on cascading two separate components.

4 0.83722574 123 acl-2010-Generating Focused Topic-Specific Sentiment Lexicons

Author: Valentin Jijkoun ; Maarten de Rijke ; Wouter Weerkamp

Abstract: We present a method for automatically generating focused and accurate topicspecific subjectivity lexicons from a general purpose polarity lexicon that allow users to pin-point subjective on-topic information in a set of relevant documents. We motivate the need for such lexicons in the field of media analysis, describe a bootstrapping method for generating a topic-specific lexicon from a general purpose polarity lexicon, and evaluate the quality of the generated lexicons both manually and using a TREC Blog track test set for opinionated blog post retrieval. Although the generated lexicons can be an order of magnitude more selective than the general purpose lexicon, they maintain, or even improve, the performance of an opin- ion retrieval system.

5 0.79526663 208 acl-2010-Sentence and Expression Level Annotation of Opinions in User-Generated Discourse

Author: Cigdem Toprak ; Niklas Jakob ; Iryna Gurevych

Abstract: In this paper, we introduce a corpus of consumer reviews from the rateitall and the eopinions websites annotated with opinion-related information. We present a two-level annotation scheme. In the first stage, the reviews are analyzed at the sentence level for (i) relevancy to a given topic, and (ii) expressing an evaluation about the topic. In the second stage, on-topic sentences containing evaluations about the topic are further investigated at the expression level for pinpointing the properties (semantic orientation, intensity), and the functional components of the evaluations (opinion terms, targets and holders). We discuss the annotation scheme, the inter-annotator agreement for different subtasks and our observations.

6 0.50526828 18 acl-2010-A Study of Information Retrieval Weighting Schemes for Sentiment Analysis

7 0.45720717 209 acl-2010-Sentiment Learning on Product Reviews via Sentiment Ontology Tree

8 0.44244972 42 acl-2010-Automatically Generating Annotator Rationales to Improve Sentiment Classification

9 0.31830364 105 acl-2010-Evaluating Multilanguage-Comparability of Subjectivity Analysis Systems

10 0.31425807 256 acl-2010-Vocabulary Choice as an Indicator of Perspective

11 0.30538309 141 acl-2010-Identifying Text Polarity Using Random Walks

12 0.30169889 204 acl-2010-Recommendation in Internet Forums and Blogs

13 0.27479988 210 acl-2010-Sentiment Translation through Lexicon Induction

14 0.26493204 122 acl-2010-Generating Fine-Grained Reviews of Songs from Album Reviews

15 0.26008606 176 acl-2010-Mood Patterns and Affective Lexicon Access in Weblogs

16 0.25479946 8 acl-2010-A Hybrid Hierarchical Model for Multi-Document Summarization

17 0.2512773 157 acl-2010-Last but Definitely Not Least: On the Role of the Last Sentence in Automatic Polarity-Classification

18 0.24341592 79 acl-2010-Cross-Lingual Latent Topic Extraction

19 0.23652883 188 acl-2010-Optimizing Informativeness and Readability for Sentiment Summarization

20 0.23127328 177 acl-2010-Multilingual Pseudo-Relevance Feedback: Performance Study of Assisting Languages


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(9, 0.012), (25, 0.061), (33, 0.012), (39, 0.013), (42, 0.173), (44, 0.01), (59, 0.073), (64, 0.112), (72, 0.04), (73, 0.058), (76, 0.01), (78, 0.029), (83, 0.076), (84, 0.025), (98, 0.175), (99, 0.012)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.9131074 123 acl-2010-Generating Focused Topic-Specific Sentiment Lexicons

Author: Valentin Jijkoun ; Maarten de Rijke ; Wouter Weerkamp

Abstract: We present a method for automatically generating focused and accurate topicspecific subjectivity lexicons from a general purpose polarity lexicon that allow users to pin-point subjective on-topic information in a set of relevant documents. We motivate the need for such lexicons in the field of media analysis, describe a bootstrapping method for generating a topic-specific lexicon from a general purpose polarity lexicon, and evaluate the quality of the generated lexicons both manually and using a TREC Blog track test set for opinionated blog post retrieval. Although the generated lexicons can be an order of magnitude more selective than the general purpose lexicon, they maintain, or even improve, the performance of an opin- ion retrieval system.

same-paper 2 0.89651465 22 acl-2010-A Unified Graph Model for Sentence-Based Opinion Retrieval

Author: Binyang Li ; Lanjun Zhou ; Shi Feng ; Kam-Fai Wong

Abstract: There is a growing research interest in opinion retrieval as on-line users’ opinions are becoming more and more popular in business, social networks, etc. Practically speaking, the goal of opinion retrieval is to retrieve documents, which entail opinions or comments, relevant to a target subject specified by the user’s query. A fundamental challenge in opinion retrieval is information representation. Existing research focuses on document-based approaches and documents are represented by bag-of-word. However, due to loss of contextual information, this representation fails to capture the associative information between an opinion and its corresponding target. It cannot distinguish different degrees of a sentiment word when associated with different targets. This in turn seriously affects opinion retrieval performance. In this paper, we propose a sentence-based approach based on a new information representa- , tion, namely topic-sentiment word pair, to capture intra-sentence contextual information between an opinion and its target. Additionally, we consider inter-sentence information to capture the relationships among the opinions on the same topic. Finally, the two types of information are combined in a unified graph-based model, which can effectively rank the documents. Compared with existing approaches, experimental results on the COAE08 dataset showed that our graph-based model achieved significant improvement. 1

3 0.86433744 178 acl-2010-Non-Cooperation in Dialogue

Author: Brian Pluss

Abstract: This paper presents ongoing research on computational models for non-cooperative dialogue. We start by analysing different levels of cooperation in conversation. Then, inspired by findings from an empirical study, we propose a technique for measuring non-cooperation in political interviews. Finally, we describe a research programme towards obtaining a suitable model and discuss previous accounts for conflictive dialogue, identifying the differences with our work.

4 0.86224329 150 acl-2010-Inducing Domain-Specific Semantic Class Taggers from (Almost) Nothing

Author: Ruihong Huang ; Ellen Riloff

Abstract: This research explores the idea of inducing domain-specific semantic class taggers using only a domain-specific text collection and seed words. The learning process begins by inducing a classifier that only has access to contextual features, forcing it to generalize beyond the seeds. The contextual classifier then labels new instances, to expand and diversify the training set. Next, a cross-category bootstrapping process simultaneously trains a suite of classifiers for multiple semantic classes. The positive instances for one class are used as negative instances for the others in an iterative bootstrapping cycle. We also explore a one-semantic-class-per-discourse heuristic, and use the classifiers to dynam- ically create semantic features. We evaluate our approach by inducing six semantic taggers from a collection of veterinary medicine message board posts.

5 0.85363698 149 acl-2010-Incorporating Extra-Linguistic Information into Reference Resolution in Collaborative Task Dialogue

Author: Ryu Iida ; Syumpei Kobayashi ; Takenobu Tokunaga

Abstract: This paper proposes an approach to reference resolution in situated dialogues by exploiting extra-linguistic information. Recently, investigations of referential behaviours involved in situations in the real world have received increasing attention by researchers (Di Eugenio et al., 2000; Byron, 2005; van Deemter, 2007; Spanger et al., 2009). In order to create an accurate reference resolution model, we need to handle extra-linguistic information as well as textual information examined by existing approaches (Soon et al., 2001 ; Ng and Cardie, 2002, etc.). In this paper, we incorporate extra-linguistic information into an existing corpus-based reference resolution model, and investigate its effects on refer- ence resolution problems within a corpus of Japanese dialogues. The results demonstrate that our proposed model achieves an accuracy of 79.0% for this task.

6 0.85241115 214 acl-2010-Sparsity in Dependency Grammar Induction

7 0.84645534 251 acl-2010-Using Anaphora Resolution to Improve Opinion Target Identification in Movie Reviews

8 0.82483512 85 acl-2010-Detecting Experiences from Weblogs

9 0.82462001 208 acl-2010-Sentence and Expression Level Annotation of Opinions in User-Generated Discourse

10 0.80839592 167 acl-2010-Learning to Adapt to Unknown Users: Referring Expression Generation in Spoken Dialogue Systems

11 0.80318129 231 acl-2010-The Prevalence of Descriptive Referring Expressions in News and Narrative

12 0.7828747 134 acl-2010-Hierarchical Sequential Learning for Extracting Opinions and Their Attributes

13 0.76980174 188 acl-2010-Optimizing Informativeness and Readability for Sentiment Summarization

14 0.7694428 209 acl-2010-Sentiment Learning on Product Reviews via Sentiment Ontology Tree

15 0.76578045 144 acl-2010-Improved Unsupervised POS Induction through Prototype Discovery

16 0.76015854 204 acl-2010-Recommendation in Internet Forums and Blogs

17 0.75416911 174 acl-2010-Modeling Semantic Relevance for Question-Answer Pairs in Web Social Communities

18 0.75405234 42 acl-2010-Automatically Generating Annotator Rationales to Improve Sentiment Classification

19 0.75379342 172 acl-2010-Minimized Models and Grammar-Informed Initialization for Supertagging with Highly Ambiguous Lexicons

20 0.7534771 5 acl-2010-A Framework for Figurative Language Detection Based on Sense Differentiation