acl acl2011 acl2011-159 knowledge-graph by maker-knowledge-mining

159 acl-2011-Identifying Noun Product Features that Imply Opinions

Source: pdf

Author: Lei Zhang ; Bing Liu

Abstract: Identifying domain-dependent opinion words is a key problem in opinion mining and has been studied by several researchers. However, existing work has been focused on adjectives and to some extent verbs. Limited work has been done on nouns and noun phrases. In our work, we used the feature-based opinion mining model, and we found that in some domains nouns and noun phrases that indicate product features may also imply opinions. In many such cases, these nouns are not subjective but objective. Their involved sentences are also objective sentences and imply positive or negative opinions. Identifying such nouns and noun phrases and their polarities is very challenging but critical for effective opinion mining in these domains. To the best of our knowledge, this problem has not been studied in the literature. This paper proposes a method to deal with the problem. Experimental results based on real-life datasets show promising results. 1

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 edu Abstract Identifying domain-dependent opinion words is a key problem in opinion mining and has been studied by several researchers. [sent-3, score-1.43]

2 Limited work has been done on nouns and noun phrases. [sent-5, score-0.291]

3 In our work, we used the feature-based opinion mining model, and we found that in some domains nouns and noun phrases that indicate product features may also imply opinions. [sent-6, score-1.371]

4 In many such cases, these nouns are not subjective but objective. [sent-7, score-0.184]

5 Their involved sentences are also objective sentences and imply positive or negative opinions. [sent-8, score-0.44]

6 Identifying such nouns and noun phrases and their polarities is very challenging but critical for effective opinion mining in these domains. [sent-9, score-1.124]

7 1 Introduction Opinion words are words that convey positive or negative polarities. [sent-13, score-0.214]

8 They are critical for opinion mining (Pang et al. [sent-14, score-0.73]

9 The key difficulty in finding such words is that opinions expressed by many of them are domain or context dependent. [sent-25, score-0.236]

10 Several researchers have studied the problem of finding opinion words (Liu, 2010). [sent-26, score-0.7]

11 Dictionary-based approaches are generally not suitable for finding domain specific opinion words as dictionaries contain little domain specific information. [sent-32, score-0.72]

12 The approach exploits some conjunctive patterns, involving and, or, but, eitheror, or neither-nor, with the intuition that the conjoining adjectives subject to linguistic constraints on the orientation or polarity of the adjectives involved. [sent-34, score-0.362]

13 Using these constraints, one can infer opinion polarities of unknown adjectives based on the known ones. [sent-35, score-0.869]

14 (2008) introduced the concept of feature context because the polarities of many opinion bearing words are sentence context dependent rather than just domain dependent. [sent-39, score-0.925]

15 (2009) proposed a method called double propagation that uses dependency relations to extract both opinion words and product features. [sent-41, score-0.841]

16 i ac t2io0n11 fo Ar Cssoocmiaptuiotanti foonra Clo Lminpguutiast i ocns:aslh Loirntpgaupisetrics , pages 575–580, However, none of these approaches handle nouns or noun phrases. [sent-44, score-0.291]

17 Our work uses the feature-based opinion mining model in (Hu and Liu, 2004) to mine opinions in product reviews. [sent-48, score-1.087]

18 We found that in some application domains product features which are indicated by nouns have implied opinions although they are not subjective words. [sent-49, score-0.68]

19 This paper aims to identify such opinionated noun features. [sent-50, score-0.343]

20 To make this concrete, let us see an example from a mattress review: “Within a month, a valley formed in the middle of the mattress. [sent-51, score-0.412]

21 ” Here “valley” indicates the quality of the mattress (a product feature) and also implies a negative opinion. [sent-52, score-0.383]

22 The opinion implied by “valley” cannot be found by current techniques. [sent-53, score-0.739]

23 (2003) proposed a method to extract subjective nouns, our work is very different because many nouns implying opinions are not subjective nouns, but objective nouns, e. [sent-55, score-0.658]

24 Those sentences involving such nouns are usually also objective sentences. [sent-58, score-0.173]

25 As much of the existing opinion mining research focuses on subjective sentences, we believe it is high time to study objective words and sentences that imply opinions as well. [sent-59, score-1.21]

26 Objective words (or sentences) that imply opinions are very difficult to recognize because their recognition typically requires the commonsense or world knowledge of the application domain. [sent-61, score-0.345]

27 In this paper, we propose a method to deal with the problem, specifically, finding product features which are nouns or noun phrases and imply positive or negative opinions. [sent-62, score-0.829]

28 For a product feature (or feature for short) with an implied opinion, there is either no adjective opinion word that modifies it directly or the opinion word that modify it usually have the same opinion. [sent-65, score-1.839]

29 Example 1: No opinion adjective word modifies the opinionated product feature (“valley”): 576 “Within a month, a valley formed in the middle of the mattress. [sent-66, score-1.455]

30 ” Example 2: An opinion adjective modifies the opinionated product feature: “Within a month, a bad valley formed in the middle of the mattress. [sent-67, score-1.419]

31 It is unlikely that a positive opinion word will modify “valley”, e. [sent-69, score-0.819]

32 Thus, if a product feature is modified by both positive and negative opinion adjectives, it is unlikely to be an opinionated product feature. [sent-72, score-1.441]

33 Based on these examples, we designed the following two steps to identify noun product features which imply positive or negative opinions: 1. [sent-73, score-0.718]

34 Candidate Identification: This step determines the surrounding sentiment context of each noun feature. [sent-74, score-0.31]

35 The intuition is that if a feature occurs in negative (respectively positive) opinion contexts significantly more frequently than in positive (or negative) opinion contexts, we can infer that its polarity is negative (or positive). [sent-75, score-1.833]

36 This step thus produces a list of candidate features with positive opinions and a list of candidate features with negative opinions. [sent-77, score-0.577]

37 The idea is that when a noun product feature is directly modified by both positive and negative opinion words, it is unlikely to be an opinionated product feature. [sent-80, score-1.621]

38 Basically, step 1 needs the feature-based sentiment analysis capability. [sent-81, score-0.13]

39 1 Feature-Based Sentiment Analysis To use the lexicon-based sentiment analysis method, we need a list of opinion words, i. [sent-85, score-0.796]

40 Opinion words are words that express positive or negative sentiments. [sent-88, score-0.214]

41 As noted earlier, there are also many words whose polarities depend on the contexts in which they appear. [sent-89, score-0.132]

42 Researchers have compiled sets of opinion words for adjectives, adverbs, verbs and nouns respectively, called the opinion lexicon. [sent-90, score-1.443]

43 In this paper, we used the opinion lexicon complied by Ding et al. [sent-91, score-0.719]

44 It is worth mentioning that our task is to find nouns which imply opinions in a specific domain, and such nouns do not appear in any general opinion lexicon. [sent-93, score-1.233]

45 Aggregating Opinions on a Feature Using the opinion lexicon, we can identify opinion polarity expressed on each product feature in a sentence. [sent-97, score-1.622]

46 2008) basically combines opinion words in the sentence to assign a sentiment to each product feature. [sent-99, score-0.969]

47 Given a sentence s which contains a product feature f, opinion words in the sentence are first identified by matching with the words in the opinion lexicon. [sent-101, score-1.604]

48 A positive word is assigned the semantic orientation (polarity) score of +1, and a negative word is assigned the semantic orientation score of -1. [sent-103, score-0.402]

49 SiO,f), (1) where wi is an opinion word, L is the set of all opinion words (including idioms) and s is the sentence that contains the feature f, and dis(wi, f) is the distance between feature f and opinion word wi in s. [sent-105, score-2.255]

50 The multiplicative inverse in the formula is used to give low weights to opinion words that are far away from the feature f. [sent-108, score-0.74]

51 If the final score is positive, then the opinion on the feature in s is positive. [sent-109, score-0.74]

52 If the score is negative, then the opinion on the feature in s is negative. [sent-110, score-0.74]

53 A rule of opinion is an implication with an expression on the left and an implied opinion on the right. [sent-116, score-1.405]

54 A negation word or phrase usually reverses the opinion expressed in a sentence. [sent-119, score-0.734]

55 For example, “I am not bothered by the hump on the mattress” is a sentence from a mattress review. [sent-122, score-0.192]

56 However, it also implies a negative opinion about “hump,” which indicates a product feature. [sent-124, score-0.93]

57 We call this kind of sentences negated feeling 577 response sentences. [sent-125, score-0.143]

58 A sentence like this normally expresses the feeling of a person or a group of persons towards some items which generally have positive or negative connotations in the sentence context or the application domain. [sent-126, score-0.404]

59 Such a sentence usually consists of four components: a noun representing a person or a group of persons (which includes personal pronoun and proper noun), a negation word, a feeling verb, and a stimulus word. [sent-127, score-0.467]

60 Instead, we regard the sentence bearing the same opinion about the stimulus word as the opinion of the feeling verb. [sent-131, score-1.587]

61 These opinion contexts will help the statistical test later. [sent-132, score-0.695]

62 The opinion before “but” and after “but” are usually the opposite to each other. [sent-135, score-0.666]

63 These rules say that deceasing or increasing of some quantities associated with opinionated items may change the orientations of the opinions. [sent-138, score-0.269]

64 Here “pain” is a negative opinion word in the opinion lexicon, and the reduction of “pain” indicates a desirable effect of the drug. [sent-140, score-1.448]

65 ” Neg and Pos represent respectively a negative and a positive opinion word. [sent-145, score-0.88]

66 Increasing rules do not change opinion directions (Liu, 2010). [sent-146, score-0.69]

67 Handing Context-Dependent Opinions As mentioned earlier, context-dependent opinion words (only adjectives and adverbs) must be determined by its contexts. [sent-150, score-0.766]

68 For example, if someone writes a sentence like “This camera is very nice and has a long battery life”, we can infer that “long” is positive for “battery life” because it is conjoined with the positive word “nice. [sent-153, score-0.273]

69 1, we can identify opinion sentences for each product feature in context, which contains both positiveopinionated sentences and negative-opinionated sentences. [sent-157, score-0.944]

70 We then determine candidate product features implying opinions by checking the percentage of either positive-opinionated sentences or negative-opinionated sentences among all opinionated sentences. [sent-158, score-0.841]

71 Through experiments, we make an empirical assumption that if either the positive-opinionated sentence percentage or the negative-opinionated sentence percentage is significantly greater than 70%, we regard this noun feature as a noun feature implying an opinion. [sent-159, score-0.807]

72 The basic heuristic for our idea is that if a noun feature is more likely to occur in positive (or negative) opinion contexts (sentences), it is more likely to be an opinionated noun feature. [sent-160, score-1.39]

73 , the percentage of positive (or negative) opinions in our case, and n is the sample size, which is the total number of opinionated sentences that contain the noun feature. [sent-167, score-0.708]

74 It means that Z score for an opinionated feature must be no less than -1. [sent-171, score-0.237]

75 Otherwise we do not regard it as a feature implying opinion. [sent-173, score-0.263]

76 3 Pruning Non-Opinionated Features Many of candidate noun features with opinions may not indicate any opinion. [sent-175, score-0.466]

77 Then, we need to distinguish features which have implied opinions and normal features which have no opinions, e. [sent-176, score-0.362]

78 ” However, for features with context dependent opinions, people often have a fixed opinion, either positive or negative but not both. [sent-181, score-0.254]

79 With this observation in mind, we can detect features with no opinion by finding direct modification relations using a dependency parser. [sent-182, score-0.706]

80 ” Here O is an opinion word, O-Dep / F-Dep is a dependency relation, which describes a relation between words, and includes mod, pnmod, subj, s, obj, obj2 and desc (detailed explanations can be found in http://www. [sent-189, score-0.666]

81 For the first example, given feature “picture quality”, we can extract its modification opinion word “good”. [sent-196, score-0.74]

82 For the second example, given feature “springs”, we can get opinion word “bad”. [sent-197, score-0.74]

83 Among these extracted opinion words for the feature noun, if some belong to the positive opinion lexicon and some belong to the negative opinion lexicon, we conclude the noun feature is not an opinionated feature and is thus pruned. [sent-199, score-2.83]

84 Table 1 shows the domains (based on their names) of the datasets, the number of sentences, and the number of noun features. [sent-201, score-0.206]

85 The first two datasets were obtained from a commercial company that provides opinion mining services, and the other two were crawled by us. [sent-202, score-0.769]

86 Experimental datasets An issue for judging noun features implying opinions is that it can be subjective. [sent-204, score-0.626]

87 For comparison, we also implemented a baseline method, which decides a noun feature’s polarity only by its modifying opinion words (adjectives). [sent-206, score-0.914]

88 If its corresponding adjective is positive-orientated, then the noun feature is positive-orientated. [sent-207, score-0.31]

89 3 for statistical test (in this case, n in equation 2 is the total number of sentences containing the noun feature) and for pruning, we can determine noun features implying opinions from the data corpus. [sent-210, score-0.795]

90 It indicates many noun features that imply opinions are not directly modified by adjective opinion words. [sent-214, score-1.287]

91 t86chu237aorleds Table 3 and Table 4 give the results of noun features implying positive and negative opinions separately. [sent-221, score-0.801]

92 Because for some datasets, there is no noun feature implying a positive/negative opinion, their precision and recall are zeros. [sent-223, score-0.443]

93 The purpose is to rank correct noun features that imply opinions at the top of the list, so as to improve the precision of the top-ranked candidates. [sent-232, score-0.632]

94 It gives the percentage of correct noun features implying opinions at the rank position N. [sent-241, score-0.653]

95 Because some domains may not contain positive or negative noun features, we combine positive and negative candidate features together for an overall ranking for each dataset. [sent-242, score-0.735]

96 Note that in Table 6, there is no result for the Drug dataset because no noun features implying opinions were found beyond the top 10 results because there are not many such noun features in the drug domain. [sent-253, score-0.859]

97 4 Conclusions This paper proposed a method to identify noun product features that imply opinions. [sent-254, score-0.504]

98 Conceptually, this work studied the problem of objective nouns and sentences with implied opinions. [sent-255, score-0.28]

99 This problem is important because without identifying such opinions, the recall of opinion mining suffers. [sent-257, score-0.73]

100 Our proposed method determines feature polarity not only by opinion words that modify the features but also by its surrounding context. [sent-258, score-0.875]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('opinion', 0.666), ('valley', 0.238), ('opinions', 0.209), ('noun', 0.18), ('opinionated', 0.163), ('implying', 0.158), ('product', 0.148), ('imply', 0.136), ('sentiment', 0.13), ('mattress', 0.119), ('negative', 0.116), ('feeling', 0.115), ('ding', 0.111), ('nouns', 0.111), ('polarities', 0.103), ('adjectives', 0.1), ('positive', 0.098), ('orientation', 0.094), ('feature', 0.074), ('subjective', 0.073), ('implied', 0.073), ('esuli', 0.069), ('negation', 0.068), ('polarity', 0.068), ('mining', 0.064), ('bing', 0.063), ('kanayama', 0.058), ('adjective', 0.056), ('modifies', 0.055), ('qiu', 0.054), ('stimulus', 0.054), ('nasukawa', 0.054), ('lexicon', 0.053), ('drug', 0.052), ('battery', 0.052), ('chicago', 0.05), ('pain', 0.049), ('liu', 0.049), ('deceasing', 0.048), ('dragut', 0.048), ('hump', 0.048), ('zagibalov', 0.048), ('month', 0.046), ('voice', 0.045), ('springs', 0.042), ('kamps', 0.042), ('wi', 0.042), ('features', 0.04), ('hatzivassiloglou', 0.04), ('pang', 0.04), ('datasets', 0.039), ('thumbs', 0.038), ('hu', 0.038), ('bad', 0.038), ('candidate', 0.037), ('kobayashi', 0.036), ('rank', 0.036), ('janyce', 0.035), ('maarten', 0.034), ('fabrizio', 0.034), ('orientations', 0.034), ('objective', 0.034), ('studied', 0.034), ('andreevskaia', 0.033), ('breck', 0.033), ('sebastiani', 0.033), ('neg', 0.033), ('regard', 0.031), ('takamura', 0.031), ('inui', 0.031), ('illinois', 0.031), ('south', 0.031), ('precision', 0.031), ('percentage', 0.03), ('wiebe', 0.03), ('bearing', 0.03), ('contexts', 0.029), ('unlikely', 0.028), ('formed', 0.028), ('picture', 0.028), ('sentences', 0.028), ('theresa', 0.027), ('titov', 0.027), ('statistic', 0.027), ('middle', 0.027), ('double', 0.027), ('customer', 0.027), ('decreased', 0.027), ('modify', 0.027), ('domain', 0.027), ('domains', 0.026), ('pruning', 0.026), ('popescu', 0.025), ('os', 0.025), ('sentence', 0.025), ('persons', 0.025), ('life', 0.024), ('adverbs', 0.024), ('ranking', 0.024), ('rules', 0.024)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999982 159 acl-2011-Identifying Noun Product Features that Imply Opinions

Author: Lei Zhang ; Bing Liu

2 0.3254011 131 acl-2011-Extracting Opinion Expressions and Their Polarities - Exploration of Pipelines and Joint Models

Author: Richard Johansson ; Alessandro Moschitti

Abstract: We investigate systems that identify opinion expressions and assigns polarities to the extracted expressions. In particular, we demonstrate the benefit of integrating opinion extraction and polarity classification into a joint model using features reflecting the global polarity structure. The model is trained using large-margin structured prediction methods. The system is evaluated on the MPQA opinion corpus, where we compare it to the only previously published end-to-end system for opinion expression extraction and polarity classification. The results show an improvement of between 10 and 15 absolute points in F-measure.

3 0.30245999 211 acl-2011-Liars and Saviors in a Sentiment Annotated Corpus of Comments to Political Debates

Author: Paula Carvalho ; Luis Sarmento ; Jorge Teixeira ; Mario J. Silva

Abstract: We investigate the expression of opinions about human entities in user-generated content (UGC). A set of 2,800 online news comments (8,000 sentences) was manually annotated, following a rich annotation scheme designed for this purpose. We conclude that the challenge in performing opinion mining in such type of content is correctly identifying the positive opinions, because (i) they are much less frequent than negative opinions and (ii) they are particularly exposed to verbal irony. We also show that the recognition of human targets poses additional challenges on mining opinions from UGC, since they are frequently mentioned by pronouns, definite descriptions and nicknames. 1

4 0.26261705 45 acl-2011-Aspect Ranking: Identifying Important Product Aspects from Online Consumer Reviews

Author: Jianxing Yu ; Zheng-Jun Zha ; Meng Wang ; Tat-Seng Chua

Abstract: In this paper, we dedicate to the topic of aspect ranking, which aims to automatically identify important product aspects from online consumer reviews. The important aspects are identified according to two observations: (a) the important aspects of a product are usually commented by a large number of consumers; and (b) consumers’ opinions on the important aspects greatly influence their overall opinions on the product. In particular, given consumer reviews of a product, we first identify the product aspects by a shallow dependency parser and determine consumers’ opinions on these aspects via a sentiment classifier. We then develop an aspect ranking algorithm to identify the important aspects by simultaneously considering the aspect frequency and the influence of consumers’ opinions given to each aspect on their overall opinions. The experimental results on 11 popular products in four domains demonstrate the effectiveness of our approach. We further apply the aspect ranking results to the application ofdocumentlevel sentiment classification, and improve the performance significantly.

5 0.2535181 21 acl-2011-A Pilot Study of Opinion Summarization in Conversations

Author: Dong Wang ; Yang Liu

Abstract: This paper presents a pilot study of opinion summarization on conversations. We create a corpus containing extractive and abstractive summaries of speaker’s opinion towards a given topic using 88 telephone conversations. We adopt two methods to perform extractive summarization. The first one is a sentence-ranking method that linearly combines scores measured from different aspects including topic relevance, subjectivity, and sentence importance. The second one is a graph-based method, which incorporates topic and sentiment information, as well as additional information about sentence-to-sentence relations extracted based on dialogue structure. Our evaluation results show that both methods significantly outperform the baseline approach that extracts the longest utterances. In particular, we find that incorporating dialogue structure in the graph-based method contributes to the improved system performance.

6 0.19976354 281 acl-2011-Sentiment Analysis of Citations using Sentence Structure-Based Features

7 0.16381383 162 acl-2011-Identifying the Semantic Orientation of Foreign Words

8 0.15815169 332 acl-2011-Using Multiple Sources to Construct a Sentiment Sensitive Thesaurus for Cross-Domain Sentiment Classification

9 0.15704587 292 acl-2011-Target-dependent Twitter Sentiment Classification

10 0.15329935 183 acl-2011-Joint Bilingual Sentiment Classification with Unlabeled Parallel Corpora

11 0.1397388 136 acl-2011-Finding Deceptive Opinion Spam by Any Stretch of the Imagination

12 0.13023485 253 acl-2011-PsychoSentiWordNet

13 0.12934291 82 acl-2011-Content Models with Attitude

14 0.1275012 204 acl-2011-Learning Word Vectors for Sentiment Analysis

15 0.12497229 64 acl-2011-C-Feel-It: A Sentiment Analyzer for Micro-blogs

16 0.11601381 279 acl-2011-Semi-supervised latent variable models for sentence-level sentiment analysis

17 0.10853101 105 acl-2011-Dr Sentiment Knows Everything!

18 0.10476293 288 acl-2011-Subjective Natural Language Problems: Motivations, Applications, Characterizations, and Implications

19 0.088166952 289 acl-2011-Subjectivity and Sentiment Analysis of Modern Standard Arabic

20 0.087343767 218 acl-2011-MemeTube: A Sentiment-based Audiovisual System for Analyzing and Displaying Microblog Messages

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.186), (1, 0.238), (2, 0.182), (3, -0.063), (4, 0.038), (5, 0.039), (6, 0.006), (7, 0.029), (8, -0.011), (9, -0.084), (10, 0.035), (11, -0.048), (12, -0.138), (13, -0.011), (14, -0.008), (15, -0.035), (16, 0.039), (17, -0.001), (18, -0.009), (19, -0.045), (20, -0.045), (21, 0.09), (22, -0.003), (23, 0.03), (24, 0.042), (25, 0.027), (26, 0.246), (27, -0.021), (28, -0.041), (29, 0.203), (30, 0.411), (31, 0.004), (32, -0.153), (33, 0.088), (34, 0.022), (35, 0.003), (36, -0.022), (37, 0.059), (38, 0.066), (39, -0.147), (40, -0.09), (41, 0.047), (42, 0.015), (43, -0.012), (44, 0.05), (45, -0.019), (46, 0.02), (47, 0.025), (48, 0.065), (49, 0.006)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.97399884 159 acl-2011-Identifying Noun Product Features that Imply Opinions

Author: Lei Zhang ; Bing Liu

2 0.91404665 211 acl-2011-Liars and Saviors in a Sentiment Annotated Corpus of Comments to Political Debates

Author: Paula Carvalho ; Luis Sarmento ; Jorge Teixeira ; Mario J. Silva

3 0.89332217 131 acl-2011-Extracting Opinion Expressions and Their Polarities - Exploration of Pipelines and Joint Models

Author: Richard Johansson ; Alessandro Moschitti

4 0.83963341 136 acl-2011-Finding Deceptive Opinion Spam by Any Stretch of the Imagination

Author: Myle Ott ; Yejin Choi ; Claire Cardie ; Jeffrey T. Hancock

Abstract: Consumers increasingly rate, review and research products online (Jansen, 2010; Litvin et al., 2008). Consequently, websites containing consumer reviews are becoming targets of opinion spam. While recent work has focused primarily on manually identifiable instances of opinion spam, in this work we study deceptive opinion spam—fictitious opinions that have been deliberately written to sound authentic. Integrating work from psychology and computational linguistics, we develop and compare three approaches to detecting deceptive opinion spam, and ultimately develop a classifier that is nearly 90% accurate on our gold-standard opinion spam dataset. Based on feature analysis of our learned models, we additionally make several theoretical contributions, including revealing a relationship between deceptive opinions and imaginative writing.

5 0.7158401 45 acl-2011-Aspect Ranking: Identifying Important Product Aspects from Online Consumer Reviews

Author: Jianxing Yu ; Zheng-Jun Zha ; Meng Wang ; Tat-Seng Chua

6 0.54046541 21 acl-2011-A Pilot Study of Opinion Summarization in Conversations

7 0.51899165 162 acl-2011-Identifying the Semantic Orientation of Foreign Words

8 0.48216581 288 acl-2011-Subjective Natural Language Problems: Motivations, Applications, Characterizations, and Implications

9 0.45336464 297 acl-2011-That's What She Said: Double Entendre Identification

10 0.45310253 84 acl-2011-Contrasting Opposing Views of News Articles on Contentious Issues

11 0.44926018 289 acl-2011-Subjectivity and Sentiment Analysis of Modern Standard Arabic

12 0.43912053 281 acl-2011-Sentiment Analysis of Citations using Sentence Structure-Based Features

13 0.43629366 82 acl-2011-Content Models with Attitude

14 0.42197937 55 acl-2011-Automatically Predicting Peer-Review Helpfulness

15 0.37598202 194 acl-2011-Language Use: What can it tell us?

16 0.37164664 292 acl-2011-Target-dependent Twitter Sentiment Classification

17 0.34377325 130 acl-2011-Extracting Comparative Entities and Predicates from Texts Using Comparative Type Classification

18 0.3376638 332 acl-2011-Using Multiple Sources to Construct a Sentiment Sensitive Thesaurus for Cross-Domain Sentiment Classification

19 0.33616391 183 acl-2011-Joint Bilingual Sentiment Classification with Unlabeled Parallel Corpora

20 0.33362636 156 acl-2011-IMASS: An Intelligent Microblog Analysis and Summarization System

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(5, 0.032), (17, 0.043), (18, 0.02), (26, 0.037), (37, 0.099), (39, 0.034), (41, 0.058), (53, 0.306), (55, 0.032), (59, 0.046), (72, 0.03), (91, 0.021), (96, 0.136)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.82894087 159 acl-2011-Identifying Noun Product Features that Imply Opinions

Author: Lei Zhang ; Bing Liu

2 0.79574335 132 acl-2011-Extracting Paraphrases from Definition Sentences on the Web

Author: Chikara Hashimoto ; Kentaro Torisawa ; Stijn De Saeger ; Jun'ichi Kazama ; Sadao Kurohashi

Abstract: ¶ kuro@i . We propose an automatic method of extracting paraphrases from definition sentences, which are also automatically acquired from the Web. We observe that a huge number of concepts are defined in Web documents, and that the sentences that define the same concept tend to convey mostly the same information using different expressions and thus contain many paraphrases. We show that a large number of paraphrases can be automatically extracted with high precision by regarding the sentences that define the same concept as parallel corpora. Experimental results indicated that with our method it was possible to extract about 300,000 paraphrases from 6 Web docu3m0e0n,t0s0 w0i ptha a precision oramte 6 6o ×f a 1b0out 94%. 108

3 0.77171445 323 acl-2011-Unsupervised Part-of-Speech Tagging with Bilingual Graph-Based Projections

Author: Dipanjan Das ; Slav Petrov

Abstract: We describe a novel approach for inducing unsupervised part-of-speech taggers for languages that have no labeled training data, but have translated text in a resource-rich language. Our method does not assume any knowledge about the target language (in particular no tagging dictionary is assumed), making it applicable to a wide array of resource-poor languages. We use graph-based label propagation for cross-lingual knowledge transfer and use the projected labels as features in an unsupervised model (BergKirkpatrick et al., 2010). Across eight European languages, our approach results in an average absolute improvement of 10.4% over a state-of-the-art baseline, and 16.7% over vanilla hidden Markov models induced with the Expectation Maximization algorithm.

4 0.74443251 66 acl-2011-Chinese sentence segmentation as comma classification

Author: Nianwen Xue ; Yaqin Yang

Abstract: We describe a method for disambiguating Chinese commas that is central to Chinese sentence segmentation. Chinese sentence segmentation is viewed as the detection of loosely coordinated clauses separated by commas. Trained and tested on data derived from the Chinese Treebank, our model achieves a classification accuracy of close to 90% overall, which translates to an F1 score of 70% for detecting commas that signal sentence boundaries.

5 0.6800406 87 acl-2011-Corpus Expansion for Statistical Machine Translation with Semantic Role Label Substitution Rules

Author: Qin Gao ; Stephan Vogel

Abstract: We present an approach of expanding parallel corpora for machine translation. By utilizing Semantic role labeling (SRL) on one side of the language pair, we extract SRL substitution rules from existing parallel corpus. The rules are then used for generating new sentence pairs. An SVM classifier is built to filter the generated sentence pairs. The filtered corpus is used for training phrase-based translation models, which can be used directly in translation tasks or combined with baseline models. Experimental results on ChineseEnglish machine translation tasks show an average improvement of 0.45 BLEU and 1.22 TER points across 5 different NIST test sets.

6 0.6660096 225 acl-2011-Monolingual Alignment by Edit Rate Computation on Sentential Paraphrase Pairs

7 0.61325836 327 acl-2011-Using Bilingual Parallel Corpora for Cross-Lingual Textual Entailment

8 0.60807979 37 acl-2011-An Empirical Evaluation of Data-Driven Paraphrase Generation Techniques

9 0.60742807 72 acl-2011-Collecting Highly Parallel Data for Paraphrase Evaluation

10 0.60415137 131 acl-2011-Extracting Opinion Expressions and Their Polarities - Exploration of Pipelines and Joint Models

11 0.59327728 331 acl-2011-Using Large Monolingual and Bilingual Corpora to Improve Coordination Disambiguation

12 0.58992803 136 acl-2011-Finding Deceptive Opinion Spam by Any Stretch of the Imagination

13 0.58568037 45 acl-2011-Aspect Ranking: Identifying Important Product Aspects from Online Consumer Reviews

14 0.5774107 274 acl-2011-Semi-Supervised Frame-Semantic Parsing for Unknown Predicates

15 0.57018828 235 acl-2011-Optimal and Syntactically-Informed Decoding for Monolingual Phrase-Based Alignment

16 0.56988901 162 acl-2011-Identifying the Semantic Orientation of Foreign Words

17 0.56980604 128 acl-2011-Exploring Entity Relations for Named Entity Disambiguation

18 0.56542963 183 acl-2011-Joint Bilingual Sentiment Classification with Unlabeled Parallel Corpora

19 0.55904585 234 acl-2011-Optimal Head-Driven Parsing Complexity for Linear Context-Free Rewriting Systems

20 0.55884719 292 acl-2011-Target-dependent Twitter Sentiment Classification