acl acl2010 acl2010-210 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Christian Scheible
Abstract: The translation of sentiment information is a task from which sentiment analysis systems can benefit. We present a novel, graph-based approach using SimRank, a well-established vertex similarity algorithm to transfer sentiment information between a source language and a target language graph. We evaluate this method in comparison with SO-PMI.
Reference: text
sentIndex sentText sentNum sentScore
1 Sentiment Translation through Lexicon Induction Christian Scheible Institute for Natural Language Processing University of Stuttgart s che ibcn @ ims Abstract The translation of sentiment information is a task from which sentiment analysis systems can benefit. [sent-1, score-1.23]
2 We present a novel, graph-based approach using SimRank, a well-established vertex similarity algorithm to transfer sentiment information between a source language and a target language graph. [sent-2, score-0.813]
3 Usually, two aspects are of importance in sentiment analysis. [sent-5, score-0.556]
4 whether a text or an expression is meant to express sentiment at all; the second is the determination of sentiment orientation, i. [sent-8, score-1.112]
5 what sentiment is to be expressed in a structure that is considered subjective. [sent-10, score-0.556]
6 Work on sentiment analysis most often covers resources or analysis methods in a single language, usually English. [sent-11, score-0.556]
7 However, the transfer of sentiment analysis between languages can be advantageous by making use of resources for a source language to improve the analysis of the target language. [sent-12, score-0.759]
8 This paper presents an approach to the transfer of sentiment information between languages. [sent-13, score-0.652]
9 It is built around an algorithm that has been successfully applied for the acquisition of bilingual lexicons. [sent-14, score-0.082]
10 Our experiments are carried out using English as a source language and German as a target language. [sent-16, score-0.078]
11 de 2 Related Work The translation of sentiment information has been the topic of multiple publications. [sent-19, score-0.595]
12 The first method simply uses bilingual dictionaries to translate an English sentiment lexicon. [sent-22, score-0.638]
13 A sentence-based classifier built with this list achieved high precision but low recall on a small Romanian test set. [sent-23, score-0.038]
14 The source language in the corpus is annotated with sentiment information, and the information is then projected to the target language. [sent-25, score-0.634]
15 Given a corpus annotated with sentiment information in one language, machine translation is used to produce an annotated corpus in the target language, by preserving the annotations. [sent-31, score-0.636]
16 The original annotations can be produced either manually or automatically. [sent-32, score-0.032]
17 Wan (2009) constructs a multilingual classifier using co-training. [sent-33, score-0.038]
18 In co-training, one classifier produces additional training data for a second classifier. [sent-34, score-0.038]
19 In this case, an English classifier assists in training a Chinese classifier. [sent-35, score-0.038]
20 The induction of a sentiment lexicon is the subject of early work by (Hatzivassiloglou and McKeown, 1997). [sent-36, score-0.721]
21 They construct graphs from coordination data from large corpora based on the intuition that adjectives with the same sentiment orientation are likely to be coordinated. [sent-37, score-1.084]
22 For example, fresh and delicious is more likely than rotten and delicious. [sent-38, score-0.038]
23 They then apply a graph clustering algorithm to find groups of adjectives with the same orientation. [sent-39, score-0.276]
24 Finally, they assign the same label to all adjectives that belong to the same cluster. [sent-40, score-0.171]
25 The authors note that some words cannot be assigned a unique label since their sentiment depends on con25 UppsaPlra,o Scewe didnegn,s o 1f3 t Jhuely AC 20L10 20. [sent-41, score-0.556]
26 Turney (2002) suggests a corpus-based extraction method based on his pointwise mutual information (PMI) synonymy measure He assumes that the sentiment orientation of a phrase can be determined by comparing its pointwise mutual information with a positive (excellent) and a negative phrase (poor). [sent-44, score-0.893]
27 1 3 Bilingual Lexicon Induction Typical approaches to the induction of bilingual lexicons involve gathering new information from a small set of known identities between the languages which is called a seed lexicon and incorporating intralingual sources of information (e. [sent-46, score-0.536]
28 It is an iterative algorithm that measures the similarity between all vertices in a graph. [sent-53, score-0.199]
29 In SimRank, two nodes are similar if their neighbors are similar. [sent-54, score-0.08]
30 This defines a recursive process that ends when the two nodes compared are identical. [sent-55, score-0.032]
31 (2009), we will apply it to a graph G in which vertices represweinltl awpoprlyds atn tod edges represent i crehla vteiorntisc e bset rwepereenwords. [sent-57, score-0.221]
32 SimRank will then yield similarity values between vertices that indicate the degree of relatedness between them with regard to the property encoded through the edges. [sent-58, score-0.238]
33 For two nodes iand j in G, similarity according to SimRank is defined as sim(i,j) =|N(i)|c|N(j)k∈N(Xi),l∈N(j)sim(k,l), where N(x) is the neighborhood of x and c is a weight factor that determines the influence of neighbors that are farther away. [sent-59, score-0.163]
34 (2009) further propose the application of the SimRank algorithm for the calculation of similarities between a source graph S and a target graph Ttie . [sent-62, score-0.344]
35 Initially, some r rceela gtiroanpsh b Set awnede na ttahretgweot graphs n . [sent-63, score-0.082]
36 e Wlathioenns operating on word graphs, these can be taken from a bilingual lexicon. [sent-65, score-0.082]
37 This provides us with a framework for the induction of a bilingual lexicon which can be constructed based on the obtained similarity values between the vertices of the two graphs. [sent-66, score-0.485]
38 (2010) was that while words with high similarity were semantically related, they often were not exact translations of each other but instead often fell into the categories of hyponymy, hypernomy, holonymy, or meronymy. [sent-68, score-0.122]
39 However, this makes the similarity values applicable for the translation of sentiment since it is a property that does not depend on exact synonymy. [sent-69, score-0.717]
40 4 Sentiment Transfer Although unsupervised methods for the design of sentiment analysis systems exist, any approach can benefit from using resources that have been established in other languages. [sent-70, score-0.556]
41 The main problem that we aim to deal with in this paper is the transfer of such information between languages. [sent-71, score-0.096]
42 The SimRank lexicon induction method is suitable for this purpose since it can produce useful similarity values even with a small seed lexicon. [sent-72, score-0.428]
43 The vertices of these graphs will represent adjectives while the edges are coordination relations between these adjectives. [sent-74, score-0.562]
44 An example for such a graph is given in Figure 1. [sent-75, score-0.105]
45 Figure 1: Sample graph showing English coordination relations. [sent-76, score-0.264]
46 The use of coordination information has been shown to be beneficial for example in early work by Hatzivassiloglou and McKeown (1997). [sent-77, score-0.159]
47 Seed links between those graphs will be taken from a universal dictionary. [sent-78, score-0.082]
48 Here, intralingual coordination relations are represented as black lines, seed relations as solid grey lines, and relations that are induced through SimRank as dashed grey lines. [sent-80, score-0.948]
49 After computing similarities in this graph, we 26 Figure 2: Sample graph showing English and German coordination relations. [sent-81, score-0.32]
50 Solid black lines represent coordinations, solid grey lines represent seed relations, and dashed grey lines show induced relations. [sent-82, score-0.802]
51 We will define the sentiment score (sent) as sent(nt) = X simnorm(ns,nt) sent(ns), nXs∈S where nt is a node in the target graph T , and S the source graph. [sent-84, score-0.806]
52 dTeh i ns way, rthgeet s gernatpihm eTn ,t score of each node is an average over all nodes in S weighted by eth iesir a nnor amvearlaizgeed similarity, simnorm. [sent-85, score-0.1]
53 We define the normalized similarity as simnorm(ns,nt) =Pnss∈imS(snims,(nnts),nt). [sent-86, score-0.083]
54 Normalization guaranPtees that all sentiment scores lie within a specified range. [sent-87, score-0.585]
55 Scores are not a direct indicator for orientation since the similarities still include a lot of noise. [sent-88, score-0.172]
56 Therefore, we interpret the scores by assigning each word to a category by finding score thresholds between the categories. [sent-89, score-0.112]
57 1 Baseline Method (SO-PMI) We will compare our method to the wellestablished SO-PMI algorithm by Turney (2002) to show an improvement over an unsupervised method. [sent-91, score-0.032]
58 The algorithm works with cooccurrence counts on large corpora. [sent-92, score-0.041]
59 To determine the semantic orientation of a word w, the hits near positive (Pwords) and negative (Nwords) seed words is used. [sent-93, score-0.428]
60 We extracted coordinations from the corpus using a simple CQP pattern search (Christ et al. [sent-98, score-0.161]
61 For our experiments, we looked only at coordinations with and. [sent-100, score-0.13]
62 For the English corpus, we used the pattern [ po s = " JJ " ] ( [ po s = " " ] [ po s = " JJ " ] ) * ( [ po s = " " ] ? [sent-101, score-0.779]
63 " and " [ po s = " JJ" ] ) +, and for the German corpus, the pattern [ po s = "ADJ . [sent-102, score-0.405]
64 * " ] ) * ( " und" [ po s = "ADJ" ] ) + was used. [sent-104, score-0.187]
65 This yielded 477,291 pairs of coordinated English adjectives and 44,245 German pairs. [sent-105, score-0.171]
66 , , , dictionary1 After building a graph out of this data as described in Section 4, we apply the SimRank algorithm using 7 iterations. [sent-109, score-0.105]
67 Data for the SO-PMI method had to be collected from queries to search engines since the information available in the Wikipedia corpus was too sparse. [sent-110, score-0.033]
68 Since Google does not provide a stable NEAR operator, we used coordinations instead. [sent-111, score-0.13]
69 For each of the test words w and the SO-PMI seed words s we made two queries + " w und s " and + " s und w " to Google. [sent-112, score-0.336]
70 The quotes and + were added to ensure that no spelling correction or synonym replacements took place. [sent-113, score-0.047]
71 Since the original experiments were designed for an English corpus, a set of German seed words had to be constructed. [sent-114, score-0.141]
72 We chose gut, nett, richtig, sch o¨n, ordentlich, angenehm, aufrichtig, gewissenhaft, and hervorragend as positive seeds, and schlecht, teuer, falsch, b ¨ose, feindlich, verhasst, widerlich, fehlerhaft, and 1http : / /www . [sent-115, score-0.075]
73 50e Table 1: Assigned values for positivity labels mangelhaft as negative seeds. [sent-118, score-0.185]
74 We constructed a test set by randomly selecting 200 German adjectives that occurred in a coordination in Wikipedia. [sent-119, score-0.365]
75 We then eliminated adjectives that we deemed uncommon or too difficult to understand or that were mislabeled as adjectives. [sent-120, score-0.209]
76 To determine the sentiment of these adjectives, we asked 9 human judges, all native German speakers, to annotate them given the classes neutral, slightly negative, very negative, slightly positive, and very positive, reflecting the categories from the training data. [sent-122, score-0.626]
77 Since human judges tend to interpret scales differently, we examine their agreement using Kendall’s coefficient of concordance (W) including correction for ties (Legendre, 2005) which takes ranks into account. [sent-124, score-0.216]
78 Manual examination of the data showed that most disagreement between the annotators occurred with adjectives that are tied to political implications, for example nuklear (nuclear). [sent-128, score-0.206]
79 3 Sentiment Lexicon Induction For our experiments, we used the polarity lexicon of Wilson et al. [sent-130, score-0.076]
80 It includes annotations of positivity in the form of the categories neutral, weakly positive (weakpos), strongly positive (strongpos), weakly negative (weakneg), and strongly positive (strongneg). [sent-132, score-0.526]
81 In order to conduct arithmetic operations on these annotations, mapped them to values from the interval [−1, 1] by using t hheem assignments given hine T inatbelrev a1l. [sent-133, score-0.071]
82 4 Results To compare the two methods to the human raters, we first reproduce the evaluation by Turney (2002) and examine the correlation coefficients. [sent-135, score-0.031]
83 Both methods will be compared to an average over the human rater values. [sent-136, score-0.066]
84 These values are calculated on values asserted based on Table 1. [sent-137, score-0.11]
85 The correlation coefficients between the automatic systems and the human ratings, SO-PMI yields r = 0. [sent-138, score-0.031]
86 Since many adjectives do not express sentiment at all, the correct categorization of neutral adjectives is as important as the scalar rating. [sent-142, score-1.043]
87 Thus, we divide the adjectives into three categories positive, neutral, and negative. [sent-143, score-0.21]
88 Due to disagreements between the human judges there exists no clear threshold between these categories. [sent-144, score-0.13]
89 In order to try different thresholds, we assume that senti– ment is symmetrically distributed with mean 0 on the human scores. [sent-145, score-0.066]
90 For x ∈ {2i0|0 ≤ i ≤ 19}, we tthheen h assign cwoorreds w owri txh ∈hu {man|0 rating score(w) to negative if score(w) ≤ −x, to neutral if −x < score(w) < x arned( wto) positive oo ntheuetrrwailse if. [sent-146, score-0.318]
wordName wordTfidf (topN-words)
[('sentiment', 0.556), ('simrank', 0.415), ('po', 0.187), ('adjectives', 0.171), ('coordination', 0.159), ('seed', 0.141), ('grey', 0.138), ('coordinations', 0.13), ('orientation', 0.116), ('vertices', 0.116), ('neutral', 0.114), ('dorow', 0.113), ('graph', 0.105), ('german', 0.099), ('transfer', 0.096), ('induction', 0.089), ('intralingual', 0.086), ('positivity', 0.086), ('simnorm', 0.086), ('similarity', 0.083), ('bilingual', 0.082), ('graphs', 0.082), ('und', 0.081), ('lexicon', 0.076), ('positive', 0.075), ('turney', 0.07), ('judges', 0.07), ('jj', 0.07), ('txh', 0.069), ('lines', 0.067), ('solid', 0.067), ('sent', 0.067), ('sim', 0.064), ('negative', 0.06), ('similarities', 0.056), ('hatzivassiloglou', 0.054), ('adj', 0.051), ('neighbors', 0.048), ('ims', 0.047), ('thresholds', 0.047), ('correction', 0.047), ('dashed', 0.045), ('pointwise', 0.043), ('weakly', 0.042), ('black', 0.042), ('mckeown', 0.042), ('cooccurrence', 0.041), ('target', 0.041), ('categories', 0.039), ('translation', 0.039), ('values', 0.039), ('classifier', 0.038), ('irony', 0.038), ('arned', 0.038), ('kendall', 0.038), ('tero', 0.038), ('delicious', 0.038), ('wwoorrdd', 0.038), ('cqp', 0.038), ('gut', 0.038), ('jeh', 0.038), ('uncommon', 0.038), ('widom', 0.038), ('source', 0.037), ('ns', 0.036), ('near', 0.036), ('nt', 0.035), ('occurred', 0.035), ('gart', 0.035), ('symmetrically', 0.035), ('scheible', 0.035), ('raters', 0.035), ('concordance', 0.035), ('rater', 0.035), ('nuclear', 0.035), ('relations', 0.034), ('english', 0.033), ('interpret', 0.033), ('queries', 0.033), ('wellestablished', 0.032), ('etn', 0.032), ('che', 0.032), ('banea', 0.032), ('arithmetic', 0.032), ('asserted', 0.032), ('nodes', 0.032), ('score', 0.032), ('annotations', 0.032), ('wikipedia', 0.031), ('pattern', 0.031), ('human', 0.031), ('scalar', 0.031), ('identities', 0.031), ('gathering', 0.031), ('laws', 0.031), ('hyponymy', 0.031), ('induced', 0.03), ('disagreements', 0.029), ('advantageous', 0.029), ('lie', 0.029)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000005 210 acl-2010-Sentiment Translation through Lexicon Induction
Author: Christian Scheible
Abstract: The translation of sentiment information is a task from which sentiment analysis systems can benefit. We present a novel, graph-based approach using SimRank, a well-established vertex similarity algorithm to transfer sentiment information between a source language and a target language graph. We evaluate this method in comparison with SO-PMI.
2 0.30379194 209 acl-2010-Sentiment Learning on Product Reviews via Sentiment Ontology Tree
Author: Wei Wei ; Jon Atle Gulla
Abstract: Existing works on sentiment analysis on product reviews suffer from the following limitations: (1) The knowledge of hierarchical relationships of products attributes is not fully utilized. (2) Reviews or sentences mentioning several attributes associated with complicated sentiments are not dealt with very well. In this paper, we propose a novel HL-SOT approach to labeling a product’s attributes and their associated sentiments in product reviews by a Hierarchical Learning (HL) process with a defined Sentiment Ontology Tree (SOT). The empirical analysis against a humanlabeled data set demonstrates promising and reasonable performance of the proposed HL-SOT approach. While this paper is mainly on sentiment analysis on reviews of one product, our proposed HLSOT approach is easily generalized to labeling a mix of reviews of more than one products.
3 0.24319355 123 acl-2010-Generating Focused Topic-Specific Sentiment Lexicons
Author: Valentin Jijkoun ; Maarten de Rijke ; Wouter Weerkamp
Abstract: We present a method for automatically generating focused and accurate topicspecific subjectivity lexicons from a general purpose polarity lexicon that allow users to pin-point subjective on-topic information in a set of relevant documents. We motivate the need for such lexicons in the field of media analysis, describe a bootstrapping method for generating a topic-specific lexicon from a general purpose polarity lexicon, and evaluate the quality of the generated lexicons both manually and using a TREC Blog track test set for opinionated blog post retrieval. Although the generated lexicons can be an order of magnitude more selective than the general purpose lexicon, they maintain, or even improve, the performance of an opin- ion retrieval system.
4 0.23095372 188 acl-2010-Optimizing Informativeness and Readability for Sentiment Summarization
Author: Hitoshi Nishikawa ; Takaaki Hasegawa ; Yoshihiro Matsuo ; Genichiro Kikui
Abstract: We propose a novel algorithm for sentiment summarization that takes account of informativeness and readability, simultaneously. Our algorithm generates a summary by selecting and ordering sentences taken from multiple review texts according to two scores that represent the informativeness and readability of the sentence order. The informativeness score is defined by the number of sentiment expressions and the readability score is learned from the target corpus. We evaluate our method by summarizing reviews on restaurants. Our method outperforms an existing algorithm as indicated by its ROUGE score and human readability experiments.
5 0.17855123 18 acl-2010-A Study of Information Retrieval Weighting Schemes for Sentiment Analysis
Author: Georgios Paltoglou ; Mike Thelwall
Abstract: Most sentiment analysis approaches use as baseline a support vector machines (SVM) classifier with binary unigram weights. In this paper, we explore whether more sophisticated feature weighting schemes from Information Retrieval can enhance classification accuracy. We show that variants of the classic tf.idf scheme adapted to sentiment analysis provide significant increases in accuracy, especially when using a sublinear function for term frequency weights and document frequency smoothing. The techniques are tested on a wide selection of data sets and produce the best accuracy to our knowledge.
6 0.1740704 141 acl-2010-Identifying Text Polarity Using Random Walks
7 0.16733301 22 acl-2010-A Unified Graph Model for Sentence-Based Opinion Retrieval
8 0.14527564 105 acl-2010-Evaluating Multilanguage-Comparability of Subjectivity Analysis Systems
9 0.14183889 42 acl-2010-Automatically Generating Annotator Rationales to Improve Sentiment Classification
10 0.12326564 78 acl-2010-Cross-Language Text Classification Using Structural Correspondence Learning
11 0.12022372 80 acl-2010-Cross Lingual Adaptation: An Experiment on Sentiment Classifications
12 0.11268185 27 acl-2010-An Active Learning Approach to Finding Related Terms
13 0.10425096 51 acl-2010-Bilingual Sense Similarity for Statistical Machine Translation
14 0.080195799 2 acl-2010-"Was It Good? It Was Provocative." Learning the Meaning of Scalar Adjectives
15 0.079670697 129 acl-2010-Growing Related Words from Seed via User Behaviors: A Re-Ranking Based Approach
16 0.07275895 50 acl-2010-Bilingual Lexicon Generation Using Non-Aligned Signatures
17 0.070390336 150 acl-2010-Inducing Domain-Specific Semantic Class Taggers from (Almost) Nothing
18 0.066855788 218 acl-2010-Structural Semantic Relatedness: A Knowledge-Based Method to Named Entity Disambiguation
19 0.066471681 89 acl-2010-Distributional Similarity vs. PU Learning for Entity Set Expansion
20 0.066420719 134 acl-2010-Hierarchical Sequential Learning for Extracting Opinions and Their Attributes
topicId topicWeight
[(0, -0.189), (1, 0.091), (2, -0.212), (3, 0.249), (4, -0.067), (5, 0.024), (6, 0.027), (7, 0.103), (8, -0.039), (9, 0.033), (10, -0.008), (11, 0.072), (12, -0.032), (13, -0.04), (14, -0.057), (15, 0.001), (16, 0.323), (17, -0.167), (18, -0.05), (19, -0.058), (20, -0.052), (21, 0.062), (22, -0.053), (23, -0.102), (24, 0.028), (25, 0.021), (26, -0.057), (27, -0.107), (28, -0.054), (29, -0.075), (30, 0.098), (31, 0.004), (32, -0.135), (33, 0.006), (34, -0.027), (35, -0.131), (36, 0.09), (37, 0.025), (38, -0.125), (39, -0.03), (40, 0.014), (41, 0.016), (42, -0.022), (43, 0.113), (44, -0.02), (45, 0.116), (46, 0.015), (47, 0.106), (48, -0.083), (49, -0.01)]
simIndex simValue paperId paperTitle
same-paper 1 0.96506906 210 acl-2010-Sentiment Translation through Lexicon Induction
Author: Christian Scheible
Abstract: The translation of sentiment information is a task from which sentiment analysis systems can benefit. We present a novel, graph-based approach using SimRank, a well-established vertex similarity algorithm to transfer sentiment information between a source language and a target language graph. We evaluate this method in comparison with SO-PMI.
2 0.82370001 42 acl-2010-Automatically Generating Annotator Rationales to Improve Sentiment Classification
Author: Ainur Yessenalina ; Yejin Choi ; Claire Cardie
Abstract: One ofthe central challenges in sentimentbased text categorization is that not every portion of a document is equally informative for inferring the overall sentiment of the document. Previous research has shown that enriching the sentiment labels with human annotators’ “rationales” can produce substantial improvements in categorization performance (Zaidan et al., 2007). We explore methods to automatically generate annotator rationales for document-level sentiment classification. Rather unexpectedly, we find the automatically generated rationales just as helpful as human rationales.
3 0.8188169 209 acl-2010-Sentiment Learning on Product Reviews via Sentiment Ontology Tree
Author: Wei Wei ; Jon Atle Gulla
Abstract: Existing works on sentiment analysis on product reviews suffer from the following limitations: (1) The knowledge of hierarchical relationships of products attributes is not fully utilized. (2) Reviews or sentences mentioning several attributes associated with complicated sentiments are not dealt with very well. In this paper, we propose a novel HL-SOT approach to labeling a product’s attributes and their associated sentiments in product reviews by a Hierarchical Learning (HL) process with a defined Sentiment Ontology Tree (SOT). The empirical analysis against a humanlabeled data set demonstrates promising and reasonable performance of the proposed HL-SOT approach. While this paper is mainly on sentiment analysis on reviews of one product, our proposed HLSOT approach is easily generalized to labeling a mix of reviews of more than one products.
4 0.66334814 18 acl-2010-A Study of Information Retrieval Weighting Schemes for Sentiment Analysis
Author: Georgios Paltoglou ; Mike Thelwall
Abstract: Most sentiment analysis approaches use as baseline a support vector machines (SVM) classifier with binary unigram weights. In this paper, we explore whether more sophisticated feature weighting schemes from Information Retrieval can enhance classification accuracy. We show that variants of the classic tf.idf scheme adapted to sentiment analysis provide significant increases in accuracy, especially when using a sublinear function for term frequency weights and document frequency smoothing. The techniques are tested on a wide selection of data sets and produce the best accuracy to our knowledge.
5 0.64920682 188 acl-2010-Optimizing Informativeness and Readability for Sentiment Summarization
Author: Hitoshi Nishikawa ; Takaaki Hasegawa ; Yoshihiro Matsuo ; Genichiro Kikui
Abstract: We propose a novel algorithm for sentiment summarization that takes account of informativeness and readability, simultaneously. Our algorithm generates a summary by selecting and ordering sentences taken from multiple review texts according to two scores that represent the informativeness and readability of the sentence order. The informativeness score is defined by the number of sentiment expressions and the readability score is learned from the target corpus. We evaluate our method by summarizing reviews on restaurants. Our method outperforms an existing algorithm as indicated by its ROUGE score and human readability experiments.
6 0.63709772 105 acl-2010-Evaluating Multilanguage-Comparability of Subjectivity Analysis Systems
7 0.61704177 141 acl-2010-Identifying Text Polarity Using Random Walks
8 0.59050578 123 acl-2010-Generating Focused Topic-Specific Sentiment Lexicons
9 0.58588177 176 acl-2010-Mood Patterns and Affective Lexicon Access in Weblogs
10 0.36707485 78 acl-2010-Cross-Language Text Classification Using Structural Correspondence Learning
11 0.35468745 27 acl-2010-An Active Learning Approach to Finding Related Terms
12 0.35200241 80 acl-2010-Cross Lingual Adaptation: An Experiment on Sentiment Classifications
13 0.33272791 157 acl-2010-Last but Definitely Not Least: On the Role of the Last Sentence in Automatic Polarity-Classification
14 0.29345307 109 acl-2010-Experiments in Graph-Based Semi-Supervised Learning Methods for Class-Instance Acquisition
15 0.29241517 92 acl-2010-Don't 'Have a Clue'? Unsupervised Co-Learning of Downward-Entailing Operators.
16 0.29144922 89 acl-2010-Distributional Similarity vs. PU Learning for Entity Set Expansion
17 0.2805191 22 acl-2010-A Unified Graph Model for Sentence-Based Opinion Retrieval
18 0.28030393 129 acl-2010-Growing Related Words from Seed via User Behaviors: A Re-Ranking Based Approach
19 0.26753834 50 acl-2010-Bilingual Lexicon Generation Using Non-Aligned Signatures
20 0.25852686 150 acl-2010-Inducing Domain-Specific Semantic Class Taggers from (Almost) Nothing
topicId topicWeight
[(14, 0.013), (25, 0.035), (39, 0.017), (42, 0.021), (44, 0.426), (59, 0.087), (73, 0.059), (78, 0.036), (83, 0.057), (84, 0.025), (98, 0.143)]
simIndex simValue paperId paperTitle
1 0.86908805 243 acl-2010-Tree-Based and Forest-Based Translation
Author: Yang Liu ; Liang Huang
Abstract: unkown-abstract
2 0.81606126 165 acl-2010-Learning Script Knowledge with Web Experiments
Author: Michaela Regneri ; Alexander Koller ; Manfred Pinkal
Abstract: We describe a novel approach to unsupervised learning of the events that make up a script, along with constraints on their temporal ordering. We collect naturallanguage descriptions of script-specific event sequences from volunteers over the Internet. Then we compute a graph representation of the script’s temporal structure using a multiple sequence alignment algorithm. The evaluation of our system shows that we outperform two informed baselines.
same-paper 3 0.80446845 210 acl-2010-Sentiment Translation through Lexicon Induction
Author: Christian Scheible
Abstract: The translation of sentiment information is a task from which sentiment analysis systems can benefit. We present a novel, graph-based approach using SimRank, a well-established vertex similarity algorithm to transfer sentiment information between a source language and a target language graph. We evaluate this method in comparison with SO-PMI.
4 0.52026695 37 acl-2010-Automatic Evaluation Method for Machine Translation Using Noun-Phrase Chunking
Author: Hiroshi Echizen-ya ; Kenji Araki
Abstract: As described in this paper, we propose a new automatic evaluation method for machine translation using noun-phrase chunking. Our method correctly determines the matching words between two sentences using corresponding noun phrases. Moreover, our method determines the similarity between two sentences in terms of the noun-phrase order of appearance. Evaluation experiments were conducted to calculate the correlation among human judgments, along with the scores produced us- ing automatic evaluation methods for MT outputs obtained from the 12 machine translation systems in NTCIR7. Experimental results show that our method obtained the highest correlations among the methods in both sentence-level adequacy and fluency.
5 0.49207124 206 acl-2010-Semantic Parsing: The Task, the State of the Art and the Future
Author: Rohit J. Kate ; Yuk Wah Wong
Abstract: unkown-abstract
6 0.481287 86 acl-2010-Discourse Structure: Theory, Practice and Use
7 0.46718776 106 acl-2010-Event-Based Hyperspace Analogue to Language for Query Expansion
8 0.44968057 247 acl-2010-Unsupervised Event Coreference Resolution with Rich Linguistic Features
9 0.44956836 31 acl-2010-Annotation
10 0.44894004 260 acl-2010-Wide-Coverage NLP with Linguistically Expressive Grammars
11 0.44262397 140 acl-2010-Identifying Non-Explicit Citing Sentences for Citation-Based Summarization.
12 0.43762079 15 acl-2010-A Semi-Supervised Key Phrase Extraction Approach: Learning from Title Phrases through a Document Semantic Network
13 0.43619677 93 acl-2010-Dynamic Programming for Linear-Time Incremental Parsing
14 0.43618923 55 acl-2010-Bootstrapping Semantic Analyzers from Non-Contradictory Texts
15 0.43605787 218 acl-2010-Structural Semantic Relatedness: A Knowledge-Based Method to Named Entity Disambiguation
16 0.43432471 158 acl-2010-Latent Variable Models of Selectional Preference
17 0.43323728 71 acl-2010-Convolution Kernel over Packed Parse Forest
18 0.43270335 3 acl-2010-A Bayesian Method for Robust Estimation of Distributional Similarities
19 0.43195137 116 acl-2010-Finding Cognate Groups Using Phylogenies
20 0.43156055 170 acl-2010-Letter-Phoneme Alignment: An Exploration