acl acl2013 acl2013-207 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Bishan Yang ; Claire Cardie
Abstract: This paper addresses the task of finegrained opinion extraction the identification of opinion-related entities: the opinion expressions, the opinion holders, and the targets of the opinions, and the relations between opinion expressions and their targets and holders. Most existing approaches tackle the extraction of opinion entities and opinion relations in a pipelined manner, where the interdependencies among different extraction stages are not captured. We propose a joint inference model that leverages knowledge from predictors that optimize subtasks – of opinion extraction, and seeks a globally optimal solution. Experimental results demonstrate that our joint inference approach significantly outperforms traditional pipeline methods and baselines that tackle subtasks in isolation for the problem of opinion extraction.
Reference: text
sentIndex sentText sentNum sentScore
1 edu l Abstract This paper addresses the task of finegrained opinion extraction the identification of opinion-related entities: the opinion expressions, the opinion holders, and the targets of the opinions, and the relations between opinion expressions and their targets and holders. [sent-3, score-3.919]
2 Most existing approaches tackle the extraction of opinion entities and opinion relations in a pipelined manner, where the interdependencies among different extraction stages are not captured. [sent-4, score-1.989]
3 We propose a joint inference model that leverages knowledge from predictors that optimize subtasks – of opinion extraction, and seeks a globally optimal solution. [sent-5, score-0.963]
4 Experimental results demonstrate that our joint inference approach significantly outperforms traditional pipeline methods and baselines that tackle subtasks in isolation for the problem of opinion extraction. [sent-6, score-0.987]
5 1 Introduction Fine-grained opinion analysis is concerned with identifying opinions in text at the expression level; this includes identifying the subjective (i. [sent-7, score-1.021]
6 , opinion) expression itself, the opinion holder and the target of the opinion (Wiebe et al. [sent-9, score-1.809]
7 In this paper, we address the task of identifying opinion-related entities and opinion relations. [sent-14, score-0.934]
8 edu l consider three types of opinion entities: opinion expressions or direct subjective expressions as defined in Wiebe et al. [sent-17, score-1.975]
9 , 1985) or speech events expressing private states; opinion targets expressions that indicate what the opinion is about; and opinion holders mentions of whom or what the opinion is from. [sent-19, score-3.79]
10 Consider the following examples in which opinion expressions (O) are underlined and targets (T) and holders (H) of the opinion are bracketed. [sent-20, score-2.076]
11 We also allow an opinion entity to be involved in multiple relations (e. [sent-33, score-1.009]
12 Not surprisingly, fine-grained opinion extraction is a challenging task due to the complexity and variety of the language used to express opinions and their components (Pang and Lee, 2008). [sent-36, score-0.958]
13 Sequence labeling models have been successfully employed to identify opinion expressions (e. [sent-38, score-1.019]
14 Ac s2s0o1ci3a Atiosnso fcoirat Cio nm foprut Caotimonpaulta Lti nognuails Lti cnsg,u piasgteics 1640–1649, 2007; Yang and Cardie, 2012)) and relation extraction techniques have been proposed to extract opinion holders and targets based on their linking relations to the opinion expressions (e. [sent-43, score-2.341]
15 However, most existing work treats the extraction of different opinion entities and opinion relations in a pipelined manner: the interaction between different extraction tasks is not modeled jointly and error propagation is not considered. [sent-47, score-2.021]
16 (2006), which proposed an ILP approach to jointly identify opinion holders, opinion expressions and their IS-FROM linking relations, and demonstrated the effectiveness of joint inference. [sent-49, score-1.944]
17 opinion expressions with no explicit opinion holder; nor does it consider IS-ABOUT relations. [sent-52, score-1.835]
18 In this paper, we present a model that jointly identifies opinion-related entities, including opinion expressions, opinion targets and opinion holders as well as the associated opinion linking relations, IS-ABOUT and IS-FROM. [sent-53, score-3.755]
19 empty) arguments for cases when the opinion holder or target is not explicitly expressed in text. [sent-56, score-0.943]
20 the spans of opinion entities can adversely affect the prediction of opinion relations; and evidence of opinion relations might provide clues to guide the accurate extraction of opinion entities. [sent-63, score-3.676]
21 , 2005)) and demonstrate that our model outperforms by a significant margin tra- ditional baselines that do not employ joint inference for extracting opinion entities and different types of opinion relations. [sent-65, score-1.906]
22 Many techniques were proposed to identify the text spans for opinion expressions (e. [sent-69, score-1.057]
23 , 2007; Johansson and Moschitti, 2010b; Yang and Cardie, 2012)), opinion holders (e. [sent-72, score-0.997]
24 Some consider extracting opinion targets/holders along with their relation to the opinion expressions. [sent-76, score-1.854]
25 Kim and Hovy (2006) identifies opinion holders and targets by using their semantic roles related to opinion words. [sent-77, score-1.955]
26 (2008) argued that semantic role labeling is not sufficient for identifying opinion holders and targets. [sent-79, score-1.037]
27 Johansson and Moschitti (2010a) extract opinion expressions and holders by applying reranking on top of sequence labeling methods. [sent-80, score-1.136]
28 (2007) considered extracting “aspect-evaluation” relations (relations between opinion expressions and targets) by identifying opinion expressions first and then searching for the most likely target for each opinion expression via a binary relation classifier. [sent-82, score-3.136]
29 All these methods extract opinion arguments and opinion relations in separate stages instead of extracting them jointly. [sent-83, score-1.883]
30 (2006), which jointly extracts opinion expressions, holders and their IS-FROM relations using an ILP approach. [sent-85, score-1.105]
31 , 2010) to identify opinion expressions and targets iteratively; however, they suffer from the problem of error propagation. [sent-94, score-1.102]
32 The problem is conceptually similar to identifying opinion arguments for opinion expressions, however, we do not assume prior knowledge of opinion expressions (unlike in SRL, where predicates are given). [sent-101, score-2.766]
33 3 Model As proposed in Section 1, we consider the task of jointly identifying opinion entities and opinion relations. [sent-102, score-1.823]
34 Specifically, given a sentence, our goal is to identify spans of opinion expressions, opinion arguments (targets and holders) and their associated linking relations. [sent-103, score-1.899]
35 Training data consists of text with manually annotated opinion expression and argument spans, each with a list of relation ids specifying the linking relation between opinion expressions and their arguments. [sent-104, score-2.218]
36 In this section, we will describe how we model opinion entity identification and opinion relation extraction, and how we combine them in a joint inference model. [sent-105, score-1.987]
37 1 Opinion Entity Identification We formulate the task of opinion entity identification as a sequence labeling problem and employ conditional random fields (CRFs) (Lafferty et al. [sent-107, score-0.982]
38 Through inference we can find the best sequence assignment for sentence x and recover the opinion entities according to the standard “IOB” encoding scheme. [sent-109, score-0.976]
39 We consider four entity labels: D, T, H, N, where D denotes opinion expressions, T denotes opinion targets, H denotes opinion holders and N denotes “NONE” entities. [sent-110, score-2.787]
40 In the following we will not distinguish these two relations, since they can both be characterized as relations between opinion expressions and opinion arguments, and the methods for relation extraction are the same. [sent-120, score-2.071]
41 We define a potential function r to capture the strength of association between an opinion candidate o and an argument candidate a, roa = p(y = 1|x) − p(y = 0|x) where p(y = 1|x) and p(y = 0|x) are the logistic regression =esti 1m|xa)te asn odf p (thye positive aen tdh negative relations. [sent-124, score-1.122]
42 Similarly, we define potential ro∅ to denote the confidence of predicting opinion span o associated with an implicit argument. [sent-125, score-1.005]
43 1 Opinion-Arg Relations For opinion-arg classification, we construct candidates of opinion expressions and opinion arguments and consider each pair of an opinion candidate and an argument candidate as a potential opinion relation. [sent-128, score-3.932]
44 To filter out candidates that are less reasonable, we consider the opinion expressions and arguments obtained from the n-best predictions by CRFs1 . [sent-130, score-1.096]
45 Specifically, we selected the most common patterns of the shortest dependency paths2 between an opinion candidate o and an argument candidate a in our dataset, and include all pairs of candidates that satisfy at least one dependency pattern. [sent-134, score-1.224]
46 , 2006), we filter pairs of opinion and argument candidates that do not overlap with any gold standard relation in our training data. [sent-138, score-1.164]
47 , 2008) due to the similarity of opinion relations to the predicate-argument relations in SRL (Ruppenhofer et al. [sent-140, score-1.009]
48 In general, the features aim to capture (a) local properties of the candidate opinion expressions and arguments and (b) syntactic and semantic attributes of their relation. [sent-143, score-1.105]
49 Distance: the relative distance (number of words) between the opinion and argument candidates. [sent-152, score-0.952]
50 Dependency Path: the shortest path in the dependency tree between the opinion candidate and the target candidate, e. [sent-153, score-0.951]
51 The dependency path hNaNs been sVhBoGwn↑ to be very useful in extracting opinion expressions and opinion holders (Johansson and Moschitti, 2010a). [sent-169, score-2.035]
52 2 Opinion-Implicit-Arg Relations When the opinion-arg relation classifier predicts that there is no suitable argument for the opinion expression candidate, it does not capture the possibility that an opinion candidate may associate with an implicit argument. [sent-172, score-2.107]
53 To incorporate knowledge of implicit relations, we build an opinion-implicit- arg classifier to identify an opinion candidate with an implicit argument based on its own properties and context information. [sent-173, score-1.176]
54 For training, we consider all gold-standard opinion expressions as training examples including those with implicit arguments as positive examples and those associated with explicit arguments as negative examples. [sent-174, score-1.17]
55 Parent Constituent: The grammatical role of the parent constituent of the deepest constituent containing the opinion expression. [sent-178, score-0.927]
56 Dependency Argument: The word types and — POS types of the arguments of the dependency patterns in which the opinion expression is involved. [sent-179, score-1.007]
57 3 Joint Inference The inference goal is to find the optimal prediction for both opinion entity identification and opinion relation extraction. [sent-182, score-1.962]
58 Given the binary variables xiy, uij , vik, it is easy to recover the entity and relation assignment by checking which spans are labeled as opinion entities, and which opinion span and argument span form an opinion relation. [sent-186, score-3.105]
59 The objective function is defined as a linear combination of the potentials from different predictors with a parameter λ to balance the contribution of two components: opinion entity identification and opinion relation extraction. [sent-187, score-1.995]
60 For an opinion candidate i, if it is predicted to have an implicit argument in relation k, vik = 1, then no argument candidate should form a relation with i. [sent-193, score-1.611]
61 We introduce tdwidoa auxiliary binary variables aik and bik to limit the maximum number of relations associated with each opinion candidate to be less than or equal to three5. [sent-195, score-1.187]
62 jX∈Akuij= 1 − vik+ aik+ bik aik ≤ 1− vik, bik ≤ 1− vik Constraint 4: Consistency between opinion-arg classifier and opinion entity extractor. [sent-197, score-1.241]
63 Suppose an argument candidate j in relation k is assigned an argument label by the entity extractor, that is xjz = 1(z = T for IS-ABOUT relation and z = H for IS-FROM relation), then there exists some opinion candidates that associate with j. [sent-198, score-1.541]
64 Similar to constraint 3, we introduce auxiliary binary variables cj and dj to enforce that an argument j links to at most three opinion expressions. [sent-199, score-1.041]
65 Xuij = xjz+ cjk+ djk Xi∈O cjk ≤ xjz, djk ≤ xjz Constraint 5: Consistency between the opinionimplicit-arg classifier and opinion entity extractor. [sent-201, score-1.064]
66 When an opinion candidate iis predicted to associate with an implicit argument in relation k, that is vik = 1, then we allow xiD to be either 1 or 0 depending on the confidence of labeling ias an opinion expression. [sent-202, score-2.217]
67 When vik = 0, there exisits some opinion argument associated with the opinion candidate, and we enforce xiD = 1, which means the entity extractor agrees to label ias an opinion expression. [sent-203, score-2.986]
68 vik + xiD ≥ 1 Note that in our ILP formulation, the label assignment for a candidate span involves one multiple-choice decision among different opinion entity labels and the “NONE” entity label. [sent-204, score-1.319]
69 For the IS-FROM relation, we set aik = 0, bik = 0 since an opinion expression usually has only one holder. [sent-208, score-1.025]
70 This design choice also allows us to easily deal with multiple types of opinion arguments and opinion relations. [sent-210, score-1.766]
71 Our gold standard opinion expressions, opinion targets and opinion holders correspond to the direct subjective annotations, target annotations and agent annotations, respectively. [sent-218, score-2.831]
72 The IS-FROM relation is obtained from the agent attribute of each opinion expression. [sent-219, score-0.956]
73 The IS-ABOUT relation is obtained from the attitude annotations: each opinion expression is annotated with attitude frames and each attitude frame is associated with a list of targets. [sent-220, score-1.12]
74 ) In the example above, we will identify (loves, being at Enderly Park) as an IS-ABOUT relation and happy as an opinion ex- pression associated with an implicit target. [sent-225, score-1.067]
75 will focus our discussion on results obtained using overlap matching, since the exact boundaries of opinion entities are hard to define even for hu- man annotators (Wiebe et al. [sent-236, score-0.983]
76 We trained CRFs for opinion entity identification using the following features: indicators for words, POS tags, and lexicon features (the subjectivity strength of the word in the Subjectivity Lexicon). [sent-238, score-0.994]
77 Each extracts opinion entities first using the same CRF employed in our approach, and then predicts opinion relations on the opinion entity candidates obtained from the CRF prediction. [sent-247, score-2.844]
78 Three relation extraction techniques were used in the baselines: • • Adj: Inspired by the adjacency rule used iAnd jH:u and Liu (2004), it links each argument candidate to its nearest opinion candidate. [sent-248, score-1.187]
79 Arguments that do not link to any opinion candidate are discarded. [sent-249, score-0.932]
80 Syn: Links pairs of opinion and argument cSaynnd:id Laitensk sth paat present prominent syntactic patterns. [sent-252, score-0.952]
81 4 137 213∗ Table 2: Performance on opinion entity extraction using overlap and exact matching metrics (the top table uses overlap and the bottom table uses exact). [sent-277, score-1.112]
82 8173024∗ Table 3 : Performance on opinion relation extraction using the overlap metric. [sent-287, score-1.064]
83 RE: Predicts opinion relations by employing th Pree opinion-arg ncl raeslsaitfiioern sa bndy opinionimplicit-arg classifier. [sent-288, score-0.933]
84 First, the opinion-arg classifier identifies pairs of opinion and argument candidates that form valid opinion relations, and then the opinion-implicit-arg classifier is used on the remaining opinion candidates to further identify opinion expressions without explicit arguments. [sent-289, score-3.799]
85 We report results using opinion entity candidates from the best CRF output and from the merged 10-best CRF output. [sent-290, score-1.018]
86 • 5 Results Table 2 shows the results of opinion entity identification using both overlap and exact metrics. [sent-292, score-1.035]
87 We can see that our joint inference approach significantly outperforms all the baselines in F1 measure on extracting all types of opinion entities. [sent-294, score-0.994]
88 CRF+Syn and CRF+Adj provide the same performance as CRF, since the relation extraction step only affects the results of opinion arguments. [sent-300, score-1.017]
89 By using binary classifiers to predict relations, CRF+RE produces high precision on opinion and target extraction but also results in very low recall. [sent-303, score-0.942]
90 Table 3 shows the results of opinion relation extraction using the overlap metric. [sent-306, score-1.064]
91 The objective function of ILP-W/O-ENTITY can be represented as argmuaxXX jX∈Ak rijuij X Xk Xi∈O (2) which is subject to constraints on uij to enforce relations to not overlap and limit the maximum number of relations that can be extracted for each opinion expression and each argument. [sent-323, score-1.197]
92 For ILPW-SINGLE-RE, we simply remove the variables associated with one opinion relation in the objective function (1) and constraints. [sent-324, score-1.004]
93 (2006) that includes opinion targets and uses simpler ILP formulation with only one parameter and fewer binary variables and constraints to represent entity label assignments 11. [sent-327, score-1.153]
94 11We compared the proposed ILP formulation with the ILP Table 4 shows the results of these methods on opinion relation extraction. [sent-328, score-1.008]
95 The results demonstrate that incorporating knowledge of implicit opinion relations is important. [sent-333, score-0.996]
96 Analyzing the errors, we found that the joint model extracts comparable number of opinion entities compared to the gold standard, while the CRF-based baselines extract significantly fewer opinion entities (around 60% of the number of entities in the gold standard). [sent-335, score-1.933]
97 Recall that the joint model finds the global optimal solution over a set of opinion entity and relation candidates, which are obtained from the n-best CRF predictions and constituents in the parse tree that satisfy certain syntactic patterns. [sent-338, score-1.057]
98 (2006) on extracting opinion holders, opinion expressions and IS-FROM relations, and showed that the proposed ILP formulation performs better on all three extraction tasks. [sent-343, score-1.989]
99 7 Conclusion In this paper we propose a joint inference approach for extracting opinion-related entities and opinion relations. [sent-355, score-1.02]
100 Joint extraction of entities and relations for opinion recognition. [sent-382, score-1.049]
wordName wordTfidf (topN-words)
[('opinion', 0.857), ('vik', 0.153), ('holders', 0.14), ('expressions', 0.121), ('ilp', 0.109), ('crf', 0.103), ('targets', 0.101), ('relation', 0.099), ('argument', 0.095), ('choi', 0.077), ('entity', 0.076), ('relations', 0.076), ('candidate', 0.075), ('candidates', 0.066), ('implicit', 0.063), ('extraction', 0.061), ('expression', 0.061), ('aik', 0.059), ('uij', 0.059), ('xjz', 0.059), ('spans', 0.056), ('entities', 0.055), ('arguments', 0.052), ('formulation', 0.052), ('bik', 0.048), ('overlap', 0.047), ('johansson', 0.045), ('wiebe', 0.043), ('inference', 0.042), ('ccomp', 0.042), ('extracting', 0.041), ('span', 0.04), ('opinions', 0.04), ('predictors', 0.039), ('potentials', 0.036), ('iz', 0.035), ('holder', 0.034), ('pipeline', 0.034), ('cardie', 0.033), ('breck', 0.033), ('jointly', 0.032), ('xid', 0.031), ('identification', 0.031), ('qiu', 0.031), ('subjectivity', 0.03), ('baselines', 0.029), ('linking', 0.029), ('mpqa', 0.029), ('roth', 0.028), ('yb', 0.027), ('kobayashi', 0.027), ('nsubj', 0.027), ('attitude', 0.026), ('punyakanok', 0.026), ('moschitti', 0.026), ('joint', 0.025), ('extractor', 0.025), ('associated', 0.025), ('exact', 0.024), ('deepest', 0.024), ('binary', 0.024), ('cjk', 0.024), ('djk', 0.024), ('enderly', 0.024), ('fiz', 0.024), ('irked', 0.024), ('isfrom', 0.024), ('jco', 0.024), ('xiz', 0.024), ('xxiz', 0.024), ('variables', 0.023), ('ruppenhofer', 0.023), ('identify', 0.023), ('constituent', 0.023), ('wilson', 0.023), ('sentiment', 0.023), ('pipelined', 0.022), ('syn', 0.022), ('yih', 0.022), ('jx', 0.022), ('assignment', 0.022), ('identifying', 0.022), ('ya', 0.022), ('enforce', 0.021), ('constraint', 0.021), ('consistency', 0.02), ('ej', 0.02), ('label', 0.02), ('potential', 0.02), ('adj', 0.02), ('subjective', 0.019), ('glpk', 0.019), ('merged', 0.019), ('ak', 0.019), ('liu', 0.019), ('dependency', 0.019), ('iand', 0.019), ('labeling', 0.018), ('patterns', 0.018), ('srl', 0.018)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999976 207 acl-2013-Joint Inference for Fine-grained Opinion Extraction
Author: Bishan Yang ; Claire Cardie
Abstract: This paper addresses the task of finegrained opinion extraction the identification of opinion-related entities: the opinion expressions, the opinion holders, and the targets of the opinions, and the relations between opinion expressions and their targets and holders. Most existing approaches tackle the extraction of opinion entities and opinion relations in a pipelined manner, where the interdependencies among different extraction stages are not captured. We propose a joint inference model that leverages knowledge from predictors that optimize subtasks – of opinion extraction, and seeks a globally optimal solution. Experimental results demonstrate that our joint inference approach significantly outperforms traditional pipeline methods and baselines that tackle subtasks in isolation for the problem of opinion extraction.
2 0.76747078 244 acl-2013-Mining Opinion Words and Opinion Targets in a Two-Stage Framework
Author: Liheng Xu ; Kang Liu ; Siwei Lai ; Yubo Chen ; Jun Zhao
Abstract: This paper proposes a novel two-stage method for mining opinion words and opinion targets. In the first stage, we propose a Sentiment Graph Walking algorithm, which naturally incorporates syntactic patterns in a Sentiment Graph to extract opinion word/target candidates. Then random walking is employed to estimate confidence of candidates, which improves extraction accuracy by considering confidence of patterns. In the second stage, we adopt a self-learning strategy to refine the results from the first stage, especially for filtering out high-frequency noise terms and capturing the long-tail terms, which are not investigated by previous methods. The experimental results on three real world datasets demonstrate the effectiveness of our approach compared with stateof-the-art unsupervised methods.
3 0.66660267 336 acl-2013-Syntactic Patterns versus Word Alignment: Extracting Opinion Targets from Online Reviews
Author: Kang Liu ; Liheng Xu ; Jun Zhao
Abstract: Mining opinion targets is a fundamental and important task for opinion mining from online reviews. To this end, there are usually two kinds of methods: syntax based and alignment based methods. Syntax based methods usually exploited syntactic patterns to extract opinion targets, which were however prone to suffer from parsing errors when dealing with online informal texts. In contrast, alignment based methods used word alignment model to fulfill this task, which could avoid parsing errors without using parsing. However, there is no research focusing on which kind of method is more better when given a certain amount of reviews. To fill this gap, this paper empiri- cally studies how the performance of these two kinds of methods vary when changing the size, domain and language of the corpus. We further combine syntactic patterns with alignment model by using a partially supervised framework and investigate whether this combination is useful or not. In our experiments, we verify that our combination is effective on the corpus with small and medium size.
4 0.40225732 187 acl-2013-Identifying Opinion Subgroups in Arabic Online Discussions
Author: Amjad Abu-Jbara ; Ben King ; Mona Diab ; Dragomir Radev
Abstract: In this paper, we use Arabic natural language processing techniques to analyze Arabic debates. The goal is to identify how the participants in a discussion split into subgroups with contrasting opinions. The members of each subgroup share the same opinion with respect to the discussion topic and an opposing opinion to the members of other subgroups. We use opinion mining techniques to identify opinion expressions and determine their polarities and their targets. We opinion predictions to represent the discussion in one of two formal representations: signed attitude network or a space of attitude vectors. We identify opinion subgroups by partitioning the signed network representation or by clustering the vector space representation. We evaluate the system using a data set of labeled discussions and show that it achieves good results.
5 0.19227009 67 acl-2013-Bi-directional Inter-dependencies of Subjective Expressions and Targets and their Value for a Joint Model
Author: Roman Klinger ; Philipp Cimiano
Abstract: Opinion mining is often regarded as a classification or segmentation task, involving the prediction of i) subjective expressions, ii) their target and iii) their polarity. Intuitively, these three variables are bidirectionally interdependent, but most work has either attempted to predict them in isolation or proposing pipeline-based approaches that cannot model the bidirectional interaction between these variables. Towards better understanding the interaction between these variables, we propose a model that allows for analyzing the relation of target and subjective phrases in both directions, thus providing an upper bound for the impact of a joint model in comparison to a pipeline model. We report results on two public datasets (cameras and cars), showing that our model outperforms state-ofthe-art models, as well as on a new dataset consisting of Twitter posts.
6 0.18774684 49 acl-2013-An annotated corpus of quoted opinions in news articles
7 0.16024218 114 acl-2013-Detecting Chronic Critics Based on Sentiment Polarity and Userâ•Žs Behavior in Social Media
8 0.12747343 147 acl-2013-Exploiting Topic based Twitter Sentiment for Stock Prediction
9 0.10479031 379 acl-2013-Utterance-Level Multimodal Sentiment Analysis
10 0.091702446 2 acl-2013-A Bayesian Model for Joint Unsupervised Induction of Sentiment, Aspect and Discourse Representations
11 0.085110538 352 acl-2013-Towards Accurate Distant Supervision for Relational Facts Extraction
12 0.084922664 117 acl-2013-Detecting Turnarounds in Sentiment Analysis: Thwarting
13 0.080642365 160 acl-2013-Fine-grained Semantic Typing of Emerging Entities
14 0.080042705 211 acl-2013-LABR: A Large Scale Arabic Book Reviews Dataset
15 0.079491057 169 acl-2013-Generating Synthetic Comparable Questions for News Articles
16 0.078751609 56 acl-2013-Argument Inference from Relevant Event Mentions in Chinese Argument Extraction
17 0.078552075 121 acl-2013-Discovering User Interactions in Ideological Discussions
18 0.077025093 189 acl-2013-ImpAr: A Deterministic Algorithm for Implicit Semantic Role Labelling
19 0.072422184 318 acl-2013-Sentiment Relevance
20 0.068757311 377 acl-2013-Using Supervised Bigram-based ILP for Extractive Summarization
topicId topicWeight
[(0, 0.183), (1, 0.274), (2, -0.097), (3, 0.273), (4, -0.189), (5, 0.326), (6, -0.464), (7, -0.205), (8, -0.245), (9, -0.148), (10, -0.262), (11, 0.1), (12, 0.027), (13, 0.009), (14, -0.005), (15, -0.032), (16, -0.019), (17, -0.007), (18, -0.012), (19, -0.033), (20, 0.046), (21, -0.009), (22, -0.004), (23, -0.026), (24, -0.029), (25, -0.006), (26, 0.047), (27, 0.023), (28, -0.006), (29, 0.038), (30, 0.037), (31, -0.019), (32, -0.003), (33, 0.036), (34, -0.041), (35, -0.021), (36, -0.016), (37, -0.011), (38, 0.04), (39, 0.001), (40, -0.027), (41, -0.018), (42, -0.017), (43, 0.027), (44, -0.049), (45, 0.012), (46, 0.037), (47, 0.04), (48, 0.014), (49, -0.001)]
simIndex simValue paperId paperTitle
same-paper 1 0.98812783 207 acl-2013-Joint Inference for Fine-grained Opinion Extraction
Author: Bishan Yang ; Claire Cardie
Abstract: This paper addresses the task of finegrained opinion extraction the identification of opinion-related entities: the opinion expressions, the opinion holders, and the targets of the opinions, and the relations between opinion expressions and their targets and holders. Most existing approaches tackle the extraction of opinion entities and opinion relations in a pipelined manner, where the interdependencies among different extraction stages are not captured. We propose a joint inference model that leverages knowledge from predictors that optimize subtasks – of opinion extraction, and seeks a globally optimal solution. Experimental results demonstrate that our joint inference approach significantly outperforms traditional pipeline methods and baselines that tackle subtasks in isolation for the problem of opinion extraction.
2 0.98178619 244 acl-2013-Mining Opinion Words and Opinion Targets in a Two-Stage Framework
Author: Liheng Xu ; Kang Liu ; Siwei Lai ; Yubo Chen ; Jun Zhao
Abstract: This paper proposes a novel two-stage method for mining opinion words and opinion targets. In the first stage, we propose a Sentiment Graph Walking algorithm, which naturally incorporates syntactic patterns in a Sentiment Graph to extract opinion word/target candidates. Then random walking is employed to estimate confidence of candidates, which improves extraction accuracy by considering confidence of patterns. In the second stage, we adopt a self-learning strategy to refine the results from the first stage, especially for filtering out high-frequency noise terms and capturing the long-tail terms, which are not investigated by previous methods. The experimental results on three real world datasets demonstrate the effectiveness of our approach compared with stateof-the-art unsupervised methods.
3 0.95469147 336 acl-2013-Syntactic Patterns versus Word Alignment: Extracting Opinion Targets from Online Reviews
Author: Kang Liu ; Liheng Xu ; Jun Zhao
Abstract: Mining opinion targets is a fundamental and important task for opinion mining from online reviews. To this end, there are usually two kinds of methods: syntax based and alignment based methods. Syntax based methods usually exploited syntactic patterns to extract opinion targets, which were however prone to suffer from parsing errors when dealing with online informal texts. In contrast, alignment based methods used word alignment model to fulfill this task, which could avoid parsing errors without using parsing. However, there is no research focusing on which kind of method is more better when given a certain amount of reviews. To fill this gap, this paper empiri- cally studies how the performance of these two kinds of methods vary when changing the size, domain and language of the corpus. We further combine syntactic patterns with alignment model by using a partially supervised framework and investigate whether this combination is useful or not. In our experiments, we verify that our combination is effective on the corpus with small and medium size.
4 0.75412869 187 acl-2013-Identifying Opinion Subgroups in Arabic Online Discussions
Author: Amjad Abu-Jbara ; Ben King ; Mona Diab ; Dragomir Radev
Abstract: In this paper, we use Arabic natural language processing techniques to analyze Arabic debates. The goal is to identify how the participants in a discussion split into subgroups with contrasting opinions. The members of each subgroup share the same opinion with respect to the discussion topic and an opposing opinion to the members of other subgroups. We use opinion mining techniques to identify opinion expressions and determine their polarities and their targets. We opinion predictions to represent the discussion in one of two formal representations: signed attitude network or a space of attitude vectors. We identify opinion subgroups by partitioning the signed network representation or by clustering the vector space representation. We evaluate the system using a data set of labeled discussions and show that it achieves good results.
5 0.60532349 49 acl-2013-An annotated corpus of quoted opinions in news articles
Author: Tim O'Keefe ; James R. Curran ; Peter Ashwell ; Irena Koprinska
Abstract: Quotes are used in news articles as evidence of a person’s opinion, and thus are a useful target for opinion mining. However, labelling each quote with a polarity score directed at a textually-anchored target can ignore the broader issue that the speaker is commenting on. We address this by instead labelling quotes as supporting or opposing a clear expression of a point of view on a topic, called a position statement. Using this we construct a corpus covering 7 topics with 2,228 quotes.
6 0.58756346 67 acl-2013-Bi-directional Inter-dependencies of Subjective Expressions and Targets and their Value for a Joint Model
7 0.55234438 114 acl-2013-Detecting Chronic Critics Based on Sentiment Polarity and Userâ•Žs Behavior in Social Media
8 0.30729359 117 acl-2013-Detecting Turnarounds in Sentiment Analysis: Thwarting
9 0.25404748 151 acl-2013-Extra-Linguistic Constraints on Stance Recognition in Ideological Debates
10 0.25228819 178 acl-2013-HEADY: News headline abstraction through event pattern clustering
11 0.24955465 242 acl-2013-Mining Equivalent Relations from Linked Data
12 0.24900994 147 acl-2013-Exploiting Topic based Twitter Sentiment for Stock Prediction
13 0.24081193 379 acl-2013-Utterance-Level Multimodal Sentiment Analysis
14 0.22161178 365 acl-2013-Understanding Tables in Context Using Standard NLP Toolkits
15 0.21645507 215 acl-2013-Large-scale Semantic Parsing via Schema Matching and Lexicon Extension
16 0.21440066 350 acl-2013-TopicSpam: a Topic-Model based approach for spam detection
17 0.21323159 160 acl-2013-Fine-grained Semantic Typing of Emerging Entities
18 0.2123002 33 acl-2013-A user-centric model of voting intention from Social Media
19 0.20631741 211 acl-2013-LABR: A Large Scale Arabic Book Reviews Dataset
20 0.2056344 159 acl-2013-Filling Knowledge Base Gaps for Distant Supervision of Relation Extraction
topicId topicWeight
[(0, 0.048), (6, 0.05), (11, 0.077), (15, 0.022), (24, 0.075), (26, 0.05), (35, 0.084), (42, 0.069), (48, 0.046), (70, 0.027), (83, 0.117), (88, 0.036), (90, 0.028), (95, 0.147)]
simIndex simValue paperId paperTitle
1 0.93468601 90 acl-2013-Conditional Random Fields for Responsive Surface Realisation using Global Features
Author: Nina Dethlefs ; Helen Hastie ; Heriberto Cuayahuitl ; Oliver Lemon
Abstract: Surface realisers in spoken dialogue systems need to be more responsive than conventional surface realisers. They need to be sensitive to the utterance context as well as robust to partial or changing generator inputs. We formulate surface realisation as a sequence labelling task and combine the use of conditional random fields (CRFs) with semantic trees. Due to their extended notion of context, CRFs are able to take the global utterance context into account and are less constrained by local features than other realisers. This leads to more natural and less repetitive surface realisation. It also allows generation from partial and modified inputs and is therefore applicable to incremental surface realisation. Results from a human rating study confirm that users are sensitive to this extended notion of context and assign ratings that are significantly higher (up to 14%) than those for taking only local context into account.
2 0.92542142 202 acl-2013-Is a 204 cm Man Tall or Small ? Acquisition of Numerical Common Sense from the Web
Author: Katsuma Narisawa ; Yotaro Watanabe ; Junta Mizuno ; Naoaki Okazaki ; Kentaro Inui
Abstract: This paper presents novel methods for modeling numerical common sense: the ability to infer whether a given number (e.g., three billion) is large, small, or normal for a given context (e.g., number of people facing a water shortage). We first discuss the necessity of numerical common sense in solving textual entailment problems. We explore two approaches for acquiring numerical common sense. Both approaches start with extracting numerical expressions and their context from the Web. One approach estimates the distribution ofnumbers co-occurring within a context and examines whether a given value is large, small, or normal, based on the distri- bution. Another approach utilizes textual patterns with which speakers explicitly expresses their judgment about the value of a numerical expression. Experimental results demonstrate the effectiveness of both approaches.
same-paper 3 0.92424846 207 acl-2013-Joint Inference for Fine-grained Opinion Extraction
Author: Bishan Yang ; Claire Cardie
Abstract: This paper addresses the task of finegrained opinion extraction the identification of opinion-related entities: the opinion expressions, the opinion holders, and the targets of the opinions, and the relations between opinion expressions and their targets and holders. Most existing approaches tackle the extraction of opinion entities and opinion relations in a pipelined manner, where the interdependencies among different extraction stages are not captured. We propose a joint inference model that leverages knowledge from predictors that optimize subtasks – of opinion extraction, and seeks a globally optimal solution. Experimental results demonstrate that our joint inference approach significantly outperforms traditional pipeline methods and baselines that tackle subtasks in isolation for the problem of opinion extraction.
4 0.88887131 185 acl-2013-Identifying Bad Semantic Neighbors for Improving Distributional Thesauri
Author: Olivier Ferret
Abstract: Distributional thesauri are now widely used in a large number of Natural Language Processing tasks. However, they are far from containing only interesting semantic relations. As a consequence, improving such thesaurus is an important issue that is mainly tackled indirectly through the improvement of semantic similarity measures. In this article, we propose a more direct approach focusing on the identification of the neighbors of a thesaurus entry that are not semantically linked to this entry. This identification relies on a discriminative classifier trained from unsupervised selected examples for building a distributional model of the entry in texts. Its bad neighbors are found by applying this classifier to a representative set of occurrences of each of these neighbors. We evaluate the interest of this method for a large set of English nouns with various frequencies.
5 0.87918997 240 acl-2013-Microblogs as Parallel Corpora
Author: Wang Ling ; Guang Xiang ; Chris Dyer ; Alan Black ; Isabel Trancoso
Abstract: In the ever-expanding sea of microblog data, there is a surprising amount of naturally occurring parallel text: some users create post multilingual messages targeting international audiences while others “retweet” translations. We present an efficient method for detecting these messages and extracting parallel segments from them. We have been able to extract over 1M Chinese-English parallel segments from Sina Weibo (the Chinese counterpart of Twitter) using only their public APIs. As a supplement to existing parallel training data, our automatically extracted parallel data yields substantial translation quality improvements in translating microblog text and modest improvements in translating edited news commentary. The resources in described in this paper are available at http://www.cs.cmu.edu/∼lingwang/utopia.
6 0.87005949 120 acl-2013-Dirt Cheap Web-Scale Parallel Text from the Common Crawl
7 0.86928427 97 acl-2013-Cross-lingual Projections between Languages from Different Families
8 0.86561918 267 acl-2013-PARMA: A Predicate Argument Aligner
9 0.86513835 25 acl-2013-A Tightly-coupled Unsupervised Clustering and Bilingual Alignment Model for Transliteration
10 0.86394501 154 acl-2013-Extracting bilingual terminologies from comparable corpora
11 0.86115217 255 acl-2013-Name-aware Machine Translation
12 0.85999686 174 acl-2013-Graph Propagation for Paraphrasing Out-of-Vocabulary Words in Statistical Machine Translation
13 0.85776895 223 acl-2013-Learning a Phrase-based Translation Model from Monolingual Data with Application to Domain Adaptation
14 0.85616159 316 acl-2013-SenseSpotting: Never let your parallel data tie you to an old domain
15 0.85559118 134 acl-2013-Embedding Semantic Similarity in Tree Kernels for Domain Adaptation of Relation Extraction
16 0.85544646 326 acl-2013-Social Text Normalization using Contextual Graph Random Walks
17 0.85392177 288 acl-2013-Punctuation Prediction with Transition-based Parsing
18 0.85286355 383 acl-2013-Vector Space Model for Adaptation in Statistical Machine Translation
19 0.8522349 5 acl-2013-A Decade of Automatic Content Evaluation of News Summaries: Reassessing the State of the Art
20 0.85169876 8 acl-2013-A Learner Corpus-based Approach to Verb Suggestion for ESL