emnlp emnlp2011 emnlp2011-104 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Deepak Agarwal ; Bee-Chung Chen ; Bo Pang
Abstract: In recent years, the amount of user-generated opinionated texts (e.g., reviews, user comments) continues to grow at a rapid speed: featured news stories on a major event easily attract thousands of user comments on a popular online News service. How to consume subjective information ofthis volume becomes an interesting and important research question. In contrast to previous work on review analysis that tried to filter or summarize information for a generic average user, we explore a different direction of enabling personalized recommendation of such information. For each user, our task is to rank the comments associated with a given article according to personalized user preference (i.e., whether the user is likely to like or dislike the comment). To this end, we propose a factor model that incorporates rater-comment and rater-author interactions simultaneously in a principled way. Our full model significantly outperforms strong baselines as well as related models that have been considered in previous work.
Reference: text
sentIndex sentText sentNum sentScore
1 , reviews, user comments) continues to grow at a rapid speed: featured news stories on a major event easily attract thousands of user comments on a popular online News service. [sent-5, score-0.813]
2 In contrast to previous work on review analysis that tried to filter or summarize information for a generic average user, we explore a different direction of enabling personalized recommendation of such information. [sent-7, score-0.307]
3 For each user, our task is to rank the comments associated with a given article according to personalized user preference (i. [sent-8, score-0.624]
4 , whether the user is likely to like or dislike the comment). [sent-10, score-0.294]
5 Many of them are user reviews: a best-seller or a popular restaurant can get over 1000 reviews on top review sites like Amazon or Yelp. [sent-14, score-0.343]
6 A large quantity of them also come in the form of user comments on blogs or news articles. [sent-15, score-0.528]
7 Most notably, during the short period of time for which a major event is active, news stories on one single event can easily attract over ten thousand 571 comments on a popular online news site like Yahoo! [sent-16, score-0.373]
8 A related line of research looked into predicting helpfulness of reviews in the hope of promoting those with better quality, where helpfulness is usually defined as some function over the percentage of users who found the review to be helpful (Kim et al. [sent-21, score-0.48]
9 The other paradigm is recommendation: based on what users have liked or disliked in the past, the system will automatically recommend Proce Ed iningbsu orfg th ,e S 2c0o1tl1an Cdo,n UfeKr,en Jcuely on 27 E–m31p,ir 2ic0a1l1 M. [sent-31, score-0.35]
10 Can we provide similar recommendation mechanisms to help users consume large quantities of subjective information? [sent-34, score-0.377]
11 Many commenting environments allow users to mark “like” or “dislike” over existing comments (e. [sent-35, score-0.381]
12 Can we learn from users’ past preferences, so that when a user is reading a new article, we have a system that automatically ranks its comments according to their likelihood of being liked by the user? [sent-39, score-0.592]
13 This can be used directly to create personalized presentation of comments (e. [sent-40, score-0.339]
14 In our case, most comments for an article a user is reading are already of interest to that user topically. [sent-49, score-0.745]
15 Which ones the user ends up liking may depend on several non-topical aspects of the text: whether the user agrees with the viewpoint expressed in the comment, whether the comment is convincing and well-written, etc. [sent-50, score-0.837]
16 However, the difficulty in analyzing the textual information in comments can be alleviated by additional contextual information such as author identities. [sent-52, score-0.398]
17 If between a pair of users one consistently likes or dislikes the other, then at least for the heavy users, this authorship information alone could be highly informative. [sent-53, score-0.382]
18 In this paper, we present a principled way of utilizing multiple sources of information for the task of recommending user comments, which significantly 572 outperforms strong baseline methods, as well as previous methods proposed for text recommendation. [sent-55, score-0.335]
19 While using authorship information alone tends to provide stronger signal than using textual information alone, to our surprise, even for heavy users, adding textual information to the authorship information yields additional improvements. [sent-56, score-0.463]
20 Classical approaches in collaborative filtering are based on item-item/user- user similarity, these are nearest-neighbor methods where the response for a user-item pair is predicted based on a local neighborhood mean (Sarwar et al. [sent-60, score-0.42]
21 Generalizations of matrix factorization to include both features and past ratings have been proposed (Agarwal and Chen, 2009; Stern et al. [sent-67, score-0.393]
22 The approach in this paper is an extension where in addition to interactions among users and items (comments in our case), we also consider the authorship information. [sent-69, score-0.331]
23 Three-way interactions were recently studied for personalized tag recommendation (Rendle and Lars, 2010). [sent-70, score-0.313]
24 For instance, Billsus and Pazanni (2007) describes an approach to build user profile models for adaptive personalization in the context of mobile content access. [sent-75, score-0.287]
25 (2007) where text processing techniques are used to build content profiles for users to recommend personalized news. [sent-78, score-0.344]
26 (2007); cold-start for new items/users was not their focus, but is important for our task candidate comments for recommendation are often not in training data. [sent-81, score-0.326]
27 Related to that, people have looked into predicting review helpfulness given the textual information in reviews, where helpfulness is either defined as the percentage of users who have voted the review to be helpful (Kim et al. [sent-85, score-0.579]
28 Our goal differs in that we look for personalized ranking (what a specific user might like) rather than generic quality (what an average user might like). [sent-88, score-0.641]
29 For instance, whether the author has used his/her true name or where the user is from (Danescu-Niculescu-Mizil et al. [sent-90, score-0.366]
30 As discussed in Section 1, whether a rater likes a comment or not may depend on whether they agree with the viewpoint expressed in the text and quality of the text. [sent-94, score-0.677]
31 In our setting, our labels come in the form of whether users liked or disliked a previous comment. [sent-102, score-0.314]
32 , liked or disliked by the same rater), which would yield a different learning problem akin to the metric learning problem; note, however, the complication that two pieces of text receiving different labels from a given user might not necessarily contain contrasting viewpoints. [sent-105, score-0.393]
33 3 Method In this section, we describe our model that predicts rater affinity to comments. [sent-107, score-0.39]
34 1 Model Notation: Let yij denote the rating that user i, called the rater, gives to comment j. [sent-112, score-0.681]
35 Since throughout, we use suffix ito denote a rater and suffix j to denote a comment, we slightly abuse notation and let xi (of dimension pu) and xj (of dimension pc) denote feature vectors of user iand comment j respectively. [sent-113, score-1.124]
36 For example, xi can be the bag of words representation (a sparse vector) inferred through text analysis on comments voted positively by user iin the past, and xj can be the bag of words representation for comment j. [sent-114, score-0.854]
37 We use a(j) to denote the author of comment j,and use µij to denote the mean rating by rater ion comment j,i. [sent-115, score-1.124]
38 Ofcourse it is impossible to estimate µij empirically since each user iusually rates a comment j at most once. [sent-118, score-0.547]
39 In our case, we use the following factors (a) user factor vi of dimension rv (≥ 1) to model rater-author affinity, (b) user factor ui an 1d) comment factor cj of dimension ru(≥ 1) to model rater-comment affinity. [sent-127, score-1.393]
40 Intuitively, e(ac≥h 1c)ou told represent viewpoints ofusers or comments along different 574 — dimensions. [sent-128, score-0.289]
41 Affinity of rater ito comment j by author a(j) is captured by (1) similarity between viewpoints of users iand a(j), measured by vi0va(j) ; and (2) simi- larity between the preferences of user iand the perspectives reflected in comment j,measured by ui0cj . [sent-129, score-1.712]
42 For instance, if a comment is rated only by one user and ru > 1, the model is clearly overparametrized and the MLE of the comment factor would tend to learn idiosyncrasies in the training data. [sent-138, score-0.986]
43 For instance, to estimate latent factors of a user with little data, we provide a backoff estimate that is obtained by pooling data across users with the same user features. [sent-143, score-0.816]
44 αi ∼ βj ∼ N(g0xi, σα2), ui ∼ N(Gxi, σu2), N(d0xj, σβ2), cj ∼ N(Dxj, σc2), N(0, σγ2), vi ∼ N(0, σv2), where 1 and dpc 1 are regression weight vectors, and Gru and Dru are regression weight matrices. [sent-145, score-0.406]
45 If user ihas no rating in the training data, ui σwill be predicted as the prior mean (backoff) Gxi, a linear projection from the feature vector xi through matrix G learnt from data. [sent-149, score-0.638]
46 However, if user ihas many ratings in the training data, we will precisely estimate the per-user residual δi that is not captured by the regression Gxi. [sent-151, score-0.643]
47 For sample sizes in between these two extremes, the per user residual estimate is “shrunk” toward zero amount of shrinkage depends on the sample size, past user ratings, variability in ratings on comments rated by the user, and the value of variance components σ2s. [sent-152, score-1.1]
48 The matrix factorization model: This model assumes the mean rating of user ion item j is given by h(µij) = αi + βj + ui0cj, and the mean of the prior distributions on αi, βj , ui, cj are zero, i. [sent-155, score-0.527]
49 The uc model: This is also a matrix factorization model but with priors based on regressions (i. [sent-160, score-0.34]
50 This is a regression model purely based on features with no per- user or per-comment latent factors. [sent-168, score-0.346]
51 1 Data We obtained comment rating data between March and May, 2010 from Yahoo! [sent-220, score-0.371]
52 On this site, users can post comments on news article pages and rate the comments made by others through thumb-up (positive) or thumbdown (negative) votes. [sent-222, score-0.686]
53 Also, we do not expect deep personalized recommendations for users who have rated very few comments in the past. [sent-224, score-0.574]
54 For instance, a rater with more than 200 ratings in the raw dataset can have fewer than 200 in the experimental dataset due to the removal of certain authors or news articles. [sent-228, score-0.704]
55 The ratings and comments were split into training, tuning, and test sets according to the article they were associated with. [sent-232, score-0.491]
56 It also creates a completely cold-start situation for comments no comment in the test set has any past rating in the training set. [sent-235, score-0.616]
57 ) For a given comment j, xj is its bag of words representation, L2 normalized. [sent-242, score-0.363]
58 Rater feature vector xi is created by summing over the feature vectors of all comments rated positively by rater i, which is then L2 normalized. [sent-245, score-0.645]
59 This is simply Cbaosseidn on ihmoiwla sriitmyila (cro a new comment j is to the comments rater ihas liked in the past. [sent-253, score-0.98]
60 For svm and nb, we use the following backoff: for users with training data from only ci, we predict ci; for users with no training data at all, we predict the majority class, in this case, the positive class. [sent-258, score-0.35]
61 577 can be more robust over shorter text spans common in user comments given the high variance. [sent-259, score-0.46]
62 For fair comparisons, for the three baseline methods, we use a simple way of utilizing author information: the feature space is augmented with author IDs and each xj is augmented with a(j)2. [sent-260, score-0.294]
63 The former measures the overall correlation of predicted scores for a method with the observed ratings in the test set, while the latter measures the performance of a hypothetical top-k recommendation scenario using the method. [sent-264, score-0.374]
64 The P@k of a method is computed as follows: (1) For each rater, rank comments that the rater rated in the test set according to the scores predicted by the method, and compute the precision at rank k for that rater; and then (2) average the per- rater precision numbers over all raters. [sent-268, score-0.962]
65 To report P@k, for k = 5, 10, 20, we only use raters who have at least 50 ratings in the test set. [sent-269, score-0.327]
66 Next, uc outperforms bilinear (significantly in AUC, P@ 10 and P@20), showing per-user and per-comment latent factors help. [sent-278, score-0.47]
67 Note that vv outperforms uc in ROC, AUC and P@20, but is worse than uc in P@5 and P@ 10; we will take a closer look at this later. [sent-279, score-0.709]
68 2 Break-down by user activity level Next, we investigate model performance in different subsets of the test set. [sent-284, score-0.348]
69 We also generated similar plots with the y-axis replaced by P@5, P@ 10 and P@20, and observed the same trend except that vv starts to outperform uc at different user activity thresholds for different metrics. [sent-288, score-0.817]
70 The orders of uc and vv are not consistent across different metrics. [sent-290, score-0.469]
71 Not surprisingly, vv performs poorly for raters or authors with no ratings observed in the training data. [sent-291, score-0.59]
72 However, once we have a small amount of ratings, it starts to outperform uc, even though intuitively, the textual information in the comment should be more informative than the authorship information alone. [sent-292, score-0.469]
73 This suggests that users’ viewpoints are quite consistent: a large portion of the ratings can be adequately explained by the pair of user identities. [sent-298, score-0.591]
74 One interesting observation is that the number of ratings required for vv to outperform uc in P@5 is quite high. [sent-299, score-0.723]
75 This suggests that to obtain high precision at the top of a recommended list, comment features are important. [sent-300, score-0.293]
76 The x-axis (bottom) has the form m-n, meaning the subset of the test data in which the number of ratings that each author received (as in (a)) or each rater gave (as in (b)) in the training set is between m and n. [sent-303, score-0.769]
77 05, vv+uc significantly outperforms vv in all metrics if the author received < 500 ratings in the training set. [sent-306, score-0.65]
78 Except for the very heavy authors, even for cases where both raters and authors are heavy users (Figure 2(c)), adding the comment feature information still yields additional improvement over the already impressive performance of using vv alone. [sent-307, score-0.962]
79 In spite ofthe simple representation we adopted for the textual information, the full model is cance still capable of accounting for part of the residual errors from vv model (that uses authorship information alone) by using comment features what was actually written does matter. [sent-308, score-0.731]
80 Finally, if we breakdown the comparison between vv+uc and uc for different user activity levels, vv+uc significantly outperforms uc (with level 0. [sent-309, score-0.828]
81 05) in all metrics if the author received at least 5 ratings in the training set. [sent-310, score-0.421]
82 Since the vv model does not utilize rater or comment features, we examine AUC of the uc model. [sent-317, score-1.11]
83 Note that our full model does not require rater features and comment features to be in the same feature space. [sent-324, score-0.641]
84 For simplicity and easy comparison to other methods, we used all comments liked by a rater in the past to build the feature vector of the rater. [sent-326, score-0.686]
85 But since the full model already has information of the textual content of comments from the comment features, and which comments were liked by the users from the ratings, rater features constructed this way do not provide any new information. [sent-327, score-1.401]
86 Indeed, if we model ui ∼ N(1, σu2), instead of ui ∼ N(Gxi, σu2), this o∼mis Nsio(n1, ,o σf xi does not hurt t∼he performance of the model. [sent-328, score-0.343]
87 In future work, other meta-information about the rater 3Note that we used n most useful features in each case. [sent-329, score-0.348]
88 can easily be incorporated into xi to enrich rater representation. [sent-330, score-0.379]
89 Recall that comment features xj were projected to comment factors cj via D. [sent-331, score-0.813]
90 We envisioned that the comment factors could be representing viewpoints. [sent-332, score-0.355]
91 If ui and cj are of the same sign, then the rater is likely to like the comment. [sent-336, score-0.599]
92 While we do not have direct labels for perspectives, our model seems to be capturing the underlying perspectives (as much as a unigrambased model could) by learning from user preference labels across different users. [sent-345, score-0.317]
93 5 Conclusions In this paper, we promote personalized recommendation as a novel way of helping users to consume large quantities of subjective information. [sent-347, score-0.51]
94 In particu- lar, learning weights over textual features across all users outperforms learning for each user individually, which holds true even for heavy raters. [sent-350, score-0.588]
95 Furthermore, while using authorship information alone provides stronger signal than using textual information alone, to our surprise, even for heavy users, adding textual information yields additional improvements. [sent-351, score-0.367]
96 It is difficult to comprehensively capture user affinity to comments using a finite number of ratings observed during a certain time interval. [sent-352, score-0.756]
97 News and comments on news articles are dynamic in nature, novel aspects may emerge over time. [sent-353, score-0.325]
98 To capture such dynamic behavior, comment factors have to be allowed to evolve over time and such an evolution would also necessitate the re-estimation of user factors. [sent-354, score-0.609]
99 Open user profiles for adaptive news systems: help or harm? [sent-366, score-0.322]
100 A joint model of text and aspect ratings for sentiment summarization. [sent-499, score-0.292]
wordName wordTfidf (topN-words)
[('rater', 0.348), ('comment', 0.293), ('ratings', 0.254), ('user', 0.254), ('uc', 0.24), ('vv', 0.229), ('comments', 0.206), ('users', 0.175), ('ij', 0.162), ('ui', 0.156), ('auc', 0.156), ('bilinear', 0.138), ('personalized', 0.133), ('recommendation', 0.12), ('collaborative', 0.117), ('author', 0.112), ('helpfulness', 0.108), ('authorship', 0.096), ('cj', 0.095), ('activity', 0.094), ('liked', 0.093), ('viewpoints', 0.083), ('textual', 0.08), ('heavy', 0.079), ('rating', 0.078), ('raters', 0.073), ('roc', 0.072), ('xj', 0.07), ('news', 0.068), ('agarwal', 0.066), ('perspectives', 0.063), ('factors', 0.062), ('regression', 0.062), ('gxi', 0.062), ('rated', 0.06), ('rv', 0.06), ('interactions', 0.06), ('political', 0.059), ('ru', 0.056), ('yij', 0.056), ('received', 0.055), ('review', 0.054), ('consume', 0.053), ('factorization', 0.052), ('articles', 0.051), ('dimension', 0.049), ('filtering', 0.049), ('matrix', 0.048), ('recommending', 0.048), ('disliked', 0.046), ('www', 0.045), ('em', 0.045), ('monte', 0.045), ('affinity', 0.042), ('backoff', 0.041), ('recommender', 0.04), ('mullen', 0.04), ('ihas', 0.04), ('dislike', 0.04), ('ppi', 0.04), ('past', 0.039), ('sentiment', 0.038), ('pang', 0.037), ('gaussian', 0.036), ('recommend', 0.036), ('salakhutdinov', 0.036), ('viewpoint', 0.036), ('reviews', 0.035), ('authors', 0.034), ('residual', 0.033), ('personalization', 0.033), ('principled', 0.033), ('carlo', 0.033), ('interaction', 0.033), ('yahoo', 0.032), ('alone', 0.032), ('pc', 0.031), ('bell', 0.031), ('nb', 0.031), ('ahn', 0.031), ('learnt', 0.031), ('somasundaran', 0.031), ('xi', 0.031), ('vi', 0.031), ('article', 0.031), ('preferences', 0.031), ('attract', 0.031), ('billsus', 0.031), ('congressional', 0.031), ('koren', 0.031), ('laver', 0.031), ('nobama', 0.031), ('sarwar', 0.031), ('stern', 0.031), ('latent', 0.03), ('factor', 0.03), ('iand', 0.03), ('opinion', 0.029), ('subjective', 0.029), ('log', 0.029)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999887 104 emnlp-2011-Personalized Recommendation of User Comments via Factor Models
Author: Deepak Agarwal ; Bee-Chung Chen ; Bo Pang
Abstract: In recent years, the amount of user-generated opinionated texts (e.g., reviews, user comments) continues to grow at a rapid speed: featured news stories on a major event easily attract thousands of user comments on a popular online News service. How to consume subjective information ofthis volume becomes an interesting and important research question. In contrast to previous work on review analysis that tried to filter or summarize information for a generic average user, we explore a different direction of enabling personalized recommendation of such information. For each user, our task is to rank the comments associated with a given article according to personalized user preference (i.e., whether the user is likely to like or dislike the comment). To this end, we propose a factor model that incorporates rater-comment and rater-author interactions simultaneously in a principled way. Our full model significantly outperforms strong baselines as well as related models that have been considered in previous work.
Author: Rui Yan ; Jian-Yun Nie ; Xiaoming Li
Abstract: Most traditional summarization methods treat their outputs as static and plain texts, which fail to capture user interests during summarization because the generated summaries are the same for different users. However, users have individual preferences on a particular source document collection and obviously a universal summary for all users might not always be satisfactory. Hence we investigate an important and challenging problem in summary generation, i.e., Interactive Personalized Summarization (IPS), which generates summaries in an interactive and personalized manner. Given the source documents, IPS captures user interests by enabling interactive clicks and incorporates personalization by modeling captured reader preference. We develop . experimental systems to compare 5 rival algorithms on 4 instinctively different datasets which amount to 5197 documents. Evaluation results in ROUGE metrics indicate the comparable performance between IPS and the best competing system but IPS produces summaries with much more user satisfaction according to evaluator ratings. Besides, low ROUGE consistency among these user preferred summaries indicates the existence of personalization.
3 0.1178434 41 emnlp-2011-Discriminating Gender on Twitter
Author: John D. Burger ; John Henderson ; George Kim ; Guido Zarrella
Abstract: Accurate prediction of demographic attributes from social media and other informal online content is valuable for marketing, personalization, and legal investigation. This paper describes the construction of a large, multilingual dataset labeled with gender, and investigates statistical models for determining the gender of uncharacterized Twitter users. We explore several different classifier types on this dataset. We show the degree to which classifier accuracy varies based on tweet volumes as well as when various kinds of profile metadata are included in the models. We also perform a large-scale human assessment using Amazon Mechanical Turk. Our methods significantly out-perform both baseline models and almost all humans on the same task.
4 0.11016963 28 emnlp-2011-Closing the Loop: Fast, Interactive Semi-Supervised Annotation With Queries on Features and Instances
Author: Burr Settles
Abstract: This paper describes DUALIST, an active learning annotation paradigm which solicits and learns from labels on both features (e.g., words) and instances (e.g., documents). We present a novel semi-supervised training algorithm developed for this setting, which is (1) fast enough to support real-time interactive speeds, and (2) at least as accurate as preexisting methods for learning with mixed feature and instance labels. Human annotators in user studies were able to produce near-stateof-the-art classifiers—on several corpora in a variety of application domains—with only a few minutes of effort.
5 0.10368679 24 emnlp-2011-Bootstrapping Semantic Parsers from Conversations
Author: Yoav Artzi ; Luke Zettlemoyer
Abstract: Conversations provide rich opportunities for interactive, continuous learning. When something goes wrong, a system can ask for clarification, rewording, or otherwise redirect the interaction to achieve its goals. In this paper, we present an approach for using conversational interactions of this type to induce semantic parsers. We demonstrate learning without any explicit annotation of the meanings of user utterances. Instead, we model meaning with latent variables, and introduce a loss function to measure how well potential meanings match the conversation. This loss drives the overall learning approach, which induces a weighted CCG grammar that could be used to automatically bootstrap the semantic analysis component in a complete dialog system. Experiments on DARPA Communicator conversational logs demonstrate effective learning, despite requiring no explicit mean- . ing annotations.
6 0.092056051 126 emnlp-2011-Structural Opinion Mining for Graph-based Sentiment Representation
7 0.073049642 120 emnlp-2011-Semi-Supervised Recursive Autoencoders for Predicting Sentiment Distributions
8 0.071803428 117 emnlp-2011-Rumor has it: Identifying Misinformation in Microblogs
9 0.071548074 88 emnlp-2011-Linear Text Segmentation Using Affinity Propagation
10 0.069232196 91 emnlp-2011-Literal and Metaphorical Sense Identification through Concrete and Abstract Context
11 0.06721808 30 emnlp-2011-Compositional Matrix-Space Models for Sentiment Analysis
12 0.065250821 11 emnlp-2011-A Simple Word Trigger Method for Social Tag Suggestion
13 0.062125079 33 emnlp-2011-Cooooooooooooooollllllllllllll!!!!!!!!!!!!!! Using Word Lengthening to Detect Sentiment in Microblogs
14 0.061111484 3 emnlp-2011-A Correction Model for Word Alignments
15 0.058300357 105 emnlp-2011-Predicting Thread Discourse Structure over Technical Web Forums
16 0.056763321 29 emnlp-2011-Collaborative Ranking: A Case Study on Entity Linking
17 0.056109618 80 emnlp-2011-Latent Vector Weighting for Word Meaning in Context
18 0.055178758 106 emnlp-2011-Predicting a Scientific Communitys Response to an Article
19 0.054396953 43 emnlp-2011-Domain-Assisted Product Aspect Hierarchy Generation: Towards Hierarchical Organization of Unstructured Consumer Reviews
20 0.053079516 71 emnlp-2011-Identifying and Following Expert Investors in Stock Microblogs
topicId topicWeight
[(0, 0.179), (1, -0.152), (2, 0.078), (3, -0.031), (4, 0.066), (5, 0.037), (6, -0.026), (7, -0.038), (8, -0.015), (9, -0.004), (10, -0.099), (11, -0.176), (12, -0.099), (13, -0.026), (14, 0.105), (15, 0.051), (16, -0.049), (17, 0.049), (18, 0.004), (19, 0.099), (20, 0.043), (21, 0.016), (22, -0.202), (23, 0.009), (24, 0.08), (25, 0.126), (26, 0.079), (27, -0.008), (28, 0.056), (29, 0.125), (30, 0.055), (31, -0.042), (32, 0.025), (33, -0.034), (34, -0.073), (35, 0.117), (36, -0.25), (37, -0.144), (38, 0.102), (39, -0.043), (40, 0.037), (41, 0.251), (42, -0.02), (43, -0.194), (44, 0.024), (45, -0.09), (46, -0.002), (47, -0.058), (48, 0.058), (49, -0.083)]
simIndex simValue paperId paperTitle
same-paper 1 0.95663363 104 emnlp-2011-Personalized Recommendation of User Comments via Factor Models
Author: Deepak Agarwal ; Bee-Chung Chen ; Bo Pang
Abstract: In recent years, the amount of user-generated opinionated texts (e.g., reviews, user comments) continues to grow at a rapid speed: featured news stories on a major event easily attract thousands of user comments on a popular online News service. How to consume subjective information ofthis volume becomes an interesting and important research question. In contrast to previous work on review analysis that tried to filter or summarize information for a generic average user, we explore a different direction of enabling personalized recommendation of such information. For each user, our task is to rank the comments associated with a given article according to personalized user preference (i.e., whether the user is likely to like or dislike the comment). To this end, we propose a factor model that incorporates rater-comment and rater-author interactions simultaneously in a principled way. Our full model significantly outperforms strong baselines as well as related models that have been considered in previous work.
2 0.50560749 24 emnlp-2011-Bootstrapping Semantic Parsers from Conversations
Author: Yoav Artzi ; Luke Zettlemoyer
Abstract: Conversations provide rich opportunities for interactive, continuous learning. When something goes wrong, a system can ask for clarification, rewording, or otherwise redirect the interaction to achieve its goals. In this paper, we present an approach for using conversational interactions of this type to induce semantic parsers. We demonstrate learning without any explicit annotation of the meanings of user utterances. Instead, we model meaning with latent variables, and introduce a loss function to measure how well potential meanings match the conversation. This loss drives the overall learning approach, which induces a weighted CCG grammar that could be used to automatically bootstrap the semantic analysis component in a complete dialog system. Experiments on DARPA Communicator conversational logs demonstrate effective learning, despite requiring no explicit mean- . ing annotations.
Author: Rui Yan ; Jian-Yun Nie ; Xiaoming Li
Abstract: Most traditional summarization methods treat their outputs as static and plain texts, which fail to capture user interests during summarization because the generated summaries are the same for different users. However, users have individual preferences on a particular source document collection and obviously a universal summary for all users might not always be satisfactory. Hence we investigate an important and challenging problem in summary generation, i.e., Interactive Personalized Summarization (IPS), which generates summaries in an interactive and personalized manner. Given the source documents, IPS captures user interests by enabling interactive clicks and incorporates personalization by modeling captured reader preference. We develop . experimental systems to compare 5 rival algorithms on 4 instinctively different datasets which amount to 5197 documents. Evaluation results in ROUGE metrics indicate the comparable performance between IPS and the best competing system but IPS produces summaries with much more user satisfaction according to evaluator ratings. Besides, low ROUGE consistency among these user preferred summaries indicates the existence of personalization.
4 0.44860664 41 emnlp-2011-Discriminating Gender on Twitter
Author: John D. Burger ; John Henderson ; George Kim ; Guido Zarrella
Abstract: Accurate prediction of demographic attributes from social media and other informal online content is valuable for marketing, personalization, and legal investigation. This paper describes the construction of a large, multilingual dataset labeled with gender, and investigates statistical models for determining the gender of uncharacterized Twitter users. We explore several different classifier types on this dataset. We show the degree to which classifier accuracy varies based on tweet volumes as well as when various kinds of profile metadata are included in the models. We also perform a large-scale human assessment using Amazon Mechanical Turk. Our methods significantly out-perform both baseline models and almost all humans on the same task.
5 0.44369447 91 emnlp-2011-Literal and Metaphorical Sense Identification through Concrete and Abstract Context
Author: Peter Turney ; Yair Neuman ; Dan Assaf ; Yohai Cohen
Abstract: Metaphor is ubiquitous in text, even in highly technical text. Correct inference about textual entailment requires computers to distinguish the literal and metaphorical senses of a word. Past work has treated this problem as a classical word sense disambiguation task. In this paper, we take a new approach, based on research in cognitive linguistics that views metaphor as a method for transferring knowledge from a familiar, well-understood, or concrete domain to an unfamiliar, less understood, or more abstract domain. This view leads to the hypothesis that metaphorical word usage is correlated with the degree of abstractness of the word’s context. We introduce an algorithm that uses this hypothesis to classify a word sense in a given context as either literal (de- notative) or metaphorical (connotative). We evaluate this algorithm with a set of adjectivenoun phrases (e.g., in dark comedy, the adjective dark is used metaphorically; in dark hair, it is used literally) and with the TroFi (Trope Finder) Example Base of literal and nonliteral usage for fifty verbs. We achieve state-of-theart performance on both datasets.
6 0.37647426 28 emnlp-2011-Closing the Loop: Fast, Interactive Semi-Supervised Annotation With Queries on Features and Instances
7 0.37362042 11 emnlp-2011-A Simple Word Trigger Method for Social Tag Suggestion
8 0.35853398 117 emnlp-2011-Rumor has it: Identifying Misinformation in Microblogs
9 0.33143252 106 emnlp-2011-Predicting a Scientific Communitys Response to an Article
10 0.32781914 126 emnlp-2011-Structural Opinion Mining for Graph-based Sentiment Representation
11 0.30911931 133 emnlp-2011-The Imagination of Crowds: Conversational AAC Language Modeling using Crowdsourcing and Large Data Sources
12 0.26964203 143 emnlp-2011-Unsupervised Information Extraction with Distributional Prior Knowledge
13 0.26090237 88 emnlp-2011-Linear Text Segmentation Using Affinity Propagation
14 0.25990888 120 emnlp-2011-Semi-Supervised Recursive Autoencoders for Predicting Sentiment Distributions
15 0.25549597 38 emnlp-2011-Data-Driven Response Generation in Social Media
16 0.24656886 135 emnlp-2011-Timeline Generation through Evolutionary Trans-Temporal Summarization
17 0.24324477 85 emnlp-2011-Learning to Simplify Sentences with Quasi-Synchronous Grammar and Integer Programming
18 0.23802917 70 emnlp-2011-Identifying Relations for Open Information Extraction
19 0.23439078 105 emnlp-2011-Predicting Thread Discourse Structure over Technical Web Forums
20 0.23103952 19 emnlp-2011-Approximate Scalable Bounded Space Sketch for Large Data NLP
topicId topicWeight
[(15, 0.021), (23, 0.108), (36, 0.022), (37, 0.023), (45, 0.08), (53, 0.017), (54, 0.025), (57, 0.041), (62, 0.019), (64, 0.026), (66, 0.035), (79, 0.036), (82, 0.031), (87, 0.014), (96, 0.031), (97, 0.356), (98, 0.027)]
simIndex simValue paperId paperTitle
1 0.78244013 30 emnlp-2011-Compositional Matrix-Space Models for Sentiment Analysis
Author: Ainur Yessenalina ; Claire Cardie
Abstract: We present a general learning-based approach for phrase-level sentiment analysis that adopts an ordinal sentiment scale and is explicitly compositional in nature. Thus, we can model the compositional effects required for accurate assignment of phrase-level sentiment. For example, combining an adverb (e.g., “very”) with a positive polar adjective (e.g., “good”) produces a phrase (“very good”) with increased polarity over the adjective alone. Inspired by recent work on distributional approaches to compositionality, we model each word as a matrix and combine words using iterated matrix multiplication, which allows for the modeling of both additive and multiplicative semantic effects. Although the multiplication-based matrix-space framework has been shown to be a theoretically elegant way to model composition (Rudolph and Giesbrecht, 2010), training such models has to be done carefully: the optimization is nonconvex and requires a good initial starting point. This paper presents the first such algorithm for learning a matrix-space model for semantic composition. In the context of the phrase-level sentiment analysis task, our experimental results show statistically significant improvements in performance over a bagof-words model.
same-paper 2 0.75537831 104 emnlp-2011-Personalized Recommendation of User Comments via Factor Models
Author: Deepak Agarwal ; Bee-Chung Chen ; Bo Pang
Abstract: In recent years, the amount of user-generated opinionated texts (e.g., reviews, user comments) continues to grow at a rapid speed: featured news stories on a major event easily attract thousands of user comments on a popular online News service. How to consume subjective information ofthis volume becomes an interesting and important research question. In contrast to previous work on review analysis that tried to filter or summarize information for a generic average user, we explore a different direction of enabling personalized recommendation of such information. For each user, our task is to rank the comments associated with a given article according to personalized user preference (i.e., whether the user is likely to like or dislike the comment). To this end, we propose a factor model that incorporates rater-comment and rater-author interactions simultaneously in a principled way. Our full model significantly outperforms strong baselines as well as related models that have been considered in previous work.
3 0.51155019 33 emnlp-2011-Cooooooooooooooollllllllllllll!!!!!!!!!!!!!! Using Word Lengthening to Detect Sentiment in Microblogs
Author: Samuel Brody ; Nicholas Diakopoulos
Abstract: We present an automatic method which leverages word lengthening to adapt a sentiment lexicon specifically for Twitter and similar social messaging networks. The contributions of the paper are as follows. First, we call attention to lengthening as a widespread phenomenon in microblogs and social messaging, and demonstrate the importance of handling it correctly. We then show that lengthening is strongly associated with subjectivity and sentiment. Finally, we present an automatic method which leverages this association to detect domain-specific sentiment- and emotionbearing words. We evaluate our method by comparison to human judgments, and analyze its strengths and weaknesses. Our results are of interest to anyone analyzing sentiment in microblogs and social networks, whether for research or commercial purposes.
4 0.50990146 81 emnlp-2011-Learning General Connotation of Words using Graph-based Algorithms
Author: Song Feng ; Ritwik Bose ; Yejin Choi
Abstract: In this paper, we introduce a connotation lexicon, a new type of lexicon that lists words with connotative polarity, i.e., words with positive connotation (e.g., award, promotion) and words with negative connotation (e.g., cancer, war). Connotation lexicons differ from much studied sentiment lexicons: the latter concerns words that express sentiment, while the former concerns words that evoke or associate with a specific polarity of sentiment. Understanding the connotation of words would seem to require common sense and world knowledge. However, we demonstrate that much of the connotative polarity of words can be inferred from natural language text in a nearly unsupervised manner. The key linguistic insight behind our approach is selectional preference of connotative predicates. We present graphbased algorithms using PageRank and HITS that collectively learn connotation lexicon together with connotative predicates. Our empirical study demonstrates that the resulting connotation lexicon is of great value for sentiment analysis complementing existing sentiment lexicons.
5 0.49776563 120 emnlp-2011-Semi-Supervised Recursive Autoencoders for Predicting Sentiment Distributions
Author: Richard Socher ; Jeffrey Pennington ; Eric H. Huang ; Andrew Y. Ng ; Christopher D. Manning
Abstract: We introduce a novel machine learning framework based on recursive autoencoders for sentence-level prediction of sentiment label distributions. Our method learns vector space representations for multi-word phrases. In sentiment prediction tasks these representations outperform other state-of-the-art approaches on commonly used datasets, such as movie reviews, without using any pre-defined sentiment lexica or polarity shifting rules. We also evaluate the model’s ability to predict sentiment distributions on a new dataset based on confessions from the experience project. The dataset consists of personal user stories annotated with multiple labels which, when aggregated, form a multinomial distribution that captures emotional reactions. Our algorithm can more accurately predict distributions over such labels compared to several competitive baselines.
6 0.4402937 126 emnlp-2011-Structural Opinion Mining for Graph-based Sentiment Representation
7 0.43744239 63 emnlp-2011-Harnessing WordNet Senses for Supervised Sentiment Classification
8 0.41559198 17 emnlp-2011-Active Learning with Amazon Mechanical Turk
9 0.4136363 117 emnlp-2011-Rumor has it: Identifying Misinformation in Microblogs
10 0.4076848 91 emnlp-2011-Literal and Metaphorical Sense Identification through Concrete and Abstract Context
11 0.40669498 71 emnlp-2011-Identifying and Following Expert Investors in Stock Microblogs
12 0.3993344 1 emnlp-2011-A Bayesian Mixture Model for PoS Induction Using Multiple Features
13 0.39859986 28 emnlp-2011-Closing the Loop: Fast, Interactive Semi-Supervised Annotation With Queries on Features and Instances
14 0.39520991 107 emnlp-2011-Probabilistic models of similarity in syntactic context
15 0.39496604 8 emnlp-2011-A Model of Discourse Predictions in Human Sentence Processing
16 0.39390984 142 emnlp-2011-Unsupervised Discovery of Discourse Relations for Eliminating Intra-sentence Polarity Ambiguities
17 0.39270383 108 emnlp-2011-Quasi-Synchronous Phrase Dependency Grammars for Machine Translation
18 0.39234412 98 emnlp-2011-Named Entity Recognition in Tweets: An Experimental Study
19 0.3914085 128 emnlp-2011-Structured Relation Discovery using Generative Models
20 0.39069289 35 emnlp-2011-Correcting Semantic Collocation Errors with L1-induced Paraphrases