emnlp emnlp2011 emnlp2011-89 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Fabio Massimo Zanzotto ; Marco Pennaccchiotti ; Kostas Tsioutsiouliklis
Abstract: In the last few years, the interest of the research community in micro-blogs and social media services, such as Twitter, is growing exponentially. Yet, so far not much attention has been paid on a key characteristic of microblogs: the high level of information redundancy. The aim of this paper is to systematically approach this problem by providing an operational definition of redundancy. We cast redundancy in the framework of Textual Entailment Recognition. We also provide quantitative evidence on the pervasiveness of redundancy in Twitter, and describe a dataset of redundancy-annotated tweets. Finally, we present a general purpose system for identifying redundant tweets. An extensive quantitative evaluation shows that our system successfully solves the redundancy detection task, improving over baseline systems with statistical significance.
Reference: text
sentIndex sentText sentNum sentScore
1 We cast redundancy in the framework of Textual Entailment Recognition. [sent-6, score-0.246]
2 We also provide quantitative evidence on the pervasiveness of redundancy in Twitter, and describe a dataset of redundancy-annotated tweets. [sent-7, score-0.283]
3 An extensive quantitative evaluation shows that our system successfully solves the redundancy detection task, improving over baseline systems with statistical significance. [sent-9, score-0.311]
4 For example, the following two tweets are part of a large set of redundant tweets issued during the 2010 winter Olympics: (example 1) t1 : “Swiss ski jumper Simon Ammann takes first gold of Vancouver” t2 : “Swiss (Suisse) get the Gold on Normal Hill ski jump. [sent-28, score-1.277]
5 #Vancouver2010” By performing an editorial study (described later in the paper) we discovered that a large part of eventrelated tweets are indeed redundant. [sent-29, score-0.628]
6 First, most applications based on Twitter share the goal of providing tweets that are both informative and diverse, with respect to an initial user information need. [sent-31, score-0.566]
7 For example, Twitter search engines should ideally select the most informative and diverse set of tweets in return to a user query. [sent-32, score-0.597]
8 Similarly, a news web portal that attaches tweets to a given news article should attach those tweets that provide the broadest and most diverse set of information, opinions, and updates about the news item. [sent-33, score-1.221]
9 To keep a high level of diversity, redundant tweets should be removed from the set of tweets displayed to the user. [sent-34, score-1.187]
10 Figure 1 shows an example of a Twitter search engine where redundant tweets are Proce Ed iningbsu orfg th ,e S 2c0o1tl1an Cdo,n UfeKr,en Jcuely on 27 E–m31p,ir 2ic0a1l1 M. [sent-35, score-0.655]
11 ec th2o0d1s1 i Ans Nsoactuiartaioln La fonrg Cuaogmep Purtoatcieosnsainlg L,in pgaugies ti 6c5s9–6 9, Figure 1: Twitter search: actual Twitter results and desired results after redundancy reduction. [sent-37, score-0.246]
12 Also, from a computational linguistic point of view, the high redundancy in micro-blogs gives the unprecedented opportunity to study classical tasks such as text summarization (Haghighi and Vanderwende, 2009), textual entailment recognition (Dagan et al. [sent-39, score-0.418]
13 The aim of this paper is to formally define, for the first time, the problem of redundancy in micro-blogs and to systematically approach the task of automatic redundancy detection. [sent-42, score-0.492]
14 tweets that convey the same information with different wordings, and ignore the more trivial issue of detecting retweets, which can be considered the most basic expression of redundancy. [sent-45, score-0.564]
15 Next, we provide our operational definition of redundancy and introduce our editorial study and dataset in Section 3. [sent-49, score-0.406]
16 In Section 4 we describe our models for redundancy detection. [sent-50, score-0.246]
17 , 2010), ranking tweets by relevance for web search (Ramage et al. [sent-56, score-0.532]
18 The authors first cluster together tweets that refer to the same news. [sent-71, score-0.532]
19 Then, for each cluster, they identify the tweets that are well-formed (i. [sent-72, score-0.532]
20 copy-pasted from news), and induce role mappings between well-formed and noisy tweets in the same cluster by performing word alignment. [sent-74, score-0.532]
21 (2010) propose a probabilistic model to discover dialogue acts in Twitter conversations and to classify tweets in a conversation according to those acts. [sent-77, score-0.532]
22 (A conversation is defined as a set of tweets in the same reply thread. [sent-78, score-0.532]
23 In our paper, we also aim at classifying tweets, but our interest is in information redundancy instead of acts. [sent-80, score-0.246]
24 In the computational linguistic literature, redundancy detection is studied in multi-document summarization, where the overall document is used to select the most informative sentences or snippets (Haghighi and Vanderwende, 2009). [sent-81, score-0.345]
25 Since tweets are short and tweet sets cannot be considered documents, these methods are hard to apply. [sent-82, score-0.709]
26 Distance/similarity feature spaces are more suitable to the paraphrase detection task because they model the similarity between the two texts. [sent-95, score-0.223]
27 On the other hand, entailment trigger and content feature spaces model complex relations between the texts, taking into account first-order entailment rules, i. [sent-96, score-0.348]
28 3 Redundancy in Twitter We formally define two tweets as redundant if they either convey the same information (paraphrase) or if the information of one tweet subsumes the information of the other (textual entailment). [sent-100, score-0.832]
29 ‘textually entails’) the other; both tweets state that Switzerland won a Gold Medal at the Vancouver winter Olympics, but the first one also specifies the name of the athlete. [sent-104, score-0.568]
30 The follow- ing pair is, instead, non-redundant, because the two tweets convey different information, and they do not subsume each other: (example 2) t1 : “Goal! [sent-105, score-0.565]
31 Iniesta scores for #ESP and they have one hand on the #worldcup ” t2 : “this will be a hard final #Esp vs Ned #worldcup ” Our definition of redundancy is grounded on, and inspired by, the theory of Textual Entailment, to which we refer the reader for further details (Dagan et al. [sent-106, score-0.246]
32 In order to answer this question we performed an initial editorial study where human editors were asked to annotate pairs of tweets as being either redundant or nonredundant. [sent-110, score-0.751]
33 The editorial study also serves as a test bed for evaluating our redundancy detection models, as discussed in Section 5. [sent-111, score-0.407]
34 Indeed, these are the types of tweets for which redundancy is a critical issue, especially in view of real applications, e. [sent-115, score-0.778]
35 to present a diverse set of tweets for a given news article. [sent-117, score-0.605]
36 The most critical issue for extracting the dataset is to pre-process tweets and to discard those that are not informative. [sent-121, score-0.569]
37 This is not an easy task: a recent study (Pear-Analytics, 2009) estimates that only 4% of all tweets are factual news, and only 37% are conversations with content. [sent-122, score-0.532]
38 In order to retain only informative tweets we first extract buzzy snapshots (Popescu and Pennacchiotti, 2010). [sent-124, score-0.718]
39 A snapshot is defined as a set of tweets that explicitly mention a specific topic within a specified time period. [sent-125, score-0.65]
40 A buzzy snapshot is defined as a snapshot with a large number of tweets, compared to previous time periods. [sent-126, score-0.227]
41 For example, given the topic ‘Haiti earthquake’, the snapshot composed by the tweets mentioning ‘Haiti earthquake’ on January 12th, 2010, will constitute a buzzy snapshot, since in previous days the topic was not mentioned often. [sent-127, score-0.758]
42 We extract buzzy snapshots for the above two topic lists by following the method described in (Popescu and Pennacchiotti, 2010): we consider time periods of one day, and call buzzy the snapshots that mention a given topic α times more than the average over the previous 2 days. [sent-129, score-0.382]
43 We further exclude irrelevant and spam snapshots by removing those that have: fewer than 10 tweets; more than 50% of tweets non-English; and an average token overlap between tweets of more than 80%, usually corresponding to spam threads. [sent-131, score-1.147]
44 The extraction is performed on a Twitter corpus containing all tweets posted between July 2009 and August 2010. [sent-132, score-0.532]
45 In all, we extract 972 snapshots for the celebrity list, containing 205,885 tweets (i. [sent-133, score-0.67]
46 average of 212 tweets per snapshot); and 674 snap1Hashtags are keywords prefixed by ‘#’ , that are used by the Twitter community to mark the topic of a tweet. [sent-135, score-0.571]
47 662 redundant entailment paraphrase 367 195 172 (29. [sent-137, score-0.282]
48 9%) Table 1: Results of the redundancy editorial study. [sent-143, score-0.342]
49 shots for the event list, containing 393,965 tweets (584 tweets per snapshot). [sent-144, score-1.064]
50 From these two corpora, we extract the final tweet-pair dataset by randomly sampling 1500 pairs of tweets contained in the same snapshot. [sent-148, score-0.569]
51 The main editorial task consisted of annotating tweet-pairs as either redundant or non-redundant. [sent-151, score-0.219]
52 We also asked editors to characterize the specific linguistic relation between the two tweets of a pair. [sent-152, score-0.532]
53 We consider four relations: entailment (the first tweet entails the second or vice versa), paraphrase, contradiction (the tweets contradict each other), and related (the tweets are about the same topic, e. [sent-153, score-1.386]
54 Annotators were asked to base their decisions on the parts of the tweets that contained information relevant to the selected topic, e. [sent-157, score-0.532]
55 Focusing on these parts is in line with potential applications of tweet redundancy detection as tweets are firstly grouped around a topic. [sent-161, score-1.02]
56 Note that pairs that fall under the entailment or paraphrase relation are redundant, while unrelated, related, and contradictory tweets are non-redundant. [sent-162, score-0.691]
57 The annotation was performed in a three stage process, since tweets are sometimes hard to understand and hence to annotate (misspellings, usage of slang and abbreviations, lack of discourse context). [sent-163, score-0.532]
58 This shows that redundancy is indeed a pervasive phenomenon in Twitter, and a critical issue that has to be solved in order to provide clean and diverse social content. [sent-173, score-0.382]
59 Most cases of redundancy correspond to tweets that report the same fact using different wording, occasionally adding irrelevant personal comments and sentiments (e. [sent-174, score-0.778]
60 4 Redundancy detection models The task of redundancy detection in Twitter is a tweet-pair classification problem. [sent-178, score-0.376]
61 Given two tweets t1 and t2, the goal is to classify the pair (t1,t2) as being either redundant or non-redundant. [sent-179, score-0.688]
62 In this section we describe different models for redundancy detection, inspired by existing work in RTE. [sent-180, score-0.246]
63 The simple intuition of the model is that if two tweets t1 and t2 have a high lexical 2At this time, the TwitterTM Terms of Use do not allow publication of the annotated dataset. [sent-186, score-0.532]
64 The bag-of-word model is of course a naive approach, since in many cases redundant tweets can have very different lexical content (e. [sent-196, score-0.709]
65 the following two tweets: “Farrah Fawcett left out of Oscar memorial”, “No Farrah Fawcett’s memory at the Academy Awards”), and non-redundant tweets – can have similar lexical content (e. [sent-198, score-0.586]
66 For example, consider the tweet pair: “Oscars forgot Farrah Fawcett”, “Farrah Fawcett snubbed at Academy Awards”. [sent-204, score-0.287]
67 3 Lexical content model (LEX) This model and the next ones (SYNT and FOR) explicitly model the content of a tweet pair P = (t1, t2) as a whole. [sent-211, score-0.318]
68 each tweet with its own bag of words), and the SVM used as the single feature the similarity between the two tweets. [sent-214, score-0.248]
69 In the LEX model we represent the content of the tweet pair in a double bag-of-word vector space. [sent-215, score-0.264]
70 Given two pairs of tweets and the LEX kernel function is defined as follows: P(a) KLEX(P(a), P(b)) = cos(t(1a),t1(b)) + P(b), cos(t(2a),t2(b)) where cos(·, ·) is the cosine similarity between the two vectors. [sent-218, score-0.615]
71 T)h ise LEX ofseianteur sei space yis b simple a thnde can be extremely effective in modeling the content of tweet pairs. [sent-219, score-0.231]
72 4 Syntactic content model (SYNT) The SYNT model represents a tweet pair using pairs of syntactic tree fragments from t1 and t2. [sent-223, score-0.264]
73 Therefore, these features represent ground rules connecting the left-hand sides and the right-hand sides of the tweet pair: each feature is active for a pair (t1, t2) when the left-hand side fr1is activated by the syntactic analysis of t1 and the right-hand side fr2 is activated by t2. [sent-226, score-0.33]
74 But it also introduces a new limitation: the above feature is in fact also active for the tweet pair (“GM bought Opel”,“Opel owns GM”). [sent-230, score-0.416]
75 Given two pairs of tweets and , the SYNT kernel function is defined as follows: P(a) P(b) KSY NT(P(a), P(b)) = K(t1(a),t1(b)) + K(t2(a),t2(b)) where K(·, ·) is the tree kernel function described in (Collins a(n·,d· Duffy, 2002). [sent-233, score-0.61]
76 5 Syntactic first-order rule content model (FOR) The FOR model overcomes the limitations of SYNT, by enriching the space with features representing first-order relations between the two tweets of a pair. [sent-235, score-0.586]
77 a first order rule that is activated by the tweet pairs if the variables are unified. [sent-238, score-0.207]
78 The feature is active for a tweet pair (t1, t2) if the syntactic interpretations of t1 and t2 can be unified with < fr1, fr2 >. [sent-241, score-0.27]
79 For example, consider the following feature: S S hNPXVBPVNP Y,NPXVBPVNP Yi bought owns This feature is active for the pair (“GM bought Opel”,“GM owns Opel”), with the variable unification X = “GM” and Y = “Opel”. [sent-242, score-0.385]
80 On the contrary, this feature is not active for the pair (“GM bought Opel”,“Opel owns GM”) as there is no possibility of unifying the two variables. [sent-243, score-0.239]
81 5 Experimental Evaluation In this section we present an evaluation of the different redundancy detection models. [sent-245, score-0.311]
82 1 Experimental Setup We experiment with the redundancy detection dataset described in Section 3. [sent-251, score-0.348]
83 , 2004) for computing WBOW and for linking the two tweets in a pair, and SVMlight (Joachims, 1999), extended with the syntactic first-order rule kernels described in (Moschitti and Zanzotto, 2007) for creating the SYNT and the FOR feature spaces. [sent-261, score-0.559]
84 This seems to be intuitive as the language of the tweets can be far from proper English, i. [sent-279, score-0.532]
85 This may be explained by the fact that in the FOR+WBOW system, the WordNet similarity is also used to link words in the two tweets of a pair. [sent-285, score-0.576]
86 This seems to suggest that if the interpretations of the part-of speech tags of the unknown words is correct, the syntax of tweets is reasonably similar to the syntax of the generic English language. [sent-292, score-0.532]
87 The best performing model is FOR+WBOW: firstorder rules successfully emerge in tweets and are positively exploited by the learning system. [sent-293, score-0.532]
88 The first column represents the editorial gold standard (gs) for the tweet pairs we considered: either redundant (R) or non-redundant (N). [sent-299, score-0.396]
89 Since we feed the classifiers with ‘redundant’ as the positive class 4, a classifier is better than another if it ranks redundant tweet pairs (R) higher than non-redundant ones (N). [sent-300, score-0.3]
90 The last two columns are the two tweets in each pair. [sent-303, score-0.532]
91 Yet, it is clear why these pairs have high lexical similarity (and therefore are ranked high by BOW and WBOW): The two tweets in the pair oe387 share ‘volcanic’, ‘ash’, and the hashtag ‘#ashtag’ . [sent-318, score-0.609]
92 This example shows that hashtags alone are not very indicative and useful for detecting redundancy in Twitter. [sent-321, score-0.322]
93 6 Conclusions In this paper we introduced the notion of linguistic redundancy in micro-blogs and the task of tweet redundancy detection. [sent-322, score-0.669]
94 We also presented an editorial study showing that redundancy is pervasive in Twitter, and that methods for its detection will be key in 5In o130, the common topic is ‘farrah fawcett’: fawcett not recognized at the Oscars memorial? [sent-323, score-0.613]
95 ” R 101 632 641 o21 “Oscars forgot farrah fawcett? [sent-337, score-0.192]
96 ” “i dont understand how they included michael jackson in the memorial tribute as an actor but snubbed farrah fawcett. [sent-348, score-0.275]
97 #oscars” “farrah fawcett snubbed at Oscars appeared in a movie with best actor Jeff Bridges. [sent-349, score-0.193]
98 In the second part of the paper we presented some promising models for redundancy detection that show encouraging results when compared to typical lexical baselines. [sent-357, score-0.311]
99 For example, the tweets that other users post about the same topic of the target-pair may be of some help. [sent-363, score-0.571]
100 Robust sentiment detection on twitter from biased and noisy data. [sent-367, score-0.293]
wordName wordTfidf (topN-words)
[('tweets', 0.532), ('wbow', 0.357), ('redundancy', 0.246), ('twitter', 0.228), ('bow', 0.203), ('synt', 0.179), ('tweet', 0.177), ('farrah', 0.151), ('fawcett', 0.124), ('opel', 0.124), ('redundant', 0.123), ('entailment', 0.113), ('editorial', 0.096), ('oscars', 0.096), ('lex', 0.093), ('snapshots', 0.083), ('johnny', 0.082), ('owns', 0.082), ('zanzotto', 0.079), ('snapshot', 0.079), ('gm', 0.076), ('ashtag', 0.069), ('buzzy', 0.069), ('snubbed', 0.069), ('detection', 0.065), ('bought', 0.064), ('earthquake', 0.064), ('social', 0.062), ('textual', 0.059), ('aroc', 0.055), ('celebrity', 0.055), ('depp', 0.055), ('eruption', 0.055), ('memorial', 0.055), ('volcanic', 0.055), ('content', 0.054), ('pennacchiotti', 0.05), ('rte', 0.05), ('fabio', 0.05), ('dagan', 0.048), ('corley', 0.047), ('haiti', 0.047), ('paraphrase', 0.046), ('similarity', 0.044), ('hashtags', 0.044), ('pervasive', 0.043), ('massimo', 0.042), ('popescu', 0.042), ('news', 0.042), ('bpoil', 0.041), ('forgot', 0.041), ('icelandic', 0.041), ('krishnamurthy', 0.041), ('kwak', 0.041), ('oilspill', 0.041), ('spaces', 0.041), ('topic', 0.039), ('kernel', 0.039), ('svm', 0.038), ('dataset', 0.037), ('died', 0.036), ('petrovi', 0.036), ('bernardo', 0.036), ('dead', 0.036), ('olympics', 0.036), ('winter', 0.036), ('moschitti', 0.035), ('vp', 0.035), ('wordnet', 0.034), ('informative', 0.034), ('active', 0.033), ('pair', 0.033), ('entails', 0.032), ('ash', 0.032), ('awards', 0.032), ('detecting', 0.032), ('np', 0.031), ('diverse', 0.031), ('events', 0.031), ('activated', 0.03), ('haghighi', 0.03), ('ritter', 0.03), ('bill', 0.03), ('media', 0.029), ('cos', 0.028), ('carrey', 0.027), ('esp', 0.027), ('impairing', 0.027), ('mlcw', 0.027), ('operational', 0.027), ('oscar', 0.027), ('recognising', 0.027), ('ski', 0.027), ('snub', 0.027), ('venice', 0.027), ('worldcup', 0.027), ('feature', 0.027), ('vancouver', 0.027), ('bp', 0.026), ('ido', 0.026)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999857 89 emnlp-2011-Linguistic Redundancy in Twitter
Author: Fabio Massimo Zanzotto ; Marco Pennaccchiotti ; Kostas Tsioutsiouliklis
Abstract: In the last few years, the interest of the research community in micro-blogs and social media services, such as Twitter, is growing exponentially. Yet, so far not much attention has been paid on a key characteristic of microblogs: the high level of information redundancy. The aim of this paper is to systematically approach this problem by providing an operational definition of redundancy. We cast redundancy in the framework of Textual Entailment Recognition. We also provide quantitative evidence on the pervasiveness of redundancy in Twitter, and describe a dataset of redundancy-annotated tweets. Finally, we present a general purpose system for identifying redundant tweets. An extensive quantitative evaluation shows that our system successfully solves the redundancy detection task, improving over baseline systems with statistical significance.
2 0.37844768 71 emnlp-2011-Identifying and Following Expert Investors in Stock Microblogs
Author: Roy Bar-Haim ; Elad Dinur ; Ronen Feldman ; Moshe Fresko ; Guy Goldstein
Abstract: Information published in online stock investment message boards, and more recently in stock microblogs, is considered highly valuable by many investors. Previous work focused on aggregation of sentiment from all users. However, in this work we show that it is beneficial to distinguish expert users from non-experts. We propose a general framework for identifying expert investors, and use it as a basis for several models that predict stock rise from stock microblogging messages (stock tweets). In particular, we present two methods that combine expert identification and per-user unsupervised learning. These methods were shown to achieve relatively high precision in predicting stock rise, and significantly outperform our baseline. In addition, our work provides an in-depth analysis of the content and potential usefulness of stock tweets.
3 0.3259593 41 emnlp-2011-Discriminating Gender on Twitter
Author: John D. Burger ; John Henderson ; George Kim ; Guido Zarrella
Abstract: Accurate prediction of demographic attributes from social media and other informal online content is valuable for marketing, personalization, and legal investigation. This paper describes the construction of a large, multilingual dataset labeled with gender, and investigates statistical models for determining the gender of uncharacterized Twitter users. We explore several different classifier types on this dataset. We show the degree to which classifier accuracy varies based on tweet volumes as well as when various kinds of profile metadata are included in the models. We also perform a large-scale human assessment using Amazon Mechanical Turk. Our methods significantly out-perform both baseline models and almost all humans on the same task.
4 0.30900258 98 emnlp-2011-Named Entity Recognition in Tweets: An Experimental Study
Author: Alan Ritter ; Sam Clark ; Mausam ; Oren Etzioni
Abstract: People tweet more than 100 Million times daily, yielding a noisy, informal, but sometimes informative corpus of 140-character messages that mirrors the zeitgeist in an unprecedented manner. The performance of standard NLP tools is severely degraded on tweets. This paper addresses this issue by re-building the NLP pipeline beginning with part-of-speech tagging, through chunking, to named-entity recognition. Our novel T-NER system doubles F1 score compared with the Stanford NER system. T-NER leverages the redundancy inherent in tweets to achieve this performance, using LabeledLDA to exploit Freebase dictionaries as a source of distant supervision. LabeledLDA outperforms cotraining, increasing F1 by 25% over ten common entity types. Our NLP tools are available at: http : / / github .com/ aritt er /twitte r_nlp
5 0.27358553 117 emnlp-2011-Rumor has it: Identifying Misinformation in Microblogs
Author: Vahed Qazvinian ; Emily Rosengren ; Dragomir R. Radev ; Qiaozhu Mei
Abstract: A rumor is commonly defined as a statement whose true value is unverifiable. Rumors may spread misinformation (false information) or disinformation (deliberately false information) on a network of people. Identifying rumors is crucial in online social media where large amounts of information are easily spread across a large network by sources with unverified authority. In this paper, we address the problem of rumor detection in microblogs and explore the effectiveness of 3 categories of features: content-based, network-based, and microblog-specific memes for correctly identifying rumors. Moreover, we show how these features are also effective in identifying disinformers, users who endorse a rumor and further help it to spread. We perform our experiments on more than 10,000 manually annotated tweets collected from Twitter and show how our retrieval model achieves more than 0.95 in Mean Average Precision (MAP). Fi- nally, we believe that our dataset is the first large-scale dataset on rumor detection. It can open new dimensions in analyzing online misinformation and other aspects of microblog conversations.
6 0.15724444 139 emnlp-2011-Twitter Catches The Flu: Detecting Influenza Epidemics using Twitter
7 0.12866031 33 emnlp-2011-Cooooooooooooooollllllllllllll!!!!!!!!!!!!!! Using Word Lengthening to Detect Sentiment in Microblogs
8 0.10032535 38 emnlp-2011-Data-Driven Response Generation in Social Media
9 0.097861618 42 emnlp-2011-Divide and Conquer: Crowdsourcing the Creation of Cross-Lingual Textual Entailment Corpora
10 0.091774851 28 emnlp-2011-Closing the Loop: Fast, Interactive Semi-Supervised Annotation With Queries on Features and Instances
11 0.077930741 133 emnlp-2011-The Imagination of Crowds: Conversational AAC Language Modeling using Crowdsourcing and Large Data Sources
12 0.074780427 127 emnlp-2011-Structured Lexical Similarity via Convolution Kernels on Dependency Trees
13 0.069249019 61 emnlp-2011-Generating Aspect-oriented Multi-Document Summarization with Event-aspect model
14 0.054225564 147 emnlp-2011-Using Syntactic and Semantic Structural Kernels for Classifying Definition Questions in Jeopardy!
15 0.054066945 6 emnlp-2011-A Generate and Rank Approach to Sentence Paraphrasing
16 0.049707532 55 emnlp-2011-Exploiting Syntactic and Distributional Information for Spelling Correction with Web-Scale N-gram Models
17 0.045163944 119 emnlp-2011-Semantic Topic Models: Combining Word Distributional Statistics and Dictionary Definitions
18 0.044593725 83 emnlp-2011-Learning Sentential Paraphrases from Bilingual Parallel Corpora for Text-to-Text Generation
19 0.04438043 20 emnlp-2011-Augmenting String-to-Tree Translation Models with Fuzzy Use of Source-side Syntax
20 0.043815386 104 emnlp-2011-Personalized Recommendation of User Comments via Factor Models
topicId topicWeight
[(0, 0.19), (1, -0.358), (2, 0.426), (3, 0.028), (4, -0.307), (5, 0.002), (6, -0.057), (7, 0.035), (8, 0.123), (9, -0.027), (10, -0.002), (11, 0.043), (12, 0.139), (13, -0.03), (14, -0.014), (15, -0.061), (16, -0.107), (17, 0.054), (18, -0.014), (19, -0.029), (20, -0.007), (21, -0.04), (22, 0.097), (23, 0.058), (24, 0.029), (25, -0.022), (26, -0.021), (27, -0.03), (28, -0.0), (29, 0.005), (30, -0.026), (31, -0.002), (32, -0.032), (33, 0.006), (34, 0.04), (35, -0.07), (36, 0.048), (37, 0.027), (38, 0.002), (39, 0.013), (40, 0.024), (41, -0.008), (42, 0.031), (43, 0.025), (44, 0.008), (45, 0.026), (46, -0.02), (47, -0.02), (48, -0.049), (49, -0.04)]
simIndex simValue paperId paperTitle
same-paper 1 0.95688802 89 emnlp-2011-Linguistic Redundancy in Twitter
Author: Fabio Massimo Zanzotto ; Marco Pennaccchiotti ; Kostas Tsioutsiouliklis
Abstract: In the last few years, the interest of the research community in micro-blogs and social media services, such as Twitter, is growing exponentially. Yet, so far not much attention has been paid on a key characteristic of microblogs: the high level of information redundancy. The aim of this paper is to systematically approach this problem by providing an operational definition of redundancy. We cast redundancy in the framework of Textual Entailment Recognition. We also provide quantitative evidence on the pervasiveness of redundancy in Twitter, and describe a dataset of redundancy-annotated tweets. Finally, we present a general purpose system for identifying redundant tweets. An extensive quantitative evaluation shows that our system successfully solves the redundancy detection task, improving over baseline systems with statistical significance.
2 0.93252617 71 emnlp-2011-Identifying and Following Expert Investors in Stock Microblogs
Author: Roy Bar-Haim ; Elad Dinur ; Ronen Feldman ; Moshe Fresko ; Guy Goldstein
Abstract: Information published in online stock investment message boards, and more recently in stock microblogs, is considered highly valuable by many investors. Previous work focused on aggregation of sentiment from all users. However, in this work we show that it is beneficial to distinguish expert users from non-experts. We propose a general framework for identifying expert investors, and use it as a basis for several models that predict stock rise from stock microblogging messages (stock tweets). In particular, we present two methods that combine expert identification and per-user unsupervised learning. These methods were shown to achieve relatively high precision in predicting stock rise, and significantly outperform our baseline. In addition, our work provides an in-depth analysis of the content and potential usefulness of stock tweets.
3 0.8498413 117 emnlp-2011-Rumor has it: Identifying Misinformation in Microblogs
Author: Vahed Qazvinian ; Emily Rosengren ; Dragomir R. Radev ; Qiaozhu Mei
Abstract: A rumor is commonly defined as a statement whose true value is unverifiable. Rumors may spread misinformation (false information) or disinformation (deliberately false information) on a network of people. Identifying rumors is crucial in online social media where large amounts of information are easily spread across a large network by sources with unverified authority. In this paper, we address the problem of rumor detection in microblogs and explore the effectiveness of 3 categories of features: content-based, network-based, and microblog-specific memes for correctly identifying rumors. Moreover, we show how these features are also effective in identifying disinformers, users who endorse a rumor and further help it to spread. We perform our experiments on more than 10,000 manually annotated tweets collected from Twitter and show how our retrieval model achieves more than 0.95 in Mean Average Precision (MAP). Fi- nally, we believe that our dataset is the first large-scale dataset on rumor detection. It can open new dimensions in analyzing online misinformation and other aspects of microblog conversations.
4 0.81719512 139 emnlp-2011-Twitter Catches The Flu: Detecting Influenza Epidemics using Twitter
Author: Eiji ARAMAKI ; Sachiko MASKAWA ; Mizuki MORITA
Abstract: Sachiko MASKAWA The University of Tokyo Tokyo, Japan s achi ko . mas kawa @ gma i . com l Mizuki MORITA National Institute of Biomedical Innovation Osaka, Japan mori ta . mi zuki @ gmai l com . posts more than 5.5 million messages (tweets) every day (reported by Twitter.com in March 201 1). With the recent rise in popularity and scale of social media, a growing need exists for systems that can extract useful information from huge amounts of data. We address the issue of detecting influenza epidemics. First, the proposed system extracts influenza related tweets using Twitter API. Then, only tweets that mention actual influenza patients are extracted by the support vector machine (SVM) based classifier. The experiment results demonstrate the feasibility of the proposed approach (0.89 correlation to the gold standard). Especially at the outbreak and early spread (early epidemic stage), the proposed method shows high correlation (0.97 correlation), which outperforms the state-of-the-art methods. This paper describes that Twitter texts reflect the real world, and that NLP techniques can be applied to extract only tweets that contain useful information. 1
5 0.78468776 41 emnlp-2011-Discriminating Gender on Twitter
Author: John D. Burger ; John Henderson ; George Kim ; Guido Zarrella
Abstract: Accurate prediction of demographic attributes from social media and other informal online content is valuable for marketing, personalization, and legal investigation. This paper describes the construction of a large, multilingual dataset labeled with gender, and investigates statistical models for determining the gender of uncharacterized Twitter users. We explore several different classifier types on this dataset. We show the degree to which classifier accuracy varies based on tweet volumes as well as when various kinds of profile metadata are included in the models. We also perform a large-scale human assessment using Amazon Mechanical Turk. Our methods significantly out-perform both baseline models and almost all humans on the same task.
6 0.56755096 98 emnlp-2011-Named Entity Recognition in Tweets: An Experimental Study
7 0.35194063 33 emnlp-2011-Cooooooooooooooollllllllllllll!!!!!!!!!!!!!! Using Word Lengthening to Detect Sentiment in Microblogs
9 0.22956949 38 emnlp-2011-Data-Driven Response Generation in Social Media
10 0.22252296 28 emnlp-2011-Closing the Loop: Fast, Interactive Semi-Supervised Annotation With Queries on Features and Instances
11 0.20303452 147 emnlp-2011-Using Syntactic and Semantic Structural Kernels for Classifying Definition Questions in Jeopardy!
12 0.20132512 127 emnlp-2011-Structured Lexical Similarity via Convolution Kernels on Dependency Trees
13 0.18747303 42 emnlp-2011-Divide and Conquer: Crowdsourcing the Creation of Cross-Lingual Textual Entailment Corpora
14 0.17938648 6 emnlp-2011-A Generate and Rank Approach to Sentence Paraphrasing
15 0.16659674 82 emnlp-2011-Learning Local Content Shift Detectors from Document-level Information
16 0.16154632 83 emnlp-2011-Learning Sentential Paraphrases from Bilingual Parallel Corpora for Text-to-Text Generation
17 0.14989641 61 emnlp-2011-Generating Aspect-oriented Multi-Document Summarization with Event-aspect model
18 0.13712032 12 emnlp-2011-A Weakly-supervised Approach to Argumentative Zoning of Scientific Documents
19 0.13525261 54 emnlp-2011-Exploiting Parse Structures for Native Language Identification
20 0.13219486 104 emnlp-2011-Personalized Recommendation of User Comments via Factor Models
topicId topicWeight
[(2, 0.333), (14, 0.013), (23, 0.121), (36, 0.019), (37, 0.021), (45, 0.062), (52, 0.051), (53, 0.021), (54, 0.018), (57, 0.014), (62, 0.019), (64, 0.019), (66, 0.027), (79, 0.036), (80, 0.012), (82, 0.018), (87, 0.017), (90, 0.01), (96, 0.053), (98, 0.016)]
simIndex simValue paperId paperTitle
same-paper 1 0.71315664 89 emnlp-2011-Linguistic Redundancy in Twitter
Author: Fabio Massimo Zanzotto ; Marco Pennaccchiotti ; Kostas Tsioutsiouliklis
Abstract: In the last few years, the interest of the research community in micro-blogs and social media services, such as Twitter, is growing exponentially. Yet, so far not much attention has been paid on a key characteristic of microblogs: the high level of information redundancy. The aim of this paper is to systematically approach this problem by providing an operational definition of redundancy. We cast redundancy in the framework of Textual Entailment Recognition. We also provide quantitative evidence on the pervasiveness of redundancy in Twitter, and describe a dataset of redundancy-annotated tweets. Finally, we present a general purpose system for identifying redundant tweets. An extensive quantitative evaluation shows that our system successfully solves the redundancy detection task, improving over baseline systems with statistical significance.
2 0.43926892 41 emnlp-2011-Discriminating Gender on Twitter
Author: John D. Burger ; John Henderson ; George Kim ; Guido Zarrella
Abstract: Accurate prediction of demographic attributes from social media and other informal online content is valuable for marketing, personalization, and legal investigation. This paper describes the construction of a large, multilingual dataset labeled with gender, and investigates statistical models for determining the gender of uncharacterized Twitter users. We explore several different classifier types on this dataset. We show the degree to which classifier accuracy varies based on tweet volumes as well as when various kinds of profile metadata are included in the models. We also perform a large-scale human assessment using Amazon Mechanical Turk. Our methods significantly out-perform both baseline models and almost all humans on the same task.
3 0.43616462 37 emnlp-2011-Cross-Cutting Models of Lexical Semantics
Author: Joseph Reisinger ; Raymond Mooney
Abstract: Context-dependent word similarity can be measured over multiple cross-cutting dimensions. For example, lung and breath are similar thematically, while authoritative and superficial occur in similar syntactic contexts, but share little semantic similarity. Both of these notions of similarity play a role in determining word meaning, and hence lexical semantic models must take them both into account. Towards this end, we develop a novel model, Multi-View Mixture (MVM), that represents words as multiple overlapping clusterings. MVM finds multiple data partitions based on different subsets of features, subject to the marginal constraint that feature subsets are distributed according to Latent Dirich- let Allocation. Intuitively, this constraint favors feature partitions that have coherent topical semantics. Furthermore, MVM uses soft feature assignment, hence the contribution of each data point to each clustering view is variable, isolating the impact of data only to views where they assign the most features. Through a series of experiments, we demonstrate the utility of MVM as an inductive bias for capturing relations between words that are intuitive to humans, outperforming related models such as Latent Dirichlet Allocation.
4 0.43454647 56 emnlp-2011-Exploring Supervised LDA Models for Assigning Attributes to Adjective-Noun Phrases
Author: Matthias Hartung ; Anette Frank
Abstract: This paper introduces an attribute selection task as a way to characterize the inherent meaning of property-denoting adjectives in adjective-noun phrases, such as e.g. hot in hot summer denoting the attribute TEMPERATURE, rather than TASTE. We formulate this task in a vector space model that represents adjectives and nouns as vectors in a semantic space defined over possible attributes. The vectors incorporate latent semantic information obtained from two variants of LDA topic models. Our LDA models outperform previous approaches on a small set of 10 attributes with considerable gains on sparse representations, which highlights the strong smoothing power of LDA models. For the first time, we extend the attribute selection task to a new data set with more than 200 classes. We observe that large-scale attribute selection is a hard problem, but a subset of attributes performs robustly on the large scale as well. Again, the LDA models outperform the VSM baseline.
5 0.42889187 98 emnlp-2011-Named Entity Recognition in Tweets: An Experimental Study
Author: Alan Ritter ; Sam Clark ; Mausam ; Oren Etzioni
Abstract: People tweet more than 100 Million times daily, yielding a noisy, informal, but sometimes informative corpus of 140-character messages that mirrors the zeitgeist in an unprecedented manner. The performance of standard NLP tools is severely degraded on tweets. This paper addresses this issue by re-building the NLP pipeline beginning with part-of-speech tagging, through chunking, to named-entity recognition. Our novel T-NER system doubles F1 score compared with the Stanford NER system. T-NER leverages the redundancy inherent in tweets to achieve this performance, using LabeledLDA to exploit Freebase dictionaries as a source of distant supervision. LabeledLDA outperforms cotraining, increasing F1 by 25% over ten common entity types. Our NLP tools are available at: http : / / github .com/ aritt er /twitte r_nlp
6 0.4238582 108 emnlp-2011-Quasi-Synchronous Phrase Dependency Grammars for Machine Translation
7 0.42105058 136 emnlp-2011-Training a Parser for Machine Translation Reordering
8 0.42018965 28 emnlp-2011-Closing the Loop: Fast, Interactive Semi-Supervised Annotation With Queries on Features and Instances
9 0.41894087 38 emnlp-2011-Data-Driven Response Generation in Social Media
10 0.41730601 137 emnlp-2011-Training dependency parsers by jointly optimizing multiple objectives
11 0.41567588 1 emnlp-2011-A Bayesian Mixture Model for PoS Induction Using Multiple Features
12 0.4144907 128 emnlp-2011-Structured Relation Discovery using Generative Models
13 0.41365626 59 emnlp-2011-Fast and Robust Joint Models for Biomedical Event Extraction
14 0.41286632 79 emnlp-2011-Lateen EM: Unsupervised Training with Multiple Objectives, Applied to Dependency Grammar Induction
15 0.41090262 35 emnlp-2011-Correcting Semantic Collocation Errors with L1-induced Paraphrases
16 0.41032323 123 emnlp-2011-Soft Dependency Constraints for Reordering in Hierarchical Phrase-Based Translation
17 0.41020739 22 emnlp-2011-Better Evaluation Metrics Lead to Better Machine Translation
18 0.40926352 23 emnlp-2011-Bootstrapped Named Entity Recognition for Product Attribute Extraction
19 0.40926322 6 emnlp-2011-A Generate and Rank Approach to Sentence Paraphrasing