brendan_oconnor_ai brendan_oconnor_ai-2013 brendan_oconnor_ai-2013-198 knowledge-graph by maker-knowledge-mining

198 brendan oconnor ai-2013-08-20-Some analysis of tweet shares and “predicting” election outcomes

meta infos for this blog

Source: html

Introduction: Everyone recently seems to be talking about this newish paper by Digrazia, McKelvey, Bollen, and Rojas ( pdf here ) that examines the correlation of Congressional candidate name mentions on Twitter against whether the candidate won the race. One of the coauthors also wrote a Washington Post Op-Ed about it. I read the paper and I think it’s reasonable, but their op-ed overstates their results. It claims: “In the 2010 data, our Twitter data predicted the winner in 404 out of 435 competitive races” But this analysis is nowhere in their paper. Fabio Rojas has now posted errata/rebuttals about the op-ed and described this analysis they did here. There are several major issues off the bat: They didn’t ever predict 404/435 races; they only analyzed 406 races they call “competitive,” getting 92.5% (in-sample) accuracy, then extrapolated to all races to get the 435 number. They’re reporting about in-sample predictions, which is really misleading to a non-scientific audi

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Everyone recently seems to be talking about this newish paper by Digrazia, McKelvey, Bollen, and Rojas ( pdf here ) that examines the correlation of Congressional candidate name mentions on Twitter against whether the candidate won the race. [sent-1, score-0.41]

2 There are several major issues off the bat: They didn’t ever predict 404/435 races; they only analyzed 406 races they call “competitive,” getting 92. [sent-6, score-0.44]

3 These aren’t predictions from just Twitter data, but a linear model that includes incumbency status and a bunch of other variables. [sent-9, score-0.578]

4 If you look at their Figure 1, as Nagler reproduces, it’s obvious that tweet share alone gives much less than that much accuracy. [sent-13, score-0.672]

5 Thus, if you say “predict the winner to be whoever got more tweet mentions,” then the number of correct predictions would be the number of dots in the shaded yellow areas, and the accuracy rate are them divided by the total number of dots. [sent-18, score-0.949]

6 [1] It’s also been pointed out that incumbency alone predicts most House races; are tweets really adding anything here? [sent-20, score-0.341]

7 The main contribution of the paper is to test tweets alongside many controlling variables, including incumbency status. [sent-21, score-0.509]

8 The most convincing analysis the authors could have done would be to add an ablation test: use the model with the tweet share variable, and a model without it, and see how different the accuracies are. [sent-22, score-0.972]

9 One additional percentage point of tweet share is worth 155 votes. [sent-27, score-0.821]

10 [2] The predictive effect of tweet share is significant, but small. [sent-28, score-0.616]

11 In the paper they point out that a standard deviation worth of tweet share margin comes out to around 5000 votes — so roughly speaking, tweet shares are 10% as important as incumbency? [sent-29, score-1.539]

12 On the other hand, tweet share is telling something that those greyed-out, non-significant demographic variables aren’t, so something interesting might be happening. [sent-32, score-0.699]

13 The paper also has some analysis of the outliers where the model fails. [sent-33, score-0.373]

14 It’s scientifically irresponsible to take the in-sample predictions and say “we predicted N number of races correctly” in the popular press. [sent-43, score-0.668]

15 ” In-sample predictions are a pretty technical concept and I think it’s misleading to call them “predictions. [sent-46, score-0.406]

16 I feel a little bad for the coauthors given how many hostile messages I’ve seen about their paper on Twitter and various blogs; presumably this motivates what Rojas says at the end of their errata/rebuttal : The original paper is a non-peer reviewed draft. [sent-48, score-0.391]

17 [1] Also weird: many of the races have a 100% tweet share to one candidate. [sent-55, score-0.935]

18 [2] These aren’t literal vote counts, but number of votes normalized by district size; I think it might be interpretable as, expected number of votes in an average-sized city. [sent-61, score-0.571]

19 Some blog posts have complained they don’t model vote share as a percentage, but I think their normalization preprocessing actually kind of handles that, albeit in a confusing/non-transparent way. [sent-62, score-0.512]

20 5; so I guess that’s more like, a standardized unit of tweet share is worth 20% of standardized impact of incumbency? [sent-65, score-0.921]

similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('share', 0.327), ('tweet', 0.289), ('races', 0.263), ('predictions', 0.25), ('incumbency', 0.226), ('rojas', 0.226), ('twitter', 0.192), ('votes', 0.164), ('variable', 0.139), ('paper', 0.135), ('predict', 0.119), ('model', 0.102), ('misleading', 0.098), ('elections', 0.098), ('republican', 0.098), ('prediction', 0.098), ('candidate', 0.098), ('accuracy', 0.095), ('test', 0.092), ('election', 0.09), ('vote', 0.083), ('variables', 0.083), ('standardized', 0.083), ('number', 0.08), ('percentage', 0.079), ('media', 0.079), ('mentions', 0.079), ('analysis', 0.077), ('ablation', 0.075), ('bollen', 0.075), ('margin', 0.075), ('nagler', 0.075), ('predicted', 0.075), ('shares', 0.075), ('winner', 0.075), ('worth', 0.07), ('strong', 0.069), ('guess', 0.069), ('competitive', 0.065), ('coauthors', 0.065), ('press', 0.06), ('reporting', 0.06), ('important', 0.059), ('also', 0.059), ('call', 0.058), ('figure', 0.058), ('claims', 0.057), ('point', 0.056), ('many', 0.056), ('alone', 0.056)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.99999958 198 brendan oconnor ai-2013-08-20-Some analysis of tweet shares and “predicting” election outcomes

2 0.15337935 187 brendan oconnor ai-2012-09-21-CMU ARK Twitter Part-of-Speech Tagger – v0.3 released

Introduction: We’re pleased to announce a new release of the CMU ARK Twitter Part-of-Speech Tagger, version 0.3. The new version is much faster (40x) and more accurate (89.2 -> 92.8) than before. We also have released new POS-annotated data, including a dataset of one tweet for each of 547 days. We have made available large-scale word clusters from unlabeled Twitter data (217k words, 56m tweets, 847m tokens). Tools, data, and a new technical report describing the release are available at: www.ark.cs.cmu.edu/TweetNLP . 0100100 a 1111100101110 111100000011 , Brendan

3 0.13210846 171 brendan oconnor ai-2011-06-14-How much text versus metadata is in a tweet?

Introduction: This should have been a blog post, but I got lazy and wrote a plaintext document instead. Link For twitter, context matters: 90% of a tweet is metadata and 10% is text. Â That’s measured by (an approximation of) information content; by raw data size, it’s 95/5.

4 0.12353884 121 brendan oconnor ai-2008-10-17-Twitter graphs of the debate

Introduction: Fascinating, from the Twitter blog :

5 0.1038409 203 brendan oconnor ai-2014-02-19-What the ACL-2014 review scores mean

Introduction: I’ve had several people ask me what the numbers in ACL reviews mean — and I can’t find anywhere online where they’re described. (Can anyone point this out if it is somewhere?) So here’s the review form, below. They all go from 1 to 5, with 5 the best. I think the review emails to authors only include a subset of the below — for example, “Overall Recommendation” is not included? The CFP said that they have different types of review forms for different types of papers. I think this one is for a standard full paper. I guess what people really want to know is what scores tend to correspond to acceptances. I really have no idea and I get the impression this can change year to year. I have no involvement with the ACL conference besides being one of many, many reviewers. APPROPRIATENESS (1-5) Does the paper fit in ACL 2014? (Please answer this question in light of the desire to broaden the scope of the research areas represented at ACL.) 5: Certainly. 4: Probabl

6 0.092820898 129 brendan oconnor ai-2008-12-03-Statistics vs. Machine Learning, fight!

7 0.091413699 131 brendan oconnor ai-2008-12-27-Facebook sentiment mining predicts presidential polls

8 0.0865805 111 brendan oconnor ai-2008-08-16-A better Obama vs McCain poll aggregation

9 0.084330224 125 brendan oconnor ai-2008-11-21-Netflix Prize

10 0.082950339 184 brendan oconnor ai-2012-07-04-The $60,000 cat: deep belief networks make less sense for language than vision

11 0.08160045 142 brendan oconnor ai-2009-05-27-Where tweets get sent from

12 0.080444604 185 brendan oconnor ai-2012-07-17-p-values, CDF’s, NLP etc.

13 0.078577757 140 brendan oconnor ai-2009-05-18-Announcing TweetMotif for summarizing twitter topics

14 0.078504294 173 brendan oconnor ai-2011-08-27-CMU Twitter Part-of-Speech tagger 0.2

15 0.077275239 189 brendan oconnor ai-2012-11-24-Graphs for SANCL-2012 web parsing results

16 0.076258913 150 brendan oconnor ai-2009-08-08-Haghighi and Klein (2009): Simple Coreference Resolution with Rich Syntactic and Semantic Features

17 0.074697748 136 brendan oconnor ai-2009-04-01-Binary classification evaluation in R via ROCR

18 0.073781133 179 brendan oconnor ai-2012-02-02-Histograms — matplotlib vs. R

19 0.071925066 2 brendan oconnor ai-2004-11-24-addiction & 2 problems of economics

20 0.071283795 200 brendan oconnor ai-2013-09-13-Response on our movie personas paper

similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, -0.296), (1, -0.125), (2, -0.007), (3, 0.165), (4, -0.006), (5, -0.044), (6, -0.182), (7, -0.075), (8, -0.094), (9, 0.081), (10, 0.008), (11, 0.038), (12, 0.011), (13, 0.073), (14, -0.102), (15, -0.047), (16, -0.03), (17, -0.037), (18, 0.069), (19, 0.049), (20, -0.006), (21, 0.008), (22, 0.064), (23, 0.032), (24, -0.012), (25, -0.064), (26, 0.012), (27, -0.028), (28, -0.062), (29, 0.024), (30, -0.01), (31, 0.044), (32, 0.014), (33, 0.035), (34, -0.058), (35, -0.048), (36, -0.053), (37, -0.074), (38, 0.033), (39, -0.04), (40, -0.019), (41, 0.008), (42, -0.033), (43, -0.037), (44, 0.071), (45, -0.111), (46, -0.001), (47, -0.086), (48, -0.044), (49, 0.053)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.97472501 198 brendan oconnor ai-2013-08-20-Some analysis of tweet shares and “predicting” election outcomes

2 0.6399743 171 brendan oconnor ai-2011-06-14-How much text versus metadata is in a tweet?

3 0.57490563 187 brendan oconnor ai-2012-09-21-CMU ARK Twitter Part-of-Speech Tagger – v0.3 released

4 0.51325083 173 brendan oconnor ai-2011-08-27-CMU Twitter Part-of-Speech tagger 0.2

Introduction: Announcement: We recently released a new version (0.2) of our part-of-speech tagger for English Twitter messages , along with annotations and interface. See the link for more details.

5 0.51200169 204 brendan oconnor ai-2014-04-26-Replot: departure delays vs flight time speed-up

Introduction: Here’s a re-plotting of a graph in this 538 post . It’s looking at whether pilots speed up the flight when there’s a delay, and find that it looks like that’s the case. This is averaged data for flights on several major transcontinental routes. I’ve replotted the main graph as follows. The x-axis is departure delay. The y-axis is the total trip time — number of minutes since the scheduled departure time. For an on-time departure, the average flight is 5 hours, 44 minutes. The blue line shows what the total trip time would be if the delayed flight took that long. Gray lines are uncertainty (I think the CI due to averaging). What’s going on is, the pilots seem to be targeting a total trip time of 370-380 minutes or so. If the departure is only slightly delayed by 10 minutes, the flight time is still the same, but delays in the 30-50 minutes range see a faster flight time which makes up for some of the delay. The original post plotted the y-axis as the delta against t

6 0.5058766 111 brendan oconnor ai-2008-08-16-A better Obama vs McCain poll aggregation

7 0.50108933 142 brendan oconnor ai-2009-05-27-Where tweets get sent from

8 0.46500903 185 brendan oconnor ai-2012-07-17-p-values, CDF’s, NLP etc.

9 0.46363857 181 brendan oconnor ai-2012-03-09-I don’t get this web parsing shared task

10 0.45183292 121 brendan oconnor ai-2008-10-17-Twitter graphs of the debate

11 0.44608676 131 brendan oconnor ai-2008-12-27-Facebook sentiment mining predicts presidential polls

12 0.43974295 88 brendan oconnor ai-2008-01-05-Indicators of a crackpot paper

13 0.43864727 106 brendan oconnor ai-2008-06-17-Pairwise comparisons for relevance evaluation

14 0.42932767 125 brendan oconnor ai-2008-11-21-Netflix Prize

15 0.41972318 194 brendan oconnor ai-2013-04-16-Rise and fall of Dirichlet process clusters

16 0.41634017 203 brendan oconnor ai-2014-02-19-What the ACL-2014 review scores mean

17 0.40707392 202 brendan oconnor ai-2014-02-18-Scatterplot of KN-PYP language model results

18 0.40313876 140 brendan oconnor ai-2009-05-18-Announcing TweetMotif for summarizing twitter topics

19 0.3996706 90 brendan oconnor ai-2008-01-20-Moral psychology on Amazon Mechanical Turk

20 0.39811811 157 brendan oconnor ai-2009-12-31-List of probabilistic model mini-language toolkits

similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(9, 0.016), (14, 0.016), (16, 0.019), (24, 0.03), (44, 0.555), (48, 0.015), (55, 0.021), (70, 0.025), (74, 0.136), (75, 0.025), (80, 0.012), (83, 0.012), (97, 0.019)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.99367893 131 brendan oconnor ai-2008-12-27-Facebook sentiment mining predicts presidential polls

Introduction: I’m a bit late blogging this, but here’s a messy, exciting — and statistically validated! — new online data source. My friend Roddy at Facebook wrote a post describing their sentiment analysis system , which can evaluate positive or negative sentiment toward a particular topic by looking at a large number of wall messages. (I’d link to it, but I can’t find the URL anymore — here’s the Lexicon , but that version only gets term frequencies but no sentiment.) How they constructed their sentiment detector is interesting. Starting with a list of positive and negative terms, they had a lexical acquisition step to gather many more candidate synonyms and misspellings — a necessity in this social media domain, where WordNet ain’t gonna come close! After manually filtering these candidates, they assess the sentiment toward a mention of a topic by looking for instances of these positive and negative words nearby, along with “negation heuristics” and a few other features. He describ

same-blog 2 0.99109304 198 brendan oconnor ai-2013-08-20-Some analysis of tweet shares and “predicting” election outcomes

3 0.97897291 181 brendan oconnor ai-2012-03-09-I don’t get this web parsing shared task

Introduction: The idea for a shared task on web parsing is really cool. But I don’t get this one: Shared Task – SANCL 2012 (First Workshop on Syntactic Analysis of Non-Canonical Language) They’re explicitly banning Manually annotating in-domain (web) sentences Creating new word clusters, or anything, from as much text data as possible … instead restricting participants to the data sets they release. Isn’t a cycle of annotation, error analysis, and new annotations (a self-training + active-learning loop, with smarter decisions through error analysis) the hands-down best way to make an NLP tool for a new domain? Are people scared of this reality? Am I off-base? I am, of course, just advocating for our Twitter POS tagger approach, where we annotated some data, made a supervised tagger, and iterated on features. The biggest weakness in that paper is we didn’t have additional iterations of error analysis. Our lack of semi-supervised learning was not a weakness.

4 0.97252172 115 brendan oconnor ai-2008-10-08-Blog move has landed

Introduction: We’re now live at a new location: anyall.org/blog . Good-bye, Blogger, it was sometimes nice knowing you. This blog is now on WordPress (perhaps behind the times ), which I’ve usually had good experiences with, e.g. for the Dolores Labs Blog . I also made the blog’s name more boring — the old one, “Social Science++”, was just too long and difficult to remember relative to how descriptive it was, and my interests have changed a little bit in any case. All the old posts have been imported, and I set up redirects for all posts. The RSS feed can’t be redirected though. (One small issue: comment authors’ urls and emails failed to get imported. I can fix it if I am given the info; if you want your old comments fixed, drop me a line.)

5 0.96501541 31 brendan oconnor ai-2006-03-18-Mark Turner: Toward the Founding of Cognitive Social Science

Introduction: Where is social science? Where should it go? How should it get there? My answer, in a nutshell, is that social science is headed for an alliance with cognitive science. Mark Turner, 2001, Chronicle of Higher Education

6 0.96501541 79 brendan oconnor ai-2007-10-13-Verificationism dinosaur comics

7 0.78536361 184 brendan oconnor ai-2012-07-04-The $60,000 cat: deep belief networks make less sense for language than vision

8 0.78157628 187 brendan oconnor ai-2012-09-21-CMU ARK Twitter Part-of-Speech Tagger – v0.3 released

9 0.78033817 129 brendan oconnor ai-2008-12-03-Statistics vs. Machine Learning, fight!

10 0.76856446 150 brendan oconnor ai-2009-08-08-Haghighi and Klein (2009): Simple Coreference Resolution with Rich Syntactic and Semantic Features

11 0.76832485 154 brendan oconnor ai-2009-09-10-Don’t MAWK AWK – the fastest and most elegant big data munging language!

12 0.76221633 32 brendan oconnor ai-2006-03-26-new kind of science, for real

13 0.7507183 2 brendan oconnor ai-2004-11-24-addiction & 2 problems of economics

14 0.71398038 189 brendan oconnor ai-2012-11-24-Graphs for SANCL-2012 web parsing results

15 0.69024253 185 brendan oconnor ai-2012-07-17-p-values, CDF’s, NLP etc.

16 0.68391925 136 brendan oconnor ai-2009-04-01-Binary classification evaluation in R via ROCR

17 0.67463946 179 brendan oconnor ai-2012-02-02-Histograms — matplotlib vs. R

18 0.67350417 107 brendan oconnor ai-2008-06-18-Turker classifiers and binary classification threshold calibration

19 0.66997147 111 brendan oconnor ai-2008-08-16-A better Obama vs McCain poll aggregation

20 0.66928101 140 brendan oconnor ai-2009-05-18-Announcing TweetMotif for summarizing twitter topics