andrew_gelman_stats andrew_gelman_stats-2014 andrew_gelman_stats-2014-2346 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: This post is by David K. Park and courtesy of Alex Palen Ellis… Thought you might find this funny: Buzzfeed set out to study porn consumption versus the red/blue political spectrum. And they failed miserably. An article form opennews.org outlines six major fallacies Buzzfeed committed, the best of which resulted in the Kansas effect: “Pornhub’s writeup omitted any explicit description of their methodology—this is never a good sign—but it seems to have involved mapping the IP addresses from which users visited the site to physical addresses and reverse geocoding those to get states…. a large percentage of IP addresses could not be resolved to an address any more specific than “USA.” When that address was geocoded, it returned a point in the centroid of the continental United States, which placed it in the state of—you guessed it—Kansas!” As a result, Kansas was 2.95 std dev above the mean. Those pervs! from: https://source.opennews.org/en-US/learning/distrust-your-data/
sentIndex sentText sentNum sentScore
1 Park and courtesy of Alex Palen Ellis… Thought you might find this funny: Buzzfeed set out to study porn consumption versus the red/blue political spectrum. [sent-2, score-0.464]
2 a large percentage of IP addresses could not be resolved to an address any more specific than “USA. [sent-6, score-0.799]
3 ” When that address was geocoded, it returned a point in the centroid of the continental United States, which placed it in the state of—you guessed it—Kansas! [sent-7, score-0.552]
wordName wordTfidf (topN-words)
[('kansas', 0.359), ('addresses', 0.347), ('buzzfeed', 0.34), ('ip', 0.292), ('address', 0.16), ('ellis', 0.155), ('outlines', 0.155), ('geocoded', 0.155), ('dev', 0.155), ('visited', 0.155), ('porn', 0.155), ('writeup', 0.14), ('std', 0.14), ('fallacies', 0.135), ('guessed', 0.131), ('omitted', 0.127), ('courtesy', 0.124), ('states', 0.122), ('https', 0.12), ('resulted', 0.118), ('resolved', 0.108), ('returned', 0.108), ('committed', 0.107), ('explicit', 0.106), ('mapping', 0.103), ('park', 0.101), ('placed', 0.101), ('alex', 0.099), ('consumption', 0.095), ('reverse', 0.092), ('versus', 0.09), ('methodology', 0.089), ('failed', 0.089), ('users', 0.084), ('sign', 0.081), ('six', 0.081), ('site', 0.081), ('physical', 0.08), ('funny', 0.077), ('united', 0.076), ('description', 0.075), ('percentage', 0.073), ('involved', 0.067), ('major', 0.066), ('specific', 0.065), ('form', 0.058), ('david', 0.057), ('result', 0.054), ('state', 0.052), ('large', 0.046)]
simIndex simValue blogId blogTitle
same-blog 1 1.0 2346 andrew gelman stats-2014-05-24-Buzzfeed, Porn, Kansas…That Can’t Be Good
Introduction: This post is by David K. Park and courtesy of Alex Palen Ellis… Thought you might find this funny: Buzzfeed set out to study porn consumption versus the red/blue political spectrum. And they failed miserably. An article form opennews.org outlines six major fallacies Buzzfeed committed, the best of which resulted in the Kansas effect: “Pornhub’s writeup omitted any explicit description of their methodology—this is never a good sign—but it seems to have involved mapping the IP addresses from which users visited the site to physical addresses and reverse geocoding those to get states…. a large percentage of IP addresses could not be resolved to an address any more specific than “USA.” When that address was geocoded, it returned a point in the centroid of the continental United States, which placed it in the state of—you guessed it—Kansas!” As a result, Kansas was 2.95 std dev above the mean. Those pervs! from: https://source.opennews.org/en-US/learning/distrust-your-data/
2 0.16438547 1143 andrew gelman stats-2012-01-29-G+ > Skype
Introduction: I spoke at the University of Kansas the other day. Kansas is far away so I gave the talk by video. We did it using a G+ hangout, and it worked really well, much much better than when I gave a talk via Skype . With G+, I could see and hear the audience clearly, and they could hear me just fine while seeing my slides (or my face, I went back and forth). Not as good as a live presentation but pretty good, considering. P.S. And here’s how to do it! Conflict of interest disclaimer: I was paid by Google last year to give a short course.
3 0.14431222 1242 andrew gelman stats-2012-04-03-Best lottery story ever
Introduction: Kansas Man Does Not Win Lottery, Is Struck By Lightning . Finally, a story that gets the probabilities right.
4 0.084452204 2031 andrew gelman stats-2013-09-19-What makes a statistician look like a hero?
Introduction: Answer here (courtesy of Kaiser Fung).
5 0.082603127 599 andrew gelman stats-2011-03-03-Two interesting posts elsewhere on graphics
Introduction: Have data graphics progressed in the last century? The first addresses familiar subjects to readers of the blog, with some nice examples of where infographics emphasize the obvious, or increase the probability of an incorrect insight. Your Help Needed: the Effect of Aesthetics on Visualization I borrow the term ‘insight’ from the second link, a study by a group of design & software researchers based around a single interactive graphic. This is similar in spirit to Unwin’s ‘caption this graphic’ assignment.
6 0.078164041 1134 andrew gelman stats-2012-01-21-Lessons learned from a recent R package submission
7 0.068427406 140 andrew gelman stats-2010-07-10-SeeThroughNY
9 0.05443567 159 andrew gelman stats-2010-07-23-Popular governor, small state
10 0.050174929 919 andrew gelman stats-2011-09-21-Least surprising headline of the year
11 0.049995717 636 andrew gelman stats-2011-03-29-The Conservative States of America
12 0.04938633 1236 andrew gelman stats-2012-03-29-Resolution of Diederik Stapel case
13 0.047878023 1649 andrew gelman stats-2013-01-02-Back when 50 miles was a long way
14 0.046928093 125 andrew gelman stats-2010-07-02-The moral of the story is, Don’t look yourself up on Google
15 0.046482004 898 andrew gelman stats-2011-09-10-Fourteen magic words: an update
16 0.045309067 1659 andrew gelman stats-2013-01-07-Some silly things you (didn’t) miss by not reading the sister blog
17 0.045055885 994 andrew gelman stats-2011-11-06-Josh Tenenbaum presents . . . a model of folk physics!
18 0.044527259 2347 andrew gelman stats-2014-05-25-Why I decided not to be a physicist
19 0.044224791 469 andrew gelman stats-2010-12-16-2500 people living in a park in Chicago?
20 0.044142656 1385 andrew gelman stats-2012-06-20-Reconciling different claims about working-class voters
topicId topicWeight
[(0, 0.065), (1, -0.022), (2, 0.028), (3, -0.009), (4, 0.003), (5, -0.006), (6, -0.009), (7, -0.011), (8, -0.016), (9, -0.003), (10, -0.006), (11, -0.004), (12, 0.02), (13, 0.003), (14, 0.015), (15, 0.024), (16, 0.006), (17, -0.006), (18, 0.001), (19, 0.012), (20, -0.017), (21, -0.008), (22, 0.007), (23, -0.027), (24, 0.014), (25, 0.011), (26, -0.022), (27, -0.001), (28, 0.008), (29, 0.001), (30, 0.008), (31, -0.048), (32, 0.026), (33, 0.003), (34, -0.027), (35, -0.014), (36, 0.007), (37, -0.02), (38, 0.003), (39, -0.009), (40, -0.005), (41, -0.02), (42, -0.005), (43, -0.033), (44, 0.029), (45, 0.006), (46, 0.016), (47, 0.036), (48, -0.023), (49, -0.013)]
simIndex simValue blogId blogTitle
same-blog 1 0.95011359 2346 andrew gelman stats-2014-05-24-Buzzfeed, Porn, Kansas…That Can’t Be Good
Introduction: This post is by David K. Park and courtesy of Alex Palen Ellis… Thought you might find this funny: Buzzfeed set out to study porn consumption versus the red/blue political spectrum. And they failed miserably. An article form opennews.org outlines six major fallacies Buzzfeed committed, the best of which resulted in the Kansas effect: “Pornhub’s writeup omitted any explicit description of their methodology—this is never a good sign—but it seems to have involved mapping the IP addresses from which users visited the site to physical addresses and reverse geocoding those to get states…. a large percentage of IP addresses could not be resolved to an address any more specific than “USA.” When that address was geocoded, it returned a point in the centroid of the continental United States, which placed it in the state of—you guessed it—Kansas!” As a result, Kansas was 2.95 std dev above the mean. Those pervs! from: https://source.opennews.org/en-US/learning/distrust-your-data/
2 0.67204565 1239 andrew gelman stats-2012-04-01-A randomized trial of the set-point diet
Introduction: Someone pointed me to this forthcoming article in the journal Nutrition by J. F. Lee et al. It looks pretty cool. I’m glad that someone went to the effort of performing this careful study. Regular readers will know that I’ve been waiting for this one for awhile. In case you can’t read the article through the paywall, here’s the abstract: Background: Under a widely-accepted theory of caloric balance, any individual has a set-point weight and will find it uncomfortable and typically unsustainable to keep his or her weight below that point. Set-points have evidently been increasing over the past few decades in the United States and other countries, leading to a public-health crisis of obesity. In an n=1 study, Roberts (2004, 2006) proposed an intervention to lower the set-point via daily consumption of unflavored sugar water or vegetable oil. Objective: To evaluate weight-loss outcomes under the diet proposed by Roberts (2004, 2006). Design: Randomized clinica
Introduction: Dan Kahan writes: We all know it’s ridiculous to be able to go on an fMRI fishing trip & resort to post hoc story-telling to explain the “significant” correlations one (inevitably) observes (good fMRI studies *don’t* do this; only bad ones do– to the injury of the reputation of all the scholars doing good studies of this kind). But now one doesn’t even need correlations that support the post-hoc inferences one is drawing. This one’s good. Kahan continues: Headline: Religious Experiences Shrink Part of the Brain text: ” … The study, published March 30 [2011] in PLoS One, showed greater atrophy in the hippocampus in individuals who identify with specific religious groups as well as those with no religious affiliation … The results showed significantly greater hippocampal atrophy in individuals reporting a life-changing religious experience. In addition, they found significantly greater hippocampal atrophy among born-again Protestants, Catholics, and those with no religiou
Introduction: One of my left-wing colleagues pointed me to this Fox TV interview in which UCLA political scientist Tim Groseclose expresses displeasure with having his research criticized by liberal advocacy group Media Matters for America. My colleague thought it was irresponsible and unprofessional for Groseclose to get all indignant about the criticism. But I understood. I remember how after the state Attorney General’s office released the study Jeff Fagan and I did on police stops ( see here for the research-paper version), we were viciously attacked. Some creep from the NYC Law Department sent a nasty letter full of accusations that were . . . I’d say “bullshit” but I don’t want to say that because “bullshit” contains the word “shit” and I don’t want to use profanity on this blog . . . anyway, this lawyer creep sent us an aggressive letter with bogus claims about our research competence. He could’ve just said: Yes, the NYPD stops ethnic minorities at a rate disproportionate to their c
5 0.60576588 1397 andrew gelman stats-2012-06-27-Stand Your Ground laws and homicides
Introduction: Jeff points me to a paper by Chandler McClellan and Erdal Tekin which begins as follows: The controversies surrounding Stand Your Ground laws have recently captured the nation’s attention. Since 2005, eighteen states have passed laws extending the right to self-defense with no duty to retreat to any place a person has a legal right to be, and several additional states are debating the adoption of similar legislation. Despite the implications that these laws may have for public safety, there has been little empirical investigation of their impact on crime and victimization. In this paper, we use monthly data from the U.S. Vital Statistics to examine how Stand Your Ground laws affect homicides. We identify the impact of these laws by exploiting variation in the effective date of these laws across states. Our results indicate that Stand Your Ground laws are associated with a significant increase in the number of homicides among whites, especially white males. According to our estimat
6 0.59729224 683 andrew gelman stats-2011-04-28-Asymmetry in Political Bias
8 0.59191179 1649 andrew gelman stats-2013-01-02-Back when 50 miles was a long way
9 0.59167063 1504 andrew gelman stats-2012-09-20-Could someone please lock this guy and Niall Ferguson in a room together?
10 0.58531547 1547 andrew gelman stats-2012-10-25-College football, voting, and the law of large numbers
11 0.57716244 827 andrew gelman stats-2011-07-28-Amusing case of self-defeating science writing
14 0.55287141 159 andrew gelman stats-2010-07-23-Popular governor, small state
15 0.55225474 2336 andrew gelman stats-2014-05-16-How much can we learn about individual-level causal claims from state-level correlations?
16 0.54932976 828 andrew gelman stats-2011-07-28-Thoughts on Groseclose book on media bias
17 0.54896295 382 andrew gelman stats-2010-10-30-“Presidential Election Outcomes Directly Influence Suicide Rates”
18 0.5486396 1996 andrew gelman stats-2013-08-24-All inference is about generalizing from sample to population
20 0.54714751 731 andrew gelman stats-2011-05-26-Lottery probability update
topicId topicWeight
[(9, 0.02), (14, 0.038), (15, 0.018), (16, 0.081), (24, 0.178), (30, 0.031), (31, 0.059), (35, 0.022), (41, 0.02), (56, 0.039), (71, 0.016), (72, 0.046), (73, 0.13), (82, 0.022), (89, 0.085), (99, 0.085)]
simIndex simValue blogId blogTitle
same-blog 1 0.91538978 2346 andrew gelman stats-2014-05-24-Buzzfeed, Porn, Kansas…That Can’t Be Good
Introduction: This post is by David K. Park and courtesy of Alex Palen Ellis… Thought you might find this funny: Buzzfeed set out to study porn consumption versus the red/blue political spectrum. And they failed miserably. An article form opennews.org outlines six major fallacies Buzzfeed committed, the best of which resulted in the Kansas effect: “Pornhub’s writeup omitted any explicit description of their methodology—this is never a good sign—but it seems to have involved mapping the IP addresses from which users visited the site to physical addresses and reverse geocoding those to get states…. a large percentage of IP addresses could not be resolved to an address any more specific than “USA.” When that address was geocoded, it returned a point in the centroid of the continental United States, which placed it in the state of—you guessed it—Kansas!” As a result, Kansas was 2.95 std dev above the mean. Those pervs! from: https://source.opennews.org/en-US/learning/distrust-your-data/
2 0.76175547 1748 andrew gelman stats-2013-03-04-PyStan!
Introduction: Stan is written in C++ and can be run from the command line and from R. We’d like for Python users to be able to run Stan as well. If anyone is interested in doing this, please let us know and we’d be happy to work with you on it. Stan, like Python, is completely free and open-source. P.S. Because Stan is open-source, it of course would also be possible for people to translate Stan into Python, or to take whatever features they like from Stan and incorporate them into a Python package. That’s fine too. But we think it would make sense in addition for users to be able to run Stan directly from Python, in the same way that it can be run from R.
3 0.75043929 917 andrew gelman stats-2011-09-20-Last post on Hipmunk
Introduction: There was some confusion on my last try , so let me explain one more time . . . The flights I where Hipmunk failed (see here for background) were not obscure itineraries. One of them was a nonstop from New York to Cincinnati; another was from NY to Durham, North Carolina; and yet another was a trip to Midway in Chicago. In that last case, Hipmunk showed no nonstops at all—which will come as a surprise to the passengers on the Southwest Airlines flight I was on a couple days ago! In these cases, Hipmunk didn’t even do the courtesy of flashing a message telling me to try elsewhere. I don’t understand. How hard would it be for the program to automatically do a Kayak search and find all the flights? Hipmunk’s graphics are great, though. Lee Wilkinson reports: Check out the figure below from The Grammar of Graphics. Dan Rope invented this graphic and programmed it in Java in the late 1990′s. We shopped this graph around to Orbitz and Expedia but they weren’t interested. So I
4 0.74582273 801 andrew gelman stats-2011-07-13-On the half-Cauchy prior for a global scale parameter
Introduction: Nick Polson and James Scott write : We generalize the half-Cauchy prior for a global scale parameter to the wider class of hypergeometric inverted-beta priors. We derive expressions for posterior moments and marginal densities when these priors are used for a top-level normal variance in a Bayesian hierarchical model. Finally, we prove a result that characterizes the frequentist risk of the Bayes estimators under all priors in the class. These arguments provide an alternative, classical justification for the use of the half-Cauchy prior in Bayesian hierarchical models, complementing the arguments in Gelman (2006). This makes me happy, of course. It’s great to be validated. The only think I didn’t catch is how they set the scale parameter for the half-Cauchy prior. In my 2006 paper I frame it as a weakly informative prior and recommend that the scale be set based on actual prior knowledge. But Polson and Scott are talking about a default choice. I used to think that such a
5 0.71990681 593 andrew gelman stats-2011-02-27-Heat map
Introduction: Jarad Niemi sends along this plot: and writes: 2010-2011 Miami Heat offensive (red), defensive (blue), and combined (black) player contribution means (dots) and 95% credible intervals (lines) where zero indicates an average NBA player. Larger positive numbers for offensive and combined are better while larger negative numbers for defense are better. In retrospect, I [Niemi] should have plotted -1*defensive_contribution so that larger was always better. The main point with this figure is that this awesome combination of James-Wade-Bosh that was discussed immediately after the LeBron trade to the Heat has a one-of-these-things-is-not-like-the-other aspect. At least according to my analysis, Bosh is hurting his team compared to the average player (although not statistically significant) due to his terrible defensive contribution (which is statistically significant). All fine so far. But the punchline comes at the end, when he writes: Anyway, a reviewer said he hated the
7 0.71365321 1787 andrew gelman stats-2013-04-04-Wanna be the next Tyler Cowen? It’s not as easy as you might think!
8 0.71176487 1258 andrew gelman stats-2012-04-10-Why display 6 years instead of 30?
9 0.71160775 2229 andrew gelman stats-2014-02-28-God-leaf-tree
10 0.71135384 482 andrew gelman stats-2010-12-23-Capitalism as a form of voluntarism
11 0.70556492 938 andrew gelman stats-2011-10-03-Comparing prediction errors
12 0.70520294 1092 andrew gelman stats-2011-12-29-More by Berger and me on weakly informative priors
13 0.70458221 1376 andrew gelman stats-2012-06-12-Simple graph WIN: the example of birthday frequencies
14 0.70313972 1479 andrew gelman stats-2012-09-01-Mothers and Moms
15 0.70264888 1978 andrew gelman stats-2013-08-12-Fixing the race, ethnicity, and national origin questions on the U.S. Census
16 0.70233405 846 andrew gelman stats-2011-08-09-Default priors update?
18 0.70202756 643 andrew gelman stats-2011-04-02-So-called Bayesian hypothesis testing is just as bad as regular hypothesis testing
19 0.70151103 743 andrew gelman stats-2011-06-03-An argument that can’t possibly make sense