andrew_gelman_stats andrew_gelman_stats-2014 andrew_gelman_stats-2014-2346 knowledge-graph by maker-knowledge-mining

2346 andrew gelman stats-2014-05-24-Buzzfeed, Porn, Kansas…That Can’t Be Good


meta infos for this blog

Source: html

Introduction: This post is by  David K. Park  and courtesy of Alex Palen Ellis… Thought you might find this funny: Buzzfeed set out to study porn consumption versus the red/blue political spectrum. And they failed miserably. An article form opennews.org outlines six major fallacies Buzzfeed committed, the best of which resulted in the Kansas effect: “Pornhub’s writeup omitted any explicit description of their methodology—this is never a good sign—but it seems to have involved mapping the IP addresses from which users visited the site to physical addresses and reverse geocoding those to get states…. a large percentage of IP addresses could not be resolved to an address any more specific than “USA.” When that address was geocoded, it returned a point in the centroid of the continental United States, which placed it in the state of—you guessed it—Kansas!” As a result, Kansas was 2.95 std dev above the mean. Those pervs! from:  https://source.opennews.org/en-US/learning/distrust-your-data/


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Park  and courtesy of Alex Palen Ellis… Thought you might find this funny: Buzzfeed set out to study porn consumption versus the red/blue political spectrum. [sent-2, score-0.464]

2 a large percentage of IP addresses could not be resolved to an address any more specific than “USA. [sent-6, score-0.799]

3 ” When that address was geocoded, it returned a point in the centroid of the continental United States, which placed it in the state of—you guessed it—Kansas! [sent-7, score-0.552]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('kansas', 0.359), ('addresses', 0.347), ('buzzfeed', 0.34), ('ip', 0.292), ('address', 0.16), ('ellis', 0.155), ('outlines', 0.155), ('geocoded', 0.155), ('dev', 0.155), ('visited', 0.155), ('porn', 0.155), ('writeup', 0.14), ('std', 0.14), ('fallacies', 0.135), ('guessed', 0.131), ('omitted', 0.127), ('courtesy', 0.124), ('states', 0.122), ('https', 0.12), ('resulted', 0.118), ('resolved', 0.108), ('returned', 0.108), ('committed', 0.107), ('explicit', 0.106), ('mapping', 0.103), ('park', 0.101), ('placed', 0.101), ('alex', 0.099), ('consumption', 0.095), ('reverse', 0.092), ('versus', 0.09), ('methodology', 0.089), ('failed', 0.089), ('users', 0.084), ('sign', 0.081), ('six', 0.081), ('site', 0.081), ('physical', 0.08), ('funny', 0.077), ('united', 0.076), ('description', 0.075), ('percentage', 0.073), ('involved', 0.067), ('major', 0.066), ('specific', 0.065), ('form', 0.058), ('david', 0.057), ('result', 0.054), ('state', 0.052), ('large', 0.046)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0 2346 andrew gelman stats-2014-05-24-Buzzfeed, Porn, Kansas…That Can’t Be Good

Introduction: This post is by  David K. Park  and courtesy of Alex Palen Ellis… Thought you might find this funny: Buzzfeed set out to study porn consumption versus the red/blue political spectrum. And they failed miserably. An article form opennews.org outlines six major fallacies Buzzfeed committed, the best of which resulted in the Kansas effect: “Pornhub’s writeup omitted any explicit description of their methodology—this is never a good sign—but it seems to have involved mapping the IP addresses from which users visited the site to physical addresses and reverse geocoding those to get states…. a large percentage of IP addresses could not be resolved to an address any more specific than “USA.” When that address was geocoded, it returned a point in the centroid of the continental United States, which placed it in the state of—you guessed it—Kansas!” As a result, Kansas was 2.95 std dev above the mean. Those pervs! from:  https://source.opennews.org/en-US/learning/distrust-your-data/

2 0.16438547 1143 andrew gelman stats-2012-01-29-G+ > Skype

Introduction: I spoke at the University of Kansas the other day. Kansas is far away so I gave the talk by video. We did it using a G+ hangout, and it worked really well, much much better than when I gave a talk via Skype . With G+, I could see and hear the audience clearly, and they could hear me just fine while seeing my slides (or my face, I went back and forth). Not as good as a live presentation but pretty good, considering. P.S. And here’s how to do it! Conflict of interest disclaimer: I was paid by Google last year to give a short course.

3 0.14431222 1242 andrew gelman stats-2012-04-03-Best lottery story ever

Introduction: Kansas Man Does Not Win Lottery, Is Struck By Lightning . Finally, a story that gets the probabilities right.

4 0.084452204 2031 andrew gelman stats-2013-09-19-What makes a statistician look like a hero?

Introduction: Answer here (courtesy of Kaiser Fung).

5 0.082603127 599 andrew gelman stats-2011-03-03-Two interesting posts elsewhere on graphics

Introduction: Have data graphics progressed in the last century? The first addresses familiar subjects to readers of the blog, with some nice examples of where infographics emphasize the obvious, or increase the probability of an incorrect insight. Your Help Needed: the Effect of Aesthetics on Visualization I borrow the term ‘insight’ from the second link, a study by a group of design & software researchers based around a single interactive graphic. This is similar in spirit to Unwin’s ‘caption this graphic’ assignment.

6 0.078164041 1134 andrew gelman stats-2012-01-21-Lessons learned from a recent R package submission

7 0.068427406 140 andrew gelman stats-2010-07-10-SeeThroughNY

8 0.063815966 1939 andrew gelman stats-2013-07-15-Forward causal reasoning statements are about estimation; reverse causal questions are about model checking and hypothesis generation

9 0.05443567 159 andrew gelman stats-2010-07-23-Popular governor, small state

10 0.050174929 919 andrew gelman stats-2011-09-21-Least surprising headline of the year

11 0.049995717 636 andrew gelman stats-2011-03-29-The Conservative States of America

12 0.04938633 1236 andrew gelman stats-2012-03-29-Resolution of Diederik Stapel case

13 0.047878023 1649 andrew gelman stats-2013-01-02-Back when 50 miles was a long way

14 0.046928093 125 andrew gelman stats-2010-07-02-The moral of the story is, Don’t look yourself up on Google

15 0.046482004 898 andrew gelman stats-2011-09-10-Fourteen magic words: an update

16 0.045309067 1659 andrew gelman stats-2013-01-07-Some silly things you (didn’t) miss by not reading the sister blog

17 0.045055885 994 andrew gelman stats-2011-11-06-Josh Tenenbaum presents . . . a model of folk physics!

18 0.044527259 2347 andrew gelman stats-2014-05-25-Why I decided not to be a physicist

19 0.044224791 469 andrew gelman stats-2010-12-16-2500 people living in a park in Chicago?

20 0.044142656 1385 andrew gelman stats-2012-06-20-Reconciling different claims about working-class voters


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.065), (1, -0.022), (2, 0.028), (3, -0.009), (4, 0.003), (5, -0.006), (6, -0.009), (7, -0.011), (8, -0.016), (9, -0.003), (10, -0.006), (11, -0.004), (12, 0.02), (13, 0.003), (14, 0.015), (15, 0.024), (16, 0.006), (17, -0.006), (18, 0.001), (19, 0.012), (20, -0.017), (21, -0.008), (22, 0.007), (23, -0.027), (24, 0.014), (25, 0.011), (26, -0.022), (27, -0.001), (28, 0.008), (29, 0.001), (30, 0.008), (31, -0.048), (32, 0.026), (33, 0.003), (34, -0.027), (35, -0.014), (36, 0.007), (37, -0.02), (38, 0.003), (39, -0.009), (40, -0.005), (41, -0.02), (42, -0.005), (43, -0.033), (44, 0.029), (45, 0.006), (46, 0.016), (47, 0.036), (48, -0.023), (49, -0.013)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.95011359 2346 andrew gelman stats-2014-05-24-Buzzfeed, Porn, Kansas…That Can’t Be Good

Introduction: This post is by  David K. Park  and courtesy of Alex Palen Ellis… Thought you might find this funny: Buzzfeed set out to study porn consumption versus the red/blue political spectrum. And they failed miserably. An article form opennews.org outlines six major fallacies Buzzfeed committed, the best of which resulted in the Kansas effect: “Pornhub’s writeup omitted any explicit description of their methodology—this is never a good sign—but it seems to have involved mapping the IP addresses from which users visited the site to physical addresses and reverse geocoding those to get states…. a large percentage of IP addresses could not be resolved to an address any more specific than “USA.” When that address was geocoded, it returned a point in the centroid of the continental United States, which placed it in the state of—you guessed it—Kansas!” As a result, Kansas was 2.95 std dev above the mean. Those pervs! from:  https://source.opennews.org/en-US/learning/distrust-your-data/

2 0.67204565 1239 andrew gelman stats-2012-04-01-A randomized trial of the set-point diet

Introduction: Someone pointed me to this forthcoming article in the journal Nutrition by J. F. Lee et al. It looks pretty cool. I’m glad that someone went to the effort of performing this careful study. Regular readers will know that I’ve been waiting for this one for awhile. In case you can’t read the article through the paywall, here’s the abstract: Background: Under a widely-accepted theory of caloric balance, any individual has a set-point weight and will find it uncomfortable and typically unsustainable to keep his or her weight below that point. Set-points have evidently been increasing over the past few decades in the United States and other countries, leading to a public-health crisis of obesity. In an n=1 study, Roberts (2004, 2006) proposed an intervention to lower the set-point via daily consumption of unflavored sugar water or vegetable oil. Objective: To evaluate weight-loss outcomes under the diet proposed by Roberts (2004, 2006). Design: Randomized clinica

3 0.65591025 1375 andrew gelman stats-2012-06-11-The unitary nature of consciousness: “It’s impossible to be insanely frustrated about 2 things at once”

Introduction: Dan Kahan writes: We all know it’s ridiculous to be able to go on an fMRI fishing trip & resort to post hoc story-telling to explain the “significant” correlations one (inevitably) observes (good fMRI studies *don’t* do this; only bad ones do– to the injury of the reputation of all the scholars doing good studies of this kind). But now one doesn’t even need correlations that support the post-hoc inferences one is drawing. This one’s good. Kahan continues: Headline: Religious Experiences Shrink Part of the Brain text: ” … The study, published March 30 [2011] in PLoS One, showed greater atrophy in the hippocampus in individuals who identify with specific religious groups as well as those with no religious affiliation … The results showed significantly greater hippocampal atrophy in individuals reporting a life-changing religious experience. In addition, they found significantly greater hippocampal atrophy among born-again Protestants, Catholics, and those with no religiou

4 0.62281454 812 andrew gelman stats-2011-07-21-Confusion about “rigging the numbers,” the support of ideological opposites, who’s a 501(c)(3), and the asymmetry of media bias

Introduction: One of my left-wing colleagues pointed me to this Fox TV interview in which UCLA political scientist Tim Groseclose expresses displeasure with having his research criticized by liberal advocacy group Media Matters for America. My colleague thought it was irresponsible and unprofessional for Groseclose to get all indignant about the criticism. But I understood. I remember how after the state Attorney General’s office released the study Jeff Fagan and I did on police stops ( see here for the research-paper version), we were viciously attacked. Some creep from the NYC Law Department sent a nasty letter full of accusations that were . . . I’d say “bullshit” but I don’t want to say that because “bullshit” contains the word “shit” and I don’t want to use profanity on this blog . . . anyway, this lawyer creep sent us an aggressive letter with bogus claims about our research competence. He could’ve just said: Yes, the NYPD stops ethnic minorities at a rate disproportionate to their c

5 0.60576588 1397 andrew gelman stats-2012-06-27-Stand Your Ground laws and homicides

Introduction: Jeff points me to a paper by Chandler McClellan and Erdal Tekin which begins as follows: The controversies surrounding Stand Your Ground laws have recently captured the nation’s attention. Since 2005, eighteen states have passed laws extending the right to self-defense with no duty to retreat to any place a person has a legal right to be, and several additional states are debating the adoption of similar legislation. Despite the implications that these laws may have for public safety, there has been little empirical investigation of their impact on crime and victimization. In this paper, we use monthly data from the U.S. Vital Statistics to examine how Stand Your Ground laws affect homicides. We identify the impact of these laws by exploiting variation in the effective date of these laws across states. Our results indicate that Stand Your Ground laws are associated with a significant increase in the number of homicides among whites, especially white males. According to our estimat

6 0.59729224 683 andrew gelman stats-2011-04-28-Asymmetry in Political Bias

7 0.5960927 2087 andrew gelman stats-2013-11-03-The Employment Nondiscrimination Act is overwhelmingly popular in nearly every one of the 50 states

8 0.59191179 1649 andrew gelman stats-2013-01-02-Back when 50 miles was a long way

9 0.59167063 1504 andrew gelman stats-2012-09-20-Could someone please lock this guy and Niall Ferguson in a room together?

10 0.58531547 1547 andrew gelman stats-2012-10-25-College football, voting, and the law of large numbers

11 0.57716244 827 andrew gelman stats-2011-07-28-Amusing case of self-defeating science writing

12 0.57588667 1263 andrew gelman stats-2012-04-13-Question of the week: Will the authors of a controversial new study apologize to busy statistician Don Berry for wasting his time reading and responding to their flawed article?

13 0.55464947 287 andrew gelman stats-2010-09-20-Paul Rosenbaum on those annoying pre-treatment variables that are sort-of instruments and sort-of covariates

14 0.55287141 159 andrew gelman stats-2010-07-23-Popular governor, small state

15 0.55225474 2336 andrew gelman stats-2014-05-16-How much can we learn about individual-level causal claims from state-level correlations?

16 0.54932976 828 andrew gelman stats-2011-07-28-Thoughts on Groseclose book on media bias

17 0.54896295 382 andrew gelman stats-2010-10-30-“Presidential Election Outcomes Directly Influence Suicide Rates”

18 0.5486396 1996 andrew gelman stats-2013-08-24-All inference is about generalizing from sample to population

19 0.54850471 1364 andrew gelman stats-2012-06-04-Massive confusion about a study that purports to show that exercise may increase heart risk

20 0.54714751 731 andrew gelman stats-2011-05-26-Lottery probability update


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(9, 0.02), (14, 0.038), (15, 0.018), (16, 0.081), (24, 0.178), (30, 0.031), (31, 0.059), (35, 0.022), (41, 0.02), (56, 0.039), (71, 0.016), (72, 0.046), (73, 0.13), (82, 0.022), (89, 0.085), (99, 0.085)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.91538978 2346 andrew gelman stats-2014-05-24-Buzzfeed, Porn, Kansas…That Can’t Be Good

Introduction: This post is by  David K. Park  and courtesy of Alex Palen Ellis… Thought you might find this funny: Buzzfeed set out to study porn consumption versus the red/blue political spectrum. And they failed miserably. An article form opennews.org outlines six major fallacies Buzzfeed committed, the best of which resulted in the Kansas effect: “Pornhub’s writeup omitted any explicit description of their methodology—this is never a good sign—but it seems to have involved mapping the IP addresses from which users visited the site to physical addresses and reverse geocoding those to get states…. a large percentage of IP addresses could not be resolved to an address any more specific than “USA.” When that address was geocoded, it returned a point in the centroid of the continental United States, which placed it in the state of—you guessed it—Kansas!” As a result, Kansas was 2.95 std dev above the mean. Those pervs! from:  https://source.opennews.org/en-US/learning/distrust-your-data/

2 0.76175547 1748 andrew gelman stats-2013-03-04-PyStan!

Introduction: Stan is written in C++ and can be run from the command line and from R. We’d like for Python users to be able to run Stan as well. If anyone is interested in doing this, please let us know and we’d be happy to work with you on it. Stan, like Python, is completely free and open-source. P.S. Because Stan is open-source, it of course would also be possible for people to translate Stan into Python, or to take whatever features they like from Stan and incorporate them into a Python package. That’s fine too. But we think it would make sense in addition for users to be able to run Stan directly from Python, in the same way that it can be run from R.

3 0.75043929 917 andrew gelman stats-2011-09-20-Last post on Hipmunk

Introduction: There was some confusion on my last try , so let me explain one more time . . . The flights I where Hipmunk failed (see here for background) were not obscure itineraries. One of them was a nonstop from New York to Cincinnati; another was from NY to Durham, North Carolina; and yet another was a trip to Midway in Chicago. In that last case, Hipmunk showed no nonstops at all—which will come as a surprise to the passengers on the Southwest Airlines flight I was on a couple days ago! In these cases, Hipmunk didn’t even do the courtesy of flashing a message telling me to try elsewhere. I don’t understand. How hard would it be for the program to automatically do a Kayak search and find all the flights? Hipmunk’s graphics are great, though. Lee Wilkinson reports: Check out the figure below from The Grammar of Graphics. Dan Rope invented this graphic and programmed it in Java in the late 1990′s. We shopped this graph around to Orbitz and Expedia but they weren’t interested. So I

4 0.74582273 801 andrew gelman stats-2011-07-13-On the half-Cauchy prior for a global scale parameter

Introduction: Nick Polson and James Scott write : We generalize the half-Cauchy prior for a global scale parameter to the wider class of hypergeometric inverted-beta priors. We derive expressions for posterior moments and marginal densities when these priors are used for a top-level normal variance in a Bayesian hierarchical model. Finally, we prove a result that characterizes the frequentist risk of the Bayes estimators under all priors in the class. These arguments provide an alternative, classical justification for the use of the half-Cauchy prior in Bayesian hierarchical models, complementing the arguments in Gelman (2006). This makes me happy, of course. It’s great to be validated. The only think I didn’t catch is how they set the scale parameter for the half-Cauchy prior. In my 2006 paper I frame it as a weakly informative prior and recommend that the scale be set based on actual prior knowledge. But Polson and Scott are talking about a default choice. I used to think that such a

5 0.71990681 593 andrew gelman stats-2011-02-27-Heat map

Introduction: Jarad Niemi sends along this plot: and writes: 2010-2011 Miami Heat offensive (red), defensive (blue), and combined (black) player contribution means (dots) and 95% credible intervals (lines) where zero indicates an average NBA player. Larger positive numbers for offensive and combined are better while larger negative numbers for defense are better. In retrospect, I [Niemi] should have plotted -1*defensive_contribution so that larger was always better. The main point with this figure is that this awesome combination of James-Wade-Bosh that was discussed immediately after the LeBron trade to the Heat has a one-of-these-things-is-not-like-the-other aspect. At least according to my analysis, Bosh is hurting his team compared to the average player (although not statistically significant) due to his terrible defensive contribution (which is statistically significant). All fine so far. But the punchline comes at the end, when he writes: Anyway, a reviewer said he hated the

6 0.71759909 1708 andrew gelman stats-2013-02-05-Wouldn’t it be cool if Glenn Hubbard were consulting for Herbalife and I were on the other side?

7 0.71365321 1787 andrew gelman stats-2013-04-04-Wanna be the next Tyler Cowen? It’s not as easy as you might think!

8 0.71176487 1258 andrew gelman stats-2012-04-10-Why display 6 years instead of 30?

9 0.71160775 2229 andrew gelman stats-2014-02-28-God-leaf-tree

10 0.71135384 482 andrew gelman stats-2010-12-23-Capitalism as a form of voluntarism

11 0.70556492 938 andrew gelman stats-2011-10-03-Comparing prediction errors

12 0.70520294 1092 andrew gelman stats-2011-12-29-More by Berger and me on weakly informative priors

13 0.70458221 1376 andrew gelman stats-2012-06-12-Simple graph WIN: the example of birthday frequencies

14 0.70313972 1479 andrew gelman stats-2012-09-01-Mothers and Moms

15 0.70264888 1978 andrew gelman stats-2013-08-12-Fixing the race, ethnicity, and national origin questions on the U.S. Census

16 0.70233405 846 andrew gelman stats-2011-08-09-Default priors update?

17 0.70230496 1706 andrew gelman stats-2013-02-04-Too many MC’s not enough MIC’s, or What principles should govern attempts to summarize bivariate associations in large multivariate datasets?

18 0.70202756 643 andrew gelman stats-2011-04-02-So-called Bayesian hypothesis testing is just as bad as regular hypothesis testing

19 0.70151103 743 andrew gelman stats-2011-06-03-An argument that can’t possibly make sense

20 0.70121688 1875 andrew gelman stats-2013-05-28-Simplify until your fake-data check works, then add complications until you can figure out where the problem is coming from