andrew_gelman_stats andrew_gelman_stats-2013 andrew_gelman_stats-2013-1971 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Following up on my regression-discontinuity post from the other day, Brad DeLong writes : The feel (and I could well be wrong) as that at some point somebody said: “This is very important, but it won’t get published without a statistically significant headline finding. Torture the data via specification search until we find a statistically significant effect so that this can get published!” I think DeLong is mistaken here. But, before getting to this, here’s the graph: and here are the regression results: So, indeed it is that cubic term that takes the result into statistical significance. The reason I disagree with DeLong is that it’s my impression that, in econometrics and applied economics, it’s considered the safe, conservative choice in regression discontinuity to control for a high-degree polynomial. See the paper discussed a few years ago here , for example, where I criticized a pair of economists for using a fifth-degree specification and they replie
sentIndex sentText sentNum sentScore
1 Following up on my regression-discontinuity post from the other day, Brad DeLong writes : The feel (and I could well be wrong) as that at some point somebody said: “This is very important, but it won’t get published without a statistically significant headline finding. [sent-1, score-0.349]
2 Torture the data via specification search until we find a statistically significant effect so that this can get published! [sent-2, score-0.77]
3 But, before getting to this, here’s the graph: and here are the regression results: So, indeed it is that cubic term that takes the result into statistical significance. [sent-4, score-0.36]
4 The reason I disagree with DeLong is that it’s my impression that, in econometrics and applied economics, it’s considered the safe, conservative choice in regression discontinuity to control for a high-degree polynomial. [sent-5, score-0.48]
5 In which case the four additional degrees of freedom required to ramp up from a linear to a 5th-degree adjustment are a small price to pay if you have a large or even moderate sample size. [sent-8, score-0.355]
6 And in this case, sure, the cubic polynomial looks ridiculous, but a linear fit would be even worse (as the authors found using their model-fit statistics). [sent-9, score-0.89]
7 I’m guessing that the authors were doing what they thought was right and proper by choosing the best-fitting of these polynomials. [sent-10, score-0.229]
8 What if the result had been statistically significant with linear adjustment but not with a higher-degree polynomial? [sent-11, score-0.64]
9 Would they have presented the statistically significant linear result and stopped there? [sent-13, score-0.531]
10 But, given my impression of how economists think about regression discontinuity analysis, my guess is that, given the data the authors did see, that they did not do a specification search; they just did what they thought was the most kosher analysis possible. [sent-15, score-1.228]
11 had violated the rules of the game (in this case, not by faking or improperly discarding data but by trying analysis after analysis in a search for statistical significance), this would be a problem, but it’s a containable problem. [sent-17, score-0.766]
12 The rules are (relatively clear), and you’re not supposed to break them. [sent-18, score-0.171]
13 did what, under current doctrine, they were supposed to do : find a discontinuity and adjust using a high-degree polynomial. [sent-21, score-0.451]
14 When the recommended analysis has such problems of face validity, that’s a different problem entirely. [sent-22, score-0.114]
15 As the (sometimes) great Michael Kinsley once said, in a different context, “the scandal isn’t what’s illegal, the scandal is what’s legal. [sent-23, score-0.274]
16 Just to clarify: Not only do I not think that Chen et al. [sent-26, score-0.112]
17 “cheated” (in the sense of trying out many specifications in a search for statistical significance), I never thought so. [sent-27, score-0.301]
18 As I wrote in my original post, I applaud the authors’ directness in graphing their model which reveals its problems. [sent-28, score-0.209]
19 My post title, “I doubt they cheated,” is specifically in response to Brad DeLong’s feeling that they “tortured the data via specification search. [sent-29, score-0.399]
wordName wordTfidf (topN-words)
[('delong', 0.299), ('discontinuity', 0.291), ('polynomial', 0.265), ('specification', 0.253), ('chen', 0.205), ('authors', 0.163), ('search', 0.158), ('linear', 0.157), ('cubic', 0.154), ('cheated', 0.15), ('statistically', 0.147), ('scandal', 0.137), ('significant', 0.134), ('brad', 0.125), ('analysis', 0.114), ('regression', 0.113), ('et', 0.112), ('adjustment', 0.109), ('economics', 0.106), ('degree', 0.095), ('result', 0.093), ('methodologists', 0.089), ('ramp', 0.089), ('supposed', 0.086), ('rules', 0.085), ('improperly', 0.083), ('kinsley', 0.08), ('directness', 0.08), ('significance', 0.078), ('via', 0.078), ('worse', 0.077), ('specifications', 0.077), ('kosher', 0.077), ('impression', 0.076), ('economists', 0.075), ('unbiasedness', 0.075), ('using', 0.074), ('torture', 0.073), ('doctrine', 0.071), ('faking', 0.071), ('discarding', 0.071), ('violated', 0.07), ('post', 0.068), ('expense', 0.067), ('applaud', 0.066), ('thought', 0.066), ('discussed', 0.066), ('mistaken', 0.063), ('graphing', 0.063), ('illegal', 0.063)]
simIndex simValue blogId blogTitle
same-blog 1 0.99999976 1971 andrew gelman stats-2013-08-07-I doubt they cheated
Introduction: Following up on my regression-discontinuity post from the other day, Brad DeLong writes : The feel (and I could well be wrong) as that at some point somebody said: “This is very important, but it won’t get published without a statistically significant headline finding. Torture the data via specification search until we find a statistically significant effect so that this can get published!” I think DeLong is mistaken here. But, before getting to this, here’s the graph: and here are the regression results: So, indeed it is that cubic term that takes the result into statistical significance. The reason I disagree with DeLong is that it’s my impression that, in econometrics and applied economics, it’s considered the safe, conservative choice in regression discontinuity to control for a high-degree polynomial. See the paper discussed a few years ago here , for example, where I criticized a pair of economists for using a fifth-degree specification and they replie
Introduction: Yu Xie thought I’d have something to say about this recent paper , “Evidence on the impact of sustained exposure to air pollution on life expectancy from China’s Huai River policy,” by Yuyu Chen, Avraham Ebenstein, Michael Greenstone, and Hongbin Li, which begins: This paper’s findings suggest that an arbitrary Chinese policy that greatly increases total suspended particulates (TSPs) air pollution is causing the 500 million residents of Northern China to lose more than 2.5 billion life years of life expectancy. The quasi-experimental empirical approach is based on China’s Huai River policy, which provided free winter heating via the provision of coal for boilers in cities north of the Huai River but denied heat to the south. Using a regression discontinuity design based on distance from the Huai River, we find that ambient concentrations of TSPs are about 184 μg/m3 [95% confidence interval (CI): 61, 307] or 55% higher in the north. Further, the results indicate that life expectanci
Introduction: Some things I respect When it comes to meta-models of statistics, here are two philosophies that I respect: 1. (My) Bayesian approach, which I associate with E. T. Jaynes, in which you construct models with strong assumptions, ride your models hard, check their fit to data, and then scrap them and improve them as necessary. 2. At the other extreme, model-free statistical procedures that are designed to work well under very weak assumptions—for example, instead of assuming a distribution is Gaussian, you would just want the procedure to work well under some conditions on the smoothness of the second derivative of the log density function. Both the above philosophies recognize that (almost) all important assumptions will be wrong, and they resolve this concern via aggressive model checking or via robustness. And of course there are intermediate positions, such as working with Bayesian models that have been shown to be robust, and then still checking them. Or, to flip it arou
4 0.14076965 2365 andrew gelman stats-2014-06-09-I hate polynomials
Introduction: A recent discussion with Mark Palko [scroll down to the comments at this link ] reminds me that I think that polynomials are way way overrated, and I think a lot of damage has arisen from the old-time approach of introducing polynomial functions as a canonical example of linear regressions ( for example ). There are very few settings I can think of where it makes sense to fit a general polynomial of degree higher than 2. I think that millions of students have been brainwashed into thinking of these as the canonical functions and that this has caused endless trouble later on. I’m not sure how I’d change the high school math curriculum to deal with this, but I do think it’s an issue.
5 0.1404262 1695 andrew gelman stats-2013-01-28-Economists argue about Bayes
Introduction: Robert Bell pointed me to this post by Brad De Long on Bayesian statistics, and then I also noticed this from Noah Smith, who wrote: My impression is that although the Bayesian/Frequentist debate is interesting and intellectually fun, there’s really not much “there” there… despite being so-hip-right-now, Bayesian is not the Statistical Jesus. I’m happy to see the discussion going in this direction. Twenty-five years ago or so, when I got into this biz, there were some serious anti-Bayesian attitudes floating around in mainstream statistics. Discussions in the journals sometimes devolved into debates of the form, “Bayesians: knaves or fools?”. You’d get all sorts of free-floating skepticism about any prior distribution at all, even while people were accepting without question (and doing theory on) logistic regressions, proportional hazards models, and all sorts of strong strong models. (In the subfield of survey sampling, various prominent researchers would refuse to mode
6 0.13917437 1072 andrew gelman stats-2011-12-19-“The difference between . . .”: It’s not just p=.05 vs. p=.06
7 0.13015206 899 andrew gelman stats-2011-09-10-The statistical significance filter
9 0.12289365 146 andrew gelman stats-2010-07-14-The statistics and the science
10 0.11798137 451 andrew gelman stats-2010-12-05-What do practitioners need to know about regression?
11 0.11309274 518 andrew gelman stats-2011-01-15-Regression discontinuity designs: looking for the keys under the lamppost?
12 0.10883762 431 andrew gelman stats-2010-11-26-One fun thing about physicists . . .
14 0.10588156 888 andrew gelman stats-2011-09-03-A psychology researcher asks: Is Anova dead?
15 0.10384423 2236 andrew gelman stats-2014-03-07-Selection bias in the reporting of shaky research
16 0.10267921 1435 andrew gelman stats-2012-07-30-Retracted articles and unethical behavior in economics journals?
19 0.09677735 1482 andrew gelman stats-2012-09-04-Model checking and model understanding in machine learning
20 0.096191742 2004 andrew gelman stats-2013-09-01-Post-publication peer review: How it (sometimes) really works
topicId topicWeight
[(0, 0.216), (1, 0.022), (2, 0.016), (3, -0.095), (4, 0.012), (5, -0.061), (6, -0.02), (7, -0.021), (8, 0.035), (9, 0.029), (10, 0.017), (11, 0.0), (12, -0.002), (13, -0.025), (14, 0.034), (15, 0.025), (16, -0.007), (17, 0.007), (18, 0.003), (19, -0.004), (20, 0.021), (21, 0.038), (22, 0.009), (23, -0.002), (24, 0.008), (25, 0.028), (26, 0.018), (27, -0.077), (28, -0.004), (29, -0.031), (30, 0.071), (31, 0.053), (32, 0.019), (33, -0.023), (34, 0.019), (35, 0.032), (36, -0.069), (37, -0.013), (38, -0.023), (39, 0.002), (40, 0.007), (41, 0.013), (42, -0.05), (43, 0.031), (44, 0.047), (45, -0.038), (46, 0.006), (47, -0.034), (48, 0.015), (49, -0.041)]
simIndex simValue blogId blogTitle
same-blog 1 0.97420698 1971 andrew gelman stats-2013-08-07-I doubt they cheated
Introduction: Following up on my regression-discontinuity post from the other day, Brad DeLong writes : The feel (and I could well be wrong) as that at some point somebody said: “This is very important, but it won’t get published without a statistically significant headline finding. Torture the data via specification search until we find a statistically significant effect so that this can get published!” I think DeLong is mistaken here. But, before getting to this, here’s the graph: and here are the regression results: So, indeed it is that cubic term that takes the result into statistical significance. The reason I disagree with DeLong is that it’s my impression that, in econometrics and applied economics, it’s considered the safe, conservative choice in regression discontinuity to control for a high-degree polynomial. See the paper discussed a few years ago here , for example, where I criticized a pair of economists for using a fifth-degree specification and they replie
2 0.87347203 146 andrew gelman stats-2010-07-14-The statistics and the science
Introduction: Yesterday I posted a review of a submitted manuscript where I first wrote that I read the paper only shallowly and then followed up with some suggestions on the statistical analysis, recommending that overdispersion be added to a fitted Posson regression and that the table of regression results be supplemented with a graph showing data and fitted lines. A commenter asked why I wrote such an apparently shallow review, and I realized that some of the implications of my review were not as clear as I’d thought. So let me clarify. There is a connection between my general reaction and my statistical comments. My statistical advice here is relevant for (at least) two reasons. First, a Poisson regression without overdispersion will give nearly-uninterpretable standard errors, which means that I have no sense if the results are statistically significant as claimed. Second, with a time series plot and regression table, but no graph showing the estimated treatment effect, it is very dif
Introduction: Maggie Fox writes : Brain scans may be able to predict what you will do better than you can yourself . . . They found a way to interpret “real time” brain images to show whether people who viewed messages about using sunscreen would actually use sunscreen during the following week. The scans were more accurate than the volunteers were, Emily Falk and colleagues at the University of California Los Angeles reported in the Journal of Neuroscience. . . . About half the volunteers had correctly predicted whether they would use sunscreen. The research team analyzed and re-analyzed the MRI scans to see if they could find any brain activity that would do better. Activity in one area of the brain, a particular part of the medial prefrontal cortex, provided the best information. “From this region of the brain, we can predict for about three-quarters of the people whether they will increase their use of sunscreen beyond what they say they will do,” Lieberman said. “It is the one re
Introduction: Yu Xie thought I’d have something to say about this recent paper , “Evidence on the impact of sustained exposure to air pollution on life expectancy from China’s Huai River policy,” by Yuyu Chen, Avraham Ebenstein, Michael Greenstone, and Hongbin Li, which begins: This paper’s findings suggest that an arbitrary Chinese policy that greatly increases total suspended particulates (TSPs) air pollution is causing the 500 million residents of Northern China to lose more than 2.5 billion life years of life expectancy. The quasi-experimental empirical approach is based on China’s Huai River policy, which provided free winter heating via the provision of coal for boilers in cities north of the Huai River but denied heat to the south. Using a regression discontinuity design based on distance from the Huai River, we find that ambient concentrations of TSPs are about 184 μg/m3 [95% confidence interval (CI): 61, 307] or 55% higher in the north. Further, the results indicate that life expectanci
5 0.80999929 706 andrew gelman stats-2011-05-11-The happiness gene: My bottom line (for now)
Introduction: I had a couple of email exchanges with Jan-Emmanuel De Neve and James Fowler, two of the authors of the article on the gene that is associated with life satisfaction which we blogged the other day. (Bruno Frey, the third author of the article in question, is out of town according to his email.) Fowler also commented directly on the blog. I won’t go through all the details, but now I have a better sense of what’s going on. (Thanks, Jan and James!) Here’s my current understanding: 1. The original manuscript was divided into two parts: an article by De Neve alone published in the Journal of Human Genetics, and an article by De Neve, Fowler, Frey, and Nicholas Christakis submitted to Econometrica. The latter paper repeats the analysis from the Adolescent Health survey and also replicates with data from the Framingham heart study (hence Christakis’s involvement). The Framingham study measures a slightly different gene and uses a slightly life-satisfaction question com
6 0.80609035 1171 andrew gelman stats-2012-02-16-“False-positive psychology”
7 0.80114967 156 andrew gelman stats-2010-07-20-Burglars are local
8 0.79617816 2159 andrew gelman stats-2014-01-04-“Dogs are sensitive to small variations of the Earth’s magnetic field”
9 0.79565841 490 andrew gelman stats-2010-12-29-Brain Structure and the Big Five
12 0.79182661 1663 andrew gelman stats-2013-01-09-The effects of fiscal consolidation
15 0.77683532 1557 andrew gelman stats-2012-11-01-‘Researcher Degrees of Freedom’
16 0.77502447 1072 andrew gelman stats-2011-12-19-“The difference between . . .”: It’s not just p=.05 vs. p=.06
17 0.77429199 1893 andrew gelman stats-2013-06-11-Folic acid and autism
18 0.76209199 933 andrew gelman stats-2011-09-30-More bad news: The (mis)reporting of statistical results in psychology journals
topicId topicWeight
[(2, 0.011), (5, 0.013), (15, 0.023), (16, 0.078), (24, 0.177), (34, 0.01), (72, 0.014), (86, 0.251), (89, 0.012), (99, 0.279)]
simIndex simValue blogId blogTitle
1 0.98495936 253 andrew gelman stats-2010-09-03-Gladwell vs Pinker
Introduction: I just happened to notice this from last year. Eric Loken writes : Steven Pinker reviewed Malcolm Gladwell’s latest book and criticized him rather harshly for several shortcomings. Gladwell appears to have made things worse for himself in a letter to the editor of the NYT by defending a manifestly weak claim from one of his essays – the claim that NFL quarterback performance is unrelated to the order they were drafted out of college. The reason w [Loken and his colleagues] are implicated is that Pinker identified an earlier blog post of ours as one of three sources he used to challenge Gladwell (yay us!). But Gladwell either misrepresented or misunderstood our post in his response, and admonishes Pinker by saying “we should agree that our differences owe less to what can be found in the scientific literature than they do to what can be found on Google.” Well, here’s what you can find on Google. Follow this link to request the data for NFL quarterbacks drafted between 1980 and
2 0.97697163 1718 andrew gelman stats-2013-02-11-Toward a framework for automatic model building
Introduction: Patrick Caldon writes: I saw your recent blog post where you discussed in passing an iterative-chain-of models approach to AI. I essentially built such a thing for my PhD thesis – not in a Bayesian context, but in a logic programming context – and proved it had a few properties and showed how you could solve some toy problems. The important bit of my framework was that at various points you also go and get more data in the process – in a statistical context this might be seen as building a little univariate model on a subset of the data, then iteratively extending into a better model with more data and more independent variables – a generalized forward stepwise regression if you like. It wrapped a proper computational framework around E.M. Gold’s identification/learning in the limit based on a logic my advisor (Eric Martin) had invented. What’s not written up in the thesis is a few months of failed struggle trying to shoehorn some simple statistical inference into this
3 0.96816242 873 andrew gelman stats-2011-08-26-Luck or knowledge?
Introduction: Joan Ginther has won the Texas lottery four times. First, she won $5.4 million, then a decade later, she won $2million, then two years later $3million and in the summer of 2010, she hit a $10million jackpot. The odds of this has been calculated at one in eighteen septillion and luck like this could only come once every quadrillion years. According to Forbes, the residents of Bishop, Texas, seem to believe God was behind it all. The Texas Lottery Commission told Mr Rich that Ms Ginther must have been ‘born under a lucky star’, and that they don’t suspect foul play. Harper’s reporter Nathanial Rich recently wrote an article about Ms Ginther, which calls the the validity of her ‘luck’ into question. First, he points out, Ms Ginther is a former math professor with a PhD from Stanford University specialising in statistics. More at Daily Mail. [Edited Saturday] In comments, C Ryan King points to the original article at Harper’s and Bill Jefferys to Wired .
Introduction: The Journal of the Royal Statistical Society publishes papers followed by discussions. Lots of discussions, each can be no more than 400 words. Here’s my most recent discussion: The authors are working on an important applied problem and I have no reason to doubt that their approach is a step forward beyond diagnostic criteria based on point estimation. An attempt at an accurate assessment of variation is important not just for statistical reasons but also because scientists have the duty to convey their uncertainty to the larger world. I am thinking, for example, of discredited claims such as that of the mathematician who claimed to predict divorces with 93% accuracy (Abraham, 2010). Regarding the paper at hand, I thought I would try an experiment in comment-writing. My usual practice is to read the graphs and then go back and clarify any questions through the text. So, very quickly: I would prefer Figure 1 to be displayed in terms of standard deviations, not variances. I
5 0.96575648 1530 andrew gelman stats-2012-10-11-Migrating your blog from Movable Type to WordPress
Introduction: Cord Blomquist, who did a great job moving us from horrible Movable Type to nice nice WordPress, writes: I [Cord] wanted to share a little news with you related to the original work we did for you last year. When ReadyMadeWeb converted your Movable Type blog to WordPress, we got a lot of other requestes for the same service, so we started thinking about a bigger market for such a product. After a bit of research, we started work on automating the data conversion, writing rules, and exceptions to the rules, on how Movable Type and TypePad data could be translated to WordPress. After many months of work, we’re getting ready to announce TP2WP.com , a service that converts Movable Type and TypePad export files to WordPress import files, so anyone who wants to migrate to WordPress can do so easily and without losing permalinks, comments, images, or other files. By automating our service, we’ve been able to drop the price to just $99. I recommend it (and, no, Cord is not paying m
6 0.96371055 1547 andrew gelman stats-2012-10-25-College football, voting, and the law of large numbers
7 0.96339595 436 andrew gelman stats-2010-11-29-Quality control problems at the New York Times
9 0.96061915 904 andrew gelman stats-2011-09-13-My wikipedia edit
10 0.95653582 76 andrew gelman stats-2010-06-09-Both R and Stata
11 0.95392704 1427 andrew gelman stats-2012-07-24-More from the sister blog
12 0.94735312 305 andrew gelman stats-2010-09-29-Decision science vs. social psychology
13 0.94522011 2082 andrew gelman stats-2013-10-30-Berri Gladwell Loken football update
14 0.94498229 759 andrew gelman stats-2011-06-11-“2 level logit with 2 REs & large sample. computational nightmare – please help”
same-blog 15 0.93423975 1971 andrew gelman stats-2013-08-07-I doubt they cheated
16 0.93306112 276 andrew gelman stats-2010-09-14-Don’t look at just one poll number–unless you really know what you’re doing!
17 0.92958266 1278 andrew gelman stats-2012-04-23-“Any old map will do” meets “God is in every leaf of every tree”
18 0.92732853 1983 andrew gelman stats-2013-08-15-More on AIC, WAIC, etc
19 0.916291 866 andrew gelman stats-2011-08-23-Participate in a research project on combining information for prediction
20 0.90601516 781 andrew gelman stats-2011-06-28-The holes in my philosophy of Bayesian data analysis