andrew_gelman_stats andrew_gelman_stats-2010 andrew_gelman_stats-2010-146 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Yesterday I posted a review of a submitted manuscript where I first wrote that I read the paper only shallowly and then followed up with some suggestions on the statistical analysis, recommending that overdispersion be added to a fitted Posson regression and that the table of regression results be supplemented with a graph showing data and fitted lines. A commenter asked why I wrote such an apparently shallow review, and I realized that some of the implications of my review were not as clear as I’d thought. So let me clarify. There is a connection between my general reaction and my statistical comments. My statistical advice here is relevant for (at least) two reasons. First, a Poisson regression without overdispersion will give nearly-uninterpretable standard errors, which means that I have no sense if the results are statistically significant as claimed. Second, with a time series plot and regression table, but no graph showing the estimated treatment effect, it is very dif
sentIndex sentText sentNum sentScore
1 A commenter asked why I wrote such an apparently shallow review, and I realized that some of the implications of my review were not as clear as I’d thought. [sent-2, score-0.848]
2 There is a connection between my general reaction and my statistical comments. [sent-4, score-0.341]
3 My statistical advice here is relevant for (at least) two reasons. [sent-5, score-0.241]
4 First, a Poisson regression without overdispersion will give nearly-uninterpretable standard errors, which means that I have no sense if the results are statistically significant as claimed. [sent-6, score-1.104]
5 Second, with a time series plot and regression table, but no graph showing the estimated treatment effect, it is very difficult for me to visualize the magnitude of the estimated effect. [sent-7, score-1.193]
6 Both of these serious statistical problems lead to the problem noted at the beginning of my review, that I “didn’t try to judge whether the conclusions are correct. [sent-8, score-0.801]
7 ” It is the authors’ job to correctly determine statistical significance (or use some other measure of uncertainty) and to put their estimates into context. [sent-9, score-0.431]
8 How can I possibly judge correctness if I don’t know whether the results are statistically significant and if I don’t have a sense of how large they are compared to variation in the data? [sent-10, score-1.106]
9 I liked the paper, and that’s why I made my suggestions. [sent-11, score-0.096]
wordName wordTfidf (topN-words)
[('overdispersion', 0.381), ('review', 0.257), ('regression', 0.222), ('judge', 0.201), ('fitted', 0.194), ('table', 0.173), ('suggestions', 0.173), ('statistical', 0.167), ('supplemented', 0.163), ('showing', 0.158), ('estimated', 0.152), ('results', 0.144), ('statistically', 0.144), ('correctness', 0.143), ('shallow', 0.139), ('visualize', 0.137), ('recommending', 0.132), ('significant', 0.131), ('graph', 0.12), ('poisson', 0.118), ('manuscript', 0.118), ('submitted', 0.101), ('whether', 0.099), ('liked', 0.096), ('magnitude', 0.096), ('beginning', 0.096), ('realized', 0.096), ('correctly', 0.095), ('commenter', 0.094), ('determine', 0.092), ('implications', 0.091), ('connection', 0.088), ('wrote', 0.087), ('reaction', 0.086), ('yesterday', 0.084), ('possibly', 0.084), ('apparently', 0.084), ('plot', 0.083), ('conclusions', 0.083), ('sense', 0.082), ('added', 0.079), ('lead', 0.079), ('variation', 0.078), ('followed', 0.077), ('significance', 0.077), ('noted', 0.076), ('uncertainty', 0.076), ('posted', 0.076), ('advice', 0.074), ('treatment', 0.073)]
simIndex simValue blogId blogTitle
same-blog 1 1.0 146 andrew gelman stats-2010-07-14-The statistics and the science
Introduction: Yesterday I posted a review of a submitted manuscript where I first wrote that I read the paper only shallowly and then followed up with some suggestions on the statistical analysis, recommending that overdispersion be added to a fitted Posson regression and that the table of regression results be supplemented with a graph showing data and fitted lines. A commenter asked why I wrote such an apparently shallow review, and I realized that some of the implications of my review were not as clear as I’d thought. So let me clarify. There is a connection between my general reaction and my statistical comments. My statistical advice here is relevant for (at least) two reasons. First, a Poisson regression without overdispersion will give nearly-uninterpretable standard errors, which means that I have no sense if the results are statistically significant as claimed. Second, with a time series plot and regression table, but no graph showing the estimated treatment effect, it is very dif
2 0.28040299 144 andrew gelman stats-2010-07-13-Hey! Here’s a referee report for you!
Introduction: I just wrote this, and I realized it might be useful more generally: The article looks reasonable to me–but I just did a shallow read and didn’t try to judge whether the conclusions are correct. My main comment is that if they’re doing a Poisson regression, they should really be doing an overdispersed Poisson regression. I don’t know if I’ve ever seen data in my life where the non-overdispersed Poisson is appropriate. Also, I’d like to see a before-after plot with dots for control cases and open circles for treatment cases and fitted regression lines drawn in. Whenever there’s a regression I like to see this scatterplot. The scatterplot isn’t a replacement for the regression, but at the very least it gives me intuition as to the scale of the estimated effect. Finally, all their numbers should be rounded appropriately. Feel free to cut-and-paste this into your own referee reports (and to apply these recommendations in your own applied research).
3 0.16702154 899 andrew gelman stats-2011-09-10-The statistical significance filter
Introduction: I’ve talked about this a bit but it’s never had its own blog entry (until now). Statistically significant findings tend to overestimate the magnitude of effects. This holds in general (because E(|x|) > |E(x)|) but even more so if you restrict to statistically significant results. Here’s an example. Suppose a true effect of theta is unbiasedly estimated by y ~ N (theta, 1). Further suppose that we will only consider statistically significant results, that is, cases in which |y| > 2. The estimate “|y| conditional on |y|>2″ is clearly an overestimate of |theta|. First off, if |theta|<2, the estimate |y| conditional on statistical significance is not only too high in expectation, it's always too high. This is a problem, given that |theta| is in reality probably is less than 2. (The low-hangning fruit have already been picked, remember?) But even if |theta|>2, the estimate |y| conditional on statistical significance will still be too high in expectation. For a discussion o
4 0.14659655 282 andrew gelman stats-2010-09-17-I can’t escape it
Introduction: I received the following email: Ms. No.: *** Title: *** Corresponding Author: *** All Authors: *** Dear Dr. Gelman, Because of your expertise, I would like to ask your assistance in determining whether the above-mentioned manuscript is appropriate for publication in ***. The abstract is pasted below. . . . My reply: I would rather not review this article. I suggest ***, ***, and *** as reviewers. I think it would be difficult for me to review the manuscript fairly.
Introduction: Devrup Ghatak writes: I am a student of economics and recently read your review of Mostly Harmless Econometrics. In the review you mention that the book contains no time series. Given that your book on data analysis (Data Analysis using Regression) does not contain any time series material either, I wonder if you happen to have any favourite time series reference similar in style/level to the data analysis book. I don’t know. The closest thing might be Hierarchical Modeling and Analysis for Spatial Data by Banerjee, Carlin, and Gelfand, but I don’t know of anything focused on time series that’s quite in the format that I’d prefer. This is not my area, though. Maybe you, the readers, have some suggestions?
6 0.1392266 1072 andrew gelman stats-2011-12-19-“The difference between . . .”: It’s not just p=.05 vs. p=.06
9 0.12289365 1971 andrew gelman stats-2013-08-07-I doubt they cheated
10 0.1200968 1452 andrew gelman stats-2012-08-09-Visually weighting regression displays
11 0.11825376 803 andrew gelman stats-2011-07-14-Subtleties with measurement-error models for the evaluation of wacky claims
12 0.11753675 836 andrew gelman stats-2011-08-03-Another plagiarism mystery
13 0.11733593 1403 andrew gelman stats-2012-07-02-Moving beyond hopeless graphics
14 0.11550133 783 andrew gelman stats-2011-06-30-Don’t stop being a statistician once the analysis is done
15 0.11068225 451 andrew gelman stats-2010-12-05-What do practitioners need to know about regression?
18 0.10461496 852 andrew gelman stats-2011-08-13-Checking your model using fake data
20 0.10236467 796 andrew gelman stats-2011-07-10-Matching and regression: two great tastes etc etc
topicId topicWeight
[(0, 0.194), (1, 0.027), (2, 0.028), (3, -0.11), (4, 0.094), (5, -0.097), (6, -0.053), (7, -0.017), (8, 0.028), (9, -0.012), (10, 0.049), (11, 0.008), (12, 0.033), (13, -0.03), (14, 0.074), (15, -0.006), (16, -0.023), (17, 0.02), (18, 0.001), (19, 0.003), (20, 0.027), (21, 0.071), (22, 0.032), (23, -0.023), (24, 0.049), (25, 0.03), (26, 0.056), (27, -0.111), (28, -0.043), (29, -0.063), (30, 0.039), (31, 0.074), (32, -0.023), (33, -0.006), (34, 0.013), (35, -0.015), (36, -0.067), (37, -0.053), (38, 0.006), (39, -0.057), (40, 0.034), (41, 0.072), (42, -0.092), (43, 0.045), (44, 0.103), (45, -0.03), (46, -0.062), (47, -0.059), (48, 0.022), (49, -0.029)]
simIndex simValue blogId blogTitle
same-blog 1 0.9839052 146 andrew gelman stats-2010-07-14-The statistics and the science
Introduction: Yesterday I posted a review of a submitted manuscript where I first wrote that I read the paper only shallowly and then followed up with some suggestions on the statistical analysis, recommending that overdispersion be added to a fitted Posson regression and that the table of regression results be supplemented with a graph showing data and fitted lines. A commenter asked why I wrote such an apparently shallow review, and I realized that some of the implications of my review were not as clear as I’d thought. So let me clarify. There is a connection between my general reaction and my statistical comments. My statistical advice here is relevant for (at least) two reasons. First, a Poisson regression without overdispersion will give nearly-uninterpretable standard errors, which means that I have no sense if the results are statistically significant as claimed. Second, with a time series plot and regression table, but no graph showing the estimated treatment effect, it is very dif
2 0.78273857 1971 andrew gelman stats-2013-08-07-I doubt they cheated
Introduction: Following up on my regression-discontinuity post from the other day, Brad DeLong writes : The feel (and I could well be wrong) as that at some point somebody said: “This is very important, but it won’t get published without a statistically significant headline finding. Torture the data via specification search until we find a statistically significant effect so that this can get published!” I think DeLong is mistaken here. But, before getting to this, here’s the graph: and here are the regression results: So, indeed it is that cubic term that takes the result into statistical significance. The reason I disagree with DeLong is that it’s my impression that, in econometrics and applied economics, it’s considered the safe, conservative choice in regression discontinuity to control for a high-degree polynomial. See the paper discussed a few years ago here , for example, where I criticized a pair of economists for using a fifth-degree specification and they replie
3 0.75012839 144 andrew gelman stats-2010-07-13-Hey! Here’s a referee report for you!
Introduction: I just wrote this, and I realized it might be useful more generally: The article looks reasonable to me–but I just did a shallow read and didn’t try to judge whether the conclusions are correct. My main comment is that if they’re doing a Poisson regression, they should really be doing an overdispersed Poisson regression. I don’t know if I’ve ever seen data in my life where the non-overdispersed Poisson is appropriate. Also, I’d like to see a before-after plot with dots for control cases and open circles for treatment cases and fitted regression lines drawn in. Whenever there’s a regression I like to see this scatterplot. The scatterplot isn’t a replacement for the regression, but at the very least it gives me intuition as to the scale of the estimated effect. Finally, all their numbers should be rounded appropriately. Feel free to cut-and-paste this into your own referee reports (and to apply these recommendations in your own applied research).
4 0.73587626 310 andrew gelman stats-2010-10-02-The winner’s curse
Introduction: If an estimate is statistically significant, it’s probably an overestimate of the magnitude of your effect. P.S. I think youall know what I mean here. But could someone rephrase it in a more pithy manner? I’d like to include it in our statistical lexicon.
5 0.68321323 1171 andrew gelman stats-2012-02-16-“False-positive psychology”
Introduction: Everybody’s talkin bout this paper by Joseph Simmons, Leif Nelson and Uri Simonsohn, who write : Despite empirical psychologists’ nominal endorsement of a low rate of false-positive findings (≤ .05), flexibility in data collection, analysis, and reporting dramatically increases actual false-positive rates. In many cases, a researcher is more likely to falsely find evidence that an effect exists than to correctly find evidence that it does not. We [Simmons, Nelson, and Simonsohn] present computer simulations and a pair of actual experiments that demonstrate how unacceptably easy it is to accumulate (and report) statistically significant evidence for a false hypothesis. Second, we suggest a simple, low-cost, and straightforwardly effective disclosure-based solution to this problem. The solution involves six concrete requirements for authors and four guidelines for reviewers, all of which impose a minimal burden on the publication process. Whatever you think about these recommend
7 0.67133892 1072 andrew gelman stats-2011-12-19-“The difference between . . .”: It’s not just p=.05 vs. p=.06
8 0.67013335 2357 andrew gelman stats-2014-06-02-Why we hate stepwise regression
11 0.66040695 293 andrew gelman stats-2010-09-23-Lowess is great
12 0.65947157 593 andrew gelman stats-2011-02-27-Heat map
13 0.65781671 1403 andrew gelman stats-2012-07-02-Moving beyond hopeless graphics
15 0.65434027 899 andrew gelman stats-2011-09-10-The statistical significance filter
16 0.65387928 1663 andrew gelman stats-2013-01-09-The effects of fiscal consolidation
18 0.65292567 933 andrew gelman stats-2011-09-30-More bad news: The (mis)reporting of statistical results in psychology journals
20 0.64208585 156 andrew gelman stats-2010-07-20-Burglars are local
topicId topicWeight
[(10, 0.016), (15, 0.023), (16, 0.027), (21, 0.039), (24, 0.18), (32, 0.122), (46, 0.014), (57, 0.016), (76, 0.029), (77, 0.034), (99, 0.401)]
simIndex simValue blogId blogTitle
1 0.98364842 701 andrew gelman stats-2011-05-07-Bechdel wasn’t kidding
Introduction: Regular readers of this blog know about the Bechdel test for movies: 1. It has to have at least two women in it 2. Who talk to each other 3. About something besides a man Amusing, huh? But I only really got the point the other day, when I was on a plane and passively watched parts of the in-flight movie. It was something I’d never heard of (of course) and it happened to be a chick flick–even without the soundtrack, it was clear that the main character was a woman and much of it was about her love life. But even this movie failed the Bechdel test miserably! I don’t even think it passed item #1 above, but if it did, it certainly failed #2. If even the chick flicks are failing the Bechdel test, then, yeah, we’re really in trouble. And don’t get me started on those old Warner Brothers cartoons. They’re great but they feature about as many female characters as the average WWII submarine. Sure, everybody knows this, but it’s still striking to think about just how unbalanced
2 0.97920936 1571 andrew gelman stats-2012-11-09-The anti-Bayesian moment and its passing
Introduction: Xian and I respond to the four discussants of our paper, “Not only defended but also applied”: The perceived absurdity of Bayesian inference.” Here’s the abstract of our rejoinder : Over the years we have often felt frustration, both at smug Bayesians—in particular, those who object to checking of the fit of model to data, either because all Bayesian models are held to be subjective and thus unquestioned (an odd combination indeed, but that is the subject of another article)—and angry anti-Bayesians who, as we wrote in our article, strain on the gnat of the prior distribution while swallowing the camel that is the likelihood. The present article arose from our memory of a particularly intemperate anti-Bayesian statement that appeared in Feller’s beautiful and classic book on probability theory. We felt that it was worth exploring the very extremeness of Feller’s words, along with similar anti-Bayesian remarks by others, in order to better understand the background underlying contr
same-blog 3 0.97907484 146 andrew gelman stats-2010-07-14-The statistics and the science
Introduction: Yesterday I posted a review of a submitted manuscript where I first wrote that I read the paper only shallowly and then followed up with some suggestions on the statistical analysis, recommending that overdispersion be added to a fitted Posson regression and that the table of regression results be supplemented with a graph showing data and fitted lines. A commenter asked why I wrote such an apparently shallow review, and I realized that some of the implications of my review were not as clear as I’d thought. So let me clarify. There is a connection between my general reaction and my statistical comments. My statistical advice here is relevant for (at least) two reasons. First, a Poisson regression without overdispersion will give nearly-uninterpretable standard errors, which means that I have no sense if the results are statistically significant as claimed. Second, with a time series plot and regression table, but no graph showing the estimated treatment effect, it is very dif
4 0.9655847 1269 andrew gelman stats-2012-04-19-Believe your models (up to the point that you abandon them)
Introduction: In a discussion of his variant of the write-a-thousand-words-a-day strategy (as he puts it, “a system for the production of academic results in writing”), Thomas Basbøll writes : Believe the claims you are making. That is, confine yourself to making claims you believe. I always emphasize this when I [Basbøll] define knowledge as “justified, true belief”. . . . I think if there is one sure way to undermine your sense of your own genius it is to begin to say things you know to be publishable without being sure they are true. Or even things you know to be “true” but don’t understand well enough to believe. He points out that this is not so easy: In times when there are strong orthodoxies it can sometimes be difficult to know what to believe. Or, rather, it is all too easy to know what to believe (what the “right belief” is). It is therefore difficult to stick to statements of one’s own belief. I sometimes worry that our universities, which are systems of formal education and for
5 0.96042937 1191 andrew gelman stats-2012-03-01-Hoe noem je?
Introduction: Gerrit Storms reports on an interesting linguistic research project in which you can participate! Here’s the description: Over the past few weeks, we have been trying to set up a scientific study that is important for many researchers interested in words, word meaning, semantics, and cognitive science in general. It is a huge word association project, in which people are asked to participate in a small task that doesn’t last longer than 5 minutes. Our goal is to build a global word association network that contains connections between about 40,000 words, the size of the lexicon of an average adult. Setting up such a network might learn us a lot about semantic memory, how it develops, and maybe also about how it can deteriorate (like in Alzheimer’s disease). Most people enjoy doing the task, but we need thousands of participants to succeed. Up till today, we found about 53,000 participants willing to do the little task, but we need more subjects. That is why we address you. Would
6 0.95920134 455 andrew gelman stats-2010-12-07-Some ideas on communicating risks to the general public
8 0.95565414 957 andrew gelman stats-2011-10-14-Questions about a study of charter schools
9 0.9556005 1209 andrew gelman stats-2012-03-12-As a Bayesian I want scientists to report their data non-Bayesianly
11 0.95416629 295 andrew gelman stats-2010-09-25-Clusters with very small numbers of observations
12 0.95389712 1744 andrew gelman stats-2013-03-01-Why big effects are more important than small effects
13 0.95373863 855 andrew gelman stats-2011-08-16-Infovis and statgraphics update update
14 0.95288914 1974 andrew gelman stats-2013-08-08-Statistical significance and the dangerous lure of certainty
15 0.95275378 1955 andrew gelman stats-2013-07-25-Bayes-respecting experimental design and other things
16 0.95255893 2176 andrew gelman stats-2014-01-19-Transformations for non-normal data
17 0.95243931 1228 andrew gelman stats-2012-03-25-Continuous variables in Bayesian networks
18 0.9523297 2295 andrew gelman stats-2014-04-18-One-tailed or two-tailed?
19 0.95221651 2251 andrew gelman stats-2014-03-17-In the best alternative histories, the real world is what’s ultimately real
20 0.95215064 1149 andrew gelman stats-2012-02-01-Philosophy of Bayesian statistics: my reactions to Cox and Mayo