andrew_gelman_stats andrew_gelman_stats-2011 andrew_gelman_stats-2011-1072 knowledge-graph by maker-knowledge-mining

1072 andrew gelman stats-2011-12-19-“The difference between . . .”: It’s not just p=.05 vs. p=.06


meta infos for this blog

Source: html

Introduction: The title of this post by Sanjay Srivastava illustrates an annoying misconception that’s crept into the (otherwise delightful) recent publicity related to my article with Hal Stern, he difference between “significant” and “not significant” is not itself statistically significant. When people bring this up, they keep referring to the difference between p=0.05 and p=0.06, making the familiar (and correct) point about the arbitrariness of the conventional p-value threshold of 0.05. And, sure, I agree with this, but everybody knows that already. The point Hal and I were making was that even apparently large differences in p-values are not statistically significant. For example, if you have one study with z=2.5 (almost significant at the 1% level!) and another with z=1 (not statistically significant at all, only 1 se from zero!), then their difference has a z of about 1 (again, not statistically significant at all). So it’s not just a comparison of 0.05 vs. 0.06, even a differenc


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 The title of this post by Sanjay Srivastava illustrates an annoying misconception that’s crept into the (otherwise delightful) recent publicity related to my article with Hal Stern, he difference between “significant” and “not significant” is not itself statistically significant. [sent-1, score-1.464]

2 When people bring this up, they keep referring to the difference between p=0. [sent-2, score-0.491]

3 06, making the familiar (and correct) point about the arbitrariness of the conventional p-value threshold of 0. [sent-4, score-0.488]

4 And, sure, I agree with this, but everybody knows that already. [sent-6, score-0.216]

5 The point Hal and I were making was that even apparently large differences in p-values are not statistically significant. [sent-7, score-1.031]

6 ) and another with z=1 (not statistically significant at all, only 1 se from zero! [sent-10, score-0.955]

7 ), then their difference has a z of about 1 (again, not statistically significant at all). [sent-11, score-1.068]

8 06, even a difference between clearly significant and clearly not significant can be clearly not statistically significant. [sent-15, score-2.217]

9 06 thing is fine, but I fear that it obscures our other point, and it could mislead researchers who might think they are safe if they think they can draw firm conclusions from apparently large p-value differences (for example, . [sent-18, score-1.095]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('significant', 0.463), ('statistically', 0.363), ('hal', 0.254), ('difference', 0.242), ('clearly', 0.211), ('apparently', 0.169), ('misconception', 0.165), ('crept', 0.165), ('delightful', 0.152), ('srivastava', 0.141), ('differences', 0.139), ('stern', 0.138), ('sanjay', 0.138), ('mislead', 0.133), ('se', 0.129), ('publicity', 0.125), ('vs', 0.124), ('fear', 0.11), ('firm', 0.109), ('illustrates', 0.109), ('threshold', 0.107), ('annoying', 0.107), ('making', 0.106), ('large', 0.105), ('safe', 0.104), ('referring', 0.103), ('point', 0.096), ('conventional', 0.093), ('draw', 0.089), ('familiar', 0.086), ('everybody', 0.084), ('bring', 0.084), ('conclusions', 0.084), ('otherwise', 0.082), ('knows', 0.08), ('comparison', 0.077), ('title', 0.077), ('zero', 0.07), ('correct', 0.069), ('related', 0.066), ('keep', 0.062), ('almost', 0.061), ('fine', 0.06), ('example', 0.058), ('level', 0.057), ('researchers', 0.053), ('even', 0.053), ('agree', 0.052), ('study', 0.049), ('recent', 0.045)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0 1072 andrew gelman stats-2011-12-19-“The difference between . . .”: It’s not just p=.05 vs. p=.06

Introduction: The title of this post by Sanjay Srivastava illustrates an annoying misconception that’s crept into the (otherwise delightful) recent publicity related to my article with Hal Stern, he difference between “significant” and “not significant” is not itself statistically significant. When people bring this up, they keep referring to the difference between p=0.05 and p=0.06, making the familiar (and correct) point about the arbitrariness of the conventional p-value threshold of 0.05. And, sure, I agree with this, but everybody knows that already. The point Hal and I were making was that even apparently large differences in p-values are not statistically significant. For example, if you have one study with z=2.5 (almost significant at the 1% level!) and another with z=1 (not statistically significant at all, only 1 se from zero!), then their difference has a z of about 1 (again, not statistically significant at all). So it’s not just a comparison of 0.05 vs. 0.06, even a differenc

2 0.23366642 899 andrew gelman stats-2011-09-10-The statistical significance filter

Introduction: I’ve talked about this a bit but it’s never had its own blog entry (until now). Statistically significant findings tend to overestimate the magnitude of effects. This holds in general (because E(|x|) > |E(x)|) but even more so if you restrict to statistically significant results. Here’s an example. Suppose a true effect of theta is unbiasedly estimated by y ~ N (theta, 1). Further suppose that we will only consider statistically significant results, that is, cases in which |y| > 2. The estimate “|y| conditional on |y|>2″ is clearly an overestimate of |theta|. First off, if |theta|<2, the estimate |y| conditional on statistical significance is not only too high in expectation, it's always too high. This is a problem, given that |theta| is in reality probably is less than 2. (The low-hangning fruit have already been picked, remember?) But even if |theta|>2, the estimate |y| conditional on statistical significance will still be too high in expectation. For a discussion o

3 0.2296766 897 andrew gelman stats-2011-09-09-The difference between significant and not significant…

Introduction: E. J. Wagenmakers writes: You may be interested in a recent article [by Nieuwenhuis, Forstmann, and Wagenmakers] showing how often researchers draw conclusions by comparing p-values. As you and Hal Stern have pointed out, this is potentially misleading because the difference between significant and not significant is not necessarily significant. We were really suprised to see how often researchers in the neurosciences make this mistake. In the paper we speculate a little bit on the cause of the error. From their paper: In theory, a comparison of two experimental effects requires a statistical test on their difference. In practice, this comparison is often based on an incorrect procedure involving two separate tests in which researchers conclude that effects differ when one effect is significant (P < 0.05) but the other is not (P > 0.05). We reviewed 513 behavioral, systems and cognitive neuroscience articles in five top-ranking journals (Science, Nature, Nature Neuroscien

4 0.19230893 920 andrew gelman stats-2011-09-22-Top 10 blog obsessions

Introduction: I was just thinking about this because we seem to be circling around the same few topics over and over (while occasionally slipping in some new statistical ideas): 10. Wegman 9. Hipmunk 8. Dennis the dentist 7. Freakonomics 6. The difference between significant and non-significant is not itself statistically significant 5. Just use a hierarchical model already! 4. Innumerate journalists who think that presidential elections are just like high school 3. A graph can be pretty but convey essentially no information 2. Stan is coming 1. Clippy! Did I miss anything important?

5 0.19080704 758 andrew gelman stats-2011-06-11-Hey, good news! Your p-value just passed the 0.05 threshold!

Introduction: E. J. Wagenmakers writes: Here’s a link for you. The first sentences tell it all: Climate warming since 1995 is now statistically significant, according to Phil Jones, the UK scientist targeted in the “ClimateGate” affair. Last year, he told BBC News that post-1995 warming was not significant–a statement still seen on blogs critical of the idea of man-made climate change. But another year of data has pushed the trend past the threshold usually used to assess whether trends are “real.” Now I [Wagenmakers] don’t like p-values one bit, but even people who do like them must cringe when they read this. First, this apparently is a sequential design, so I’m not sure what sampling plan leads to these p-values. Secondly, comparing significance values suggests that the data have suddenly crossed some invisible line that divided nonsignificant from significant effects (as you pointed out in your paper with Hal Stern). Ugh! I share Wagenmakers’s reaction. There seems to be some con

6 0.16716714 156 andrew gelman stats-2010-07-20-Burglars are local

7 0.16159187 310 andrew gelman stats-2010-10-02-The winner’s curse

8 0.15897943 1059 andrew gelman stats-2011-12-14-Looking at many comparisons may increase the risk of finding something statistically significant by epidemiologists, a population with relatively low multilevel modeling consumption

9 0.15839621 1150 andrew gelman stats-2012-02-02-The inevitable problems with statistical significance and 95% intervals

10 0.15782425 106 andrew gelman stats-2010-06-23-Scientists can read your mind . . . as long as the’re allowed to look at more than one place in your brain and then make a prediction after seeing what you actually did

11 0.15197784 2042 andrew gelman stats-2013-09-28-Difficulties of using statistical significance (or lack thereof) to sift through and compare research hypotheses

12 0.1392266 146 andrew gelman stats-2010-07-14-The statistics and the science

13 0.13917437 1971 andrew gelman stats-2013-08-07-I doubt they cheated

14 0.12059437 1838 andrew gelman stats-2013-05-03-Setting aside the politics, the debate over the new health-care study reveals that we’re moving to a new high standard of statistical journalism

15 0.1200583 401 andrew gelman stats-2010-11-08-Silly old chi-square!

16 0.11878987 593 andrew gelman stats-2011-02-27-Heat map

17 0.11650053 1171 andrew gelman stats-2012-02-16-“False-positive psychology”

18 0.11507669 1337 andrew gelman stats-2012-05-22-Question 12 of my final exam for Design and Analysis of Sample Surveys

19 0.11463376 1695 andrew gelman stats-2013-01-28-Economists argue about Bayes

20 0.11431676 888 andrew gelman stats-2011-09-03-A psychology researcher asks: Is Anova dead?


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.122), (1, -0.011), (2, 0.041), (3, -0.131), (4, -0.013), (5, -0.076), (6, -0.012), (7, 0.024), (8, -0.016), (9, -0.019), (10, -0.05), (11, 0.005), (12, 0.054), (13, -0.077), (14, 0.049), (15, 0.022), (16, -0.008), (17, 0.004), (18, 0.016), (19, -0.006), (20, 0.022), (21, 0.029), (22, -0.005), (23, -0.018), (24, -0.004), (25, -0.002), (26, 0.064), (27, -0.067), (28, -0.021), (29, -0.088), (30, 0.038), (31, 0.034), (32, 0.004), (33, -0.006), (34, 0.063), (35, 0.089), (36, -0.09), (37, -0.003), (38, -0.051), (39, -0.015), (40, -0.023), (41, -0.027), (42, -0.057), (43, 0.102), (44, 0.034), (45, -0.084), (46, -0.06), (47, 0.001), (48, 0.013), (49, -0.023)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.9828614 1072 andrew gelman stats-2011-12-19-“The difference between . . .”: It’s not just p=.05 vs. p=.06

Introduction: The title of this post by Sanjay Srivastava illustrates an annoying misconception that’s crept into the (otherwise delightful) recent publicity related to my article with Hal Stern, he difference between “significant” and “not significant” is not itself statistically significant. When people bring this up, they keep referring to the difference between p=0.05 and p=0.06, making the familiar (and correct) point about the arbitrariness of the conventional p-value threshold of 0.05. And, sure, I agree with this, but everybody knows that already. The point Hal and I were making was that even apparently large differences in p-values are not statistically significant. For example, if you have one study with z=2.5 (almost significant at the 1% level!) and another with z=1 (not statistically significant at all, only 1 se from zero!), then their difference has a z of about 1 (again, not statistically significant at all). So it’s not just a comparison of 0.05 vs. 0.06, even a differenc

2 0.90942544 156 andrew gelman stats-2010-07-20-Burglars are local

Introduction: This makes sense: In the land of fiction, it’s the criminal’s modus operandi – his method of entry, his taste for certain jewellery and so forth – that can be used by detectives to identify his handiwork. The reality according to a new analysis of solved burglaries in the Northamptonshire region of England is that these aspects of criminal behaviour are on their own unreliable as identifying markers, most likely because they are dictated by circumstances rather than the criminal’s taste and style. However, the geographical spread and timing of a burglar’s crimes are distinctive, and could help with police investigations. And, as a bonus, more Tourette’s pride! P.S. On yet another unrelated topic from the same blog, I wonder if the researchers in this study are aware that the difference between “significant” and “not significant” is not itself statistically significant .

3 0.76511705 899 andrew gelman stats-2011-09-10-The statistical significance filter

Introduction: I’ve talked about this a bit but it’s never had its own blog entry (until now). Statistically significant findings tend to overestimate the magnitude of effects. This holds in general (because E(|x|) > |E(x)|) but even more so if you restrict to statistically significant results. Here’s an example. Suppose a true effect of theta is unbiasedly estimated by y ~ N (theta, 1). Further suppose that we will only consider statistically significant results, that is, cases in which |y| > 2. The estimate “|y| conditional on |y|>2″ is clearly an overestimate of |theta|. First off, if |theta|<2, the estimate |y| conditional on statistical significance is not only too high in expectation, it's always too high. This is a problem, given that |theta| is in reality probably is less than 2. (The low-hangning fruit have already been picked, remember?) But even if |theta|>2, the estimate |y| conditional on statistical significance will still be too high in expectation. For a discussion o

4 0.75911522 310 andrew gelman stats-2010-10-02-The winner’s curse

Introduction: If an estimate is statistically significant, it’s probably an overestimate of the magnitude of your effect. P.S. I think youall know what I mean here. But could someone rephrase it in a more pithy manner? I’d like to include it in our statistical lexicon.

5 0.74875259 1557 andrew gelman stats-2012-11-01-‘Researcher Degrees of Freedom’

Introduction: False-Positive Psychology: Undisclosed Flexibility in Data Collection and Analysis Allows Presenting Anything as Significant [I]t is unacceptably easy to publish “statistically significant” evidence consistent with any hypothesis. The culprit is a construct we refer to as researcher degrees of freedom. In the course of collecting and analyzing data, researchers have many decisions to make: Should more data be collected? Should some observations be excluded? Which conditions should be combined and which ones compared? Which control variables should be considered? Should specific measures be combined or transformed or both? It is rare, and sometimes impractical, for researchers to make all these decisions beforehand. Rather, it is common (and accepted practice) for researchers to explore various analytic alternatives, to search for a combination that yields “statistical significance,” and to then report only what “worked.” The problem, of course, is that the likelihood of at leas

6 0.70821154 1944 andrew gelman stats-2013-07-18-You’ll get a high Type S error rate if you use classical statistical methods to analyze data from underpowered studies

7 0.7018699 593 andrew gelman stats-2011-02-27-Heat map

8 0.70124739 1171 andrew gelman stats-2012-02-16-“False-positive psychology”

9 0.68460602 1150 andrew gelman stats-2012-02-02-The inevitable problems with statistical significance and 95% intervals

10 0.68276447 106 andrew gelman stats-2010-06-23-Scientists can read your mind . . . as long as the’re allowed to look at more than one place in your brain and then make a prediction after seeing what you actually did

11 0.67102349 146 andrew gelman stats-2010-07-14-The statistics and the science

12 0.66360152 1971 andrew gelman stats-2013-08-07-I doubt they cheated

13 0.64501572 2042 andrew gelman stats-2013-09-28-Difficulties of using statistical significance (or lack thereof) to sift through and compare research hypotheses

14 0.63760191 1776 andrew gelman stats-2013-03-25-The harm done by tests of significance

15 0.63661802 2030 andrew gelman stats-2013-09-19-Is coffee a killer? I don’t think the effect is as high as was estimated from the highest number that came out of a noisy study

16 0.62671512 933 andrew gelman stats-2011-09-30-More bad news: The (mis)reporting of statistical results in psychology journals

17 0.61843902 1838 andrew gelman stats-2013-05-03-Setting aside the politics, the debate over the new health-care study reveals that we’re moving to a new high standard of statistical journalism

18 0.61422294 2049 andrew gelman stats-2013-10-03-On house arrest for p-hacking

19 0.61324102 918 andrew gelman stats-2011-09-21-Avoiding boundary estimates in linear mixed models

20 0.61184758 695 andrew gelman stats-2011-05-04-Statistics ethics question


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(16, 0.045), (22, 0.02), (24, 0.251), (33, 0.047), (41, 0.02), (53, 0.016), (63, 0.032), (65, 0.044), (77, 0.021), (79, 0.018), (82, 0.038), (86, 0.013), (96, 0.067), (99, 0.249)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.98105824 1072 andrew gelman stats-2011-12-19-“The difference between . . .”: It’s not just p=.05 vs. p=.06

Introduction: The title of this post by Sanjay Srivastava illustrates an annoying misconception that’s crept into the (otherwise delightful) recent publicity related to my article with Hal Stern, he difference between “significant” and “not significant” is not itself statistically significant. When people bring this up, they keep referring to the difference between p=0.05 and p=0.06, making the familiar (and correct) point about the arbitrariness of the conventional p-value threshold of 0.05. And, sure, I agree with this, but everybody knows that already. The point Hal and I were making was that even apparently large differences in p-values are not statistically significant. For example, if you have one study with z=2.5 (almost significant at the 1% level!) and another with z=1 (not statistically significant at all, only 1 se from zero!), then their difference has a z of about 1 (again, not statistically significant at all). So it’s not just a comparison of 0.05 vs. 0.06, even a differenc

2 0.95298958 847 andrew gelman stats-2011-08-10-Using a “pure infographic” to explore differences between information visualization and statistical graphics

Introduction: Our discussion on data visualization continues. One one side are three statisticians–Antony Unwin, Kaiser Fung, and myself. We have been writing about the different goals served by information visualization and statistical graphics. On the other side are graphics experts (sorry for the imprecision, I don’t know exactly what these people do in their day jobs or how they are trained, and I don’t want to mislabel them) such as Robert Kosara and Jen Lowe , who seem a bit annoyed at how my colleagues and myself seem to follow the Tufte strategy of criticizing what we don’t understand. And on the third side are many (most?) academic statisticians, econometricians, etc., who don’t understand or respect graphs and seem to think of visualization as a toy that is unrelated to serious science or statistics. I’m not so interested in the third group right now–I tried to communicate with them in my big articles from 2003 and 2004 )–but I am concerned that our dialogue with the graphic

3 0.95247996 197 andrew gelman stats-2010-08-10-The last great essayist?

Introduction: I recently read a bizarre article by Janet Malcolm on a murder trial in NYC. What threw me about the article was that the story was utterly commonplace (by the standards of today’s headlines): divorced mom kills ex-husband in a custody dispute over their four-year-old daughter. The only interesting features were (a) the wife was a doctor and the husband were a dentist, the sort of people you’d expect to sue rather than slay, and (b) the wife hired a hitman from within the insular immigrant community that she (and her husband) belonged to. But, really, neither of these was much of a twist. To add to the non-storyness of it all, there were no other suspects, the evidence against the wife and the hitman was overwhelming, and even the high-paid defense lawyers didn’t seem to be making much of an effort to convince anyone of their client’s innocents. (One of the closing arguments was that one aspect of the wife’s story was so ridiculous that it had to be true. In the lawyer’s wo

4 0.95161104 1838 andrew gelman stats-2013-05-03-Setting aside the politics, the debate over the new health-care study reveals that we’re moving to a new high standard of statistical journalism

Introduction: Pointing to this news article by Megan McArdle discussing a recent study of Medicaid recipients, Jonathan Falk writes: Forget the interpretation for a moment, and the political spin, but haven’t we reached an interesting point when a journalist says things like: When you do an RCT with more than 12,000 people in it, and your defense of your hypothesis is that maybe the study just didn’t have enough power, what you’re actually saying is “the beneficial effects are probably pretty small”. and A good Bayesian—and aren’t most of us are supposed to be good Bayesians these days?—should be updating in light of this new information. Given this result, what is the likelihood that Obamacare will have a positive impact on the average health of Americans? Every one of us, for or against, should be revising that probability downwards. I’m not saying that you have to revise it to zero; I certainly haven’t. But however high it was yesterday, it should be somewhat lower today. This

5 0.9509747 953 andrew gelman stats-2011-10-11-Steve Jobs’s cancer and science-based medicine

Introduction: Interesting discussion from David Gorski (which I found via this link from Joseph Delaney). I don’t have anything really to add to this discussion except to note the value of this sort of anecdote in a statistics discussion. It’s only n=1 and adds almost nothing to the literature on the effectiveness of various treatments, but a story like this can help focus one’s thoughts on the decision problems.

6 0.95072019 1455 andrew gelman stats-2012-08-12-Probabilistic screening to get an approximate self-weighted sample

7 0.95059574 2023 andrew gelman stats-2013-09-14-On blogging

8 0.94954658 1999 andrew gelman stats-2013-08-27-Bayesian model averaging or fitting a larger model

9 0.94908726 2247 andrew gelman stats-2014-03-14-The maximal information coefficient

10 0.94905049 1062 andrew gelman stats-2011-12-16-Mr. Pearson, meet Mr. Mandelbrot: Detecting Novel Associations in Large Data Sets

11 0.94780052 278 andrew gelman stats-2010-09-15-Advice that might make sense for individuals but is negative-sum overall

12 0.94726431 2312 andrew gelman stats-2014-04-29-Ken Rice presents a unifying approach to statistical inference and hypothesis testing

13 0.94693345 1757 andrew gelman stats-2013-03-11-My problem with the Lindley paradox

14 0.94657356 1240 andrew gelman stats-2012-04-02-Blogads update

15 0.94560909 1224 andrew gelman stats-2012-03-21-Teaching velocity and acceleration

16 0.94508827 1087 andrew gelman stats-2011-12-27-“Keeping things unridiculous”: Berger, O’Hagan, and me on weakly informative priors

17 0.9449091 1155 andrew gelman stats-2012-02-05-What is a prior distribution?

18 0.94486511 2143 andrew gelman stats-2013-12-22-The kluges of today are the textbook solutions of tomorrow.

19 0.94470143 414 andrew gelman stats-2010-11-14-“Like a group of teenagers on a bus, they behave in public as if they were in private”

20 0.94467992 1376 andrew gelman stats-2012-06-12-Simple graph WIN: the example of birthday frequencies