andrew_gelman_stats andrew_gelman_stats-2013 andrew_gelman_stats-2013-2102 knowledge-graph by maker-knowledge-mining

2102 andrew gelman stats-2013-11-15-“Are all significant p-values created equal?”


meta infos for this blog

Source: html

Introduction: The answer is no, as explained in this classic article by Warren Browner and Thomas Newman from 1987. If I were to rewrite this article today, I would frame things slightly differently—referring to Type S and Type M errors rather than speaking of “the probability that the research hypothesis is true”—but overall they make good points, and I like their analogy to medical diagnostic testing.


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 The answer is no, as explained in this classic article by Warren Browner and Thomas Newman from 1987. [sent-1, score-0.594]

2 If I were to rewrite this article today, I would frame things slightly differently—referring to Type S and Type M errors rather than speaking of “the probability that the research hypothesis is true”—but overall they make good points, and I like their analogy to medical diagnostic testing. [sent-2, score-2.577]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('newman', 0.349), ('type', 0.298), ('rewrite', 0.287), ('diagnostic', 0.275), ('warren', 0.275), ('differently', 0.229), ('frame', 0.227), ('referring', 0.205), ('analogy', 0.181), ('thomas', 0.181), ('classic', 0.175), ('explained', 0.173), ('speaking', 0.172), ('slightly', 0.168), ('overall', 0.167), ('medical', 0.161), ('testing', 0.159), ('hypothesis', 0.141), ('errors', 0.14), ('today', 0.137), ('article', 0.136), ('probability', 0.111), ('answer', 0.11), ('true', 0.108), ('points', 0.091), ('things', 0.078), ('rather', 0.072), ('research', 0.069), ('good', 0.058), ('make', 0.057), ('would', 0.039), ('like', 0.038)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0 2102 andrew gelman stats-2013-11-15-“Are all significant p-values created equal?”

Introduction: The answer is no, as explained in this classic article by Warren Browner and Thomas Newman from 1987. If I were to rewrite this article today, I would frame things slightly differently—referring to Type S and Type M errors rather than speaking of “the probability that the research hypothesis is true”—but overall they make good points, and I like their analogy to medical diagnostic testing.

2 0.16342011 256 andrew gelman stats-2010-09-04-Noooooooooooooooooooooooooooooooooooooooooooooooo!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Introduction: Masanao sends this one in, under the heading, “another incident of misunderstood p-value”: Warren Davies, a positive psychology MSc student at UEL, provides the latest in our ongoing series of guest features for students. Warren has just released a Psychology Study Guide, which covers information on statistics, research methods and study skills for psychology students. Despite the myriad rules and procedures of science, some research findings are pure flukes. Perhaps you’re testing a new drug, and by chance alone, a large number of people spontaneously get better. The better your study is conducted, the lower the chance that your result was a fluke – but still, there is always a certain probability that it was. Statistical significance testing gives you an idea of what this probability is. In science we’re always testing hypotheses. We never conduct a study to ‘see what happens’, because there’s always at least one way to make any useless set of data look important. We take

3 0.14837998 494 andrew gelman stats-2010-12-31-Type S error rates for classical and Bayesian single and multiple comparison procedures

Introduction: Type S error: When your estimate is the wrong sign, compared to the true value of the parameter Type M error: When the magnitude of your estimate is far off, compared to the true value of the parameter More here.

4 0.12495616 2093 andrew gelman stats-2013-11-07-I’m negative on the expression “false positives”

Introduction: After seeing a document sent to me and others regarding the crisis of spurious, statistically-significant research findings in psychology research, I had the following reaction: I am unhappy with the use in the document of the phrase “false positives.” I feel that this expression is unhelpful as it frames science in terms of “true” and “false” claims, which I don’t think is particularly accurate. In particular, in most of the recent disputed Psych Science type studies (the ESP study excepted, perhaps), there is little doubt that there is _some_ underlying effect. The issue, as I see it, as that the underlying effects are much smaller, and much more variable, than mainstream researchers imagine. So what happens is that Psych Science or Nature or whatever will publish a result that is purported to be some sort of universal truth, but it is actually a pattern specific to one data set, one population, and one experimental condition. In a sense, yes, these journals are publishing

5 0.10781135 423 andrew gelman stats-2010-11-20-How to schedule projects in an introductory statistics course?

Introduction: John Haubrick writes: Next semester I want to center my statistics class around independent projects that they will present at the end of the semester. My question is, by centering around a project and teaching for the different parts that they need at the time, should topics such as hypothesis testing be moved toward the beginning of the course? Or should I only discuss setting up a research hypothesis and discuss the actual testing later after they have the data? My reply: I’m not sure. There always is a difficulty of what can be covered in a project. My quick thought is that a project will perhaps work better if it is focused on data collection or exploratory data analysis rather than on estimation and hypothesis testing, which are topics that get covered pretty well in the course as a whole.

6 0.10622833 967 andrew gelman stats-2011-10-20-Picking on Gregg Easterbrook

7 0.10136917 1016 andrew gelman stats-2011-11-17-I got 99 comparisons but multiplicity ain’t one

8 0.10058557 777 andrew gelman stats-2011-06-23-Combining survey data obtained using different modes of sampling

9 0.096509308 1605 andrew gelman stats-2012-12-04-Write This Book

10 0.092577122 109 andrew gelman stats-2010-06-25-Classics of statistics

11 0.092020839 1883 andrew gelman stats-2013-06-04-Interrogating p-values

12 0.084677741 2040 andrew gelman stats-2013-09-26-Difficulties in making inferences about scientific truth from distributions of published p-values

13 0.083754301 1538 andrew gelman stats-2012-10-17-Rust

14 0.081682213 463 andrew gelman stats-2010-12-11-Compare p-values from privately funded medical trials to those in publicly funded research?

15 0.080976367 1024 andrew gelman stats-2011-11-23-Of hypothesis tests and Unitarians

16 0.080668226 904 andrew gelman stats-2011-09-13-My wikipedia edit

17 0.079444543 1355 andrew gelman stats-2012-05-31-Lindley’s paradox

18 0.078064919 2263 andrew gelman stats-2014-03-24-Empirical implications of Empirical Implications of Theoretical Models

19 0.075396672 2295 andrew gelman stats-2014-04-18-One-tailed or two-tailed?

20 0.074973285 167 andrew gelman stats-2010-07-27-Why don’t more medical discoveries become cures?


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.098), (1, 0.003), (2, -0.007), (3, -0.048), (4, -0.028), (5, -0.04), (6, 0.003), (7, 0.029), (8, 0.002), (9, -0.069), (10, -0.047), (11, 0.007), (12, 0.0), (13, -0.051), (14, -0.04), (15, -0.011), (16, -0.018), (17, -0.009), (18, -0.009), (19, -0.026), (20, 0.007), (21, 0.013), (22, 0.018), (23, 0.002), (24, -0.044), (25, -0.04), (26, 0.0), (27, 0.006), (28, -0.008), (29, -0.016), (30, 0.037), (31, -0.01), (32, 0.011), (33, 0.039), (34, -0.056), (35, -0.064), (36, 0.065), (37, -0.043), (38, 0.019), (39, -0.007), (40, -0.017), (41, -0.038), (42, -0.002), (43, -0.008), (44, -0.018), (45, 0.047), (46, 0.002), (47, -0.017), (48, -0.02), (49, -0.026)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.96560204 2102 andrew gelman stats-2013-11-15-“Are all significant p-values created equal?”

Introduction: The answer is no, as explained in this classic article by Warren Browner and Thomas Newman from 1987. If I were to rewrite this article today, I would frame things slightly differently—referring to Type S and Type M errors rather than speaking of “the probability that the research hypothesis is true”—but overall they make good points, and I like their analogy to medical diagnostic testing.

2 0.80209106 1355 andrew gelman stats-2012-05-31-Lindley’s paradox

Introduction: Sam Seaver writes: I [Seaver] happened to be reading an ironic article by Karl Friston when I learned something new about frequentist vs bayesian, namely Lindley’s paradox, on page 12. The text is as follows: So why are we worried about trivial effects? They are important because the probability that the true effect size is exactly zero is itself zero and could cause us to reject the null hypothesis inappropriately. This is a fallacy of classical inference and is not unrelated to Lindley’s paradox (Lindley 1957). Lindley’s paradox describes a counterintuitive situation in which Bayesian and frequentist approaches to hypothesis testing give opposite results. It occurs when; (i) a result is significant by a frequentist test, indicating sufficient evidence to reject the null hypothesis d=0 and (ii) priors render the posterior probability of d=0 high, indicating strong evidence that the null hypothesis is true. In his original treatment, Lindley (1957) showed that – under a parti

3 0.78573233 1024 andrew gelman stats-2011-11-23-Of hypothesis tests and Unitarians

Introduction: Xian, Judith, and I read this line in a book by statistician Murray Aitkin in which he considered the following hypothetical example: A survey of 100 individuals expressing support (Yes/No) for the president, before and after a presidential address . . . The question of interest is whether there has been a change in support between the surveys . . . We want to assess the evidence for the hypothesis of equality H1 against the alternative hypothesis H2 of a change. Here is our response : Based on our experience in public opinion research, this is not a real question. Support for any political position is always changing. The real question is how much the support has changed, or perhaps how this change is distributed across the population. A defender of Aitkin (and of classical hypothesis testing) might respond at this point that, yes, everybody knows that changes are never exactly zero and that we should take a more “grown-up” view of the null hypothesis, not that the change

4 0.772686 256 andrew gelman stats-2010-09-04-Noooooooooooooooooooooooooooooooooooooooooooooooo!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Introduction: Masanao sends this one in, under the heading, “another incident of misunderstood p-value”: Warren Davies, a positive psychology MSc student at UEL, provides the latest in our ongoing series of guest features for students. Warren has just released a Psychology Study Guide, which covers information on statistics, research methods and study skills for psychology students. Despite the myriad rules and procedures of science, some research findings are pure flukes. Perhaps you’re testing a new drug, and by chance alone, a large number of people spontaneously get better. The better your study is conducted, the lower the chance that your result was a fluke – but still, there is always a certain probability that it was. Statistical significance testing gives you an idea of what this probability is. In science we’re always testing hypotheses. We never conduct a study to ‘see what happens’, because there’s always at least one way to make any useless set of data look important. We take

5 0.74065238 2149 andrew gelman stats-2013-12-26-Statistical evidence for revised standards

Introduction: In response to the discussion of X and me of his recent paper , Val Johnson writes: I would like to thank Andrew for forwarding his comments on uniformly most powerful Bayesian tests (UMPBTs) to me and his invitation to respond to them. I think he (and also Christian Robert) raise a number of interesting points concerning this new class of Bayesian tests, but I think that they may have confounded several issues that might more usefully be examined separately. The first issue involves the choice of the Bayesian evidence threshold, gamma, used in rejecting a null hypothesis in favor of an alternative hypothesis. Andrew objects to the higher values of gamma proposed in my recent PNAS article on grounds that too many important scientific effects would be missed if thresholds of 25-50 were routinely used. These evidence thresholds correspond roughly to p-values of 0.005; Andrew suggests that evidence thresholds around 5 should continue to be used (gamma=5 corresponds approximate

6 0.73309302 2281 andrew gelman stats-2014-04-04-The Notorious N.H.S.T. presents: Mo P-values Mo Problems

7 0.72664005 2295 andrew gelman stats-2014-04-18-One-tailed or two-tailed?

8 0.71799612 1760 andrew gelman stats-2013-03-12-Misunderstanding the p-value

9 0.69104117 1869 andrew gelman stats-2013-05-24-In which I side with Neyman over Fisher

10 0.68882859 331 andrew gelman stats-2010-10-10-Bayes jumps the shark

11 0.68587089 2272 andrew gelman stats-2014-03-29-I agree with this comment

12 0.67655516 1883 andrew gelman stats-2013-06-04-Interrogating p-values

13 0.65452582 2312 andrew gelman stats-2014-04-29-Ken Rice presents a unifying approach to statistical inference and hypothesis testing

14 0.65406746 2263 andrew gelman stats-2014-03-24-Empirical implications of Empirical Implications of Theoretical Models

15 0.64218551 1826 andrew gelman stats-2013-04-26-“A Vast Graveyard of Undead Theories: Publication Bias and Psychological Science’s Aversion to the Null”

16 0.62697119 2305 andrew gelman stats-2014-04-25-Revised statistical standards for evidence (comments to Val Johnson’s comments on our comments on Val’s comments on p-values)

17 0.61908388 1612 andrew gelman stats-2012-12-08-The Case for More False Positives in Anti-doping Testing

18 0.61265969 1095 andrew gelman stats-2012-01-01-Martin and Liu: Probabilistic inference based on consistency of model with data

19 0.61222893 2183 andrew gelman stats-2014-01-23-Discussion on preregistration of research studies

20 0.61031097 2078 andrew gelman stats-2013-10-26-“The Bayesian approach to forensic evidence”


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(24, 0.203), (28, 0.067), (55, 0.031), (63, 0.059), (75, 0.047), (77, 0.03), (86, 0.173), (87, 0.039), (99, 0.2)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.9512307 2102 andrew gelman stats-2013-11-15-“Are all significant p-values created equal?”

Introduction: The answer is no, as explained in this classic article by Warren Browner and Thomas Newman from 1987. If I were to rewrite this article today, I would frame things slightly differently—referring to Type S and Type M errors rather than speaking of “the probability that the research hypothesis is true”—but overall they make good points, and I like their analogy to medical diagnostic testing.

2 0.9416278 436 andrew gelman stats-2010-11-29-Quality control problems at the New York Times

Introduction: I guess there’s a reason they put this stuff in the Opinion section and not in the Science section, huh? P.S. More here .

3 0.90463221 1552 andrew gelman stats-2012-10-29-“Communication is a central task of statistics, and ideally a state-of-the-art data analysis can have state-of-the-art displays to match”

Introduction: The Journal of the Royal Statistical Society publishes papers followed by discussions. Lots of discussions, each can be no more than 400 words. Here’s my most recent discussion: The authors are working on an important applied problem and I have no reason to doubt that their approach is a step forward beyond diagnostic criteria based on point estimation. An attempt at an accurate assessment of variation is important not just for statistical reasons but also because scientists have the duty to convey their uncertainty to the larger world. I am thinking, for example, of discredited claims such as that of the mathematician who claimed to predict divorces with 93% accuracy (Abraham, 2010). Regarding the paper at hand, I thought I would try an experiment in comment-writing. My usual practice is to read the graphs and then go back and clarify any questions through the text. So, very quickly: I would prefer Figure 1 to be displayed in terms of standard deviations, not variances. I

4 0.89616096 305 andrew gelman stats-2010-09-29-Decision science vs. social psychology

Introduction: Dan Goldstein sends along this bit of research , distinguishing terms used in two different subfields of psychology. Dan writes: Intuitive calls included not listing words that don’t occur 3 or more times in both programs. I [Dan] did this because when I looked at the results, those cases tended to be proper names or arbitrary things like header or footer text. It also narrowed down the space of words to inspect, which means I could actually get the thing done in my copious free time. I think the bar graphs are kinda ugly, maybe there’s a better way to do it based on classifying the words according to content? Also the whole exercise would gain a new dimension by comparing several areas instead of just two. Maybe that’s coming next.

5 0.89442855 1327 andrew gelman stats-2012-05-18-Comments on “A Bayesian approach to complex clinical diagnoses: a case-study in child abuse”

Introduction: I was given the opportunity to briefly comment on the paper , A Bayesian approach to complex clinical diagnoses: a case-study in child abuse, by Nicky Best, Deborah Ashby, Frank Dunstan, David Foreman, and Neil McIntosh, for the Journal of the Royal Statistical Society. Here is what I wrote: Best et al. are working on an important applied problem and I have no reason to doubt that their approach is a step forward beyond diagnostic criteria based on point estimation. An attempt at an accurate assessment of variation is important not just for statistical reasons but also because scientists have the duty to convey their uncertainty to the larger world. I am thinking, for example, of discredited claims such as that of the mathematician who claimed to predict divorces with 93% accuracy (Abraham, 2010). Regarding the paper at hand, I thought I would try an experiment in comment-writing. My usual practice is to read the graphs and then go back and clarify any questions through the t

6 0.87967861 253 andrew gelman stats-2010-09-03-Gladwell vs Pinker

7 0.87421918 759 andrew gelman stats-2011-06-11-“2 level logit with 2 REs & large sample. computational nightmare – please help”

8 0.87356079 779 andrew gelman stats-2011-06-25-Avoiding boundary estimates using a prior distribution as regularization

9 0.87337208 1718 andrew gelman stats-2013-02-11-Toward a framework for automatic model building

10 0.87070751 494 andrew gelman stats-2010-12-31-Type S error rates for classical and Bayesian single and multiple comparison procedures

11 0.86263835 1971 andrew gelman stats-2013-08-07-I doubt they cheated

12 0.86239487 1983 andrew gelman stats-2013-08-15-More on AIC, WAIC, etc

13 0.8609671 2082 andrew gelman stats-2013-10-30-Berri Gladwell Loken football update

14 0.85988379 2093 andrew gelman stats-2013-11-07-I’m negative on the expression “false positives”

15 0.85930371 1530 andrew gelman stats-2012-10-11-Migrating your blog from Movable Type to WordPress

16 0.85923421 1547 andrew gelman stats-2012-10-25-College football, voting, and the law of large numbers

17 0.85639298 2224 andrew gelman stats-2014-02-25-Basketball Stats: Don’t model the probability of win, model the expected score differential.

18 0.85370225 846 andrew gelman stats-2011-08-09-Default priors update?

19 0.85186678 2017 andrew gelman stats-2013-09-11-“Informative g-Priors for Logistic Regression”

20 0.8508957 1367 andrew gelman stats-2012-06-05-Question 26 of my final exam for Design and Analysis of Sample Surveys