andrew_gelman_stats andrew_gelman_stats-2012 andrew_gelman_stats-2012-1171 knowledge-graph by maker-knowledge-mining

1171 andrew gelman stats-2012-02-16-“False-positive psychology”


meta infos for this blog

Source: html

Introduction: Everybody’s talkin bout this paper by Joseph Simmons, Leif Nelson and Uri Simonsohn, who write : Despite empirical psychologists’ nominal endorsement of a low rate of false-positive findings (≤ .05), flexibility in data collection, analysis, and reporting dramatically increases actual false-positive rates. In many cases, a researcher is more likely to falsely find evidence that an effect exists than to correctly find evidence that it does not. We [Simmons, Nelson, and Simonsohn] present computer simulations and a pair of actual experiments that demonstrate how unacceptably easy it is to accumulate (and report) statistically significant evidence for a false hypothesis. Second, we suggest a simple, low-cost, and straightforwardly effective disclosure-based solution to this problem. The solution involves six concrete requirements for authors and four guidelines for reviewers, all of which impose a minimal burden on the publication process. Whatever you think about these recommend


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Everybody’s talkin bout this paper by Joseph Simmons, Leif Nelson and Uri Simonsohn, who write : Despite empirical psychologists’ nominal endorsement of a low rate of false-positive findings (≤ . [sent-1, score-0.41]

2 05), flexibility in data collection, analysis, and reporting dramatically increases actual false-positive rates. [sent-2, score-0.273]

3 In many cases, a researcher is more likely to falsely find evidence that an effect exists than to correctly find evidence that it does not. [sent-3, score-0.524]

4 We [Simmons, Nelson, and Simonsohn] present computer simulations and a pair of actual experiments that demonstrate how unacceptably easy it is to accumulate (and report) statistically significant evidence for a false hypothesis. [sent-4, score-1.239]

5 Second, we suggest a simple, low-cost, and straightforwardly effective disclosure-based solution to this problem. [sent-5, score-0.231]

6 The solution involves six concrete requirements for authors and four guidelines for reviewers, all of which impose a minimal burden on the publication process. [sent-6, score-0.668]

7 I love its central example: To help illustrate the problem, we [Simmons et al. [sent-8, score-0.105]

8 ] conducted two experiments designed to demonstrate something false: that certain songs can change listeners’ age. [sent-9, score-0.358]

9 They go on to present some impressive-looking statistical results, then they go behind the curtain to show the fairly innocuous manipulations they performed to attain statistical significance. [sent-11, score-0.654]

10 A key part of the story is that, although such manipulations could be performed by a cheater, they could also seem like reasonable steps to a sincere researcher who thinks there’s an effect and wants to analyze the data a bit to understand it further. [sent-12, score-0.537]

11 We’ve all known for a long time that a p-value of 0. [sent-13, score-0.101]

12 This can be a big problem for studies in psychology and other fields where various data stories are vaguely consistent with theory. [sent-21, score-0.17]

13 We’ve all known about these problems but it’s only recently that we’ve been aware of how serious they are and how little we should trust a bunch of statistically significant results. [sent-22, score-0.298]

14 is that I’m not so happy with the framing in terms of “false positives”; to me, the problem is not so much with null effects but with uncertainty and variation. [sent-25, score-0.168]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('simmons', 0.373), ('manipulations', 0.197), ('nelson', 0.184), ('false', 0.172), ('simonsohn', 0.165), ('performed', 0.141), ('songs', 0.125), ('demonstrate', 0.122), ('evidence', 0.119), ('accumulate', 0.118), ('attain', 0.118), ('straightforwardly', 0.118), ('unacceptably', 0.118), ('talkin', 0.118), ('leif', 0.118), ('solution', 0.113), ('experiments', 0.111), ('researcher', 0.107), ('et', 0.105), ('statistically', 0.103), ('concrete', 0.103), ('impose', 0.103), ('present', 0.102), ('known', 0.101), ('positives', 0.1), ('srivastava', 0.1), ('actual', 0.1), ('bout', 0.098), ('endorsement', 0.098), ('sanjay', 0.098), ('falsely', 0.096), ('innocuous', 0.096), ('nominal', 0.096), ('burden', 0.095), ('significant', 0.094), ('flexibility', 0.092), ('sincere', 0.092), ('requirements', 0.089), ('vaguely', 0.088), ('guidelines', 0.086), ('demonstrates', 0.086), ('framing', 0.086), ('exists', 0.083), ('uri', 0.083), ('problem', 0.082), ('dramatically', 0.081), ('pair', 0.08), ('minimal', 0.079), ('joseph', 0.076), ('reviewers', 0.074)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0000002 1171 andrew gelman stats-2012-02-16-“False-positive psychology”

Introduction: Everybody’s talkin bout this paper by Joseph Simmons, Leif Nelson and Uri Simonsohn, who write : Despite empirical psychologists’ nominal endorsement of a low rate of false-positive findings (≤ .05), flexibility in data collection, analysis, and reporting dramatically increases actual false-positive rates. In many cases, a researcher is more likely to falsely find evidence that an effect exists than to correctly find evidence that it does not. We [Simmons, Nelson, and Simonsohn] present computer simulations and a pair of actual experiments that demonstrate how unacceptably easy it is to accumulate (and report) statistically significant evidence for a false hypothesis. Second, we suggest a simple, low-cost, and straightforwardly effective disclosure-based solution to this problem. The solution involves six concrete requirements for authors and four guidelines for reviewers, all of which impose a minimal burden on the publication process. Whatever you think about these recommend

2 0.16643742 1883 andrew gelman stats-2013-06-04-Interrogating p-values

Introduction: This article is a discussion of a paper by Greg Francis for a special issue, edited by E. J. Wagenmakers, of the Journal of Mathematical Psychology. Here’s what I wrote: Much of statistical practice is an effort to reduce or deny variation and uncertainty. The reduction is done through standardization, replication, and other practices of experimental design, with the idea being to isolate and stabilize the quantity being estimated and then average over many cases. Even so, however, uncertainty persists, and statistical hypothesis testing is in many ways an endeavor to deny this, by reporting binary accept/reject decisions. Classical statistical methods produce binary statements, but there is no reason to assume that the world works that way. Expressions such as Type 1 error, Type 2 error, false positive, and so on, are based on a model in which the world is divided into real and non-real effects. To put it another way, I understand the general scientific distinction of real vs

3 0.15326211 1069 andrew gelman stats-2011-12-19-I got one of these letters once and was so irritated that I wrote back to the journal withdrawing my paper

Introduction: I’m talkin bout this .

4 0.13604519 2091 andrew gelman stats-2013-11-06-“Marginally significant”

Introduction: Jeremy Fox writes: You’ve probably seen this [by Matthew Hankins]. . . . Everyone else on Twitter already has. It’s a graph of the frequency with which the phrase “marginally significant” occurs in association with different P values. Apparently it’s real data, from a Google Scholar search, though I haven’t tried to replicate the search myself. My reply: I admire the effort that went into the data collection and the excellent display (following Bill Cleveland etc., I’d prefer a landscape rather than portrait orientation of the graph, also I’d prefer a gritty histogram rather than a smooth density, and I don’t like the y-axis going below zero, nor do I like the box around the graph, also there’s that weird R default where the axis labels are so far from the actual axes, I don’t know whassup with that . . . but these are all minor, minor issues, certainly I’ve done much worse myself many times even in published articles; see the presentation here for lots of examples), an

5 0.13455766 1557 andrew gelman stats-2012-11-01-‘Researcher Degrees of Freedom’

Introduction: False-Positive Psychology: Undisclosed Flexibility in Data Collection and Analysis Allows Presenting Anything as Significant [I]t is unacceptably easy to publish “statistically significant” evidence consistent with any hypothesis. The culprit is a construct we refer to as researcher degrees of freedom. In the course of collecting and analyzing data, researchers have many decisions to make: Should more data be collected? Should some observations be excluded? Which conditions should be combined and which ones compared? Which control variables should be considered? Should specific measures be combined or transformed or both? It is rare, and sometimes impractical, for researchers to make all these decisions beforehand. Rather, it is common (and accepted practice) for researchers to explore various analytic alternatives, to search for a combination that yields “statistical significance,” and to then report only what “worked.” The problem, of course, is that the likelihood of at leas

6 0.13030142 2093 andrew gelman stats-2013-11-07-I’m negative on the expression “false positives”

7 0.12791255 1826 andrew gelman stats-2013-04-26-“A Vast Graveyard of Undead Theories: Publication Bias and Psychological Science’s Aversion to the Null”

8 0.12109074 1963 andrew gelman stats-2013-07-31-Response by Jessica Tracy and Alec Beall to my critique of the methods in their paper, “Women Are More Likely to Wear Red or Pink at Peak Fertility”

9 0.11842717 445 andrew gelman stats-2010-12-03-Getting a job in pro sports… as a statistician

10 0.11739373 1844 andrew gelman stats-2013-05-06-Against optimism about social science

11 0.11650053 1072 andrew gelman stats-2011-12-19-“The difference between . . .”: It’s not just p=.05 vs. p=.06

12 0.11322077 565 andrew gelman stats-2011-02-09-Dennis the dentist, debunked?

13 0.10811518 2042 andrew gelman stats-2013-09-28-Difficulties of using statistical significance (or lack thereof) to sift through and compare research hypotheses

14 0.10620985 256 andrew gelman stats-2010-09-04-Noooooooooooooooooooooooooooooooooooooooooooooooo!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

15 0.10594179 2166 andrew gelman stats-2014-01-10-3 years out of date on the whole Dennis the dentist thing!

16 0.10155889 1959 andrew gelman stats-2013-07-28-50 shades of gray: A research story

17 0.10051773 2065 andrew gelman stats-2013-10-17-Cool dynamic demographic maps provide beautiful illustration of Chris Rock effect

18 0.10002112 1575 andrew gelman stats-2012-11-12-Thinking like a statistician (continuously) rather than like a civilian (discretely)

19 0.09772253 2183 andrew gelman stats-2014-01-23-Discussion on preregistration of research studies

20 0.094345391 1695 andrew gelman stats-2013-01-28-Economists argue about Bayes


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.19), (1, -0.013), (2, 0.012), (3, -0.146), (4, -0.006), (5, -0.076), (6, -0.004), (7, -0.003), (8, 0.006), (9, -0.044), (10, -0.034), (11, 0.0), (12, 0.016), (13, -0.055), (14, 0.015), (15, -0.003), (16, -0.019), (17, -0.026), (18, 0.002), (19, -0.013), (20, -0.004), (21, 0.018), (22, -0.018), (23, 0.001), (24, -0.038), (25, 0.0), (26, 0.04), (27, -0.018), (28, 0.017), (29, -0.044), (30, 0.019), (31, 0.046), (32, 0.023), (33, 0.002), (34, 0.016), (35, 0.028), (36, -0.028), (37, -0.048), (38, 0.006), (39, -0.05), (40, -0.025), (41, 0.027), (42, 0.005), (43, 0.018), (44, 0.022), (45, -0.008), (46, -0.028), (47, 0.033), (48, 0.042), (49, 0.012)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.96702766 1171 andrew gelman stats-2012-02-16-“False-positive psychology”

Introduction: Everybody’s talkin bout this paper by Joseph Simmons, Leif Nelson and Uri Simonsohn, who write : Despite empirical psychologists’ nominal endorsement of a low rate of false-positive findings (≤ .05), flexibility in data collection, analysis, and reporting dramatically increases actual false-positive rates. In many cases, a researcher is more likely to falsely find evidence that an effect exists than to correctly find evidence that it does not. We [Simmons, Nelson, and Simonsohn] present computer simulations and a pair of actual experiments that demonstrate how unacceptably easy it is to accumulate (and report) statistically significant evidence for a false hypothesis. Second, we suggest a simple, low-cost, and straightforwardly effective disclosure-based solution to this problem. The solution involves six concrete requirements for authors and four guidelines for reviewers, all of which impose a minimal burden on the publication process. Whatever you think about these recommend

2 0.83685863 1826 andrew gelman stats-2013-04-26-“A Vast Graveyard of Undead Theories: Publication Bias and Psychological Science’s Aversion to the Null”

Introduction: Erin Jonaitis points us to this article by Christopher Ferguson and Moritz Heene, who write: Publication bias remains a controversial issue in psychological science. . . . that the field often constructs arguments to block the publication and interpretation of null results and that null results may be further extinguished through questionable researcher practices. Given that science is dependent on the process of falsification, we argue that these problems reduce psychological science’s capability to have a proper mechanism for theory falsification, thus resulting in the promulgation of numerous “undead” theories that are ideologically popular but have little basis in fact. They mention the infamous Daryl Bem article. It is pretty much only because Bem’s claims are (presumably) false that they got published in a major research journal. Had the claims been true—that is, had Bem run identical experiments, analyzed his data more carefully and objectively, and reported that the r

3 0.82633132 1883 andrew gelman stats-2013-06-04-Interrogating p-values

Introduction: This article is a discussion of a paper by Greg Francis for a special issue, edited by E. J. Wagenmakers, of the Journal of Mathematical Psychology. Here’s what I wrote: Much of statistical practice is an effort to reduce or deny variation and uncertainty. The reduction is done through standardization, replication, and other practices of experimental design, with the idea being to isolate and stabilize the quantity being estimated and then average over many cases. Even so, however, uncertainty persists, and statistical hypothesis testing is in many ways an endeavor to deny this, by reporting binary accept/reject decisions. Classical statistical methods produce binary statements, but there is no reason to assume that the world works that way. Expressions such as Type 1 error, Type 2 error, false positive, and so on, are based on a model in which the world is divided into real and non-real effects. To put it another way, I understand the general scientific distinction of real vs

4 0.81283289 898 andrew gelman stats-2011-09-10-Fourteen magic words: an update

Introduction: In the discussion of the fourteen magic words that can increase voter turnout by over 10 percentage points , questions were raised about the methods used to estimate the experimental effects. I sent these on to Chris Bryan, the author of the study, and he gave the following response: We’re happy to address the questions that have come up. It’s always noteworthy when a precise psychological manipulation like this one generates a large effect on a meaningful outcome. Such findings illustrate the power of the underlying psychological process. I’ve provided the contingency tables for the two turnout experiments below. As indicated in the paper, the data are analyzed using logistic regressions. The change in chi-squared statistic represents the significance of the noun vs. verb condition variable in predicting turnout; that is, the change in the model’s significance when the condition variable is added. This is a standard way to analyze dichotomous outcomes. Four outliers were excl

5 0.80199486 2040 andrew gelman stats-2013-09-26-Difficulties in making inferences about scientific truth from distributions of published p-values

Introduction: Jeff Leek just posted the discussions of his paper (with Leah Jager), “An estimate of the science-wise false discovery rate and application to the top medical literature,” along with some further comments of his own. Here are my original thoughts on an earlier version of their article. Keith O’Rourke and I expanded these thoughts into a formal comment for the journal. We’re pretty much in agreement with John Ioannidis (you can find his discussion in the top link above). In quick summary, I agree with Jager and Leek that this is an important topic. I think there are two key places where Keith and I disagree with them: 1. They take published p-values at face value whereas we consider them as the result of a complicated process of selection. This is something I didn’t used to think much about, but now I’ve become increasingly convinced that the problems with published p-values is not a simple file-drawer effect or the case of a few p=0.051 values nudged toward p=0.049, bu

6 0.79882264 511 andrew gelman stats-2011-01-11-One more time on that ESP study: The problem of overestimates and the shrinkage solution

7 0.79747212 1944 andrew gelman stats-2013-07-18-You’ll get a high Type S error rate if you use classical statistical methods to analyze data from underpowered studies

8 0.79077947 2093 andrew gelman stats-2013-11-07-I’m negative on the expression “false positives”

9 0.78993201 2183 andrew gelman stats-2014-01-23-Discussion on preregistration of research studies

10 0.78892332 576 andrew gelman stats-2011-02-15-With a bit of precognition, you’d have known I was going to post again on this topic, and with a lot of precognition, you’d have known I was going to post today

11 0.7858485 2042 andrew gelman stats-2013-09-28-Difficulties of using statistical significance (or lack thereof) to sift through and compare research hypotheses

12 0.78300482 2156 andrew gelman stats-2014-01-01-“Though They May Be Unaware, Newlyweds Implicitly Know Whether Their Marriage Will Be Satisfying”

13 0.7810114 1963 andrew gelman stats-2013-07-31-Response by Jessica Tracy and Alec Beall to my critique of the methods in their paper, “Women Are More Likely to Wear Red or Pink at Peak Fertility”

14 0.7804237 2326 andrew gelman stats-2014-05-08-Discussion with Steven Pinker on research that is attached to data that are so noisy as to be essentially uninformative

15 0.77699256 2223 andrew gelman stats-2014-02-24-“Edlin’s rule” for routinely scaling down published estimates

16 0.77639997 1557 andrew gelman stats-2012-11-01-‘Researcher Degrees of Freedom’

17 0.77459067 1072 andrew gelman stats-2011-12-19-“The difference between . . .”: It’s not just p=.05 vs. p=.06

18 0.77418131 897 andrew gelman stats-2011-09-09-The difference between significant and not significant…

19 0.77378559 1626 andrew gelman stats-2012-12-16-The lamest, grudgingest, non-retraction retraction ever

20 0.77121443 1974 andrew gelman stats-2013-08-08-Statistical significance and the dangerous lure of certainty


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(2, 0.166), (6, 0.019), (15, 0.051), (16, 0.113), (17, 0.011), (21, 0.03), (24, 0.175), (86, 0.028), (95, 0.012), (99, 0.259)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.98032641 549 andrew gelman stats-2011-02-01-“Roughly 90% of the increase in . . .” Hey, wait a minute!

Introduction: Matthew Yglesias links approvingly to the following statement by Michael Mandel: Homeland Security accounts for roughly 90% of the increase in federal regulatory employment over the past ten years. Roughly 90%, huh? That sounds pretty impressive. But wait a minute . . . what if total federal regulatory employment had increased a bit less. Then Homeland Security could’ve accounted for 105% of the increase, or 500% of the increase, or whatever. The point is the change in total employment is the sum of a bunch of pluses and minuses. It happens that, if you don’t count Homeland Security, the total hasn’t changed much–I’m assuming Mandel’s numbers are correct here–and that could be interesting. The “roughly 90%” figure is misleading because, when written as a percent of the total increase, it’s natural to quickly envision it as a percentage that is bounded by 100%. There is a total increase in regulatory employment that the individual agencies sum to, but some margins are p

2 0.97216415 1017 andrew gelman stats-2011-11-18-Lack of complete overlap

Introduction: Evens Salies writes: I have a question regarding a randomizing constraint in my current funded electricity experiment. After elimination of missing data we have 110 voluntary households from a larger population (resource constraints do not allow us to have more households!). I randomly assign them to threated and non treated where the treatment variable is some ICT that allows the treated to track their electricity consumption in real tim. The ICT is made of two devices, one that is plugged on the household’s modem and the other on the electric meter. A necessary condition for being treated is that the distance between the box and the meter be below some threshold (d), the value of which is 20 meters approximately. 50 ICTs can be installed. 60 households will be in the control group. But, I can only assign 6 households in the control group for whom d is less than 20. Therefore, I have only 6 households in the control group who have a counterfactual in the group of treated.

3 0.96149719 1663 andrew gelman stats-2013-01-09-The effects of fiscal consolidation

Introduction: José Iparraguirre writes: I’ve read a recent paper by the International Monetary Fund on the effects of fiscal consolidation measures on income inequality (Fiscal Monitor October 2012, Appendix 1). They run a panel regression with 48 countries and 30 years (annual data) of a measure of income inequality (Gini coefficient) on a number of covariates, including a measure of fiscal consolidation. Footnote 39 (page 51) informs that they’ve employed seemingly unrelated regression and panel-corrected standard errors, and that to double-check they’ve also run ordinary least squares and fixed-effects panel regressions—all with similar results. So far, so good. However, the footnote goes on to explain that “Some of the results (e.g. the causal relationship between consolidation and inequality) may be subject to endogeneity and should be interpreted with caution”. (Italics are mine). Therefore, it seems that the crux of the exercise—i.e. estimating the relationship between fiscal con

4 0.96027607 1872 andrew gelman stats-2013-05-27-More spam!

Introduction: I just got this one today: Dear Dr. Gelman, I am pleased to inform you that the ** team has identified your recent publication, “Philosophy and the practice of Bayesian statistics.” as being of special interest to the progress in the Psychology field. We would like to list your publication on our next edition of the ** series. ** alerts the scientific community to breaking journal articles considered to represent the best in Psychology research. For today’s edition, click here. ** is viewed almost 40,000 times each month and has an audience of academic and clinical personnel from a growing number of the top 20 major academic institutions. Publications featured by ** gain extensive exposure. This exposure may benefit you and your organization since this provides a showcase for key research studies such as yours. This exposure has the added benefit of encouraging additional funding. There is a small processing charge for listing publications on ** ($35). Please let us know

5 0.95931053 1698 andrew gelman stats-2013-01-30-The spam just gets weirder and weirder

Introduction: In the inbox today, under the header, “Hidden Costs behind Milk & Dairy Consumption (video)”: Hey Professor Gelman, Our site’s production team recently released a short video uncovering the local and global impact that milk has on our lives. After spending some time on your posts, I noticed you talked about dairy products and milk so I thought I’d email you. Are you the correct person to contact in regards to the content on the site? If so, let me know if you’re interested in checking out the video. Thanks, Emily S. Hmmm . . . I guess I do talk a lot about dairy products and milk on this site!

same-blog 6 0.95790124 1171 andrew gelman stats-2012-02-16-“False-positive psychology”

7 0.95696032 1189 andrew gelman stats-2012-02-28-Those darn physicists

8 0.95201135 1954 andrew gelman stats-2013-07-24-Too Good To Be True: The Scientific Mass Production of Spurious Statistical Significance

9 0.95020092 1254 andrew gelman stats-2012-04-09-In the future, everyone will publish everything.

10 0.94973582 17 andrew gelman stats-2010-05-05-Taking philosophical arguments literally

11 0.94312692 97 andrew gelman stats-2010-06-18-Economic Disparities and Life Satisfaction in European Regions

12 0.93676674 885 andrew gelman stats-2011-09-01-Needed: A Billionaire Candidate for President Who Shares the Views of a Washington Post Columnist

13 0.93363547 489 andrew gelman stats-2010-12-28-Brow inflation

14 0.92777646 1893 andrew gelman stats-2013-06-11-Folic acid and autism

15 0.92682159 1102 andrew gelman stats-2012-01-06-Bayesian Anova found useful in ecology

16 0.92570072 1508 andrew gelman stats-2012-09-23-Speaking frankly

17 0.9232496 1196 andrew gelman stats-2012-03-04-Piss-poor monocausal social science

18 0.91152775 681 andrew gelman stats-2011-04-26-Worst statistical graphic I have seen this year

19 0.90873551 2336 andrew gelman stats-2014-05-16-How much can we learn about individual-level causal claims from state-level correlations?

20 0.90140873 27 andrew gelman stats-2010-05-11-Update on the spam email study