andrew_gelman_stats andrew_gelman_stats-2013 andrew_gelman_stats-2013-1744 knowledge-graph by maker-knowledge-mining

1744 andrew gelman stats-2013-03-01-Why big effects are more important than small effects


meta infos for this blog

Source: html

Introduction: The title of this post is silly but I have an important point to make, regarding an implicit model which I think many people assume even though it does not really make sense. Following a link from Sanjay Srivastava, I came across a post from David Funder saying that it’s useful to talk about the sizes of effects (I actually prefer the term “comparisons” so as to avoid the causal baggage) rather than just their signs. I agree , and I wanted to elaborate a bit on a point that comes up in Funder’s discussion. He quotes an (unnamed) prominent social psychologist as writing: The key to our research . . . [is not] to accurately estimate effect size. If I were testing an advertisement for a marketing research firm and wanted to be sure that the cost of the ad would produce enough sales to make it worthwhile, effect size would be crucial. But when I am testing a theory about whether, say, positive mood reduces information processing in comparison with negative mood, I am worried abou


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 The title of this post is silly but I have an important point to make, regarding an implicit model which I think many people assume even though it does not really make sense. [sent-1, score-0.227]

2 Following a link from Sanjay Srivastava, I came across a post from David Funder saying that it’s useful to talk about the sizes of effects (I actually prefer the term “comparisons” so as to avoid the causal baggage) rather than just their signs. [sent-2, score-0.47]

3 I agree , and I wanted to elaborate a bit on a point that comes up in Funder’s discussion. [sent-3, score-0.132]

4 He quotes an (unnamed) prominent social psychologist as writing: The key to our research . [sent-4, score-0.135]

5 If I were testing an advertisement for a marketing research firm and wanted to be sure that the cost of the ad would produce enough sales to make it worthwhile, effect size would be crucial. [sent-8, score-0.775]

6 I’ve added the emphasis in the quote above to point to what I see as its key mistake, which is an implicit model in which effects are additive and interactions are multiplicative . [sent-14, score-0.735]

7 My impression is that people think this way all the time: an effect is positive, negative, or zero, and if it’s positive, it will have different degrees of positivity depending on conditions (with a “pure” measurement having larger effects than an “attenuated” measurement). [sent-15, score-0.768]

8 There seems to be an idea, when considering true effects or population comparisons (that is, forgetting for a moment about sampling or estimation uncertainty), that there is a high fence at zero, stopping positive effects from becoming negative or vice versa. [sent-17, score-1.214]

9 If main effects are additive, so can interactions. [sent-19, score-0.219]

10 If “a different manipulation of mood, a different set of informational stimuli, a different contextual setting for the research” can change the magnitude of an effect, I think it can shift the sign as well. [sent-20, score-1.003]

11 001 is that they can be fragile; there’s no guarantee the effect might not be -0. [sent-22, score-0.235]

12 And I’m not talking about sampling variability here, I’m talking about interactions, that is, real variability in the underlying effect or comparison. [sent-24, score-0.679]

13 This idea is familiar to those of us who use multilevel models but it can be missing in some standard presentations of statistics in which parameters are estimated one at a time without interest in their variation. [sent-25, score-0.069]

14 Funder’s post is fine too; he focuses on a different point, which is how to assess the relevance of correlations such as 0. [sent-28, score-0.207]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('mood', 0.408), ('funder', 0.29), ('effect', 0.235), ('effects', 0.219), ('stimuli', 0.176), ('positive', 0.173), ('fence', 0.166), ('negative', 0.153), ('informational', 0.149), ('size', 0.143), ('different', 0.139), ('contextual', 0.139), ('additive', 0.126), ('manipulation', 0.123), ('reduces', 0.123), ('variability', 0.119), ('processing', 0.111), ('magnitude', 0.098), ('produce', 0.097), ('implicit', 0.095), ('across', 0.093), ('interactions', 0.091), ('sizes', 0.09), ('positivity', 0.088), ('measurement', 0.087), ('worry', 0.086), ('advertisement', 0.083), ('testing', 0.08), ('comparison', 0.077), ('setting', 0.077), ('direction', 0.077), ('sampling', 0.076), ('comparisons', 0.074), ('sheer', 0.074), ('multiplicative', 0.074), ('unnamed', 0.074), ('srivastava', 0.071), ('zero', 0.07), ('research', 0.069), ('forgetting', 0.069), ('presentations', 0.069), ('worthwhile', 0.069), ('sanjay', 0.069), ('fragile', 0.069), ('wanted', 0.068), ('post', 0.068), ('key', 0.066), ('talking', 0.065), ('vice', 0.065), ('point', 0.064)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0 1744 andrew gelman stats-2013-03-01-Why big effects are more important than small effects

Introduction: The title of this post is silly but I have an important point to make, regarding an implicit model which I think many people assume even though it does not really make sense. Following a link from Sanjay Srivastava, I came across a post from David Funder saying that it’s useful to talk about the sizes of effects (I actually prefer the term “comparisons” so as to avoid the causal baggage) rather than just their signs. I agree , and I wanted to elaborate a bit on a point that comes up in Funder’s discussion. He quotes an (unnamed) prominent social psychologist as writing: The key to our research . . . [is not] to accurately estimate effect size. If I were testing an advertisement for a marketing research firm and wanted to be sure that the cost of the ad would produce enough sales to make it worthwhile, effect size would be crucial. But when I am testing a theory about whether, say, positive mood reduces information processing in comparison with negative mood, I am worried abou

2 0.20431682 803 andrew gelman stats-2011-07-14-Subtleties with measurement-error models for the evaluation of wacky claims

Introduction: A few days ago I discussed the evaluation of somewhat-plausible claims that are somewhat supported by theory and somewhat supported by statistical evidence. One point I raised was that an implausibly large estimate of effect size can be cause for concern: Uri Simonsohn (the author of the recent rebuttal of the name-choice article by Pelham et al.) argued that the implied effects were too large to be believed (just as I was arguing above regarding the July 4th study), which makes more plausible his claims that the results arise from methodological artifacts. That calculation is straight Bayes: the distribution of systematic errors has much longer tails than the distribution of random errors, so the larger the estimated effect, the more likely it is to be a mistake. This little theoretical result is a bit annoying, because it is the larger effects that are the most interesting!” Larry Bartels notes that my reasoning above is a bit incoherent: I [Bartels] strongly agree with

3 0.1518067 963 andrew gelman stats-2011-10-18-Question on Type M errors

Introduction: Inti Pedroso writes: Today during the group meeting at my new job we were revising a paper whose main conclusions were sustained by an ANOVA. One of the first observations is that the experiment had a small sample size. Interestingly (may not so), some of the reported effects (most of them interactions) were quite large. One of the experience group members said that “there is a common wisdom that one should not believe effects from small sample sizes but [he thinks] if they [the effects] are large enough to be picked on a small study they must be real large effects”. I argued that if the sample size is small one could incur on a M-type error in which the magnitude of the effect is being over-estimated and that if larger samples are evaluated the magnitude may become smaller and also the confidence intervals. The concept of M-type error is completely new to all other members of the group (on which I am in my second week) and I was given the job of finding a suitable ref to explain

4 0.14912091 2155 andrew gelman stats-2013-12-31-No on Yes-No decisions

Introduction: Just to elaborate on our post from last month (“I’m negative on the expression ‘false positives’”), here’s a recent exchange exchange we had regarding the relevance of yes/no decisions in summarizing statistical inferences about scientific questions. Shravan wrote : Isn’t it true that I am already done if P(theta>0) is much larger than P(theta<0)? I don't need to compute any loss function if the former is 0.99 and the latter 0.01. In most studies of the type that people like me do [Shravan is a linguist], we set up experiments where we have a decisive test like this for theory A and against theory B. To which I replied : In some way the problem is with the focus on “theta.” Effects (and, more generally, comparisons) vary, they can be positive for some people in some settings and negative for other people in other settings. If you’re talking about a single “theta,” you have to define what population and what scenario you are thinking about. And it’s probably not the popul

5 0.14815052 1607 andrew gelman stats-2012-12-05-The p-value is not . . .

Introduction: From a recent email exchange: I agree that you should never compare p-values directly. The p-value is a strange nonlinear transformation of data that is only interpretable under the null hypothesis. Once you abandon the null (as we do when we observe something with a very low p-value), the p-value itself becomes irrelevant. To put it another way, the p-value is a measure of evidence, it is not an estimate of effect size (as it is often treated, with the idea that a p=.001 effect is larger than a p=.01 effect, etc). Even conditional on sample size, the p-value is not a measure of effect size.

6 0.14428587 1400 andrew gelman stats-2012-06-29-Decline Effect in Linguistics?

7 0.14341332 7 andrew gelman stats-2010-04-27-Should Mister P be allowed-encouraged to reside in counter-factual populations?

8 0.13669848 950 andrew gelman stats-2011-10-10-“Causality is almost always in doubt”

9 0.13643044 643 andrew gelman stats-2011-04-02-So-called Bayesian hypothesis testing is just as bad as regular hypothesis testing

10 0.13145776 2042 andrew gelman stats-2013-09-28-Difficulties of using statistical significance (or lack thereof) to sift through and compare research hypotheses

11 0.13103509 797 andrew gelman stats-2011-07-11-How do we evaluate a new and wacky claim?

12 0.1254777 1644 andrew gelman stats-2012-12-30-Fixed effects, followed by Bayes shrinkage?

13 0.11940635 888 andrew gelman stats-2011-09-03-A psychology researcher asks: Is Anova dead?

14 0.11936642 2287 andrew gelman stats-2014-04-09-Advice: positive-sum, zero-sum, or negative-sum

15 0.11715562 898 andrew gelman stats-2011-09-10-Fourteen magic words: an update

16 0.11705479 1746 andrew gelman stats-2013-03-02-Fishing for cherries

17 0.11644823 1418 andrew gelman stats-2012-07-16-Long discussion about causal inference and the use of hierarchical models to bridge between different inferential settings

18 0.11569475 1883 andrew gelman stats-2013-06-04-Interrogating p-values

19 0.11536593 2040 andrew gelman stats-2013-09-26-Difficulties in making inferences about scientific truth from distributions of published p-values

20 0.11466945 2090 andrew gelman stats-2013-11-05-How much do we trust a new claim that early childhood stimulation raised earnings by 42%?


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.207), (1, 0.042), (2, 0.074), (3, -0.153), (4, 0.023), (5, -0.039), (6, -0.013), (7, 0.015), (8, 0.025), (9, -0.007), (10, -0.101), (11, 0.022), (12, 0.089), (13, -0.084), (14, 0.05), (15, -0.002), (16, -0.084), (17, 0.013), (18, -0.045), (19, 0.061), (20, -0.035), (21, -0.048), (22, 0.012), (23, 0.023), (24, -0.052), (25, -0.001), (26, -0.057), (27, 0.067), (28, -0.008), (29, -0.004), (30, -0.036), (31, -0.022), (32, -0.058), (33, -0.027), (34, 0.002), (35, -0.016), (36, -0.048), (37, -0.079), (38, -0.005), (39, 0.005), (40, 0.036), (41, 0.011), (42, -0.017), (43, -0.038), (44, -0.002), (45, -0.014), (46, -0.029), (47, 0.01), (48, 0.007), (49, 0.049)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.98488265 1744 andrew gelman stats-2013-03-01-Why big effects are more important than small effects

Introduction: The title of this post is silly but I have an important point to make, regarding an implicit model which I think many people assume even though it does not really make sense. Following a link from Sanjay Srivastava, I came across a post from David Funder saying that it’s useful to talk about the sizes of effects (I actually prefer the term “comparisons” so as to avoid the causal baggage) rather than just their signs. I agree , and I wanted to elaborate a bit on a point that comes up in Funder’s discussion. He quotes an (unnamed) prominent social psychologist as writing: The key to our research . . . [is not] to accurately estimate effect size. If I were testing an advertisement for a marketing research firm and wanted to be sure that the cost of the ad would produce enough sales to make it worthwhile, effect size would be crucial. But when I am testing a theory about whether, say, positive mood reduces information processing in comparison with negative mood, I am worried abou

2 0.8918069 1400 andrew gelman stats-2012-06-29-Decline Effect in Linguistics?

Introduction: Josef Fruehwald writes : In the past few years, the empirical foundations of the social sciences, especially Psychology, have been coming under increased scrutiny and criticism. For example, there was the New Yorker piece from 2010 called “The Truth Wears Off” about the “decline effect,” or how the effect size of a phenomenon appears to decrease over time. . . . I [Fruehwald] am a linguist. Do the problems facing psychology face me? To really answer that, I first have to decide which explanation for the decline effect I think is most likely, and I think Andrew Gelman’s proposal is a good candidate: The short story is that if you screen for statistical significance when estimating small effects, you will necessarily overestimate the magnitudes of effects, sometimes by a huge amount. I’ve put together some R code to demonstrate this point. Let’s say I’m looking at two populations, and unknown to me as a researcher, there is a small difference between the two, even though they

3 0.87096614 803 andrew gelman stats-2011-07-14-Subtleties with measurement-error models for the evaluation of wacky claims

Introduction: A few days ago I discussed the evaluation of somewhat-plausible claims that are somewhat supported by theory and somewhat supported by statistical evidence. One point I raised was that an implausibly large estimate of effect size can be cause for concern: Uri Simonsohn (the author of the recent rebuttal of the name-choice article by Pelham et al.) argued that the implied effects were too large to be believed (just as I was arguing above regarding the July 4th study), which makes more plausible his claims that the results arise from methodological artifacts. That calculation is straight Bayes: the distribution of systematic errors has much longer tails than the distribution of random errors, so the larger the estimated effect, the more likely it is to be a mistake. This little theoretical result is a bit annoying, because it is the larger effects that are the most interesting!” Larry Bartels notes that my reasoning above is a bit incoherent: I [Bartels] strongly agree with

4 0.86952037 7 andrew gelman stats-2010-04-27-Should Mister P be allowed-encouraged to reside in counter-factual populations?

Introduction: Lets say you are repeatedly going to recieve unselected sets of well done RCTs on various say medical treatments. One reasonable assumption with all of these treatments is that they are monotonic – either helpful or harmful for all. The treatment effect will (as always) vary for subgroups in the population – these will not be explicitly identified in the studies – but each study very likely will enroll different percentages of the variuos patient subgroups. Being all randomized studies these subgroups will be balanced in the treatment versus control arms – but each study will (as always) be estimating a different – but exchangeable – treatment effect (Exhangeable due to the ignorance about the subgroup memberships of the enrolled patients.) That reasonable assumption – monotonicity – will be to some extent (as always) wrong, but given that it is a risk believed well worth taking – if the average effect in any population is positive (versus negative) the average effect in any other

5 0.84672344 963 andrew gelman stats-2011-10-18-Question on Type M errors

Introduction: Inti Pedroso writes: Today during the group meeting at my new job we were revising a paper whose main conclusions were sustained by an ANOVA. One of the first observations is that the experiment had a small sample size. Interestingly (may not so), some of the reported effects (most of them interactions) were quite large. One of the experience group members said that “there is a common wisdom that one should not believe effects from small sample sizes but [he thinks] if they [the effects] are large enough to be picked on a small study they must be real large effects”. I argued that if the sample size is small one could incur on a M-type error in which the magnitude of the effect is being over-estimated and that if larger samples are evaluated the magnitude may become smaller and also the confidence intervals. The concept of M-type error is completely new to all other members of the group (on which I am in my second week) and I was given the job of finding a suitable ref to explain

6 0.82776797 898 andrew gelman stats-2011-09-10-Fourteen magic words: an update

7 0.81830317 1310 andrew gelman stats-2012-05-09-Varying treatment effects, again

8 0.81826854 2165 andrew gelman stats-2014-01-09-San Fernando Valley cityscapes: An example of the benefits of fractal devastation?

9 0.81784421 797 andrew gelman stats-2011-07-11-How do we evaluate a new and wacky claim?

10 0.78214228 1607 andrew gelman stats-2012-12-05-The p-value is not . . .

11 0.78037667 629 andrew gelman stats-2011-03-26-Is it plausible that 1% of people pick a career based on their first name?

12 0.77826262 2155 andrew gelman stats-2013-12-31-No on Yes-No decisions

13 0.7691195 2 andrew gelman stats-2010-04-23-Modeling heterogenous treatment effects

14 0.76875448 1215 andrew gelman stats-2012-03-16-The “hot hand” and problems with hypothesis testing

15 0.76838446 2227 andrew gelman stats-2014-02-27-“What Can we Learn from the Many Labs Replication Project?”

16 0.76401782 1644 andrew gelman stats-2012-12-30-Fixed effects, followed by Bayes shrinkage?

17 0.76174653 1891 andrew gelman stats-2013-06-09-“Heterogeneity of variance in experimental studies: A challenge to conventional interpretations”

18 0.75920659 1150 andrew gelman stats-2012-02-02-The inevitable problems with statistical significance and 95% intervals

19 0.75858873 1186 andrew gelman stats-2012-02-27-Confusion from illusory precision

20 0.74947363 2090 andrew gelman stats-2013-11-05-How much do we trust a new claim that early childhood stimulation raised earnings by 42%?


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(5, 0.012), (9, 0.012), (10, 0.13), (11, 0.012), (13, 0.023), (15, 0.043), (16, 0.046), (21, 0.031), (24, 0.169), (53, 0.015), (73, 0.011), (77, 0.022), (86, 0.018), (95, 0.013), (99, 0.327)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.97474778 2215 andrew gelman stats-2014-02-17-The Washington Post reprints university press releases without editing them

Introduction: Somebody points me to this horrifying exposé by Paul Raeburn on a new series by the Washington Post where they reprint press releases as if they are actual news. And the gimmick is, the reason why it’s appearing on this blog, is that these are university press releases on science stories . What could possibly go wrong there? After all, Steve Chaplin, a self-identified “science-writing PIO from an R1,” writes in a comment to Raeburn’s post: We write about peer-reviewed research accepted for publication or published by the world’s leading scientific journals after that research has been determined to be legitimate. Repeatability of new research is a publication requisite. I emphasized that last sentence myself because it was such a stunner. Do people really think that??? So I guess what he’s saying is, they don’t do press releases for articles from Psychological Science or the Journal of Personality and Social Psychology . But I wonder how the profs in the psych d

2 0.9705264 1059 andrew gelman stats-2011-12-14-Looking at many comparisons may increase the risk of finding something statistically significant by epidemiologists, a population with relatively low multilevel modeling consumption

Introduction: To understand the above title, see here . Masanao writes: This report claims that eating meat increases the risk of cancer. I’m sure you can’t read the page but you probably can understand the graphs. Different bars represent subdivision in the amount of the particular type of meat one consumes. And each chunk is different types of meat. Left is for male right is for female. They claim that the difference is significant, but they are clearly not!! I’m for not eating much meat but this is just way too much… Here’s the graph: I don’t know what to think. If you look carefully you can find one or two statistically significant differences but overall the pattern doesn’t look so compelling. I don’t know what the top and bottom rows are, though. Overall, the pattern in the top row looks like it could represent a real trend, while the graphs on the bottom row look like noise. This could be a good example for our multiple comparisons paper. If the researchers won’t

3 0.96941328 78 andrew gelman stats-2010-06-10-Hey, where’s my kickback?

Introduction: I keep hearing about textbook publishers who practically bribe instructors to assign their textbooks to students. And then I received this (unsolicited) email: You have recently been sent Pearson (Allyn & Bacon, Longman, Prentice Hall) texts to review for your summer and fall courses. As a thank you for reviewing our texts, I would like to invite you to participate in a brief survey (attached). If you have any questions about the survey, are not sure which books you have been sent, or if you would like to receive instructor’s materials, desk copies, etc. please let me know! If you have recently received your course assignments – let me know as well . Additionally, if you have decided to use a Pearson book in your summer or fall courses, I will provide you with an ISBN that will include discounts and resources for your students at no extra cost! All you have to do is answer the 3 simple questions on the attached survey and you will receive a $10.00 Dunkin Donuts gift card.

4 0.96919 37 andrew gelman stats-2010-05-17-Is chartjunk really “more useful” than plain graphs? I don’t think so.

Introduction: Helen DeWitt links to this blog that reports on a study by Scott Bateman, Carl Gutwin, David McDine, Regan Mandryk, Aaron Genest, and Christopher Brooks that claims the following: Guidelines for designing information charts often state that the presentation should reduce ‘chart junk’–visual embellishments that are not essential to understanding the data. . . . we conducted an experiment that compared embellished charts with plain ones, and measured both interpretation accuracy and long-term recall. We found that people’s accuracy in describing the embellished charts was no worse than for plain charts, and that their recall after a two-to-three-week gap was significantly better. As the above-linked blogger puts it, “chartjunk is more useful than plain graphs. . . . Tufte is not going to like this.” I can’t speak for Ed Tufte, but I’m not gonna take this claim about chartjunk lying down. I have two points to make which I hope can stop the above-linked study from being sla

same-blog 5 0.96869171 1744 andrew gelman stats-2013-03-01-Why big effects are more important than small effects

Introduction: The title of this post is silly but I have an important point to make, regarding an implicit model which I think many people assume even though it does not really make sense. Following a link from Sanjay Srivastava, I came across a post from David Funder saying that it’s useful to talk about the sizes of effects (I actually prefer the term “comparisons” so as to avoid the causal baggage) rather than just their signs. I agree , and I wanted to elaborate a bit on a point that comes up in Funder’s discussion. He quotes an (unnamed) prominent social psychologist as writing: The key to our research . . . [is not] to accurately estimate effect size. If I were testing an advertisement for a marketing research firm and wanted to be sure that the cost of the ad would produce enough sales to make it worthwhile, effect size would be crucial. But when I am testing a theory about whether, say, positive mood reduces information processing in comparison with negative mood, I am worried abou

6 0.96760952 1974 andrew gelman stats-2013-08-08-Statistical significance and the dangerous lure of certainty

7 0.96723288 487 andrew gelman stats-2010-12-27-Alfred Kahn

8 0.96436942 2257 andrew gelman stats-2014-03-20-The candy weighing demonstration, or, the unwisdom of crowds

9 0.96062565 1122 andrew gelman stats-2012-01-16-“Groundbreaking or Definitive? Journals Need to Pick One”

10 0.9597075 1402 andrew gelman stats-2012-07-01-Ice cream! and temperature

11 0.95044255 1363 andrew gelman stats-2012-06-03-Question about predictive checks

12 0.94514221 2266 andrew gelman stats-2014-03-25-A statistical graphics course and statistical graphics advice

13 0.94413424 1209 andrew gelman stats-2012-03-12-As a Bayesian I want scientists to report their data non-Bayesianly

14 0.93766725 511 andrew gelman stats-2011-01-11-One more time on that ESP study: The problem of overestimates and the shrinkage solution

15 0.93698674 878 andrew gelman stats-2011-08-29-Infovis, infographics, and data visualization: Where I’m coming from, and where I’d like to go

16 0.93646538 1247 andrew gelman stats-2012-04-05-More philosophy of Bayes

17 0.93560016 2220 andrew gelman stats-2014-02-22-Quickies

18 0.9353528 1848 andrew gelman stats-2013-05-09-A tale of two discussion papers

19 0.93479288 2080 andrew gelman stats-2013-10-28-Writing for free

20 0.93452621 2340 andrew gelman stats-2014-05-20-Thermodynamic Monte Carlo: Michael Betancourt’s new method for simulating from difficult distributions and evaluating normalizing constants