andrew_gelman_stats andrew_gelman_stats-2013 andrew_gelman_stats-2013-2097 knowledge-graph by maker-knowledge-mining

2097 andrew gelman stats-2013-11-11-Why ask why? Forward causal inference and reverse causal questions


meta infos for this blog

Source: html

Introduction: Guido Imbens and I write : The statistical and econometrics literature on causality is more focused on “effects of causes” than on “causes of effects.” That is, in the standard approach it is natural to study the effect of a treatment, but it is not in general possible to define the causes of any particular outcome. This has led some researchers to dismiss the search for causes as “cocktail party chatter” that is outside the realm of science. We argue here that the search for causes can be understood within traditional statistical frameworks as a part of model checking and hypothesis generation. We argue that it can make sense to ask questions about the causes of effects, but the answers to these questions will be in terms of effects of causes. We also posted the paper on NBER so I’m hoping it will get some attention from economists. [Again, here's the open link to the paper.] I think what we have here is an important idea linking statistical and econometric models of caus


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Guido Imbens and I write : The statistical and econometrics literature on causality is more focused on “effects of causes” than on “causes of effects. [sent-1, score-0.661]

2 ” That is, in the standard approach it is natural to study the effect of a treatment, but it is not in general possible to define the causes of any particular outcome. [sent-2, score-1.074]

3 This has led some researchers to dismiss the search for causes as “cocktail party chatter” that is outside the realm of science. [sent-3, score-1.384]

4 We argue here that the search for causes can be understood within traditional statistical frameworks as a part of model checking and hypothesis generation. [sent-4, score-1.673]

5 We argue that it can make sense to ask questions about the causes of effects, but the answers to these questions will be in terms of effects of causes. [sent-5, score-1.441]

6 We also posted the paper on NBER so I’m hoping it will get some attention from economists. [sent-6, score-0.265]

7 ] I think what we have here is an important idea linking statistical and econometric models of causal inference to how we think about causality more generally. [sent-8, score-0.72]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('causes', 0.63), ('causality', 0.232), ('chatter', 0.182), ('nber', 0.172), ('cocktail', 0.172), ('effects', 0.17), ('search', 0.163), ('argue', 0.16), ('guido', 0.159), ('realm', 0.154), ('imbens', 0.146), ('frameworks', 0.143), ('econometric', 0.136), ('questions', 0.122), ('dismiss', 0.122), ('linking', 0.116), ('econometrics', 0.112), ('understood', 0.108), ('statistical', 0.105), ('hoping', 0.102), ('answers', 0.096), ('focused', 0.095), ('define', 0.093), ('traditional', 0.093), ('led', 0.09), ('outside', 0.085), ('party', 0.085), ('checking', 0.084), ('attention', 0.084), ('posted', 0.079), ('treatment', 0.077), ('natural', 0.075), ('hypothesis', 0.074), ('causal', 0.073), ('terms', 0.072), ('open', 0.072), ('ask', 0.069), ('literature', 0.067), ('within', 0.064), ('generally', 0.064), ('link', 0.058), ('approach', 0.058), ('inference', 0.058), ('standard', 0.057), ('possible', 0.056), ('researchers', 0.055), ('effect', 0.054), ('study', 0.051), ('write', 0.05), ('part', 0.049)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0 2097 andrew gelman stats-2013-11-11-Why ask why? Forward causal inference and reverse causal questions

Introduction: Guido Imbens and I write : The statistical and econometrics literature on causality is more focused on “effects of causes” than on “causes of effects.” That is, in the standard approach it is natural to study the effect of a treatment, but it is not in general possible to define the causes of any particular outcome. This has led some researchers to dismiss the search for causes as “cocktail party chatter” that is outside the realm of science. We argue here that the search for causes can be understood within traditional statistical frameworks as a part of model checking and hypothesis generation. We argue that it can make sense to ask questions about the causes of effects, but the answers to these questions will be in terms of effects of causes. We also posted the paper on NBER so I’m hoping it will get some attention from economists. [Again, here's the open link to the paper.] I think what we have here is an important idea linking statistical and econometric models of caus

2 0.33517191 1675 andrew gelman stats-2013-01-15-“10 Things You Need to Know About Causal Effects”

Introduction: Macartan Humphreys pointed me to this excellent guide . Here are the 10 items: 1. A causal claim is a statement about what didn’t happen. 2. There is a fundamental problem of causal inference. 3. You can estimate average causal effects even if you cannot observe any individual causal effects. 4. If you know that, on average, A causes B and that B causes C, this does not mean that you know that A causes C. 5. The counterfactual model is all about contribution, not attribution. 6. X can cause Y even if there is no “causal path” connecting X and Y. 7. Correlation is not causation. 8. X can cause Y even if X is not a necessary condition or a sufficient condition for Y. 9. Estimating average causal effects does not require that treatment and control groups are identical. 10. There is no causation without manipulation. The article follows with crisp discussions of each point. My favorite is item #6, not because it’s the most important but because it brings in some real s

3 0.15016535 1939 andrew gelman stats-2013-07-15-Forward causal reasoning statements are about estimation; reverse causal questions are about model checking and hypothesis generation

Introduction: Consider two broad classes of inferential questions : 1. Forward causal inference . What might happen if we do X? What are the effects of smoking on health, the effects of schooling on knowledge, the effect of campaigns on election outcomes, and so forth? 2. Reverse causal inference . What causes Y? Why do more attractive people earn more money? Why do many poor people vote for Republicans and rich people vote for Democrats? Why did the economy collapse? When statisticians and econometricians write about causal inference, they focus on forward causal questions. Rubin always told us: Never ask Why? Only ask What if? And, from the econ perspective, causation is typically framed in terms of manipulations: if x had changed by 1, how much would y be expected to change, holding all else constant? But reverse causal questions are important too. They’re a natural way to think (consider the importance of the word “Why”) and are arguably more important than forward questions.

4 0.12973085 1418 andrew gelman stats-2012-07-16-Long discussion about causal inference and the use of hierarchical models to bridge between different inferential settings

Introduction: Elias Bareinboim asked what I thought about his comment on selection bias in which he referred to a paper by himself and Judea Pearl, “Controlling Selection Bias in Causal Inference.” I replied that I have no problem with what he wrote, but that from my perspective I find it easier to conceptualize such problems in terms of multilevel models. I elaborated on that point in a recent post , “Hierarchical modeling as a framework for extrapolation,” which I think was read by only a few people (I say this because it received only two comments). I don’t think Bareinboim objected to anything I wrote, but like me he is comfortable working within his own framework. He wrote the following to me: In some sense, “not ad hoc” could mean logically consistent. In other words, if one agrees with the assumptions encoded in the model, one must also agree with the conclusions entailed by these assumptions. I am not aware of any other way of doing mathematics. As it turns out, to get causa

5 0.12478894 217 andrew gelman stats-2010-08-19-The “either-or” fallacy of believing in discrete models: an example of folk statistics

Introduction: Psychologists talk about “folk psychology”: ideas that make sense to us about how people think and behave, even if these ideas are not accurate descriptions of reality. And physicists talk about “folk physics” (for example, the idea that a thrown ball falls in a straight line and then suddenly drops, rather than following an approximate parabola). There’s also “folk statistics.” Some of the ideas of folk statistics are so strong that even educated people–even well-known researchers–can make these mistakes. One of the ideas of folk statistics that bothers me a lot is what might be called the “either/or fallacy”: the idea that if there are two possible stories, the truth has to be one or the other. I have often encountered the either/or fallacy in Bayesian statistics, for example the vast literature on “model selection” or “variable selection” or “model averaging” in which it is assumed that one of some pre-specified discrete set of models is the truth, and that this true model

6 0.12266607 64 andrew gelman stats-2010-06-03-Estimates of war deaths: Darfur edition

7 0.11876331 950 andrew gelman stats-2011-10-10-“Causality is almost always in doubt”

8 0.11311572 2245 andrew gelman stats-2014-03-12-More on publishing in journals

9 0.11109125 870 andrew gelman stats-2011-08-25-Why it doesn’t make sense in general to form confidence intervals by inverting hypothesis tests

10 0.10745077 560 andrew gelman stats-2011-02-06-Education and Poverty

11 0.1071562 1801 andrew gelman stats-2013-04-13-Can you write a program to determine the causal order?

12 0.10574423 1913 andrew gelman stats-2013-06-24-Why it doesn’t make sense in general to form confidence intervals by inverting hypothesis tests

13 0.10456659 1624 andrew gelman stats-2012-12-15-New prize on causality in statstistics education

14 0.10420804 1802 andrew gelman stats-2013-04-14-Detecting predictability in complex ecosystems

15 0.098080449 1891 andrew gelman stats-2013-06-09-“Heterogeneity of variance in experimental studies: A challenge to conventional interpretations”

16 0.098069482 2007 andrew gelman stats-2013-09-03-Popper and Jaynes

17 0.097245544 1778 andrew gelman stats-2013-03-27-My talk at the University of Michigan today 4pm

18 0.09703812 1383 andrew gelman stats-2012-06-18-Hierarchical modeling as a framework for extrapolation

19 0.093506709 1644 andrew gelman stats-2012-12-30-Fixed effects, followed by Bayes shrinkage?

20 0.09210214 2336 andrew gelman stats-2014-05-16-How much can we learn about individual-level causal claims from state-level correlations?


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.128), (1, 0.042), (2, 0.011), (3, -0.082), (4, -0.023), (5, -0.001), (6, -0.066), (7, -0.005), (8, 0.076), (9, 0.01), (10, -0.053), (11, 0.046), (12, 0.046), (13, -0.031), (14, 0.028), (15, -0.006), (16, -0.043), (17, -0.002), (18, -0.056), (19, 0.078), (20, -0.031), (21, -0.09), (22, 0.068), (23, -0.004), (24, 0.052), (25, 0.093), (26, -0.025), (27, -0.008), (28, -0.052), (29, 0.035), (30, -0.012), (31, -0.054), (32, -0.01), (33, -0.004), (34, -0.051), (35, 0.017), (36, 0.017), (37, -0.028), (38, 0.012), (39, 0.03), (40, 0.016), (41, 0.027), (42, 0.011), (43, 0.028), (44, -0.028), (45, 0.025), (46, -0.03), (47, -0.043), (48, -0.02), (49, -0.033)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.96948808 2097 andrew gelman stats-2013-11-11-Why ask why? Forward causal inference and reverse causal questions

Introduction: Guido Imbens and I write : The statistical and econometrics literature on causality is more focused on “effects of causes” than on “causes of effects.” That is, in the standard approach it is natural to study the effect of a treatment, but it is not in general possible to define the causes of any particular outcome. This has led some researchers to dismiss the search for causes as “cocktail party chatter” that is outside the realm of science. We argue here that the search for causes can be understood within traditional statistical frameworks as a part of model checking and hypothesis generation. We argue that it can make sense to ask questions about the causes of effects, but the answers to these questions will be in terms of effects of causes. We also posted the paper on NBER so I’m hoping it will get some attention from economists. [Again, here's the open link to the paper.] I think what we have here is an important idea linking statistical and econometric models of caus

2 0.86007601 1675 andrew gelman stats-2013-01-15-“10 Things You Need to Know About Causal Effects”

Introduction: Macartan Humphreys pointed me to this excellent guide . Here are the 10 items: 1. A causal claim is a statement about what didn’t happen. 2. There is a fundamental problem of causal inference. 3. You can estimate average causal effects even if you cannot observe any individual causal effects. 4. If you know that, on average, A causes B and that B causes C, this does not mean that you know that A causes C. 5. The counterfactual model is all about contribution, not attribution. 6. X can cause Y even if there is no “causal path” connecting X and Y. 7. Correlation is not causation. 8. X can cause Y even if X is not a necessary condition or a sufficient condition for Y. 9. Estimating average causal effects does not require that treatment and control groups are identical. 10. There is no causation without manipulation. The article follows with crisp discussions of each point. My favorite is item #6, not because it’s the most important but because it brings in some real s

3 0.83373177 1939 andrew gelman stats-2013-07-15-Forward causal reasoning statements are about estimation; reverse causal questions are about model checking and hypothesis generation

Introduction: Consider two broad classes of inferential questions : 1. Forward causal inference . What might happen if we do X? What are the effects of smoking on health, the effects of schooling on knowledge, the effect of campaigns on election outcomes, and so forth? 2. Reverse causal inference . What causes Y? Why do more attractive people earn more money? Why do many poor people vote for Republicans and rich people vote for Democrats? Why did the economy collapse? When statisticians and econometricians write about causal inference, they focus on forward causal questions. Rubin always told us: Never ask Why? Only ask What if? And, from the econ perspective, causation is typically framed in terms of manipulations: if x had changed by 1, how much would y be expected to change, holding all else constant? But reverse causal questions are important too. They’re a natural way to think (consider the importance of the word “Why”) and are arguably more important than forward questions.

4 0.79419225 1136 andrew gelman stats-2012-01-23-Fight! (also a bit of reminiscence at the end)

Introduction: Martin Lindquist and Michael Sobel published a fun little article in Neuroimage on models and assumptions for causal inference with intermediate outcomes. As their subtitle indicates (“A response to the comments on our comment”), this is a topic of some controversy. Lindquist and Sobel write: Our original comment (Lindquist and Sobel, 2011) made explicit the types of assumptions neuroimaging researchers are making when directed graphical models (DGMs), which include certain types of structural equation models (SEMs), are used to estimate causal effects. When these assumptions, which many researchers are not aware of, are not met, parameters of these models should not be interpreted as effects. . . . [Judea] Pearl does not disagree with anything we stated. However, he takes exception to our use of potential outcomes notation, which is the standard notation used in the statistical literature on causal inference, and his comment is devoted to promoting his alternative conventions. [C

5 0.78369063 393 andrew gelman stats-2010-11-04-Estimating the effect of A on B, and also the effect of B on A

Introduction: Lei Liu writes: I am working with clinicians in infectious disease and international health to study the (possible causal) relation between malnutrition and virus infection episodes (e.g., diarrhea) in babies in developing countries. Basically the clinicians are interested in two questions: does malnutrition cause more diarrhea episodes? does diarrhea lead to malnutrition? The malnutrition status is indicated by height and weight (adjusted, HAZ and WAZ measures) observed every 3 months from birth to 1 year. They also recorded the time of each diarrhea episode during the 1 year follow-up period. They have very solid datasets for analysis. As you can see, this is almost like a chicken and egg problem. I am a layman to causal inference. The method I use is just to do some simple regression. For example, to study the causal relation from malnutrition to diarrhea episodes, I use binary variable (diarrhea yes/no during months 0-3) as response, and use the HAZ at month 0 as covariate

6 0.7660746 340 andrew gelman stats-2010-10-13-Randomized experiments, non-randomized experiments, and observational studies

7 0.760234 1492 andrew gelman stats-2012-09-11-Using the “instrumental variables” or “potential outcomes” approach to clarify causal thinking

8 0.73780966 2286 andrew gelman stats-2014-04-08-Understanding Simpson’s paradox using a graph

9 0.72858322 1418 andrew gelman stats-2012-07-16-Long discussion about causal inference and the use of hierarchical models to bridge between different inferential settings

10 0.72470969 1996 andrew gelman stats-2013-08-24-All inference is about generalizing from sample to population

11 0.72243029 1802 andrew gelman stats-2013-04-14-Detecting predictability in complex ecosystems

12 0.71668017 1888 andrew gelman stats-2013-06-08-New Judea Pearl journal of causal inference

13 0.70659631 1732 andrew gelman stats-2013-02-22-Evaluating the impacts of welfare reform?

14 0.70312256 879 andrew gelman stats-2011-08-29-New journal on causal inference

15 0.69725639 807 andrew gelman stats-2011-07-17-Macro causality

16 0.66628915 1778 andrew gelman stats-2013-03-27-My talk at the University of Michigan today 4pm

17 0.65296173 1891 andrew gelman stats-2013-06-09-“Heterogeneity of variance in experimental studies: A challenge to conventional interpretations”

18 0.64912188 550 andrew gelman stats-2011-02-02-An IV won’t save your life if the line is tangled

19 0.64788139 518 andrew gelman stats-2011-01-15-Regression discontinuity designs: looking for the keys under the lamppost?

20 0.64727509 2170 andrew gelman stats-2014-01-13-Judea Pearl overview on causal inference, and more general thoughts on the reexpression of existing methods by considering their implicit assumptions


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(9, 0.026), (15, 0.019), (17, 0.069), (21, 0.099), (24, 0.164), (31, 0.091), (46, 0.052), (64, 0.06), (99, 0.291)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.97412932 2097 andrew gelman stats-2013-11-11-Why ask why? Forward causal inference and reverse causal questions

Introduction: Guido Imbens and I write : The statistical and econometrics literature on causality is more focused on “effects of causes” than on “causes of effects.” That is, in the standard approach it is natural to study the effect of a treatment, but it is not in general possible to define the causes of any particular outcome. This has led some researchers to dismiss the search for causes as “cocktail party chatter” that is outside the realm of science. We argue here that the search for causes can be understood within traditional statistical frameworks as a part of model checking and hypothesis generation. We argue that it can make sense to ask questions about the causes of effects, but the answers to these questions will be in terms of effects of causes. We also posted the paper on NBER so I’m hoping it will get some attention from economists. [Again, here's the open link to the paper.] I think what we have here is an important idea linking statistical and econometric models of caus

2 0.91853333 810 andrew gelman stats-2011-07-20-Adding more information can make the variance go up (depending on your model)

Introduction: Andy McKenzie writes: In their March 9 “ counterpoint ” in nature biotech to the prospect that we should try to integrate more sources of data in clinical practice (see “ point ” arguing for this), Isaac Kohane and David Margulies claim that, “Finally, how much better is our new knowledge than older knowledge? When is the incremental benefit of a genomic variant(s) or gene expression profile relative to a family history or classic histopathology insufficient and when does it add rather than subtract variance?” Perhaps I am mistaken (thus this email), but it seems that this claim runs contra to the definition of conditional probability. That is, if you have a hierarchical model, and the family history / classical histopathology already suggests a parameter estimate with some variance, how could the new genomic info possibly increase the variance of that parameter estimate? Surely the question is how much variance the new genomic info reduces and whether it therefore justifies t

3 0.91714549 2264 andrew gelman stats-2014-03-24-On deck this month

Introduction: Actually, more like the next month and a half . . . I just have this long backlog so I thought I might as well share it with you: Empirical implications of Empirical Implications of Theoretical Models A statistical graphics course and statistical graphics advice What property is important in a risk prediction model? Discrimination or calibration? Beyond the Valley of the Trolls Science tells us that fast food lovers are more likely to marry other fast food lovers References (with code) for Bayesian hierarchical (multilevel) modeling and structural equation modeling Adjudicating between alternative interpretations of a statistical interaction? The most-cited statistics papers ever American Psychological Society announces a new journal Am I too negative? As the boldest experiment in journalism history, you admit you made a mistake Personally, I’d rather go with Teragram Bizarre academic spam An old discussion of food deserts Skepticism about a published cl

4 0.91645366 2136 andrew gelman stats-2013-12-16-Whither the “bet on sparsity principle” in a nonsparse world?

Introduction: Rob Tibshirani writes : Hastie et al. (2001) coined the informal “Bet on Sparsity” principle. The l1 methods assume that the truth is sparse, in some basis. If the assumption holds true, then the parameters can be efficiently estimated using l1 penalties. If the assumption does not hold—so that the truth is dense—then no method will be able to recover the underlying model without a large amount of data per parameter. I’ve earlier expressed my full and sincere appreciation for Hastie and Tibshirani’s work in this area. Now I’d like to briefly comment on the above snippet. The question is, how do we think about the “bet on sparsity” principle in a world where the truth is dense? I’m thinking here of social science, where no effects are clean and no coefficient is zero (see page 960 of this article or various blog discussions in the past few years), where every contrast is meaningful—but some of these contrasts might be lost in the noise with any realistic size of data.

5 0.91599178 1459 andrew gelman stats-2012-08-15-How I think about mixture models

Introduction: Larry Wasserman refers to finite mixture models as “beasts” and writes jokes that they “should be avoided at all costs.” I’ve thought a lot about mixture models, ever since using them in an analysis of voting patterns that was published in 1990. First off, I’d like to say that our model was useful so I’d prefer not to pay the cost of avoiding it. For a quick description of our mixture model and its context, see pp. 379-380 of my article in the Jim Berger volume). Actually, our case was particularly difficult because we were not even fitting a mixture model to data, we were fitting it to latent data and using the model to perform partial pooling. My difficulties in trying to fit this model inspired our discussion of mixture models in Bayesian Data Analysis (page 109 in the second edition, in the section on “Counterexamples to the theorems”). I agree with Larry that if you’re fitting a mixture model, it’s good to be aware of the problems that arise if you try to estimate

6 0.91284192 514 andrew gelman stats-2011-01-13-News coverage of statistical issues…how did I do?

7 0.91205192 1989 andrew gelman stats-2013-08-20-Correcting for multiple comparisons in a Bayesian regression model

8 0.91204393 1675 andrew gelman stats-2013-01-15-“10 Things You Need to Know About Causal Effects”

9 0.91193062 486 andrew gelman stats-2010-12-26-Age and happiness: The pattern isn’t as clear as you might think

10 0.91099608 147 andrew gelman stats-2010-07-15-Quote of the day: statisticians and defaults

11 0.90902734 1401 andrew gelman stats-2012-06-30-David Hogg on statistics

12 0.90848827 2207 andrew gelman stats-2014-02-11-My talks in Bristol this Wed and London this Thurs

13 0.90791774 1230 andrew gelman stats-2012-03-26-Further thoughts on nonparametric correlation measures

14 0.90767944 789 andrew gelman stats-2011-07-07-Descriptive statistics, causal inference, and story time

15 0.90704352 242 andrew gelman stats-2010-08-29-The Subtle Micro-Effects of Peacekeeping

16 0.90671062 1502 andrew gelman stats-2012-09-19-Scalability in education

17 0.90631175 992 andrew gelman stats-2011-11-05-Deadwood in the math curriculum

18 0.90436971 659 andrew gelman stats-2011-04-13-Jim Campbell argues that Larry Bartels’s “Unequal Democracy” findings are not robust

19 0.90297246 2111 andrew gelman stats-2013-11-23-Tables > figures yet again

20 0.90236866 2159 andrew gelman stats-2014-01-04-“Dogs are sensitive to small variations of the Earth’s magnetic field”