andrew_gelman_stats andrew_gelman_stats-2012 andrew_gelman_stats-2012-1492 knowledge-graph by maker-knowledge-mining

1492 andrew gelman stats-2012-09-11-Using the “instrumental variables” or “potential outcomes” approach to clarify causal thinking


meta infos for this blog

Source: html

Introduction: As I’ve written here many times, my experiences in social science and public health research have left me skeptical of statistical methods that hypothesize or try to detect zero relationships between observational data (see, for example, the discussion starting at the bottom of page 960 in my review of causal inference in the American Journal of Sociology). In short, I have a taste for continuous rather than discrete models. As discussed in the above-linked article (with respect to the writings of cognitive scientist Steven Sloman), I think that common-sense thinking about causal inference can often mislead. In many cases, I have found that that the theoretical frameworks of instrumental variables and potential outcomes (for a review see, for example, chapters 9 and 10 of my book with Jennifer) help clarify my thinking. Here is an example that came up in a recent blog discussion. Computer science student Elias Bareinboim gave the following example: “suppose we know nothing a


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 As discussed in the above-linked article (with respect to the writings of cognitive scientist Steven Sloman), I think that common-sense thinking about causal inference can often mislead. [sent-3, score-0.222]

2 In many cases, I have found that that the theoretical frameworks of instrumental variables and potential outcomes (for a review see, for example, chapters 9 and 10 of my book with Jennifer) help clarify my thinking. [sent-4, score-0.731]

3 Here is an example that came up in a recent blog discussion. [sent-5, score-0.096]

4 Computer science student Elias Bareinboim gave the following example: “suppose we know nothing about the world, except that one causal link is missing (e. [sent-6, score-0.132]

5 ” Bareinboim describes this as a “transparent set of assumptions” but to me it’s not transparent at all. [sent-9, score-0.13]

6 But to resolve my problem, I’ll bring out my tools to understand it. [sent-11, score-0.054]

7 What does it mean to say that “skin color does not affect . [sent-12, score-0.449]

8 For example, I could go to the beach and get a tan. [sent-18, score-0.13]

9 This could well negatively affect my cognitive skills (we can call this the Jersey Shore theory). [sent-19, score-0.369]

10 Or maybe at conception you could switch some of my genes around. [sent-20, score-0.126]

11 Assuming this sort of manipulation were technically possible, it would change other things about me than skin color. [sent-21, score-0.701]

12 Similarly, tanning has effects other than changing my skin, it also puts me at the beach (or the tanning salon) rather than in the library where I might be improving my intelligence. [sent-23, score-0.595]

13 If you want to understand the effect of some observed condition X on an outcome Y, you manipulate some instrument I that affects X, then you look at the effects of I on X and on Y. [sent-25, score-0.535]

14 The example X = skin color is typical in that there are different possible instruments that can be imagined, and these will have different effects on Y. [sent-26, score-1.343]

15 This is the “potential outcome” approach: we consider possible outcomes under different potential treatments (that is, different assignments of the instrument). [sent-28, score-0.607]

16 For an applied example, you can see our 1990 article on incumbency advantage, where we were explicit in defining conditions and potential outcomes. [sent-29, score-0.279]

17 The point is that in studying such causal relations, it can be helpful to define the manipulation or instrument explicitly, even if it only has a theoretical existence. [sent-30, score-0.717]

18 In that sense, instrumental variables and potential outcomes are a sort of accounting principle, giving us a tool to define as precisely as possible what we are studying. [sent-31, score-0.777]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('skin', 0.521), ('instrument', 0.236), ('color', 0.235), ('affect', 0.214), ('tanning', 0.181), ('potential', 0.165), ('instrumental', 0.162), ('bareinboim', 0.149), ('intellectual', 0.146), ('causal', 0.132), ('beach', 0.13), ('transparent', 0.13), ('instruments', 0.123), ('capacity', 0.12), ('outcomes', 0.118), ('manipulation', 0.115), ('effects', 0.103), ('possible', 0.101), ('example', 0.096), ('variables', 0.09), ('cognitive', 0.09), ('define', 0.084), ('alteration', 0.083), ('shore', 0.083), ('different', 0.082), ('studying', 0.08), ('salon', 0.078), ('outcome', 0.077), ('imagined', 0.075), ('sloman', 0.075), ('undefined', 0.075), ('hypothesize', 0.072), ('elias', 0.072), ('theoretical', 0.07), ('jersey', 0.066), ('conception', 0.066), ('frameworks', 0.065), ('negatively', 0.065), ('technically', 0.065), ('manipulate', 0.063), ('review', 0.061), ('genes', 0.06), ('assignments', 0.059), ('incumbency', 0.058), ('accounting', 0.057), ('affects', 0.056), ('explicit', 0.056), ('detect', 0.055), ('implausible', 0.055), ('resolve', 0.054)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.99999988 1492 andrew gelman stats-2012-09-11-Using the “instrumental variables” or “potential outcomes” approach to clarify causal thinking

Introduction: As I’ve written here many times, my experiences in social science and public health research have left me skeptical of statistical methods that hypothesize or try to detect zero relationships between observational data (see, for example, the discussion starting at the bottom of page 960 in my review of causal inference in the American Journal of Sociology). In short, I have a taste for continuous rather than discrete models. As discussed in the above-linked article (with respect to the writings of cognitive scientist Steven Sloman), I think that common-sense thinking about causal inference can often mislead. In many cases, I have found that that the theoretical frameworks of instrumental variables and potential outcomes (for a review see, for example, chapters 9 and 10 of my book with Jennifer) help clarify my thinking. Here is an example that came up in a recent blog discussion. Computer science student Elias Bareinboim gave the following example: “suppose we know nothing a

2 0.20543863 1418 andrew gelman stats-2012-07-16-Long discussion about causal inference and the use of hierarchical models to bridge between different inferential settings

Introduction: Elias Bareinboim asked what I thought about his comment on selection bias in which he referred to a paper by himself and Judea Pearl, “Controlling Selection Bias in Causal Inference.” I replied that I have no problem with what he wrote, but that from my perspective I find it easier to conceptualize such problems in terms of multilevel models. I elaborated on that point in a recent post , “Hierarchical modeling as a framework for extrapolation,” which I think was read by only a few people (I say this because it received only two comments). I don’t think Bareinboim objected to anything I wrote, but like me he is comfortable working within his own framework. He wrote the following to me: In some sense, “not ad hoc” could mean logically consistent. In other words, if one agrees with the assumptions encoded in the model, one must also agree with the conclusions entailed by these assumptions. I am not aware of any other way of doing mathematics. As it turns out, to get causa

3 0.1472301 2309 andrew gelman stats-2014-04-28-Crowdstorming a dataset

Introduction: Raphael Silberzahn writes: Brian Nosek, Eric Luis Uhlmann, Dan Martin, and I just launched a project through the Open Science Center we think you’ll find interesting. The basic idea is to “Crowdstorm a Dataset”. Multiple independent analysts are recruited to test the same hypothesis on the same data set in whatever manner they see as best. If everyone comes up with the same results, then scientists can speak with one voice. If not, the subjectivity and conditionality of results on analysis strategy is made transparent. For this first project, we are crowdstorming the question of whether soccer referees are more likely to give red cards to dark skin toned players than light skin toned players. The full project description is here . If you’re interested in being one of the crowdstormer analysts, you can register here . All analysts will receive an author credit on the final paper. We would love to have Bayesian analysts represented in the group. Also, please feel free to let

4 0.14223346 2274 andrew gelman stats-2014-03-30-Adjudicating between alternative interpretations of a statistical interaction?

Introduction: Jacob Felson writes: Say we have a statistically significant interaction in non-experimental data between two continuous predictors, X and Z and it is unclear which variable is primarily a cause and which variable is primarily a moderator. One person might find it more plausible to think of X as a cause and Z as a moderator and another person may think the reverse more plausible. My question then is whether there is are any set of rules or heuristics you could recommend to help adjudicate between alternate perspectives on such an interaction term. My reply: I think in this setting, it would make sense to think about different interventions, some of which affect X, others of which affect Z, others of which affect both, and go from there. Rather than trying to isolate a single causal path, consider different cases of forward casual inference. My guess is that the different stories regarding moderators etc. could motivate different thought experiments (and, ultimately, differe

5 0.12681063 1675 andrew gelman stats-2013-01-15-“10 Things You Need to Know About Causal Effects”

Introduction: Macartan Humphreys pointed me to this excellent guide . Here are the 10 items: 1. A causal claim is a statement about what didn’t happen. 2. There is a fundamental problem of causal inference. 3. You can estimate average causal effects even if you cannot observe any individual causal effects. 4. If you know that, on average, A causes B and that B causes C, this does not mean that you know that A causes C. 5. The counterfactual model is all about contribution, not attribution. 6. X can cause Y even if there is no “causal path” connecting X and Y. 7. Correlation is not causation. 8. X can cause Y even if X is not a necessary condition or a sufficient condition for Y. 9. Estimating average causal effects does not require that treatment and control groups are identical. 10. There is no causation without manipulation. The article follows with crisp discussions of each point. My favorite is item #6, not because it’s the most important but because it brings in some real s

6 0.12674703 1425 andrew gelman stats-2012-07-23-Examples of the use of hierarchical modeling to generalize to new settings

7 0.118553 2308 andrew gelman stats-2014-04-27-White stripes and dead armadillos

8 0.11391343 2152 andrew gelman stats-2013-12-28-Using randomized incentives as an instrument for survey nonresponse?

9 0.11390752 1939 andrew gelman stats-2013-07-15-Forward causal reasoning statements are about estimation; reverse causal questions are about model checking and hypothesis generation

10 0.1061545 1575 andrew gelman stats-2012-11-12-Thinking like a statistician (continuously) rather than like a civilian (discretely)

11 0.10156647 1962 andrew gelman stats-2013-07-30-The Roy causal model?

12 0.10126329 2204 andrew gelman stats-2014-02-09-Keli Liu and Xiao-Li Meng on Simpson’s paradox

13 0.10038853 807 andrew gelman stats-2011-07-17-Macro causality

14 0.096900135 368 andrew gelman stats-2010-10-25-Is instrumental variables analysis particularly susceptible to Type M errors?

15 0.090612702 550 andrew gelman stats-2011-02-02-An IV won’t save your life if the line is tangled

16 0.089439794 2286 andrew gelman stats-2014-04-08-Understanding Simpson’s paradox using a graph

17 0.08856979 103 andrew gelman stats-2010-06-22-Beach reads, Proust, and income tax

18 0.088526599 879 andrew gelman stats-2011-08-29-New journal on causal inference

19 0.087927908 2170 andrew gelman stats-2014-01-13-Judea Pearl overview on causal inference, and more general thoughts on the reexpression of existing methods by considering their implicit assumptions

20 0.084741198 785 andrew gelman stats-2011-07-02-Experimental reasoning in social science


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.169), (1, 0.006), (2, -0.0), (3, -0.065), (4, -0.005), (5, -0.008), (6, -0.04), (7, 0.001), (8, 0.062), (9, 0.062), (10, -0.04), (11, 0.016), (12, 0.035), (13, -0.027), (14, 0.05), (15, 0.036), (16, -0.014), (17, 0.012), (18, -0.045), (19, 0.063), (20, -0.035), (21, -0.055), (22, 0.065), (23, 0.022), (24, 0.072), (25, 0.086), (26, 0.033), (27, -0.024), (28, -0.026), (29, 0.017), (30, 0.01), (31, -0.005), (32, -0.06), (33, -0.01), (34, -0.035), (35, -0.016), (36, 0.006), (37, -0.026), (38, -0.017), (39, 0.036), (40, -0.013), (41, -0.005), (42, -0.001), (43, -0.018), (44, -0.046), (45, 0.039), (46, 0.012), (47, 0.034), (48, -0.028), (49, 0.037)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.96881616 1492 andrew gelman stats-2012-09-11-Using the “instrumental variables” or “potential outcomes” approach to clarify causal thinking

Introduction: As I’ve written here many times, my experiences in social science and public health research have left me skeptical of statistical methods that hypothesize or try to detect zero relationships between observational data (see, for example, the discussion starting at the bottom of page 960 in my review of causal inference in the American Journal of Sociology). In short, I have a taste for continuous rather than discrete models. As discussed in the above-linked article (with respect to the writings of cognitive scientist Steven Sloman), I think that common-sense thinking about causal inference can often mislead. In many cases, I have found that that the theoretical frameworks of instrumental variables and potential outcomes (for a review see, for example, chapters 9 and 10 of my book with Jennifer) help clarify my thinking. Here is an example that came up in a recent blog discussion. Computer science student Elias Bareinboim gave the following example: “suppose we know nothing a

2 0.89094478 1675 andrew gelman stats-2013-01-15-“10 Things You Need to Know About Causal Effects”

Introduction: Macartan Humphreys pointed me to this excellent guide . Here are the 10 items: 1. A causal claim is a statement about what didn’t happen. 2. There is a fundamental problem of causal inference. 3. You can estimate average causal effects even if you cannot observe any individual causal effects. 4. If you know that, on average, A causes B and that B causes C, this does not mean that you know that A causes C. 5. The counterfactual model is all about contribution, not attribution. 6. X can cause Y even if there is no “causal path” connecting X and Y. 7. Correlation is not causation. 8. X can cause Y even if X is not a necessary condition or a sufficient condition for Y. 9. Estimating average causal effects does not require that treatment and control groups are identical. 10. There is no causation without manipulation. The article follows with crisp discussions of each point. My favorite is item #6, not because it’s the most important but because it brings in some real s

3 0.84450591 1136 andrew gelman stats-2012-01-23-Fight! (also a bit of reminiscence at the end)

Introduction: Martin Lindquist and Michael Sobel published a fun little article in Neuroimage on models and assumptions for causal inference with intermediate outcomes. As their subtitle indicates (“A response to the comments on our comment”), this is a topic of some controversy. Lindquist and Sobel write: Our original comment (Lindquist and Sobel, 2011) made explicit the types of assumptions neuroimaging researchers are making when directed graphical models (DGMs), which include certain types of structural equation models (SEMs), are used to estimate causal effects. When these assumptions, which many researchers are not aware of, are not met, parameters of these models should not be interpreted as effects. . . . [Judea] Pearl does not disagree with anything we stated. However, he takes exception to our use of potential outcomes notation, which is the standard notation used in the statistical literature on causal inference, and his comment is devoted to promoting his alternative conventions. [C

4 0.8374387 1418 andrew gelman stats-2012-07-16-Long discussion about causal inference and the use of hierarchical models to bridge between different inferential settings

Introduction: Elias Bareinboim asked what I thought about his comment on selection bias in which he referred to a paper by himself and Judea Pearl, “Controlling Selection Bias in Causal Inference.” I replied that I have no problem with what he wrote, but that from my perspective I find it easier to conceptualize such problems in terms of multilevel models. I elaborated on that point in a recent post , “Hierarchical modeling as a framework for extrapolation,” which I think was read by only a few people (I say this because it received only two comments). I don’t think Bareinboim objected to anything I wrote, but like me he is comfortable working within his own framework. He wrote the following to me: In some sense, “not ad hoc” could mean logically consistent. In other words, if one agrees with the assumptions encoded in the model, one must also agree with the conclusions entailed by these assumptions. I am not aware of any other way of doing mathematics. As it turns out, to get causa

5 0.81373167 393 andrew gelman stats-2010-11-04-Estimating the effect of A on B, and also the effect of B on A

Introduction: Lei Liu writes: I am working with clinicians in infectious disease and international health to study the (possible causal) relation between malnutrition and virus infection episodes (e.g., diarrhea) in babies in developing countries. Basically the clinicians are interested in two questions: does malnutrition cause more diarrhea episodes? does diarrhea lead to malnutrition? The malnutrition status is indicated by height and weight (adjusted, HAZ and WAZ measures) observed every 3 months from birth to 1 year. They also recorded the time of each diarrhea episode during the 1 year follow-up period. They have very solid datasets for analysis. As you can see, this is almost like a chicken and egg problem. I am a layman to causal inference. The method I use is just to do some simple regression. For example, to study the causal relation from malnutrition to diarrhea episodes, I use binary variable (diarrhea yes/no during months 0-3) as response, and use the HAZ at month 0 as covariate

6 0.8136605 1732 andrew gelman stats-2013-02-22-Evaluating the impacts of welfare reform?

7 0.809385 1996 andrew gelman stats-2013-08-24-All inference is about generalizing from sample to population

8 0.80771178 2286 andrew gelman stats-2014-04-08-Understanding Simpson’s paradox using a graph

9 0.80593956 550 andrew gelman stats-2011-02-02-An IV won’t save your life if the line is tangled

10 0.79655063 1939 andrew gelman stats-2013-07-15-Forward causal reasoning statements are about estimation; reverse causal questions are about model checking and hypothesis generation

11 0.77220666 2274 andrew gelman stats-2014-03-30-Adjudicating between alternative interpretations of a statistical interaction?

12 0.76406121 2097 andrew gelman stats-2013-11-11-Why ask why? Forward causal inference and reverse causal questions

13 0.75814956 518 andrew gelman stats-2011-01-15-Regression discontinuity designs: looking for the keys under the lamppost?

14 0.75461859 1336 andrew gelman stats-2012-05-22-Battle of the Repo Man quotes: Reid Hastie’s turn

15 0.75300759 1133 andrew gelman stats-2012-01-21-Judea Pearl on why he is “only a half-Bayesian”

16 0.75298989 287 andrew gelman stats-2010-09-20-Paul Rosenbaum on those annoying pre-treatment variables that are sort-of instruments and sort-of covariates

17 0.75187749 1645 andrew gelman stats-2012-12-31-Statistical modeling, causal inference, and social science

18 0.75046861 807 andrew gelman stats-2011-07-17-Macro causality

19 0.74047571 340 andrew gelman stats-2010-10-13-Randomized experiments, non-randomized experiments, and observational studies

20 0.73553443 2170 andrew gelman stats-2014-01-13-Judea Pearl overview on causal inference, and more general thoughts on the reexpression of existing methods by considering their implicit assumptions


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(6, 0.012), (9, 0.04), (11, 0.012), (16, 0.065), (21, 0.044), (24, 0.145), (43, 0.012), (45, 0.017), (55, 0.012), (58, 0.013), (72, 0.03), (76, 0.018), (78, 0.119), (86, 0.022), (95, 0.044), (99, 0.26)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.95454168 1492 andrew gelman stats-2012-09-11-Using the “instrumental variables” or “potential outcomes” approach to clarify causal thinking

Introduction: As I’ve written here many times, my experiences in social science and public health research have left me skeptical of statistical methods that hypothesize or try to detect zero relationships between observational data (see, for example, the discussion starting at the bottom of page 960 in my review of causal inference in the American Journal of Sociology). In short, I have a taste for continuous rather than discrete models. As discussed in the above-linked article (with respect to the writings of cognitive scientist Steven Sloman), I think that common-sense thinking about causal inference can often mislead. In many cases, I have found that that the theoretical frameworks of instrumental variables and potential outcomes (for a review see, for example, chapters 9 and 10 of my book with Jennifer) help clarify my thinking. Here is an example that came up in a recent blog discussion. Computer science student Elias Bareinboim gave the following example: “suppose we know nothing a

2 0.93469357 207 andrew gelman stats-2010-08-14-Pourquoi Google search est devenu plus raisonnable?

Introduction: A few months ago I questioned Dan Ariely’s belief that Google is the voice of the people by reporting the following bizarre options that Google gave to complete the simplest search I could think of: Several commenters gave informed discussions about what was going on in Google’s program. Maybe things are better now, though? The latest version seems much more reasonable: (Aleks sent this to me, then I checked on my own computer and got the same thing.)

3 0.92285621 1580 andrew gelman stats-2012-11-16-Stantastic!

Introduction: Richard McElreath writes: I’ve been translating a few ongoing data analysis projects into Stan code, mostly with success. The most important for me right now has been a hierarchical zero-inflated gamma problem. This a “hurdle” model, in which a bernoulli GLM produces zeros/nonzeros, and then a gamma GLM produces the nonzero values, using varying effects correlated with those in the bernoulli process. The data are 20 years of human foraging returns from a subsistence hunting population in Paraguay (the Ache), comprising about 15k hunts in total (Hill & Kintigh. 2009. Current Anthropology 50:369-377). Observed values are kilograms of meat returned to camp. The more complex models contain a 147-by-9 matrix of varying effects (147 unique hunters), as well as imputation of missing values. Originally, I had written the sampler myself in raw R code. It was very slow, but I knew what it was doing at least. Just before Stan version 1.0 was released, I had managed to get JAGS to do it a

4 0.92252874 2025 andrew gelman stats-2013-09-15-The it-gets-me-so-angry-I-can’t-deal-with-it threshold

Introduction: I happened to be looking at Slate (I know, I know, but I’d already browsed Gawker and I was desperately avoiding doing real work) and came across this article by Alice Gregory entitled, “I Read Everything Janet Malcolm Ever Published. I’m in awe of her.” I too think Malcolm is an excellent writer, but (a) I’m not happy that she gets off the hook for faking quotes , and (b) I’m really really not happy with her apparent attempt to try to force a mistrial for a convicted killer. I just can’t get over that, for some reason. I can appreciate Picasso’s genius even though he beat his wives or whatever it was he did, I can enjoy the music of Jackson Browne, etc. But for some reason this Malcolm stuff sticks in my craw. There’s no deep meaning to this—I recognize it is a somewhat irrational attitude on my part, I just wanted to share it with you.

5 0.920708 775 andrew gelman stats-2011-06-21-Fundamental difficulty of inference for a ratio when the denominator could be positive or negative

Introduction: Ratio estimates are common in statistics. In survey sampling, the ratio estimate is when you use y/x to estimate Y/X (using the notation in which x,y are totals of sample measurements and X,Y are population totals). In textbook sampling examples, the denominator X will be an all-positive variable, something that is easy to measure and is, ideally, close to proportional to Y. For example, X is last year’s sales and Y is this year’s sales, or X is the number of people in a cluster and Y is some count. Ratio estimation doesn’t work so well if X can be either positive or negative. More generally we can consider any estimate of a ratio, with no need for a survey sampling context. The problem with estimating Y/X is that the very interpretation of Y/X can change completely if the sign of X changes. Everything is ok for a point estimate: you get X.hat and Y.hat, you can take the ratio Y.hat/X.hat, no problem. But the inference falls apart if you have enough uncertainty in X.hat th

6 0.91962785 639 andrew gelman stats-2011-03-31-Bayes: radical, liberal, or conservative?

7 0.91737753 1384 andrew gelman stats-2012-06-19-Slick time series decomposition of the birthdays data

8 0.91638196 1881 andrew gelman stats-2013-06-03-Boot

9 0.91495568 431 andrew gelman stats-2010-11-26-One fun thing about physicists . . .

10 0.9113676 2174 andrew gelman stats-2014-01-17-How to think about the statistical evidence when the statistical evidence can’t be conclusive?

11 0.91095603 2303 andrew gelman stats-2014-04-23-Thinking of doing a list experiment? Here’s a list of reasons why you should think again

12 0.91000909 2154 andrew gelman stats-2013-12-30-Bill Gates’s favorite graph of the year

13 0.90841258 2281 andrew gelman stats-2014-04-04-The Notorious N.H.S.T. presents: Mo P-values Mo Problems

14 0.90783197 2080 andrew gelman stats-2013-10-28-Writing for free

15 0.90770113 2112 andrew gelman stats-2013-11-25-An interesting but flawed attempt to apply general forecasting principles to contextualize attitudes toward risks of global warming

16 0.90756911 670 andrew gelman stats-2011-04-20-Attractive but hard-to-read graph could be made much much better

17 0.90748537 1422 andrew gelman stats-2012-07-20-Likelihood thresholds and decisions

18 0.90732187 1117 andrew gelman stats-2012-01-13-What are the important issues in ethics and statistics? I’m looking for your input!

19 0.90696061 400 andrew gelman stats-2010-11-08-Poli sci plagiarism update, and a note about the benefits of not caring

20 0.90683341 789 andrew gelman stats-2011-07-07-Descriptive statistics, causal inference, and story time