andrew_gelman_stats andrew_gelman_stats-2011 andrew_gelman_stats-2011-550 knowledge-graph by maker-knowledge-mining

550 andrew gelman stats-2011-02-02-An IV won’t save your life if the line is tangled


meta infos for this blog

Source: html

Introduction: Alex Tabarrok quotes Randall Morck and Bernard Yeung on difficulties with instrumental variables. This reminded me of some related things I’ve written. In the official story the causal question comes first and then the clever researcher comes up with an IV. I suspect that often it’s the other way around: you find a natural experiment and look at the consequences that flow from it. And maybe that’s not such a bad thing. See section 4 of this article . More generally, I think economists and political scientists are currently a bit overinvested in identification strategies. I agree with Heckman’s point (as I understand it) that ultimately we should be building models that work for us rather than always thinking we can get causal inference on the cheap, as it were, by some trick or another. (This is a point I briefly discuss in a couple places here and also in my recent paper for the causality volume that Don Green etc are involved with.) I recently had this discussion wi


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Alex Tabarrok quotes Randall Morck and Bernard Yeung on difficulties with instrumental variables. [sent-1, score-0.439]

2 This reminded me of some related things I’ve written. [sent-2, score-0.092]

3 In the official story the causal question comes first and then the clever researcher comes up with an IV. [sent-3, score-0.61]

4 I suspect that often it’s the other way around: you find a natural experiment and look at the consequences that flow from it. [sent-4, score-0.39]

5 More generally, I think economists and political scientists are currently a bit overinvested in identification strategies. [sent-7, score-0.392]

6 I agree with Heckman’s point (as I understand it) that ultimately we should be building models that work for us rather than always thinking we can get causal inference on the cheap, as it were, by some trick or another. [sent-8, score-0.56]

7 (This is a point I briefly discuss in a couple places here and also in my recent paper for the causality volume that Don Green etc are involved with. [sent-9, score-0.522]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('instrumental', 0.241), ('identification', 0.216), ('aspire', 0.184), ('randall', 0.184), ('experiments', 0.164), ('stretch', 0.16), ('soooo', 0.155), ('natural', 0.152), ('discontinuity', 0.151), ('bernard', 0.148), ('causal', 0.147), ('heckman', 0.145), ('tabarrok', 0.14), ('iv', 0.135), ('flavor', 0.135), ('cheap', 0.13), ('generally', 0.129), ('flow', 0.128), ('mapping', 0.123), ('comes', 0.121), ('sometimes', 0.119), ('quantities', 0.119), ('green', 0.118), ('causality', 0.117), ('alex', 0.117), ('clever', 0.114), ('holds', 0.112), ('trick', 0.111), ('appeal', 0.111), ('thinking', 0.111), ('consequences', 0.11), ('volume', 0.11), ('briefly', 0.107), ('official', 0.107), ('great', 0.105), ('quotes', 0.102), ('point', 0.101), ('identified', 0.1), ('clean', 0.1), ('difficulties', 0.096), ('reminded', 0.092), ('strategy', 0.091), ('building', 0.09), ('bit', 0.09), ('inferences', 0.089), ('apparently', 0.089), ('month', 0.088), ('difficulty', 0.088), ('places', 0.087), ('currently', 0.086)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0 550 andrew gelman stats-2011-02-02-An IV won’t save your life if the line is tangled

Introduction: Alex Tabarrok quotes Randall Morck and Bernard Yeung on difficulties with instrumental variables. This reminded me of some related things I’ve written. In the official story the causal question comes first and then the clever researcher comes up with an IV. I suspect that often it’s the other way around: you find a natural experiment and look at the consequences that flow from it. And maybe that’s not such a bad thing. See section 4 of this article . More generally, I think economists and political scientists are currently a bit overinvested in identification strategies. I agree with Heckman’s point (as I understand it) that ultimately we should be building models that work for us rather than always thinking we can get causal inference on the cheap, as it were, by some trick or another. (This is a point I briefly discuss in a couple places here and also in my recent paper for the causality volume that Don Green etc are involved with.) I recently had this discussion wi

2 0.18603991 2208 andrew gelman stats-2014-02-12-How to think about “identifiability” in Bayesian inference?

Introduction: We had some questions on the Stan list regarding identification. The topic arose because people were fitting models with improper posterior distributions, the kind of model where there’s a ridge in the likelihood and the parameters are not otherwise constrained. I tried to help by writing something on Bayesian identifiability for the Stan list. Then Ben Goodrich came along and cleaned up what I wrote. I think this might be of interest to many of you so I’ll repeat the discussion here. Here’s what I wrote: Identification is actually a tricky concept and is not so clearly defined. In the broadest sense, a Bayesian model is identified if the posterior distribution is proper. Then one can do Bayesian inference and that’s that. No need to require a finite variance or even a finite mean, all that’s needed is a finite integral of the probability distribution. That said, there are some reasons why a stronger definition can be useful: 1. Weak identification. Suppose that, wit

3 0.18210606 368 andrew gelman stats-2010-10-25-Is instrumental variables analysis particularly susceptible to Type M errors?

Introduction: Hendrik Juerges writes: I am an applied econometrician. The reason I am writing is that I am pondering a question for some time now and I am curious whether you have any views on it. One problem the practitioner of instrumental variables estimation faces is large standard errors even with very large samples. Part of the problem is of course that one estimates a ratio. Anyhow, more often than not, I and many other researchers I know end up with large point estimates and standard errors when trying IV on a problem. Sometimes some of us are lucky and get a statistically significant result. Those estimates that make it beyond the 2 standard error threshold are often ridiculously large (one famous example in my line of research being Lleras-Muney’s estimates of the 10% effect of one year of schooling on mortality). The standard defense here is that IV estimates the complier-specific causal effect (which is mathematically correct). But still, I find many of the IV results (including my

4 0.1232539 1778 andrew gelman stats-2013-03-27-My talk at the University of Michigan today 4pm

Introduction: Causality and Statistical Learning Andrew Gelman, Statistics and Political Science, Columbia University Wed 27 Mar, 4pm, Betty Ford Auditorium, Ford School of Public Policy Causal inference is central to the social and biomedical sciences. There are unresolved debates about the meaning of causality and the methods that should be used to measure it. As a statistician, I am trained to say that randomized experiments are a gold standard, yet I have spent almost all my applied career analyzing observational data. In this talk we shall consider various approaches to causal reasoning from the perspective of an applied statistician who recognizes the importance of causal identification yet must learn from available information. Two relevant papers are here and here .

5 0.12130314 1939 andrew gelman stats-2013-07-15-Forward causal reasoning statements are about estimation; reverse causal questions are about model checking and hypothesis generation

Introduction: Consider two broad classes of inferential questions : 1. Forward causal inference . What might happen if we do X? What are the effects of smoking on health, the effects of schooling on knowledge, the effect of campaigns on election outcomes, and so forth? 2. Reverse causal inference . What causes Y? Why do more attractive people earn more money? Why do many poor people vote for Republicans and rich people vote for Democrats? Why did the economy collapse? When statisticians and econometricians write about causal inference, they focus on forward causal questions. Rubin always told us: Never ask Why? Only ask What if? And, from the econ perspective, causation is typically framed in terms of manipulations: if x had changed by 1, how much would y be expected to change, holding all else constant? But reverse causal questions are important too. They’re a natural way to think (consider the importance of the word “Why”) and are arguably more important than forward questions.

6 0.11921454 1962 andrew gelman stats-2013-07-30-The Roy causal model?

7 0.11710372 32 andrew gelman stats-2010-05-14-Causal inference in economics

8 0.11304925 1418 andrew gelman stats-2012-07-16-Long discussion about causal inference and the use of hierarchical models to bridge between different inferential settings

9 0.1065795 25 andrew gelman stats-2010-05-10-Two great tastes that taste great together

10 0.10024637 518 andrew gelman stats-2011-01-15-Regression discontinuity designs: looking for the keys under the lamppost?

11 0.099820852 1666 andrew gelman stats-2013-01-10-They’d rather be rigorous than right

12 0.098599881 1645 andrew gelman stats-2012-12-31-Statistical modeling, causal inference, and social science

13 0.092124999 2207 andrew gelman stats-2014-02-11-My talks in Bristol this Wed and London this Thurs

14 0.091071799 1675 andrew gelman stats-2013-01-15-“10 Things You Need to Know About Causal Effects”

15 0.090612702 1492 andrew gelman stats-2012-09-11-Using the “instrumental variables” or “potential outcomes” approach to clarify causal thinking

16 0.090142526 222 andrew gelman stats-2010-08-21-Estimating and reporting teacher effectivenss: Newspaper researchers do things that academic researchers never could

17 0.089225277 879 andrew gelman stats-2011-08-29-New journal on causal inference

18 0.088852704 2006 andrew gelman stats-2013-09-03-Evaluating evidence from published research

19 0.088581927 789 andrew gelman stats-2011-07-07-Descriptive statistics, causal inference, and story time

20 0.088482395 340 andrew gelman stats-2010-10-13-Randomized experiments, non-randomized experiments, and observational studies


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.182), (1, -0.004), (2, -0.026), (3, -0.025), (4, -0.024), (5, -0.009), (6, -0.004), (7, -0.017), (8, 0.067), (9, 0.052), (10, -0.019), (11, 0.035), (12, -0.0), (13, -0.017), (14, 0.021), (15, -0.012), (16, -0.025), (17, 0.015), (18, -0.032), (19, 0.041), (20, -0.033), (21, -0.086), (22, 0.053), (23, 0.02), (24, 0.08), (25, 0.065), (26, 0.047), (27, -0.034), (28, -0.022), (29, 0.041), (30, 0.05), (31, -0.02), (32, -0.01), (33, -0.034), (34, -0.045), (35, -0.006), (36, 0.018), (37, 0.008), (38, 0.01), (39, 0.032), (40, -0.031), (41, -0.011), (42, -0.019), (43, 0.003), (44, -0.001), (45, 0.013), (46, -0.017), (47, 0.063), (48, -0.001), (49, 0.0)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.96949059 550 andrew gelman stats-2011-02-02-An IV won’t save your life if the line is tangled

Introduction: Alex Tabarrok quotes Randall Morck and Bernard Yeung on difficulties with instrumental variables. This reminded me of some related things I’ve written. In the official story the causal question comes first and then the clever researcher comes up with an IV. I suspect that often it’s the other way around: you find a natural experiment and look at the consequences that flow from it. And maybe that’s not such a bad thing. See section 4 of this article . More generally, I think economists and political scientists are currently a bit overinvested in identification strategies. I agree with Heckman’s point (as I understand it) that ultimately we should be building models that work for us rather than always thinking we can get causal inference on the cheap, as it were, by some trick or another. (This is a point I briefly discuss in a couple places here and also in my recent paper for the causality volume that Don Green etc are involved with.) I recently had this discussion wi

2 0.86353815 1136 andrew gelman stats-2012-01-23-Fight! (also a bit of reminiscence at the end)

Introduction: Martin Lindquist and Michael Sobel published a fun little article in Neuroimage on models and assumptions for causal inference with intermediate outcomes. As their subtitle indicates (“A response to the comments on our comment”), this is a topic of some controversy. Lindquist and Sobel write: Our original comment (Lindquist and Sobel, 2011) made explicit the types of assumptions neuroimaging researchers are making when directed graphical models (DGMs), which include certain types of structural equation models (SEMs), are used to estimate causal effects. When these assumptions, which many researchers are not aware of, are not met, parameters of these models should not be interpreted as effects. . . . [Judea] Pearl does not disagree with anything we stated. However, he takes exception to our use of potential outcomes notation, which is the standard notation used in the statistical literature on causal inference, and his comment is devoted to promoting his alternative conventions. [C

3 0.85957271 2286 andrew gelman stats-2014-04-08-Understanding Simpson’s paradox using a graph

Introduction: Joshua Vogelstein pointed me to this post by Michael Nielsen on how to teach Simpson’s paradox. I don’t know if Nielsen (and others) are aware that people have developed some snappy graphical methods for displaying Simpson’s paradox (and, more generally, aggregation issues). We do some this in our Red State Blue State book, but before that was the BK plot, named by Howard Wainer after a 2001 paper by Stuart Baker and Barnett Kramer, although in apparently appeared earlier in a 1987 paper by Jeon, Chung, and Bae, and doubtless was made by various other people before then. Here’s Wainer’s graphical explication from 2002 (adapted from Baker and Kramer’s 2001 paper): Here’s the version from our 2007 article (with Boris Shor, Joe Bafumi, and David Park): But I recommend Wainer’s article (linked to above) as the first thing to read on the topic of presenting aggregation paradoxes in a clear and grabby way. P.S. Robert Long writes in: I noticed your post ab

4 0.85545838 1492 andrew gelman stats-2012-09-11-Using the “instrumental variables” or “potential outcomes” approach to clarify causal thinking

Introduction: As I’ve written here many times, my experiences in social science and public health research have left me skeptical of statistical methods that hypothesize or try to detect zero relationships between observational data (see, for example, the discussion starting at the bottom of page 960 in my review of causal inference in the American Journal of Sociology). In short, I have a taste for continuous rather than discrete models. As discussed in the above-linked article (with respect to the writings of cognitive scientist Steven Sloman), I think that common-sense thinking about causal inference can often mislead. In many cases, I have found that that the theoretical frameworks of instrumental variables and potential outcomes (for a review see, for example, chapters 9 and 10 of my book with Jennifer) help clarify my thinking. Here is an example that came up in a recent blog discussion. Computer science student Elias Bareinboim gave the following example: “suppose we know nothing a

5 0.85341054 1675 andrew gelman stats-2013-01-15-“10 Things You Need to Know About Causal Effects”

Introduction: Macartan Humphreys pointed me to this excellent guide . Here are the 10 items: 1. A causal claim is a statement about what didn’t happen. 2. There is a fundamental problem of causal inference. 3. You can estimate average causal effects even if you cannot observe any individual causal effects. 4. If you know that, on average, A causes B and that B causes C, this does not mean that you know that A causes C. 5. The counterfactual model is all about contribution, not attribution. 6. X can cause Y even if there is no “causal path” connecting X and Y. 7. Correlation is not causation. 8. X can cause Y even if X is not a necessary condition or a sufficient condition for Y. 9. Estimating average causal effects does not require that treatment and control groups are identical. 10. There is no causation without manipulation. The article follows with crisp discussions of each point. My favorite is item #6, not because it’s the most important but because it brings in some real s

6 0.84347218 1996 andrew gelman stats-2013-08-24-All inference is about generalizing from sample to population

7 0.82222056 1939 andrew gelman stats-2013-07-15-Forward causal reasoning statements are about estimation; reverse causal questions are about model checking and hypothesis generation

8 0.81338924 807 andrew gelman stats-2011-07-17-Macro causality

9 0.80963308 1336 andrew gelman stats-2012-05-22-Battle of the Repo Man quotes: Reid Hastie’s turn

10 0.80429798 1418 andrew gelman stats-2012-07-16-Long discussion about causal inference and the use of hierarchical models to bridge between different inferential settings

11 0.79552066 1802 andrew gelman stats-2013-04-14-Detecting predictability in complex ecosystems

12 0.77957177 785 andrew gelman stats-2011-07-02-Experimental reasoning in social science

13 0.77869827 879 andrew gelman stats-2011-08-29-New journal on causal inference

14 0.77536505 340 andrew gelman stats-2010-10-13-Randomized experiments, non-randomized experiments, and observational studies

15 0.7738713 1645 andrew gelman stats-2012-12-31-Statistical modeling, causal inference, and social science

16 0.75577486 307 andrew gelman stats-2010-09-29-“Texting bans don’t reduce crashes; effects are slight crash increases”

17 0.75447214 1888 andrew gelman stats-2013-06-08-New Judea Pearl journal of causal inference

18 0.74076736 1624 andrew gelman stats-2012-12-15-New prize on causality in statstistics education

19 0.73495018 1732 andrew gelman stats-2013-02-22-Evaluating the impacts of welfare reform?

20 0.73045909 2170 andrew gelman stats-2014-01-13-Judea Pearl overview on causal inference, and more general thoughts on the reexpression of existing methods by considering their implicit assumptions


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(16, 0.137), (24, 0.12), (31, 0.012), (72, 0.228), (76, 0.022), (77, 0.012), (84, 0.012), (85, 0.012), (86, 0.034), (99, 0.322)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.97489178 1179 andrew gelman stats-2012-02-21-“Readability” as freedom from the actual sensation of reading

Introduction: In her essay on Margaret Mitchell and Gone With the Wind, Claudia Roth Pierpoint writes: The much remarked “readability” of the book must have played a part in this smooth passage from the page to the screen, since “readability” has to do not only with freedom from obscurity but, paradoxically, with freedom from the actual sensation of reading [emphasis added]—of the tug and traction of words as they move thoughts into place in the mind. Requiring, in fact, the least reading, the most “readable” book allows its characters to slip easily through nets of words and into other forms. Popular art has been well defined by just this effortless movement from medium to medium, which is carried out, as Leslie Fiedler observed in relation to Uncle Tom’s Cabin, “without loss of intensity or alteration of meaning.” Isabel Archer rises from the page only in the hanging garments of Henry James’s prose, but Scarlett O’Hara is a free woman. Well put. I wish Pierpoint would come out with ano

2 0.96178931 1375 andrew gelman stats-2012-06-11-The unitary nature of consciousness: “It’s impossible to be insanely frustrated about 2 things at once”

Introduction: Dan Kahan writes: We all know it’s ridiculous to be able to go on an fMRI fishing trip & resort to post hoc story-telling to explain the “significant” correlations one (inevitably) observes (good fMRI studies *don’t* do this; only bad ones do– to the injury of the reputation of all the scholars doing good studies of this kind). But now one doesn’t even need correlations that support the post-hoc inferences one is drawing. This one’s good. Kahan continues: Headline: Religious Experiences Shrink Part of the Brain text: ” … The study, published March 30 [2011] in PLoS One, showed greater atrophy in the hippocampus in individuals who identify with specific religious groups as well as those with no religious affiliation … The results showed significantly greater hippocampal atrophy in individuals reporting a life-changing religious experience. In addition, they found significantly greater hippocampal atrophy among born-again Protestants, Catholics, and those with no religiou

3 0.95828837 1935 andrew gelman stats-2013-07-12-“A tangle of unexamined emotional impulses and illogical responses”

Introduction: Tyler Cowen posts the following note from a taxi driver: I learned very early on to never drive someone to their destination if it was a route they drove themselves, say to their home from the airport . . . Everyone prides themselves on driving the shortest route but they rarely do. . . . When I first started driving a cab, I drove the shortest route—always, I’m ethical—but people would accuse me of taking the long way because it wasn’t the way they drove . . . In the end, experts they consider themselves to be, people are a tangle of unexamined emotional impulses and illogical responses. I take a lot of rides to and from the airport, and I can assure you that a lot of taxi drivers don’t know the good routes. Once I had to start screaming from the back seat to stop the guy from getting on the BQE. I don’t “pride myself” on knowing a good route home from the airport, but I prefer the good route. I’m guessing that the taxi driver quoted above is subject to the same illusions

4 0.94773889 1381 andrew gelman stats-2012-06-16-The Art of Fielding

Introduction: I liked it; the reviews were well-deserved. It indeed is a cross between The Mysteries of Pittsburgh and The Universal Baseball Association, J. Henry Waugh, Prop. What struck me most, though, was the contrast with Indecision, the novel by Harbach’s associate, Benjamin Kunkel. As I noted a few years ago , Indecision was notable in that all the characters had agency. That is, each character had his or her own ideas and seemed to act on his or her own ideas, rather than merely carrying the plot along or providing scenery. In contrast, the most gripping drama in The Art of Fielding seem to be characters’ struggling with their plot-determined roles (hence the connection with Coover’s God-soaked baseball classic). Also notable to me was that the college-aged characters not being particularly obsessed with sex—I guess this is that easy-going hook-up culture I keep reading about—while at the same time, just about all the characters seem to be involved in serious drug addiction. I’ve re

5 0.94653755 737 andrew gelman stats-2011-05-30-Memorial Day question

Introduction: When I was a kid they shifted a bunch of holidays to Monday. (Not all the holidays: they kept New Year’s, Christmas, and July 4th on fixed dates, they kept Thanksgiving on a Thursday, and for some reason the shifted Veterans Day didn’t stick. But they successfully moved Washington’s Birthday, Memorial Day, and Columbus Day. It makes sense to give people a 3-day weekend. I have no idea why they picked Monday rather than Friday, but either one would do, I suppose. My question is: if this Monday holiday thing was such a good idea, why did it take them so long to do it?

6 0.94357157 84 andrew gelman stats-2010-06-14-Is it 1930?

same-blog 7 0.9384436 550 andrew gelman stats-2011-02-02-An IV won’t save your life if the line is tangled

8 0.93178326 68 andrew gelman stats-2010-06-03-…pretty soon you’re talking real money.

9 0.92946589 741 andrew gelman stats-2011-06-02-At least he didn’t prove a false theorem

10 0.92842233 500 andrew gelman stats-2011-01-03-Bribing statistics

11 0.92455918 2331 andrew gelman stats-2014-05-12-On deck this week

12 0.92303419 1244 andrew gelman stats-2012-04-03-Meta-analyses of impact evaluations of aid programs

13 0.91681361 83 andrew gelman stats-2010-06-13-Silly Sas lays out old-fashioned statistical thinking

14 0.91653883 1079 andrew gelman stats-2011-12-23-Surveys show Americans are populist class warriors, except when they aren’t

15 0.91434824 268 andrew gelman stats-2010-09-10-Fighting Migraine with Multilevel Modeling

16 0.90916854 2335 andrew gelman stats-2014-05-15-Bill Easterly vs. Jeff Sachs: What percentage of the recipients didn’t use the free malaria bed nets in Zambia?

17 0.90862507 624 andrew gelman stats-2011-03-22-A question about the economic benefits of universities

18 0.90534616 1524 andrew gelman stats-2012-10-07-An (impressive) increase in survival rate from 50% to 60% corresponds to an R-squared of (only) 1%. Counterintuitive, huh?

19 0.89748585 190 andrew gelman stats-2010-08-07-Mister P makes the big jump from the New York Times to the Washington Post

20 0.88810521 2045 andrew gelman stats-2013-09-30-Using the aggregate of the outcome variable as a group-level predictor in a hierarchical model