andrew_gelman_stats andrew_gelman_stats-2011 andrew_gelman_stats-2011-807 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: David Backus writes: This is from my area of work, macroeconomics. The suggestion here is that the economy is growing slowly because consumers aren’t spending money. But how do we know it’s not the reverse: that consumers are spending less because the economy isn’t doing well. As a teacher, I can tell you that it’s almost impossible to get students to understand that the first statement isn’t obviously true. What I’d call the demand-side story (more spending leads to more output) is everywhere, including this piece, from the usually reliable David Leonhardt. This whole situation reminds me of the story of the village whose inhabitants support themselves by taking in each others’ laundry. I guess we’re rich enough in the U.S. that we can stay afloat for a few decades just buying things from each other? Regarding the causal question, I’d like to move away from the idea of “Does A causes B or does B cause A” and toward a more intervention-based framework (Rubin’s model for
sentIndex sentText sentNum sentScore
1 The suggestion here is that the economy is growing slowly because consumers aren’t spending money. [sent-2, score-1.241]
2 But how do we know it’s not the reverse: that consumers are spending less because the economy isn’t doing well. [sent-3, score-1.012]
3 As a teacher, I can tell you that it’s almost impossible to get students to understand that the first statement isn’t obviously true. [sent-4, score-0.137]
4 What I’d call the demand-side story (more spending leads to more output) is everywhere, including this piece, from the usually reliable David Leonhardt. [sent-5, score-0.544]
5 This whole situation reminds me of the story of the village whose inhabitants support themselves by taking in each others’ laundry. [sent-6, score-0.295]
6 that we can stay afloat for a few decades just buying things from each other? [sent-9, score-0.289]
7 Regarding the causal question, I’d like to move away from the idea of “Does A causes B or does B cause A” and toward a more intervention-based framework (Rubin’s model for causal inference) in which we consider effects of potential actions. [sent-10, score-0.762]
8 Considering the example above, a focus on interventions clarifies some of the causal questions. [sent-12, score-0.576]
9 For example, if you want to talk about the effect of consumers spending less, you have to consider what interventions you have in mind that would cause consumers to spend more. [sent-13, score-1.882]
10 One such intervention is the famous helicopter drop but there are others, I assume. [sent-14, score-0.157]
11 Conversely, if you want to talk about the poor economy affecting spending, you have to consider what interventions you have in mind to make the economy go better. [sent-15, score-1.224]
12 In that sense, instrumental variables are a fundamental way to think of just about all causal questions of this sort. [sent-16, score-0.377]
13 You start with variables A and B (for example, consumer spending and economic growth). [sent-17, score-0.564]
14 Instead of picturing A causing B or B causing A, you consider various treatments that can affect both A and B . [sent-18, score-0.684]
15 As I never tire of saying, my knowledge of macroeconomics hasn’t developed since I took econ class in 11th grade. [sent-20, score-0.357]
wordName wordTfidf (topN-words)
[('spending', 0.389), ('consumers', 0.371), ('interventions', 0.267), ('economy', 0.252), ('causal', 0.202), ('causing', 0.168), ('consider', 0.155), ('cause', 0.13), ('afloat', 0.126), ('inhabitants', 0.126), ('tire', 0.126), ('mind', 0.115), ('picturing', 0.114), ('backus', 0.11), ('clarifies', 0.107), ('village', 0.099), ('affecting', 0.099), ('david', 0.092), ('variables', 0.092), ('macroeconomics', 0.089), ('buying', 0.088), ('everywhere', 0.087), ('reliable', 0.085), ('isn', 0.085), ('talk', 0.084), ('intervention', 0.083), ('instrumental', 0.083), ('consumer', 0.083), ('conceptual', 0.082), ('slowly', 0.082), ('treatments', 0.079), ('conversely', 0.078), ('output', 0.078), ('hasn', 0.077), ('growing', 0.077), ('reverse', 0.075), ('econ', 0.075), ('stay', 0.075), ('growth', 0.075), ('drop', 0.074), ('grade', 0.074), ('teacher', 0.073), ('causes', 0.073), ('others', 0.072), ('impossible', 0.071), ('suggestion', 0.07), ('story', 0.07), ('developed', 0.067), ('piece', 0.067), ('obviously', 0.066)]
simIndex simValue blogId blogTitle
same-blog 1 1.0 807 andrew gelman stats-2011-07-17-Macro causality
Introduction: David Backus writes: This is from my area of work, macroeconomics. The suggestion here is that the economy is growing slowly because consumers aren’t spending money. But how do we know it’s not the reverse: that consumers are spending less because the economy isn’t doing well. As a teacher, I can tell you that it’s almost impossible to get students to understand that the first statement isn’t obviously true. What I’d call the demand-side story (more spending leads to more output) is everywhere, including this piece, from the usually reliable David Leonhardt. This whole situation reminds me of the story of the village whose inhabitants support themselves by taking in each others’ laundry. I guess we’re rich enough in the U.S. that we can stay afloat for a few decades just buying things from each other? Regarding the causal question, I’d like to move away from the idea of “Does A causes B or does B cause A” and toward a more intervention-based framework (Rubin’s model for
2 0.39158714 814 andrew gelman stats-2011-07-21-The powerful consumer?
Introduction: Economist David Backus writes : A casual reader of economic news can’t help but get the impression that the way to get the economy moving is to have people spend more — consume more, in the language of macroeconomics. Seems obvious, doesn’t it? At the risk of making the obvious complicated, I’d say it’s not so obvious. It’s also not obvious that consumption has gone down since the crisis, or that saving has gone up. So what’s going on with the labor market? I’ll get to the rest of the explanation, but first some background. The other day, I posted posted this remark from Backus: This is from my area of work, macroeconomics. The suggestion here is that the economy is growing slowly because consumers aren’t spending money. But how do we know it’s not the reverse: that consumers are spending less because the economy isn’t doing well. As a teacher, I can tell you that it’s almost impossible to get students to understand that the first statement isn’t obviously true
3 0.20807692 67 andrew gelman stats-2010-06-03-More on that Dartmouth health care study
Introduction: Hank Aaron at the Brookings Institution, who knows a lot more about policy than I do, had some interesting comments on the recent New York Times article about problems with the Dartmouth health care atlas. which I discussed a few hours ago . Aaron writes that much of the criticism in that newspaper article was off-base, but that there are real difficulties in translating the Dartmouth results (finding little relation between spending and quality of care) to cost savings in the real world. Aaron writes: The Dartmouth research, showing huge variation in the use of various medical procedures and large variations in per patient spending under Medicare, has been a revelation and a useful one. There is no way to explain such variation on medical grounds and it is problematic. But readers, including my former colleague Orszag, have taken an oversimplistic view of what the numbers mean and what to do about them. There are three really big problems with the common interpreta
Introduction: Consider two broad classes of inferential questions : 1. Forward causal inference . What might happen if we do X? What are the effects of smoking on health, the effects of schooling on knowledge, the effect of campaigns on election outcomes, and so forth? 2. Reverse causal inference . What causes Y? Why do more attractive people earn more money? Why do many poor people vote for Republicans and rich people vote for Democrats? Why did the economy collapse? When statisticians and econometricians write about causal inference, they focus on forward causal questions. Rubin always told us: Never ask Why? Only ask What if? And, from the econ perspective, causation is typically framed in terms of manipulations: if x had changed by 1, how much would y be expected to change, holding all else constant? But reverse causal questions are important too. They’re a natural way to think (consider the importance of the word “Why”) and are arguably more important than forward questions.
5 0.1917893 1675 andrew gelman stats-2013-01-15-“10 Things You Need to Know About Causal Effects”
Introduction: Macartan Humphreys pointed me to this excellent guide . Here are the 10 items: 1. A causal claim is a statement about what didn’t happen. 2. There is a fundamental problem of causal inference. 3. You can estimate average causal effects even if you cannot observe any individual causal effects. 4. If you know that, on average, A causes B and that B causes C, this does not mean that you know that A causes C. 5. The counterfactual model is all about contribution, not attribution. 6. X can cause Y even if there is no “causal path” connecting X and Y. 7. Correlation is not causation. 8. X can cause Y even if X is not a necessary condition or a sufficient condition for Y. 9. Estimating average causal effects does not require that treatment and control groups are identical. 10. There is no causation without manipulation. The article follows with crisp discussions of each point. My favorite is item #6, not because it’s the most important but because it brings in some real s
6 0.14132255 2274 andrew gelman stats-2014-03-30-Adjudicating between alternative interpretations of a statistical interaction?
9 0.12073383 1854 andrew gelman stats-2013-05-13-A Structural Comparison of Conspicuous Consumption in China and the United States
10 0.11086779 394 andrew gelman stats-2010-11-05-2010: What happened?
11 0.1040019 879 andrew gelman stats-2011-08-29-New journal on causal inference
12 0.10390973 1396 andrew gelman stats-2012-06-27-Recently in the sister blog
13 0.10048379 659 andrew gelman stats-2011-04-13-Jim Campbell argues that Larry Bartels’s “Unequal Democracy” findings are not robust
15 0.099704385 1996 andrew gelman stats-2013-08-24-All inference is about generalizing from sample to population
16 0.098202825 2121 andrew gelman stats-2013-12-02-Should personal genetic testing be regulated? Battle of the blogroll
17 0.093382865 1936 andrew gelman stats-2013-07-13-Economic policy does not occur in a political vacuum
18 0.091870219 560 andrew gelman stats-2011-02-06-Education and Poverty
19 0.091305785 84 andrew gelman stats-2010-06-14-Is it 1930?
20 0.089900613 1388 andrew gelman stats-2012-06-22-Americans think economy isn’t so bad in their city but is crappy nationally and globally
topicId topicWeight
[(0, 0.155), (1, -0.028), (2, 0.052), (3, -0.006), (4, -0.019), (5, 0.038), (6, 0.005), (7, 0.033), (8, 0.044), (9, 0.057), (10, -0.081), (11, 0.051), (12, -0.001), (13, -0.014), (14, 0.037), (15, -0.05), (16, 0.03), (17, 0.001), (18, -0.073), (19, 0.107), (20, -0.025), (21, -0.07), (22, 0.09), (23, 0.025), (24, 0.047), (25, 0.111), (26, 0.016), (27, -0.064), (28, -0.004), (29, 0.101), (30, 0.056), (31, -0.019), (32, -0.025), (33, 0.004), (34, -0.091), (35, -0.02), (36, -0.014), (37, 0.012), (38, 0.071), (39, 0.036), (40, -0.024), (41, 0.024), (42, -0.031), (43, 0.026), (44, 0.039), (45, 0.033), (46, 0.038), (47, 0.016), (48, -0.023), (49, -0.016)]
simIndex simValue blogId blogTitle
same-blog 1 0.97547358 807 andrew gelman stats-2011-07-17-Macro causality
Introduction: David Backus writes: This is from my area of work, macroeconomics. The suggestion here is that the economy is growing slowly because consumers aren’t spending money. But how do we know it’s not the reverse: that consumers are spending less because the economy isn’t doing well. As a teacher, I can tell you that it’s almost impossible to get students to understand that the first statement isn’t obviously true. What I’d call the demand-side story (more spending leads to more output) is everywhere, including this piece, from the usually reliable David Leonhardt. This whole situation reminds me of the story of the village whose inhabitants support themselves by taking in each others’ laundry. I guess we’re rich enough in the U.S. that we can stay afloat for a few decades just buying things from each other? Regarding the causal question, I’d like to move away from the idea of “Does A causes B or does B cause A” and toward a more intervention-based framework (Rubin’s model for
2 0.79318839 1675 andrew gelman stats-2013-01-15-“10 Things You Need to Know About Causal Effects”
Introduction: Macartan Humphreys pointed me to this excellent guide . Here are the 10 items: 1. A causal claim is a statement about what didn’t happen. 2. There is a fundamental problem of causal inference. 3. You can estimate average causal effects even if you cannot observe any individual causal effects. 4. If you know that, on average, A causes B and that B causes C, this does not mean that you know that A causes C. 5. The counterfactual model is all about contribution, not attribution. 6. X can cause Y even if there is no “causal path” connecting X and Y. 7. Correlation is not causation. 8. X can cause Y even if X is not a necessary condition or a sufficient condition for Y. 9. Estimating average causal effects does not require that treatment and control groups are identical. 10. There is no causation without manipulation. The article follows with crisp discussions of each point. My favorite is item #6, not because it’s the most important but because it brings in some real s
Introduction: Consider two broad classes of inferential questions : 1. Forward causal inference . What might happen if we do X? What are the effects of smoking on health, the effects of schooling on knowledge, the effect of campaigns on election outcomes, and so forth? 2. Reverse causal inference . What causes Y? Why do more attractive people earn more money? Why do many poor people vote for Republicans and rich people vote for Democrats? Why did the economy collapse? When statisticians and econometricians write about causal inference, they focus on forward causal questions. Rubin always told us: Never ask Why? Only ask What if? And, from the econ perspective, causation is typically framed in terms of manipulations: if x had changed by 1, how much would y be expected to change, holding all else constant? But reverse causal questions are important too. They’re a natural way to think (consider the importance of the word “Why”) and are arguably more important than forward questions.
4 0.72080386 1996 andrew gelman stats-2013-08-24-All inference is about generalizing from sample to population
Introduction: Jeff Walker writes: Your blog has skirted around the value of observational studies and chided folks for using causal language when they only have associations but I sense that you ultimately find value in these associations. I would love for you to expand this thought in a blog. Specifically: Does a measured association “suggest” a causal relationship? Are measured associations a good and efficient way to narrow the field of things that should be studied? Of all the things we should pursue, should we start with the stuff that has some largish measured association? Certainly many associations are not directly causal but due to joint association. Similarly, there must be many variables that are directly causally associated ( A -> B) but the effect, measured as an association, is masked by confounders. So if we took the “measured associations are worthwhile” approach, we’d never or rarely find the masked effects. But I’d also like to know if one is more likely to find a large causal
5 0.71071166 550 andrew gelman stats-2011-02-02-An IV won’t save your life if the line is tangled
Introduction: Alex Tabarrok quotes Randall Morck and Bernard Yeung on difficulties with instrumental variables. This reminded me of some related things I’ve written. In the official story the causal question comes first and then the clever researcher comes up with an IV. I suspect that often it’s the other way around: you find a natural experiment and look at the consequences that flow from it. And maybe that’s not such a bad thing. See section 4 of this article . More generally, I think economists and political scientists are currently a bit overinvested in identification strategies. I agree with Heckman’s point (as I understand it) that ultimately we should be building models that work for us rather than always thinking we can get causal inference on the cheap, as it were, by some trick or another. (This is a point I briefly discuss in a couple places here and also in my recent paper for the causality volume that Don Green etc are involved with.) I recently had this discussion wi
6 0.70922047 1136 andrew gelman stats-2012-01-23-Fight! (also a bit of reminiscence at the end)
7 0.67808676 814 andrew gelman stats-2011-07-21-The powerful consumer?
8 0.67396849 1888 andrew gelman stats-2013-06-08-New Judea Pearl journal of causal inference
9 0.67073321 340 andrew gelman stats-2010-10-13-Randomized experiments, non-randomized experiments, and observational studies
11 0.65974623 2286 andrew gelman stats-2014-04-08-Understanding Simpson’s paradox using a graph
13 0.64690828 1732 andrew gelman stats-2013-02-22-Evaluating the impacts of welfare reform?
15 0.64147496 879 andrew gelman stats-2011-08-29-New journal on causal inference
16 0.62892878 393 andrew gelman stats-2010-11-04-Estimating the effect of A on B, and also the effect of B on A
17 0.61146146 1802 andrew gelman stats-2013-04-14-Detecting predictability in complex ecosystems
18 0.61115843 1801 andrew gelman stats-2013-04-13-Can you write a program to determine the causal order?
19 0.60343796 2097 andrew gelman stats-2013-11-11-Why ask why? Forward causal inference and reverse causal questions
20 0.59867662 13 andrew gelman stats-2010-04-30-Things I learned from the Mickey Kaus for Senate campaign
topicId topicWeight
[(5, 0.011), (16, 0.121), (21, 0.033), (24, 0.187), (30, 0.014), (35, 0.068), (42, 0.016), (45, 0.046), (78, 0.013), (80, 0.014), (85, 0.012), (86, 0.021), (89, 0.023), (93, 0.012), (96, 0.019), (99, 0.258)]
simIndex simValue blogId blogTitle
same-blog 1 0.97155178 807 andrew gelman stats-2011-07-17-Macro causality
Introduction: David Backus writes: This is from my area of work, macroeconomics. The suggestion here is that the economy is growing slowly because consumers aren’t spending money. But how do we know it’s not the reverse: that consumers are spending less because the economy isn’t doing well. As a teacher, I can tell you that it’s almost impossible to get students to understand that the first statement isn’t obviously true. What I’d call the demand-side story (more spending leads to more output) is everywhere, including this piece, from the usually reliable David Leonhardt. This whole situation reminds me of the story of the village whose inhabitants support themselves by taking in each others’ laundry. I guess we’re rich enough in the U.S. that we can stay afloat for a few decades just buying things from each other? Regarding the causal question, I’d like to move away from the idea of “Does A causes B or does B cause A” and toward a more intervention-based framework (Rubin’s model for
2 0.96685624 2049 andrew gelman stats-2013-10-03-On house arrest for p-hacking
Introduction: People keep pointing me to this excellent news article by David Brown, about a scientist who was convicted of data manipulation: In all, 330 patients were randomly assigned to get either interferon gamma-1b or placebo injections. Disease progression or death occurred in 46 percent of those on the drug and 52 percent of those on placebo. That was not a significant difference, statistically speaking. When only survival was considered, however, the drug looked better: 10 percent of people getting the drug died, compared with 17 percent of those on placebo. However, that difference wasn’t “statistically significant,” either. Specifically, the so-called P value — a mathematical measure of the strength of the evidence that there’s a true difference between a treatment and placebo — was 0.08. . . . Technically, the study was a bust, although the results leaned toward a benefit from interferon gamma-1b. Was there a group of patients in which the results tipped? Harkonen asked the statis
3 0.96140611 1019 andrew gelman stats-2011-11-19-Validation of Software for Bayesian Models Using Posterior Quantiles
Introduction: I love this stuff : This article presents a simulation-based method designed to establish the computational correctness of software developed to fit a specific Bayesian model, capitalizing on properties of Bayesian posterior distributions. We illustrate the validation technique with two examples. The validation method is shown to find errors in software when they exist and, moreover, the validation output can be informative about the nature and location of such errors. We also compare our method with that of an earlier approach. I hope we can put it into Stan.
4 0.95548868 1881 andrew gelman stats-2013-06-03-Boot
Introduction: Joshua Hartshorne writes: I ran several large-N experiments (separate participants) and looked at performance against age. What we want to do is compare age-of-peak-performance across the different tasks (again, different participants). We bootstrapped age-of-peak-performance. On each iteration, we sampled (with replacement) the X scores at each age, where X=num of participants at that age, and recorded the age at which performance peaked on that task. We then recorded the age at which performance was at peak and repeated. Once we had distributions of age-of-peak-performance, we used the means and SDs to calculate t-statistics to compare the results across different tasks. For graphical presentation, we used medians, interquartile ranges, and 95% confidence intervals (based on the distributions: the range within which 75% and 95% of the bootstrapped peaks appeared). While a number of people we consulted with thought this made a lot of sense, one reviewer of the paper insist
5 0.95499003 799 andrew gelman stats-2011-07-13-Hypothesis testing with multiple imputations
Introduction: Vincent Yip writes: I have read your paper [with Kobi Abayomi and Marc Levy] regarding multiple imputation application. In order to diagnostic my imputed data, I used Kolmogorov-Smirnov (K-S) tests to compare the distribution differences between the imputed and observed values of a single attribute as mentioned in your paper. My question is: For example I have this attribute X with the following data: (NA = missing) Original dataset: 1, NA, 3, 4, 1, 5, NA Imputed dataset: 1, 2 , 3, 4, 1, 5, 6 a) in order to run the KS test, will I treat the observed data as 1, 3, 4,1, 5? b) and for the observed data, will I treat 1, 2 , 3, 4, 1, 5, 6 as the imputed dataset for the K-S test? or just 2 ,6? c) if I used m=5, I will have 5 set of imputed data sets. How would I apply K-S test to 5 of them and compare to the single observed distribution? Do I combine the 5 imputed data set into one by averaging each imputed values so I get one single imputed data and compare with the ob
6 0.94977641 586 andrew gelman stats-2011-02-23-A statistical version of Arrow’s paradox
7 0.9489103 898 andrew gelman stats-2011-09-10-Fourteen magic words: an update
8 0.94854969 503 andrew gelman stats-2011-01-04-Clarity on my email policy
9 0.94829953 2121 andrew gelman stats-2013-12-02-Should personal genetic testing be regulated? Battle of the blogroll
10 0.94695842 1206 andrew gelman stats-2012-03-10-95% intervals that I don’t believe, because they’re from a flat prior I don’t believe
11 0.945916 488 andrew gelman stats-2010-12-27-Graph of the year
12 0.94550264 1080 andrew gelman stats-2011-12-24-Latest in blog advertising
13 0.9448992 1121 andrew gelman stats-2012-01-15-R-squared for multilevel models
14 0.94463909 2086 andrew gelman stats-2013-11-03-How best to compare effects measured in two different time periods?
15 0.9446373 2179 andrew gelman stats-2014-01-20-The AAA Tranche of Subprime Science
16 0.94412744 777 andrew gelman stats-2011-06-23-Combining survey data obtained using different modes of sampling
17 0.94291919 639 andrew gelman stats-2011-03-31-Bayes: radical, liberal, or conservative?
18 0.94212961 1572 andrew gelman stats-2012-11-10-I don’t like this cartoon
19 0.94206941 1926 andrew gelman stats-2013-07-05-More plain old everyday Bayesianism
20 0.9416157 1644 andrew gelman stats-2012-12-30-Fixed effects, followed by Bayes shrinkage?