andrew_gelman_stats andrew_gelman_stats-2012 andrew_gelman_stats-2012-1136 knowledge-graph by maker-knowledge-mining

1136 andrew gelman stats-2012-01-23-Fight! (also a bit of reminiscence at the end)


meta infos for this blog

Source: html

Introduction: Martin Lindquist and Michael Sobel published a fun little article in Neuroimage on models and assumptions for causal inference with intermediate outcomes. As their subtitle indicates (“A response to the comments on our comment”), this is a topic of some controversy. Lindquist and Sobel write: Our original comment (Lindquist and Sobel, 2011) made explicit the types of assumptions neuroimaging researchers are making when directed graphical models (DGMs), which include certain types of structural equation models (SEMs), are used to estimate causal effects. When these assumptions, which many researchers are not aware of, are not met, parameters of these models should not be interpreted as effects. . . . [Judea] Pearl does not disagree with anything we stated. However, he takes exception to our use of potential outcomes notation, which is the standard notation used in the statistical literature on causal inference, and his comment is devoted to promoting his alternative conventions. [C


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Martin Lindquist and Michael Sobel published a fun little article in Neuroimage on models and assumptions for causal inference with intermediate outcomes. [sent-1, score-0.406]

2 Lindquist and Sobel write: Our original comment (Lindquist and Sobel, 2011) made explicit the types of assumptions neuroimaging researchers are making when directed graphical models (DGMs), which include certain types of structural equation models (SEMs), are used to estimate causal effects. [sent-3, score-1.034]

3 However, he takes exception to our use of potential outcomes notation, which is the standard notation used in the statistical literature on causal inference, and his comment is devoted to promoting his alternative conventions. [sent-9, score-0.361]

4 Glymour is also more optimistic than us about the potential of using directed graphical models (DGMs) to discover causal relations in neuroimaging research . [sent-11, score-0.524]

5 They consider a causal setting z -> x -> y, where z is the treatment variable, x is the intermediate outcome, and y is the ultimate outcome, and much of their discussion centers on estimating the causal effect of x on y. [sent-15, score-0.515]

6 If x is an observed variable that is not directly manipulated, I don’t know if it makes sense to talk about the effect of x on y, unconditional on the intervention that was used to change x. [sent-17, score-0.197]

7 In their example, I’d talk about “the effect of x on y, if x is changed through z. [sent-18, score-0.15]

8 Lindquist and Sobel talk about the effect of z on x. [sent-21, score-0.15]

9 If z=0 or 1, they write x(z), so that the causal effect of z on x is x(1) – x(0) (or, more generally, x(1) compared to x(0), but we lose nothing by considering simple differences here). [sent-22, score-0.39]

10 If x can equal 0 or 1, they write y(z,x), so that the causal effect of x on y, conditional on z, is y(z,1) – y(z,0). [sent-25, score-0.329]

11 I don’t find Pearl’s response to be so convincing—I agree with Lindquist and Sobel’s statement that the graphical or structural equation modeling expression looks simple and appealing but the underlying assumptions in those expressions are not so clear. [sent-38, score-0.6]

12 To be specific, Pearl contrasts three expressions of a single model, the causal chain Z—>X—>Y. [sent-40, score-0.286]

13 Here’s Pearl: Pearl characterizes the third expression is a more meaningful and clear display. [sent-41, score-0.149]

14 In contrast, Lindquist and Sobel argue that the above graphical expression appears clear only because it sweeps the model’s assumptions under the rug. [sent-42, score-0.404]

15 Speaking of clear and simple, I’m reminded of a scene, several decades ago, when a bunch of us on the county math team won some competition, and the prize was that we each got to choose one of several math books. [sent-44, score-0.154]

16 Which brings back another memory: our coach for the Mathematical Olympiad program was an unbelievably grumpy old man. [sent-48, score-0.162]

17 At one point he interrupted one of his lectures to rant about how all the calculus books now are wasting their space with applications. [sent-49, score-0.199]

18 That all seemed natural to me at the time but in retrospect I’m amazed by how brainwashed we all were. [sent-51, score-0.139]

19 ) The other thing I remember about the grumpy coach dude, besides his personality (which, in retrospect, was perhaps necessary to keep a bunch of 15-year-old boys in line; even nerds can make trouble), was that he thought it was cheating to use calculus or analytic geometry. [sent-55, score-0.318]

20 His favorite sorts of problems used elaborate arguments from classical geometry and he always felt we should be able to solve these without resorting to technical means. [sent-56, score-0.142]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('sobel', 0.534), ('lindquist', 0.487), ('pearl', 0.235), ('causal', 0.175), ('glymour', 0.16), ('expressions', 0.111), ('calculus', 0.107), ('dgms', 0.107), ('neuroimaging', 0.107), ('assumptions', 0.106), ('graphical', 0.103), ('effect', 0.101), ('grumpy', 0.092), ('expression', 0.091), ('retrospect', 0.09), ('olympiad', 0.08), ('directed', 0.078), ('notation', 0.073), ('coach', 0.07), ('algebra', 0.068), ('structural', 0.066), ('comment', 0.066), ('intermediate', 0.064), ('equation', 0.062), ('models', 0.061), ('simple', 0.061), ('engineering', 0.058), ('clear', 0.058), ('write', 0.053), ('didn', 0.052), ('types', 0.051), ('friend', 0.051), ('unprepared', 0.049), ('winship', 0.049), ('brainwashed', 0.049), ('nerds', 0.049), ('resorting', 0.049), ('sems', 0.049), ('talk', 0.049), ('math', 0.048), ('used', 0.047), ('trouble', 0.046), ('arguments', 0.046), ('interrupted', 0.046), ('sweeps', 0.046), ('space', 0.046), ('outcome', 0.045), ('subtitle', 0.044), ('linear', 0.043), ('inappropriately', 0.042)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0 1136 andrew gelman stats-2012-01-23-Fight! (also a bit of reminiscence at the end)

Introduction: Martin Lindquist and Michael Sobel published a fun little article in Neuroimage on models and assumptions for causal inference with intermediate outcomes. As their subtitle indicates (“A response to the comments on our comment”), this is a topic of some controversy. Lindquist and Sobel write: Our original comment (Lindquist and Sobel, 2011) made explicit the types of assumptions neuroimaging researchers are making when directed graphical models (DGMs), which include certain types of structural equation models (SEMs), are used to estimate causal effects. When these assumptions, which many researchers are not aware of, are not met, parameters of these models should not be interpreted as effects. . . . [Judea] Pearl does not disagree with anything we stated. However, he takes exception to our use of potential outcomes notation, which is the standard notation used in the statistical literature on causal inference, and his comment is devoted to promoting his alternative conventions. [C

2 0.25031671 332 andrew gelman stats-2010-10-10-Proposed new section of the American Statistical Association on Imaging Sciences

Introduction: Martin Lindquist writes that he and others are trying to start a new ASA section on statistics in imaging. If you’re interested in being a signatory to its formation, please send him an email.

3 0.21382543 1418 andrew gelman stats-2012-07-16-Long discussion about causal inference and the use of hierarchical models to bridge between different inferential settings

Introduction: Elias Bareinboim asked what I thought about his comment on selection bias in which he referred to a paper by himself and Judea Pearl, “Controlling Selection Bias in Causal Inference.” I replied that I have no problem with what he wrote, but that from my perspective I find it easier to conceptualize such problems in terms of multilevel models. I elaborated on that point in a recent post , “Hierarchical modeling as a framework for extrapolation,” which I think was read by only a few people (I say this because it received only two comments). I don’t think Bareinboim objected to anything I wrote, but like me he is comfortable working within his own framework. He wrote the following to me: In some sense, “not ad hoc” could mean logically consistent. In other words, if one agrees with the assumptions encoded in the model, one must also agree with the conclusions entailed by these assumptions. I am not aware of any other way of doing mathematics. As it turns out, to get causa

4 0.20653617 1888 andrew gelman stats-2013-06-08-New Judea Pearl journal of causal inference

Introduction: Pearl reports that his Journal of Causal Inference has just posted its first issue , which contains a mix of theoretical and applied papers. Pearl writes that they welcome submissions on all aspects of causal inference.

5 0.18753982 2170 andrew gelman stats-2014-01-13-Judea Pearl overview on causal inference, and more general thoughts on the reexpression of existing methods by considering their implicit assumptions

Introduction: This material should be familiar to many of you but could be helpful to newcomers. Pearl writes: ALL causal conclusions in nonexperimental settings must be based on untested, judgmental assumptions that investigators are prepared to defend on scientific grounds. . . . To understand what the world should be like for a given procedure to work is of no lesser scientific value than seeking evidence for how the world works . . . Assumptions are self-destructive in their honesty. The more explicit the assumption, the more criticism it invites . . . causal diagrams invite the harshest criticism because they make assumptions more explicit and more transparent than other representation schemes. As regular readers know (for example, search this blog for “Pearl”), I have not got much out of the causal-diagrams approach myself, but in general I think that when there are multiple, mathematically equivalent methods of getting the same answer, we tend to go with the framework we are used

6 0.15010597 1133 andrew gelman stats-2012-01-21-Judea Pearl on why he is “only a half-Bayesian”

7 0.14327885 879 andrew gelman stats-2011-08-29-New journal on causal inference

8 0.13631833 1939 andrew gelman stats-2013-07-15-Forward causal reasoning statements are about estimation; reverse causal questions are about model checking and hypothesis generation

9 0.13557048 1624 andrew gelman stats-2012-12-15-New prize on causality in statstistics education

10 0.11549534 1675 andrew gelman stats-2013-01-15-“10 Things You Need to Know About Causal Effects”

11 0.096351281 1732 andrew gelman stats-2013-02-22-Evaluating the impacts of welfare reform?

12 0.095507123 2286 andrew gelman stats-2014-04-08-Understanding Simpson’s paradox using a graph

13 0.09436246 803 andrew gelman stats-2011-07-14-Subtleties with measurement-error models for the evaluation of wacky claims

14 0.091460705 2273 andrew gelman stats-2014-03-29-References (with code) for Bayesian hierarchical (multilevel) modeling and structural equation modeling

15 0.089887373 1336 andrew gelman stats-2012-05-22-Battle of the Repo Man quotes: Reid Hastie’s turn

16 0.080562703 390 andrew gelman stats-2010-11-02-Fragment of statistical autobiography

17 0.08053679 518 andrew gelman stats-2011-01-15-Regression discontinuity designs: looking for the keys under the lamppost?

18 0.079893515 1848 andrew gelman stats-2013-05-09-A tale of two discussion papers

19 0.079257332 1996 andrew gelman stats-2013-08-24-All inference is about generalizing from sample to population

20 0.075755864 2245 andrew gelman stats-2014-03-12-More on publishing in journals


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.176), (1, 0.024), (2, -0.017), (3, -0.02), (4, 0.007), (5, -0.004), (6, -0.007), (7, 0.0), (8, 0.088), (9, 0.022), (10, -0.038), (11, 0.023), (12, 0.008), (13, -0.014), (14, 0.027), (15, 0.015), (16, -0.013), (17, -0.004), (18, -0.03), (19, 0.062), (20, -0.04), (21, -0.084), (22, 0.083), (23, 0.017), (24, 0.073), (25, 0.129), (26, -0.007), (27, -0.019), (28, -0.047), (29, 0.029), (30, -0.014), (31, -0.035), (32, -0.049), (33, 0.004), (34, -0.069), (35, -0.032), (36, 0.016), (37, -0.016), (38, 0.009), (39, 0.023), (40, -0.048), (41, 0.02), (42, -0.013), (43, -0.005), (44, -0.03), (45, -0.012), (46, -0.023), (47, 0.032), (48, -0.013), (49, -0.008)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.95745057 1136 andrew gelman stats-2012-01-23-Fight! (also a bit of reminiscence at the end)

Introduction: Martin Lindquist and Michael Sobel published a fun little article in Neuroimage on models and assumptions for causal inference with intermediate outcomes. As their subtitle indicates (“A response to the comments on our comment”), this is a topic of some controversy. Lindquist and Sobel write: Our original comment (Lindquist and Sobel, 2011) made explicit the types of assumptions neuroimaging researchers are making when directed graphical models (DGMs), which include certain types of structural equation models (SEMs), are used to estimate causal effects. When these assumptions, which many researchers are not aware of, are not met, parameters of these models should not be interpreted as effects. . . . [Judea] Pearl does not disagree with anything we stated. However, he takes exception to our use of potential outcomes notation, which is the standard notation used in the statistical literature on causal inference, and his comment is devoted to promoting his alternative conventions. [C

2 0.92675257 1675 andrew gelman stats-2013-01-15-“10 Things You Need to Know About Causal Effects”

Introduction: Macartan Humphreys pointed me to this excellent guide . Here are the 10 items: 1. A causal claim is a statement about what didn’t happen. 2. There is a fundamental problem of causal inference. 3. You can estimate average causal effects even if you cannot observe any individual causal effects. 4. If you know that, on average, A causes B and that B causes C, this does not mean that you know that A causes C. 5. The counterfactual model is all about contribution, not attribution. 6. X can cause Y even if there is no “causal path” connecting X and Y. 7. Correlation is not causation. 8. X can cause Y even if X is not a necessary condition or a sufficient condition for Y. 9. Estimating average causal effects does not require that treatment and control groups are identical. 10. There is no causation without manipulation. The article follows with crisp discussions of each point. My favorite is item #6, not because it’s the most important but because it brings in some real s

3 0.87523091 1939 andrew gelman stats-2013-07-15-Forward causal reasoning statements are about estimation; reverse causal questions are about model checking and hypothesis generation

Introduction: Consider two broad classes of inferential questions : 1. Forward causal inference . What might happen if we do X? What are the effects of smoking on health, the effects of schooling on knowledge, the effect of campaigns on election outcomes, and so forth? 2. Reverse causal inference . What causes Y? Why do more attractive people earn more money? Why do many poor people vote for Republicans and rich people vote for Democrats? Why did the economy collapse? When statisticians and econometricians write about causal inference, they focus on forward causal questions. Rubin always told us: Never ask Why? Only ask What if? And, from the econ perspective, causation is typically framed in terms of manipulations: if x had changed by 1, how much would y be expected to change, holding all else constant? But reverse causal questions are important too. They’re a natural way to think (consider the importance of the word “Why”) and are arguably more important than forward questions.

4 0.85079885 1492 andrew gelman stats-2012-09-11-Using the “instrumental variables” or “potential outcomes” approach to clarify causal thinking

Introduction: As I’ve written here many times, my experiences in social science and public health research have left me skeptical of statistical methods that hypothesize or try to detect zero relationships between observational data (see, for example, the discussion starting at the bottom of page 960 in my review of causal inference in the American Journal of Sociology). In short, I have a taste for continuous rather than discrete models. As discussed in the above-linked article (with respect to the writings of cognitive scientist Steven Sloman), I think that common-sense thinking about causal inference can often mislead. In many cases, I have found that that the theoretical frameworks of instrumental variables and potential outcomes (for a review see, for example, chapters 9 and 10 of my book with Jennifer) help clarify my thinking. Here is an example that came up in a recent blog discussion. Computer science student Elias Bareinboim gave the following example: “suppose we know nothing a

5 0.82801718 2286 andrew gelman stats-2014-04-08-Understanding Simpson’s paradox using a graph

Introduction: Joshua Vogelstein pointed me to this post by Michael Nielsen on how to teach Simpson’s paradox. I don’t know if Nielsen (and others) are aware that people have developed some snappy graphical methods for displaying Simpson’s paradox (and, more generally, aggregation issues). We do some this in our Red State Blue State book, but before that was the BK plot, named by Howard Wainer after a 2001 paper by Stuart Baker and Barnett Kramer, although in apparently appeared earlier in a 1987 paper by Jeon, Chung, and Bae, and doubtless was made by various other people before then. Here’s Wainer’s graphical explication from 2002 (adapted from Baker and Kramer’s 2001 paper): Here’s the version from our 2007 article (with Boris Shor, Joe Bafumi, and David Park): But I recommend Wainer’s article (linked to above) as the first thing to read on the topic of presenting aggregation paradoxes in a clear and grabby way. P.S. Robert Long writes in: I noticed your post ab

6 0.82368755 550 andrew gelman stats-2011-02-02-An IV won’t save your life if the line is tangled

7 0.82363099 1418 andrew gelman stats-2012-07-16-Long discussion about causal inference and the use of hierarchical models to bridge between different inferential settings

8 0.81742913 1996 andrew gelman stats-2013-08-24-All inference is about generalizing from sample to population

9 0.8081066 393 andrew gelman stats-2010-11-04-Estimating the effect of A on B, and also the effect of B on A

10 0.80526549 340 andrew gelman stats-2010-10-13-Randomized experiments, non-randomized experiments, and observational studies

11 0.80464768 1888 andrew gelman stats-2013-06-08-New Judea Pearl journal of causal inference

12 0.79923403 1133 andrew gelman stats-2012-01-21-Judea Pearl on why he is “only a half-Bayesian”

13 0.79790056 807 andrew gelman stats-2011-07-17-Macro causality

14 0.79336965 2097 andrew gelman stats-2013-11-11-Why ask why? Forward causal inference and reverse causal questions

15 0.7691257 879 andrew gelman stats-2011-08-29-New journal on causal inference

16 0.76777893 1336 andrew gelman stats-2012-05-22-Battle of the Repo Man quotes: Reid Hastie’s turn

17 0.76388431 2170 andrew gelman stats-2014-01-13-Judea Pearl overview on causal inference, and more general thoughts on the reexpression of existing methods by considering their implicit assumptions

18 0.7509591 1624 andrew gelman stats-2012-12-15-New prize on causality in statstistics education

19 0.74267364 1732 andrew gelman stats-2013-02-22-Evaluating the impacts of welfare reform?

20 0.72281098 1802 andrew gelman stats-2013-04-14-Detecting predictability in complex ecosystems


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(15, 0.032), (16, 0.073), (17, 0.18), (21, 0.042), (22, 0.012), (24, 0.126), (40, 0.015), (53, 0.014), (54, 0.011), (84, 0.02), (86, 0.029), (97, 0.012), (99, 0.271)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.94886982 2314 andrew gelman stats-2014-05-01-Heller, Heller, and Gorfine on univariate and multivariate information measures

Introduction: Malka Gorfine writes: We noticed that the important topic of association measures and tests came up again in your blog, and we have few comments in this regard. It is useful to distinguish between the univariate and multivariate methods. A consistent multivariate method can recognise dependence between two vectors of random variables, while a univariate method can only loop over pairs of components and check for dependency between them. There are very few consistent multivariate methods. To the best of our knowledge there are three practical methods: 1) HSIC by Gretton et al. (http://www.gatsby.ucl.ac.uk/~gretton/papers/GreBouSmoSch05.pdf) 2) dcov by Szekely et al. (http://projecteuclid.org/euclid.aoas/1267453933) 3) A method we introduced in Heller et al (Biometrika, 2013, 503—510, http://biomet.oxfordjournals.org/content/early/2012/12/04/biomet.ass070.full.pdf+html, and an R package, HHG, is available as well http://cran.r-project.org/web/packages/HHG/index.html). A

2 0.94491851 309 andrew gelman stats-2010-10-01-Why Development Economics Needs Theory?

Introduction: Robert Neumann writes: in the JEP 24(3), page18, Daron Acemoglu states: Why Development Economics Needs Theory There is no general agreement on how much we should rely on economic theory in motivating empirical work and whether we should try to formulate and estimate “structural parameters.” I (Acemoglu) argue that the answer is largely “yes” because otherwise econometric estimates would lack external validity, in which case they can neither inform us about whether a particular model or theory is a useful approximation to reality, nor would they be useful in providing us guidance on what the effects of similar shocks and policies would be in different circumstances or if implemented in different scales. I therefore define “structural parameters” as those that provide external validity and would thus be useful in testing theories or in policy analysis beyond the specific environment and sample from which they are derived. External validity becomes a particularly challenging t

3 0.94434029 1230 andrew gelman stats-2012-03-26-Further thoughts on nonparametric correlation measures

Introduction: Malka Gorfine, Ruth Heller, and Yair Heller write a comment on the paper of Reshef et al. that we discussed a few months ago. Just to remind you what’s going on here, here’s my quick summary from December: Reshef et al. propose a new nonlinear R-squared-like measure. Unlike R-squared, this new method depends on a tuning parameter that controls the level of discretization, in a “How long is the coast of Britain” sort of way. The dependence on scale is inevitable for such a general method. Just consider: if you sample 1000 points from the unit bivariate normal distribution, (x,y) ~ N(0,I), you’ll be able to fit them perfectly by a 999-degree polynomial fit to the data. So the scale of the fit matters. The clever idea of the paper is that, instead of going for an absolute measure (which, as we’ve seen, will be scale-dependent), they focus on the problem of summarizing the grid of pairwise dependences in a large set of variables. As they put it: “Imagine a data set with hundreds

4 0.94121635 1557 andrew gelman stats-2012-11-01-‘Researcher Degrees of Freedom’

Introduction: False-Positive Psychology: Undisclosed Flexibility in Data Collection and Analysis Allows Presenting Anything as Significant [I]t is unacceptably easy to publish “statistically significant” evidence consistent with any hypothesis. The culprit is a construct we refer to as researcher degrees of freedom. In the course of collecting and analyzing data, researchers have many decisions to make: Should more data be collected? Should some observations be excluded? Which conditions should be combined and which ones compared? Which control variables should be considered? Should specific measures be combined or transformed or both? It is rare, and sometimes impractical, for researchers to make all these decisions beforehand. Rather, it is common (and accepted practice) for researchers to explore various analytic alternatives, to search for a combination that yields “statistical significance,” and to then report only what “worked.” The problem, of course, is that the likelihood of at leas

same-blog 5 0.93452197 1136 andrew gelman stats-2012-01-23-Fight! (also a bit of reminiscence at the end)

Introduction: Martin Lindquist and Michael Sobel published a fun little article in Neuroimage on models and assumptions for causal inference with intermediate outcomes. As their subtitle indicates (“A response to the comments on our comment”), this is a topic of some controversy. Lindquist and Sobel write: Our original comment (Lindquist and Sobel, 2011) made explicit the types of assumptions neuroimaging researchers are making when directed graphical models (DGMs), which include certain types of structural equation models (SEMs), are used to estimate causal effects. When these assumptions, which many researchers are not aware of, are not met, parameters of these models should not be interpreted as effects. . . . [Judea] Pearl does not disagree with anything we stated. However, he takes exception to our use of potential outcomes notation, which is the standard notation used in the statistical literature on causal inference, and his comment is devoted to promoting his alternative conventions. [C

6 0.9265511 705 andrew gelman stats-2011-05-10-Some interesting unpublished ideas on survey weighting

7 0.92286336 1616 andrew gelman stats-2012-12-10-John McAfee is a Heinlein hero

8 0.9180541 2324 andrew gelman stats-2014-05-07-Once more on nonparametric measures of mutual information

9 0.90710896 1362 andrew gelman stats-2012-06-03-Question 24 of my final exam for Design and Analysis of Sample Surveys

10 0.90566367 397 andrew gelman stats-2010-11-06-Multilevel quantile regression

11 0.90318346 1076 andrew gelman stats-2011-12-21-Derman, Rodrik and the nature of statistical models

12 0.89601481 1422 andrew gelman stats-2012-07-20-Likelihood thresholds and decisions

13 0.89072299 1467 andrew gelman stats-2012-08-23-The pinch-hitter syndrome again

14 0.88633823 1591 andrew gelman stats-2012-11-26-Politics as an escape hatch

15 0.8835218 2359 andrew gelman stats-2014-06-04-All the Assumptions That Are My Life

16 0.88120991 2136 andrew gelman stats-2013-12-16-Whither the “bet on sparsity principle” in a nonsparse world?

17 0.87420458 1383 andrew gelman stats-2012-06-18-Hierarchical modeling as a framework for extrapolation

18 0.87250733 1272 andrew gelman stats-2012-04-20-More proposals to reform the peer-review system

19 0.86931157 2315 andrew gelman stats-2014-05-02-Discovering general multidimensional associations

20 0.8675167 1228 andrew gelman stats-2012-03-25-Continuous variables in Bayesian networks