andrew_gelman_stats andrew_gelman_stats-2013 andrew_gelman_stats-2013-1732 knowledge-graph by maker-knowledge-mining

1732 andrew gelman stats-2013-02-22-Evaluating the impacts of welfare reform?


meta infos for this blog

Source: html

Introduction: John Pugliese writes: I was recently in a conversation with some colleagues regarding the evaluation of recent welfare reform in California. The discussion centered around what types of design might allow us to understand the impact the changes. Experimental designs were out, as random assignment is not feasible. Our data is pre/post, and some of my colleagues believed that the best we can do under these circumstance was a descriptive study; i.e. no causal inference. All of us were concerned with changes in economic and population changes over the pre-to-post period; i.e. over-estimating the effects in an improving economy. I was thought a quasi-experimental design was possible using MLM. Briefly, my suggestion was the following: Match our post-participants to a set of pre-participants on relevant person level factors, and treat the pre/post differences as a random effect at the county level. Next, we would adjust the pre/post differences by changes in economic and populati


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 John Pugliese writes: I was recently in a conversation with some colleagues regarding the evaluation of recent welfare reform in California. [sent-1, score-0.753]

2 The discussion centered around what types of design might allow us to understand the impact the changes. [sent-2, score-0.584]

3 Experimental designs were out, as random assignment is not feasible. [sent-3, score-0.351]

4 Our data is pre/post, and some of my colleagues believed that the best we can do under these circumstance was a descriptive study; i. [sent-4, score-0.675]

5 All of us were concerned with changes in economic and population changes over the pre-to-post period; i. [sent-7, score-0.876]

6 I was thought a quasi-experimental design was possible using MLM. [sent-10, score-0.137]

7 Briefly, my suggestion was the following: Match our post-participants to a set of pre-participants on relevant person level factors, and treat the pre/post differences as a random effect at the county level. [sent-11, score-0.956]

8 Next, we would adjust the pre/post differences by changes in economic and population factors at level 2 (county level) in order to produce an estimate of the change. [sent-12, score-1.117]

9 I was wondering if you might share some thoughts on this approach? [sent-13, score-0.079]

10 In reply to your colleagues who believe all that can be done is description, recall Jennifer’s dictum that the goal of inference is always causal. [sent-14, score-0.47]

11 A good description is fine—I spend much of my time as an applied researcher doing descriptive inference—but, implicitly or explicitly, it will usually be used for causal purposes, so it’s worth making that link and understanding the assumptions required for any given causal interpretation. [sent-15, score-0.949]

12 Regarding your idea, it seems to me that you’re considering variation of the treatment at the county level. [sent-16, score-0.428]

13 From the perspective of data analysis, multilevel modeling is definitely the way to go. [sent-17, score-0.093]

14 But I don’t know enough about the substantive context to say more than this. [sent-18, score-0.095]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('county', 0.344), ('changes', 0.223), ('colleagues', 0.211), ('descriptive', 0.205), ('causal', 0.202), ('level', 0.164), ('description', 0.163), ('circumstance', 0.159), ('factors', 0.153), ('dictum', 0.152), ('design', 0.137), ('differences', 0.134), ('regarding', 0.13), ('random', 0.128), ('population', 0.127), ('economic', 0.124), ('assignment', 0.117), ('centered', 0.116), ('welfare', 0.114), ('reform', 0.111), ('inference', 0.107), ('designs', 0.106), ('purposes', 0.106), ('improving', 0.1), ('adjust', 0.1), ('believed', 0.1), ('briefly', 0.098), ('implicitly', 0.096), ('match', 0.095), ('concerned', 0.095), ('substantive', 0.095), ('conversation', 0.094), ('suggestion', 0.093), ('definitely', 0.093), ('evaluation', 0.093), ('treat', 0.093), ('produce', 0.092), ('explicitly', 0.091), ('jennifer', 0.09), ('types', 0.089), ('period', 0.085), ('us', 0.084), ('considering', 0.084), ('required', 0.081), ('impact', 0.08), ('wondering', 0.079), ('commenters', 0.079), ('experimental', 0.078), ('allow', 0.078), ('offer', 0.077)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0000001 1732 andrew gelman stats-2013-02-22-Evaluating the impacts of welfare reform?

Introduction: John Pugliese writes: I was recently in a conversation with some colleagues regarding the evaluation of recent welfare reform in California. The discussion centered around what types of design might allow us to understand the impact the changes. Experimental designs were out, as random assignment is not feasible. Our data is pre/post, and some of my colleagues believed that the best we can do under these circumstance was a descriptive study; i.e. no causal inference. All of us were concerned with changes in economic and population changes over the pre-to-post period; i.e. over-estimating the effects in an improving economy. I was thought a quasi-experimental design was possible using MLM. Briefly, my suggestion was the following: Match our post-participants to a set of pre-participants on relevant person level factors, and treat the pre/post differences as a random effect at the county level. Next, we would adjust the pre/post differences by changes in economic and populati

2 0.18436339 1418 andrew gelman stats-2012-07-16-Long discussion about causal inference and the use of hierarchical models to bridge between different inferential settings

Introduction: Elias Bareinboim asked what I thought about his comment on selection bias in which he referred to a paper by himself and Judea Pearl, “Controlling Selection Bias in Causal Inference.” I replied that I have no problem with what he wrote, but that from my perspective I find it easier to conceptualize such problems in terms of multilevel models. I elaborated on that point in a recent post , “Hierarchical modeling as a framework for extrapolation,” which I think was read by only a few people (I say this because it received only two comments). I don’t think Bareinboim objected to anything I wrote, but like me he is comfortable working within his own framework. He wrote the following to me: In some sense, “not ad hoc” could mean logically consistent. In other words, if one agrees with the assumptions encoded in the model, one must also agree with the conclusions entailed by these assumptions. I am not aware of any other way of doing mathematics. As it turns out, to get causa

3 0.1679987 770 andrew gelman stats-2011-06-15-Still more Mr. P in public health

Introduction: When it rains it pours . . . John Transue writes: I saw a post on Andrew Sullivan’s blog today about life expectancy in different US counties. With a bunch of the worst counties being in Mississippi, I thought that it might be another case of analysts getting extreme values from small counties. However, the paper (see here ) includes a pretty interesting methods section. This is from page 5, “Specifically, we used a mixed-effects Poisson regression with time, geospatial, and covariate components. Poisson regression fits count outcome variables, e.g., death counts, and is preferable to a logistic model because the latter is biased when an outcome is rare (occurring in less than 1% of observations).” They have downloadable data. I believe that the data are predicted values from the model. A web appendix also gives 90% CIs for their estimates. Do you think they solved the small county problem and that the worst counties really are where their spreadsheet suggests? My re

4 0.14818022 1939 andrew gelman stats-2013-07-15-Forward causal reasoning statements are about estimation; reverse causal questions are about model checking and hypothesis generation

Introduction: Consider two broad classes of inferential questions : 1. Forward causal inference . What might happen if we do X? What are the effects of smoking on health, the effects of schooling on knowledge, the effect of campaigns on election outcomes, and so forth? 2. Reverse causal inference . What causes Y? Why do more attractive people earn more money? Why do many poor people vote for Republicans and rich people vote for Democrats? Why did the economy collapse? When statisticians and econometricians write about causal inference, they focus on forward causal questions. Rubin always told us: Never ask Why? Only ask What if? And, from the econ perspective, causation is typically framed in terms of manipulations: if x had changed by 1, how much would y be expected to change, holding all else constant? But reverse causal questions are important too. They’re a natural way to think (consider the importance of the word “Why”) and are arguably more important than forward questions.

5 0.13764775 2180 andrew gelman stats-2014-01-21-Everything I need to know about Bayesian statistics, I learned in eight schools.

Introduction: This post is by Phil. I’m aware that there  are  some people who use a Bayesian approach largely because it allows them to provide a highly informative prior distribution based subjective judgment, but that is not the appeal of Bayesian methods for a lot of us practitioners. It’s disappointing and surprising, twenty years after my initial experiences, to still hear highly informed professional statisticians who think that what distinguishes Bayesian statistics from Frequentist statistics is “subjectivity” ( as seen in  a recent blog post and its comments ). My first encounter with Bayesian statistics was just over 20 years ago. I was a postdoc at Lawrence Berkeley National Laboratory, with a new PhD in theoretical atomic physics but working on various problems related to the geographical and statistical distribution of indoor radon (a naturally occurring radioactive gas that can be dangerous if present at high concentrations). One of the issues I ran into right at the start was th

6 0.13478413 182 andrew gelman stats-2010-08-03-Nebraska never looked so appealing: anatomy of a zombie attack. Oops, I mean a recession.

7 0.13213503 454 andrew gelman stats-2010-12-07-Diabetes stops at the state line?

8 0.13188885 803 andrew gelman stats-2011-07-14-Subtleties with measurement-error models for the evaluation of wacky claims

9 0.12977685 383 andrew gelman stats-2010-10-31-Analyzing the entire population rather than a sample

10 0.12894456 2086 andrew gelman stats-2013-11-03-How best to compare effects measured in two different time periods?

11 0.12681985 1675 andrew gelman stats-2013-01-15-“10 Things You Need to Know About Causal Effects”

12 0.11963458 888 andrew gelman stats-2011-09-03-A psychology researcher asks: Is Anova dead?

13 0.11458083 879 andrew gelman stats-2011-08-29-New journal on causal inference

14 0.11287396 972 andrew gelman stats-2011-10-25-How do you interpret standard errors from a regression fit to the entire population?

15 0.10864471 1740 andrew gelman stats-2013-02-26-“Is machine learning a subset of statistics?”

16 0.10691559 1656 andrew gelman stats-2013-01-05-Understanding regression models and regression coefficients

17 0.106725 1888 andrew gelman stats-2013-06-08-New Judea Pearl journal of causal inference

18 0.10612824 7 andrew gelman stats-2010-04-27-Should Mister P be allowed-encouraged to reside in counter-factual populations?

19 0.10605227 86 andrew gelman stats-2010-06-14-“Too much data”?

20 0.10480647 1934 andrew gelman stats-2013-07-11-Yes, worry about generalizing from data to population. But multilevel modeling is the solution, not the problem


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.206), (1, 0.037), (2, 0.071), (3, -0.091), (4, 0.028), (5, 0.04), (6, -0.062), (7, 0.004), (8, 0.074), (9, 0.061), (10, -0.049), (11, -0.017), (12, 0.06), (13, 0.004), (14, 0.05), (15, 0.031), (16, -0.036), (17, 0.005), (18, -0.039), (19, 0.068), (20, -0.05), (21, -0.04), (22, 0.061), (23, 0.072), (24, 0.047), (25, 0.06), (26, 0.005), (27, 0.012), (28, -0.04), (29, 0.073), (30, 0.009), (31, -0.064), (32, -0.039), (33, -0.004), (34, -0.069), (35, 0.031), (36, -0.024), (37, -0.003), (38, 0.003), (39, 0.099), (40, -0.018), (41, -0.039), (42, -0.034), (43, -0.058), (44, -0.024), (45, 0.024), (46, 0.007), (47, 0.07), (48, 0.017), (49, -0.03)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.98591787 1732 andrew gelman stats-2013-02-22-Evaluating the impacts of welfare reform?

Introduction: John Pugliese writes: I was recently in a conversation with some colleagues regarding the evaluation of recent welfare reform in California. The discussion centered around what types of design might allow us to understand the impact the changes. Experimental designs were out, as random assignment is not feasible. Our data is pre/post, and some of my colleagues believed that the best we can do under these circumstance was a descriptive study; i.e. no causal inference. All of us were concerned with changes in economic and population changes over the pre-to-post period; i.e. over-estimating the effects in an improving economy. I was thought a quasi-experimental design was possible using MLM. Briefly, my suggestion was the following: Match our post-participants to a set of pre-participants on relevant person level factors, and treat the pre/post differences as a random effect at the county level. Next, we would adjust the pre/post differences by changes in economic and populati

2 0.80764574 1418 andrew gelman stats-2012-07-16-Long discussion about causal inference and the use of hierarchical models to bridge between different inferential settings

Introduction: Elias Bareinboim asked what I thought about his comment on selection bias in which he referred to a paper by himself and Judea Pearl, “Controlling Selection Bias in Causal Inference.” I replied that I have no problem with what he wrote, but that from my perspective I find it easier to conceptualize such problems in terms of multilevel models. I elaborated on that point in a recent post , “Hierarchical modeling as a framework for extrapolation,” which I think was read by only a few people (I say this because it received only two comments). I don’t think Bareinboim objected to anything I wrote, but like me he is comfortable working within his own framework. He wrote the following to me: In some sense, “not ad hoc” could mean logically consistent. In other words, if one agrees with the assumptions encoded in the model, one must also agree with the conclusions entailed by these assumptions. I am not aware of any other way of doing mathematics. As it turns out, to get causa

3 0.80087811 1996 andrew gelman stats-2013-08-24-All inference is about generalizing from sample to population

Introduction: Jeff Walker writes: Your blog has skirted around the value of observational studies and chided folks for using causal language when they only have associations but I sense that you ultimately find value in these associations. I would love for you to expand this thought in a blog. Specifically: Does a measured association “suggest” a causal relationship? Are measured associations a good and efficient way to narrow the field of things that should be studied? Of all the things we should pursue, should we start with the stuff that has some largish measured association? Certainly many associations are not directly causal but due to joint association. Similarly, there must be many variables that are directly causally associated ( A -> B) but the effect, measured as an association, is masked by confounders. So if we took the “measured associations are worthwhile” approach, we’d never or rarely find the masked effects. But I’d also like to know if one is more likely to find a large causal

4 0.79564679 1492 andrew gelman stats-2012-09-11-Using the “instrumental variables” or “potential outcomes” approach to clarify causal thinking

Introduction: As I’ve written here many times, my experiences in social science and public health research have left me skeptical of statistical methods that hypothesize or try to detect zero relationships between observational data (see, for example, the discussion starting at the bottom of page 960 in my review of causal inference in the American Journal of Sociology). In short, I have a taste for continuous rather than discrete models. As discussed in the above-linked article (with respect to the writings of cognitive scientist Steven Sloman), I think that common-sense thinking about causal inference can often mislead. In many cases, I have found that that the theoretical frameworks of instrumental variables and potential outcomes (for a review see, for example, chapters 9 and 10 of my book with Jennifer) help clarify my thinking. Here is an example that came up in a recent blog discussion. Computer science student Elias Bareinboim gave the following example: “suppose we know nothing a

5 0.79040527 1675 andrew gelman stats-2013-01-15-“10 Things You Need to Know About Causal Effects”

Introduction: Macartan Humphreys pointed me to this excellent guide . Here are the 10 items: 1. A causal claim is a statement about what didn’t happen. 2. There is a fundamental problem of causal inference. 3. You can estimate average causal effects even if you cannot observe any individual causal effects. 4. If you know that, on average, A causes B and that B causes C, this does not mean that you know that A causes C. 5. The counterfactual model is all about contribution, not attribution. 6. X can cause Y even if there is no “causal path” connecting X and Y. 7. Correlation is not causation. 8. X can cause Y even if X is not a necessary condition or a sufficient condition for Y. 9. Estimating average causal effects does not require that treatment and control groups are identical. 10. There is no causation without manipulation. The article follows with crisp discussions of each point. My favorite is item #6, not because it’s the most important but because it brings in some real s

6 0.74149966 2286 andrew gelman stats-2014-04-08-Understanding Simpson’s paradox using a graph

7 0.737957 393 andrew gelman stats-2010-11-04-Estimating the effect of A on B, and also the effect of B on A

8 0.73248714 1136 andrew gelman stats-2012-01-23-Fight! (also a bit of reminiscence at the end)

9 0.72132409 287 andrew gelman stats-2010-09-20-Paul Rosenbaum on those annoying pre-treatment variables that are sort-of instruments and sort-of covariates

10 0.72103101 1939 andrew gelman stats-2013-07-15-Forward causal reasoning statements are about estimation; reverse causal questions are about model checking and hypothesis generation

11 0.71842778 2274 andrew gelman stats-2014-03-30-Adjudicating between alternative interpretations of a statistical interaction?

12 0.71451628 340 andrew gelman stats-2010-10-13-Randomized experiments, non-randomized experiments, and observational studies

13 0.69739294 518 andrew gelman stats-2011-01-15-Regression discontinuity designs: looking for the keys under the lamppost?

14 0.69040704 550 andrew gelman stats-2011-02-02-An IV won’t save your life if the line is tangled

15 0.6881879 2097 andrew gelman stats-2013-11-11-Why ask why? Forward causal inference and reverse causal questions

16 0.68685514 972 andrew gelman stats-2011-10-25-How do you interpret standard errors from a regression fit to the entire population?

17 0.68578482 807 andrew gelman stats-2011-07-17-Macro causality

18 0.67434007 86 andrew gelman stats-2010-06-14-“Too much data”?

19 0.66688049 2336 andrew gelman stats-2014-05-16-How much can we learn about individual-level causal claims from state-level correlations?

20 0.65332019 1196 andrew gelman stats-2012-03-04-Piss-poor monocausal social science


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(15, 0.01), (16, 0.087), (21, 0.018), (24, 0.206), (84, 0.123), (89, 0.016), (90, 0.027), (98, 0.024), (99, 0.399)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.99065667 360 andrew gelman stats-2010-10-21-Forensic bioinformatics, or, Don’t believe everything you read in the (scientific) papers

Introduction: Hadley Wickham sent me this , by Keith Baggerly and Kevin Coombes: In this report we [Baggerly and Coombes] examine several related papers purporting to use microarray-based signatures of drug sensitivity derived from cell lines to predict patient response. Patients in clinical trials are currently being allocated to treatment arms on the basis of these results. However, we show in five case studies that the results incorporate several simple errors that may be putting patients at risk. One theme that emerges is that the most common errors are simple (e.g., row or column offsets); conversely, it is our experience that the most simple errors are common. This is horrible! But, in a way, it’s not surprising. I make big mistakes in my applied work all the time. I mean, all the time. Sometimes I scramble the order of the 50 states, or I’m plotting a pure noise variable, or whatever. But usually I don’t drift too far from reality because I have a lot of cross-checks and I (or my

same-blog 2 0.98870391 1732 andrew gelman stats-2013-02-22-Evaluating the impacts of welfare reform?

Introduction: John Pugliese writes: I was recently in a conversation with some colleagues regarding the evaluation of recent welfare reform in California. The discussion centered around what types of design might allow us to understand the impact the changes. Experimental designs were out, as random assignment is not feasible. Our data is pre/post, and some of my colleagues believed that the best we can do under these circumstance was a descriptive study; i.e. no causal inference. All of us were concerned with changes in economic and population changes over the pre-to-post period; i.e. over-estimating the effects in an improving economy. I was thought a quasi-experimental design was possible using MLM. Briefly, my suggestion was the following: Match our post-participants to a set of pre-participants on relevant person level factors, and treat the pre/post differences as a random effect at the county level. Next, we would adjust the pre/post differences by changes in economic and populati

3 0.98709273 2053 andrew gelman stats-2013-10-06-Ideas that spread fast and slow

Introduction: Atul Gawande (the thinking man’s Malcolm Gladwell) asks : Why do some innovations spread so swiftly and others so slowly? Consider the very different trajectories of surgical anesthesia and antiseptics, both of which were discovered in the nineteenth century. The first public demonstration of anesthesia was in 1846. The Boston surgeon Henry Jacob Bigelow was approached by a local dentist named William Morton, who insisted that he had found a gas that could render patients insensible to the pain of surgery. That was a dramatic claim. In those days, even a minor tooth extraction was excruciating. Without effective pain control, surgeons learned to work with slashing speed. Attendants pinned patients down as they screamed and thrashed, until they fainted from the agony. Nothing ever tried had made much difference. Nonetheless, Bigelow agreed to let Morton demonstrate his claim. On October 16, 1846, at Massachusetts General Hospital, Morton administered his gas through an inhaler in

4 0.98691094 1877 andrew gelman stats-2013-05-30-Infill asymptotics and sprawl asymptotics

Introduction: Anirban Bhattacharya, Debdeep Pati, Natesh Pillai, and David Dunson write : Penalized regression methods, such as L1 regularization, are routinely used in high-dimensional applications, and there is a rich literature on optimality properties under sparsity assumptions. In the Bayesian paradigm, sparsity is routinely induced through two-component mixture priors having a probability mass at zero, but such priors encounter daunting computational problems in high dimensions. This has motivated an amazing variety of continuous shrinkage priors, which can be expressed as global-local scale mixtures of Gaussians, facilitating computation. In sharp contrast to the corresponding frequentist literature, very little is known about the properties of such priors. Focusing on a broad class of shrinkage priors, we provide precise results on prior and posterior concentration. Interestingly, we demonstrate that most commonly used shrinkage priors, including the Bayesian Lasso, are suboptimal in hig

5 0.98304933 235 andrew gelman stats-2010-08-25-Term Limits for the Supreme Court?

Introduction: In the wake of the confirmation of Elena Kagan to the Supreme Court, political commentators have been expressing a bit of frustration about polarization within the court and polarization in the nomination process. One proposal that’s been floating around is to replace lifetime appointments by fixed terms, perhaps twelve or eighteen years. This would enforce a regular schedule of replacements, instead of the current system in which eighty-something judges have an incentive to hang on as long as possible so as to time their retirements to be during the administration of a politically-compatible president. A couple weeks ago at the sister blog, John Sides discussed some recent research that was relevant to the judicial term limits proposal. Political scientists Justin Crowe and Chris Karpowitz analyzed the historical record or Supreme Court terms and found that long terms of twenty years or more have been happening since the early years of the court. Yes, there is less turnover th

6 0.98304439 1776 andrew gelman stats-2013-03-25-The harm done by tests of significance

7 0.98253584 98 andrew gelman stats-2010-06-19-Further thoughts on happiness and life satisfaction research

8 0.98197961 184 andrew gelman stats-2010-08-04-That half-Cauchy prior

9 0.98124629 1883 andrew gelman stats-2013-06-04-Interrogating p-values

10 0.98021567 1165 andrew gelman stats-2012-02-13-Philosophy of Bayesian statistics: my reactions to Wasserman

11 0.97876501 186 andrew gelman stats-2010-08-04-“To find out what happens when you change something, it is necessary to change it.”

12 0.97735465 2004 andrew gelman stats-2013-09-01-Post-publication peer review: How it (sometimes) really works

13 0.97281271 42 andrew gelman stats-2010-05-19-Updated solutions to Bayesian Data Analysis homeworks

14 0.97273248 1817 andrew gelman stats-2013-04-21-More on Bayesian model selection in high-dimensional settings

15 0.97039825 1205 andrew gelman stats-2012-03-09-Coming to agreement on philosophy of statistics

16 0.96788907 247 andrew gelman stats-2010-09-01-How does Bayes do it?

17 0.96655738 2183 andrew gelman stats-2014-01-23-Discussion on preregistration of research studies

18 0.9664042 187 andrew gelman stats-2010-08-05-Update on state size and governors’ popularity

19 0.96459448 770 andrew gelman stats-2011-06-15-Still more Mr. P in public health

20 0.96385163 2042 andrew gelman stats-2013-09-28-Difficulties of using statistical significance (or lack thereof) to sift through and compare research hypotheses