andrew_gelman_stats andrew_gelman_stats-2010 andrew_gelman_stats-2010-7 knowledge-graph by maker-knowledge-mining

7 andrew gelman stats-2010-04-27-Should Mister P be allowed-encouraged to reside in counter-factual populations?

meta infos for this blog

Source: html

Introduction: Lets say you are repeatedly going to recieve unselected sets of well done RCTs on various say medical treatments. One reasonable assumption with all of these treatments is that they are monotonic – either helpful or harmful for all. The treatment effect will (as always) vary for subgroups in the population – these will not be explicitly identified in the studies – but each study very likely will enroll different percentages of the variuos patient subgroups. Being all randomized studies these subgroups will be balanced in the treatment versus control arms – but each study will (as always) be estimating a different – but exchangeable – treatment effect (Exhangeable due to the ignorance about the subgroup memberships of the enrolled patients.) That reasonable assumption – monotonicity – will be to some extent (as always) wrong, but given that it is a risk believed well worth taking – if the average effect in any population is positive (versus negative) the average effect in any other

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Lets say you are repeatedly going to recieve unselected sets of well done RCTs on various say medical treatments. [sent-1, score-0.161]

2 One reasonable assumption with all of these treatments is that they are monotonic – either helpful or harmful for all. [sent-2, score-0.588]

3 The treatment effect will (as always) vary for subgroups in the population – these will not be explicitly identified in the studies – but each study very likely will enroll different percentages of the variuos patient subgroups. [sent-3, score-1.909]

4 Being all randomized studies these subgroups will be balanced in the treatment versus control arms – but each study will (as always) be estimating a different – but exchangeable – treatment effect (Exhangeable due to the ignorance about the subgroup memberships of the enrolled patients. [sent-4, score-2.498]

5 ) That reasonable assumption – monotonicity – will be to some extent (as always) wrong, but given that it is a risk believed well worth taking – if the average effect in any population is positive (versus negative) the average effect in any other population will be positive (versus negative). [sent-5, score-2.061]

6 Should we encourage (or discourage) such Mr P based estimates – just because they are for counter-factual rather than real populations. [sent-7, score-0.258]

similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('subgroups', 0.352), ('population', 0.275), ('effect', 0.259), ('versus', 0.254), ('treatment', 0.186), ('study', 0.163), ('average', 0.149), ('assumption', 0.146), ('memberships', 0.146), ('enroll', 0.138), ('monotonic', 0.138), ('rcts', 0.138), ('variance', 0.132), ('mr', 0.132), ('mixtures', 0.132), ('monotonicity', 0.132), ('discourage', 0.127), ('exchangeable', 0.127), ('negative', 0.127), ('enrolled', 0.123), ('arms', 0.12), ('subgroup', 0.12), ('positive', 0.115), ('always', 0.113), ('harmful', 0.111), ('lets', 0.109), ('ignorance', 0.107), ('balanced', 0.104), ('inverse', 0.104), ('percentages', 0.103), ('reasonable', 0.101), ('estimates', 0.1), ('studies', 0.1), ('unknown', 0.098), ('patient', 0.097), ('minimum', 0.096), ('treatments', 0.092), ('weighting', 0.09), ('repeatedly', 0.089), ('believed', 0.086), ('encourage', 0.086), ('mixture', 0.084), ('randomized', 0.083), ('identified', 0.079), ('explicitly', 0.079), ('vary', 0.078), ('define', 0.074), ('based', 0.072), ('sets', 0.072), ('due', 0.068)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.99999988 7 andrew gelman stats-2010-04-27-Should Mister P be allowed-encouraged to reside in counter-factual populations?

2 0.18043518 86 andrew gelman stats-2010-06-14-“Too much data”?

Introduction: Chris Hane writes: I am scientist needing to model a treatment effect on a population of ~500 people. The dependent variable in the model is the difference in a person’s pre-treatment 12 month total medical cost versus post-treatment cost. So there is large variation in costs, but not so much by using the difference between the pre and post treatment costs. The issue I’d like some advice on is that the treatment has already occurred so there is no possibility of creating a fully randomized control now. I do have a very large population of people to use as possible controls via propensity scoring or exact matching. If I had a few thousand people to possibly match, then I would use standard techniques. However, I have a potential population of over a hundred thousand people. An exact match of the possible controls to age, gender and region of the country still leaves a population of 10,000 controls. Even if I use propensity scores to weight the 10,000 observations (understan

3 0.16302589 972 andrew gelman stats-2011-10-25-How do you interpret standard errors from a regression fit to the entire population?

Introduction: David Radwin asks a question which comes up fairly often in one form or another: How should one respond to requests for statistical hypothesis tests for population (or universe) data? I [Radwin] first encountered this issue as an undergraduate when a professor suggested a statistical significance test for my paper comparing roll call votes between freshman and veteran members of Congress. Later I learned that such tests apply only to samples because their purpose is to tell you whether the difference in the observed sample is likely to exist in the population. If you have data for the whole population, like all members of the 103rd House of Representatives, you do not need a test to discern the true difference in the population. Sometimes researchers assume some sort of superpopulation like “all possible Congresses” or “Congresses across all time” and that the members of any given Congress constitute a sample. In my current work in education research, it is sometimes asserted t

4 0.159237 2155 andrew gelman stats-2013-12-31-No on Yes-No decisions

Introduction: Just to elaborate on our post from last month (“I’m negative on the expression ‘false positives’”), here’s a recent exchange exchange we had regarding the relevance of yes/no decisions in summarizing statistical inferences about scientific questions. Shravan wrote : Isn’t it true that I am already done if P(theta>0) is much larger than P(theta<0)? I don't need to compute any loss function if the former is 0.99 and the latter 0.01. In most studies of the type that people like me do [Shravan is a linguist], we set up experiments where we have a decisive test like this for theory A and against theory B. To which I replied : In some way the problem is with the focus on “theta.” Effects (and, more generally, comparisons) vary, they can be positive for some people in some settings and negative for other people in other settings. If you’re talking about a single “theta,” you have to define what population and what scenario you are thinking about. And it’s probably not the popul

5 0.14920048 960 andrew gelman stats-2011-10-15-The bias-variance tradeoff

Introduction: Joshua Vogelstein asks for my thoughts as a Bayesian on the above topic. So here they are (briefly): The concept of the bias-variance tradeoff can be useful if you don’t take it too seriously. The basic idea is as follows: if you’re estimating something, you can slice your data finer and finer, or perform more and more adjustments, each time getting a purer—and less biased—estimate. But each subdivision or each adjustment reduces your sample size or increases potential estimation error, hence the variance of your estimate goes up. That story is real. In lots and lots of examples, there’s a continuum between a completely unadjusted general estimate (high bias, low variance) and a specific, focused, adjusted estimate (low bias, high variance). Suppose, for example, you’re using data from a large experiment to estimate the effect of a treatment on a fairly narrow group, say, white men between the ages of 45 and 50. At one extreme, you could just take the estimated treatment e

6 0.14505346 388 andrew gelman stats-2010-11-01-The placebo effect in pharma

7 0.14341332 1744 andrew gelman stats-2013-03-01-Why big effects are more important than small effects

8 0.14321111 803 andrew gelman stats-2011-07-14-Subtleties with measurement-error models for the evaluation of wacky claims

9 0.13780282 1891 andrew gelman stats-2013-06-09-“Heterogeneity of variance in experimental studies: A challenge to conventional interpretations”

10 0.12930524 2 andrew gelman stats-2010-04-23-Modeling heterogenous treatment effects

11 0.12818775 1150 andrew gelman stats-2012-02-02-The inevitable problems with statistical significance and 95% intervals

12 0.12382997 1418 andrew gelman stats-2012-07-16-Long discussion about causal inference and the use of hierarchical models to bridge between different inferential settings

13 0.12363604 1910 andrew gelman stats-2013-06-22-Struggles over the criticism of the “cannabis users and IQ change” paper

14 0.11617971 383 andrew gelman stats-2010-10-31-Analyzing the entire population rather than a sample

15 0.11129026 850 andrew gelman stats-2011-08-11-Understanding how estimates change when you move to a multilevel model

16 0.10810692 797 andrew gelman stats-2011-07-11-How do we evaluate a new and wacky claim?

17 0.10612824 1732 andrew gelman stats-2013-02-22-Evaluating the impacts of welfare reform?

18 0.10210955 2008 andrew gelman stats-2013-09-04-Does it matter that a sample is unrepresentative? It depends on the size of the treatment interactions

19 0.10123687 1527 andrew gelman stats-2012-10-10-Another reason why you can get good inferences from a bad model

20 0.10062233 1310 andrew gelman stats-2012-05-09-Varying treatment effects, again

similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.139), (1, 0.046), (2, 0.125), (3, -0.178), (4, 0.02), (5, -0.009), (6, 0.024), (7, 0.028), (8, 0.019), (9, -0.014), (10, -0.064), (11, 0.004), (12, 0.078), (13, -0.018), (14, 0.045), (15, 0.013), (16, -0.033), (17, 0.017), (18, -0.015), (19, 0.053), (20, -0.07), (21, -0.013), (22, 0.005), (23, 0.017), (24, -0.003), (25, 0.064), (26, -0.062), (27, 0.05), (28, 0.005), (29, 0.032), (30, -0.064), (31, -0.033), (32, -0.042), (33, 0.014), (34, 0.003), (35, 0.027), (36, -0.03), (37, -0.067), (38, -0.001), (39, -0.008), (40, 0.035), (41, -0.033), (42, -0.007), (43, -0.05), (44, 0.045), (45, -0.009), (46, 0.022), (47, 0.017), (48, 0.016), (49, 0.034)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.99330765 7 andrew gelman stats-2010-04-27-Should Mister P be allowed-encouraged to reside in counter-factual populations?

2 0.83352447 803 andrew gelman stats-2011-07-14-Subtleties with measurement-error models for the evaluation of wacky claims

Introduction: A few days ago I discussed the evaluation of somewhat-plausible claims that are somewhat supported by theory and somewhat supported by statistical evidence. One point I raised was that an implausibly large estimate of effect size can be cause for concern: Uri Simonsohn (the author of the recent rebuttal of the name-choice article by Pelham et al.) argued that the implied effects were too large to be believed (just as I was arguing above regarding the July 4th study), which makes more plausible his claims that the results arise from methodological artifacts. That calculation is straight Bayes: the distribution of systematic errors has much longer tails than the distribution of random errors, so the larger the estimated effect, the more likely it is to be a mistake. This little theoretical result is a bit annoying, because it is the larger effects that are the most interesting!” Larry Bartels notes that my reasoning above is a bit incoherent: I [Bartels] strongly agree with

3 0.81552857 1744 andrew gelman stats-2013-03-01-Why big effects are more important than small effects

Introduction: The title of this post is silly but I have an important point to make, regarding an implicit model which I think many people assume even though it does not really make sense. Following a link from Sanjay Srivastava, I came across a post from David Funder saying that it’s useful to talk about the sizes of effects (I actually prefer the term “comparisons” so as to avoid the causal baggage) rather than just their signs. I agree , and I wanted to elaborate a bit on a point that comes up in Funder’s discussion. He quotes an (unnamed) prominent social psychologist as writing: The key to our research . . . [is not] to accurately estimate effect size. If I were testing an advertisement for a marketing research firm and wanted to be sure that the cost of the ad would produce enough sales to make it worthwhile, effect size would be crucial. But when I am testing a theory about whether, say, positive mood reduces information processing in comparison with negative mood, I am worried abou

4 0.79490852 963 andrew gelman stats-2011-10-18-Question on Type M errors

Introduction: Inti Pedroso writes: Today during the group meeting at my new job we were revising a paper whose main conclusions were sustained by an ANOVA. One of the first observations is that the experiment had a small sample size. Interestingly (may not so), some of the reported effects (most of them interactions) were quite large. One of the experience group members said that “there is a common wisdom that one should not believe effects from small sample sizes but [he thinks] if they [the effects] are large enough to be picked on a small study they must be real large effects”. I argued that if the sample size is small one could incur on a M-type error in which the magnitude of the effect is being over-estimated and that if larger samples are evaluated the magnitude may become smaller and also the confidence intervals. The concept of M-type error is completely new to all other members of the group (on which I am in my second week) and I was given the job of finding a suitable ref to explain

5 0.75353843 1150 andrew gelman stats-2012-02-02-The inevitable problems with statistical significance and 95% intervals

Introduction: I’m thinking more and more that we have to get rid of statistical significance, 95% intervals, and all the rest, and just come to a more fundamental acceptance of uncertainty. In practice, I think we use confidence intervals and hypothesis tests as a way to avoid acknowledging uncertainty. We set up some rules and then act as if we know what is real and what is not. Even in my own applied work, I’ve often enough presented 95% intervals and gone on from there. But maybe that’s just not right. I was thinking about this after receiving the following email from a psychology student: I [the student] am trying to conceptualize the lessons in your paper with Stern with comparing treatment effects across studies. When trying to understand if a certain intervention works, we must look at what the literature says. However this can be complicated if the literature has divergent results. There are four situations I am thinking of. FOr each of these situations, assume the studies are r

6 0.73548138 1607 andrew gelman stats-2012-12-05-The p-value is not . . .

7 0.7315802 1910 andrew gelman stats-2013-06-22-Struggles over the criticism of the “cannabis users and IQ change” paper

8 0.71982652 2 andrew gelman stats-2010-04-23-Modeling heterogenous treatment effects

9 0.71586984 2165 andrew gelman stats-2014-01-09-San Fernando Valley cityscapes: An example of the benefits of fractal devastation?

10 0.71543753 1400 andrew gelman stats-2012-06-29-Decline Effect in Linguistics?

11 0.71472377 797 andrew gelman stats-2011-07-11-How do we evaluate a new and wacky claim?

12 0.71044809 820 andrew gelman stats-2011-07-25-Design of nonrandomized cluster sample study

13 0.70718008 1186 andrew gelman stats-2012-02-27-Confusion from illusory precision

14 0.70243508 2008 andrew gelman stats-2013-09-04-Does it matter that a sample is unrepresentative? It depends on the size of the treatment interactions

15 0.70116961 716 andrew gelman stats-2011-05-17-Is the internet causing half the rapes in Norway? I wanna see the scatterplot.

16 0.69968718 2193 andrew gelman stats-2014-01-31-Into the thicket of variation: More on the political orientations of parents of sons and daughters, and a return to the tradeoff between internal and external validity in design and interpretation of research studies

17 0.69565976 898 andrew gelman stats-2011-09-10-Fourteen magic words: an update

18 0.69140589 86 andrew gelman stats-2010-06-14-“Too much data”?

19 0.68639582 2227 andrew gelman stats-2014-02-27-“What Can we Learn from the Many Labs Replication Project?”

20 0.67512321 1929 andrew gelman stats-2013-07-07-Stereotype threat!

similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(2, 0.017), (16, 0.057), (21, 0.023), (23, 0.014), (24, 0.149), (50, 0.014), (56, 0.016), (60, 0.013), (61, 0.012), (73, 0.229), (84, 0.016), (98, 0.015), (99, 0.317)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.98842758 655 andrew gelman stats-2011-04-10-“Versatile, affordable chicken has grown in popularity”

Introduction: Awhile ago I was cleaning out the closet and found some old unread magazines. Good stuff. As we’ve discussed before , lots of things are better read a few years late. Today I was reading the 18 Nov 2004 issue of the London Review of Books, which contained (among other things) the following: - A review by Jenny Diski of a biography of Stanley Milgram. Diski appears to want to debunk: Milgram was a whiz at devising sexy experiments, but barely interested in any theoretical basis for them. They all have the same instant attractiveness of style, and then an underlying emptiness. Huh? Michael Jordan couldn’t hit the curveball and he was reportedly an easy mark for golf hustlers but that doesn’t diminish his greatness on the basketball court. She also criticizes Milgram for being “no help at all” for solving international disputes. OK, fine. I haven’t solved any international disputes either. Milgram, though, . . . he conducted an imaginative experiment whose results stu

2 0.98674893 1925 andrew gelman stats-2013-07-04-“Versatile, affordable chicken has grown in popularity”

Introduction: From two years ago : Awhile ago I was cleaning out the closet and found some old unread magazines. Good stuff. As we’ve discussed before , lots of things are better read a few years late. Today I was reading the 18 Nov 2004 issue of the London Review of Books, which contained (among other things) the following: - A review by Jenny Diski of a biography of Stanley Milgram. Diski appears to want to debunk: Milgram was a whiz at devising sexy experiments, but barely interested in any theoretical basis for them. They all have the same instant attractiveness of style, and then an underlying emptiness. Huh? Michael Jordan couldn’t hit the curveball and he was reportedly an easy mark for golf hustlers but that doesn’t diminish his greatness on the basketball court. She also criticizes Milgram for being “no help at all” for solving international disputes. OK, fine. I haven’t solved any international disputes either. Milgram, though, . . . he conducted an imaginative exp

3 0.95695245 917 andrew gelman stats-2011-09-20-Last post on Hipmunk

Introduction: There was some confusion on my last try , so let me explain one more time . . . The flights I where Hipmunk failed (see here for background) were not obscure itineraries. One of them was a nonstop from New York to Cincinnati; another was from NY to Durham, North Carolina; and yet another was a trip to Midway in Chicago. In that last case, Hipmunk showed no nonstops at all—which will come as a surprise to the passengers on the Southwest Airlines flight I was on a couple days ago! In these cases, Hipmunk didn’t even do the courtesy of flashing a message telling me to try elsewhere. I don’t understand. How hard would it be for the program to automatically do a Kayak search and find all the flights? Hipmunk’s graphics are great, though. Lee Wilkinson reports: Check out the figure below from The Grammar of Graphics. Dan Rope invented this graphic and programmed it in Java in the late 1990′s. We shopped this graph around to Orbitz and Expedia but they weren’t interested. So I

4 0.95391738 1748 andrew gelman stats-2013-03-04-PyStan!

Introduction: Stan is written in C++ and can be run from the command line and from R. We’d like for Python users to be able to run Stan as well. If anyone is interested in doing this, please let us know and we’d be happy to work with you on it. Stan, like Python, is completely free and open-source. P.S. Because Stan is open-source, it of course would also be possible for people to translate Stan into Python, or to take whatever features they like from Stan and incorporate them into a Python package. That’s fine too. But we think it would make sense in addition for users to be able to run Stan directly from Python, in the same way that it can be run from R.

5 0.94899338 1099 andrew gelman stats-2012-01-05-Approaching harmonic convergence

Introduction: Check out comment #9 here . All we need is for Steven Levitt, David Runciman, and some Reader in Management somewhere to weigh in and weâ€™ll be all set.

6 0.94854259 794 andrew gelman stats-2011-07-09-The quest for the holy graph

7 0.94843274 2238 andrew gelman stats-2014-03-09-Hipmunk worked

same-blog 8 0.94727051 7 andrew gelman stats-2010-04-27-Should Mister P be allowed-encouraged to reside in counter-factual populations?

9 0.94457781 497 andrew gelman stats-2011-01-02-Hipmunk update

10 0.93210548 280 andrew gelman stats-2010-09-16-Meet Hipmunk, a really cool flight-finder that doesn’t actually work

11 0.90954643 2020 andrew gelman stats-2013-09-12-Samplers for Big Science: emcee and BAT

12 0.9008466 161 andrew gelman stats-2010-07-24-Differences in color perception by sex, also the Bechdel test for women in movies

13 0.89371711 798 andrew gelman stats-2011-07-12-Sometimes a graph really is just ugly

14 0.88962156 1511 andrew gelman stats-2012-09-26-What do statistical p-values mean when the sample = the population?

15 0.88644296 2325 andrew gelman stats-2014-05-07-Stan users meetup next week

16 0.8830837 1846 andrew gelman stats-2013-05-07-Like Casper the ghost, Niall Ferguson is not only white. He is also very, very adorable.

17 0.87959969 573 andrew gelman stats-2011-02-14-Hipmunk < Expedia, again

18 0.87597305 320 andrew gelman stats-2010-10-05-Does posterior predictive model checking fit with the operational subjective approach?

19 0.87533152 931 andrew gelman stats-2011-09-29-Hamiltonian Monte Carlo stories

20 0.87141746 496 andrew gelman stats-2011-01-01-Tukey’s philosophy