andrew_gelman_stats andrew_gelman_stats-2011 andrew_gelman_stats-2011-972 knowledge-graph by maker-knowledge-mining

972 andrew gelman stats-2011-10-25-How do you interpret standard errors from a regression fit to the entire population?


meta infos for this blog

Source: html

Introduction: David Radwin asks a question which comes up fairly often in one form or another: How should one respond to requests for statistical hypothesis tests for population (or universe) data? I [Radwin] first encountered this issue as an undergraduate when a professor suggested a statistical significance test for my paper comparing roll call votes between freshman and veteran members of Congress. Later I learned that such tests apply only to samples because their purpose is to tell you whether the difference in the observed sample is likely to exist in the population. If you have data for the whole population, like all members of the 103rd House of Representatives, you do not need a test to discern the true difference in the population. Sometimes researchers assume some sort of superpopulation like “all possible Congresses” or “Congresses across all time” and that the members of any given Congress constitute a sample. In my current work in education research, it is sometimes asserted t


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 David Radwin asks a question which comes up fairly often in one form or another: How should one respond to requests for statistical hypothesis tests for population (or universe) data? [sent-1, score-0.765]

2 I [Radwin] first encountered this issue as an undergraduate when a professor suggested a statistical significance test for my paper comparing roll call votes between freshman and veteran members of Congress. [sent-2, score-0.593]

3 Later I learned that such tests apply only to samples because their purpose is to tell you whether the difference in the observed sample is likely to exist in the population. [sent-3, score-0.323]

4 If you have data for the whole population, like all members of the 103rd House of Representatives, you do not need a test to discern the true difference in the population. [sent-4, score-0.471]

5 Sometimes researchers assume some sort of superpopulation like “all possible Congresses” or “Congresses across all time” and that the members of any given Congress constitute a sample. [sent-5, score-0.347]

6 In my current work in education research, it is sometimes asserted that students at a particular school or set of schools is a sample of the population of all students at similar schools nationwide. [sent-6, score-0.981]

7 But even if such a population existed, it is not credible that the observed population is a representative sample of the larger superpopulation. [sent-7, score-1.222]

8 Can you suggest resources that might convincingly explain why hypothesis tests are inappropriate for population data? [sent-8, score-0.769]

9 To keep things simple, I will consider estimates and standard errors. [sent-11, score-0.279]

10 Sometimes we can all agree that if you have a whole population, your standard error is zero. [sent-12, score-0.426]

11 This is basic finite population inference from survey sampling theory, if your goal is to estimate the population average or total. [sent-13, score-1.102]

12 (And the comparison between freshman and veteran members of Congress, at the very beginning of the above question, is a special case of a regression on an indicator variable. [sent-15, score-0.612]

13 ) You have the whole population–all the congressmembers, all 50 states, whatever, you run a regression and you get a standard error. [sent-16, score-0.386]

14 Maybe the estimated coefficient is only 1 standard error from 0, so it’s not “statistically significant. [sent-17, score-0.295]

15 You can still consider the cases in which the regression will be used for prediction. [sent-19, score-0.196]

16 We had data from the entire population of congressional elections in each year, but we got our standard error not from the variation between districts but rather from the unexplained year-to-year variation of elections within districts. [sent-22, score-1.442]

17 To put it another way, we would’ve got the wrong answer if we had tried to get uncertainties for our estimates by “bootstrapping” the 435 congressional elections. [sent-23, score-0.175]

18 We wanted inferences for these 435 under hypothetical alternative conditions, not inference for the entire population or for another sample of 435. [sent-24, score-0.997]

19 (We did make population inferences, but that was to estimate the hyperparameters that governed our inferences over individual district outcomes under hypothetical national swings. [sent-25, score-0.87]

20 It’s sort of like the WWJD principle in causal inference: if you think seriously about your replications (for the goal of getting the right standard error), you might well get a better understanding of what you’re trying to do with your model. [sent-27, score-0.363]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('population', 0.473), ('congresses', 0.197), ('veteran', 0.197), ('members', 0.188), ('radwin', 0.18), ('standard', 0.169), ('freshman', 0.141), ('whole', 0.131), ('inferences', 0.131), ('error', 0.126), ('replications', 0.124), ('sample', 0.121), ('tests', 0.118), ('sometimes', 0.117), ('consider', 0.11), ('hypothesis', 0.109), ('congressional', 0.107), ('states', 0.106), ('hypothetical', 0.105), ('congress', 0.102), ('elections', 0.091), ('schools', 0.09), ('asserted', 0.09), ('foxhole', 0.09), ('regression', 0.086), ('inference', 0.086), ('governed', 0.085), ('discern', 0.085), ('reread', 0.085), ('superpopulation', 0.085), ('observed', 0.084), ('entire', 0.081), ('wwjd', 0.081), ('formalizing', 0.081), ('variation', 0.081), ('congressmembers', 0.078), ('hyperparameters', 0.076), ('unexplained', 0.076), ('intuitions', 0.076), ('constitute', 0.074), ('representatives', 0.072), ('credible', 0.071), ('bootstrapping', 0.071), ('goal', 0.07), ('convincingly', 0.069), ('universe', 0.068), ('uncertainties', 0.068), ('test', 0.067), ('districts', 0.066), ('requests', 0.065)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.99999994 972 andrew gelman stats-2011-10-25-How do you interpret standard errors from a regression fit to the entire population?

Introduction: David Radwin asks a question which comes up fairly often in one form or another: How should one respond to requests for statistical hypothesis tests for population (or universe) data? I [Radwin] first encountered this issue as an undergraduate when a professor suggested a statistical significance test for my paper comparing roll call votes between freshman and veteran members of Congress. Later I learned that such tests apply only to samples because their purpose is to tell you whether the difference in the observed sample is likely to exist in the population. If you have data for the whole population, like all members of the 103rd House of Representatives, you do not need a test to discern the true difference in the population. Sometimes researchers assume some sort of superpopulation like “all possible Congresses” or “Congresses across all time” and that the members of any given Congress constitute a sample. In my current work in education research, it is sometimes asserted t

2 0.18263589 383 andrew gelman stats-2010-10-31-Analyzing the entire population rather than a sample

Introduction: Lee Mobley writes: I recently read what you posted on your blog How does statistical analysis differ when analyzing the entire population rather than a sample? What you said in the blog accords with my training in econometrics. However I am concerned about a new wrinkle on this problem that derives from multilevel modeling. We are analyzing multilevel models of the probability of using cancer screening for the entire Medicare population. I argue that every state has different systems in place (politics, cancer control efforts, culture, insurance regulations, etc) so that essentially a different probability generating mechanism is in place for each state. Thus I estimate 50 separate regressions for the populations in each state, and then note and map the variability in the effect estimates (slope parameters) for each covariate. Reviewers argue that I should be using random slopes modeling, pooling all individuals in all states together. I am familiar with this approach

3 0.16302589 7 andrew gelman stats-2010-04-27-Should Mister P be allowed-encouraged to reside in counter-factual populations?

Introduction: Lets say you are repeatedly going to recieve unselected sets of well done RCTs on various say medical treatments. One reasonable assumption with all of these treatments is that they are monotonic – either helpful or harmful for all. The treatment effect will (as always) vary for subgroups in the population – these will not be explicitly identified in the studies – but each study very likely will enroll different percentages of the variuos patient subgroups. Being all randomized studies these subgroups will be balanced in the treatment versus control arms – but each study will (as always) be estimating a different – but exchangeable – treatment effect (Exhangeable due to the ignorance about the subgroup memberships of the enrolled patients.) That reasonable assumption – monotonicity – will be to some extent (as always) wrong, but given that it is a risk believed well worth taking – if the average effect in any population is positive (versus negative) the average effect in any other

4 0.16024143 1511 andrew gelman stats-2012-09-26-What do statistical p-values mean when the sample = the population?

Introduction: Felipe Nunes writes: I have many friends working with data that they claim to be considered as a ‘population’. For example, the universe of bills presented in a Congress, the roll call votes of all deputies in a legislature, a survey with all deputies in a country, the outcomes of an election, or the set of electoral institutions around the world. Because of the nature of these data, we do not know how to interpret the p-value. I have seen many arguments been made, but I have never seen a formal response to the question. So I don’t know what to say. The most common arguments among the community of young researchers in Brazil are: (1) don’t interpret p-value when you have population, but don’t infer anything either; (2) interpret the p-value because of error measurement which is also present, (3) there is no such a thing as a population, so always look at p-values, (4) don’t worry about p-value, interpret the coefficients substantively, and (5) if you are frequentist you interpret p-

5 0.15887344 405 andrew gelman stats-2010-11-10-Estimation from an out-of-date census

Introduction: Suguru Mizunoya writes: When we estimate the number of people from a national sampling survey (such as labor force survey) using sampling weights, don’t we obtain underestimated number of people, if the country’s population is growing and the sampling frame is based on an old census data? In countries with increasing populations, the probability of inclusion changes over time, but the weights can’t be adjusted frequently because census takes place only once every five or ten years. I am currently working for UNICEF for a project on estimating number of out-of-school children in developing countries. The project leader is comfortable to use estimates of number of people from DHS and other surveys. But, I am concerned that we may need to adjust the estimated number of people by the population projection, otherwise the estimates will be underestimated. I googled around on this issue, but I could not find a right article or paper on this. My reply: I don’t know if there’s a pa

6 0.15683073 2155 andrew gelman stats-2013-12-31-No on Yes-No decisions

7 0.15517563 2359 andrew gelman stats-2014-06-04-All the Assumptions That Are My Life

8 0.149884 2295 andrew gelman stats-2014-04-18-One-tailed or two-tailed?

9 0.14872697 1383 andrew gelman stats-2012-06-18-Hierarchical modeling as a framework for extrapolation

10 0.14659004 86 andrew gelman stats-2010-06-14-“Too much data”?

11 0.14194646 1352 andrew gelman stats-2012-05-29-Question 19 of my final exam for Design and Analysis of Sample Surveys

12 0.14182201 1527 andrew gelman stats-2012-10-10-Another reason why you can get good inferences from a bad model

13 0.13932472 472 andrew gelman stats-2010-12-17-So-called fixed and random effects

14 0.13797922 1289 andrew gelman stats-2012-04-29-We go to war with the data we have, not the data we want

15 0.13528271 1605 andrew gelman stats-2012-12-04-Write This Book

16 0.13337478 1149 andrew gelman stats-2012-02-01-Philosophy of Bayesian statistics: my reactions to Cox and Mayo

17 0.13185199 2351 andrew gelman stats-2014-05-28-Bayesian nonparametric weighted sampling inference

18 0.12689057 464 andrew gelman stats-2010-12-12-Finite-population standard deviation in a hierarchical model

19 0.1262674 2008 andrew gelman stats-2013-09-04-Does it matter that a sample is unrepresentative? It depends on the size of the treatment interactions

20 0.12533712 56 andrew gelman stats-2010-05-28-Another argument in favor of expressing conditional probability statements using the population distribution


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.236), (1, 0.062), (2, 0.131), (3, -0.109), (4, 0.045), (5, 0.057), (6, -0.036), (7, 0.056), (8, 0.04), (9, -0.075), (10, -0.001), (11, -0.018), (12, 0.002), (13, -0.015), (14, -0.015), (15, -0.03), (16, -0.053), (17, -0.026), (18, 0.008), (19, 0.028), (20, -0.003), (21, -0.014), (22, 0.008), (23, 0.027), (24, 0.009), (25, -0.045), (26, -0.022), (27, 0.013), (28, -0.008), (29, 0.098), (30, 0.034), (31, -0.067), (32, 0.005), (33, 0.035), (34, -0.036), (35, 0.029), (36, -0.019), (37, -0.052), (38, 0.006), (39, 0.048), (40, -0.009), (41, -0.052), (42, -0.041), (43, -0.022), (44, -0.029), (45, 0.016), (46, -0.015), (47, -0.027), (48, -0.001), (49, -0.005)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.97922832 972 andrew gelman stats-2011-10-25-How do you interpret standard errors from a regression fit to the entire population?

Introduction: David Radwin asks a question which comes up fairly often in one form or another: How should one respond to requests for statistical hypothesis tests for population (or universe) data? I [Radwin] first encountered this issue as an undergraduate when a professor suggested a statistical significance test for my paper comparing roll call votes between freshman and veteran members of Congress. Later I learned that such tests apply only to samples because their purpose is to tell you whether the difference in the observed sample is likely to exist in the population. If you have data for the whole population, like all members of the 103rd House of Representatives, you do not need a test to discern the true difference in the population. Sometimes researchers assume some sort of superpopulation like “all possible Congresses” or “Congresses across all time” and that the members of any given Congress constitute a sample. In my current work in education research, it is sometimes asserted t

2 0.77830797 777 andrew gelman stats-2011-06-23-Combining survey data obtained using different modes of sampling

Introduction: I’m involved (with Irv Garfinkel and others) in a planned survey of New York City residents. It’s hard to reach people in the city–not everyone will answer their mail or phone, and you can’t send an interviewer door-to-door in a locked apartment building. (I think it violates IRB to have a plan of pushing all the buzzers by the entrance and hoping someone will let you in.) So the plan is to use multiple modes, including phone, in person household, random street intercepts and mail. The question then is how to combine these samples. My suggested approach is to divide the population into poststrata based on various factors (age, ethnicity, family type, housing type, etc), then to pool responses within each poststratum, then to runs some regressions including postratsta and also indicators for mode, to understand how respondents from different modes differ, after controlling for the demographic/geographic adjustments. Maybe this has already been done and written up somewhere? P.

3 0.73431677 820 andrew gelman stats-2011-07-25-Design of nonrandomized cluster sample study

Introduction: Rhoderick Machekano writes: I have a design question which has been bothering me and wonder if you can clear for me. In my line of work, we often conveniently select health centers and from those sample patients. When I am doing sample size estimation under this design do I account for the design effect – since I expect outcomes in patients from the same health center to be correlated? Given that I didn’t random sample the health facilities, is my only limitation that I cannot generalize the results and make group level comparisons in the analysis? My response: You can generalize the results even if you didn’t randomly sample the health facilities. The only thing is that your generalization applies to the implicit population of facilities to which your sample is representative. You could try to move further on this by considering facility-level predictors. Regarding sample size estimation, see chapter 20 .

4 0.72473574 1732 andrew gelman stats-2013-02-22-Evaluating the impacts of welfare reform?

Introduction: John Pugliese writes: I was recently in a conversation with some colleagues regarding the evaluation of recent welfare reform in California. The discussion centered around what types of design might allow us to understand the impact the changes. Experimental designs were out, as random assignment is not feasible. Our data is pre/post, and some of my colleagues believed that the best we can do under these circumstance was a descriptive study; i.e. no causal inference. All of us were concerned with changes in economic and population changes over the pre-to-post period; i.e. over-estimating the effects in an improving economy. I was thought a quasi-experimental design was possible using MLM. Briefly, my suggestion was the following: Match our post-participants to a set of pre-participants on relevant person level factors, and treat the pre/post differences as a random effect at the county level. Next, we would adjust the pre/post differences by changes in economic and populati

5 0.72275507 1289 andrew gelman stats-2012-04-29-We go to war with the data we have, not the data we want

Introduction: This post is by Phil. Psychologists perform experiments on Canadian undergraduate psychology students and draws conclusions that (they believe) apply to humans in general; they publish in Science. A drug company decides to embark on additional trials that will cost tens of millions of dollars based on the results of a careful double-blind study….whose patients are all volunteers from two hospitals. A movie studio holds 9 screenings of a new movie for volunteer viewers and, based on their survey responses, decides to spend another $8 million to re-shoot the ending.  A researcher interested in the effect of ventilation on worker performance conducts a months-long study in which ventilation levels are varied and worker performance is monitored…in a single building. In almost all fields of research, most studies are based on convenience samples, or on random samples from a larger population that is itself a convenience sample. The paragraph above gives just a few examples.  The benefit

6 0.72267413 1628 andrew gelman stats-2012-12-17-Statistics in a world where nothing is random

7 0.72138447 603 andrew gelman stats-2011-03-07-Assumptions vs. conditions, part 2

8 0.72086853 70 andrew gelman stats-2010-06-07-Mister P goes on a date

9 0.71977293 405 andrew gelman stats-2010-11-10-Estimation from an out-of-date census

10 0.71903211 213 andrew gelman stats-2010-08-17-Matching at two levels

11 0.70722347 2359 andrew gelman stats-2014-06-04-All the Assumptions That Are My Life

12 0.70549601 212 andrew gelman stats-2010-08-17-Futures contracts, Granger causality, and my preference for estimation to testing

13 0.69447196 1455 andrew gelman stats-2012-08-12-Probabilistic screening to get an approximate self-weighted sample

14 0.69409007 960 andrew gelman stats-2011-10-15-The bias-variance tradeoff

15 0.6939711 1691 andrew gelman stats-2013-01-25-Extreem p-values!

16 0.69250286 383 andrew gelman stats-2010-10-31-Analyzing the entire population rather than a sample

17 0.69067264 2180 andrew gelman stats-2014-01-21-Everything I need to know about Bayesian statistics, I learned in eight schools.

18 0.68992865 1746 andrew gelman stats-2013-03-02-Fishing for cherries

19 0.68955147 85 andrew gelman stats-2010-06-14-Prior distribution for design effects

20 0.68456429 898 andrew gelman stats-2011-09-10-Fourteen magic words: an update


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(4, 0.013), (9, 0.019), (15, 0.014), (16, 0.05), (19, 0.016), (21, 0.016), (24, 0.156), (36, 0.032), (63, 0.015), (72, 0.013), (82, 0.019), (89, 0.031), (91, 0.048), (99, 0.402)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.98927307 972 andrew gelman stats-2011-10-25-How do you interpret standard errors from a regression fit to the entire population?

Introduction: David Radwin asks a question which comes up fairly often in one form or another: How should one respond to requests for statistical hypothesis tests for population (or universe) data? I [Radwin] first encountered this issue as an undergraduate when a professor suggested a statistical significance test for my paper comparing roll call votes between freshman and veteran members of Congress. Later I learned that such tests apply only to samples because their purpose is to tell you whether the difference in the observed sample is likely to exist in the population. If you have data for the whole population, like all members of the 103rd House of Representatives, you do not need a test to discern the true difference in the population. Sometimes researchers assume some sort of superpopulation like “all possible Congresses” or “Congresses across all time” and that the members of any given Congress constitute a sample. In my current work in education research, it is sometimes asserted t

2 0.98441893 2251 andrew gelman stats-2014-03-17-In the best alternative histories, the real world is what’s ultimately real

Introduction: This amusing-yet-so-true video directed by Eléonore Pourriat shows a sex-role-reversed world where women are in charge and men don’t get taken seriously. It’s convincing and affecting, but the twist that interests me comes at the end, when the real world returns. It’s really creepy. And this in turn reminds me of something we discussed here several years ago, the idea that alternative histories are made particularly compelling when they are grounded in the fact that the alternate world is not the real world. Pourriat’s video would have been excellent even without its final scene, but that scene drives the point home in a way that I don’t think would’ve been possible had the video stayed entirely within its artificial world. The point here is that the real world is indeed what is real. This alternative sex-role-reversed world is not actually possible, and what makes it interesting to think about is the contrast to what really is. If you set up an alternative history but you do

3 0.98425168 1596 andrew gelman stats-2012-11-29-More consulting experiences, this time in computational linguistics

Introduction: Bob wrote this long comment that I think is worth posting: I [Bob] have done a fair bit of consulting for my small natural language processing company over the past ten years. Like statistics, natural language processing is something may companies think they want, but have no idea how to do themselves. We almost always handed out “free” consulting. Usually on the phone to people who called us out of the blue. Our blog and tutorials Google ranking was pretty much our only approach to marketing other than occassionally going to business-oriented conferences. Our goal was to sell software licenses (because consulting doesn’t scale nor does it provide continuing royalty income), but since so few people knew how to use toolkits like ours, we had to help them along the way. We even provided “free” consulting with our startup license package. We were brutally honest with customers, both about our goals and their goals. Their goals were often incompatible with ours (use company X’

4 0.98419523 295 andrew gelman stats-2010-09-25-Clusters with very small numbers of observations

Introduction: James O’Brien writes: How would you explain, to a “classically-trained” hypothesis-tester, that “It’s OK to fit a multilevel model even if some groups have only one observation each”? I [O'Brien] think I understand the logic and the statistical principles at work in this, but I’ve having trouble being clear and persuasive. I also feel like I’m contending with some methodological conventional wisdom here. My reply: I’m so used to this idea that I find it difficult to defend it in some sort of general conceptual way. So let me retreat to a more functional defense, which is that multilevel modeling gives good estimates, especially when the number of observations per group is small. One way to see this in any particular example in through cross-validation. Another way is to consider the alternatives. If you try really hard you can come up with a “classical hypothesis testing” approach which will do as well as the multilevel model. It would just take a lot of work. I’d r

5 0.98416167 695 andrew gelman stats-2011-05-04-Statistics ethics question

Introduction: A graduate student in public health writes: I have been asked to do the statistical analysis for a medical unit that is delivering a pilot study of a program to [details redacted to prevent identification]. They are using a prospective, nonrandomized, cohort-controlled trial study design. The investigator thinks they can recruit only a small number of treatment and control cases, maybe less than 30 in total. After I told the Investigator that I cannot do anything statistically with a sample size that small, he responded that small sample sizes are common in this field, and he send me an example of analysis that someone had done on a similar study. So he still wants me to come up with a statistical plan. Is it unethical for me to do anything other than descriptive statistics? I think he should just stick to qualitative research. But the study she mentions above has 40 subjects and apparently had enough power to detect some effects. This is a pilot study after all so the n does n

6 0.98391443 2115 andrew gelman stats-2013-11-27-Three unblinded mice

7 0.98363996 2176 andrew gelman stats-2014-01-19-Transformations for non-normal data

8 0.9831531 1939 andrew gelman stats-2013-07-15-Forward causal reasoning statements are about estimation; reverse causal questions are about model checking and hypothesis generation

9 0.98309743 1900 andrew gelman stats-2013-06-15-Exploratory multilevel analysis when group-level variables are of importance

10 0.9826262 2254 andrew gelman stats-2014-03-18-Those wacky anti-Bayesians used to be intimidating, but now they’re just pathetic

11 0.98247379 2284 andrew gelman stats-2014-04-07-How literature is like statistical reasoning: Kosara on stories. Gelman and Basbøll on stories.

12 0.98227525 1149 andrew gelman stats-2012-02-01-Philosophy of Bayesian statistics: my reactions to Cox and Mayo

13 0.98209906 1482 andrew gelman stats-2012-09-04-Model checking and model understanding in machine learning

14 0.98183364 383 andrew gelman stats-2010-10-31-Analyzing the entire population rather than a sample

15 0.98178798 2042 andrew gelman stats-2013-09-28-Difficulties of using statistical significance (or lack thereof) to sift through and compare research hypotheses

16 0.98177207 342 andrew gelman stats-2010-10-14-Trying to be precise about vagueness

17 0.9816277 1418 andrew gelman stats-2012-07-16-Long discussion about causal inference and the use of hierarchical models to bridge between different inferential settings

18 0.98154104 2236 andrew gelman stats-2014-03-07-Selection bias in the reporting of shaky research

19 0.9815293 2287 andrew gelman stats-2014-04-09-Advice: positive-sum, zero-sum, or negative-sum

20 0.98148239 48 andrew gelman stats-2010-05-23-The bane of many causes