andrew_gelman_stats andrew_gelman_stats-2013 andrew_gelman_stats-2013-1814 knowledge-graph by maker-knowledge-mining

1814 andrew gelman stats-2013-04-20-A mess with which I am comfortable


meta infos for this blog

Source: html

Introduction: Having established that survey weighting is a mess, I should also acknowledge that, by this standard, regression modeling is also a mess, involving many arbitrary choices of variable selection, transformations and modeling of interaction. Nonetheless, regression modeling is a mess with which I am comfortable and, perhaps more relevant to the discussion, can be extended using multilevel models to get inference for small cross-classifications or small areas. We’re working on it.


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Having established that survey weighting is a mess, I should also acknowledge that, by this standard, regression modeling is also a mess, involving many arbitrary choices of variable selection, transformations and modeling of interaction. [sent-1, score-2.51]

2 Nonetheless, regression modeling is a mess with which I am comfortable and, perhaps more relevant to the discussion, can be extended using multilevel models to get inference for small cross-classifications or small areas. [sent-2, score-2.416]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('mess', 0.595), ('modeling', 0.306), ('transformations', 0.227), ('extended', 0.219), ('acknowledge', 0.198), ('nonetheless', 0.196), ('regression', 0.188), ('established', 0.187), ('comfortable', 0.186), ('arbitrary', 0.183), ('weighting', 0.181), ('small', 0.175), ('involving', 0.152), ('choices', 0.151), ('selection', 0.134), ('multilevel', 0.122), ('variable', 0.12), ('relevant', 0.107), ('survey', 0.103), ('inference', 0.093), ('standard', 0.092), ('working', 0.088), ('also', 0.077), ('perhaps', 0.074), ('models', 0.074), ('discussion', 0.072), ('using', 0.061), ('many', 0.054), ('re', 0.053), ('get', 0.041)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0000001 1814 andrew gelman stats-2013-04-20-A mess with which I am comfortable

Introduction: Having established that survey weighting is a mess, I should also acknowledge that, by this standard, regression modeling is also a mess, involving many arbitrary choices of variable selection, transformations and modeling of interaction. Nonetheless, regression modeling is a mess with which I am comfortable and, perhaps more relevant to the discussion, can be extended using multilevel models to get inference for small cross-classifications or small areas. We’re working on it.

2 0.23706517 1430 andrew gelman stats-2012-07-26-Some thoughts on survey weighting

Introduction: From a comment I made in an email exchange: My work on survey adjustments has very much been inspired by the ideas of Rod Little. Much of my efforts have gone toward the goal of integrating hierarchical modeling (which is so helpful for small-area estimation) with post stratification (which adjusts for known differences between sample and population). In the surveys I’ve dealt with, nonresponse/nonavailability can be a big issue, and I’ve always tried to emphasize that (a) the probability of a person being included in the sample is just about never known, and (b) even if this probability were known, I’d rather know the empirical n/N than the probability p (which is only valid in expectation). Regarding nonparametric modeling: I haven’t done much of that (although I hope to at some point) but Rod and his students have. As I wrote in the first sentence of the above-linked paper, I do think the current theory and practice of survey weighting is a mess, in that much depends on so

3 0.17231528 705 andrew gelman stats-2011-05-10-Some interesting unpublished ideas on survey weighting

Introduction: A couple years ago we had an amazing all-star session at the Joint Statistical Meetings. The topic was new approaches to survey weighting (which is a mess , as I’m sure you’ve heard). Xiao-Li Meng recommended shrinking weights by taking them to a fractional power (such as square root) instead of trimming the extremes. Rod Little combined design-based and model-based survey inference. Michael Elliott used mixture models for complex survey design. And here’s my introduction to the session.

4 0.15067622 25 andrew gelman stats-2010-05-10-Two great tastes that taste great together

Introduction: Vlad Kogan writes: I’ve using your book on regression and multilevel modeling and have a quick R question for you. Do you happen to know if there is any R package that can estimate a two-stage (instrumental variable) multi-level model? My reply: I don’t know. I’ll post on blog and maybe there will be a response. You could also try the R help list.

5 0.13756433 2117 andrew gelman stats-2013-11-29-The gradual transition to replicable science

Introduction: Somebody emailed me: I am a researcher at ** University and I have recently read your article on average predictive comparisons for statistical models published 2007 in the journal “Sociological Methodology”. Gelman, Andrew/Iain Pardoe. 2007. “Average Predictive Comparisons for Models with Nonlinearity, Interactions, and Variance Components”. Sociological Methodology 37: 23-51. Currently I am working with multilevel models and find your approach very interesting and useful. May I ask you whether replication materials (e.g. R Code) for this article are available? I had to reply: Hi—I’m embarrassed to say that our R files are a mess! I had ideas of programming the approach more generally as an R package but this has not yet happened yet.

6 0.13466907 295 andrew gelman stats-2010-09-25-Clusters with very small numbers of observations

7 0.1231882 1900 andrew gelman stats-2013-06-15-Exploratory multilevel analysis when group-level variables are of importance

8 0.11936954 1508 andrew gelman stats-2012-09-23-Speaking frankly

9 0.11853756 352 andrew gelman stats-2010-10-19-Analysis of survey data: Design based models vs. hierarchical modeling?

10 0.11786941 383 andrew gelman stats-2010-10-31-Analyzing the entire population rather than a sample

11 0.1144755 255 andrew gelman stats-2010-09-04-How does multilevel modeling affect the estimate of the grand mean?

12 0.11381061 1981 andrew gelman stats-2013-08-14-The robust beauty of improper linear models in decision making

13 0.10787638 247 andrew gelman stats-2010-09-01-How does Bayes do it?

14 0.10519435 1425 andrew gelman stats-2012-07-23-Examples of the use of hierarchical modeling to generalize to new settings

15 0.099821381 2152 andrew gelman stats-2013-12-28-Using randomized incentives as an instrument for survey nonresponse?

16 0.098155424 2273 andrew gelman stats-2014-03-29-References (with code) for Bayesian hierarchical (multilevel) modeling and structural equation modeling

17 0.095223233 1934 andrew gelman stats-2013-07-11-Yes, worry about generalizing from data to population. But multilevel modeling is the solution, not the problem

18 0.09482687 397 andrew gelman stats-2010-11-06-Multilevel quantile regression

19 0.09362638 1575 andrew gelman stats-2012-11-12-Thinking like a statistician (continuously) rather than like a civilian (discretely)

20 0.092181712 1472 andrew gelman stats-2012-08-28-Migrating from dot to underscore


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.104), (1, 0.09), (2, 0.045), (3, -0.036), (4, 0.079), (5, 0.072), (6, -0.045), (7, -0.039), (8, 0.072), (9, 0.055), (10, 0.044), (11, -0.043), (12, 0.021), (13, 0.066), (14, 0.02), (15, 0.008), (16, -0.057), (17, -0.007), (18, 0.017), (19, 0.015), (20, -0.018), (21, 0.016), (22, 0.001), (23, 0.042), (24, -0.041), (25, -0.045), (26, 0.036), (27, -0.035), (28, -0.089), (29, -0.009), (30, 0.038), (31, 0.031), (32, -0.012), (33, 0.013), (34, -0.044), (35, -0.025), (36, 0.05), (37, 0.026), (38, -0.018), (39, 0.02), (40, -0.022), (41, 0.052), (42, 0.005), (43, -0.079), (44, 0.026), (45, 0.02), (46, -0.015), (47, 0.035), (48, -0.049), (49, -0.022)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.98980874 1814 andrew gelman stats-2013-04-20-A mess with which I am comfortable

Introduction: Having established that survey weighting is a mess, I should also acknowledge that, by this standard, regression modeling is also a mess, involving many arbitrary choices of variable selection, transformations and modeling of interaction. Nonetheless, regression modeling is a mess with which I am comfortable and, perhaps more relevant to the discussion, can be extended using multilevel models to get inference for small cross-classifications or small areas. We’re working on it.

2 0.82628191 1900 andrew gelman stats-2013-06-15-Exploratory multilevel analysis when group-level variables are of importance

Introduction: Steve Miller writes: Much of what I do is cross-national analyses of survey data (largely World Values Survey). . . . My big question pertains to (what I would call) exploratory analysis of multilevel data, especially when the group-level predictors are of theoretical importance. A lot of what I do involves analyzing cross-national survey items of citizen attitudes, typically of political leadership. These survey items are usually yes/no responses, or four-part responses indicating a level of agreement (strongly agree, agree, disagree, strongly disagree) that can be condensed into a binary variable. I believe these can be explained by reference to country-level factors. Much of the group-level variables of interest are count variables with a modal value of 0, which can be quite messy. How would you recommend exploring the variation in the dependent variable as it could be explained by the group-level count variable of interest, before fitting the multilevel model itself? When

3 0.77833986 2152 andrew gelman stats-2013-12-28-Using randomized incentives as an instrument for survey nonresponse?

Introduction: I received the following question: Is there a classic paper on instrumenting for survey non-response? some colleagues in public health are going to carry out a survey and I wonder about suggesting that they build in a randomization of response-encouragement (e.g. offering additional $ to a subset of those who don’t respond initially). Can you recommend a basic treatment of this, and why it might or might not make sense compared to IPW using covariates (without an instrument)? My reply: Here’s the best analysis I know of on the effects of incentives for survey response. There have been several survey-experiments on the subject. The short answer is that the effect on nonresponse is small and the outcome is highly variable, hence you can’t very well use it as an instrument in any particular survey. My recommended approach to dealing with nonresponse is to use multilevel regression and poststratification; an example is here . Inverse-probability weighting doesn’t really w

4 0.77178693 397 andrew gelman stats-2010-11-06-Multilevel quantile regression

Introduction: Ryan Seals writes: I’m an epidemiologist at Emory University, and I’m working on a project of release patterns in jails (basically trying to model how long individuals are in jail before they’re release, for purposes of designing short-term health interventions, i.e. HIV testing, drug counseling, etc…). The question lends itself to quantile regression; we’re interested in the # of days it takes for 50% and 75% of inmates to be released. But being a clustered/nested data structure, it also obviously lends itself to multilevel modeling, with the group-level being individual jails. So: do you know of any work on multilevel quantile regression? My quick lit search didn’t yield much, and I don’t see any preprogrammed way to do it in SAS. My reply: To start with, I’m putting in the R keyword here, on the hope that some readers might be able to refer you to an R function that does what you want. Beyond this, I think it should be possible to program something in Bugs. In ARM we hav

5 0.74258852 25 andrew gelman stats-2010-05-10-Two great tastes that taste great together

Introduction: Vlad Kogan writes: I’ve using your book on regression and multilevel modeling and have a quick R question for you. Do you happen to know if there is any R package that can estimate a two-stage (instrumental variable) multi-level model? My reply: I don’t know. I’ll post on blog and maybe there will be a response. You could also try the R help list.

6 0.71584111 2357 andrew gelman stats-2014-06-02-Why we hate stepwise regression

7 0.67882049 704 andrew gelman stats-2011-05-10-Multiple imputation and multilevel analysis

8 0.67414695 1430 andrew gelman stats-2012-07-26-Some thoughts on survey weighting

9 0.66232097 1248 andrew gelman stats-2012-04-06-17 groups, 6 group-level predictors: What to do?

10 0.6549443 796 andrew gelman stats-2011-07-10-Matching and regression: two great tastes etc etc

11 0.65425372 1294 andrew gelman stats-2012-05-01-Modeling y = a + b + c

12 0.65037048 1934 andrew gelman stats-2013-07-11-Yes, worry about generalizing from data to population. But multilevel modeling is the solution, not the problem

13 0.64990699 1094 andrew gelman stats-2011-12-31-Using factor analysis or principal components analysis or measurement-error models for biological measurements in archaeology?

14 0.64689952 10 andrew gelman stats-2010-04-29-Alternatives to regression for social science predictions

15 0.64509147 383 andrew gelman stats-2010-10-31-Analyzing the entire population rather than a sample

16 0.64354968 255 andrew gelman stats-2010-09-04-How does multilevel modeling affect the estimate of the grand mean?

17 0.64144027 1815 andrew gelman stats-2013-04-20-Displaying inferences from complex models

18 0.63530719 772 andrew gelman stats-2011-06-17-Graphical tools for understanding multilevel models

19 0.63156229 295 andrew gelman stats-2010-09-25-Clusters with very small numbers of observations

20 0.62570506 948 andrew gelman stats-2011-10-10-Combining data from many sources


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(2, 0.07), (7, 0.183), (21, 0.029), (24, 0.115), (40, 0.043), (65, 0.038), (86, 0.035), (99, 0.32)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.94529742 1814 andrew gelman stats-2013-04-20-A mess with which I am comfortable

Introduction: Having established that survey weighting is a mess, I should also acknowledge that, by this standard, regression modeling is also a mess, involving many arbitrary choices of variable selection, transformations and modeling of interaction. Nonetheless, regression modeling is a mess with which I am comfortable and, perhaps more relevant to the discussion, can be extended using multilevel models to get inference for small cross-classifications or small areas. We’re working on it.

2 0.93290013 1592 andrew gelman stats-2012-11-27-Art-math

Introduction: This seems like the sort of thing I would like: Drawing from My Mind’s Eye: Dorothea Rockburne in Conversation with David Cohen Introduced by Nina Samuel Thursday, November 29 6 pm BGC, 38 West 86th Street Benoît Mandelbrot, unusual among mathematicians of the twentieth century, harnessed the power of visual images to express his theories and to pursue new lines of thought. In this conversation artist Dorothea Rockburne will share memories of studying with mathematician Max Dehn at Black Mountain College, of meeting Mandelbrot, and discuss her recent work. Exhibition curator Nina Samuel will discuss the related exhibition “The Islands of Benoît Mandelbrot: Fractals, Chaos, and the Materiality of Thinking,” on view in the BGC Focus Gallery through January 27, 2013. David Cohen is editor and publisher of artcritical.com as well as founder and moderator of The Review Panel. Dorothea Rockburne is a distinguished artist whose work has been inspired by her lifelong st

3 0.92556965 1699 andrew gelman stats-2013-01-31-Fowlerpalooza!

Introduction: Russ Lyons points us to a discussion in Statistics in Medicine of the famous claims by Christakis and Fowler on the contagion of obesity etc. James O’Malley and Christakis and Fowler present the positive case. Andrew Thomas and Tyler VanderWeele present constructive criticism. Christakis and Fowler reply . Coincidentally, a couple weeks ago an epidemiologist was explaining to me the differences between the Framingham Heart Study and the Nurses Health Study and why Framingham got the postmenopausal supplement risks right while Nurses got it wrong. P.S. The journal issue also includes a comment on “A distribution-free test of constant mean in linear mixed effects models.” Wow! I had no idea people still did this sort of thing. How horrible. But I guess that’s what half-life is all about. These ideas last forever, they just become less and less relevant to people.

4 0.92052948 1525 andrew gelman stats-2012-10-08-Ethical standards in different data communities

Introduction: I opened the paper today and saw this from Paul Krugman, on Jack Welch, the former chairman of General Electric, who posted an assertion on Twitter that the [recent unemployment data] had been cooked to help President Obama’s re-election campaign. His claim was quickly picked up by right-wing pundits and media personalities. It was nonsense, of course. Job numbers are prepared by professional civil servants, at an agency that currently has no political appointees. But then maybe Mr. Welch — under whose leadership G.E. reported remarkably smooth earnings growth, with none of the short-term fluctuations you might have expected (fluctuations that reappeared under his successor) — doesn’t know how hard it would be to cook the jobs data. I was curious so I googled *general electric historical earnings*. It was surprisingly difficult to find the numbers! Most of the links just went back to 2011, or to 2008. Eventually I came across this blog by Barry Ritholtz that showed this

5 0.91829908 1603 andrew gelman stats-2012-12-03-Somebody listened to me!

Introduction: Several months ago, I wrote : One challenge, though, is that uncovering the problem [of scientific fraud] and forcing the retraction is a near-thankless job. That’s one reason I don’t mind if Uri Simonsohn is treated as some sort of hero or superstar for uncovering multiple cases of research fraud. Some people might feel there’s something unseemly about Simonsohn doing this . . . OK, fine, but let’s talk incentives. If retractions are a good thing, and fraudsters and plagiarists are not generally going to retract on their own, then somebody’s going to have to do the hard work of discovering, exposing, and confronting scholarly misconduct. If these discoverers, exposers, and confronters are going to be attacked back by their targets (which would be natural enough) and they’re going to be attacked by the fraudsters’ friends and colleagues (also natural) and even have their work disparaged by outsiders who think they’re going too far, then, hey, they need some incentives in the othe

6 0.91799527 1194 andrew gelman stats-2012-03-04-Multilevel modeling even when you’re not interested in predictions for new groups

7 0.91556752 289 andrew gelman stats-2010-09-21-“How segregated is your city?”: A story of why every graph, no matter how clear it seems to be, needs a caption to anchor the reader in some numbers

8 0.91217726 721 andrew gelman stats-2011-05-20-Non-statistical thinking in the US foreign policy establishment

9 0.90750921 402 andrew gelman stats-2010-11-09-Kaggle: forecasting competitions in the classroom

10 0.88641298 975 andrew gelman stats-2011-10-27-Caffeine keeps your Mac awake

11 0.88612938 2165 andrew gelman stats-2014-01-09-San Fernando Valley cityscapes: An example of the benefits of fractal devastation?

12 0.88154817 226 andrew gelman stats-2010-08-23-More on those L.A. Times estimates of teacher effectiveness

13 0.87891692 2230 andrew gelman stats-2014-03-02-What is it with Americans in Olympic ski teams from tropical countries?

14 0.87584752 2304 andrew gelman stats-2014-04-24-An open site for researchers to post and share papers

15 0.87414974 2027 andrew gelman stats-2013-09-17-Christian Robert on the Jeffreys-Lindley paradox; more generally, it’s good news when philosophical arguments can be transformed into technical modeling issues

16 0.87183237 1415 andrew gelman stats-2012-07-13-Retractions, retractions: “left-wing enough to not care about truth if it confirms their social theories, right-wing enough to not care as long as they’re getting paid enough”

17 0.87174654 277 andrew gelman stats-2010-09-14-In an introductory course, when does learning occur?

18 0.8704102 1752 andrew gelman stats-2013-03-06-Online Education and Jazz

19 0.86833614 1976 andrew gelman stats-2013-08-10-The birthday problem

20 0.86736536 1196 andrew gelman stats-2012-03-04-Piss-poor monocausal social science