andrew_gelman_stats andrew_gelman_stats-2011 andrew_gelman_stats-2011-705 knowledge-graph by maker-knowledge-mining

705 andrew gelman stats-2011-05-10-Some interesting unpublished ideas on survey weighting


meta infos for this blog

Source: html

Introduction: A couple years ago we had an amazing all-star session at the Joint Statistical Meetings. The topic was new approaches to survey weighting (which is a mess , as I’m sure you’ve heard). Xiao-Li Meng recommended shrinking weights by taking them to a fractional power (such as square root) instead of trimming the extremes. Rod Little combined design-based and model-based survey inference. Michael Elliott used mixture models for complex survey design. And here’s my introduction to the session.


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 A couple years ago we had an amazing all-star session at the Joint Statistical Meetings. [sent-1, score-0.745]

2 The topic was new approaches to survey weighting (which is a mess , as I’m sure you’ve heard). [sent-2, score-0.965]

3 Xiao-Li Meng recommended shrinking weights by taking them to a fractional power (such as square root) instead of trimming the extremes. [sent-3, score-1.519]

4 Rod Little combined design-based and model-based survey inference. [sent-4, score-0.452]

5 Michael Elliott used mixture models for complex survey design. [sent-5, score-0.695]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('session', 0.371), ('survey', 0.283), ('trimming', 0.254), ('shrinking', 0.243), ('elliott', 0.243), ('rod', 0.243), ('fractional', 0.222), ('root', 0.198), ('meng', 0.19), ('square', 0.186), ('mess', 0.182), ('introduction', 0.176), ('weights', 0.169), ('combined', 0.169), ('weighting', 0.166), ('mixture', 0.155), ('joint', 0.153), ('recommended', 0.149), ('amazing', 0.147), ('approaches', 0.132), ('complex', 0.125), ('michael', 0.12), ('heard', 0.114), ('power', 0.112), ('taking', 0.097), ('couple', 0.092), ('topic', 0.09), ('instead', 0.087), ('little', 0.079), ('ago', 0.077), ('models', 0.067), ('used', 0.065), ('sure', 0.064), ('years', 0.058), ('statistical', 0.052), ('new', 0.048), ('ve', 0.046)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0 705 andrew gelman stats-2011-05-10-Some interesting unpublished ideas on survey weighting

Introduction: A couple years ago we had an amazing all-star session at the Joint Statistical Meetings. The topic was new approaches to survey weighting (which is a mess , as I’m sure you’ve heard). Xiao-Li Meng recommended shrinking weights by taking them to a fractional power (such as square root) instead of trimming the extremes. Rod Little combined design-based and model-based survey inference. Michael Elliott used mixture models for complex survey design. And here’s my introduction to the session.

2 0.28709257 1430 andrew gelman stats-2012-07-26-Some thoughts on survey weighting

Introduction: From a comment I made in an email exchange: My work on survey adjustments has very much been inspired by the ideas of Rod Little. Much of my efforts have gone toward the goal of integrating hierarchical modeling (which is so helpful for small-area estimation) with post stratification (which adjusts for known differences between sample and population). In the surveys I’ve dealt with, nonresponse/nonavailability can be a big issue, and I’ve always tried to emphasize that (a) the probability of a person being included in the sample is just about never known, and (b) even if this probability were known, I’d rather know the empirical n/N than the probability p (which is only valid in expectation). Regarding nonparametric modeling: I haven’t done much of that (although I hope to at some point) but Rod and his students have. As I wrote in the first sentence of the above-linked paper, I do think the current theory and practice of survey weighting is a mess, in that much depends on so

3 0.22783509 784 andrew gelman stats-2011-07-01-Weighting and prediction in sample surveys

Introduction: A couple years ago Rod Little was invited to write an article for the diamond jubilee of the Calcutta Statistical Association Bulletin. His article was published with discussions from Danny Pfefferman, J. N. K. Rao, Don Rubin, and myself. Here it all is . I’ll paste my discussion below, but it’s worth reading the others’ perspectives too. Especially the part in Rod’s rejoinder where he points out a mistake I made. Survey weights, like sausage and legislation, are designed and best appreciated by those who are placed a respectable distance from their manufacture. For those of us working inside the factory, vigorous discussion of methods is appreciated. I enjoyed Rod Little’s review of the connections between modeling and survey weighting and have just a few comments. I like Little’s discussion of model-based shrinkage of post-stratum averages, which, as he notes, can be seen to correspond to shrinkage of weights. I would only add one thing to his formula at the end of his

4 0.17231528 1814 andrew gelman stats-2013-04-20-A mess with which I am comfortable

Introduction: Having established that survey weighting is a mess, I should also acknowledge that, by this standard, regression modeling is also a mess, involving many arbitrary choices of variable selection, transformations and modeling of interaction. Nonetheless, regression modeling is a mess with which I am comfortable and, perhaps more relevant to the discussion, can be extended using multilevel models to get inference for small cross-classifications or small areas. We’re working on it.

5 0.15520585 761 andrew gelman stats-2011-06-13-A survey’s not a survey if they don’t tell you how they did it

Introduction: Since we’re on the topic of nonreplicable research . . . see here (link from here ) for a story of a survey that’s so bad that the people who did it won’t say how they did it. I know too many cases where people screwed up in a survey when they were actually trying to get the right answer, for me to trust any report of a survey that doesn’t say what they did. I’m reminded of this survey which may well have been based on a sample of size 6 (again, the people who did it refused to release any description of methodology).

6 0.14854448 352 andrew gelman stats-2010-10-19-Analysis of survey data: Design based models vs. hierarchical modeling?

7 0.14386663 1736 andrew gelman stats-2013-02-24-Rcpp class in Sat 9 Mar in NYC

8 0.14165258 1371 andrew gelman stats-2012-06-07-Question 28 of my final exam for Design and Analysis of Sample Surveys

9 0.13389304 613 andrew gelman stats-2011-03-15-Gay-married state senator shot down gay marriage

10 0.12927391 1459 andrew gelman stats-2012-08-15-How I think about mixture models

11 0.12366356 1142 andrew gelman stats-2012-01-29-Difficulties with the 1-4-power transformation

12 0.12170991 1903 andrew gelman stats-2013-06-17-Weak identification provides partial information

13 0.10845664 2152 andrew gelman stats-2013-12-28-Using randomized incentives as an instrument for survey nonresponse?

14 0.10794364 259 andrew gelman stats-2010-09-06-Inbox zero. Really.

15 0.10761294 2351 andrew gelman stats-2014-05-28-Bayesian nonparametric weighted sampling inference

16 0.097092703 1399 andrew gelman stats-2012-06-28-Life imitates blog

17 0.091169156 1433 andrew gelman stats-2012-07-28-LOL without the CATS

18 0.09028016 1455 andrew gelman stats-2012-08-12-Probabilistic screening to get an approximate self-weighted sample

19 0.085048862 1491 andrew gelman stats-2012-09-10-Update on Levitt paper on child car seats

20 0.082481056 1406 andrew gelman stats-2012-07-05-Xiao-Li Meng and Xianchao Xie rethink asymptotics


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.081), (1, 0.019), (2, 0.034), (3, -0.012), (4, 0.043), (5, 0.063), (6, -0.009), (7, 0.018), (8, 0.038), (9, -0.058), (10, 0.052), (11, -0.104), (12, -0.012), (13, 0.09), (14, -0.019), (15, -0.021), (16, 0.004), (17, -0.006), (18, 0.04), (19, -0.004), (20, -0.047), (21, -0.024), (22, -0.04), (23, 0.056), (24, -0.059), (25, 0.044), (26, 0.002), (27, 0.007), (28, 0.054), (29, -0.003), (30, 0.016), (31, 0.051), (32, 0.013), (33, 0.018), (34, -0.045), (35, -0.036), (36, 0.038), (37, 0.002), (38, 0.006), (39, 0.003), (40, -0.012), (41, 0.101), (42, 0.097), (43, -0.021), (44, 0.021), (45, -0.007), (46, -0.008), (47, -0.032), (48, 0.004), (49, 0.011)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.98625678 705 andrew gelman stats-2011-05-10-Some interesting unpublished ideas on survey weighting

Introduction: A couple years ago we had an amazing all-star session at the Joint Statistical Meetings. The topic was new approaches to survey weighting (which is a mess , as I’m sure you’ve heard). Xiao-Li Meng recommended shrinking weights by taking them to a fractional power (such as square root) instead of trimming the extremes. Rod Little combined design-based and model-based survey inference. Michael Elliott used mixture models for complex survey design. And here’s my introduction to the session.

2 0.84826314 1371 andrew gelman stats-2012-06-07-Question 28 of my final exam for Design and Analysis of Sample Surveys

Introduction: This is it, the last question on the exam! 28. A telephone survey was conducted several years ago, asking people how often they were polled in the past year. I can’t recall the responses, but suppose that 40% of the respondents said they participated in zero surveys in the previous year, 30% said they participated in one survey, 15% said two surveys, 10% said three, and 5% said four. From this it is easy to estimate an average, but there is a worry that this survey will itself overrepresent survey participants and thus overestimate the rate at which the average person is surveyed. Come up with a procedure to use these data to get an improved estimate of the average number of surveys that a randomly-sampled American is polled in a year. Solution to question 27 From yesterday : 27. Which of the following problems were identified with the Burnham et al. survey of Iraq mortality? (Indicate all that apply.) (a) The survey used cluster sampling, which is inappropriate for estim

3 0.80108172 1430 andrew gelman stats-2012-07-26-Some thoughts on survey weighting

Introduction: From a comment I made in an email exchange: My work on survey adjustments has very much been inspired by the ideas of Rod Little. Much of my efforts have gone toward the goal of integrating hierarchical modeling (which is so helpful for small-area estimation) with post stratification (which adjusts for known differences between sample and population). In the surveys I’ve dealt with, nonresponse/nonavailability can be a big issue, and I’ve always tried to emphasize that (a) the probability of a person being included in the sample is just about never known, and (b) even if this probability were known, I’d rather know the empirical n/N than the probability p (which is only valid in expectation). Regarding nonparametric modeling: I haven’t done much of that (although I hope to at some point) but Rod and his students have. As I wrote in the first sentence of the above-linked paper, I do think the current theory and practice of survey weighting is a mess, in that much depends on so

4 0.79673046 761 andrew gelman stats-2011-06-13-A survey’s not a survey if they don’t tell you how they did it

Introduction: Since we’re on the topic of nonreplicable research . . . see here (link from here ) for a story of a survey that’s so bad that the people who did it won’t say how they did it. I know too many cases where people screwed up in a survey when they were actually trying to get the right answer, for me to trust any report of a survey that doesn’t say what they did. I’m reminded of this survey which may well have been based on a sample of size 6 (again, the people who did it refused to release any description of methodology).

5 0.78280538 725 andrew gelman stats-2011-05-21-People kept emailing me this one so I think I have to blog something

Introduction: Here and here , for example. I just hope they’re using our survey methods and aren’t trying to contact the zombies face-to-face!

6 0.75917476 784 andrew gelman stats-2011-07-01-Weighting and prediction in sample surveys

7 0.72969174 385 andrew gelman stats-2010-10-31-Wacky surveys where they don’t tell you the questions they asked

8 0.67882866 1288 andrew gelman stats-2012-04-29-Clueless Americans think they’ll never get sick

9 0.67618603 1679 andrew gelman stats-2013-01-18-Is it really true that only 8% of people who buy Herbalife products are Herbalife distributors?

10 0.67329609 1455 andrew gelman stats-2012-08-12-Probabilistic screening to get an approximate self-weighted sample

11 0.67305362 1754 andrew gelman stats-2013-03-08-Cool GSS training video! And cumulative file 1972-2012!

12 0.66847074 5 andrew gelman stats-2010-04-27-Ethical and data-integrity problems in a study of mortality in Iraq

13 0.66761732 1320 andrew gelman stats-2012-05-14-Question 4 of my final exam for Design and Analysis of Sample Surveys

14 0.65507579 1437 andrew gelman stats-2012-07-31-Paying survey respondents

15 0.64402795 958 andrew gelman stats-2011-10-14-The General Social Survey is a great resource

16 0.63615137 2152 andrew gelman stats-2013-12-28-Using randomized incentives as an instrument for survey nonresponse?

17 0.63505012 1322 andrew gelman stats-2012-05-15-Question 5 of my final exam for Design and Analysis of Sample Surveys

18 0.62804431 405 andrew gelman stats-2010-11-10-Estimation from an out-of-date census

19 0.6258164 1323 andrew gelman stats-2012-05-16-Question 6 of my final exam for Design and Analysis of Sample Surveys

20 0.62537587 1313 andrew gelman stats-2012-05-11-Question 1 of my final exam for Design and Analysis of Sample Surveys


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(2, 0.054), (7, 0.03), (9, 0.048), (17, 0.228), (21, 0.051), (24, 0.081), (39, 0.031), (80, 0.039), (82, 0.036), (86, 0.014), (96, 0.029), (99, 0.223)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.90550268 705 andrew gelman stats-2011-05-10-Some interesting unpublished ideas on survey weighting

Introduction: A couple years ago we had an amazing all-star session at the Joint Statistical Meetings. The topic was new approaches to survey weighting (which is a mess , as I’m sure you’ve heard). Xiao-Li Meng recommended shrinking weights by taking them to a fractional power (such as square root) instead of trimming the extremes. Rod Little combined design-based and model-based survey inference. Michael Elliott used mixture models for complex survey design. And here’s my introduction to the session.

2 0.8402313 2314 andrew gelman stats-2014-05-01-Heller, Heller, and Gorfine on univariate and multivariate information measures

Introduction: Malka Gorfine writes: We noticed that the important topic of association measures and tests came up again in your blog, and we have few comments in this regard. It is useful to distinguish between the univariate and multivariate methods. A consistent multivariate method can recognise dependence between two vectors of random variables, while a univariate method can only loop over pairs of components and check for dependency between them. There are very few consistent multivariate methods. To the best of our knowledge there are three practical methods: 1) HSIC by Gretton et al. (http://www.gatsby.ucl.ac.uk/~gretton/papers/GreBouSmoSch05.pdf) 2) dcov by Szekely et al. (http://projecteuclid.org/euclid.aoas/1267453933) 3) A method we introduced in Heller et al (Biometrika, 2013, 503—510, http://biomet.oxfordjournals.org/content/early/2012/12/04/biomet.ass070.full.pdf+html, and an R package, HHG, is available as well http://cran.r-project.org/web/packages/HHG/index.html). A

3 0.82617521 309 andrew gelman stats-2010-10-01-Why Development Economics Needs Theory?

Introduction: Robert Neumann writes: in the JEP 24(3), page18, Daron Acemoglu states: Why Development Economics Needs Theory There is no general agreement on how much we should rely on economic theory in motivating empirical work and whether we should try to formulate and estimate “structural parameters.” I (Acemoglu) argue that the answer is largely “yes” because otherwise econometric estimates would lack external validity, in which case they can neither inform us about whether a particular model or theory is a useful approximation to reality, nor would they be useful in providing us guidance on what the effects of similar shocks and policies would be in different circumstances or if implemented in different scales. I therefore define “structural parameters” as those that provide external validity and would thus be useful in testing theories or in policy analysis beyond the specific environment and sample from which they are derived. External validity becomes a particularly challenging t

4 0.81594497 1230 andrew gelman stats-2012-03-26-Further thoughts on nonparametric correlation measures

Introduction: Malka Gorfine, Ruth Heller, and Yair Heller write a comment on the paper of Reshef et al. that we discussed a few months ago. Just to remind you what’s going on here, here’s my quick summary from December: Reshef et al. propose a new nonlinear R-squared-like measure. Unlike R-squared, this new method depends on a tuning parameter that controls the level of discretization, in a “How long is the coast of Britain” sort of way. The dependence on scale is inevitable for such a general method. Just consider: if you sample 1000 points from the unit bivariate normal distribution, (x,y) ~ N(0,I), you’ll be able to fit them perfectly by a 999-degree polynomial fit to the data. So the scale of the fit matters. The clever idea of the paper is that, instead of going for an absolute measure (which, as we’ve seen, will be scale-dependent), they focus on the problem of summarizing the grid of pairwise dependences in a large set of variables. As they put it: “Imagine a data set with hundreds

5 0.79632211 2324 andrew gelman stats-2014-05-07-Once more on nonparametric measures of mutual information

Introduction: Ben Murell writes: Our reply to Kinney and Atwal has come out (http://www.pnas.org/content/early/2014/04/29/1403623111.full.pdf) along with their response (http://www.pnas.org/content/early/2014/04/29/1404661111.full.pdf). I feel like they somewhat missed the point. If you’re still interested in this line of discussion, feel free to post, and maybe the Murrells and Kinney can bash it out in your comments! Background: Too many MC’s not enough MIC’s, or What principles should govern attempts to summarize bivariate associations in large multivariate datasets? Heller, Heller, and Gorfine on univariate and multivariate information measures Kinney and Atwal on the maximal information coefficient Mr. Pearson, meet Mr. Mandelbrot: Detecting Novel Associations in Large Data Sets Gorfine, Heller, Heller, Simon, and Tibshirani don’t like MIC The fun thing is that all these people are sending me their papers, and I’m enough of an outsider in this field that each of the

6 0.79445595 1362 andrew gelman stats-2012-06-03-Question 24 of my final exam for Design and Analysis of Sample Surveys

7 0.79321963 1616 andrew gelman stats-2012-12-10-John McAfee is a Heinlein hero

8 0.7897138 1076 andrew gelman stats-2011-12-21-Derman, Rodrik and the nature of statistical models

9 0.78235483 1136 andrew gelman stats-2012-01-23-Fight! (also a bit of reminiscence at the end)

10 0.77513051 1557 andrew gelman stats-2012-11-01-‘Researcher Degrees of Freedom’

11 0.77046978 397 andrew gelman stats-2010-11-06-Multilevel quantile regression

12 0.76596302 1467 andrew gelman stats-2012-08-23-The pinch-hitter syndrome again

13 0.74459791 1383 andrew gelman stats-2012-06-18-Hierarchical modeling as a framework for extrapolation

14 0.7429871 2136 andrew gelman stats-2013-12-16-Whither the “bet on sparsity principle” in a nonsparse world?

15 0.74250495 1361 andrew gelman stats-2012-06-02-Question 23 of my final exam for Design and Analysis of Sample Surveys

16 0.7334255 2097 andrew gelman stats-2013-11-11-Why ask why? Forward causal inference and reverse causal questions

17 0.73246908 2359 andrew gelman stats-2014-06-04-All the Assumptions That Are My Life

18 0.72688413 1272 andrew gelman stats-2012-04-20-More proposals to reform the peer-review system

19 0.7244426 2315 andrew gelman stats-2014-05-02-Discovering general multidimensional associations

20 0.72151148 1422 andrew gelman stats-2012-07-20-Likelihood thresholds and decisions