andrew_gelman_stats andrew_gelman_stats-2012 andrew_gelman_stats-2012-1371 knowledge-graph by maker-knowledge-mining

1371 andrew gelman stats-2012-06-07-Question 28 of my final exam for Design and Analysis of Sample Surveys


meta infos for this blog

Source: html

Introduction: This is it, the last question on the exam! 28. A telephone survey was conducted several years ago, asking people how often they were polled in the past year. I can’t recall the responses, but suppose that 40% of the respondents said they participated in zero surveys in the previous year, 30% said they participated in one survey, 15% said two surveys, 10% said three, and 5% said four. From this it is easy to estimate an average, but there is a worry that this survey will itself overrepresent survey participants and thus overestimate the rate at which the average person is surveyed. Come up with a procedure to use these data to get an improved estimate of the average number of surveys that a randomly-sampled American is polled in a year. Solution to question 27 From yesterday : 27. Which of the following problems were identified with the Burnham et al. survey of Iraq mortality? (Indicate all that apply.) (a) The survey used cluster sampling, which is inappropriate for estim


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 A telephone survey was conducted several years ago, asking people how often they were polled in the past year. [sent-3, score-0.884]

2 I can’t recall the responses, but suppose that 40% of the respondents said they participated in zero surveys in the previous year, 30% said they participated in one survey, 15% said two surveys, 10% said three, and 5% said four. [sent-4, score-1.845]

3 From this it is easy to estimate an average, but there is a worry that this survey will itself overrepresent survey participants and thus overestimate the rate at which the average person is surveyed. [sent-5, score-1.472]

4 Come up with a procedure to use these data to get an improved estimate of the average number of surveys that a randomly-sampled American is polled in a year. [sent-6, score-0.922]

5 Solution to question 27 From yesterday : 27. [sent-7, score-0.162]

6 Which of the following problems were identified with the Burnham et al. [sent-8, score-0.187]

7 ) (a) The survey used cluster sampling, which is inappropriate for estimating individual out- comes such as death. [sent-11, score-0.644]

8 (c) The second-stage sampling was not a probability sample. [sent-14, score-0.233]

9 (d) Survey materials supplied by the authors are incomplete and inconsistent with published descriptions of the survey. [sent-15, score-0.491]

10 Cluster sampling is fine, and the researchers did identify their psu’s. [sent-17, score-0.368]

11 Solution to question 28 You can use weighting or poststratification. [sent-19, score-0.263]

12 With weighting, you need an estimate that a person will participate in the survey. [sent-20, score-0.303]

13 It’s reasonable to suppose that people who responded to more surveys in the past are more likely to respond to this one. [sent-21, score-0.603]

14 With poststrat, you adjust for demographics such as age, ethnicity, sex, and education, that are correlated with survey response rates. [sent-22, score-0.621]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('survey', 0.378), ('surveys', 0.271), ('polled', 0.243), ('sampling', 0.233), ('burnham', 0.228), ('participated', 0.222), ('said', 0.192), ('solution', 0.183), ('cluster', 0.18), ('weighting', 0.167), ('average', 0.137), ('identify', 0.135), ('overrepresent', 0.135), ('poststrat', 0.135), ('estimate', 0.126), ('et', 0.114), ('incomplete', 0.106), ('descriptions', 0.101), ('inconsistent', 0.101), ('suppose', 0.1), ('iraq', 0.098), ('supplied', 0.098), ('past', 0.096), ('question', 0.096), ('mortality', 0.095), ('person', 0.094), ('demographics', 0.094), ('exam', 0.092), ('telephone', 0.092), ('overestimate', 0.092), ('inappropriate', 0.086), ('materials', 0.085), ('ethnicity', 0.084), ('participate', 0.083), ('adjust', 0.08), ('indicate', 0.075), ('improved', 0.075), ('conducted', 0.075), ('primary', 0.074), ('identified', 0.073), ('responses', 0.072), ('respondents', 0.07), ('responded', 0.07), ('procedure', 0.07), ('correlated', 0.069), ('sex', 0.069), ('respond', 0.066), ('participants', 0.066), ('worry', 0.066), ('yesterday', 0.066)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.99999994 1371 andrew gelman stats-2012-06-07-Question 28 of my final exam for Design and Analysis of Sample Surveys

Introduction: This is it, the last question on the exam! 28. A telephone survey was conducted several years ago, asking people how often they were polled in the past year. I can’t recall the responses, but suppose that 40% of the respondents said they participated in zero surveys in the previous year, 30% said they participated in one survey, 15% said two surveys, 10% said three, and 5% said four. From this it is easy to estimate an average, but there is a worry that this survey will itself overrepresent survey participants and thus overestimate the rate at which the average person is surveyed. Come up with a procedure to use these data to get an improved estimate of the average number of surveys that a randomly-sampled American is polled in a year. Solution to question 27 From yesterday : 27. Which of the following problems were identified with the Burnham et al. survey of Iraq mortality? (Indicate all that apply.) (a) The survey used cluster sampling, which is inappropriate for estim

2 0.38783228 1368 andrew gelman stats-2012-06-06-Question 27 of my final exam for Design and Analysis of Sample Surveys

Introduction: 27. Which of the following problems were identified with the Burnham et al. survey of Iraq mortality? (Indicate all that apply.) (a) The survey used cluster sampling, which is inappropriate for estimating individual out- comes such as death. (b) In their report, Burnham et al. did not identify their primary sampling units. (c) The second-stage sampling was not a probability sample. (d) Survey materials supplied by the authors are incomplete and inconsistent with published descriptions of the survey. Solution to question 26 From yesterday : 26. You have just graded an an exam with 28 questions and 15 students. You fit a logistic item- response model estimating ability, difficulty, and discrimination parameters. Which of the following statements are basically true? (Indicate all that apply.) (a) If a question is answered correctly by students with very low and very high ability, but is missed by students in the middle, it will have a high value for its discrimination

3 0.23378138 1430 andrew gelman stats-2012-07-26-Some thoughts on survey weighting

Introduction: From a comment I made in an email exchange: My work on survey adjustments has very much been inspired by the ideas of Rod Little. Much of my efforts have gone toward the goal of integrating hierarchical modeling (which is so helpful for small-area estimation) with post stratification (which adjusts for known differences between sample and population). In the surveys I’ve dealt with, nonresponse/nonavailability can be a big issue, and I’ve always tried to emphasize that (a) the probability of a person being included in the sample is just about never known, and (b) even if this probability were known, I’d rather know the empirical n/N than the probability p (which is only valid in expectation). Regarding nonparametric modeling: I haven’t done much of that (although I hope to at some point) but Rod and his students have. As I wrote in the first sentence of the above-linked paper, I do think the current theory and practice of survey weighting is a mess, in that much depends on so

4 0.23159829 5 andrew gelman stats-2010-04-27-Ethical and data-integrity problems in a study of mortality in Iraq

Introduction: Michael Spagat notifies me that his article criticizing the 2006 study of Burnham, Lafta, Doocy and Roberts has just been published . The Burnham et al. paper (also called, to my irritation (see the last item here ), “the Lancet survey”) used a cluster sample to estimate the number of deaths in Iraq in the three years following the 2003 invasion. In his newly-published paper, Spagat writes: [The Spagat article] presents some evidence suggesting ethical violations to the survey’s respondents including endangerment, privacy breaches and violations in obtaining informed consent. Breaches of minimal disclosure standards examined include non-disclosure of the survey’s questionnaire, data-entry form, data matching anonymised interviewer identifications with households and sample design. The paper also presents some evidence relating to data fabrication and falsification, which falls into nine broad categories. This evidence suggests that this survey cannot be considered a reliable or

5 0.21922745 1455 andrew gelman stats-2012-08-12-Probabilistic screening to get an approximate self-weighted sample

Introduction: Sharad had a survey sampling question: We’re trying to use mechanical turk to conduct some surveys, and have quickly discovered that turkers tend to be quite young. We’d really like a representative sample of the U.S., or at the least be able to recruit a diverse enough sample from turk that we can post-stratify to adjust the estimates. The approach we ended up taking is to pay turkers a small amount to answer a couple of screening questions (age & sex), and then probabilistically recruit individuals to complete the full survey (for more money) based on the estimated turk population parameters and our desired target distribution. We use rejection sampling, so the end result is that individuals who are invited to take the full survey look as if they came from a representative sample, at least in terms of age and sex. I’m wondering whether this sort of technique—a two step design in which participants are first screened and then probabilistically selected to mimic a target distributio

6 0.20898663 784 andrew gelman stats-2011-07-01-Weighting and prediction in sample surveys

7 0.20540097 761 andrew gelman stats-2011-06-13-A survey’s not a survey if they don’t tell you how they did it

8 0.20172344 352 andrew gelman stats-2010-10-19-Analysis of survey data: Design based models vs. hierarchical modeling?

9 0.19799161 1315 andrew gelman stats-2012-05-12-Question 2 of my final exam for Design and Analysis of Sample Surveys

10 0.18956104 1320 andrew gelman stats-2012-05-14-Question 4 of my final exam for Design and Analysis of Sample Surveys

11 0.1721482 1356 andrew gelman stats-2012-05-31-Question 21 of my final exam for Design and Analysis of Sample Surveys

12 0.16335702 1322 andrew gelman stats-2012-05-15-Question 5 of my final exam for Design and Analysis of Sample Surveys

13 0.15976466 107 andrew gelman stats-2010-06-24-PPS in Georgia

14 0.15908523 1323 andrew gelman stats-2012-05-16-Question 6 of my final exam for Design and Analysis of Sample Surveys

15 0.15821542 1437 andrew gelman stats-2012-07-31-Paying survey respondents

16 0.15569469 1362 andrew gelman stats-2012-06-03-Question 24 of my final exam for Design and Analysis of Sample Surveys

17 0.15281408 749 andrew gelman stats-2011-06-06-“Sampling: Design and Analysis”: a course for political science graduate students

18 0.15267834 288 andrew gelman stats-2010-09-21-Discussion of the paper by Girolami and Calderhead on Bayesian computation

19 0.15114711 2191 andrew gelman stats-2014-01-29-“Questioning The Lancet, PLOS, And Other Surveys On Iraqi Deaths, An Interview With Univ. of London Professor Michael Spagat”

20 0.15023145 1349 andrew gelman stats-2012-05-28-Question 18 of my final exam for Design and Analysis of Sample Surveys


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.155), (1, 0.016), (2, 0.183), (3, -0.129), (4, 0.084), (5, 0.106), (6, -0.018), (7, 0.041), (8, 0.009), (9, -0.165), (10, 0.099), (11, -0.186), (12, -0.031), (13, 0.226), (14, -0.073), (15, -0.078), (16, 0.025), (17, 0.009), (18, 0.048), (19, 0.009), (20, -0.055), (21, -0.049), (22, -0.096), (23, 0.063), (24, -0.04), (25, 0.037), (26, 0.028), (27, -0.012), (28, 0.054), (29, -0.011), (30, 0.003), (31, 0.05), (32, -0.016), (33, 0.034), (34, -0.111), (35, -0.013), (36, 0.037), (37, 0.013), (38, -0.053), (39, 0.025), (40, -0.019), (41, 0.092), (42, 0.098), (43, -0.017), (44, -0.034), (45, -0.03), (46, -0.004), (47, -0.013), (48, 0.01), (49, 0.023)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.99490833 1371 andrew gelman stats-2012-06-07-Question 28 of my final exam for Design and Analysis of Sample Surveys

Introduction: This is it, the last question on the exam! 28. A telephone survey was conducted several years ago, asking people how often they were polled in the past year. I can’t recall the responses, but suppose that 40% of the respondents said they participated in zero surveys in the previous year, 30% said they participated in one survey, 15% said two surveys, 10% said three, and 5% said four. From this it is easy to estimate an average, but there is a worry that this survey will itself overrepresent survey participants and thus overestimate the rate at which the average person is surveyed. Come up with a procedure to use these data to get an improved estimate of the average number of surveys that a randomly-sampled American is polled in a year. Solution to question 27 From yesterday : 27. Which of the following problems were identified with the Burnham et al. survey of Iraq mortality? (Indicate all that apply.) (a) The survey used cluster sampling, which is inappropriate for estim

2 0.87518799 1320 andrew gelman stats-2012-05-14-Question 4 of my final exam for Design and Analysis of Sample Surveys

Introduction: 4. Researchers have found that survey respondents overreport church attendance. Thus, naive estimates from surveys overstate the percentage of Americans who attend church regularly. Does this have a large impact on estimates of time trends in religious attendance? Solution to question 3 From yesterday : 3. We discussed in class the best currently available method for estimating the proportion of military servicemembers who are gay. What is that method? (Recall the problems with the direct approach: there is no simple way to survey servicemembers at random, nor is it likely that they would answer such a question honestly.) Solution: I was talking about the work of Gary Gates, combining an estimate of the percentage of gays in the population with an estimate of the probability that someone is in the military, given that he or she is gay.

3 0.84103578 705 andrew gelman stats-2011-05-10-Some interesting unpublished ideas on survey weighting

Introduction: A couple years ago we had an amazing all-star session at the Joint Statistical Meetings. The topic was new approaches to survey weighting (which is a mess , as I’m sure you’ve heard). Xiao-Li Meng recommended shrinking weights by taking them to a fractional power (such as square root) instead of trimming the extremes. Rod Little combined design-based and model-based survey inference. Michael Elliott used mixture models for complex survey design. And here’s my introduction to the session.

4 0.83565897 1679 andrew gelman stats-2013-01-18-Is it really true that only 8% of people who buy Herbalife products are Herbalife distributors?

Introduction: A reporter emailed me the other day with a question about a case I’d never heard of before, a company called Herbalife that is being accused of being a pyramid scheme. The reporter pointed me to this document which describes a survey conducted by “a third party firm called Lieberman Research”: Two independent studies took place using real time (aka “river”) sampling, in which respondents were intercepted across a wide array of websites Sample size of 2,000 adults 18+ matched to U.S. census on age, gender, income, region and ethnicity “River sampling” in this case appears to mean, according to the reporter, that “people were invited into it through online ads.” The survey found that 5% of U.S. households had purchased Herbalife products during the past three months (with a “0.8% margin of error,” ha ha ha). They they did a multiplication and a division to estimate that only 8% of households who bought these products were Herbalife distributors: 480,000 active distributor

5 0.82625932 784 andrew gelman stats-2011-07-01-Weighting and prediction in sample surveys

Introduction: A couple years ago Rod Little was invited to write an article for the diamond jubilee of the Calcutta Statistical Association Bulletin. His article was published with discussions from Danny Pfefferman, J. N. K. Rao, Don Rubin, and myself. Here it all is . I’ll paste my discussion below, but it’s worth reading the others’ perspectives too. Especially the part in Rod’s rejoinder where he points out a mistake I made. Survey weights, like sausage and legislation, are designed and best appreciated by those who are placed a respectable distance from their manufacture. For those of us working inside the factory, vigorous discussion of methods is appreciated. I enjoyed Rod Little’s review of the connections between modeling and survey weighting and have just a few comments. I like Little’s discussion of model-based shrinkage of post-stratum averages, which, as he notes, can be seen to correspond to shrinkage of weights. I would only add one thing to his formula at the end of his

6 0.82202929 1430 andrew gelman stats-2012-07-26-Some thoughts on survey weighting

7 0.80762506 761 andrew gelman stats-2011-06-13-A survey’s not a survey if they don’t tell you how they did it

8 0.78164577 1345 andrew gelman stats-2012-05-26-Question 16 of my final exam for Design and Analysis of Sample Surveys

9 0.77868015 385 andrew gelman stats-2010-10-31-Wacky surveys where they don’t tell you the questions they asked

10 0.77180529 1322 andrew gelman stats-2012-05-15-Question 5 of my final exam for Design and Analysis of Sample Surveys

11 0.76854199 1356 andrew gelman stats-2012-05-31-Question 21 of my final exam for Design and Analysis of Sample Surveys

12 0.7619468 1323 andrew gelman stats-2012-05-16-Question 6 of my final exam for Design and Analysis of Sample Surveys

13 0.76110321 1288 andrew gelman stats-2012-04-29-Clueless Americans think they’ll never get sick

14 0.75764692 1362 andrew gelman stats-2012-06-03-Question 24 of my final exam for Design and Analysis of Sample Surveys

15 0.75345415 1437 andrew gelman stats-2012-07-31-Paying survey respondents

16 0.74802959 5 andrew gelman stats-2010-04-27-Ethical and data-integrity problems in a study of mortality in Iraq

17 0.74495405 381 andrew gelman stats-2010-10-30-Sorry, Senator DeMint: Most Americans Don’t Want to Ban Gays from the Classroom

18 0.74106431 1455 andrew gelman stats-2012-08-12-Probabilistic screening to get an approximate self-weighted sample

19 0.73578441 405 andrew gelman stats-2010-11-10-Estimation from an out-of-date census

20 0.73544711 1313 andrew gelman stats-2012-05-11-Question 1 of my final exam for Design and Analysis of Sample Surveys


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(0, 0.011), (2, 0.045), (9, 0.156), (11, 0.011), (16, 0.05), (24, 0.191), (26, 0.012), (35, 0.013), (37, 0.027), (53, 0.017), (69, 0.012), (95, 0.014), (96, 0.017), (99, 0.314)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.97911888 560 andrew gelman stats-2011-02-06-Education and Poverty

Introduction: Jonathan Livengood writes: There has been some discussion about the recent PISA results (in which the U.S. comes out pretty badly), for example here and here . The claim being made is that the poor U.S. scores are due to rampant individual- or family-level poverty in the U.S. They claim that when one controls for poverty, the U.S. comes out on top in the PISA standings, and then they infer that poverty causes poor test scores. The further inference is then that the U.S. could improve education by the “simple” action of reducing poverty. Anyway, I was wondering what you thought about their analysis. My reply: I agree this is interesting and I agree it’s hard to know exactly what to say about these comparisons. When I’m stuck in this sort of question, I ask, WWJD? In this case, I think Jennifer would ask what are the potential interventions being considered. Various ideas for changing the school system would perhaps have different effects on different groups of students.

2 0.97358143 1356 andrew gelman stats-2012-05-31-Question 21 of my final exam for Design and Analysis of Sample Surveys

Introduction: 21. A country is divided into three regions with populations of 2 million, 2 million, and 0.5 million, respectively. A survey is done asking about foreign policy opinions.. Somebody proposes taking a sample of 50 people from each reason. Give a reason why this non-proportional sample would not usually be done, and also a reason why it might actually be a good idea. Solution to question 20 From yesterday : 20. Explain in two sentences why we expect survey respondents to be honest about vote preferences but possibly dishonest about reporting unhealty behaviors. Solution: Respondents tend to be sincere about vote preferences because this affects the outcome of the poll, and people are motivated to have their candidate poll well. This motivation is typically not present in reporting behaviors; you have no particular reason for wanting to affect the average survey response.

3 0.97011703 1332 andrew gelman stats-2012-05-20-Problemen met het boek

Introduction: Regarding the so-called Dutch Book argument for Bayesian inference (the idea that, if your inferences do not correspond to a Bayesian posterior distribution, you can be forced to make incoherent bets and ultimately become a money pump), I wrote: I have never found this argument appealing, because a bet is a game not a decision. A bet requires 2 players, and one player has to offer the bets. I do agree that in some bounded settings (for example, betting on win place show in a horse race), I’d want my bets to be coherent; if they are incoherent (e.g., if my bets correspond to P(A|B)*P(B) not being equal to P(A,B)), then I should be able to do better by examining the incoherence. But in an “open system” (to borrow some physics jargon), I don’t think coherence is possible. There is always new information coming in, and there is always additional prior information in reserve that hasn’t entered the model.

same-blog 4 0.96982765 1371 andrew gelman stats-2012-06-07-Question 28 of my final exam for Design and Analysis of Sample Surveys

Introduction: This is it, the last question on the exam! 28. A telephone survey was conducted several years ago, asking people how often they were polled in the past year. I can’t recall the responses, but suppose that 40% of the respondents said they participated in zero surveys in the previous year, 30% said they participated in one survey, 15% said two surveys, 10% said three, and 5% said four. From this it is easy to estimate an average, but there is a worry that this survey will itself overrepresent survey participants and thus overestimate the rate at which the average person is surveyed. Come up with a procedure to use these data to get an improved estimate of the average number of surveys that a randomly-sampled American is polled in a year. Solution to question 27 From yesterday : 27. Which of the following problems were identified with the Burnham et al. survey of Iraq mortality? (Indicate all that apply.) (a) The survey used cluster sampling, which is inappropriate for estim

5 0.9695065 389 andrew gelman stats-2010-11-01-Why it can be rational to vote

Introduction: I think I can best do my civic duty by running this one every Election Day, just like Art Buchwald on Thanksgiving. . . . With a national election coming up, and with the publicity at its maximum, now is a good time to ask, is it rational for you to vote? And, by extension, wass it worth your while to pay attention to whatever the candidates and party leaders have been saying for the year or so? With a chance of casting a decisive vote that is comparable to the chance of winning the lottery, what is the gain from being a good citizen and casting your vote? The short answer is, quite a lot. First the bad news. With 100 million voters, your chance that your vote will be decisive–even if the national election is predicted to be reasonably close–is, at best, 1 in a million in a battleground district and much less in a noncompetitive district such as where I live. (The calculation is based on the chance that your district’s vote will be exactly tied, along with the chance that your di

6 0.96950501 1565 andrew gelman stats-2012-11-06-Why it can be rational to vote

7 0.9680782 1532 andrew gelman stats-2012-10-13-A real-life dollar auction game!

8 0.96756756 1424 andrew gelman stats-2012-07-22-Extreme events as evidence for differences in distributions

9 0.96730804 1961 andrew gelman stats-2013-07-29-Postdocs in probabilistic modeling! With David Blei! And Stan!

10 0.96183848 1142 andrew gelman stats-2012-01-29-Difficulties with the 1-4-power transformation

11 0.95821446 1110 andrew gelman stats-2012-01-10-Jobs in statistics research! In New Jersey!

12 0.95656943 1226 andrew gelman stats-2012-03-22-Story time meets the all-else-equal fallacy and the fallacy of measurement

13 0.95334327 640 andrew gelman stats-2011-03-31-Why Edit Wikipedia?

14 0.95103866 1715 andrew gelman stats-2013-02-09-Thomas Hobbes would be spinning in his grave

15 0.93220353 1027 andrew gelman stats-2011-11-25-Note to student journalists: Google is your friend

16 0.93211389 1875 andrew gelman stats-2013-05-28-Simplify until your fake-data check works, then add complications until you can figure out where the problem is coming from

17 0.92941856 678 andrew gelman stats-2011-04-25-Democrats do better among the most and least educated groups

18 0.92858183 675 andrew gelman stats-2011-04-22-Arrow’s other theorem

19 0.92817658 107 andrew gelman stats-2010-06-24-PPS in Georgia

20 0.92741042 157 andrew gelman stats-2010-07-21-Roller coasters, charity, profit, hmmm