andrew_gelman_stats andrew_gelman_stats-2012 andrew_gelman_stats-2012-1362 knowledge-graph by maker-knowledge-mining

1362 andrew gelman stats-2012-06-03-Question 24 of my final exam for Design and Analysis of Sample Surveys


meta infos for this blog

Source: html

Introduction: 24. A supermarket chain has 100 equally-sized stores. It is desired to estimate the proportion of vegetables that spoil before being sold. The following sampling designs are considered: (a) Sample 10 stores, then sample half the vegetables within each of these stores; or (b) Sample 20 stores, then sample one-quarter of the vegetables within each of these stores. Which of these designs has the lowest variance? Why might the higher-variance design still be chosen? Solution to question 23 From yesterday : 23. Suppose you are conducting a survey in which people are asked about their health behaviors (how often they wash their hands, how often they go to the doctor, etc.). There is a concern that different interviewers will get different sorts of responses—that is, there may be important interviewer effects. Describe (in two sentences) how you could estimate the interviewer effects within your survey. Can the interviewer effects create problems of reliability of the survey r


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 It is desired to estimate the proportion of vegetables that spoil before being sold. [sent-3, score-0.584]

2 The following sampling designs are considered: (a) Sample 10 stores, then sample half the vegetables within each of these stores; or (b) Sample 20 stores, then sample one-quarter of the vegetables within each of these stores. [sent-4, score-1.348]

3 Which of these designs has the lowest variance? [sent-5, score-0.226]

4 Solution to question 23 From yesterday : 23. [sent-7, score-0.056]

5 Suppose you are conducting a survey in which people are asked about their health behaviors (how often they wash their hands, how often they go to the doctor, etc. [sent-8, score-0.603]

6 There is a concern that different interviewers will get different sorts of responses—that is, there may be important interviewer effects. [sent-10, score-1.155]

7 Describe (in two sentences) how you could estimate the interviewer effects within your survey. [sent-11, score-0.873]

8 Can the interviewer effects create problems of reliability of the survey responses? [sent-12, score-1.262]

9 Can the interviewer effects create problems of validity of the survey responses? [sent-14, score-1.235]

10 Solution: You can do a survey experiment and randomly assign different respondents to different interviewers. [sent-16, score-0.734]

11 ) of responses for each interviewer, and compare (ideally using a graph! [sent-19, score-0.362]

12 Yes, interviewer effects can create problems of reliability (different interviewers getting different results) and validity (responses being different from the truth). [sent-21, score-1.746]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('interviewer', 0.536), ('responses', 0.309), ('stores', 0.293), ('vegetables', 0.273), ('interviewers', 0.195), ('create', 0.167), ('survey', 0.162), ('different', 0.16), ('reliability', 0.158), ('sample', 0.156), ('designs', 0.146), ('effects', 0.144), ('validity', 0.131), ('sentence', 0.124), ('within', 0.121), ('solution', 0.104), ('spoil', 0.104), ('wash', 0.098), ('supermarket', 0.095), ('problems', 0.095), ('explain', 0.093), ('doctor', 0.089), ('conducting', 0.089), ('behaviors', 0.084), ('lowest', 0.08), ('ideally', 0.079), ('hands', 0.079), ('desired', 0.073), ('assign', 0.073), ('estimate', 0.072), ('sentences', 0.072), ('chosen', 0.068), ('chain', 0.068), ('randomly', 0.066), ('proportion', 0.062), ('often', 0.061), ('respondents', 0.06), ('concern', 0.059), ('truth', 0.059), ('yesterday', 0.056), ('describe', 0.054), ('compare', 0.053), ('experiment', 0.053), ('variance', 0.052), ('half', 0.052), ('sampling', 0.05), ('health', 0.048), ('considered', 0.047), ('design', 0.047), ('sorts', 0.045)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.99999982 1362 andrew gelman stats-2012-06-03-Question 24 of my final exam for Design and Analysis of Sample Surveys

Introduction: 24. A supermarket chain has 100 equally-sized stores. It is desired to estimate the proportion of vegetables that spoil before being sold. The following sampling designs are considered: (a) Sample 10 stores, then sample half the vegetables within each of these stores; or (b) Sample 20 stores, then sample one-quarter of the vegetables within each of these stores. Which of these designs has the lowest variance? Why might the higher-variance design still be chosen? Solution to question 23 From yesterday : 23. Suppose you are conducting a survey in which people are asked about their health behaviors (how often they wash their hands, how often they go to the doctor, etc.). There is a concern that different interviewers will get different sorts of responses—that is, there may be important interviewer effects. Describe (in two sentences) how you could estimate the interviewer effects within your survey. Can the interviewer effects create problems of reliability of the survey r

2 0.66606569 1361 andrew gelman stats-2012-06-02-Question 23 of my final exam for Design and Analysis of Sample Surveys

Introduction: 23. Suppose you are conducting a survey in which people are asked about their health behaviors (how often they wash their hands, how often they go to the doctor, etc.). There is a concern that different interviewers will get different sorts of responses—that is, there may be important interviewer effects. Describe (in two sentences) how you could estimate the interviewer effects within your survey. Can the interviewer effects create problems of reliability of the survey responses? Explain (in one sentence). Can the interviewer effects create problems of validity of the survey responses? Explain (in one sentence). Solution to question 22 From yesterday : 22. A supermarket chain has 100 equally-sized stores. It is desired to estimate the proportion of vegetables that spoil before being sold. Three stores are selected at random and are checked: the percent of spoiled vegetables are 3%, 5%, and 10% in the three stores. Give an estimate and standard error for the percentage of spo

3 0.3778393 1365 andrew gelman stats-2012-06-04-Question 25 of my final exam for Design and Analysis of Sample Surveys

Introduction: 25. You are using multilevel regression and poststratification (MRP) to a survey of 1500 people to estimate support for the space program, by state. The model is fit using, as a state- level predictor, the Republican presidential vote in the state, which turns out to have a low correlation with support for the space program. Which of the following statements are basically true? (Indicate all that apply.) (a) For small states, the MRP estimates will be determined almost entirely by the demo- graphic characteristics of the respondents in the sample from that state. (b) For small states, the MRP estimates will be determined almost entirely by the demographic characteristics of the population in that state. (c) Adding a predictor specifically for this model (for example, a measure of per-capita space-program spending in the state) could dramatically improve the estimates of state-level opinion. (d) It would not be appropriate to add a predictor such as per-capita space-program spen

4 0.29631412 1358 andrew gelman stats-2012-06-01-Question 22 of my final exam for Design and Analysis of Sample Surveys

Introduction: 22. A supermarket chain has 100 equally-sized stores. It is desired to estimate the proportion of vegetables that spoil before being sold. Three stores are selected at random and are checked: the percent of spoiled vegetables are 3%, 5%, and 10% in the three stores. Give an estimate and standard error for the percentage of spoiled vegetables for the entire chain. Solution to question 21 From yesterday : 21. A country is divided into three regions with populations of 2 million, 2 million, and 0.5 million, respectively. A survey is done asking about foreign policy opinions. Somebody proposes taking a sample of 50 people from each reason. Give a reason why this non-proportional sample would not usually be done, and also a reason why it might actually be a good idea. Solution: Nonproportional sampling is usually avoided because it makes the analysis more complicated and it results in a higher standard error for estimates of the general population. It might be a good idea her

5 0.15569469 1371 andrew gelman stats-2012-06-07-Question 28 of my final exam for Design and Analysis of Sample Surveys

Introduction: This is it, the last question on the exam! 28. A telephone survey was conducted several years ago, asking people how often they were polled in the past year. I can’t recall the responses, but suppose that 40% of the respondents said they participated in zero surveys in the previous year, 30% said they participated in one survey, 15% said two surveys, 10% said three, and 5% said four. From this it is easy to estimate an average, but there is a worry that this survey will itself overrepresent survey participants and thus overestimate the rate at which the average person is surveyed. Come up with a procedure to use these data to get an improved estimate of the average number of surveys that a randomly-sampled American is polled in a year. Solution to question 27 From yesterday : 27. Which of the following problems were identified with the Burnham et al. survey of Iraq mortality? (Indicate all that apply.) (a) The survey used cluster sampling, which is inappropriate for estim

6 0.13629369 1349 andrew gelman stats-2012-05-28-Question 18 of my final exam for Design and Analysis of Sample Surveys

7 0.13459826 1356 andrew gelman stats-2012-05-31-Question 21 of my final exam for Design and Analysis of Sample Surveys

8 0.11253694 777 andrew gelman stats-2011-06-23-Combining survey data obtained using different modes of sampling

9 0.10795443 761 andrew gelman stats-2011-06-13-A survey’s not a survey if they don’t tell you how they did it

10 0.10499819 820 andrew gelman stats-2011-07-25-Design of nonrandomized cluster sample study

11 0.10186656 2302 andrew gelman stats-2014-04-23-A short questionnaire regarding the subjective assessment of evidence

12 0.10111818 1315 andrew gelman stats-2012-05-12-Question 2 of my final exam for Design and Analysis of Sample Surveys

13 0.098439418 784 andrew gelman stats-2011-07-01-Weighting and prediction in sample surveys

14 0.097167432 1900 andrew gelman stats-2013-06-15-Exploratory multilevel analysis when group-level variables are of importance

15 0.095589958 1352 andrew gelman stats-2012-05-29-Question 19 of my final exam for Design and Analysis of Sample Surveys

16 0.092138536 963 andrew gelman stats-2011-10-18-Question on Type M errors

17 0.088301063 1455 andrew gelman stats-2012-08-12-Probabilistic screening to get an approximate self-weighted sample

18 0.088027395 1317 andrew gelman stats-2012-05-13-Question 3 of my final exam for Design and Analysis of Sample Surveys

19 0.085041516 1348 andrew gelman stats-2012-05-27-Question 17 of my final exam for Design and Analysis of Sample Surveys

20 0.082805581 1320 andrew gelman stats-2012-05-14-Question 4 of my final exam for Design and Analysis of Sample Surveys


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.11), (1, 0.032), (2, 0.142), (3, -0.115), (4, 0.084), (5, 0.056), (6, -0.015), (7, 0.038), (8, 0.003), (9, -0.094), (10, 0.042), (11, -0.109), (12, 0.003), (13, 0.133), (14, -0.034), (15, -0.045), (16, -0.063), (17, -0.016), (18, 0.003), (19, 0.07), (20, -0.039), (21, -0.079), (22, -0.029), (23, 0.057), (24, 0.01), (25, -0.015), (26, 0.022), (27, 0.054), (28, -0.02), (29, -0.033), (30, -0.006), (31, -0.013), (32, 0.002), (33, -0.033), (34, 0.018), (35, -0.033), (36, -0.019), (37, -0.022), (38, 0.024), (39, -0.032), (40, -0.002), (41, 0.006), (42, 0.016), (43, 0.087), (44, -0.024), (45, 0.018), (46, 0.003), (47, -0.009), (48, 0.021), (49, 0.039)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.97719032 1362 andrew gelman stats-2012-06-03-Question 24 of my final exam for Design and Analysis of Sample Surveys

Introduction: 24. A supermarket chain has 100 equally-sized stores. It is desired to estimate the proportion of vegetables that spoil before being sold. The following sampling designs are considered: (a) Sample 10 stores, then sample half the vegetables within each of these stores; or (b) Sample 20 stores, then sample one-quarter of the vegetables within each of these stores. Which of these designs has the lowest variance? Why might the higher-variance design still be chosen? Solution to question 23 From yesterday : 23. Suppose you are conducting a survey in which people are asked about their health behaviors (how often they wash their hands, how often they go to the doctor, etc.). There is a concern that different interviewers will get different sorts of responses—that is, there may be important interviewer effects. Describe (in two sentences) how you could estimate the interviewer effects within your survey. Can the interviewer effects create problems of reliability of the survey r

2 0.92684084 1361 andrew gelman stats-2012-06-02-Question 23 of my final exam for Design and Analysis of Sample Surveys

Introduction: 23. Suppose you are conducting a survey in which people are asked about their health behaviors (how often they wash their hands, how often they go to the doctor, etc.). There is a concern that different interviewers will get different sorts of responses—that is, there may be important interviewer effects. Describe (in two sentences) how you could estimate the interviewer effects within your survey. Can the interviewer effects create problems of reliability of the survey responses? Explain (in one sentence). Can the interviewer effects create problems of validity of the survey responses? Explain (in one sentence). Solution to question 22 From yesterday : 22. A supermarket chain has 100 equally-sized stores. It is desired to estimate the proportion of vegetables that spoil before being sold. Three stores are selected at random and are checked: the percent of spoiled vegetables are 3%, 5%, and 10% in the three stores. Give an estimate and standard error for the percentage of spo

3 0.87848377 1358 andrew gelman stats-2012-06-01-Question 22 of my final exam for Design and Analysis of Sample Surveys

Introduction: 22. A supermarket chain has 100 equally-sized stores. It is desired to estimate the proportion of vegetables that spoil before being sold. Three stores are selected at random and are checked: the percent of spoiled vegetables are 3%, 5%, and 10% in the three stores. Give an estimate and standard error for the percentage of spoiled vegetables for the entire chain. Solution to question 21 From yesterday : 21. A country is divided into three regions with populations of 2 million, 2 million, and 0.5 million, respectively. A survey is done asking about foreign policy opinions. Somebody proposes taking a sample of 50 people from each reason. Give a reason why this non-proportional sample would not usually be done, and also a reason why it might actually be a good idea. Solution: Nonproportional sampling is usually avoided because it makes the analysis more complicated and it results in a higher standard error for estimates of the general population. It might be a good idea her

4 0.84578133 1320 andrew gelman stats-2012-05-14-Question 4 of my final exam for Design and Analysis of Sample Surveys

Introduction: 4. Researchers have found that survey respondents overreport church attendance. Thus, naive estimates from surveys overstate the percentage of Americans who attend church regularly. Does this have a large impact on estimates of time trends in religious attendance? Solution to question 3 From yesterday : 3. We discussed in class the best currently available method for estimating the proportion of military servicemembers who are gay. What is that method? (Recall the problems with the direct approach: there is no simple way to survey servicemembers at random, nor is it likely that they would answer such a question honestly.) Solution: I was talking about the work of Gary Gates, combining an estimate of the percentage of gays in the population with an estimate of the probability that someone is in the military, given that he or she is gay.

5 0.80890739 1348 andrew gelman stats-2012-05-27-Question 17 of my final exam for Design and Analysis of Sample Surveys

Introduction: 17. In a survey of n people, half are asked if they support “the health care law recently passed by Congress” and half are asked if they support “the law known as Obamacare.” The goal is to estimate the effect of the wording on the proportion of Yes responses. How large must n be for the effect to be estimated within a standard error of 5 percentage points? Solution to question 16 From yesterday : 16. You are doing a survey in a war-torn country to estimate what percentage of unemployed men support the rebels in a civil war. Express this as a ratio estimation problem, where goal is to estimate Y.bar/X.bar. What are x and y here? Give the estimate and standard error for the percentage of unemployed men who support the rebels. Solution: x is 1 if the respondent is an unemployed man, 0 otherwise. y is 1 if the respondent is an unemployed man and supports the rebels, 0 otherwise. The estimate is y.bar/x.bar [typo fixed], the standard error is (1/x.bar)*(1/sqrt(n))*s.z, whe

6 0.80612463 1349 andrew gelman stats-2012-05-28-Question 18 of my final exam for Design and Analysis of Sample Surveys

7 0.76218116 1331 andrew gelman stats-2012-05-19-Question 9 of my final exam for Design and Analysis of Sample Surveys

8 0.76176941 1323 andrew gelman stats-2012-05-16-Question 6 of my final exam for Design and Analysis of Sample Surveys

9 0.75282234 1345 andrew gelman stats-2012-05-26-Question 16 of my final exam for Design and Analysis of Sample Surveys

10 0.75017822 1371 andrew gelman stats-2012-06-07-Question 28 of my final exam for Design and Analysis of Sample Surveys

11 0.72241646 1326 andrew gelman stats-2012-05-17-Question 7 of my final exam for Design and Analysis of Sample Surveys

12 0.71991789 1356 andrew gelman stats-2012-05-31-Question 21 of my final exam for Design and Analysis of Sample Surveys

13 0.71612132 1365 andrew gelman stats-2012-06-04-Question 25 of my final exam for Design and Analysis of Sample Surveys

14 0.70899028 1315 andrew gelman stats-2012-05-12-Question 2 of my final exam for Design and Analysis of Sample Surveys

15 0.69451052 1322 andrew gelman stats-2012-05-15-Question 5 of my final exam for Design and Analysis of Sample Surveys

16 0.69148034 1679 andrew gelman stats-2013-01-18-Is it really true that only 8% of people who buy Herbalife products are Herbalife distributors?

17 0.67523146 1441 andrew gelman stats-2012-08-02-“Based on my experiences, I think you could make general progress by constructing a solution to your specific problem.”

18 0.67175531 820 andrew gelman stats-2011-07-25-Design of nonrandomized cluster sample study

19 0.66718251 1317 andrew gelman stats-2012-05-13-Question 3 of my final exam for Design and Analysis of Sample Surveys

20 0.61329949 1352 andrew gelman stats-2012-05-29-Question 19 of my final exam for Design and Analysis of Sample Surveys


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(16, 0.055), (17, 0.151), (21, 0.028), (24, 0.097), (36, 0.011), (37, 0.015), (47, 0.012), (55, 0.012), (63, 0.015), (65, 0.078), (66, 0.025), (91, 0.06), (99, 0.32)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.96617162 1362 andrew gelman stats-2012-06-03-Question 24 of my final exam for Design and Analysis of Sample Surveys

Introduction: 24. A supermarket chain has 100 equally-sized stores. It is desired to estimate the proportion of vegetables that spoil before being sold. The following sampling designs are considered: (a) Sample 10 stores, then sample half the vegetables within each of these stores; or (b) Sample 20 stores, then sample one-quarter of the vegetables within each of these stores. Which of these designs has the lowest variance? Why might the higher-variance design still be chosen? Solution to question 23 From yesterday : 23. Suppose you are conducting a survey in which people are asked about their health behaviors (how often they wash their hands, how often they go to the doctor, etc.). There is a concern that different interviewers will get different sorts of responses—that is, there may be important interviewer effects. Describe (in two sentences) how you could estimate the interviewer effects within your survey. Can the interviewer effects create problems of reliability of the survey r

2 0.94575864 1361 andrew gelman stats-2012-06-02-Question 23 of my final exam for Design and Analysis of Sample Surveys

Introduction: 23. Suppose you are conducting a survey in which people are asked about their health behaviors (how often they wash their hands, how often they go to the doctor, etc.). There is a concern that different interviewers will get different sorts of responses—that is, there may be important interviewer effects. Describe (in two sentences) how you could estimate the interviewer effects within your survey. Can the interviewer effects create problems of reliability of the survey responses? Explain (in one sentence). Can the interviewer effects create problems of validity of the survey responses? Explain (in one sentence). Solution to question 22 From yesterday : 22. A supermarket chain has 100 equally-sized stores. It is desired to estimate the proportion of vegetables that spoil before being sold. Three stores are selected at random and are checked: the percent of spoiled vegetables are 3%, 5%, and 10% in the three stores. Give an estimate and standard error for the percentage of spo

3 0.93992823 2314 andrew gelman stats-2014-05-01-Heller, Heller, and Gorfine on univariate and multivariate information measures

Introduction: Malka Gorfine writes: We noticed that the important topic of association measures and tests came up again in your blog, and we have few comments in this regard. It is useful to distinguish between the univariate and multivariate methods. A consistent multivariate method can recognise dependence between two vectors of random variables, while a univariate method can only loop over pairs of components and check for dependency between them. There are very few consistent multivariate methods. To the best of our knowledge there are three practical methods: 1) HSIC by Gretton et al. (http://www.gatsby.ucl.ac.uk/~gretton/papers/GreBouSmoSch05.pdf) 2) dcov by Szekely et al. (http://projecteuclid.org/euclid.aoas/1267453933) 3) A method we introduced in Heller et al (Biometrika, 2013, 503—510, http://biomet.oxfordjournals.org/content/early/2012/12/04/biomet.ass070.full.pdf+html, and an R package, HHG, is available as well http://cran.r-project.org/web/packages/HHG/index.html). A

4 0.93949515 309 andrew gelman stats-2010-10-01-Why Development Economics Needs Theory?

Introduction: Robert Neumann writes: in the JEP 24(3), page18, Daron Acemoglu states: Why Development Economics Needs Theory There is no general agreement on how much we should rely on economic theory in motivating empirical work and whether we should try to formulate and estimate “structural parameters.” I (Acemoglu) argue that the answer is largely “yes” because otherwise econometric estimates would lack external validity, in which case they can neither inform us about whether a particular model or theory is a useful approximation to reality, nor would they be useful in providing us guidance on what the effects of similar shocks and policies would be in different circumstances or if implemented in different scales. I therefore define “structural parameters” as those that provide external validity and would thus be useful in testing theories or in policy analysis beyond the specific environment and sample from which they are derived. External validity becomes a particularly challenging t

5 0.93591791 1616 andrew gelman stats-2012-12-10-John McAfee is a Heinlein hero

Introduction: “A small group of mathematicians” Jenny Davidson points to this article by Krugman on Asimov’s Foundation Trilogy. Given the silliness of the topic, Krugman’s piece is disappointingly serious (“Maybe the first thing to say about Foundation is that it’s not exactly science fiction – not really. Yes, it’s set in the future, there’s interstellar travel, people shoot each other with blasters instead of pistols and so on. But these are superficial details . . . the story can sound arid and didactic. . . . you’ll also be disappointed if you’re looking for shoot-em-up action scenes, in which Han Solo and Luke Skywalker destroy the Death Star in the nick of time. . . .”). What really jumped out at me from Krugman’s piece, though, was this line: In Foundation, we learn that a small group of mathematicians have developed “psychohistory”, the aforementioned rigorous science of society. Like Davidson (and Krugman), I read the Foundation books as a child. I remember the “psychohisto

6 0.92794889 1136 andrew gelman stats-2012-01-23-Fight! (also a bit of reminiscence at the end)

7 0.92606932 397 andrew gelman stats-2010-11-06-Multilevel quantile regression

8 0.92506385 1076 andrew gelman stats-2011-12-21-Derman, Rodrik and the nature of statistical models

9 0.9250474 1557 andrew gelman stats-2012-11-01-‘Researcher Degrees of Freedom’

10 0.92261654 705 andrew gelman stats-2011-05-10-Some interesting unpublished ideas on survey weighting

11 0.92197132 1230 andrew gelman stats-2012-03-26-Further thoughts on nonparametric correlation measures

12 0.90769154 1383 andrew gelman stats-2012-06-18-Hierarchical modeling as a framework for extrapolation

13 0.90210801 1272 andrew gelman stats-2012-04-20-More proposals to reform the peer-review system

14 0.89640915 2359 andrew gelman stats-2014-06-04-All the Assumptions That Are My Life

15 0.8955195 1591 andrew gelman stats-2012-11-26-Politics as an escape hatch

16 0.89407879 1819 andrew gelman stats-2013-04-23-Charles Murray’s “Coming Apart” and the measurement of social and political divisions

17 0.89407182 2061 andrew gelman stats-2013-10-14-More on Mister P and how it does what it does

18 0.89360255 1422 andrew gelman stats-2012-07-20-Likelihood thresholds and decisions

19 0.89302218 2056 andrew gelman stats-2013-10-09-Mister P: What’s its secret sauce?

20 0.8925696 416 andrew gelman stats-2010-11-16-Is parenting a form of addiction?