andrew_gelman_stats andrew_gelman_stats-2012 andrew_gelman_stats-2012-1317 knowledge-graph by maker-knowledge-mining

1317 andrew gelman stats-2012-05-13-Question 3 of my final exam for Design and Analysis of Sample Surveys


meta infos for this blog

Source: html

Introduction: 3. We discussed in class the best currently available method for estimating the proportion of military servicemembers who are gay. What is that method? (Recall the problems with the direct approach: there is no simple way to survey servicemembers at random, nor is it likely that they would answer such a question honestly.) Solution to question 2 From yesterday : 2. Which of the following are useful goals in a pilot study? (Indicate all that apply.) (a) You can search for statistical significance, then from that decide what to look for in a confirmatory analysis of your full dataset. (b) You can see if you find statistical significance in a pre-chosen comparison of interest. (c) You can examine the direction (positive or negative, even if not statistically significant) of comparisons of interest. (d) With a small sample size, you cannot hope to learn anything conclusive, but you can get a crude estimate of effect size and standard deviation which will be useful in a po


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 We discussed in class the best currently available method for estimating the proportion of military servicemembers who are gay. [sent-2, score-0.728]

2 (Recall the problems with the direct approach: there is no simple way to survey servicemembers at random, nor is it likely that they would answer such a question honestly. [sent-4, score-0.558]

3 ) Solution to question 2 From yesterday : 2. [sent-5, score-0.219]

4 Which of the following are useful goals in a pilot study? [sent-6, score-0.616]

5 ) (a) You can search for statistical significance, then from that decide what to look for in a confirmatory analysis of your full dataset. [sent-8, score-0.452]

6 (b) You can see if you find statistical significance in a pre-chosen comparison of interest. [sent-9, score-0.202]

7 (c) You can examine the direction (positive or negative, even if not statistically significant) of comparisons of interest. [sent-10, score-0.097]

8 (d) With a small sample size, you cannot hope to learn anything conclusive, but you can get a crude estimate of effect size and standard deviation which will be useful in a power analysis to help you decide how large your full study needs to be. [sent-11, score-1.432]

9 (e) You can talk with survey respondents and get a sense of how they perceived your questions. [sent-12, score-0.288]

10 (f) You get a chance to learn about practical difficulties with sampling, nonresponse, and question wording. [sent-13, score-0.411]

11 (g) You can check if your sample is approximately representative of your population. [sent-14, score-0.33]

12 The purpose of a pilot study is to test out the data collection. [sent-16, score-0.692]

13 The sample size will be too small for a, b, c, d, and g. [sent-17, score-0.522]

14 In some of their earliest work, Kahneman and Tversky documented the common misconception of researchers that data from a small pilot study should closely match the population. [sent-18, score-1.372]

15 The question would have clearer if I’d inserted the word “small” before “pilot” in the preamble. [sent-19, score-0.441]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('pilot', 0.443), ('servicemembers', 0.306), ('size', 0.186), ('small', 0.182), ('study', 0.17), ('decide', 0.155), ('sample', 0.154), ('question', 0.145), ('preamble', 0.144), ('misconception', 0.144), ('solution', 0.138), ('earliest', 0.138), ('significance', 0.135), ('conclusive', 0.133), ('inserted', 0.129), ('documented', 0.123), ('confirmatory', 0.118), ('learn', 0.115), ('method', 0.111), ('tversky', 0.111), ('nonresponse', 0.111), ('full', 0.111), ('kahneman', 0.109), ('survey', 0.107), ('perceived', 0.101), ('crude', 0.098), ('useful', 0.097), ('examine', 0.097), ('clearer', 0.097), ('representative', 0.092), ('deviation', 0.091), ('military', 0.089), ('match', 0.086), ('closely', 0.086), ('indicate', 0.085), ('approximately', 0.084), ('proportion', 0.082), ('respondents', 0.08), ('difficulties', 0.08), ('purpose', 0.079), ('goals', 0.076), ('yesterday', 0.074), ('needs', 0.073), ('currently', 0.071), ('practical', 0.071), ('word', 0.07), ('estimating', 0.069), ('search', 0.068), ('recall', 0.068), ('comparison', 0.067)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.99999988 1317 andrew gelman stats-2012-05-13-Question 3 of my final exam for Design and Analysis of Sample Surveys

Introduction: 3. We discussed in class the best currently available method for estimating the proportion of military servicemembers who are gay. What is that method? (Recall the problems with the direct approach: there is no simple way to survey servicemembers at random, nor is it likely that they would answer such a question honestly.) Solution to question 2 From yesterday : 2. Which of the following are useful goals in a pilot study? (Indicate all that apply.) (a) You can search for statistical significance, then from that decide what to look for in a confirmatory analysis of your full dataset. (b) You can see if you find statistical significance in a pre-chosen comparison of interest. (c) You can examine the direction (positive or negative, even if not statistically significant) of comparisons of interest. (d) With a small sample size, you cannot hope to learn anything conclusive, but you can get a crude estimate of effect size and standard deviation which will be useful in a po

2 0.51797915 1315 andrew gelman stats-2012-05-12-Question 2 of my final exam for Design and Analysis of Sample Surveys

Introduction: 2. Which of the following are useful goals in a pilot study? (Indicate all that apply.) (a) You can search for statistical significance, then from that decide what to look for in a confirmatory analysis of your full dataset. (b) You can see if you find statistical significance in a pre-chosen comparison of interest. (c) You can examine the direction (positive or negative, even if not statistically significant) of comparisons of interest. (d) With a small sample size, you cannot hope to learn anything conclusive, but you can get a crude estimate of effect size and standard deviation which will be useful in a power analysis to help you decide how large your full study needs to be. (e) You can talk with survey respondents and get a sense of how they perceived your questions. (f) You get a chance to learn about practical difficulties with sampling, nonresponse, and question wording. (g) You can check if your sample is approximately representative of your population. Soluti

3 0.40057597 695 andrew gelman stats-2011-05-04-Statistics ethics question

Introduction: A graduate student in public health writes: I have been asked to do the statistical analysis for a medical unit that is delivering a pilot study of a program to [details redacted to prevent identification]. They are using a prospective, nonrandomized, cohort-controlled trial study design. The investigator thinks they can recruit only a small number of treatment and control cases, maybe less than 30 in total. After I told the Investigator that I cannot do anything statistically with a sample size that small, he responded that small sample sizes are common in this field, and he send me an example of analysis that someone had done on a similar study. So he still wants me to come up with a statistical plan. Is it unethical for me to do anything other than descriptive statistics? I think he should just stick to qualitative research. But the study she mentions above has 40 subjects and apparently had enough power to detect some effects. This is a pilot study after all so the n does n

4 0.33414161 1320 andrew gelman stats-2012-05-14-Question 4 of my final exam for Design and Analysis of Sample Surveys

Introduction: 4. Researchers have found that survey respondents overreport church attendance. Thus, naive estimates from surveys overstate the percentage of Americans who attend church regularly. Does this have a large impact on estimates of time trends in religious attendance? Solution to question 3 From yesterday : 3. We discussed in class the best currently available method for estimating the proportion of military servicemembers who are gay. What is that method? (Recall the problems with the direct approach: there is no simple way to survey servicemembers at random, nor is it likely that they would answer such a question honestly.) Solution: I was talking about the work of Gary Gates, combining an estimate of the percentage of gays in the population with an estimate of the probability that someone is in the military, given that he or she is gay.

5 0.13518882 1371 andrew gelman stats-2012-06-07-Question 28 of my final exam for Design and Analysis of Sample Surveys

Introduction: This is it, the last question on the exam! 28. A telephone survey was conducted several years ago, asking people how often they were polled in the past year. I can’t recall the responses, but suppose that 40% of the respondents said they participated in zero surveys in the previous year, 30% said they participated in one survey, 15% said two surveys, 10% said three, and 5% said four. From this it is easy to estimate an average, but there is a worry that this survey will itself overrepresent survey participants and thus overestimate the rate at which the average person is surveyed. Come up with a procedure to use these data to get an improved estimate of the average number of surveys that a randomly-sampled American is polled in a year. Solution to question 27 From yesterday : 27. Which of the following problems were identified with the Burnham et al. survey of Iraq mortality? (Indicate all that apply.) (a) The survey used cluster sampling, which is inappropriate for estim

6 0.1321809 963 andrew gelman stats-2011-10-18-Question on Type M errors

7 0.13090475 1966 andrew gelman stats-2013-08-03-Uncertainty in parameter estimates using multilevel models

8 0.13021007 1944 andrew gelman stats-2013-07-18-You’ll get a high Type S error rate if you use classical statistical methods to analyze data from underpowered studies

9 0.12740642 1455 andrew gelman stats-2012-08-12-Probabilistic screening to get an approximate self-weighted sample

10 0.12001525 2152 andrew gelman stats-2013-12-28-Using randomized incentives as an instrument for survey nonresponse?

11 0.11598328 1349 andrew gelman stats-2012-05-28-Question 18 of my final exam for Design and Analysis of Sample Surveys

12 0.11534019 820 andrew gelman stats-2011-07-25-Design of nonrandomized cluster sample study

13 0.113942 511 andrew gelman stats-2011-01-11-One more time on that ESP study: The problem of overestimates and the shrinkage solution

14 0.11383883 957 andrew gelman stats-2011-10-14-Questions about a study of charter schools

15 0.11349923 1074 andrew gelman stats-2011-12-20-Reading a research paper != agreeing with its claims

16 0.1133416 2366 andrew gelman stats-2014-06-09-On deck this week

17 0.11205453 1628 andrew gelman stats-2012-12-17-Statistics in a world where nothing is random

18 0.11153894 1365 andrew gelman stats-2012-06-04-Question 25 of my final exam for Design and Analysis of Sample Surveys

19 0.11078747 761 andrew gelman stats-2011-06-13-A survey’s not a survey if they don’t tell you how they did it

20 0.10796317 1607 andrew gelman stats-2012-12-05-The p-value is not . . .


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.18), (1, 0.049), (2, 0.124), (3, -0.186), (4, 0.087), (5, 0.042), (6, -0.054), (7, 0.07), (8, -0.013), (9, -0.165), (10, 0.004), (11, -0.082), (12, 0.054), (13, 0.085), (14, -0.023), (15, -0.081), (16, -0.038), (17, -0.015), (18, 0.04), (19, 0.009), (20, -0.028), (21, -0.053), (22, -0.012), (23, 0.063), (24, -0.069), (25, 0.006), (26, 0.043), (27, -0.003), (28, 0.004), (29, -0.074), (30, 0.004), (31, 0.015), (32, 0.005), (33, 0.058), (34, 0.006), (35, 0.087), (36, -0.075), (37, -0.008), (38, -0.064), (39, 0.001), (40, 0.07), (41, 0.042), (42, -0.022), (43, 0.086), (44, 0.072), (45, -0.048), (46, -0.026), (47, -0.046), (48, 0.017), (49, -0.047)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.98722637 1317 andrew gelman stats-2012-05-13-Question 3 of my final exam for Design and Analysis of Sample Surveys

Introduction: 3. We discussed in class the best currently available method for estimating the proportion of military servicemembers who are gay. What is that method? (Recall the problems with the direct approach: there is no simple way to survey servicemembers at random, nor is it likely that they would answer such a question honestly.) Solution to question 2 From yesterday : 2. Which of the following are useful goals in a pilot study? (Indicate all that apply.) (a) You can search for statistical significance, then from that decide what to look for in a confirmatory analysis of your full dataset. (b) You can see if you find statistical significance in a pre-chosen comparison of interest. (c) You can examine the direction (positive or negative, even if not statistically significant) of comparisons of interest. (d) With a small sample size, you cannot hope to learn anything conclusive, but you can get a crude estimate of effect size and standard deviation which will be useful in a po

2 0.89122134 1315 andrew gelman stats-2012-05-12-Question 2 of my final exam for Design and Analysis of Sample Surveys

Introduction: 2. Which of the following are useful goals in a pilot study? (Indicate all that apply.) (a) You can search for statistical significance, then from that decide what to look for in a confirmatory analysis of your full dataset. (b) You can see if you find statistical significance in a pre-chosen comparison of interest. (c) You can examine the direction (positive or negative, even if not statistically significant) of comparisons of interest. (d) With a small sample size, you cannot hope to learn anything conclusive, but you can get a crude estimate of effect size and standard deviation which will be useful in a power analysis to help you decide how large your full study needs to be. (e) You can talk with survey respondents and get a sense of how they perceived your questions. (f) You get a chance to learn about practical difficulties with sampling, nonresponse, and question wording. (g) You can check if your sample is approximately representative of your population. Soluti

3 0.80986452 820 andrew gelman stats-2011-07-25-Design of nonrandomized cluster sample study

Introduction: Rhoderick Machekano writes: I have a design question which has been bothering me and wonder if you can clear for me. In my line of work, we often conveniently select health centers and from those sample patients. When I am doing sample size estimation under this design do I account for the design effect – since I expect outcomes in patients from the same health center to be correlated? Given that I didn’t random sample the health facilities, is my only limitation that I cannot generalize the results and make group level comparisons in the analysis? My response: You can generalize the results even if you didn’t randomly sample the health facilities. The only thing is that your generalization applies to the implicit population of facilities to which your sample is representative. You could try to move further on this by considering facility-level predictors. Regarding sample size estimation, see chapter 20 .

4 0.79056603 695 andrew gelman stats-2011-05-04-Statistics ethics question

Introduction: A graduate student in public health writes: I have been asked to do the statistical analysis for a medical unit that is delivering a pilot study of a program to [details redacted to prevent identification]. They are using a prospective, nonrandomized, cohort-controlled trial study design. The investigator thinks they can recruit only a small number of treatment and control cases, maybe less than 30 in total. After I told the Investigator that I cannot do anything statistically with a sample size that small, he responded that small sample sizes are common in this field, and he send me an example of analysis that someone had done on a similar study. So he still wants me to come up with a statistical plan. Is it unethical for me to do anything other than descriptive statistics? I think he should just stick to qualitative research. But the study she mentions above has 40 subjects and apparently had enough power to detect some effects. This is a pilot study after all so the n does n

5 0.74996448 1362 andrew gelman stats-2012-06-03-Question 24 of my final exam for Design and Analysis of Sample Surveys

Introduction: 24. A supermarket chain has 100 equally-sized stores. It is desired to estimate the proportion of vegetables that spoil before being sold. The following sampling designs are considered: (a) Sample 10 stores, then sample half the vegetables within each of these stores; or (b) Sample 20 stores, then sample one-quarter of the vegetables within each of these stores. Which of these designs has the lowest variance? Why might the higher-variance design still be chosen? Solution to question 23 From yesterday : 23. Suppose you are conducting a survey in which people are asked about their health behaviors (how often they wash their hands, how often they go to the doctor, etc.). There is a concern that different interviewers will get different sorts of responses—that is, there may be important interviewer effects. Describe (in two sentences) how you could estimate the interviewer effects within your survey. Can the interviewer effects create problems of reliability of the survey r

6 0.73214865 1320 andrew gelman stats-2012-05-14-Question 4 of my final exam for Design and Analysis of Sample Surveys

7 0.72429174 1358 andrew gelman stats-2012-06-01-Question 22 of my final exam for Design and Analysis of Sample Surveys

8 0.68759674 1371 andrew gelman stats-2012-06-07-Question 28 of my final exam for Design and Analysis of Sample Surveys

9 0.68443429 1455 andrew gelman stats-2012-08-12-Probabilistic screening to get an approximate self-weighted sample

10 0.67909634 1551 andrew gelman stats-2012-10-28-A convenience sample and selected treatments

11 0.67102748 1679 andrew gelman stats-2013-01-18-Is it really true that only 8% of people who buy Herbalife products are Herbalife distributors?

12 0.66477031 1356 andrew gelman stats-2012-05-31-Question 21 of my final exam for Design and Analysis of Sample Surveys

13 0.66359442 1944 andrew gelman stats-2013-07-18-You’ll get a high Type S error rate if you use classical statistical methods to analyze data from underpowered studies

14 0.66071385 2359 andrew gelman stats-2014-06-04-All the Assumptions That Are My Life

15 0.6567235 1349 andrew gelman stats-2012-05-28-Question 18 of my final exam for Design and Analysis of Sample Surveys

16 0.65164363 1361 andrew gelman stats-2012-06-02-Question 23 of my final exam for Design and Analysis of Sample Surveys

17 0.65071058 1331 andrew gelman stats-2012-05-19-Question 9 of my final exam for Design and Analysis of Sample Surveys

18 0.64319843 107 andrew gelman stats-2010-06-24-PPS in Georgia

19 0.63057923 963 andrew gelman stats-2011-10-18-Question on Type M errors

20 0.62693745 1323 andrew gelman stats-2012-05-16-Question 6 of my final exam for Design and Analysis of Sample Surveys


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(1, 0.015), (16, 0.023), (24, 0.155), (52, 0.015), (53, 0.015), (61, 0.011), (86, 0.01), (98, 0.154), (99, 0.486)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.98958695 742 andrew gelman stats-2011-06-02-Grouponomics, counterfactuals, and opportunity cost

Introduction: I keep encountering the word “Groupon”–I think it’s some sort of pets.com-style commercial endeavor where people can buy coupons? I don’t really care, and I’ve avoided googling the word out of a general animosity toward our society’s current glorification of get-rich-quick schemes. (As you can tell, I’m still bitter about that whole stock market thing.) Anyway, even without knowing what Groupon actually is, I enjoyed this blog by Kaiser Fung in which he tries to work out some of its economic consequences. He connects the statistical notion of counterfactuals to the concept of opportunity cost from economics. The comments are interesting too.

2 0.98382312 132 andrew gelman stats-2010-07-07-Note to “Cigarettes”

Introduction: To the person who posted an apparently non-spam comment with a URL link to a “cheap cigarettes” website: In case you’re wondering, no, your comment didn’t get caught by the spam filter–I’m not sure why not, given that URL. I put it in the spam file manually. If you’d like to participate in blog discussion in the future, please refrain from including spam links. Thank you. Also, it’s “John Tukey,” not “John Turkey.”

3 0.98330039 1701 andrew gelman stats-2013-01-31-The name that fell off a cliff

Introduction: John Tillinghast points us to this blog entry by Hilary Parker. Here’s what she found: Hey—nice graph! P.S. Those of you who are interested in this sort of thing should check out the Baby Name Wizard blog which is full of thoughtful, data-based explorations about names.

same-blog 4 0.98165452 1317 andrew gelman stats-2012-05-13-Question 3 of my final exam for Design and Analysis of Sample Surveys

Introduction: 3. We discussed in class the best currently available method for estimating the proportion of military servicemembers who are gay. What is that method? (Recall the problems with the direct approach: there is no simple way to survey servicemembers at random, nor is it likely that they would answer such a question honestly.) Solution to question 2 From yesterday : 2. Which of the following are useful goals in a pilot study? (Indicate all that apply.) (a) You can search for statistical significance, then from that decide what to look for in a confirmatory analysis of your full dataset. (b) You can see if you find statistical significance in a pre-chosen comparison of interest. (c) You can examine the direction (positive or negative, even if not statistically significant) of comparisons of interest. (d) With a small sample size, you cannot hope to learn anything conclusive, but you can get a crude estimate of effect size and standard deviation which will be useful in a po

5 0.98138517 955 andrew gelman stats-2011-10-12-Why it doesn’t make sense to chew people out for not reading the help page

Introduction: Karl Broman writes : Barry Rowlingson gave an interesting talk at UseR 2011, “Why R-help must die!” He suggested the Q-and-A type sites Stack Overflow (on programming) and Cross Validated (on statistics), both part of Stack Exchange. I haven’t used R-help recently but I do occasionally send people there. Just to see what was going on there, I clicked on over , did a little searching, and found this delight from a renowned professor of R. There’s something about the “please” there that just makes it all that much more special. (In contrast, the advice here to “please do your homework” just seems rude. I have a larger (or maybe smaller) point to make, though, which is about the silliness of advice to “read the damn manual” etc. Several years ago I read a fascinating book called City by William Whyte. He and his students had gone around various public places in NYC and observed how people actually behaved—how they walked, sit, stood, and interacted. One of Whyte’s ce

6 0.97845036 1399 andrew gelman stats-2012-06-28-Life imitates blog

7 0.97570938 196 andrew gelman stats-2010-08-10-The U.S. as welfare state

8 0.96861005 1806 andrew gelman stats-2013-04-16-My talk in Chicago this Thurs 6:30pm

9 0.96742511 1 andrew gelman stats-2010-04-22-Political Belief Networks: Socio-cognitive Heterogeneity in American Public Opinion

10 0.96537447 695 andrew gelman stats-2011-05-04-Statistics ethics question

11 0.96349812 2234 andrew gelman stats-2014-03-05-Plagiarism, Arizona style

12 0.96189743 625 andrew gelman stats-2011-03-23-My last post on albedo, I promise

13 0.96161389 2334 andrew gelman stats-2014-05-14-“The subtle funk of just a little poultry offal”

14 0.9609434 635 andrew gelman stats-2011-03-29-Bayesian spam!

15 0.96087635 96 andrew gelman stats-2010-06-18-Course proposal: Bayesian and advanced likelihood statistical methods for zombies.

16 0.96078962 325 andrew gelman stats-2010-10-07-Fitting discrete-data regression models in social science

17 0.95989835 626 andrew gelman stats-2011-03-23-Physics is hard

18 0.95911545 2016 andrew gelman stats-2013-09-11-Zipfian Academy, A School for Data Science

19 0.95769978 2362 andrew gelman stats-2014-06-06-Statistically savvy journalism

20 0.95715183 82 andrew gelman stats-2010-06-12-UnConMax – uncertainty consideration maxims 7 +-- 2