andrew_gelman_stats andrew_gelman_stats-2012 andrew_gelman_stats-2012-1365 knowledge-graph by maker-knowledge-mining

1365 andrew gelman stats-2012-06-04-Question 25 of my final exam for Design and Analysis of Sample Surveys


meta infos for this blog

Source: html

Introduction: 25. You are using multilevel regression and poststratification (MRP) to a survey of 1500 people to estimate support for the space program, by state. The model is fit using, as a state- level predictor, the Republican presidential vote in the state, which turns out to have a low correlation with support for the space program. Which of the following statements are basically true? (Indicate all that apply.) (a) For small states, the MRP estimates will be determined almost entirely by the demo- graphic characteristics of the respondents in the sample from that state. (b) For small states, the MRP estimates will be determined almost entirely by the demographic characteristics of the population in that state. (c) Adding a predictor specifically for this model (for example, a measure of per-capita space-program spending in the state) could dramatically improve the estimates of state-level opinion. (d) It would not be appropriate to add a predictor such as per-capita space-program spen


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 You are using multilevel regression and poststratification (MRP) to a survey of 1500 people to estimate support for the space program, by state. [sent-2, score-0.404]

2 The model is fit using, as a state- level predictor, the Republican presidential vote in the state, which turns out to have a low correlation with support for the space program. [sent-3, score-0.519]

3 ) (a) For small states, the MRP estimates will be determined almost entirely by the demo- graphic characteristics of the respondents in the sample from that state. [sent-6, score-1.176]

4 (b) For small states, the MRP estimates will be determined almost entirely by the demographic characteristics of the population in that state. [sent-7, score-0.873]

5 (c) Adding a predictor specifically for this model (for example, a measure of per-capita space-program spending in the state) could dramatically improve the estimates of state-level opinion. [sent-8, score-0.896]

6 (d) It would not be appropriate to add a predictor such as per-capita space-program spending in the state: by adding such a predictor to the model, you would essentially be assuming what you are trying to prove. [sent-9, score-0.97]

7 Solution to question 24 From yesterday : 24. [sent-10, score-0.066]

8 It is desired to estimate the proportion of vegetables that spoil before being sold. [sent-12, score-0.688]

9 The following sampling designs are considered: (a) Sample 10 stores, then sample half the vegetables within each of these stores; or (b) Sample 20 stores, then sample one-quarter of the vegetables within each of these stores. [sent-13, score-1.594]

10 Which of these designs has the lowest variance? [sent-14, score-0.266]

11 Solution: Design (b) has lower variance but design (a) is cheaper. [sent-16, score-0.289]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('stores', 0.345), ('vegetables', 0.321), ('predictor', 0.312), ('mrp', 0.288), ('sample', 0.229), ('designs', 0.172), ('characteristics', 0.168), ('determined', 0.168), ('design', 0.166), ('adding', 0.141), ('estimates', 0.14), ('spending', 0.14), ('entirely', 0.137), ('state', 0.137), ('space', 0.128), ('solution', 0.123), ('variance', 0.123), ('spoil', 0.123), ('cheaper', 0.115), ('supermarket', 0.112), ('states', 0.107), ('support', 0.102), ('within', 0.095), ('almost', 0.095), ('lowest', 0.094), ('poststratification', 0.089), ('dramatically', 0.088), ('graphic', 0.087), ('desired', 0.086), ('estimate', 0.085), ('demographic', 0.084), ('small', 0.081), ('model', 0.08), ('chosen', 0.08), ('chain', 0.08), ('indicate', 0.076), ('presidential', 0.075), ('proportion', 0.073), ('following', 0.071), ('respondents', 0.071), ('turns', 0.07), ('specifically', 0.069), ('improve', 0.067), ('statements', 0.066), ('yesterday', 0.066), ('republican', 0.066), ('assuming', 0.065), ('correlation', 0.064), ('basically', 0.063), ('half', 0.061)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.99999988 1365 andrew gelman stats-2012-06-04-Question 25 of my final exam for Design and Analysis of Sample Surveys

Introduction: 25. You are using multilevel regression and poststratification (MRP) to a survey of 1500 people to estimate support for the space program, by state. The model is fit using, as a state- level predictor, the Republican presidential vote in the state, which turns out to have a low correlation with support for the space program. Which of the following statements are basically true? (Indicate all that apply.) (a) For small states, the MRP estimates will be determined almost entirely by the demo- graphic characteristics of the respondents in the sample from that state. (b) For small states, the MRP estimates will be determined almost entirely by the demographic characteristics of the population in that state. (c) Adding a predictor specifically for this model (for example, a measure of per-capita space-program spending in the state) could dramatically improve the estimates of state-level opinion. (d) It would not be appropriate to add a predictor such as per-capita space-program spen

2 0.54118896 1367 andrew gelman stats-2012-06-05-Question 26 of my final exam for Design and Analysis of Sample Surveys

Introduction: 26. You have just graded an an exam with 28 questions and 15 students. You fit a logistic item- response model estimating ability, difficulty, and discrimination parameters. Which of the following statements are basically true? (Indicate all that apply.) (a) If a question is answered correctly by students with very low and very high ability, but is missed by students in the middle, it will have a high value for its discrimination parameter. (b) It is not possible to fit an item-response model when you have more questions than students. In order to fit the model, you either need to reduce the number of questions (for example, by discarding some questions or by putting together some questions into a combined score) or increase the number of students in the dataset. (c) To keep the model identified, you can set one of the difficulty parameters or one of the ability parameters to zero and set one of the discrimination parameters to 1. (d) If two students answer the same number of q

3 0.3778393 1362 andrew gelman stats-2012-06-03-Question 24 of my final exam for Design and Analysis of Sample Surveys

Introduction: 24. A supermarket chain has 100 equally-sized stores. It is desired to estimate the proportion of vegetables that spoil before being sold. The following sampling designs are considered: (a) Sample 10 stores, then sample half the vegetables within each of these stores; or (b) Sample 20 stores, then sample one-quarter of the vegetables within each of these stores. Which of these designs has the lowest variance? Why might the higher-variance design still be chosen? Solution to question 23 From yesterday : 23. Suppose you are conducting a survey in which people are asked about their health behaviors (how often they wash their hands, how often they go to the doctor, etc.). There is a concern that different interviewers will get different sorts of responses—that is, there may be important interviewer effects. Describe (in two sentences) how you could estimate the interviewer effects within your survey. Can the interviewer effects create problems of reliability of the survey r

4 0.34561566 1358 andrew gelman stats-2012-06-01-Question 22 of my final exam for Design and Analysis of Sample Surveys

Introduction: 22. A supermarket chain has 100 equally-sized stores. It is desired to estimate the proportion of vegetables that spoil before being sold. Three stores are selected at random and are checked: the percent of spoiled vegetables are 3%, 5%, and 10% in the three stores. Give an estimate and standard error for the percentage of spoiled vegetables for the entire chain. Solution to question 21 From yesterday : 21. A country is divided into three regions with populations of 2 million, 2 million, and 0.5 million, respectively. A survey is done asking about foreign policy opinions. Somebody proposes taking a sample of 50 people from each reason. Give a reason why this non-proportional sample would not usually be done, and also a reason why it might actually be a good idea. Solution: Nonproportional sampling is usually avoided because it makes the analysis more complicated and it results in a higher standard error for estimates of the general population. It might be a good idea her

5 0.32889771 2056 andrew gelman stats-2013-10-09-Mister P: What’s its secret sauce?

Introduction: This is a long and technical post on an important topic: the use of multilevel regression and poststratification (MRP) to estimate state-level public opinion. MRP as a research method, and state-level opinion (or, more generally, attitudes in demographic and geographic subpopulation) as a subject, have both become increasingly important in political science—and soon, I expect, will become increasingly important in other social sciences as well. Being able to estimate state-level opinion from national surveys is just such a powerful thing, that if it can be done, people will do it. It’s taken 15 years or so for the method to really catch on, but the ready availability of survey data and of computing power—as well as our increasing comfort level, as a profession, with these techniques, has made MRP become more of a routine research tool. As a method becomes used more and more widely, there will be natural concerns about its domains of applicability. That is the subject of the pres

6 0.30616155 2061 andrew gelman stats-2013-10-14-More on Mister P and how it does what it does

7 0.22474059 1361 andrew gelman stats-2012-06-02-Question 23 of my final exam for Design and Analysis of Sample Surveys

8 0.21450061 2062 andrew gelman stats-2013-10-15-Last word on Mister P (for now)

9 0.20137441 1196 andrew gelman stats-2012-03-04-Piss-poor monocausal social science

10 0.16098499 383 andrew gelman stats-2010-10-31-Analyzing the entire population rather than a sample

11 0.14012495 820 andrew gelman stats-2011-07-25-Design of nonrandomized cluster sample study

12 0.13626359 769 andrew gelman stats-2011-06-15-Mr. P by another name . . . is still great!

13 0.12578924 1315 andrew gelman stats-2012-05-12-Question 2 of my final exam for Design and Analysis of Sample Surveys

14 0.1254063 1349 andrew gelman stats-2012-05-28-Question 18 of my final exam for Design and Analysis of Sample Surveys

15 0.1214344 1097 andrew gelman stats-2012-01-03-Libertarians in Space

16 0.1146485 1934 andrew gelman stats-2013-07-11-Yes, worry about generalizing from data to population. But multilevel modeling is the solution, not the problem

17 0.11153894 1317 andrew gelman stats-2012-05-13-Question 3 of my final exam for Design and Analysis of Sample Surveys

18 0.1112281 1331 andrew gelman stats-2012-05-19-Question 9 of my final exam for Design and Analysis of Sample Surveys

19 0.11050449 1340 andrew gelman stats-2012-05-23-Question 13 of my final exam for Design and Analysis of Sample Surveys

20 0.1098064 1334 andrew gelman stats-2012-05-21-Question 11 of my final exam for Design and Analysis of Sample Surveys


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.142), (1, 0.084), (2, 0.246), (3, -0.065), (4, 0.094), (5, 0.091), (6, -0.032), (7, -0.002), (8, 0.008), (9, -0.044), (10, 0.099), (11, -0.089), (12, -0.005), (13, 0.143), (14, -0.035), (15, -0.047), (16, -0.045), (17, -0.024), (18, -0.003), (19, 0.022), (20, -0.004), (21, -0.082), (22, -0.011), (23, -0.005), (24, 0.006), (25, -0.053), (26, -0.013), (27, 0.055), (28, -0.005), (29, -0.009), (30, 0.127), (31, -0.142), (32, 0.061), (33, -0.057), (34, 0.006), (35, 0.016), (36, 0.068), (37, -0.059), (38, 0.125), (39, -0.096), (40, 0.024), (41, -0.056), (42, -0.075), (43, 0.087), (44, 0.041), (45, -0.036), (46, 0.018), (47, 0.129), (48, 0.067), (49, 0.057)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.97805285 1365 andrew gelman stats-2012-06-04-Question 25 of my final exam for Design and Analysis of Sample Surveys

Introduction: 25. You are using multilevel regression and poststratification (MRP) to a survey of 1500 people to estimate support for the space program, by state. The model is fit using, as a state- level predictor, the Republican presidential vote in the state, which turns out to have a low correlation with support for the space program. Which of the following statements are basically true? (Indicate all that apply.) (a) For small states, the MRP estimates will be determined almost entirely by the demo- graphic characteristics of the respondents in the sample from that state. (b) For small states, the MRP estimates will be determined almost entirely by the demographic characteristics of the population in that state. (c) Adding a predictor specifically for this model (for example, a measure of per-capita space-program spending in the state) could dramatically improve the estimates of state-level opinion. (d) It would not be appropriate to add a predictor such as per-capita space-program spen

2 0.83119601 2056 andrew gelman stats-2013-10-09-Mister P: What’s its secret sauce?

Introduction: This is a long and technical post on an important topic: the use of multilevel regression and poststratification (MRP) to estimate state-level public opinion. MRP as a research method, and state-level opinion (or, more generally, attitudes in demographic and geographic subpopulation) as a subject, have both become increasingly important in political science—and soon, I expect, will become increasingly important in other social sciences as well. Being able to estimate state-level opinion from national surveys is just such a powerful thing, that if it can be done, people will do it. It’s taken 15 years or so for the method to really catch on, but the ready availability of survey data and of computing power—as well as our increasing comfort level, as a profession, with these techniques, has made MRP become more of a routine research tool. As a method becomes used more and more widely, there will be natural concerns about its domains of applicability. That is the subject of the pres

3 0.81855279 2061 andrew gelman stats-2013-10-14-More on Mister P and how it does what it does

Introduction: Following up on our discussion the other day, Matt Buttice and Ben Highton write: It was nice to see our article mentioned and discussed by Andrew, Jeff Lax, Justin Phillips, and Yair Ghitza on Andrew’s blog in this post on Wednesday. As noted in the post, we recently published an article in Political Analysis on how well multilevel regression and poststratification (MRP) performs at producing estimates of state opinion with conventional national surveys where N≈1,500. Our central claims are that (i) the performance of MRP is highly variable, (ii) in the absence of knowing the true values, it is difficult to determine the quality of the MRP estimates produced on the basis of a single national sample, and, (iii) therefore, our views about the usefulness of MRP in instances where a researcher has a single sample of N≈1,500 are less optimistic than the ones expressed in previous research on the topic. Obviously we were interested in the blog posts. We found them stimulating

4 0.78819829 2062 andrew gelman stats-2013-10-15-Last word on Mister P (for now)

Introduction: To recap: Matt Buttice and Ben Highton recently published an article where they evaluated multilevel regression and poststratification (MRP) on a bunch of political examples estimating state-level attitudes. My Columbia colleagues Jeff Lax, Justin Phillips, and Yair Ghitza added some discussion , giving a bunch of practical tips and pointing to some problems with Buttice and Highton’s evaluations. Buttice and Highton replied , emphasizing the difficulties of comparing methods in the absence of a known ground truth. And Jeff Lax added the following comment, which I think is a good overview of the discussion so far: In the back and forth between us all on details, some points may get lost and disagreements overstated. Where are things at this point? 1. Buttice and Highton (BH) show beyond previous work that MRP performance in making state estimates can vary to an extent that is not directly observable unless one knows the true estimates (in which case one would not be us

5 0.78170931 1367 andrew gelman stats-2012-06-05-Question 26 of my final exam for Design and Analysis of Sample Surveys

Introduction: 26. You have just graded an an exam with 28 questions and 15 students. You fit a logistic item- response model estimating ability, difficulty, and discrimination parameters. Which of the following statements are basically true? (Indicate all that apply.) (a) If a question is answered correctly by students with very low and very high ability, but is missed by students in the middle, it will have a high value for its discrimination parameter. (b) It is not possible to fit an item-response model when you have more questions than students. In order to fit the model, you either need to reduce the number of questions (for example, by discarding some questions or by putting together some questions into a combined score) or increase the number of students in the dataset. (c) To keep the model identified, you can set one of the difficulty parameters or one of the ability parameters to zero and set one of the discrimination parameters to 1. (d) If two students answer the same number of q

6 0.66940677 1358 andrew gelman stats-2012-06-01-Question 22 of my final exam for Design and Analysis of Sample Surveys

7 0.66635358 1362 andrew gelman stats-2012-06-03-Question 24 of my final exam for Design and Analysis of Sample Surveys

8 0.66048336 1323 andrew gelman stats-2012-05-16-Question 6 of my final exam for Design and Analysis of Sample Surveys

9 0.61841315 1322 andrew gelman stats-2012-05-15-Question 5 of my final exam for Design and Analysis of Sample Surveys

10 0.61673623 152 andrew gelman stats-2010-07-17-Distorting the Electoral Connection? Partisan Representation in Confirmation Politics

11 0.61124814 1320 andrew gelman stats-2012-05-14-Question 4 of my final exam for Design and Analysis of Sample Surveys

12 0.60562962 1331 andrew gelman stats-2012-05-19-Question 9 of my final exam for Design and Analysis of Sample Surveys

13 0.60258693 1361 andrew gelman stats-2012-06-02-Question 23 of my final exam for Design and Analysis of Sample Surveys

14 0.59169251 200 andrew gelman stats-2010-08-11-Separating national and state swings in voting and public opinion, or, How I avoided blogorific embarrassment: An agony in four acts

15 0.58716446 769 andrew gelman stats-2011-06-15-Mr. P by another name . . . is still great!

16 0.56216514 1368 andrew gelman stats-2012-06-06-Question 27 of my final exam for Design and Analysis of Sample Surveys

17 0.56012964 1348 andrew gelman stats-2012-05-27-Question 17 of my final exam for Design and Analysis of Sample Surveys

18 0.56010205 2074 andrew gelman stats-2013-10-23-Can’t Stop Won’t Stop Mister P Beatdown

19 0.55133641 1441 andrew gelman stats-2012-08-02-“Based on my experiences, I think you could make general progress by constructing a solution to your specific problem.”

20 0.53405863 1313 andrew gelman stats-2012-05-11-Question 1 of my final exam for Design and Analysis of Sample Surveys


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(2, 0.068), (16, 0.069), (24, 0.153), (36, 0.011), (37, 0.015), (65, 0.239), (86, 0.023), (89, 0.012), (91, 0.062), (99, 0.237)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.94578505 457 andrew gelman stats-2010-12-07-Whassup with phantom-limb treatment?

Introduction: OK, here’s something that is completely baffling me. I read this article by John Colapinto on the neuroscientist V. S. Ramachandran, who’s famous for his innovative treatment for “phantom limb” pain: His first subject was a young man who a decade earlier had crashed his motorcycle and torn from his spinal column the nerves supplying the left arm. After keeping the useless arm in a sling for a year, the man had the arm amputated above the elbow. Ever since, he had felt unremitting cramping in the phantom limb, as though it were immobilized in an awkward position. . . . Ramachandram positioned a twenty-inch-by-twenty-inch drugstore mirror . . . and told him to place his intact right arm on one side of the mirror and his stump on the other. He told the man to arrange the mirror so that the reflection created the illusion that his intact arm was the continuation of the amputated one. The Ramachandran asked the man to move his right and left arms . . . “Oh, my God!” the man began

2 0.92937797 1475 andrew gelman stats-2012-08-30-A Stan is Born

Introduction: Stan 1.0.0 and RStan 1.0.0 It’s official. The Stan Development Team is happy to announce the first stable versions of Stan and RStan. What is (R)Stan? Stan is an open-source package for obtaining Bayesian inference using the No-U-Turn sampler, a variant of Hamiltonian Monte Carlo. It’s sort of like BUGS, but with a different language for expressing models and a different sampler for sampling from their posteriors. RStan is the R interface to Stan. Stan Home Page Stan’s home page is: http://mc-stan.org/ It links everything you need to get started running Stan from the command line, from R, or from C++, including full step-by-step install instructions, a detailed user’s guide and reference manual for the modeling language, and tested ports of most of the BUGS examples. Peruse the Manual If you’d like to learn more, the Stan User’s Guide and Reference Manual is the place to start.

same-blog 3 0.90529728 1365 andrew gelman stats-2012-06-04-Question 25 of my final exam for Design and Analysis of Sample Surveys

Introduction: 25. You are using multilevel regression and poststratification (MRP) to a survey of 1500 people to estimate support for the space program, by state. The model is fit using, as a state- level predictor, the Republican presidential vote in the state, which turns out to have a low correlation with support for the space program. Which of the following statements are basically true? (Indicate all that apply.) (a) For small states, the MRP estimates will be determined almost entirely by the demo- graphic characteristics of the respondents in the sample from that state. (b) For small states, the MRP estimates will be determined almost entirely by the demographic characteristics of the population in that state. (c) Adding a predictor specifically for this model (for example, a measure of per-capita space-program spending in the state) could dramatically improve the estimates of state-level opinion. (d) It would not be appropriate to add a predictor such as per-capita space-program spen

4 0.89725369 2062 andrew gelman stats-2013-10-15-Last word on Mister P (for now)

Introduction: To recap: Matt Buttice and Ben Highton recently published an article where they evaluated multilevel regression and poststratification (MRP) on a bunch of political examples estimating state-level attitudes. My Columbia colleagues Jeff Lax, Justin Phillips, and Yair Ghitza added some discussion , giving a bunch of practical tips and pointing to some problems with Buttice and Highton’s evaluations. Buttice and Highton replied , emphasizing the difficulties of comparing methods in the absence of a known ground truth. And Jeff Lax added the following comment, which I think is a good overview of the discussion so far: In the back and forth between us all on details, some points may get lost and disagreements overstated. Where are things at this point? 1. Buttice and Highton (BH) show beyond previous work that MRP performance in making state estimates can vary to an extent that is not directly observable unless one knows the true estimates (in which case one would not be us

5 0.89654672 1993 andrew gelman stats-2013-08-22-Improvements to Kindle Version of BDA3

Introduction: I let Andrew know about the comments about the defective Kindle version of BDA2 and he wrote to his editor at Chapman and Hall, Rob Calver, who wrote back with this info: I can guarantee that the Kindle version of the third edition will be a substantial improvement. We publish all of our mathematics and statistics books through Kindle now as Print Replica. This means that we send the printer pdf to Amazon and they convert into their Print Replica format, which is essentially just a pdf viewer. We have not experienced very many issues at all with this setup. Unfortunately, there was a period before Amazon launched Print Replica when they converted math/stat books into their Kindle format, and converted them very badly in some cases. Equations were held as images, making them very difficult to read. It appears this was the case with Andrew’s second edition, judging by some of the comments. The third edition [of BDA] will be available through Kindle with a short delay (for Amazo

6 0.89637947 1197 andrew gelman stats-2012-03-04-“All Models are Right, Most are Useless”

7 0.89400101 1426 andrew gelman stats-2012-07-23-Special effects

8 0.88934982 1021 andrew gelman stats-2011-11-21-Don’t judge a book by its title

9 0.8708958 2074 andrew gelman stats-2013-10-23-Can’t Stop Won’t Stop Mister P Beatdown

10 0.87017214 671 andrew gelman stats-2011-04-20-One more time-use graph

11 0.85919315 1333 andrew gelman stats-2012-05-20-Question 10 of my final exam for Design and Analysis of Sample Surveys

12 0.85644889 2056 andrew gelman stats-2013-10-09-Mister P: What’s its secret sauce?

13 0.85058331 463 andrew gelman stats-2010-12-11-Compare p-values from privately funded medical trials to those in publicly funded research?

14 0.84832239 2146 andrew gelman stats-2013-12-24-NYT version of birthday graph

15 0.83772969 1454 andrew gelman stats-2012-08-11-Weakly informative priors for Bayesian nonparametric models?

16 0.82997048 990 andrew gelman stats-2011-11-04-At the politics blogs . . .

17 0.8252008 2061 andrew gelman stats-2013-10-14-More on Mister P and how it does what it does

18 0.82368684 1845 andrew gelman stats-2013-05-07-Is Felix Salmon wrong on free TV?

19 0.81632066 758 andrew gelman stats-2011-06-11-Hey, good news! Your p-value just passed the 0.05 threshold!

20 0.81379759 100 andrew gelman stats-2010-06-19-Unsurprisingly, people are more worried about the economy and jobs than about deficits