andrew_gelman_stats andrew_gelman_stats-2012 andrew_gelman_stats-2012-1367 knowledge-graph by maker-knowledge-mining

1367 andrew gelman stats-2012-06-05-Question 26 of my final exam for Design and Analysis of Sample Surveys


meta infos for this blog

Source: html

Introduction: 26. You have just graded an an exam with 28 questions and 15 students. You fit a logistic item- response model estimating ability, difficulty, and discrimination parameters. Which of the following statements are basically true? (Indicate all that apply.) (a) If a question is answered correctly by students with very low and very high ability, but is missed by students in the middle, it will have a high value for its discrimination parameter. (b) It is not possible to fit an item-response model when you have more questions than students. In order to fit the model, you either need to reduce the number of questions (for example, by discarding some questions or by putting together some questions into a combined score) or increase the number of students in the dataset. (c) To keep the model identified, you can set one of the difficulty parameters or one of the ability parameters to zero and set one of the discrimination parameters to 1. (d) If two students answer the same number of q


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 You have just graded an an exam with 28 questions and 15 students. [sent-2, score-0.417]

2 You fit a logistic item- response model estimating ability, difficulty, and discrimination parameters. [sent-3, score-0.576]

3 ) (a) If a question is answered correctly by students with very low and very high ability, but is missed by students in the middle, it will have a high value for its discrimination parameter. [sent-6, score-1.26]

4 (b) It is not possible to fit an item-response model when you have more questions than students. [sent-7, score-0.566]

5 In order to fit the model, you either need to reduce the number of questions (for example, by discarding some questions or by putting together some questions into a combined score) or increase the number of students in the dataset. [sent-8, score-1.373]

6 (c) To keep the model identified, you can set one of the difficulty parameters or one of the ability parameters to zero and set one of the discrimination parameters to 1. [sent-9, score-1.414]

7 (d) If two students answer the same number of questions correctly, they will have the same estimated ability parameter. [sent-10, score-0.861]

8 (e) Under the model, if a student’s ability parameter has the same value as a particular question’s difficulty parameter, there is a 50% chance the student will get the question right. [sent-11, score-0.827]

9 Solution to question 25 From yesterday : 25. [sent-12, score-0.11]

10 You are using multilevel regression and poststratification (MRP) to a survey of 1500 people to estimate support for the space program, by state. [sent-13, score-0.273]

11 The model is fit using, as a state- level predictor, the Republican presidential vote in the state, which turns out to have a low correlation with support for the space program. [sent-14, score-0.688]

12 ) (a) For small states, the MRP estimates will be determined almost entirely by the demo- graphic characteristics of the respondents in the sample from that state. [sent-17, score-0.784]

13 (b) For small states, the MRP estimates will be determined almost entirely by the demographic characteristics of the population in that state. [sent-18, score-0.782]

14 (c) Adding a predictor specifically for this model (for example, a measure of per-capita space-program spending in the state) could dramatically improve the estimates of state-level opinion. [sent-19, score-0.831]

15 (d) It would not be appropriate to add a predictor such as per-capita space-program spending in the state: by adding such a predictor to the model, you would essentially be assuming what you are trying to prove. [sent-20, score-0.906]

16 (a) is wrong (actually, inference for small states will be driven by the state-level predictor), and (d) is wrong (because the new coefficient will be estimated from the data). [sent-22, score-0.565]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('predictor', 0.333), ('ability', 0.27), ('mrp', 0.246), ('discrimination', 0.243), ('questions', 0.233), ('model', 0.183), ('difficulty', 0.166), ('students', 0.154), ('fit', 0.15), ('characteristics', 0.144), ('determined', 0.144), ('parameters', 0.14), ('states', 0.138), ('indicate', 0.129), ('correctly', 0.127), ('adding', 0.12), ('estimates', 0.12), ('spending', 0.12), ('entirely', 0.117), ('state', 0.117), ('statements', 0.113), ('question', 0.11), ('space', 0.109), ('basically', 0.108), ('solution', 0.105), ('graded', 0.105), ('small', 0.104), ('number', 0.102), ('estimated', 0.102), ('parameter', 0.101), ('student', 0.095), ('low', 0.094), ('discarding', 0.093), ('support', 0.088), ('answered', 0.086), ('value', 0.085), ('almost', 0.081), ('exam', 0.079), ('driven', 0.077), ('poststratification', 0.076), ('dramatically', 0.075), ('graphic', 0.074), ('combined', 0.073), ('wrong', 0.072), ('true', 0.072), ('demographic', 0.072), ('missed', 0.069), ('high', 0.069), ('set', 0.066), ('presidential', 0.064)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0 1367 andrew gelman stats-2012-06-05-Question 26 of my final exam for Design and Analysis of Sample Surveys

Introduction: 26. You have just graded an an exam with 28 questions and 15 students. You fit a logistic item- response model estimating ability, difficulty, and discrimination parameters. Which of the following statements are basically true? (Indicate all that apply.) (a) If a question is answered correctly by students with very low and very high ability, but is missed by students in the middle, it will have a high value for its discrimination parameter. (b) It is not possible to fit an item-response model when you have more questions than students. In order to fit the model, you either need to reduce the number of questions (for example, by discarding some questions or by putting together some questions into a combined score) or increase the number of students in the dataset. (c) To keep the model identified, you can set one of the difficulty parameters or one of the ability parameters to zero and set one of the discrimination parameters to 1. (d) If two students answer the same number of q

2 0.64367521 1368 andrew gelman stats-2012-06-06-Question 27 of my final exam for Design and Analysis of Sample Surveys

Introduction: 27. Which of the following problems were identified with the Burnham et al. survey of Iraq mortality? (Indicate all that apply.) (a) The survey used cluster sampling, which is inappropriate for estimating individual out- comes such as death. (b) In their report, Burnham et al. did not identify their primary sampling units. (c) The second-stage sampling was not a probability sample. (d) Survey materials supplied by the authors are incomplete and inconsistent with published descriptions of the survey. Solution to question 26 From yesterday : 26. You have just graded an an exam with 28 questions and 15 students. You fit a logistic item- response model estimating ability, difficulty, and discrimination parameters. Which of the following statements are basically true? (Indicate all that apply.) (a) If a question is answered correctly by students with very low and very high ability, but is missed by students in the middle, it will have a high value for its discrimination

3 0.54118896 1365 andrew gelman stats-2012-06-04-Question 25 of my final exam for Design and Analysis of Sample Surveys

Introduction: 25. You are using multilevel regression and poststratification (MRP) to a survey of 1500 people to estimate support for the space program, by state. The model is fit using, as a state- level predictor, the Republican presidential vote in the state, which turns out to have a low correlation with support for the space program. Which of the following statements are basically true? (Indicate all that apply.) (a) For small states, the MRP estimates will be determined almost entirely by the demo- graphic characteristics of the respondents in the sample from that state. (b) For small states, the MRP estimates will be determined almost entirely by the demographic characteristics of the population in that state. (c) Adding a predictor specifically for this model (for example, a measure of per-capita space-program spending in the state) could dramatically improve the estimates of state-level opinion. (d) It would not be appropriate to add a predictor such as per-capita space-program spen

4 0.29237631 2056 andrew gelman stats-2013-10-09-Mister P: What’s its secret sauce?

Introduction: This is a long and technical post on an important topic: the use of multilevel regression and poststratification (MRP) to estimate state-level public opinion. MRP as a research method, and state-level opinion (or, more generally, attitudes in demographic and geographic subpopulation) as a subject, have both become increasingly important in political science—and soon, I expect, will become increasingly important in other social sciences as well. Being able to estimate state-level opinion from national surveys is just such a powerful thing, that if it can be done, people will do it. It’s taken 15 years or so for the method to really catch on, but the ready availability of survey data and of computing power—as well as our increasing comfort level, as a profession, with these techniques, has made MRP become more of a routine research tool. As a method becomes used more and more widely, there will be natural concerns about its domains of applicability. That is the subject of the pres

5 0.2635451 2061 andrew gelman stats-2013-10-14-More on Mister P and how it does what it does

Introduction: Following up on our discussion the other day, Matt Buttice and Ben Highton write: It was nice to see our article mentioned and discussed by Andrew, Jeff Lax, Justin Phillips, and Yair Ghitza on Andrew’s blog in this post on Wednesday. As noted in the post, we recently published an article in Political Analysis on how well multilevel regression and poststratification (MRP) performs at producing estimates of state opinion with conventional national surveys where N≈1,500. Our central claims are that (i) the performance of MRP is highly variable, (ii) in the absence of knowing the true values, it is difficult to determine the quality of the MRP estimates produced on the basis of a single national sample, and, (iii) therefore, our views about the usefulness of MRP in instances where a researcher has a single sample of N≈1,500 are less optimistic than the ones expressed in previous research on the topic. Obviously we were interested in the blog posts. We found them stimulating

6 0.21333627 1196 andrew gelman stats-2012-03-04-Piss-poor monocausal social science

7 0.19522269 2062 andrew gelman stats-2013-10-15-Last word on Mister P (for now)

8 0.16200364 1972 andrew gelman stats-2013-08-07-When you’re planning on fitting a model, build up to it by fitting simpler models first. Then, once you have a model you like, check the hell out of it

9 0.1520569 1340 andrew gelman stats-2012-05-23-Question 13 of my final exam for Design and Analysis of Sample Surveys

10 0.14725092 1735 andrew gelman stats-2013-02-24-F-f-f-fake data

11 0.14053042 1337 andrew gelman stats-2012-05-22-Question 12 of my final exam for Design and Analysis of Sample Surveys

12 0.13434604 383 andrew gelman stats-2010-10-31-Analyzing the entire population rather than a sample

13 0.13167182 1981 andrew gelman stats-2013-08-14-The robust beauty of improper linear models in decision making

14 0.1316366 1341 andrew gelman stats-2012-05-24-Question 14 of my final exam for Design and Analysis of Sample Surveys

15 0.12759198 1097 andrew gelman stats-2012-01-03-Libertarians in Space

16 0.12444668 288 andrew gelman stats-2010-09-21-Discussion of the paper by Girolami and Calderhead on Bayesian computation

17 0.12100535 1311 andrew gelman stats-2012-05-10-My final exam for Design and Analysis of Sample Surveys

18 0.11994528 405 andrew gelman stats-2010-11-10-Estimation from an out-of-date census

19 0.11952542 1656 andrew gelman stats-2013-01-05-Understanding regression models and regression coefficients

20 0.11913899 1344 andrew gelman stats-2012-05-25-Question 15 of my final exam for Design and Analysis of Sample Surveys


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.21), (1, 0.129), (2, 0.217), (3, -0.026), (4, 0.108), (5, 0.136), (6, 0.024), (7, 0.044), (8, 0.015), (9, -0.02), (10, 0.13), (11, 0.04), (12, -0.078), (13, 0.097), (14, -0.076), (15, -0.067), (16, 0.041), (17, -0.034), (18, -0.036), (19, 0.029), (20, 0.014), (21, -0.067), (22, -0.021), (23, -0.089), (24, 0.009), (25, -0.01), (26, 0.021), (27, 0.004), (28, -0.002), (29, -0.029), (30, 0.124), (31, -0.146), (32, 0.045), (33, -0.013), (34, -0.06), (35, 0.028), (36, 0.092), (37, -0.054), (38, 0.111), (39, -0.132), (40, 0.012), (41, -0.035), (42, -0.083), (43, 0.085), (44, 0.037), (45, -0.038), (46, 0.015), (47, 0.113), (48, 0.047), (49, 0.087)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.96651125 1367 andrew gelman stats-2012-06-05-Question 26 of my final exam for Design and Analysis of Sample Surveys

Introduction: 26. You have just graded an an exam with 28 questions and 15 students. You fit a logistic item- response model estimating ability, difficulty, and discrimination parameters. Which of the following statements are basically true? (Indicate all that apply.) (a) If a question is answered correctly by students with very low and very high ability, but is missed by students in the middle, it will have a high value for its discrimination parameter. (b) It is not possible to fit an item-response model when you have more questions than students. In order to fit the model, you either need to reduce the number of questions (for example, by discarding some questions or by putting together some questions into a combined score) or increase the number of students in the dataset. (c) To keep the model identified, you can set one of the difficulty parameters or one of the ability parameters to zero and set one of the discrimination parameters to 1. (d) If two students answer the same number of q

2 0.89637905 1365 andrew gelman stats-2012-06-04-Question 25 of my final exam for Design and Analysis of Sample Surveys

Introduction: 25. You are using multilevel regression and poststratification (MRP) to a survey of 1500 people to estimate support for the space program, by state. The model is fit using, as a state- level predictor, the Republican presidential vote in the state, which turns out to have a low correlation with support for the space program. Which of the following statements are basically true? (Indicate all that apply.) (a) For small states, the MRP estimates will be determined almost entirely by the demo- graphic characteristics of the respondents in the sample from that state. (b) For small states, the MRP estimates will be determined almost entirely by the demographic characteristics of the population in that state. (c) Adding a predictor specifically for this model (for example, a measure of per-capita space-program spending in the state) could dramatically improve the estimates of state-level opinion. (d) It would not be appropriate to add a predictor such as per-capita space-program spen

3 0.80084145 1368 andrew gelman stats-2012-06-06-Question 27 of my final exam for Design and Analysis of Sample Surveys

Introduction: 27. Which of the following problems were identified with the Burnham et al. survey of Iraq mortality? (Indicate all that apply.) (a) The survey used cluster sampling, which is inappropriate for estimating individual out- comes such as death. (b) In their report, Burnham et al. did not identify their primary sampling units. (c) The second-stage sampling was not a probability sample. (d) Survey materials supplied by the authors are incomplete and inconsistent with published descriptions of the survey. Solution to question 26 From yesterday : 26. You have just graded an an exam with 28 questions and 15 students. You fit a logistic item- response model estimating ability, difficulty, and discrimination parameters. Which of the following statements are basically true? (Indicate all that apply.) (a) If a question is answered correctly by students with very low and very high ability, but is missed by students in the middle, it will have a high value for its discrimination

4 0.78054309 2056 andrew gelman stats-2013-10-09-Mister P: What’s its secret sauce?

Introduction: This is a long and technical post on an important topic: the use of multilevel regression and poststratification (MRP) to estimate state-level public opinion. MRP as a research method, and state-level opinion (or, more generally, attitudes in demographic and geographic subpopulation) as a subject, have both become increasingly important in political science—and soon, I expect, will become increasingly important in other social sciences as well. Being able to estimate state-level opinion from national surveys is just such a powerful thing, that if it can be done, people will do it. It’s taken 15 years or so for the method to really catch on, but the ready availability of survey data and of computing power—as well as our increasing comfort level, as a profession, with these techniques, has made MRP become more of a routine research tool. As a method becomes used more and more widely, there will be natural concerns about its domains of applicability. That is the subject of the pres

5 0.74751371 2061 andrew gelman stats-2013-10-14-More on Mister P and how it does what it does

Introduction: Following up on our discussion the other day, Matt Buttice and Ben Highton write: It was nice to see our article mentioned and discussed by Andrew, Jeff Lax, Justin Phillips, and Yair Ghitza on Andrew’s blog in this post on Wednesday. As noted in the post, we recently published an article in Political Analysis on how well multilevel regression and poststratification (MRP) performs at producing estimates of state opinion with conventional national surveys where N≈1,500. Our central claims are that (i) the performance of MRP is highly variable, (ii) in the absence of knowing the true values, it is difficult to determine the quality of the MRP estimates produced on the basis of a single national sample, and, (iii) therefore, our views about the usefulness of MRP in instances where a researcher has a single sample of N≈1,500 are less optimistic than the ones expressed in previous research on the topic. Obviously we were interested in the blog posts. We found them stimulating

6 0.74298704 2062 andrew gelman stats-2013-10-15-Last word on Mister P (for now)

7 0.64991194 1323 andrew gelman stats-2012-05-16-Question 6 of my final exam for Design and Analysis of Sample Surveys

8 0.64564949 627 andrew gelman stats-2011-03-24-How few respondents are reasonable to use when calculating the average by county?

9 0.62269759 1196 andrew gelman stats-2012-03-04-Piss-poor monocausal social science

10 0.61929971 1322 andrew gelman stats-2012-05-15-Question 5 of my final exam for Design and Analysis of Sample Surveys

11 0.59830976 2074 andrew gelman stats-2013-10-23-Can’t Stop Won’t Stop Mister P Beatdown

12 0.58671987 1981 andrew gelman stats-2013-08-14-The robust beauty of improper linear models in decision making

13 0.57469362 1294 andrew gelman stats-2012-05-01-Modeling y = a + b + c

14 0.56484365 1340 andrew gelman stats-2012-05-23-Question 13 of my final exam for Design and Analysis of Sample Surveys

15 0.5593124 1344 andrew gelman stats-2012-05-25-Question 15 of my final exam for Design and Analysis of Sample Surveys

16 0.55893159 162 andrew gelman stats-2010-07-25-Darn that Lindsey Graham! (or, “Mr. P Predicts the Kagan vote”)

17 0.55012047 962 andrew gelman stats-2011-10-17-Death!

18 0.53812611 250 andrew gelman stats-2010-09-02-Blending results from two relatively independent multi-level models

19 0.53779495 1341 andrew gelman stats-2012-05-24-Question 14 of my final exam for Design and Analysis of Sample Surveys

20 0.53579664 1966 andrew gelman stats-2013-08-03-Uncertainty in parameter estimates using multilevel models


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(2, 0.035), (3, 0.031), (16, 0.082), (24, 0.265), (59, 0.035), (65, 0.083), (86, 0.078), (89, 0.017), (99, 0.261)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.97512329 1367 andrew gelman stats-2012-06-05-Question 26 of my final exam for Design and Analysis of Sample Surveys

Introduction: 26. You have just graded an an exam with 28 questions and 15 students. You fit a logistic item- response model estimating ability, difficulty, and discrimination parameters. Which of the following statements are basically true? (Indicate all that apply.) (a) If a question is answered correctly by students with very low and very high ability, but is missed by students in the middle, it will have a high value for its discrimination parameter. (b) It is not possible to fit an item-response model when you have more questions than students. In order to fit the model, you either need to reduce the number of questions (for example, by discarding some questions or by putting together some questions into a combined score) or increase the number of students in the dataset. (c) To keep the model identified, you can set one of the difficulty parameters or one of the ability parameters to zero and set one of the discrimination parameters to 1. (d) If two students answer the same number of q

2 0.96199191 2146 andrew gelman stats-2013-12-24-NYT version of birthday graph

Introduction: They didn’t have room for all four graphs of the time-series decomposition so they just displayed the date-of-year graph: They rotated so the graph fit better on the page. The rotation worked for me, but I was a bit bummed that that they put the title and heading of the graph (“The birthrate tends to drop on holidays . . .”) on the left in the Mar-Apr slot, leaving no room to label Leap Day and April Fool’s. I suggested to the graphics people that they put the label at the very top and just shrink the rest of the graph by 5 or 10% so as to not take up any more total space. Then there’d be plenty of space to label Leap Day and April Fool’s. But they didn’t do it, maybe they felt that it wouldn’t look good to have the label right at the top, I dunno.

3 0.95900202 2074 andrew gelman stats-2013-10-23-Can’t Stop Won’t Stop Mister P Beatdown

Introduction: Ben Highton and Matt Buttice point us to this response addressing some of the issues Jeff Lax raised in his most recent MRP post. P.S. Jeff replies in comments: It sounds like we’ve converged. They acknowledge MRP performance is significantly better on average than reported in their new paper in PA and yet performance variation in terms of correlation to “truth” remains higher than some might have thought. Cool. I hope this sort of blog exchange can be a model of scientific discussion. Instead of a paper just sitting there by itself, it can be openly explored. Ideally, the published paper would include a link to these discussions of Highton, Buttice, Lax, Phillips, and Ghitza, so that readers would automatically get all this information.

4 0.95896065 846 andrew gelman stats-2011-08-09-Default priors update?

Introduction: Ryan King writes: I was wondering if you have a brief comment on the state of the art for objective priors for hierarchical generalized linear models (generalized linear mixed models). I have been working off the papers in Bayesian Analysis (2006) 1, Number 3 (Browne and Draper, Kass and Natarajan, Gelman). There seems to have been continuous work for matching priors in linear mixed models, but GLMMs less so because of the lack of an analytic marginal likelihood for the variance components. There are a number of additional suggestions in the literature since 2006, but little robust practical guidance. I’m interested in both mean parameters and the variance components. I’m almost always concerned with logistic random effect models. I’m fascinated by the matching-priors idea of higher-order asymptotic improvements to maximum likelihood, and need to make some kind of defensible default recommendation. Given the massive scale of the datasets (genetics …), extensive sensitivity a

5 0.9576025 1368 andrew gelman stats-2012-06-06-Question 27 of my final exam for Design and Analysis of Sample Surveys

Introduction: 27. Which of the following problems were identified with the Burnham et al. survey of Iraq mortality? (Indicate all that apply.) (a) The survey used cluster sampling, which is inappropriate for estimating individual out- comes such as death. (b) In their report, Burnham et al. did not identify their primary sampling units. (c) The second-stage sampling was not a probability sample. (d) Survey materials supplied by the authors are incomplete and inconsistent with published descriptions of the survey. Solution to question 26 From yesterday : 26. You have just graded an an exam with 28 questions and 15 students. You fit a logistic item- response model estimating ability, difficulty, and discrimination parameters. Which of the following statements are basically true? (Indicate all that apply.) (a) If a question is answered correctly by students with very low and very high ability, but is missed by students in the middle, it will have a high value for its discrimination

6 0.95468366 953 andrew gelman stats-2011-10-11-Steve Jobs’s cancer and science-based medicine

7 0.95323348 779 andrew gelman stats-2011-06-25-Avoiding boundary estimates using a prior distribution as regularization

8 0.95123124 1454 andrew gelman stats-2012-08-11-Weakly informative priors for Bayesian nonparametric models?

9 0.950719 1474 andrew gelman stats-2012-08-29-More on scaled-inverse Wishart and prior independence

10 0.95029175 1240 andrew gelman stats-2012-04-02-Blogads update

11 0.94927883 1208 andrew gelman stats-2012-03-11-Gelman on Hennig on Gelman on Bayes

12 0.94876671 1838 andrew gelman stats-2013-05-03-Setting aside the politics, the debate over the new health-care study reveals that we’re moving to a new high standard of statistical journalism

13 0.94814605 1365 andrew gelman stats-2012-06-04-Question 25 of my final exam for Design and Analysis of Sample Surveys

14 0.94565392 847 andrew gelman stats-2011-08-10-Using a “pure infographic” to explore differences between information visualization and statistical graphics

15 0.94554532 197 andrew gelman stats-2010-08-10-The last great essayist?

16 0.94531596 494 andrew gelman stats-2010-12-31-Type S error rates for classical and Bayesian single and multiple comparison procedures

17 0.94510245 1072 andrew gelman stats-2011-12-19-“The difference between . . .”: It’s not just p=.05 vs. p=.06

18 0.94482124 1455 andrew gelman stats-2012-08-12-Probabilistic screening to get an approximate self-weighted sample

19 0.9435873 1206 andrew gelman stats-2012-03-10-95% intervals that I don’t believe, because they’re from a flat prior I don’t believe

20 0.9419831 1155 andrew gelman stats-2012-02-05-What is a prior distribution?