andrew_gelman_stats andrew_gelman_stats-2012 andrew_gelman_stats-2012-1368 knowledge-graph by maker-knowledge-mining

1368 andrew gelman stats-2012-06-06-Question 27 of my final exam for Design and Analysis of Sample Surveys

meta infos for this blog

Source: html

Introduction: 27. Which of the following problems were identified with the Burnham et al. survey of Iraq mortality? (Indicate all that apply.) (a) The survey used cluster sampling, which is inappropriate for estimating individual out- comes such as death. (b) In their report, Burnham et al. did not identify their primary sampling units. (c) The second-stage sampling was not a probability sample. (d) Survey materials supplied by the authors are incomplete and inconsistent with published descriptions of the survey. Solution to question 26 From yesterday : 26. You have just graded an an exam with 28 questions and 15 students. You fit a logistic item- response model estimating ability, difficulty, and discrimination parameters. Which of the following statements are basically true? (Indicate all that apply.) (a) If a question is answered correctly by students with very low and very high ability, but is missed by students in the middle, it will have a high value for its discrimination

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Which of the following problems were identified with the Burnham et al. [sent-2, score-0.346]

2 ) (a) The survey used cluster sampling, which is inappropriate for estimating individual out- comes such as death. [sent-5, score-0.46]

3 (c) The second-stage sampling was not a probability sample. [sent-8, score-0.183]

4 (d) Survey materials supplied by the authors are incomplete and inconsistent with published descriptions of the survey. [sent-9, score-0.512]

5 Solution to question 26 From yesterday : 26. [sent-10, score-0.201]

6 You have just graded an an exam with 28 questions and 15 students. [sent-11, score-0.506]

7 You fit a logistic item- response model estimating ability, difficulty, and discrimination parameters. [sent-12, score-0.771]

8 ) (a) If a question is answered correctly by students with very low and very high ability, but is missed by students in the middle, it will have a high value for its discrimination parameter. [sent-15, score-1.415]

9 (b) It is not possible to fit an item-response model when you have more questions than students. [sent-16, score-0.559]

10 In order to fit the model, you either need to reduce the number of questions (for example, by discarding some questions or by putting together some questions into a combined score) or increase the number of students in the dataset. [sent-17, score-1.763]

11 (c) To keep the model identified, you can set one of the difficulty parameters or one of the ability parameters to zero and set one of the discrimination parameters to 1. [sent-18, score-1.634]

12 (d) If two students answer the same number of questions correctly, they will have the same estimated ability parameter. [sent-19, score-0.922]

13 (e) Under the model, if a studentâ€™s ability parameter has the same value as a particular questionâ€™s difficulty parameter, there is a 50% chance the student will get the question right. [sent-20, score-1.003]

similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('ability', 0.328), ('discrimination', 0.295), ('questions', 0.283), ('burnham', 0.238), ('difficulty', 0.202), ('students', 0.187), ('sampling', 0.183), ('parameters', 0.17), ('indicate', 0.157), ('correctly', 0.154), ('identified', 0.153), ('survey', 0.148), ('model', 0.139), ('fit', 0.137), ('question', 0.133), ('estimating', 0.128), ('solution', 0.127), ('graded', 0.127), ('number', 0.124), ('parameter', 0.122), ('et', 0.119), ('student', 0.115), ('discarding', 0.113), ('incomplete', 0.111), ('descriptions', 0.105), ('inconsistent', 0.105), ('answered', 0.104), ('value', 0.103), ('iraq', 0.102), ('supplied', 0.102), ('mortality', 0.1), ('exam', 0.096), ('cluster', 0.094), ('inappropriate', 0.09), ('materials', 0.089), ('combined', 0.088), ('missed', 0.084), ('high', 0.084), ('set', 0.08), ('primary', 0.077), ('score', 0.075), ('following', 0.074), ('reduce', 0.073), ('logistic', 0.072), ('identify', 0.071), ('middle', 0.07), ('statements', 0.069), ('putting', 0.068), ('yesterday', 0.068), ('basically', 0.066)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0000001 1368 andrew gelman stats-2012-06-06-Question 27 of my final exam for Design and Analysis of Sample Surveys

2 0.64367521 1367 andrew gelman stats-2012-06-05-Question 26 of my final exam for Design and Analysis of Sample Surveys

Introduction: 26. You have just graded an an exam with 28 questions and 15 students. You fit a logistic item- response model estimating ability, difficulty, and discrimination parameters. Which of the following statements are basically true? (Indicate all that apply.) (a) If a question is answered correctly by students with very low and very high ability, but is missed by students in the middle, it will have a high value for its discrimination parameter. (b) It is not possible to fit an item-response model when you have more questions than students. In order to fit the model, you either need to reduce the number of questions (for example, by discarding some questions or by putting together some questions into a combined score) or increase the number of students in the dataset. (c) To keep the model identified, you can set one of the difficulty parameters or one of the ability parameters to zero and set one of the discrimination parameters to 1. (d) If two students answer the same number of q

3 0.38783228 1371 andrew gelman stats-2012-06-07-Question 28 of my final exam for Design and Analysis of Sample Surveys

Introduction: This is it, the last question on the exam! 28. A telephone survey was conducted several years ago, asking people how often they were polled in the past year. I can’t recall the responses, but suppose that 40% of the respondents said they participated in zero surveys in the previous year, 30% said they participated in one survey, 15% said two surveys, 10% said three, and 5% said four. From this it is easy to estimate an average, but there is a worry that this survey will itself overrepresent survey participants and thus overestimate the rate at which the average person is surveyed. Come up with a procedure to use these data to get an improved estimate of the average number of surveys that a randomly-sampled American is polled in a year. Solution to question 27 From yesterday : 27. Which of the following problems were identified with the Burnham et al. survey of Iraq mortality? (Indicate all that apply.) (a) The survey used cluster sampling, which is inappropriate for estim

4 0.1938352 5 andrew gelman stats-2010-04-27-Ethical and data-integrity problems in a study of mortality in Iraq

Introduction: Michael Spagat notifies me that his article criticizing the 2006 study of Burnham, Lafta, Doocy and Roberts has just been published . The Burnham et al. paper (also called, to my irritation (see the last item here ), “the Lancet survey”) used a cluster sample to estimate the number of deaths in Iraq in the three years following the 2003 invasion. In his newly-published paper, Spagat writes: [The Spagat article] presents some evidence suggesting ethical violations to the survey’s respondents including endangerment, privacy breaches and violations in obtaining informed consent. Breaches of minimal disclosure standards examined include non-disclosure of the survey’s questionnaire, data-entry form, data matching anonymised interviewer identifications with households and sample design. The paper also presents some evidence relating to data fabrication and falsification, which falls into nine broad categories. This evidence suggests that this survey cannot be considered a reliable or

5 0.14729214 1972 andrew gelman stats-2013-08-07-When you’re planning on fitting a model, build up to it by fitting simpler models first. Then, once you have a model you like, check the hell out of it

Introduction: In response to my remarks on his online book, Think Bayes, Allen Downey wrote: I [Downey] have a question about one of your comments: My [Gelman's] main criticism with both books is that they talk a lot about inference but not so much about model building or model checking (recall the three steps of Bayesian data analysis). I think it’s ok for an introductory book to focus on inference, which of course is central to the data-analytic process—but I’d like them to at least mention that Bayesian ideas arise in model building and model checking as well. This sounds like something I agree with, and one of the things I tried to do in the book is to put modeling decisions front and center. But the word “modeling” is used in lots of ways, so I want to see if we are talking about the same thing. For example, in many chapters, I start with a simple model of the scenario, do some analysis, then check whether the model is good enough, and iterate. Here’s the discussion of modeling

6 0.14667414 1311 andrew gelman stats-2012-05-10-My final exam for Design and Analysis of Sample Surveys

7 0.13963613 1144 andrew gelman stats-2012-01-29-How many parameters are in a multilevel model?

8 0.13934845 2178 andrew gelman stats-2014-01-20-Mailing List Degree-of-Difficulty Difficulty

9 0.13861886 749 andrew gelman stats-2011-06-06-“Sampling: Design and Analysis”: a course for political science graduate students

10 0.13275459 288 andrew gelman stats-2010-09-21-Discussion of the paper by Girolami and Calderhead on Bayesian computation

11 0.13198294 1323 andrew gelman stats-2012-05-16-Question 6 of my final exam for Design and Analysis of Sample Surveys

12 0.12949511 1352 andrew gelman stats-2012-05-29-Question 19 of my final exam for Design and Analysis of Sample Surveys

13 0.12922797 1341 andrew gelman stats-2012-05-24-Question 14 of my final exam for Design and Analysis of Sample Surveys

14 0.12679167 405 andrew gelman stats-2010-11-10-Estimation from an out-of-date census

15 0.1267176 481 andrew gelman stats-2010-12-22-The Jumpstart financial literacy survey and the different purposes of tests

16 0.12533136 1353 andrew gelman stats-2012-05-30-Question 20 of my final exam for Design and Analysis of Sample Surveys

17 0.12382531 1349 andrew gelman stats-2012-05-28-Question 18 of my final exam for Design and Analysis of Sample Surveys

18 0.11916922 1315 andrew gelman stats-2012-05-12-Question 2 of my final exam for Design and Analysis of Sample Surveys

19 0.11912744 2208 andrew gelman stats-2014-02-12-How to think about “identifiability” in Bayesian inference?

20 0.11527281 1628 andrew gelman stats-2012-12-17-Statistics in a world where nothing is random

similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.173), (1, 0.11), (2, 0.124), (3, -0.066), (4, 0.099), (5, 0.143), (6, 0.057), (7, 0.072), (8, -0.009), (9, -0.096), (10, 0.127), (11, 0.004), (12, -0.105), (13, 0.121), (14, -0.112), (15, -0.11), (16, 0.069), (17, -0.015), (18, -0.023), (19, 0.019), (20, -0.002), (21, -0.059), (22, -0.045), (23, -0.041), (24, -0.007), (25, 0.049), (26, 0.047), (27, 0.005), (28, 0.037), (29, -0.045), (30, -0.002), (31, -0.039), (32, -0.007), (33, 0.034), (34, -0.127), (35, 0.011), (36, 0.036), (37, -0.031), (38, 0.009), (39, -0.054), (40, -0.002), (41, 0.039), (42, 0.007), (43, 0.05), (44, -0.016), (45, -0.03), (46, 0.003), (47, 0.024), (48, 0.008), (49, 0.047)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.98263115 1368 andrew gelman stats-2012-06-06-Question 27 of my final exam for Design and Analysis of Sample Surveys

2 0.82475775 1367 andrew gelman stats-2012-06-05-Question 26 of my final exam for Design and Analysis of Sample Surveys

3 0.74509931 1323 andrew gelman stats-2012-05-16-Question 6 of my final exam for Design and Analysis of Sample Surveys

Introduction: 6. A survey of New York City residents is performed using cluster sampling. The design effect is 3.0. From the survey, the estimated proportion who prefer the Mets to the Yankees is 0.42 with a standard error of 0.05. How many people were in the sample? Solution to question 5 From yesterday : 5. Which of the following better describes changes in public opinion on most issues? (Choose only one.) (a) Dynamic stability: On any given issue, average opinion remains stable but liberals and conservatives move back and forth in opposite directions (the “accordion model”) (b) Uniform swing: Average opinion on an issue can move but the liberals and conservatives don’t move much relative to each other (the disribution of opinions is a “solid block of wood”) (c) Compensating tradeoffs: When considering multiple survey questions on the same general topic, average opinion can move sharply to the left or right on individual questions while the average over all the questions remains st

4 0.72561383 1322 andrew gelman stats-2012-05-15-Question 5 of my final exam for Design and Analysis of Sample Surveys

Introduction: 5. Which of the following better describes changes in public opinion on most issues? (Choose only one.) (a) Dynamic stability: On any given issue, average opinion remains stable but liberals and conservatives move back and forth in opposite directions (the “accordion model”) (b) Uniform swing: Average opinion on an issue can move but the liberals and conservatives don’t move much relative to each other (the disribution of opinions is a “solid block of wood”) (c) Compensating tradeoffs: When considering multiple survey questions on the same general topic, average opinion can move sharply to the left or right on individual questions while the average over all the questions remains stable (the “rubber band model”) Solution to question 4 From yesterday : 4. Researchers have found that survey respondents overreport church attendance. Thus, naive estimates from surveys overstate the percentage of Americans who attend church regularly. Does this have a large impact on estimate

5 0.71500361 1371 andrew gelman stats-2012-06-07-Question 28 of my final exam for Design and Analysis of Sample Surveys

6 0.71032351 1320 andrew gelman stats-2012-05-14-Question 4 of my final exam for Design and Analysis of Sample Surveys

7 0.70540845 1352 andrew gelman stats-2012-05-29-Question 19 of my final exam for Design and Analysis of Sample Surveys

8 0.68801945 784 andrew gelman stats-2011-07-01-Weighting and prediction in sample surveys

9 0.66101593 1365 andrew gelman stats-2012-06-04-Question 25 of my final exam for Design and Analysis of Sample Surveys

10 0.6599251 381 andrew gelman stats-2010-10-30-Sorry, Senator DeMint: Most Americans Don’t Want to Ban Gays from the Classroom

11 0.64739364 1349 andrew gelman stats-2012-05-28-Question 18 of my final exam for Design and Analysis of Sample Surveys

12 0.63425148 627 andrew gelman stats-2011-03-24-How few respondents are reasonable to use when calculating the average by county?

13 0.62708223 1353 andrew gelman stats-2012-05-30-Question 20 of my final exam for Design and Analysis of Sample Surveys

14 0.62379497 1344 andrew gelman stats-2012-05-25-Question 15 of my final exam for Design and Analysis of Sample Surveys

15 0.6148414 1430 andrew gelman stats-2012-07-26-Some thoughts on survey weighting

16 0.60992217 1341 andrew gelman stats-2012-05-24-Question 14 of my final exam for Design and Analysis of Sample Surveys

17 0.60929865 1362 andrew gelman stats-2012-06-03-Question 24 of my final exam for Design and Analysis of Sample Surveys

18 0.60851353 2041 andrew gelman stats-2013-09-27-Setting up Jitts online

19 0.60429835 1345 andrew gelman stats-2012-05-26-Question 16 of my final exam for Design and Analysis of Sample Surveys

20 0.59422982 1679 andrew gelman stats-2013-01-18-Is it really true that only 8% of people who buy Herbalife products are Herbalife distributors?

similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(0, 0.012), (3, 0.054), (9, 0.044), (11, 0.013), (16, 0.047), (24, 0.301), (26, 0.013), (35, 0.014), (37, 0.029), (53, 0.014), (59, 0.038), (86, 0.044), (95, 0.034), (99, 0.24)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.98714483 1368 andrew gelman stats-2012-06-06-Question 27 of my final exam for Design and Analysis of Sample Surveys

2 0.96803498 1455 andrew gelman stats-2012-08-12-Probabilistic screening to get an approximate self-weighted sample

Introduction: Sharad had a survey sampling question: We’re trying to use mechanical turk to conduct some surveys, and have quickly discovered that turkers tend to be quite young. We’d really like a representative sample of the U.S., or at the least be able to recruit a diverse enough sample from turk that we can post-stratify to adjust the estimates. The approach we ended up taking is to pay turkers a small amount to answer a couple of screening questions (age & sex), and then probabilistically recruit individuals to complete the full survey (for more money) based on the estimated turk population parameters and our desired target distribution. We use rejection sampling, so the end result is that individuals who are invited to take the full survey look as if they came from a representative sample, at least in terms of age and sex. I’m wondering whether this sort of technique—a two step design in which participants are first screened and then probabilistically selected to mimic a target distributio

3 0.96404761 1376 andrew gelman stats-2012-06-12-Simple graph WIN: the example of birthday frequencies

Introduction: From Chris Mulligan: The data come from the Center for Disease Control and cover the years 1969-1988. Chris also gives instructions for how to download the data and plot them in R from scratch (in 30 lines of R code)! And now, the background A few months ago I heard about a study reporting that, during a recent eleven-year period, more babies were born on Valentine’s Day and fewer on Halloween compared to neighboring days: I wrote , What I’d really like to see is a graph with all 366 days of the year. It would be easy enough to make. That way we could put the Valentine’s and Halloween data in the context of other possible patterns. While they’re at it, they could also graph births by day of the week and show Thanksgiving, Easter, and other holidays that don’t have fixed dates. It’s so frustrating when people only show part of the story. I was pointed to some tables: and a graph from Matt Stiles: The heatmap is cute but I wanted to se

4 0.96381289 953 andrew gelman stats-2011-10-11-Steve Jobs’s cancer and science-based medicine

Introduction: Interesting discussion from David Gorski (which I found via this link from Joseph Delaney). I don’t have anything really to add to this discussion except to note the value of this sort of anecdote in a statistics discussion. It’s only n=1 and adds almost nothing to the literature on the effectiveness of various treatments, but a story like this can help focus one’s thoughts on the decision problems.

5 0.96332467 743 andrew gelman stats-2011-06-03-An argument that can’t possibly make sense

Introduction: Tyler Cowen writes : Texas has begun to enforce [a law regarding parallel parking] only recently . . . Up until now, of course, there has been strong net mobility into the state of Texas, so was the previous lack of enforcement so bad? I care not at all about the direction in which people park their cars and I have no opinion on this law, but I have to raise an alarm at Cowen’s argument here. Let me strip it down to its basic form: 1. Until recently, state X had policy A. 2. Up until now, there has been strong net mobility into state X 3. Therefore, the presumption is that policy A is ok. In this particular case, I think we can safely assume that parallel parking regulations have had close to zero impact on the population flows into and out of Texas. More generally, I think logicians could poke some holes into the argument that 1 and 2 above imply 3. For one thing, you could apply this argument to any policy in any state that’s had positive net migration. Hai

6 0.96280766 1999 andrew gelman stats-2013-08-27-Bayesian model averaging or fitting a larger model

7 0.96224678 482 andrew gelman stats-2010-12-23-Capitalism as a form of voluntarism

8 0.96223581 278 andrew gelman stats-2010-09-15-Advice that might make sense for individuals but is negative-sum overall

9 0.96201444 197 andrew gelman stats-2010-08-10-The last great essayist?

10 0.96123385 2231 andrew gelman stats-2014-03-03-Running into a Stan Reference by Accident

11 0.96057421 1706 andrew gelman stats-2013-02-04-Too many MC’s not enough MIC’s, or What principles should govern attempts to summarize bivariate associations in large multivariate datasets?

12 0.95910001 1978 andrew gelman stats-2013-08-12-Fixing the race, ethnicity, and national origin questions on the U.S. Census

13 0.95909572 1092 andrew gelman stats-2011-12-29-More by Berger and me on weakly informative priors

14 0.95870042 938 andrew gelman stats-2011-10-03-Comparing prediction errors

15 0.95860493 1838 andrew gelman stats-2013-05-03-Setting aside the politics, the debate over the new health-care study reveals that we’re moving to a new high standard of statistical journalism

16 0.9585526 2017 andrew gelman stats-2013-09-11-“Informative g-Priors for Logistic Regression”

17 0.95792365 2143 andrew gelman stats-2013-12-22-The kluges of today are the textbook solutions of tomorrow.

18 0.95760548 1062 andrew gelman stats-2011-12-16-Mr. Pearson, meet Mr. Mandelbrot: Detecting Novel Associations in Large Data Sets

19 0.95757854 847 andrew gelman stats-2011-08-10-Using a “pure infographic” to explore differences between information visualization and statistical graphics

20 0.9568156 1891 andrew gelman stats-2013-06-09-“Heterogeneity of variance in experimental studies: A challenge to conventional interpretations”