andrew_gelman_stats andrew_gelman_stats-2013 andrew_gelman_stats-2013-2056 knowledge-graph by maker-knowledge-mining

2056 andrew gelman stats-2013-10-09-Mister P: What’s its secret sauce?


meta infos for this blog

Source: html

Introduction: This is a long and technical post on an important topic: the use of multilevel regression and poststratification (MRP) to estimate state-level public opinion. MRP as a research method, and state-level opinion (or, more generally, attitudes in demographic and geographic subpopulation) as a subject, have both become increasingly important in political science—and soon, I expect, will become increasingly important in other social sciences as well. Being able to estimate state-level opinion from national surveys is just such a powerful thing, that if it can be done, people will do it. It’s taken 15 years or so for the method to really catch on, but the ready availability of survey data and of computing power—as well as our increasing comfort level, as a profession, with these techniques, has made MRP become more of a routine research tool. As a method becomes used more and more widely, there will be natural concerns about its domains of applicability. That is the subject of the pres


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 BH expand the number of MRP estimates subjected to validity checks and find varying degrees of MRP success across survey questions. [sent-8, score-0.228]

2 0 points) and expect correlation to “true” state values will be around . [sent-40, score-0.263]

3 Take into account uncertainty around your mrp estimates, especially in substantive work. [sent-52, score-0.838]

4 When assessing MRP estimates against noisy estimates of true state opinion, one needs to adjust correlations to account for this noise. [sent-57, score-0.55]

5 In the BH and in our own previous and current assessments of MRP, MRP estimates were compared to the raw mean by state in large national survey samples, large enough to have good estimates of the true mean by state to be a target for MRP. [sent-58, score-0.594]

6 Since the current goal of our work and of the BH work is to give a sense of the actual correlations to true opinion by state, this is no longer the best way to proceed. [sent-60, score-0.219]

7 Since we are measuring truth using what is in effect still only a sample (here, a sample of what true opinion would be in an infinitely large population in the world of the CCES survey), one must adjust for the reliability of the measure of truth. [sent-61, score-0.562]

8 One can do so using Cronbach’s alpha (based on Spearman-Brown and split-halves correlations between different sample estimates of “truth”). [sent-62, score-0.308]

9 8 so that the correlation of MRP to state values would be about a third higher than the naïve estimate. [sent-64, score-0.284]

10 One also cannot simply say that, since the target is the large CCES survey, the raw state means are the truth by definition and so no correction for reliability is needed. [sent-65, score-0.228]

11 After all, the sample sizes for estimating truth vary across questions and so do reliability scores (from around . [sent-66, score-0.284]

12 That is, even though MRP estimates wouldn’t change, their supposed success as measured through the correlation to the “truth” estimate would suddenly be lower… all because of not correcting for the additional noise in “truth. [sent-71, score-0.28]

13 To be sure, BH are aware of the dangers of noise in estimates and we do not mean to suggest otherwise. [sent-74, score-0.187]

14 For example, they correctly point out the possibility that some of our differential findings about responsiveness across policies may be induced or hidden by variations in quality of the MRP estimates themselves across policies. [sent-75, score-0.204]

15 (b) When computing the MRP estimates, they treat the survey — instead of the census — as the true population. [sent-85, score-0.215]

16 This, unfortunately, leads to highly inefficient estimates of the population distribution within states, which in turn leads to higher variance than necessary in the MRP estimates. [sent-107, score-0.296]

17 The middle plot shows the same results using the alternative method — I am using MRP on the full dataset as the “true value”, and using census population data to poststratify. [sent-115, score-0.272]

18 (It is worth noting that each line is ordered by its value — the top row has the lowest average correlation under each measure, but that means that the rows are not comparable across graphs. [sent-121, score-0.236]

19 But because state is included as a term in the model, this can at worst be considered equivalent to weighting the survey responses by the terms in the model (which is helpful in and of itself, as BH’s “true values” do not account for sample weights). [sent-124, score-0.293]

20 ” Although the method above is quite different than what LP describe, the average correlations are indeed about 40% higher than those reported by BH, on average. [sent-126, score-0.227]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('mrp', 0.728), ('bh', 0.444), ('estimates', 0.128), ('lp', 0.117), ('true', 0.11), ('cces', 0.1), ('correlation', 0.093), ('state', 0.083), ('reliability', 0.073), ('truth', 0.072), ('correlations', 0.066), ('sample', 0.066), ('survey', 0.062), ('predictor', 0.061), ('average', 0.06), ('noise', 0.059), ('hereafter', 0.059), ('higher', 0.056), ('values', 0.052), ('phillips', 0.051), ('available', 0.05), ('lax', 0.049), ('using', 0.048), ('buttice', 0.048), ('highton', 0.048), ('responses', 0.047), ('interactions', 0.046), ('value', 0.045), ('method', 0.045), ('surveys', 0.045), ('measure', 0.044), ('census', 0.043), ('opinion', 0.043), ('malecki', 0.043), ('states', 0.042), ('paper', 0.041), ('population', 0.04), ('substantive', 0.04), ('codings', 0.039), ('devtools', 0.039), ('inspecting', 0.039), ('mrpdata', 0.039), ('across', 0.038), ('leads', 0.036), ('median', 0.036), ('yair', 0.036), ('sub', 0.036), ('thrust', 0.036), ('around', 0.035), ('account', 0.035)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.99999976 2056 andrew gelman stats-2013-10-09-Mister P: What’s its secret sauce?

Introduction: This is a long and technical post on an important topic: the use of multilevel regression and poststratification (MRP) to estimate state-level public opinion. MRP as a research method, and state-level opinion (or, more generally, attitudes in demographic and geographic subpopulation) as a subject, have both become increasingly important in political science—and soon, I expect, will become increasingly important in other social sciences as well. Being able to estimate state-level opinion from national surveys is just such a powerful thing, that if it can be done, people will do it. It’s taken 15 years or so for the method to really catch on, but the ready availability of survey data and of computing power—as well as our increasing comfort level, as a profession, with these techniques, has made MRP become more of a routine research tool. As a method becomes used more and more widely, there will be natural concerns about its domains of applicability. That is the subject of the pres

2 0.76620221 2062 andrew gelman stats-2013-10-15-Last word on Mister P (for now)

Introduction: To recap: Matt Buttice and Ben Highton recently published an article where they evaluated multilevel regression and poststratification (MRP) on a bunch of political examples estimating state-level attitudes. My Columbia colleagues Jeff Lax, Justin Phillips, and Yair Ghitza added some discussion , giving a bunch of practical tips and pointing to some problems with Buttice and Highton’s evaluations. Buttice and Highton replied , emphasizing the difficulties of comparing methods in the absence of a known ground truth. And Jeff Lax added the following comment, which I think is a good overview of the discussion so far: In the back and forth between us all on details, some points may get lost and disagreements overstated. Where are things at this point? 1. Buttice and Highton (BH) show beyond previous work that MRP performance in making state estimates can vary to an extent that is not directly observable unless one knows the true estimates (in which case one would not be us

3 0.69217974 2061 andrew gelman stats-2013-10-14-More on Mister P and how it does what it does

Introduction: Following up on our discussion the other day, Matt Buttice and Ben Highton write: It was nice to see our article mentioned and discussed by Andrew, Jeff Lax, Justin Phillips, and Yair Ghitza on Andrew’s blog in this post on Wednesday. As noted in the post, we recently published an article in Political Analysis on how well multilevel regression and poststratification (MRP) performs at producing estimates of state opinion with conventional national surveys where N≈1,500. Our central claims are that (i) the performance of MRP is highly variable, (ii) in the absence of knowing the true values, it is difficult to determine the quality of the MRP estimates produced on the basis of a single national sample, and, (iii) therefore, our views about the usefulness of MRP in instances where a researcher has a single sample of N≈1,500 are less optimistic than the ones expressed in previous research on the topic. Obviously we were interested in the blog posts. We found them stimulating

4 0.32889771 1365 andrew gelman stats-2012-06-04-Question 25 of my final exam for Design and Analysis of Sample Surveys

Introduction: 25. You are using multilevel regression and poststratification (MRP) to a survey of 1500 people to estimate support for the space program, by state. The model is fit using, as a state- level predictor, the Republican presidential vote in the state, which turns out to have a low correlation with support for the space program. Which of the following statements are basically true? (Indicate all that apply.) (a) For small states, the MRP estimates will be determined almost entirely by the demo- graphic characteristics of the respondents in the sample from that state. (b) For small states, the MRP estimates will be determined almost entirely by the demographic characteristics of the population in that state. (c) Adding a predictor specifically for this model (for example, a measure of per-capita space-program spending in the state) could dramatically improve the estimates of state-level opinion. (d) It would not be appropriate to add a predictor such as per-capita space-program spen

5 0.31945661 2074 andrew gelman stats-2013-10-23-Can’t Stop Won’t Stop Mister P Beatdown

Introduction: Ben Highton and Matt Buttice point us to this response addressing some of the issues Jeff Lax raised in his most recent MRP post. P.S. Jeff replies in comments: It sounds like we’ve converged. They acknowledge MRP performance is significantly better on average than reported in their new paper in PA and yet performance variation in terms of correlation to “truth” remains higher than some might have thought. Cool. I hope this sort of blog exchange can be a model of scientific discussion. Instead of a paper just sitting there by itself, it can be openly explored. Ideally, the published paper would include a link to these discussions of Highton, Buttice, Lax, Phillips, and Ghitza, so that readers would automatically get all this information.

6 0.29237631 1367 andrew gelman stats-2012-06-05-Question 26 of my final exam for Design and Analysis of Sample Surveys

7 0.18667211 162 andrew gelman stats-2010-07-25-Darn that Lindsey Graham! (or, “Mr. P Predicts the Kagan vote”)

8 0.14207101 544 andrew gelman stats-2011-01-29-Splitting the data

9 0.12792332 2173 andrew gelman stats-2014-01-15-Postdoc involving pathbreaking work in MRP, Stan, and the 2014 election!

10 0.12032289 70 andrew gelman stats-2010-06-07-Mister P goes on a date

11 0.10654274 1764 andrew gelman stats-2013-03-15-How do I make my graphs?

12 0.10263164 200 andrew gelman stats-2010-08-11-Separating national and state swings in voting and public opinion, or, How I avoided blogorific embarrassment: An agony in four acts

13 0.098042965 769 andrew gelman stats-2011-06-15-Mr. P by another name . . . is still great!

14 0.096497245 352 andrew gelman stats-2010-10-19-Analysis of survey data: Design based models vs. hierarchical modeling?

15 0.09403342 383 andrew gelman stats-2010-10-31-Analyzing the entire population rather than a sample

16 0.093399405 2087 andrew gelman stats-2013-11-03-The Employment Nondiscrimination Act is overwhelmingly popular in nearly every one of the 50 states

17 0.090650737 288 andrew gelman stats-2010-09-21-Discussion of the paper by Girolami and Calderhead on Bayesian computation

18 0.081934109 784 andrew gelman stats-2011-07-01-Weighting and prediction in sample surveys

19 0.081268623 962 andrew gelman stats-2011-10-17-Death!

20 0.081175379 2041 andrew gelman stats-2013-09-27-Setting up Jitts online


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.183), (1, 0.049), (2, 0.152), (3, -0.073), (4, 0.078), (5, 0.035), (6, -0.04), (7, -0.022), (8, -0.002), (9, 0.001), (10, 0.076), (11, -0.053), (12, 0.008), (13, 0.073), (14, -0.01), (15, 0.006), (16, -0.031), (17, -0.002), (18, -0.024), (19, -0.002), (20, -0.004), (21, -0.027), (22, -0.064), (23, -0.033), (24, -0.006), (25, -0.028), (26, -0.007), (27, 0.09), (28, 0.05), (29, 0.031), (30, 0.231), (31, -0.135), (32, 0.106), (33, -0.053), (34, 0.028), (35, 0.038), (36, 0.153), (37, -0.029), (38, 0.175), (39, -0.156), (40, 0.109), (41, -0.055), (42, -0.057), (43, -0.033), (44, -0.005), (45, -0.115), (46, 0.02), (47, 0.178), (48, 0.139), (49, 0.112)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.96817201 2062 andrew gelman stats-2013-10-15-Last word on Mister P (for now)

Introduction: To recap: Matt Buttice and Ben Highton recently published an article where they evaluated multilevel regression and poststratification (MRP) on a bunch of political examples estimating state-level attitudes. My Columbia colleagues Jeff Lax, Justin Phillips, and Yair Ghitza added some discussion , giving a bunch of practical tips and pointing to some problems with Buttice and Highton’s evaluations. Buttice and Highton replied , emphasizing the difficulties of comparing methods in the absence of a known ground truth. And Jeff Lax added the following comment, which I think is a good overview of the discussion so far: In the back and forth between us all on details, some points may get lost and disagreements overstated. Where are things at this point? 1. Buttice and Highton (BH) show beyond previous work that MRP performance in making state estimates can vary to an extent that is not directly observable unless one knows the true estimates (in which case one would not be us

same-blog 2 0.95702988 2056 andrew gelman stats-2013-10-09-Mister P: What’s its secret sauce?

Introduction: This is a long and technical post on an important topic: the use of multilevel regression and poststratification (MRP) to estimate state-level public opinion. MRP as a research method, and state-level opinion (or, more generally, attitudes in demographic and geographic subpopulation) as a subject, have both become increasingly important in political science—and soon, I expect, will become increasingly important in other social sciences as well. Being able to estimate state-level opinion from national surveys is just such a powerful thing, that if it can be done, people will do it. It’s taken 15 years or so for the method to really catch on, but the ready availability of survey data and of computing power—as well as our increasing comfort level, as a profession, with these techniques, has made MRP become more of a routine research tool. As a method becomes used more and more widely, there will be natural concerns about its domains of applicability. That is the subject of the pres

3 0.95699883 2061 andrew gelman stats-2013-10-14-More on Mister P and how it does what it does

Introduction: Following up on our discussion the other day, Matt Buttice and Ben Highton write: It was nice to see our article mentioned and discussed by Andrew, Jeff Lax, Justin Phillips, and Yair Ghitza on Andrew’s blog in this post on Wednesday. As noted in the post, we recently published an article in Political Analysis on how well multilevel regression and poststratification (MRP) performs at producing estimates of state opinion with conventional national surveys where N≈1,500. Our central claims are that (i) the performance of MRP is highly variable, (ii) in the absence of knowing the true values, it is difficult to determine the quality of the MRP estimates produced on the basis of a single national sample, and, (iii) therefore, our views about the usefulness of MRP in instances where a researcher has a single sample of N≈1,500 are less optimistic than the ones expressed in previous research on the topic. Obviously we were interested in the blog posts. We found them stimulating

4 0.80151707 2074 andrew gelman stats-2013-10-23-Can’t Stop Won’t Stop Mister P Beatdown

Introduction: Ben Highton and Matt Buttice point us to this response addressing some of the issues Jeff Lax raised in his most recent MRP post. P.S. Jeff replies in comments: It sounds like we’ve converged. They acknowledge MRP performance is significantly better on average than reported in their new paper in PA and yet performance variation in terms of correlation to “truth” remains higher than some might have thought. Cool. I hope this sort of blog exchange can be a model of scientific discussion. Instead of a paper just sitting there by itself, it can be openly explored. Ideally, the published paper would include a link to these discussions of Highton, Buttice, Lax, Phillips, and Ghitza, so that readers would automatically get all this information.

5 0.78132576 1365 andrew gelman stats-2012-06-04-Question 25 of my final exam for Design and Analysis of Sample Surveys

Introduction: 25. You are using multilevel regression and poststratification (MRP) to a survey of 1500 people to estimate support for the space program, by state. The model is fit using, as a state- level predictor, the Republican presidential vote in the state, which turns out to have a low correlation with support for the space program. Which of the following statements are basically true? (Indicate all that apply.) (a) For small states, the MRP estimates will be determined almost entirely by the demo- graphic characteristics of the respondents in the sample from that state. (b) For small states, the MRP estimates will be determined almost entirely by the demographic characteristics of the population in that state. (c) Adding a predictor specifically for this model (for example, a measure of per-capita space-program spending in the state) could dramatically improve the estimates of state-level opinion. (d) It would not be appropriate to add a predictor such as per-capita space-program spen

6 0.68446058 152 andrew gelman stats-2010-07-17-Distorting the Electoral Connection? Partisan Representation in Confirmation Politics

7 0.63290775 1367 andrew gelman stats-2012-06-05-Question 26 of my final exam for Design and Analysis of Sample Surveys

8 0.57170594 200 andrew gelman stats-2010-08-11-Separating national and state swings in voting and public opinion, or, How I avoided blogorific embarrassment: An agony in four acts

9 0.54333824 162 andrew gelman stats-2010-07-25-Darn that Lindsey Graham! (or, “Mr. P Predicts the Kagan vote”)

10 0.53017783 150 andrew gelman stats-2010-07-16-Gaydar update: Additional research on estimating small fractions of the population

11 0.52156591 159 andrew gelman stats-2010-07-23-Popular governor, small state

12 0.5164361 187 andrew gelman stats-2010-08-05-Update on state size and governors’ popularity

13 0.49212834 70 andrew gelman stats-2010-06-07-Mister P goes on a date

14 0.47953239 454 andrew gelman stats-2010-12-07-Diabetes stops at the state line?

15 0.47489887 2087 andrew gelman stats-2013-11-03-The Employment Nondiscrimination Act is overwhelmingly popular in nearly every one of the 50 states

16 0.47298935 2147 andrew gelman stats-2013-12-25-Measuring Beauty

17 0.4702417 405 andrew gelman stats-2010-11-10-Estimation from an out-of-date census

18 0.46597052 769 andrew gelman stats-2011-06-15-Mr. P by another name . . . is still great!

19 0.46277902 228 andrew gelman stats-2010-08-24-A new efficient lossless compression algorithm

20 0.46265936 2346 andrew gelman stats-2014-05-24-Buzzfeed, Porn, Kansas…That Can’t Be Good


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(8, 0.01), (16, 0.075), (21, 0.022), (24, 0.164), (47, 0.013), (63, 0.018), (65, 0.19), (76, 0.033), (86, 0.033), (87, 0.018), (99, 0.285)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.97259182 457 andrew gelman stats-2010-12-07-Whassup with phantom-limb treatment?

Introduction: OK, here’s something that is completely baffling me. I read this article by John Colapinto on the neuroscientist V. S. Ramachandran, who’s famous for his innovative treatment for “phantom limb” pain: His first subject was a young man who a decade earlier had crashed his motorcycle and torn from his spinal column the nerves supplying the left arm. After keeping the useless arm in a sling for a year, the man had the arm amputated above the elbow. Ever since, he had felt unremitting cramping in the phantom limb, as though it were immobilized in an awkward position. . . . Ramachandram positioned a twenty-inch-by-twenty-inch drugstore mirror . . . and told him to place his intact right arm on one side of the mirror and his stump on the other. He told the man to arrange the mirror so that the reflection created the illusion that his intact arm was the continuation of the amputated one. The Ramachandran asked the man to move his right and left arms . . . “Oh, my God!” the man began

2 0.96672404 1197 andrew gelman stats-2012-03-04-“All Models are Right, Most are Useless”

Introduction: The above is the title of a talk that Thad Tarpey gave at the Joint Statistical Meetings in 2009. Here’s the abstract: Students of statistics are often introduced to George Box’s famous quote: “all models are wrong, some are useful.” In this talk I [Tarpey] argue that this quote, although useful, is wrong. A different and more positive perspective is to acknowledge that a model is simply a means of extracting information of interest from data. The truth is infinitely complex and a model is merely an approximation to the truth. If the approximation is poor or misleading, then the model is useless. In this talk I give examples of correct models that are not true models. I illustrate how the notion of a “wrong” model can lead to wrong conclusions. I’m curious what he had to say—maybe he could post the slides? P.S. And here they are !

3 0.965433 1021 andrew gelman stats-2011-11-21-Don’t judge a book by its title

Introduction: A correspondent writes: I just want to spend a few words to point you to this book I have just found on Amazon: “Understanding The New Statistics: Effect Sizes, Confidence Intervals, and Meta-Analysis” by G. Cumming. I have been attracted by the rather unusual and ‘sexy’ title but it seems to be nothing more than an attempt at alerting the psychology community on considering point estimation procedures and confidence intervals, in place of hypothesis testing, the latter being ‘a terrible idea!’ in the author’s own words. Some more quotes here . Then he says: “‘These are hardly new techniques, but I label them ‘The New Statistics’ because using them would for many researchers be quite new, as well as a highly beneficial change!’” Of course the latter is not stated on the book cover. That’s about as bad as writing a book with subtitle, “Why Americans vote the way they do,” but not actually telling the reader why Americans vote the way they do. I guess what I’m saying is:

4 0.96426737 1426 andrew gelman stats-2012-07-23-Special effects

Introduction: I just saw L’Age de Glace 4 and boy are my eyes tired. I’m just glad it wasn’t in 3-D or I probably would’ve thrown up. The special effects were amazing, way beyond George of the Jungle and that ilk. Which was good, as I could only understand about 10% of the dialogue. I’d heard about all this new animation technology but not actually seen it before.

5 0.96082592 2062 andrew gelman stats-2013-10-15-Last word on Mister P (for now)

Introduction: To recap: Matt Buttice and Ben Highton recently published an article where they evaluated multilevel regression and poststratification (MRP) on a bunch of political examples estimating state-level attitudes. My Columbia colleagues Jeff Lax, Justin Phillips, and Yair Ghitza added some discussion , giving a bunch of practical tips and pointing to some problems with Buttice and Highton’s evaluations. Buttice and Highton replied , emphasizing the difficulties of comparing methods in the absence of a known ground truth. And Jeff Lax added the following comment, which I think is a good overview of the discussion so far: In the back and forth between us all on details, some points may get lost and disagreements overstated. Where are things at this point? 1. Buttice and Highton (BH) show beyond previous work that MRP performance in making state estimates can vary to an extent that is not directly observable unless one knows the true estimates (in which case one would not be us

6 0.95666063 671 andrew gelman stats-2011-04-20-One more time-use graph

7 0.95611376 1475 andrew gelman stats-2012-08-30-A Stan is Born

8 0.95226771 1365 andrew gelman stats-2012-06-04-Question 25 of my final exam for Design and Analysis of Sample Surveys

same-blog 9 0.95003211 2056 andrew gelman stats-2013-10-09-Mister P: What’s its secret sauce?

10 0.94883871 1993 andrew gelman stats-2013-08-22-Improvements to Kindle Version of BDA3

11 0.93584049 463 andrew gelman stats-2010-12-11-Compare p-values from privately funded medical trials to those in publicly funded research?

12 0.93133229 1454 andrew gelman stats-2012-08-11-Weakly informative priors for Bayesian nonparametric models?

13 0.92242211 2061 andrew gelman stats-2013-10-14-More on Mister P and how it does what it does

14 0.92214096 2074 andrew gelman stats-2013-10-23-Can’t Stop Won’t Stop Mister P Beatdown

15 0.91408056 1811 andrew gelman stats-2013-04-18-Psychology experiments to understand what’s going on with data graphics?

16 0.91390002 758 andrew gelman stats-2011-06-11-Hey, good news! Your p-value just passed the 0.05 threshold!

17 0.91307241 1333 andrew gelman stats-2012-05-20-Question 10 of my final exam for Design and Analysis of Sample Surveys

18 0.91117716 100 andrew gelman stats-2010-06-19-Unsurprisingly, people are more worried about the economy and jobs than about deficits

19 0.9082514 1845 andrew gelman stats-2013-05-07-Is Felix Salmon wrong on free TV?

20 0.9002369 416 andrew gelman stats-2010-11-16-Is parenting a form of addiction?