andrew_gelman_stats andrew_gelman_stats-2010 andrew_gelman_stats-2010-249 knowledge-graph by maker-knowledge-mining

249 andrew gelman stats-2010-09-01-References on predicting elections


meta infos for this blog

Source: html

Introduction: Mike Axelrod writes: I [Axelrod] am interested in building a model that predicts voting on the precinct level, using variables such as party registration, age, sex, income etc. Surely political scientists have worked on this problem. I would be grateful for any reference you could provide in the way of articles and books. My reply: Political scientists have worked on this problem, and it’s easy enough to imagine hierarchical models of the sort discussed in my book with Jennifer. I can picture what I would do if asked to forecast at the precinct level, for example to model exit polls. (In fact, I was briefly hired by the exit poll consortium in 2000 to do this, but then after I told them about hierarchical Bayes, they un-hired me!) But I don’t actually know of any literature on precinct-level forecasting. Perhaps one of you out there knows of some references?


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Mike Axelrod writes: I [Axelrod] am interested in building a model that predicts voting on the precinct level, using variables such as party registration, age, sex, income etc. [sent-1, score-1.138]

2 I would be grateful for any reference you could provide in the way of articles and books. [sent-3, score-0.496]

3 My reply: Political scientists have worked on this problem, and it’s easy enough to imagine hierarchical models of the sort discussed in my book with Jennifer. [sent-4, score-0.968]

4 I can picture what I would do if asked to forecast at the precinct level, for example to model exit polls. [sent-5, score-1.186]

5 (In fact, I was briefly hired by the exit poll consortium in 2000 to do this, but then after I told them about hierarchical Bayes, they un-hired me! [sent-6, score-1.235]

6 ) But I don’t actually know of any literature on precinct-level forecasting. [sent-7, score-0.125]

7 Perhaps one of you out there knows of some references? [sent-8, score-0.097]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('axelrod', 0.402), ('precinct', 0.372), ('exit', 0.372), ('consortium', 0.201), ('hierarchical', 0.184), ('grateful', 0.176), ('worked', 0.174), ('registration', 0.165), ('scientists', 0.161), ('hired', 0.147), ('predicts', 0.14), ('level', 0.139), ('forecast', 0.126), ('mike', 0.125), ('surely', 0.125), ('briefly', 0.125), ('poll', 0.12), ('references', 0.116), ('political', 0.115), ('sex', 0.109), ('reference', 0.108), ('picture', 0.106), ('building', 0.105), ('voting', 0.102), ('party', 0.099), ('knows', 0.097), ('age', 0.095), ('income', 0.095), ('bayes', 0.094), ('told', 0.086), ('provide', 0.086), ('imagine', 0.084), ('model', 0.084), ('articles', 0.079), ('asked', 0.079), ('discussed', 0.079), ('literature', 0.079), ('variables', 0.078), ('easy', 0.075), ('fact', 0.069), ('interested', 0.063), ('reply', 0.062), ('book', 0.056), ('perhaps', 0.054), ('models', 0.053), ('enough', 0.053), ('sort', 0.049), ('would', 0.047), ('problem', 0.047), ('actually', 0.046)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0000001 249 andrew gelman stats-2010-09-01-References on predicting elections

Introduction: Mike Axelrod writes: I [Axelrod] am interested in building a model that predicts voting on the precinct level, using variables such as party registration, age, sex, income etc. Surely political scientists have worked on this problem. I would be grateful for any reference you could provide in the way of articles and books. My reply: Political scientists have worked on this problem, and it’s easy enough to imagine hierarchical models of the sort discussed in my book with Jennifer. I can picture what I would do if asked to forecast at the precinct level, for example to model exit polls. (In fact, I was briefly hired by the exit poll consortium in 2000 to do this, but then after I told them about hierarchical Bayes, they un-hired me!) But I don’t actually know of any literature on precinct-level forecasting. Perhaps one of you out there knows of some references?

2 0.11528914 107 andrew gelman stats-2010-06-24-PPS in Georgia

Introduction: Lucy Flynn writes: I’m working at a non-profit organization called CRRC in the Republic of Georgia. I’m having a methodological problem and I saw the syllabus for your sampling class online and thought I might be able to ask you about it? We do a lot of complex surveys nationwide; our typical sample design is as follows: - stratify by rural/urban/capital - sub-stratify the rural and urban strata into NE/NW/SE/SW geographic quadrants - select voting precincts as PSUs - select households as SSUs - select individual respondents as TSUs I’m relatively new here, and past practice has been to sample voting precincts with probability proportional to size. It’s desirable because it’s not logistically feasible for us to vary the number of interviews per precinct with precinct size, so it makes the selection probabilities for households more even across precinct sizes. However, I have a complex sampling textbook (Lohr 1999), and it explains how complex it is to calculate sel

3 0.10292824 1815 andrew gelman stats-2013-04-20-Displaying inferences from complex models

Introduction: David Williams writes: I am completing my doctoral dissertation dealing with modeling adverse birth outcomes. The models are complex with 9 risk factors, 5 area level variables and 4 individual level variables. I used hierarchical logistic regression (SAS glimmix) to analyze the data. I am now faced with reporting the results. Can you please recommend any references and/or examples that would suggest what results to report in what format? I have found no references and scant examples of reporting such results in tables. My reply: I think graphs are the way to go. I don’t have any immediate ideas beyond what’s in the book with Jennifer. I think this is an important area of research.

4 0.099801362 1425 andrew gelman stats-2012-07-23-Examples of the use of hierarchical modeling to generalize to new settings

Introduction: In a link to our back-and-forth on causal inference and the use of hierarchical models to bridge between different inferential settings, Elias Bareinboim (a computer scientist who is working with Judea Pearl) writes : In the past week, I have been engaged in a discussion with Andrew Gelman and his blog readers regarding causal inference, selection bias, confounding, and generalizability. I was trying to understand how his method which he calls “hierarchical modeling” would handle these issues and what guarantees it provides. . . . If anyone understands how “hierarchical modeling” can solve a simple toy problem (e.g., M-bias, control of confounding, mediation, generalizability), please share with us. In his post, Bareinboim raises a direct question about hierarchical modeling and also indirectly brings up larger questions about what is convincing evidence when evaluating a statistical method. As I wrote earlier, Bareinboim believes that “The only way investigators can decide w

5 0.097956806 1248 andrew gelman stats-2012-04-06-17 groups, 6 group-level predictors: What to do?

Introduction: Yi-Chun Ou writes: I am using a multilevel model with three levels. I read that you wrote a book about multilevel models, and wonder if you can solve the following question. The data structure is like this: Level one: customer (8444 customers) Level two: companys (90 companies) Level three: industry (17 industries) I use 6 level-three variables (i.e. industry characteristics) to explain the variance of the level-one effect across industries. The question here is whether there is an over-fitting problem since there are only 17 industries. I understand that this must be a problem for non-multilevel models, but is it also a problem for multilevel models? My reply: Yes, this could be a problem. I’d suggest combining some of your variables into a common score, or using only some of the variables, or using strong priors to control the inferences. This is an interesting and important area of statistics research, to do this sort of thing systematically. There’s lots o

6 0.091338247 151 andrew gelman stats-2010-07-16-Wanted: Probability distributions for rank orderings

7 0.091239437 1999 andrew gelman stats-2013-08-27-Bayesian model averaging or fitting a larger model

8 0.087786935 2035 andrew gelman stats-2013-09-23-Scalable Stan

9 0.08643581 2273 andrew gelman stats-2014-03-29-References (with code) for Bayesian hierarchical (multilevel) modeling and structural equation modeling

10 0.08466845 1972 andrew gelman stats-2013-08-07-When you’re planning on fitting a model, build up to it by fitting simpler models first. Then, once you have a model you like, check the hell out of it

11 0.083312288 244 andrew gelman stats-2010-08-30-Useful models, model checking, and external validation: a mini-discussion

12 0.081692807 586 andrew gelman stats-2011-02-23-A statistical version of Arrow’s paradox

13 0.081333622 1570 andrew gelman stats-2012-11-08-Poll aggregation and election forecasting

14 0.080129921 1070 andrew gelman stats-2011-12-19-The scope for snooping

15 0.07572189 723 andrew gelman stats-2011-05-21-Literary blurb translation guide

16 0.07512939 1597 andrew gelman stats-2012-11-29-What is expected of a consultant

17 0.0741603 604 andrew gelman stats-2011-03-08-More on the missing conservative psychology researchers

18 0.073747426 1383 andrew gelman stats-2012-06-18-Hierarchical modeling as a framework for extrapolation

19 0.07217966 459 andrew gelman stats-2010-12-09-Solve mazes by starting at the exit

20 0.07049638 2255 andrew gelman stats-2014-03-19-How Americans vote


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.128), (1, 0.033), (2, 0.053), (3, 0.048), (4, -0.011), (5, 0.038), (6, -0.021), (7, -0.04), (8, 0.042), (9, 0.058), (10, 0.043), (11, 0.023), (12, 0.006), (13, -0.004), (14, 0.02), (15, -0.013), (16, -0.022), (17, 0.015), (18, 0.02), (19, -0.006), (20, 0.021), (21, -0.021), (22, -0.006), (23, -0.026), (24, -0.002), (25, -0.03), (26, 0.007), (27, -0.032), (28, 0.005), (29, -0.001), (30, -0.02), (31, 0.025), (32, -0.007), (33, -0.009), (34, -0.033), (35, -0.005), (36, 0.011), (37, -0.017), (38, -0.02), (39, 0.034), (40, -0.03), (41, -0.007), (42, 0.038), (43, 0.044), (44, -0.031), (45, -0.072), (46, 0.004), (47, 0.008), (48, 0.007), (49, 0.005)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.96114367 249 andrew gelman stats-2010-09-01-References on predicting elections

Introduction: Mike Axelrod writes: I [Axelrod] am interested in building a model that predicts voting on the precinct level, using variables such as party registration, age, sex, income etc. Surely political scientists have worked on this problem. I would be grateful for any reference you could provide in the way of articles and books. My reply: Political scientists have worked on this problem, and it’s easy enough to imagine hierarchical models of the sort discussed in my book with Jennifer. I can picture what I would do if asked to forecast at the precinct level, for example to model exit polls. (In fact, I was briefly hired by the exit poll consortium in 2000 to do this, but then after I told them about hierarchical Bayes, they un-hired me!) But I don’t actually know of any literature on precinct-level forecasting. Perhaps one of you out there knows of some references?

2 0.67025542 1042 andrew gelman stats-2011-12-05-Timing is everything!

Introduction: A colleague emailed me with a question about the methods used by Groseclose and Milyo in their study of media bias. Before getting to the question, I just wanted to comment that Groseclose has had really bad timing with this project. First off, his article came out in 2005 when everybody was hating Bush. Even the Republicans who reelected him weren’t thrilled with the guy. Then his book came out in 2011. If his book had come out a year ago, that would’ve been perfect: the 2010 elections coming up, lots of anger at the Democrats and Obama, no peer-reviewed criticisms of his work, etc. Instead he waited until 2011, and then look what he got: - Republicans feel they have a chance at winning in 2012 so they’re more interested in fighting and less interested in complaining. - John Gasper shoots down Groseclose/Milyo in the Quarterly Journal of Political Science. That’s gotta hurt. (Until this point, Groseclose could respond to attacks by saying he was waiting until a crit

3 0.6637758 151 andrew gelman stats-2010-07-16-Wanted: Probability distributions for rank orderings

Introduction: Dietrich Stoyan writes: I asked the IMS people for an expert in statistics of voting/elections and they wrote me your name. I am a statistician, but never worked in the field voting/elections. It was my son-in-law who asked me for statistical theories in that field. He posed in particular the following problem: The aim of the voting is to come to a ranking of c candidates. Every vote is a permutation of these c candidates. The problem is to have probability distributions in the set of all permutations of c elements. Are there theories for such distributions? I should be very grateful for a fast answer with hints to literature. (I confess that I do not know your books.) My reply: Rather than trying to model the ranks directly, I’d recommend modeling a latent continuous outcome which then implies a distribution on ranks, if the ranks are of interest. There are lots of distributions of c-dimensional continuous outcomes. In political science, the usual way to start is

4 0.65702325 1248 andrew gelman stats-2012-04-06-17 groups, 6 group-level predictors: What to do?

Introduction: Yi-Chun Ou writes: I am using a multilevel model with three levels. I read that you wrote a book about multilevel models, and wonder if you can solve the following question. The data structure is like this: Level one: customer (8444 customers) Level two: companys (90 companies) Level three: industry (17 industries) I use 6 level-three variables (i.e. industry characteristics) to explain the variance of the level-one effect across industries. The question here is whether there is an over-fitting problem since there are only 17 industries. I understand that this must be a problem for non-multilevel models, but is it also a problem for multilevel models? My reply: Yes, this could be a problem. I’d suggest combining some of your variables into a common score, or using only some of the variables, or using strong priors to control the inferences. This is an interesting and important area of statistics research, to do this sort of thing systematically. There’s lots o

5 0.65586877 421 andrew gelman stats-2010-11-19-Just chaid

Introduction: Reading somebody else’s statistics rant made me realize the inherent contradictions in much of my own statistical advice. Jeff Lax sent along this article by Philip Schrodt, along with the cryptic comment: Perhaps of interest to you. perhaps not. Not meant to be an excuse for you to rant against hypothesis testing again. In his article, Schrodt makes a reasonable and entertaining argument against the overfitting of data and the overuse of linear models. He states that his article is motivated by the quantitative papers he has been sent to review for journals or conferences, and he explicitly excludes “studies of United States voting behavior,” so at least I think Mister P is off the hook. I notice a bit of incoherence in Schrodt’s position–on one hand, he criticizes “kitchen-sink models” for overfitting and he criticizes “using complex methods without understanding the underlying assumptions” . . . but then later on he suggests that political scientists in this countr

6 0.65346086 1468 andrew gelman stats-2012-08-24-Multilevel modeling and instrumental variables

7 0.65314257 854 andrew gelman stats-2011-08-15-A silly paper that tries to make fun of multilevel models

8 0.64253646 2163 andrew gelman stats-2014-01-08-How to display multinominal logit results graphically?

9 0.6421966 1815 andrew gelman stats-2013-04-20-Displaying inferences from complex models

10 0.64030439 1908 andrew gelman stats-2013-06-21-Interpreting interactions in discrete-data regression

11 0.640109 1395 andrew gelman stats-2012-06-27-Cross-validation (What is it good for?)

12 0.63912523 1392 andrew gelman stats-2012-06-26-Occam

13 0.63760328 753 andrew gelman stats-2011-06-09-Allowing interaction terms to vary

14 0.63061595 1121 andrew gelman stats-2012-01-15-R-squared for multilevel models

15 0.6304518 575 andrew gelman stats-2011-02-15-What are the trickiest models to fit?

16 0.63020003 851 andrew gelman stats-2011-08-12-year + (1|year)

17 0.62829804 257 andrew gelman stats-2010-09-04-Question about standard range for social science correlations

18 0.62706935 1047 andrew gelman stats-2011-12-08-I Am Too Absolutely Heteroskedastic for This Probit Model

19 0.62506145 1981 andrew gelman stats-2013-08-14-The robust beauty of improper linear models in decision making

20 0.62257963 1636 andrew gelman stats-2012-12-23-Peter Bartlett on model complexity and sample size


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(15, 0.031), (16, 0.018), (20, 0.019), (21, 0.056), (24, 0.089), (58, 0.245), (82, 0.022), (86, 0.059), (95, 0.038), (99, 0.298)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.9362992 119 andrew gelman stats-2010-06-30-Why is George Apley overrated?

Introduction: A comment by Mark Palko reminded me that, while I’m a huge Marquand fan, I think The Late George Apley is way overrated. My theory is that Marquand’s best books don’t fit into the modernist way of looking about literature, and that the gatekeepers of the 1930s and 1940s, when judging Marquand by these standards, conveniently labeled Apley has his best book because it had a form–Edith-Wharton-style satire–that they could handle. In contrast, Point of No Return and all the other classics are a mixture of seriousness and satire that left critics uneasy. Perhaps there’s a way to study this sort of thing more systematically?

same-blog 2 0.88683057 249 andrew gelman stats-2010-09-01-References on predicting elections

Introduction: Mike Axelrod writes: I [Axelrod] am interested in building a model that predicts voting on the precinct level, using variables such as party registration, age, sex, income etc. Surely political scientists have worked on this problem. I would be grateful for any reference you could provide in the way of articles and books. My reply: Political scientists have worked on this problem, and it’s easy enough to imagine hierarchical models of the sort discussed in my book with Jennifer. I can picture what I would do if asked to forecast at the precinct level, for example to model exit polls. (In fact, I was briefly hired by the exit poll consortium in 2000 to do this, but then after I told them about hierarchical Bayes, they un-hired me!) But I don’t actually know of any literature on precinct-level forecasting. Perhaps one of you out there knows of some references?

3 0.84609294 1428 andrew gelman stats-2012-07-25-The problem with realistic advice?

Introduction: In an article entitled 16 Weeks, Thomas Basbøll ruthlessly lays out the time constraints that limit what a student will be able to write during a semester and recommends that students follow a plan: Try to be realistic. If you need time for “free writing” or “thought writing” (writing to find out what you think) book that into your calendar as well, but the important part of the challenge is to find time to write down what you already know needs to be written. If you don’t yet know what you’re going to say this semester, then your challenge is, in part, to figure that out. But you should still find at least 30 minutes a day to write down something you know you want to say. Keep in mind that we are only talking about sixteen weeks in the very near future. . . . Assuming that you do have something say, then, here’s the challenge: write always and only when (and what) your calendar tells you to. Don’t write when “inspired” to do so (unless this happens to coincide with your writing s

4 0.83800095 162 andrew gelman stats-2010-07-25-Darn that Lindsey Graham! (or, “Mr. P Predicts the Kagan vote”)

Introduction: On the basis of two papers and because it is completely obvious, we (meaning me , Justin, and John ) predict that Elena Kagan will get confirmed to be an Associate Justice of the Supreme Court. But we also want to see how close we can come to predicting the votes for and against. We actually have two sets of predictions, both using the MRP technique discussed previously on this blog. The first is based on our recent paper in the Journal of Politics showing that support for the nominee in a senator’s home state plays a striking role in whether she or he votes to confirm the nominee. The second is based on a new working paper extending “basic” MRP to show that senators respond far more to their co-partisans than the median voter in their home states. Usually, our vote “predictions” do not differ much, but there is a group of senators who are predicted to vote yes for Kagan with a probability around 50% and the two sets of predictions thus differ for Kagan more than usual.

5 0.82967907 979 andrew gelman stats-2011-10-29-Bayesian inference for the parameter of a uniform distribution

Introduction: Subhash Lele writes: I was wondering if you might know some good references to Bayesian treatment of parameter estimation for U(0,b) type distributions. I am looking for cases where the parameter is on the boundary. I would appreciate any help and advice you could provide. I am, in particular, looking for an MCMC (preferably in WinBUGS) based approach. I figured out the WinBUGS part but I am still curious about the theoretical papers, asymptotics etc. I actually can’t think of any examples! But maybe you, the readers, can. We also should think of the best way to implement this model in Stan. We like to transform to avoid hard boundary constraints, but it seems a bit tacky to do a data-based transformation (which itself would not work if there are latent variables). P.S. I actually saw Lele speak at a statistics conference around 20 years ago. There was a lively exchange between Lele and an older guy who was working on similar problems using a different method. The oth

6 0.82881194 73 andrew gelman stats-2010-06-08-Observational Epidemiology

7 0.8254531 103 andrew gelman stats-2010-06-22-Beach reads, Proust, and income tax

8 0.81608051 815 andrew gelman stats-2011-07-22-Statistical inference based on the minimum description length principle

9 0.81606448 1886 andrew gelman stats-2013-06-07-Robust logistic regression

10 0.8006596 1635 andrew gelman stats-2012-12-22-More Pinker Pinker Pinker

11 0.79184234 254 andrew gelman stats-2010-09-04-Bayesian inference viewed as a computational approximation to classical calculations

12 0.79021126 2364 andrew gelman stats-2014-06-08-Regression and causality and variable ordering

13 0.78840363 574 andrew gelman stats-2011-02-14-“The best data visualizations should stand on their own”? I don’t think so.

14 0.78457284 670 andrew gelman stats-2011-04-20-Attractive but hard-to-read graph could be made much much better

15 0.7835145 1966 andrew gelman stats-2013-08-03-Uncertainty in parameter estimates using multilevel models

16 0.78229082 2173 andrew gelman stats-2014-01-15-Postdoc involving pathbreaking work in MRP, Stan, and the 2014 election!

17 0.78199428 1782 andrew gelman stats-2013-03-30-“Statistical Modeling: A Fresh Approach”

18 0.78106195 731 andrew gelman stats-2011-05-26-Lottery probability update

19 0.78047454 462 andrew gelman stats-2010-12-10-Who’s holding the pen?, The split screen, and other ideas for one-on-one instruction

20 0.78015018 105 andrew gelman stats-2010-06-23-More on those divorce prediction statistics, including a discussion of the innumeracy of (some) mathematicians