andrew_gelman_stats andrew_gelman_stats-2010 andrew_gelman_stats-2010-485 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Catherine Bueker writes: I [Bueker] am analyzing the effect of various contextual factors on the voter turnout of naturalized Latino citizens. I have included the natural log of the number of Spanish Language ads run in each state during the election cycle to predict voter turnout. I now want to calculate the predicted probabilities of turnout for those in states with 0 ads, 500 ads, 1000 ads, etc. The problem is that I do not know how to handle the beta coefficient of the LN(Spanish language ads). Is there someway to “unlog” the coefficient? My reply: Calculate these probabilities for specific values of predictors, then graph the predictions of interest. Also, you can average over the other inputs in your model to get summaries. See this article with Pardoe for further discussion.
sentIndex sentText sentNum sentScore
1 Catherine Bueker writes: I [Bueker] am analyzing the effect of various contextual factors on the voter turnout of naturalized Latino citizens. [sent-1, score-0.831]
2 I have included the natural log of the number of Spanish Language ads run in each state during the election cycle to predict voter turnout. [sent-2, score-1.466]
3 I now want to calculate the predicted probabilities of turnout for those in states with 0 ads, 500 ads, 1000 ads, etc. [sent-3, score-0.79]
4 The problem is that I do not know how to handle the beta coefficient of the LN(Spanish language ads). [sent-4, score-0.594]
5 My reply: Calculate these probabilities for specific values of predictors, then graph the predictions of interest. [sent-6, score-0.433]
6 Also, you can average over the other inputs in your model to get summaries. [sent-7, score-0.224]
7 See this article with Pardoe for further discussion. [sent-8, score-0.032]
wordName wordTfidf (topN-words)
[('ads', 0.59), ('bueker', 0.357), ('spanish', 0.252), ('turnout', 0.233), ('voter', 0.209), ('calculate', 0.209), ('coefficient', 0.175), ('probabilities', 0.163), ('language', 0.155), ('catherine', 0.153), ('latino', 0.147), ('pardoe', 0.147), ('contextual', 0.128), ('inputs', 0.114), ('cycle', 0.114), ('beta', 0.11), ('log', 0.096), ('handle', 0.092), ('analyzing', 0.088), ('predicted', 0.087), ('predictions', 0.079), ('included', 0.077), ('predictors', 0.077), ('predict', 0.076), ('factors', 0.074), ('election', 0.073), ('specific', 0.068), ('values', 0.067), ('natural', 0.067), ('states', 0.064), ('run', 0.062), ('graph', 0.056), ('average', 0.055), ('state', 0.054), ('various', 0.051), ('effect', 0.048), ('number', 0.048), ('reply', 0.047), ('discussion', 0.04), ('problem', 0.036), ('want', 0.034), ('model', 0.032), ('article', 0.032), ('know', 0.026), ('writes', 0.025), ('get', 0.023), ('see', 0.021), ('also', 0.021)]
simIndex simValue blogId blogTitle
same-blog 1 0.99999994 485 andrew gelman stats-2010-12-25-Unlogging
Introduction: Catherine Bueker writes: I [Bueker] am analyzing the effect of various contextual factors on the voter turnout of naturalized Latino citizens. I have included the natural log of the number of Spanish Language ads run in each state during the election cycle to predict voter turnout. I now want to calculate the predicted probabilities of turnout for those in states with 0 ads, 500 ads, 1000 ads, etc. The problem is that I do not know how to handle the beta coefficient of the LN(Spanish language ads). Is there someway to “unlog” the coefficient? My reply: Calculate these probabilities for specific values of predictors, then graph the predictions of interest. Also, you can average over the other inputs in your model to get summaries. See this article with Pardoe for further discussion.
2 0.11905325 1080 andrew gelman stats-2011-12-24-Latest in blog advertising
Introduction: I received the following message from “Patricia Lopez” of “Premium Link Ads”: Hello, I am interested in placing a text link on your page: http://andrewgelman.com/2011/07/super_sam_fuld/. The link would point to a page on a website that is relevant to your page and may be useful to your site visitors. We would be happy to compensate you for your time if it is something we are able to work out. The best way to reach me is through a direct response to this email. This will help me get back to you about the right link request. Please let me know if you are interested, and if not thanks for your time. Thanks. Usually I just ignore these, but after our recent discussion I decided to reply. I wrote: How much do you pay? But no answer. I wonder what’s going on? I mean, why bother sending the email in the first place if you’re not going to follow up?
3 0.11654638 678 andrew gelman stats-2011-04-25-Democrats do better among the most and least educated groups
Introduction: These are based on raw Pew data, reweighted to adjust for voter turnout by state, income, and ethnicity. No modeling of vote on age, education, and ethnicity. I think our future estimates based on the 9-way model will be better, but these are basically OK, I think. All but six of the dots in the graph are based on sample sizes greater than 30. I published these last year but they’re still relevant, I think. There’s lots of confusion when it comes to education and voting.
4 0.11092904 1313 andrew gelman stats-2012-05-11-Question 1 of my final exam for Design and Analysis of Sample Surveys
Introduction: 1. Suppose that, in a survey of 1000 people in a state, 400 say they voted in a recent primary election. Actually, though, the voter turnout was only 30%. Give an estimate of the probability that a nonvoter will falsely state that he or she voted. (Assume that all voters honestly report that they voted.) P.S. The commenters are picking up some of the unintended “Hare and pineapple” ambiguity in my question!
5 0.10442436 1832 andrew gelman stats-2013-04-29-The blogroll
Introduction: I encourage you to check out our linked blogs . Here’s what they’re all about: Cognitive and Behavioral Science BPS Research Digest : I haven’t been following this one recently, but it has lots of good links, I should probably check it more often. There are a couple things that bother me, though. The blog is sponsored by the British Psychological Society, so this sounds pretty serious. But then they run things like advertising promotions sponsored by a textbook company and highlight iffy experimental claims. For example, in 2010 they ran a wholly uncritical post on the notorious Daryl Bem study that purported to find ESP. After being called on it in the comments, the blogger (Christian Jarrett) responded with, “The stats appear sound. . . . it’s a great study. Rigorously conducted” and even defended “the discussion of quantum physics in the paper.” To be fair, though, and as he points out in comments, Jarrett wrote of Bem’s study: “this isn’t proof of psi, far fr
6 0.102281 1310 andrew gelman stats-2012-05-09-Varying treatment effects, again
7 0.10222967 406 andrew gelman stats-2010-11-10-Translating into Votes: The Electoral Impact of Spanish-Language Ballots
8 0.10091723 1049 andrew gelman stats-2011-12-09-Today in the sister blog
9 0.083721571 898 andrew gelman stats-2011-09-10-Fourteen magic words: an update
10 0.075078353 288 andrew gelman stats-2010-09-21-Discussion of the paper by Girolami and Calderhead on Bayesian computation
11 0.072293423 520 andrew gelman stats-2011-01-17-R Advertised
12 0.067659199 772 andrew gelman stats-2011-06-17-Graphical tools for understanding multilevel models
13 0.066874661 159 andrew gelman stats-2010-07-23-Popular governor, small state
14 0.066202104 389 andrew gelman stats-2010-11-01-Why it can be rational to vote
15 0.066202104 1565 andrew gelman stats-2012-11-06-Why it can be rational to vote
16 0.064065009 1315 andrew gelman stats-2012-05-12-Question 2 of my final exam for Design and Analysis of Sample Surveys
17 0.062791646 1346 andrew gelman stats-2012-05-27-Average predictive comparisons when changing a pair of variables
18 0.062647544 1656 andrew gelman stats-2013-01-05-Understanding regression models and regression coefficients
19 0.060811661 951 andrew gelman stats-2011-10-11-Data mining efforts for Obama’s campaign
20 0.060759459 2315 andrew gelman stats-2014-05-02-Discovering general multidimensional associations
topicId topicWeight
[(0, 0.069), (1, 0.023), (2, 0.07), (3, 0.011), (4, 0.024), (5, -0.004), (6, -0.026), (7, -0.025), (8, 0.017), (9, 0.002), (10, 0.023), (11, 0.006), (12, 0.015), (13, -0.027), (14, -0.008), (15, 0.014), (16, 0.007), (17, 0.009), (18, -0.015), (19, 0.005), (20, -0.003), (21, 0.035), (22, 0.01), (23, -0.012), (24, 0.011), (25, 0.007), (26, 0.005), (27, 0.031), (28, -0.015), (29, 0.001), (30, 0.004), (31, -0.009), (32, -0.004), (33, -0.004), (34, -0.012), (35, -0.005), (36, 0.056), (37, -0.035), (38, -0.006), (39, 0.005), (40, -0.005), (41, -0.007), (42, 0.023), (43, -0.016), (44, 0.01), (45, -0.01), (46, -0.003), (47, -0.021), (48, 0.007), (49, 0.003)]
simIndex simValue blogId blogTitle
same-blog 1 0.95141029 485 andrew gelman stats-2010-12-25-Unlogging
Introduction: Catherine Bueker writes: I [Bueker] am analyzing the effect of various contextual factors on the voter turnout of naturalized Latino citizens. I have included the natural log of the number of Spanish Language ads run in each state during the election cycle to predict voter turnout. I now want to calculate the predicted probabilities of turnout for those in states with 0 ads, 500 ads, 1000 ads, etc. The problem is that I do not know how to handle the beta coefficient of the LN(Spanish language ads). Is there someway to “unlog” the coefficient? My reply: Calculate these probabilities for specific values of predictors, then graph the predictions of interest. Also, you can average over the other inputs in your model to get summaries. See this article with Pardoe for further discussion.
2 0.69338703 162 andrew gelman stats-2010-07-25-Darn that Lindsey Graham! (or, “Mr. P Predicts the Kagan vote”)
Introduction: On the basis of two papers and because it is completely obvious, we (meaning me , Justin, and John ) predict that Elena Kagan will get confirmed to be an Associate Justice of the Supreme Court. But we also want to see how close we can come to predicting the votes for and against. We actually have two sets of predictions, both using the MRP technique discussed previously on this blog. The first is based on our recent paper in the Journal of Politics showing that support for the nominee in a senator’s home state plays a striking role in whether she or he votes to confirm the nominee. The second is based on a new working paper extending “basic” MRP to show that senators respond far more to their co-partisans than the median voter in their home states. Usually, our vote “predictions” do not differ much, but there is a group of senators who are predicted to vote yes for Kagan with a probability around 50% and the two sets of predictions thus differ for Kagan more than usual.
3 0.653395 292 andrew gelman stats-2010-09-23-Doug Hibbs on the fundamentals in 2010
Introduction: Hibbs, one of the original economy-and-elections guys, writes : The number of House seats won by the presidents party at midterm elections is well explained by three pre-determined or exogenous variables: (1) the number of House seats won by the in-party at the previous on-year election, (2) the vote margin of the in-partys candidate at the previous presidential election, and (3) the average growth rate of per capita real disposable personal income during the congressional term. Given the partisan division of House seats following the 2008 on-year election, President Obamas margin of victory in 2008, and the weak growth of per capita real income during the rst 6 quarters of the 111th Congress, the Democrats chances of holding on to a House majority by winning at least 218 seats at the 2010 midterm election will depend on real income growth in the 3rd quarter of 2010. The data available at this writing indicate the that Democrats will win 211 seats, a loss of 45 from the 2008 o
4 0.63803428 934 andrew gelman stats-2011-09-30-Nooooooooooooooooooo!
Introduction: Michael Axelrod writes: Quantitative historian Allan Lichtman claims to have discovered 13 predictors that determine who will win the popular vote in presidential elections. He predicts Obama will win in 2012. Writing in his New York Times column, “538,” Nate Silver attempted a critique Lichtman’s prediction. Soon afterward Lichtman wrote a rejoinder. Evidently Lichtman has correctly and publicly predicted the popular vote winners in the last 7 presidential elections. I think he predicted Gore would win in 2000. He got the popular vote winner right, but not electoral college vote winner. Lichtman presents his methods in his early 1980s book, “The Keys to the White House.” Lichtman consulted with Volodia Keilis-Borok, and used a kernel discriminant analysis approach on election results from 1860-1980 as the training set. I think there is some argument as to scoring because Lichtman claims more than 7 successes. I guess he divided the data into a training and validation sets and w
5 0.63726467 406 andrew gelman stats-2010-11-10-Translating into Votes: The Electoral Impact of Spanish-Language Ballots
Introduction: Dan Hopkins sends along this article : [Hopkins] uses regression discontinuity design to estimate the turnout and election impacts of Spanish-language assistance provided under Section 203 of the Voting Rights Act. Analyses of two different data sets – the Latino National Survey and California 1998 primary election returns – show that Spanish-language assistance increased turnout for citizens who speak little English. The California results also demonstrate that election procedures an influence outcomes, as support for ending bilingual education dropped markedly in heavily Spanish-speaking neighborhoods with Spanish-language assistance. The California analyses find hints of backlash among non-Hispanic white precincts, but not with the same size or certainty. Small changes in election procedures can influence who votes as well as what wins. Beyond the direct relevance of these results, I find this paper interesting as an example of research that is fundamentally quantitative. Th
6 0.62932867 270 andrew gelman stats-2010-09-12-Comparison of forecasts for the 2010 congressional elections
7 0.62780416 250 andrew gelman stats-2010-09-02-Blending results from two relatively independent multi-level models
8 0.6251471 1294 andrew gelman stats-2012-05-01-Modeling y = a + b + c
9 0.61721259 678 andrew gelman stats-2011-04-25-Democrats do better among the most and least educated groups
11 0.60860974 753 andrew gelman stats-2011-06-09-Allowing interaction terms to vary
12 0.60519868 1544 andrew gelman stats-2012-10-22-Is it meaningful to talk about a probability of “65.7%” that Obama will win the election?
13 0.59984374 237 andrew gelman stats-2010-08-27-Bafumi-Erikson-Wlezien predict a 50-seat loss for Democrats in November
15 0.59344578 1462 andrew gelman stats-2012-08-18-Standardizing regression inputs
16 0.59193599 1547 andrew gelman stats-2012-10-25-College football, voting, and the law of large numbers
17 0.59023154 251 andrew gelman stats-2010-09-02-Interactions of predictors in a causal model
18 0.58968788 257 andrew gelman stats-2010-09-04-Question about standard range for social science correlations
19 0.58857894 391 andrew gelman stats-2010-11-03-Some thoughts on election forecasting
20 0.58001643 1570 andrew gelman stats-2012-11-08-Poll aggregation and election forecasting
topicId topicWeight
[(16, 0.047), (21, 0.015), (24, 0.239), (52, 0.242), (69, 0.04), (86, 0.072), (99, 0.19)]
simIndex simValue blogId blogTitle
same-blog 1 0.94530857 485 andrew gelman stats-2010-12-25-Unlogging
Introduction: Catherine Bueker writes: I [Bueker] am analyzing the effect of various contextual factors on the voter turnout of naturalized Latino citizens. I have included the natural log of the number of Spanish Language ads run in each state during the election cycle to predict voter turnout. I now want to calculate the predicted probabilities of turnout for those in states with 0 ads, 500 ads, 1000 ads, etc. The problem is that I do not know how to handle the beta coefficient of the LN(Spanish language ads). Is there someway to “unlog” the coefficient? My reply: Calculate these probabilities for specific values of predictors, then graph the predictions of interest. Also, you can average over the other inputs in your model to get summaries. See this article with Pardoe for further discussion.
2 0.89840525 546 andrew gelman stats-2011-01-31-Infovis vs. statistical graphics: My talk tomorrow (Tues) 1pm at Columbia
Introduction: Infovis vs. statistical graphics . Tues 1 Feb 2011 1pm, Avery Hall room 114. It’s for the Lectures in Planning Series at the School of Architecture, Planning, and Preservation. Background on the talk (joint with Antony Unwin) is here . And here are more of my thoughts on statistical graphics.
3 0.86384541 223 andrew gelman stats-2010-08-21-Statoverflow
Introduction: Skirant Vadali writes: I am writing to seek your help in building a community driven Q&A; website tentatively called called ‘Statistics Analysis’. I am neither a founder of this website nor do I have any financial stake in its success. By way of background to this website, please see Stackoverflow (http://stackoverflow.com/) and Mathoverflow (http://mathoverflow.net/). Stackoverflow is a Q&A; website targeted at software developers and is designed to help them ask questions and get answers from other developers. Mathoverflow is a Q&A; website targeted at research mathematicians and is designed to help them ask and answer questions from other mathematicians across the world. The success of both these sites in helping their respective communities is a strong indicator that sites designed along these lines are very useful. The company that runs Stackoverflow (who also host Mathoverflow.net) has recently decided to develop other community driven websites for various other topic are
4 0.84644282 1686 andrew gelman stats-2013-01-21-Finite-population Anova calculations for models with interactions
Introduction: Jim Thomson writes: I wonder if you could provide some clarification on the correct way to calculate the finite-population standard deviations for interaction terms in your Bayesian approach to ANOVA (as explained in your 2005 paper, and Gelman and Hill 2007). I understand that it is the SD of the constrained batch coefficients that is of interest, but in most WinBUGS examples I have seen, the SDs are all calculated directly as sd.fin<-sd(beta.main[]) for main effects and sd(beta.int[,]) for interaction effects, where beta.main and beta.int are the unconstrained coefficients, e.g. beta.int[i,j]~dnorm(0,tau). For main effects, I can see that it makes no difference, since the constrained value is calculated by subtracting the mean, and sd(B[]) = sd(B[]-mean(B[])). But the conventional sum-to-zero constraint for interaction terms in linear models is more complicated than subtracting the mean (there are only (n1-1)*(n2-1) free coefficients for an interaction b/w factors with n1 a
5 0.84155852 1246 andrew gelman stats-2012-04-04-Data visualization panel at the New York Public Library this evening!
Introduction: I’ll be participating in a panel (along with Kaiser Fung, Mark Hansen, Tahir Hemphill, and Manuel Lima), “What Makes Good Data Visualization?”, at the 42nd St. library this evening. The event is organized by Isabel Walcott Draves and is part of the Leaders in Software and Art series. This article with Antony Unwin should be relevant (although I won’t be “presenting”; I’ll be part of a panel and we’ll be having a wide-ranging conversation).
7 0.8230322 2096 andrew gelman stats-2013-11-10-Schiminovich is on The Simpsons
8 0.79588354 1531 andrew gelman stats-2012-10-12-Elderpedia
9 0.78485256 1256 andrew gelman stats-2012-04-10-Our data visualization panel at the New York Public Library
10 0.78226733 914 andrew gelman stats-2011-09-16-meta-infographic
11 0.77997577 953 andrew gelman stats-2011-10-11-Steve Jobs’s cancer and science-based medicine
12 0.77974582 779 andrew gelman stats-2011-06-25-Avoiding boundary estimates using a prior distribution as regularization
13 0.77875936 2247 andrew gelman stats-2014-03-14-The maximal information coefficient
14 0.7777468 918 andrew gelman stats-2011-09-21-Avoiding boundary estimates in linear mixed models
15 0.77677953 1376 andrew gelman stats-2012-06-12-Simple graph WIN: the example of birthday frequencies
16 0.77464563 1368 andrew gelman stats-2012-06-06-Question 27 of my final exam for Design and Analysis of Sample Surveys
17 0.77367401 846 andrew gelman stats-2011-08-09-Default priors update?
18 0.77326554 1455 andrew gelman stats-2012-08-12-Probabilistic screening to get an approximate self-weighted sample
19 0.77207029 2017 andrew gelman stats-2013-09-11-“Informative g-Priors for Logistic Regression”
20 0.77156001 2231 andrew gelman stats-2014-03-03-Running into a Stan Reference by Accident