andrew_gelman_stats andrew_gelman_stats-2014 andrew_gelman_stats-2014-2228 knowledge-graph by maker-knowledge-mining

2228 andrew gelman stats-2014-02-28-Combining two of my interests


meta infos for this blog

Source: html

Introduction: Paul Alper writes: Hi Andrew (or Andy or even Gelman [17 of them]): Go to this link and have some fun with (useless? powerful?) data mining. As the authors say, it is addictive. Paul (no other way to spell it) Alper [215 of us] I’m reminded of this discussion from 2012, “Michael’s a Republican, Susan’s a Democrat.” As I wrote at the time: It’s no surprise that men give more to Republicans and women to Democrats, or that the average contribution to a Republican has a larger dollar value than the average contribution to a Democrat, nor perhaps should we be surprised that “Tom” splits his support between the two parties while “Thomas” is a strong Republican. Still, it’s fun to see the data. Overall, I think this graph understates contributions to Republicans because it doesn’t include those new super-pacs. But the new tool seems to be based on a different dataset, opinion polls rather than campaign contributions. Playing around a bit, I see a lot less variability


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Paul Alper writes: Hi Andrew (or Andy or even Gelman [17 of them]): Go to this link and have some fun with (useless? [sent-1, score-0.225]

2 Paul (no other way to spell it) Alper [215 of us] I’m reminded of this discussion from 2012, “Michael’s a Republican, Susan’s a Democrat. [sent-5, score-0.216]

3 Overall, I think this graph understates contributions to Republicans because it doesn’t include those new super-pacs. [sent-8, score-0.347]

4 But the new tool seems to be based on a different dataset, opinion polls rather than campaign contributions. [sent-9, score-0.559]

5 Playing around a bit, I see a lot less variability in party ID by name (estimated using the survey database) than in partisanship of campaign contributions by name (using the campaign contribution database). [sent-10, score-1.756]

6 In both cases, I’d say the data are fun and worth exploring but we should be careful before assuming the numbers are correct. [sent-13, score-0.571]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('campaign', 0.304), ('contribution', 0.29), ('alper', 0.284), ('database', 0.238), ('fun', 0.225), ('contributions', 0.184), ('republicans', 0.176), ('republican', 0.167), ('paul', 0.167), ('understates', 0.163), ('name', 0.139), ('id', 0.133), ('susan', 0.131), ('spell', 0.129), ('partisanship', 0.127), ('andy', 0.125), ('hi', 0.124), ('democrat', 0.122), ('average', 0.117), ('variability', 0.117), ('dollar', 0.115), ('parties', 0.113), ('exploring', 0.112), ('useless', 0.105), ('tom', 0.101), ('powerful', 0.1), ('polls', 0.096), ('surprise', 0.09), ('dataset', 0.09), ('thomas', 0.089), ('tool', 0.089), ('men', 0.089), ('democrats', 0.087), ('perhaps', 0.087), ('reminded', 0.087), ('playing', 0.086), ('sets', 0.085), ('women', 0.084), ('assuming', 0.083), ('overall', 0.083), ('careful', 0.081), ('party', 0.08), ('michael', 0.077), ('gelman', 0.076), ('surprised', 0.076), ('estimated', 0.076), ('andrew', 0.074), ('using', 0.072), ('data', 0.07), ('opinion', 0.07)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0 2228 andrew gelman stats-2014-02-28-Combining two of my interests

Introduction: Paul Alper writes: Hi Andrew (or Andy or even Gelman [17 of them]): Go to this link and have some fun with (useless? powerful?) data mining. As the authors say, it is addictive. Paul (no other way to spell it) Alper [215 of us] I’m reminded of this discussion from 2012, “Michael’s a Republican, Susan’s a Democrat.” As I wrote at the time: It’s no surprise that men give more to Republicans and women to Democrats, or that the average contribution to a Republican has a larger dollar value than the average contribution to a Democrat, nor perhaps should we be surprised that “Tom” splits his support between the two parties while “Thomas” is a strong Republican. Still, it’s fun to see the data. Overall, I think this graph understates contributions to Republicans because it doesn’t include those new super-pacs. But the new tool seems to be based on a different dataset, opinion polls rather than campaign contributions. Playing around a bit, I see a lot less variability

2 0.24479219 286 andrew gelman stats-2010-09-20-Are the Democrats avoiding a national campaign?

Introduction: Bob Erikson, one of my colleagues at Columbia who knows much more about American politics than I do, sent in the following screed. I’ll post Bob’s note, followed by my comments. Bob writes: Monday morning many of us were startled by the following headline: White House strenuously denies NYT report that it is considering getting aggressive about winning the midterm elections. At first I [Bob] thought I was reading the Onion, but no, it was a sarcastic comment on the blog Talking Points Memo. But the gist of the headline appears to be correct. Indeed, the New York Times reported that White House advisers denied that a national ad campaign was being planned. ‘There’s been no discussion of such a thing at the White House’ What do we make of this? Is there some hidden downside to actually running a national campaign? Of course, money spent nationally is not spent on targeted local campaigns. But that is always the case. What explains the Democrats’ trepidation abou

3 0.18327494 946 andrew gelman stats-2011-10-07-Analysis of Power Law of Participation

Introduction: Rick Wash writes: A colleague as USC (Lian Jian) and I were recently discussing a statistical analysis issue that both of us have run into recently. We both mostly do research about how people use online interactive websites. One property that most of these systems have is known as the “powerlaw of participation” — the distribution of the number of contributions from each person follows a powerlaw. This mean that a few people contribution a TON and many, many people are in the “long tail” and contribute very rarely. For example, Facebook posts and twitter posts both have this distribution, as do comments on blogs and many other forms of user contribution online. This distribution has proven to be a problem when we analyze individual behavior. The basic problem is that we’d like to account for the fact that we have repeated data from many users, but a large number of users only have 1 or 2 data points. For example, Lian recently analyzed data about monetary contributions

4 0.15299344 2255 andrew gelman stats-2014-03-19-How Americans vote

Introduction: An interview with me from 2012 : You’re a statistician and wrote a book,  Red State, Blue State, Rich State, Poor State , looking at why Americans vote the way they do. In an election year I think it would be a good time to revisit that question, not just for people in the US, but anyone around the world who wants to understand the realities – rather than the stereotypes – of how Americans vote. I regret the title I gave my book. I was too greedy. I wanted it to be an airport bestseller because I figured there were millions of people who are interested in politics and some subset of them are always looking at the statistics. It’s got a very grabby title and as a result people underestimated the content. They thought it was a popularisation of my work, or, at best, an expansion of an article we’d written. But it had tons of original material. If I’d given it a more serious, political science-y title, then all sorts of people would have wanted to read it, because they would

5 0.14685428 1512 andrew gelman stats-2012-09-27-A Non-random Walk Down Campaign Street

Introduction: Political campaigns are commonly understood as random walks, during which, at any point in time, the level of support for any party or candidate is equally likely to go up or down. Each shift in the polls is then interpreted as the result of some combination of news and campaign strategies. A completely different story of campaigns is the mean reversion model in which the elections are determined by fundamental factors of the economy and partisanship; the role of the campaign is to give voters a chance to reach their predetermined positions. The popularity of the random walk model for polls may be partially explained via analogy to the widespread idea that stock prices reflect all available information, as popularized in Burton Malkiel’s book, A Random Walk Down Wall Street. Once the idea has sunk in that short-term changes in the stock market are inherently unpredictable, it is natural for journalists to think the same of polls. For example, political analyst Nate Silver wrote

6 0.14450808 79 andrew gelman stats-2010-06-10-What happens when the Democrats are “fighting Wall Street with one hand, unions with the other,” while the Republicans are fighting unions with two hands?

7 0.12152259 394 andrew gelman stats-2010-11-05-2010: What happened?

8 0.12134027 2015 andrew gelman stats-2013-09-10-The ethics of lying, cheating, and stealing with data: A case study

9 0.1197259 201 andrew gelman stats-2010-08-12-Are all rich people now liberals?

10 0.11555929 1318 andrew gelman stats-2012-05-13-Stolen jokes

11 0.11208269 210 andrew gelman stats-2010-08-16-What I learned from those tough 538 commenters

12 0.10886022 1577 andrew gelman stats-2012-11-14-Richer people continue to vote Republican

13 0.10656866 2141 andrew gelman stats-2013-12-20-Don’t douthat, man! Please give this fallacy a name.

14 0.098854378 659 andrew gelman stats-2011-04-13-Jim Campbell argues that Larry Bartels’s “Unequal Democracy” findings are not robust

15 0.098440811 1556 andrew gelman stats-2012-11-01-Recently in the sister blogs: special pre-election edition!

16 0.09712027 237 andrew gelman stats-2010-08-27-Bafumi-Erikson-Wlezien predict a 50-seat loss for Democrats in November

17 0.09213049 593 andrew gelman stats-2011-02-27-Heat map

18 0.091882214 364 andrew gelman stats-2010-10-22-Politics is not a random walk: Momentum and mean reversion in polling

19 0.090764821 2157 andrew gelman stats-2014-01-02-2013

20 0.090469353 50 andrew gelman stats-2010-05-25-Looking for Sister Right


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.155), (1, -0.058), (2, 0.116), (3, 0.065), (4, -0.015), (5, -0.024), (6, -0.057), (7, -0.026), (8, -0.01), (9, -0.009), (10, 0.035), (11, -0.014), (12, 0.023), (13, -0.003), (14, 0.012), (15, 0.023), (16, 0.004), (17, -0.009), (18, -0.016), (19, 0.014), (20, -0.036), (21, 0.002), (22, 0.031), (23, -0.018), (24, 0.007), (25, 0.019), (26, -0.03), (27, 0.032), (28, 0.018), (29, 0.011), (30, 0.021), (31, 0.043), (32, -0.004), (33, 0.002), (34, -0.02), (35, 0.04), (36, -0.043), (37, 0.004), (38, -0.004), (39, -0.01), (40, 0.021), (41, 0.02), (42, 0.093), (43, -0.006), (44, 0.009), (45, 0.068), (46, 0.044), (47, -0.047), (48, 0.039), (49, 0.019)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.93893081 2228 andrew gelman stats-2014-02-28-Combining two of my interests

Introduction: Paul Alper writes: Hi Andrew (or Andy or even Gelman [17 of them]): Go to this link and have some fun with (useless? powerful?) data mining. As the authors say, it is addictive. Paul (no other way to spell it) Alper [215 of us] I’m reminded of this discussion from 2012, “Michael’s a Republican, Susan’s a Democrat.” As I wrote at the time: It’s no surprise that men give more to Republicans and women to Democrats, or that the average contribution to a Republican has a larger dollar value than the average contribution to a Democrat, nor perhaps should we be surprised that “Tom” splits his support between the two parties while “Thomas” is a strong Republican. Still, it’s fun to see the data. Overall, I think this graph understates contributions to Republicans because it doesn’t include those new super-pacs. But the new tool seems to be based on a different dataset, opinion polls rather than campaign contributions. Playing around a bit, I see a lot less variability

2 0.74653399 286 andrew gelman stats-2010-09-20-Are the Democrats avoiding a national campaign?

Introduction: Bob Erikson, one of my colleagues at Columbia who knows much more about American politics than I do, sent in the following screed. I’ll post Bob’s note, followed by my comments. Bob writes: Monday morning many of us were startled by the following headline: White House strenuously denies NYT report that it is considering getting aggressive about winning the midterm elections. At first I [Bob] thought I was reading the Onion, but no, it was a sarcastic comment on the blog Talking Points Memo. But the gist of the headline appears to be correct. Indeed, the New York Times reported that White House advisers denied that a national ad campaign was being planned. ‘There’s been no discussion of such a thing at the White House’ What do we make of this? Is there some hidden downside to actually running a national campaign? Of course, money spent nationally is not spent on targeted local campaigns. But that is always the case. What explains the Democrats’ trepidation abou

3 0.74331915 659 andrew gelman stats-2011-04-13-Jim Campbell argues that Larry Bartels’s “Unequal Democracy” findings are not robust

Introduction: A few years ago Larry Bartels presented this graph, a version of which latter appeared in his book Unequal Democracy: Larry looked at the data in a number of ways, and the evidence seemed convincing that, at least in the short term, the Democrats were better than Republicans for the economy. This is consistent with Democrats’ general policies of lowering unemployment, as compared to Republicans lowering inflation, and, by comparing first-term to second-term presidents, he found that the result couldn’t simply be explained as a rebound or alternation pattern. The question then arose, why have the Republicans won so many elections? Why aren’t the Democrats consistently dominating? Non-economic issues are part of the story, of course, but lots of evidence shows the economy to be a key concern for voters, so it’s still hard to see how, with a pattern such as shown above, the Republicans could keep winning. Larry had some explanations, largely having to do with timing: under De

4 0.73822612 649 andrew gelman stats-2011-04-05-Internal and external forecasting

Introduction: Some thoughts on the implausibility of Paul Ryan’s 2.8% unemployment forecast. Some general issues arise. P.S. Yes, Democrats also have been known to promote optimistic forecasts!

5 0.71938992 654 andrew gelman stats-2011-04-09-There’s no evidence that voters choose presidential candidates based on their looks

Introduction: Jonathan Chait writes that the most important aspect of a presidential candidate is “political talent”: Republicans have generally understood that an agenda tilted toward the desires of the powerful requires a skilled frontman who can pitch Middle America. Favorite character types include jocks, movie stars, folksy Texans and war heroes. . . . [But the frontrunners for the 2012 Republican nomination] make Michael Dukakis look like John F. Kennedy. They are qualified enough to serve as president, but wildly unqualified to run for president. . . . [Mitch] Daniels’s drawbacks begin — but by no means end — with his lack of height, hair and charisma. . . . [Jeb Bush] suffers from an inherent branding challenge [because of his last name]. . . . [Chris] Christie . . . doesn’t cut a trim figure and who specializes in verbally abusing his constituents. . . . [Haley] Barbour is the comic embodiment of his party’s most negative stereotypes. A Barbour nomination would be the rough equivalent

6 0.71263921 2005 andrew gelman stats-2013-09-02-“Il y a beaucoup de candidats démocrates, et leurs idéologies ne sont pas très différentes. Et la participation est imprévisible.”

7 0.71182418 1512 andrew gelman stats-2012-09-27-A Non-random Walk Down Campaign Street

8 0.70618325 377 andrew gelman stats-2010-10-28-The incoming moderate Republican congressmembers

9 0.70202249 210 andrew gelman stats-2010-08-16-What I learned from those tough 538 commenters

10 0.70135951 312 andrew gelman stats-2010-10-02-“Regression to the mean” is fine. But what’s the “mean”?

11 0.70000666 1388 andrew gelman stats-2012-06-22-Americans think economy isn’t so bad in their city but is crappy nationally and globally

12 0.6936152 828 andrew gelman stats-2011-07-28-Thoughts on Groseclose book on media bias

13 0.69282424 521 andrew gelman stats-2011-01-17-“the Tea Party’s ire, directed at Democrats and Republicans alike”

14 0.69258589 1407 andrew gelman stats-2012-07-06-Statistical inference and the secret ballot

15 0.68745774 967 andrew gelman stats-2011-10-20-Picking on Gregg Easterbrook

16 0.67753297 656 andrew gelman stats-2011-04-11-Jonathan Chait and I agree about the importance of the fundamentals in determining presidential elections

17 0.67681724 394 andrew gelman stats-2010-11-05-2010: What happened?

18 0.67422938 2141 andrew gelman stats-2013-12-20-Don’t douthat, man! Please give this fallacy a name.

19 0.67348999 1635 andrew gelman stats-2012-12-22-More Pinker Pinker Pinker

20 0.67320037 384 andrew gelman stats-2010-10-31-Two stories about the election that I don’t believe


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(5, 0.019), (16, 0.089), (20, 0.019), (24, 0.085), (43, 0.069), (47, 0.047), (57, 0.046), (63, 0.07), (75, 0.041), (77, 0.016), (86, 0.026), (98, 0.015), (99, 0.363)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.98263371 2228 andrew gelman stats-2014-02-28-Combining two of my interests

Introduction: Paul Alper writes: Hi Andrew (or Andy or even Gelman [17 of them]): Go to this link and have some fun with (useless? powerful?) data mining. As the authors say, it is addictive. Paul (no other way to spell it) Alper [215 of us] I’m reminded of this discussion from 2012, “Michael’s a Republican, Susan’s a Democrat.” As I wrote at the time: It’s no surprise that men give more to Republicans and women to Democrats, or that the average contribution to a Republican has a larger dollar value than the average contribution to a Democrat, nor perhaps should we be surprised that “Tom” splits his support between the two parties while “Thomas” is a strong Republican. Still, it’s fun to see the data. Overall, I think this graph understates contributions to Republicans because it doesn’t include those new super-pacs. But the new tool seems to be based on a different dataset, opinion polls rather than campaign contributions. Playing around a bit, I see a lot less variability

2 0.9593904 75 andrew gelman stats-2010-06-08-“Is the cyber mob a threat to freedom?”

Introduction: This one was so dumb I couldn’t resist sharing it with you. TEMPLETON BOOK FORUM invites you to “Is the Cyber Mob a Threat to Freedom?” featuring Ron Rosenbaum, Slate, Lee Siegel, The New York Observer, moderated by Michael Goodwin, The New York Post New Threats to Freedom Today’s threats to freedom are “much less visible and obvious than they were in the 20th century and may even appear in the guise of social and political progress,” writes Adam Bellow in his introduction to the new essay collection that he has edited for the Templeton Press. Indeed, Bellow suggests, the danger often lies precisely in our “failure or reluctance to notice them.” According to Ron Rosenbaum and Lee Siegel, in their provocative contributions to the volume, the extraordinary advances made possible by the Internet have come at a sometimes worrisome cost. Rosenbaum focuses on how online anonymity has become a mask encouraging political discourse that is increasingly distorted by vitriol, abuse, and

3 0.95855534 2235 andrew gelman stats-2014-03-06-How much time (if any) should we spend criticizing research that’s fraudulent, crappy, or just plain pointless?

Introduction: I had a brief email exchange with Jeff Leek regarding our recent discussions of replication, criticism, and the self-correcting process of science. Jeff writes: (1) I can see the problem with serious, evidence-based criticisms not being published in the same journal (and linked to) studies that are shown to be incorrect. I have been mostly seeing these sorts of things show up in blogs. But I’m not sure that is a bad thing. I think people read blogs more than they read the literature. I wonder if this means that blogs will eventually be a sort of “shadow literature”? (2) I think there is a ton of bad literature out there, just like there is a ton of bad stuff on Google. If we focus too much on the bad stuff we will be paralyzed. I still manage to find good papers despite all the bad papers. (3) I think one positive solution to this problem is to incentivize/publish referee reports and give people credit for a good referee report just like they get credit for a good paper. T

4 0.95729876 544 andrew gelman stats-2011-01-29-Splitting the data

Introduction: Antonio Rangel writes: I’m a neuroscientist at Caltech . . . I’m using the debate on the ESP paper , as I’m sure other labs around the world are, as an opportunity to discuss some basic statistical issues/ideas w/ my lab. Request: Is there any chance you would be willing to share your thoughts about the difference between exploratory “data mining” studies and confirmatory studies? What I have in mind is that one could use a dataset to explore/discover novel hypotheses and then conduct another experiment to test those hypotheses rigorously. It seems that a good combination of both approaches could be the best of both worlds, since the first would lead to novel hypothesis discovery, and the later to careful testing. . . it is a fundamental issue for neuroscience and psychology. My reply: I know that people talk about this sort of thing . . . but in any real setting, I think I’d want all my data right now to answer any questions I have. I like cross-validation and have used

5 0.9566375 1201 andrew gelman stats-2012-03-07-Inference = data + model

Introduction: A recent article on global warming reminded me of the difficulty of letting the data speak. William Nordhaus shows the following graph: And then he writes: One of the reasons that drawing conclusions on temperature trends is tricky is that the historical temperature series is highly volatile, as can be seen in the figure. The presence of short-term volatility requires looking at long-term trends. A useful analogy is the stock market. Suppose an analyst says that because real stock prices have declined over the last decade (which is true), it follows that there is no upward trend. Here again, an examination of the long-term data would quickly show this to be incorrect. The last decade of temperature and stock market data is not representative of the longer-term trends. The finding that global temperatures are rising over the last century-plus is one of the most robust findings of climate science and statistics. I see what he’s saying, but first, I don’t find the st

6 0.95301962 421 andrew gelman stats-2010-11-19-Just chaid

7 0.95189983 2326 andrew gelman stats-2014-05-08-Discussion with Steven Pinker on research that is attached to data that are so noisy as to be essentially uninformative

8 0.95164102 460 andrew gelman stats-2010-12-09-Statistics gifts?

9 0.95142758 1458 andrew gelman stats-2012-08-14-1.5 million people were told that extreme conservatives are happier than political moderates. Approximately .0001 million Americans learned that the opposite is true.

10 0.95109284 989 andrew gelman stats-2011-11-03-This post does not mention Wegman

11 0.95089114 1253 andrew gelman stats-2012-04-08-Technology speedup graph

12 0.95027339 452 andrew gelman stats-2010-12-06-Followup questions

13 0.95024204 2301 andrew gelman stats-2014-04-22-Ticket to Baaaaarf

14 0.95013708 524 andrew gelman stats-2011-01-19-Data exploration and multiple comparisons

15 0.95005244 1815 andrew gelman stats-2013-04-20-Displaying inferences from complex models

16 0.94966209 1870 andrew gelman stats-2013-05-26-How to understand coefficients that reverse sign when you start controlling for things?

17 0.94957888 1347 andrew gelman stats-2012-05-27-Macromuddle

18 0.94871831 1882 andrew gelman stats-2013-06-03-The statistical properties of smart chains (and referral chains more generally)

19 0.94869161 1859 andrew gelman stats-2013-05-16-How do we choose our default methods?

20 0.94859326 2279 andrew gelman stats-2014-04-02-Am I too negative?