andrew_gelman_stats andrew_gelman_stats-2012 andrew_gelman_stats-2012-1508 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Even within the realm of writing-about-statistics, there are things I can say in a blog that are much more difficult to include in an academic article. Blogging gives me freedom. But I want to distinguish between two different sorts of frankness. 1. Obnoxiousness: In a blog I can write, “I hate X” as rudely as I’d like without needing to justify myself. 2. Openness: In a blog I can write about the limitations of my work. It’s a real challenge to discuss limitations in a scholarly article, as we’re always looking over our shoulder at what referees might think. Sure, sometimes I can get away with writing “Survey weighting is a mess,” but my impression is that most scholarly articles are relentlessly upbeat. Sort of like how a magazine article typically will have a theme and just plug it over and over. In a blog we can more easily admit uncertainty. Overall, I think blogs are more celebrated for feature 1 above (the freedom to say what you really feel, to be rude, par
sentIndex sentText sentNum sentScore
1 Even within the realm of writing-about-statistics, there are things I can say in a blog that are much more difficult to include in an academic article. [sent-1, score-0.76]
2 But I want to distinguish between two different sorts of frankness. [sent-3, score-0.209]
3 Obnoxiousness: In a blog I can write, “I hate X” as rudely as I’d like without needing to justify myself. [sent-5, score-0.81]
4 Openness: In a blog I can write about the limitations of my work. [sent-7, score-0.565]
5 It’s a real challenge to discuss limitations in a scholarly article, as we’re always looking over our shoulder at what referees might think. [sent-8, score-1.122]
6 Sure, sometimes I can get away with writing “Survey weighting is a mess,” but my impression is that most scholarly articles are relentlessly upbeat. [sent-9, score-0.896]
7 Sort of like how a magazine article typically will have a theme and just plug it over and over. [sent-10, score-0.591]
8 Overall, I think blogs are more celebrated for feature 1 above (the freedom to say what you really feel, to be rude, partisan, and politically incorrect), but I think feature 2 (the freedom to express uncertainty) is important too. [sent-12, score-1.529]
wordName wordTfidf (topN-words)
[('limitations', 0.263), ('scholarly', 0.249), ('freedom', 0.247), ('feature', 0.226), ('rudely', 0.211), ('shoulder', 0.211), ('relentlessly', 0.199), ('blog', 0.187), ('realm', 0.179), ('plug', 0.179), ('needing', 0.156), ('openness', 0.156), ('rude', 0.153), ('celebrated', 0.153), ('justify', 0.146), ('mess', 0.143), ('referees', 0.14), ('incorrect', 0.135), ('theme', 0.135), ('partisan', 0.132), ('weighting', 0.131), ('distinguish', 0.126), ('politically', 0.12), ('blogs', 0.118), ('write', 0.115), ('magazine', 0.114), ('express', 0.111), ('challenge', 0.11), ('hate', 0.11), ('blogging', 0.109), ('admit', 0.104), ('overall', 0.101), ('easily', 0.095), ('uncertainty', 0.093), ('impression', 0.091), ('academic', 0.085), ('sorts', 0.083), ('gives', 0.083), ('article', 0.082), ('typically', 0.081), ('say', 0.081), ('discuss', 0.081), ('articles', 0.079), ('difficult', 0.079), ('away', 0.078), ('include', 0.075), ('survey', 0.074), ('within', 0.074), ('sometimes', 0.069), ('looking', 0.068)]
simIndex simValue blogId blogTitle
same-blog 1 0.99999982 1508 andrew gelman stats-2012-09-23-Speaking frankly
Introduction: Even within the realm of writing-about-statistics, there are things I can say in a blog that are much more difficult to include in an academic article. Blogging gives me freedom. But I want to distinguish between two different sorts of frankness. 1. Obnoxiousness: In a blog I can write, “I hate X” as rudely as I’d like without needing to justify myself. 2. Openness: In a blog I can write about the limitations of my work. It’s a real challenge to discuss limitations in a scholarly article, as we’re always looking over our shoulder at what referees might think. Sure, sometimes I can get away with writing “Survey weighting is a mess,” but my impression is that most scholarly articles are relentlessly upbeat. Sort of like how a magazine article typically will have a theme and just plug it over and over. In a blog we can more easily admit uncertainty. Overall, I think blogs are more celebrated for feature 1 above (the freedom to say what you really feel, to be rude, par
2 0.14698079 120 andrew gelman stats-2010-06-30-You can’t put Pandora back in the box
Introduction: Rajiv Sethi writes : I suspect that within a decade, blogs will be a cornerstone of research in economics. Many original and creative contributions to the discipline will first be communicated to the profession (and the world at large) in the form of blog posts, since the medium allows for material of arbitrary length, depth and complexity. Ideas first expressed in this form will make their way (with suitable attribution) into reading lists, doctoral dissertations and more conventionally refereed academic publications. And blogs will come to play a central role in the process of recruitment, promotion and reward at major research universities. This genie is not going back into its bottle. And he thinks this is a good thing: In fact, the refereeing process for blog posts is in some respects more rigorous than that for journal articles. Reports are numerous, non-anonymous, public, rapidly and efficiently produced, and collaboratively constructed. It is not obvious to me [Sethi]
3 0.13341655 1430 andrew gelman stats-2012-07-26-Some thoughts on survey weighting
Introduction: From a comment I made in an email exchange: My work on survey adjustments has very much been inspired by the ideas of Rod Little. Much of my efforts have gone toward the goal of integrating hierarchical modeling (which is so helpful for small-area estimation) with post stratification (which adjusts for known differences between sample and population). In the surveys I’ve dealt with, nonresponse/nonavailability can be a big issue, and I’ve always tried to emphasize that (a) the probability of a person being included in the sample is just about never known, and (b) even if this probability were known, I’d rather know the empirical n/N than the probability p (which is only valid in expectation). Regarding nonparametric modeling: I haven’t done much of that (although I hope to at some point) but Rod and his students have. As I wrote in the first sentence of the above-linked paper, I do think the current theory and practice of survey weighting is a mess, in that much depends on so
4 0.11936954 1814 andrew gelman stats-2013-04-20-A mess with which I am comfortable
Introduction: Having established that survey weighting is a mess, I should also acknowledge that, by this standard, regression modeling is also a mess, involving many arbitrary choices of variable selection, transformations and modeling of interaction. Nonetheless, regression modeling is a mess with which I am comfortable and, perhaps more relevant to the discussion, can be extended using multilevel models to get inference for small cross-classifications or small areas. We’re working on it.
5 0.1106849 1697 andrew gelman stats-2013-01-29-Where 36% of all boys end up nowadays
Introduction: My Take a Number feature appears in today’s Times. And here are the graphs that I wish they’d had space to include! Original story here .
6 0.11066614 2245 andrew gelman stats-2014-03-12-More on publishing in journals
7 0.10219125 727 andrew gelman stats-2011-05-23-My new writing strategy
8 0.099413678 752 andrew gelman stats-2011-06-08-Traffic Prediction
9 0.096763797 865 andrew gelman stats-2011-08-22-Blogging is “destroying the business model for quality”?
10 0.09143994 1202 andrew gelman stats-2012-03-08-Between and within-Krugman correlation
11 0.088449061 868 andrew gelman stats-2011-08-24-Blogs vs. real journalism
12 0.087591842 2232 andrew gelman stats-2014-03-03-What is the appropriate time scale for blogging—the day or the week?
13 0.086736806 1561 andrew gelman stats-2012-11-04-Someone is wrong on the internet
14 0.085164621 840 andrew gelman stats-2011-08-05-An example of Bayesian model averaging
15 0.084803581 2244 andrew gelman stats-2014-03-11-What if I were to stop publishing in journals?
16 0.084012307 49 andrew gelman stats-2010-05-24-Blogging
18 0.076596104 1428 andrew gelman stats-2012-07-25-The problem with realistic advice?
19 0.07609833 1832 andrew gelman stats-2013-04-29-The blogroll
20 0.074702427 2269 andrew gelman stats-2014-03-27-Beyond the Valley of the Trolls
topicId topicWeight
[(0, 0.133), (1, -0.048), (2, -0.019), (3, -0.003), (4, 0.003), (5, -0.023), (6, 0.048), (7, -0.024), (8, 0.045), (9, -0.023), (10, 0.044), (11, -0.03), (12, -0.005), (13, 0.043), (14, -0.04), (15, -0.004), (16, -0.086), (17, -0.022), (18, -0.018), (19, 0.037), (20, 0.028), (21, -0.047), (22, -0.035), (23, 0.047), (24, 0.04), (25, -0.008), (26, 0.029), (27, 0.004), (28, -0.029), (29, 0.022), (30, 0.035), (31, 0.015), (32, -0.0), (33, 0.025), (34, 0.028), (35, -0.005), (36, 0.032), (37, 0.011), (38, 0.016), (39, -0.047), (40, 0.011), (41, 0.003), (42, 0.018), (43, 0.011), (44, 0.017), (45, -0.002), (46, -0.039), (47, -0.013), (48, -0.028), (49, -0.002)]
simIndex simValue blogId blogTitle
same-blog 1 0.97719425 1508 andrew gelman stats-2012-09-23-Speaking frankly
Introduction: Even within the realm of writing-about-statistics, there are things I can say in a blog that are much more difficult to include in an academic article. Blogging gives me freedom. But I want to distinguish between two different sorts of frankness. 1. Obnoxiousness: In a blog I can write, “I hate X” as rudely as I’d like without needing to justify myself. 2. Openness: In a blog I can write about the limitations of my work. It’s a real challenge to discuss limitations in a scholarly article, as we’re always looking over our shoulder at what referees might think. Sure, sometimes I can get away with writing “Survey weighting is a mess,” but my impression is that most scholarly articles are relentlessly upbeat. Sort of like how a magazine article typically will have a theme and just plug it over and over. In a blog we can more easily admit uncertainty. Overall, I think blogs are more celebrated for feature 1 above (the freedom to say what you really feel, to be rude, par
2 0.79863006 868 andrew gelman stats-2011-08-24-Blogs vs. real journalism
Introduction: I was thinking a bit more about Jonathan Rauch’s lament about the fading of the buggy-whip industry print journalism, in which he mocks bloggers, analogizes blogging to scribbling with spray paint on the side of a building, and writes that the blogosphere is “the single worst medium for sustained, and therefore grown-up, reading and writing and argumentation ever invented.” Yup. Worse than talk radio. Worse than cave painting. Worse than smoke signals, rock ‘n’ roll lyrics, woodcuts, spray-paint graffiti, and every other medium of communication ever invented. OK, he didn’t really mean it. Rauch actually has an ironclad argument here. He’s claiming, in a blog, that blogging is crap. Therefore, if he fills his blog with unsupported exaggerations, that’s fine, as he’s demonstrating that blogging is . . . crap. Not to pile on, but, hey, why not? I was curious what Rauch has blogged on lately, so I googled Jonathan Rauch blog and ended up at this site , which most recently
3 0.78449637 727 andrew gelman stats-2011-05-23-My new writing strategy
Introduction: In high school and college I would write long assignments using a series of outlines. I’d start with a single sheet where I’d write down the key phrases, connect them with lines, and then write more and more phrases until the page was filled up. Then I’d write a series of outlines, culminating in a sentence-level outline that was roughly one line per sentence of the paper. Then I’d write. It worked pretty well. Or horribly, depending on how you look at it. I was able to produce 10-page papers etc. on time. But I think it crippled my writing style for years. It’s taken me a long time to learn how to write directly–to explain clearly what I’ve done and why. And I’m still working on the “why” part. There’s a thin line between verbosity and terseness. I went to MIT and my roommate was a computer science major. He wrote me a word processor on his Atari 800, which did the job pretty well. For my senior thesis I broke down and used the computers in campus. I formatted it in tro
4 0.77670294 865 andrew gelman stats-2011-08-22-Blogging is “destroying the business model for quality”?
Introduction: Journalist Jonathan Rauch writes that the internet is Sturgeon squared: This is the blogosphere. I’m not getting paid to be here. I’m here to get incredibly famous (in my case, even more incredibly famous) so that I can get paid somewhere else. . . . The average quality of newspapers and (published) novels is far, far better than the average quality of blog posts (and–ugh!–comments). This is because people pay for newspapers and novels. What distinguishes newspapers and novels is how much does not get published in them, because people won’t pay for it. Payment is a filter, and a pretty good one. Imperfect, of course. But pointing out the defects of the old model is merely changing the subject if the new model is worse. . . . Yes, the new model is bringing a lot of new content into being. But most of it is bad. And it’s displacing a lot of better content, by destroying the business model for quality. Even in the information economy, there’s no free lunch. . . . Yes, there’s g
5 0.76601261 49 andrew gelman stats-2010-05-24-Blogging
Introduction: Rajiv Sethi quotes Bentley University economics professor Scott Sumner writing on the first anniversary of his blog: Be careful what you wish for. Last February 2nd I [Sumner] started this blog with very low expectations… I knew I wasn’t a good writer . . . And I was also pretty sure that the content was not of much interest to anyone. Now my biggest problem is time–I spend 6 to 10 hours a day on the blog, seven days a week. Several hours are spent responding to reader comments and the rest is spent writing long-winded posts and checking other economics blogs. . . . I [Sumner] don’t think much of the official methodology in macroeconomics. Many of my fellow economists seem to have a Popperian view of the social sciences. You develop a model. You go out and get some data. And then you try to refute the model with some sort of regression analysis. . . . My problem with this view is that it doesn’t reflect the way macro and finance actually work. Instead the models are
6 0.75875127 104 andrew gelman stats-2010-06-22-Seeking balance
7 0.74474269 1408 andrew gelman stats-2012-07-07-Not much difference between communicating to self and communicating to others
8 0.74423927 1561 andrew gelman stats-2012-11-04-Someone is wrong on the internet
9 0.7384817 1084 andrew gelman stats-2011-12-26-Tweeting the Hits?
10 0.73691285 458 andrew gelman stats-2010-12-08-Blogging: Is it “fair use”?
11 0.7316848 2225 andrew gelman stats-2014-02-26-A good comment on one of my papers
12 0.73035377 2088 andrew gelman stats-2013-11-04-Recently in the sister blog
13 0.72277898 2080 andrew gelman stats-2013-10-28-Writing for free
14 0.72268772 637 andrew gelman stats-2011-03-29-Unfinished business
15 0.72195256 2232 andrew gelman stats-2014-03-03-What is the appropriate time scale for blogging—the day or the week?
16 0.71541232 1463 andrew gelman stats-2012-08-19-It is difficult to convey intonation in typed speech
17 0.71403509 186 andrew gelman stats-2010-08-04-“To find out what happens when you change something, it is necessary to change it.”
18 0.70583194 2126 andrew gelman stats-2013-12-07-If I could’ve done it all over again
19 0.70297962 1225 andrew gelman stats-2012-03-22-Procrastination as a positive productivity strategy
20 0.69472957 640 andrew gelman stats-2011-03-31-Why Edit Wikipedia?
topicId topicWeight
[(2, 0.322), (7, 0.018), (15, 0.018), (16, 0.05), (17, 0.023), (21, 0.017), (24, 0.096), (27, 0.029), (45, 0.016), (53, 0.028), (99, 0.276)]
simIndex simValue blogId blogTitle
1 0.97427118 97 andrew gelman stats-2010-06-18-Economic Disparities and Life Satisfaction in European Regions
Introduction: Grazia Pittau, Roberto Zelli, and I came out with a paper investigating the role of economic variables in predicting regional disparities in reported life satisfaction of European Union citizens. We use multilevel modeling to explicitly account for the hierarchical nature of our data, respondents within regions and countries, and for understanding patterns of variation within and between regions. Here’s what we found: - Personal income matters more in poor regions than in rich regions, a pattern that still holds for regions within the same country. - Being unemployed is negatively associated with life satisfaction even after controlled for income variation. Living in high unemployment regions does not alleviate the unhappiness of being out of work. - After controlling for individual characteristics and modeling interactions, regional differences in life satisfaction still remain. Here’s a quick graph; there’s more in the article:
2 0.96730506 663 andrew gelman stats-2011-04-15-Happy tax day!
Introduction: Your taxes pay for the research funding that supports the work we do here, some of which appears on this blog and almost all of which is public, free, and open-source. So, to all of the taxpayers out there in the audience: thank you.
3 0.95977634 489 andrew gelman stats-2010-12-28-Brow inflation
Introduction: In an article headlined, “Hollywood moves away from middlebrow,” Brooks Barnes writes : As Hollywood plowed into 2010, there was plenty of clinging to the tried and true: humdrum remakes like “The Wolfman” and “The A-Team”; star vehicles like “Killers” with Ashton Kutcher and “The Tourist” with Angelina Jolie and Johnny Depp; and shoddy sequels like “Sex and the City 2.” All arrived at theaters with marketing thunder intended to fill multiplexes on opening weekend, no matter the quality of the film. . . . But the audience pushed back. One by one, these expensive yet middle-of-the-road pictures delivered disappointing results or flat-out flopped. Meanwhile, gambles on original concepts paid off. “Inception,” a complicated thriller about dream invaders, racked up more than $825 million in global ticket sales; “The Social Network” has so far delivered $192 million, a stellar result for a highbrow drama. . . . the message that the year sent about quality and originality is real enoug
4 0.95405865 17 andrew gelman stats-2010-05-05-Taking philosophical arguments literally
Introduction: Aaron Swartz writes the following, as a lead-in to an argument in favor of vegetarianism: Imagine you were an early settler of what is now the United States. It seems likely you would have killed native Americans. After all, your parents killed them, your siblings killed them, your friends killed them, the leaders of the community killed them, the President killed them. Chances are, you would have killed them too . . . Or if you see nothing wrong with killing native Americans, take the example of slavery. Again, everyone had slaves and probably didn’t think too much about the morality of it. . . . Are these statements true, though? It’s hard for me to believe that most early settlers (from the context, it looks like Swartz is discussing the 1500s-1700s here) killed native Americans. That is, if N is the number of early settlers, and Y is the number of these settlers who killed at least one Indian, I suspect Y/N is much closer to 0 than to 1. Similarly, it’s not even cl
Introduction: Writing in the Washington Post, Matt Miller wants a billionaire to run for president and “save the country.” We already have two billionaires running for president. (OK, not really. Romney has a mere quarter of a billion bucks, and it’s Huntsman’s dad, not Huntsman himself, who’s the billionaire in that family.) And, according to all reports, NYC mayor Bloomberg would run for president in an instant if he thought he’d have a chance of winning. So we should amend Miller’s article to say that he wants a billionaire presidential candidate who (a) shares the political views of a “senior fellow at the Center for American Progress and co-host of public radio’s “Left, Right, and Center” and (b) has a chance of winning. That shouldn’t be too hard to find, right? Hey, I have an idea! MIller writes that that Thomas Friedman just wrote a book arguing that “the right independent candidacy could provide for our dysfunctional politics presents an unrivaled opportunity.” Friedman’s actu
6 0.93174052 549 andrew gelman stats-2011-02-01-“Roughly 90% of the increase in . . .” Hey, wait a minute!
7 0.9273659 1017 andrew gelman stats-2011-11-18-Lack of complete overlap
8 0.9271369 44 andrew gelman stats-2010-05-20-Boris was right
9 0.92600977 1189 andrew gelman stats-2012-02-28-Those darn physicists
same-blog 10 0.92027164 1508 andrew gelman stats-2012-09-23-Speaking frankly
11 0.90928566 1698 andrew gelman stats-2013-01-30-The spam just gets weirder and weirder
12 0.90731549 1663 andrew gelman stats-2013-01-09-The effects of fiscal consolidation
13 0.90522623 1567 andrew gelman stats-2012-11-07-Election reports
14 0.89483619 1260 andrew gelman stats-2012-04-11-Hunger Games survival analysis
15 0.89180255 1872 andrew gelman stats-2013-05-27-More spam!
16 0.88409221 1893 andrew gelman stats-2013-06-11-Folic acid and autism
17 0.8818959 1102 andrew gelman stats-2012-01-06-Bayesian Anova found useful in ecology
18 0.86644948 1954 andrew gelman stats-2013-07-24-Too Good To Be True: The Scientific Mass Production of Spurious Statistical Significance
19 0.83425093 2360 andrew gelman stats-2014-06-05-Identifying pathways for managing multiple disturbances to limit plant invasions
20 0.83325243 1254 andrew gelman stats-2012-04-09-In the future, everyone will publish everything.