andrew_gelman_stats andrew_gelman_stats-2011 andrew_gelman_stats-2011-1084 knowledge-graph by maker-knowledge-mining

1084 andrew gelman stats-2011-12-26-Tweeting the Hits?


meta infos for this blog

Source: html

Introduction: Someone sent me an email saying that he liked my little essay, “Descriptive statistics aren’t just for losers.” I had no idea what he was talking about, but it sounded like the kind of thing I’d say, so I searched the blog and found this post , which indeed I really like! I thanked my correspondent for reminding me of this little article I’d forgotten, and he told me he just learned of it via someone’s tweet. This made me think: Maybe I should have a twitter feed of nothing but old blog entries. I could just go back to 2004 and then go gradually forward, tweeting the items that I judge to remain of interest. Does this make sense? Or is there a better way to do this? ALternatively, I could do it as a separate blog, but that seems a bit . . . recursive.


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Someone sent me an email saying that he liked my little essay, “Descriptive statistics aren’t just for losers. [sent-1, score-0.674]

2 ” I had no idea what he was talking about, but it sounded like the kind of thing I’d say, so I searched the blog and found this post , which indeed I really like! [sent-2, score-1.21]

3 I thanked my correspondent for reminding me of this little article I’d forgotten, and he told me he just learned of it via someone’s tweet. [sent-3, score-1.276]

4 This made me think: Maybe I should have a twitter feed of nothing but old blog entries. [sent-4, score-0.834]

5 I could just go back to 2004 and then go gradually forward, tweeting the items that I judge to remain of interest. [sent-5, score-1.031]

6 ALternatively, I could do it as a separate blog, but that seems a bit . [sent-8, score-0.335]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('thanked', 0.275), ('recursive', 0.259), ('reminding', 0.248), ('alternatively', 0.212), ('searched', 0.205), ('feed', 0.202), ('sounded', 0.199), ('essay', 0.187), ('twitter', 0.187), ('forgotten', 0.185), ('blog', 0.183), ('gradually', 0.182), ('correspondent', 0.18), ('descriptive', 0.167), ('little', 0.161), ('judge', 0.159), ('someone', 0.155), ('liked', 0.152), ('items', 0.149), ('remain', 0.14), ('separate', 0.132), ('forward', 0.127), ('learned', 0.127), ('go', 0.124), ('via', 0.121), ('aren', 0.114), ('email', 0.113), ('told', 0.111), ('sent', 0.107), ('old', 0.104), ('kind', 0.103), ('talking', 0.101), ('indeed', 0.089), ('nothing', 0.088), ('saying', 0.084), ('could', 0.08), ('found', 0.075), ('back', 0.073), ('post', 0.071), ('made', 0.07), ('bit', 0.067), ('sense', 0.065), ('thing', 0.062), ('idea', 0.062), ('maybe', 0.06), ('better', 0.06), ('like', 0.06), ('statistics', 0.057), ('seems', 0.056), ('article', 0.053)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0000001 1084 andrew gelman stats-2011-12-26-Tweeting the Hits?

Introduction: Someone sent me an email saying that he liked my little essay, “Descriptive statistics aren’t just for losers.” I had no idea what he was talking about, but it sounded like the kind of thing I’d say, so I searched the blog and found this post , which indeed I really like! I thanked my correspondent for reminding me of this little article I’d forgotten, and he told me he just learned of it via someone’s tweet. This made me think: Maybe I should have a twitter feed of nothing but old blog entries. I could just go back to 2004 and then go gradually forward, tweeting the items that I judge to remain of interest. Does this make sense? Or is there a better way to do this? ALternatively, I could do it as a separate blog, but that seems a bit . . . recursive.

2 0.1029486 91 andrew gelman stats-2010-06-16-RSS mess

Introduction: Apparently some of our new blog entries are appearing as old entries on the RSS feed, meaning that those of you who read the blog using RSS may be missing a lot of good stuff. We’re working on this. But, in the meantime, I recommend you click on the blog itself to see what’s been posted in the last few weeks. Enjoy.

3 0.098683327 1759 andrew gelman stats-2013-03-12-How tall is Jon Lee Anderson?

Introduction: The second best thing about this story (from Tom Scocca) is that Anderson spells “Tweets” with a capital T. But the best thing is that Scocca is numerate—he compares numbers on the logarithmic scale: Reminding Lake that he only had 169 Twitter followers was the saddest gambit of all. Jon Lee Anderson has 17,866 followers. And Kim Kardashian has, as I write this, 17,489,892 followers. That is: Jon Lee Anderson is 1/1,000 as important on Twitter, by his own standard, as Kim Kardashian. He is 10 times closer to Mitch Lake than he is to Kim Kardashian. How often do we see a popular journalist who understands orders of magnitude? Good job, Tom Scocca! P.S. Based on his “little twerp” comment, I also wonder if Anderson suffers from tall person syndrome—that’s the problem that some people of above-average height have, that they think they’re more important than other people because they literally look down on them. Don’t get me wrong—I have lots of tall friends who are complete

4 0.092756256 1394 andrew gelman stats-2012-06-27-99!

Introduction: Those of you who know what I’m talking about, know what I’m talking about.

5 0.092256702 532 andrew gelman stats-2011-01-23-My Wall Street Journal story

Introduction: I was talking with someone the other day about the book by that Yale law professor who called her kids “garbage” and didn’t let them go to the bathroom when they were studying piano . . . apparently it wasn’t so bad as all that, she was misrepresented by the Wall Street Journal excerpt: “I was very surprised,” she says. “The Journal basically strung together the most controversial sections of the book. And I had no idea they’d put that kind of a title on it. . . . “And while it’s ultimately my responsibility — my strict Chinese mom told me ‘never blame other people for your problems!’ — the one-sided nature of the excerpt has really led to some major misconceptions about what the book says, and about what I really believe.” I don’t completely follow her reasoning here: just because, many years ago, her mother told her a slogan about not blaming other people, therefore she can say, “it’s ultimately my responsibility”? You can see the illogic of this by flipping it around. Wha

6 0.091216758 2232 andrew gelman stats-2014-03-03-What is the appropriate time scale for blogging—the day or the week?

7 0.088868335 429 andrew gelman stats-2010-11-24-“But you and I don’t learn in isolation either”

8 0.088868335 887 andrew gelman stats-2011-09-02-“It’s like marveling over a plastic flower when there’s a huge garden blooming outside”

9 0.087542228 407 andrew gelman stats-2010-11-11-Data Visualization vs. Statistical Graphics

10 0.085044727 1796 andrew gelman stats-2013-04-09-The guy behind me on line for the train . . .

11 0.084227651 1787 andrew gelman stats-2013-04-04-Wanna be the next Tyler Cowen? It’s not as easy as you might think!

12 0.082841955 2044 andrew gelman stats-2013-09-30-Query from a textbook author – looking for stories to tell to undergrads about significance

13 0.082411736 2303 andrew gelman stats-2014-04-23-Thinking of doing a list experiment? Here’s a list of reasons why you should think again

14 0.081717037 1764 andrew gelman stats-2013-03-15-How do I make my graphs?

15 0.079860166 2229 andrew gelman stats-2014-02-28-God-leaf-tree

16 0.079530254 2111 andrew gelman stats-2013-11-23-Tables > figures yet again

17 0.078807019 503 andrew gelman stats-2011-01-04-Clarity on my email policy

18 0.077726126 27 andrew gelman stats-2010-05-11-Update on the spam email study

19 0.075966924 2187 andrew gelman stats-2014-01-26-Twitter sucks, and people are gullible as f…

20 0.075040393 390 andrew gelman stats-2010-11-02-Fragment of statistical autobiography


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.137), (1, -0.066), (2, -0.052), (3, 0.029), (4, 0.014), (5, -0.024), (6, 0.07), (7, -0.006), (8, 0.046), (9, -0.025), (10, 0.01), (11, 0.006), (12, 0.049), (13, 0.01), (14, -0.024), (15, 0.042), (16, -0.039), (17, -0.022), (18, -0.027), (19, 0.022), (20, 0.034), (21, -0.034), (22, -0.032), (23, 0.004), (24, -0.005), (25, 0.006), (26, -0.026), (27, -0.006), (28, -0.022), (29, 0.029), (30, 0.027), (31, 0.025), (32, -0.036), (33, 0.028), (34, 0.016), (35, -0.012), (36, 0.031), (37, -0.013), (38, -0.02), (39, 0.008), (40, -0.016), (41, -0.019), (42, 0.029), (43, -0.014), (44, -0.021), (45, -0.024), (46, -0.037), (47, 0.003), (48, -0.021), (49, -0.057)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.97742897 1084 andrew gelman stats-2011-12-26-Tweeting the Hits?

Introduction: Someone sent me an email saying that he liked my little essay, “Descriptive statistics aren’t just for losers.” I had no idea what he was talking about, but it sounded like the kind of thing I’d say, so I searched the blog and found this post , which indeed I really like! I thanked my correspondent for reminding me of this little article I’d forgotten, and he told me he just learned of it via someone’s tweet. This made me think: Maybe I should have a twitter feed of nothing but old blog entries. I could just go back to 2004 and then go gradually forward, tweeting the items that I judge to remain of interest. Does this make sense? Or is there a better way to do this? ALternatively, I could do it as a separate blog, but that seems a bit . . . recursive.

2 0.82454491 458 andrew gelman stats-2010-12-08-Blogging: Is it “fair use”?

Introduction: Dave Kane writes: I [Kane] am involved in a dispute relating to whether or not a blog can be considered part of one’s academic writing. Williams College restricts the use of undergraduate theses as follows: Non-commercial, academic use within the scope of “Fair Use” standards is acceptable. Otherwise, you may not copy or distribute any content without the permission of the copyright holder. Seems obvious enough. Yet some folks think that my use of thesis material in a blog post fails this test because it is not “academic.” See this post for the gory details. Parenthetically, your readers might be interested in the substantive discovery here, the details of the Williams admissions process (which is probably very similar to Columbia’s). Williams places students into academic rating (AR) categories as follows: verbal math composite SAT II ACT AP AR 1: 770-800 750-800 1520-1600 750-800 35-36 mostly 5s AR 2: 730-770 720-750 1450-1520 720-770 33-34 4s an

3 0.81163388 868 andrew gelman stats-2011-08-24-Blogs vs. real journalism

Introduction: I was thinking a bit more about Jonathan Rauch’s lament about the fading of the buggy-whip industry print journalism, in which he mocks bloggers, analogizes blogging to scribbling with spray paint on the side of a building, and writes that the blogosphere is “the single worst medium for sustained, and therefore grown-up, reading and writing and argumentation ever invented.” Yup. Worse than talk radio. Worse than cave painting. Worse than smoke signals, rock ‘n’ roll lyrics, woodcuts, spray-paint graffiti, and every other medium of communication ever invented. OK, he didn’t really mean it. Rauch actually has an ironclad argument here. He’s claiming, in a blog, that blogging is crap. Therefore, if he fills his blog with unsupported exaggerations, that’s fine, as he’s demonstrating that blogging is . . . crap. Not to pile on, but, hey, why not? I was curious what Rauch has blogged on lately, so I googled Jonathan Rauch blog and ended up at this site , which most recently

4 0.80777407 1796 andrew gelman stats-2013-04-09-The guy behind me on line for the train . . .

Introduction: . . . sounded exactly like a David Mamet character. I mean, exactly. Or like Eric Bogosian doing a David Mamet character. I only wish I had a good ear for dialogue and could get it down for you. OK, we don’t use the word fuck on this blog but I could substitute something like f*** and you’d get the point. He was on his cell phone and seemed to be talking with his wife or girlfriend, explaining why they should get back together. It was a bit of a cross between Alec Baldwin and Jack Lemmon.

5 0.80666405 1561 andrew gelman stats-2012-11-04-Someone is wrong on the internet

Introduction: I made the mistake of googling myself (I know, I know . . .) and came across a couple of rude bloggers criticizing something I’d written. I don’t mind criticism, and lord knows I can be a rude blogger myself at times, but these criticisms were really bad, a mix of already-refuted arguments and new claims that were just flat-out ridiculous. Really bad stuff. I then spent about an hour, on and off, writing a long long post explaining why they were wrong and how they could make their arguments better. But then, before I hit Send, I realized it would a mistake to post my response. Getting into a fight with these people whom I’d never heard of before . . . what’s the point? If they want to comment on my blog, I will respond (within reason), or if they are well known researchers or journalists, it’s perhaps worth correcting them. Or if they made an interesting argument, sure. But there’s no point in scouring the web looking for bad arguments to refute. That way lies madness. I w

6 0.79327965 220 andrew gelman stats-2010-08-20-Why I blog?

7 0.79267752 1508 andrew gelman stats-2012-09-23-Speaking frankly

8 0.78618574 2036 andrew gelman stats-2013-09-24-“Instead of the intended message that being poor is hard, the takeaway is that rich people aren’t very good with money.”

9 0.77773058 1007 andrew gelman stats-2011-11-13-At last, treated with the disrespect that I deserve

10 0.77398223 727 andrew gelman stats-2011-05-23-My new writing strategy

11 0.77186179 1421 andrew gelman stats-2012-07-19-Alexa, Maricel, and Marty: Three cellular automata who got on my nerves

12 0.77175802 1964 andrew gelman stats-2013-08-01-Non-topical blogging

13 0.7664901 104 andrew gelman stats-2010-06-22-Seeking balance

14 0.76536399 1065 andrew gelman stats-2011-12-17-Read this blog on Google Currents

15 0.76093572 865 andrew gelman stats-2011-08-22-Blogging is “destroying the business model for quality”?

16 0.75507224 1408 andrew gelman stats-2012-07-07-Not much difference between communicating to self and communicating to others

17 0.75481862 2232 andrew gelman stats-2014-03-03-What is the appropriate time scale for blogging—the day or the week?

18 0.74725002 640 andrew gelman stats-2011-03-31-Why Edit Wikipedia?

19 0.74442315 2306 andrew gelman stats-2014-04-26-Sleazy sock puppet can’t stop spamming our discussion of compressed sensing and promoting the work of Xiteng Liu

20 0.74405628 49 andrew gelman stats-2010-05-24-Blogging


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(15, 0.034), (16, 0.039), (24, 0.235), (29, 0.028), (53, 0.028), (69, 0.026), (76, 0.247), (77, 0.064), (86, 0.017), (99, 0.165)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.87434751 1084 andrew gelman stats-2011-12-26-Tweeting the Hits?

Introduction: Someone sent me an email saying that he liked my little essay, “Descriptive statistics aren’t just for losers.” I had no idea what he was talking about, but it sounded like the kind of thing I’d say, so I searched the blog and found this post , which indeed I really like! I thanked my correspondent for reminding me of this little article I’d forgotten, and he told me he just learned of it via someone’s tweet. This made me think: Maybe I should have a twitter feed of nothing but old blog entries. I could just go back to 2004 and then go gradually forward, tweeting the items that I judge to remain of interest. Does this make sense? Or is there a better way to do this? ALternatively, I could do it as a separate blog, but that seems a bit . . . recursive.

2 0.84207273 1551 andrew gelman stats-2012-10-28-A convenience sample and selected treatments

Introduction: Charlie Saunders writes: A study has recently been published in the New England Journal of Medicine (NEJM) which uses survival analysis to examine long-acting reversible contraception (e.g. intrauterine devices [IUDs]) vs. short-term commonly prescribed methods of contraception (e.g. oral contraceptive pills) on unintended pregnancies. The authors use a convenience sample of over 7,000 women. I am not well versed-enough in sampling theory to determine the appropriateness of this but it would seem that the use of a non-probability sampling would be a significant drawback. If you could give me your opinion on this, I would appreciate it. The NEJM is one of the top medical journals in the country. Could this type of sampling method coupled with this method of analysis be published in a journal like JASA? My reply: There are two concerns, first that it is a convenience sample and thus not representative of the population, and second that the treatments are chosen rather tha

3 0.83393776 988 andrew gelman stats-2011-11-02-Roads, traffic, and the importance in decision analysis of carefully examining your goals

Introduction: Sandeep Baliga writes : [In a recent study , Gilles Duranton and Matthew Turner write:] For interstate highways in metropolitan areas we [Duranton and Turner] find that VKT (vehicle kilometers traveled) increases one for one with interstate highways, confirming the fundamental law of highway congestion.’ Provision of public transit also simply leads to the people taking public transport being replaced by drivers on the road. Therefore: These findings suggest that both road capacity expansions and extensions to public transit are not appropriate policies with which to combat traffic congestion. This leaves congestion pricing as the main candidate tool to curb traffic congestion. To which I reply: Sure, if your goal is to curb traffic congestion . But what sort of goal is that? Thinking like a microeconomist, my policy goal is to increase people’s utility. Sure, traffic congestion is annoying, but there must be some advantages to driving on that crowded road or pe

4 0.80812573 1609 andrew gelman stats-2012-12-06-Stephen Kosslyn’s principles of graphics and one more: There’s no need to cram everything into a single plot

Introduction: Jerzy Wieczorek has an interesting review of the book Graph Design for the Eye and Mind by psychology researcher Stephen Kosslyn. I recommend you read all of Wieczorek’s review (and maybe Kosslyn’s book, but that I haven’t seen), but here I’ll just focus on one point. Here’s Wieczorek summarizing Kosslyn: p. 18-19: the horizontal axis should be for the variable with the “most important part of the data.” See Kosslyn’s Figure 1.6 and 1.7 below. Figure 1.6 clearly shows that one of the sex-by-income groups reacts to age differently than the other three groups do. Figure 1.7 uses sex as the x-axis variable, making it much harder to see this same effect in the data. As a statistician exploring the data, I might make several plots using different groupings… but for communicating my results to an audience, I would choose the one plot that shows the findings most clearly. Those who know me well (or who have read the title of this post) will guess my reaction, whic

5 0.79486799 300 andrew gelman stats-2010-09-28-A calibrated Cook gives Dems the edge in Nov, sez Sandy

Introduction: Sandy Gordon sends along this fun little paper forecasting the 2010 midterm election using expert predictions (the Cook and Rothenberg Political Reports). Gordon’s gimmick is that he uses past performance to calibrate the reports’ judgments based on “solid,” “likely,” “leaning,” and “toss-up” categories, and then he uses the calibrated versions of the current predictions to make his forecast. As I wrote a few weeks ago in response to Nate’s forecasts, I think the right way to go, if you really want to forecast the election outcome, is to use national information to predict the national swing and then do regional, state, and district-level adjustments using whatever local information is available. I don’t see the point of using only the expert forecasts and no other data. Still, Gordon is bringing new information (his calibrations) to the table, so I wanted to share it with you. Ultimately I like the throw-in-everything approach that Nate uses (although I think Nate’s descr

6 0.78273308 1351 andrew gelman stats-2012-05-29-A Ph.D. thesis is not really a marathon

7 0.75614262 1818 andrew gelman stats-2013-04-22-Goal: Rules for Turing chess

8 0.75611991 2246 andrew gelman stats-2014-03-13-An Economist’s Guide to Visualizing Data

9 0.7553314 1810 andrew gelman stats-2013-04-17-Subway series

10 0.74868488 668 andrew gelman stats-2011-04-19-The free cup and the extra dollar: A speculation in philosophy

11 0.74721718 482 andrew gelman stats-2010-12-23-Capitalism as a form of voluntarism

12 0.74614727 2023 andrew gelman stats-2013-09-14-On blogging

13 0.74387461 1875 andrew gelman stats-2013-05-28-Simplify until your fake-data check works, then add complications until you can figure out where the problem is coming from

14 0.74287844 337 andrew gelman stats-2010-10-12-Election symposium at Columbia Journalism School

15 0.74254787 1706 andrew gelman stats-2013-02-04-Too many MC’s not enough MIC’s, or What principles should govern attempts to summarize bivariate associations in large multivariate datasets?

16 0.74182773 1376 andrew gelman stats-2012-06-12-Simple graph WIN: the example of birthday frequencies

17 0.74098581 2247 andrew gelman stats-2014-03-14-The maximal information coefficient

18 0.74056 743 andrew gelman stats-2011-06-03-An argument that can’t possibly make sense

19 0.7401787 1850 andrew gelman stats-2013-05-10-The recursion of pop-econ

20 0.73933935 1092 andrew gelman stats-2011-12-29-More by Berger and me on weakly informative priors