andrew_gelman_stats andrew_gelman_stats-2010 andrew_gelman_stats-2010-104 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: I’m trying to temporarily kick the blogging habit as I seem to be addicted. I’m currently on a binge and my plan is to schedule a bunch of already-written entries at one per weekday and not blog anything new for awhile. Yesterday I fell off the wagon and posted 4 items, but maybe now I can show some restraint. P.S. In keeping with the spirit of this blog, I scheduled it to appear on 13 May, even though I wrote it on 15 Apr. Just about everything you’ve been reading on this blog for the past several weeks (and lots of forthcoming items) were written a month ago. The only exceptions are whatever my cobloggers have been posting and various items that were timely enough that I inserted them in the queue afterward. P.P.S I bumped it up to 22 Jun because, as of 14 Apr, I was continuing to write new entries. I hope to slow down soon! P.P.P.S. (20 June) I was going to bump it up again–the horizon’s now in mid-July–but I thought, enough is enough! Right now I think that about ha
sentIndex sentText sentNum sentScore
1 I’m trying to temporarily kick the blogging habit as I seem to be addicted. [sent-1, score-0.507]
2 I’m currently on a binge and my plan is to schedule a bunch of already-written entries at one per weekday and not blog anything new for awhile. [sent-2, score-0.858]
3 Yesterday I fell off the wagon and posted 4 items, but maybe now I can show some restraint. [sent-3, score-0.388]
4 In keeping with the spirit of this blog, I scheduled it to appear on 13 May, even though I wrote it on 15 Apr. [sent-6, score-0.504]
5 Just about everything you’ve been reading on this blog for the past several weeks (and lots of forthcoming items) were written a month ago. [sent-7, score-0.431]
6 The only exceptions are whatever my cobloggers have been posting and various items that were timely enough that I inserted them in the queue afterward. [sent-8, score-1.37]
7 S I bumped it up to 22 Jun because, as of 14 Apr, I was continuing to write new entries. [sent-11, score-0.384]
8 (20 June) I was going to bump it up again–the horizon’s now in mid-July–but I thought, enough is enough! [sent-17, score-0.292]
9 Right now I think that about half of my posts are topical, appearing within a couple days of posting–I often write them in the evening but I like to have them appear between 9 and 10am, eastern time–and half are on a longer delay. [sent-18, score-1.239]
wordName wordTfidf (topN-words)
[('items', 0.295), ('posting', 0.191), ('binge', 0.181), ('bumped', 0.181), ('wagon', 0.181), ('horizon', 0.171), ('timely', 0.171), ('jun', 0.163), ('half', 0.162), ('bump', 0.158), ('cobloggers', 0.153), ('eastern', 0.153), ('inserted', 0.153), ('temporarily', 0.153), ('topical', 0.153), ('appear', 0.152), ('queue', 0.149), ('apr', 0.146), ('kick', 0.146), ('evening', 0.14), ('schedule', 0.135), ('enough', 0.134), ('delay', 0.133), ('scheduled', 0.133), ('forthcoming', 0.131), ('fell', 0.128), ('exceptions', 0.124), ('june', 0.122), ('blog', 0.12), ('appearing', 0.119), ('habit', 0.115), ('spirit', 0.115), ('slow', 0.108), ('entries', 0.105), ('keeping', 0.104), ('continuing', 0.104), ('write', 0.099), ('soon', 0.098), ('weeks', 0.093), ('blogging', 0.093), ('posts', 0.088), ('yesterday', 0.088), ('month', 0.087), ('plan', 0.087), ('longer', 0.085), ('currently', 0.084), ('posted', 0.079), ('days', 0.079), ('per', 0.074), ('bunch', 0.072)]
simIndex simValue blogId blogTitle
same-blog 1 0.99999994 104 andrew gelman stats-2010-06-22-Seeking balance
Introduction: I’m trying to temporarily kick the blogging habit as I seem to be addicted. I’m currently on a binge and my plan is to schedule a bunch of already-written entries at one per weekday and not blog anything new for awhile. Yesterday I fell off the wagon and posted 4 items, but maybe now I can show some restraint. P.S. In keeping with the spirit of this blog, I scheduled it to appear on 13 May, even though I wrote it on 15 Apr. Just about everything you’ve been reading on this blog for the past several weeks (and lots of forthcoming items) were written a month ago. The only exceptions are whatever my cobloggers have been posting and various items that were timely enough that I inserted them in the queue afterward. P.P.S I bumped it up to 22 Jun because, as of 14 Apr, I was continuing to write new entries. I hope to slow down soon! P.P.P.S. (20 June) I was going to bump it up again–the horizon’s now in mid-July–but I thought, enough is enough! Right now I think that about ha
2 0.16518588 1964 andrew gelman stats-2013-08-01-Non-topical blogging
Introduction: On a day with four blog posts (and followed by a day with two more), econblogger Mark Thoma wrote : Every once in awhile I [Thoma] kind of need a bit of a break . . . I ran out of energy a few weeks ago . . . I’ll do my best until then, daily links at least somehow and short “echo” posts as usual, but I doubt I’ll have time to say much myself . . . [There's a reason I haven't missed a day posting to the blog in over eight years. When I first started, I was afraid that if I missed a day new readers would bail out . . . I realize a missed day won't kill the blog at this point, but it's still important to me to keep posting every day.] What I do is post once a day; when I write new posts, I schedule them for the future. I currently have approx 2-month lag. Sometimes I post 2 or 3 times in one day, if I have something topical or just something I feel like posting on. Overall, though, I find a benefit to the lag. Posts that are less topical (not tied to the news or to a current o
3 0.13618229 771 andrew gelman stats-2011-06-16-30 days of statistics
Introduction: I was talking with a colleague about one of our research projects and said that I would write something up, if blogging didn’t get in the way. She suggested that for the next month I just blog about my research ideas. So I think I’ll do that. This means no mocking of plagiarists, no reflections on literature, no answers to miscellaneous questions about how many groups you need in a multilevel model, no rants about economists, no links to pretty graphs, etc., for 30 days. Meanwhile, I have a roughly 30-day backlog. So after my next 30 days of stat blogging, the backlog will gradually appear. There’s some good stuff there, including reflections on Milos, a (sincere) tribute to the haters, an updated Twitteo Killed the Bloggio Star, a question about acupuncture, and some remote statistical modeling advice I gave that actually worked! I’m sure you’ll enjoy it. But you’ll have to wait for all that fun stuff. For the next thirty days, it’s statistics research every day. P.S. I
4 0.12119002 2232 andrew gelman stats-2014-03-03-What is the appropriate time scale for blogging—the day or the week?
Introduction: I post (approximately) once a day and don’t plan to change that. I have enough material to post more often—for example, I could intersperse existing blog posts with summaries of my published papers or of other work that I like; and, beyond this, we currently have a one-to-two-month backlog of posts—but I’m afraid that if the number of posts were doubled, the attention given to each would be roughly halved. Looking at it the other way, I certainly don’t want to reduce my level of posting. Sure, it takes time to blog, but these are things that are important for me to say. If I were to blog less frequently, it would only be because I was pouring all these words into a different vessel, for example a book. For now, though, I think it makes sense to blog and then collect the words later as appropriate. With blogging I get comments, and many of these comments are helpful—either directly (by pointing out errors in my thinking or linking to relevant software or literature) or indirec
5 0.11848311 790 andrew gelman stats-2011-07-08-Blog in motion
Introduction: In the next few days we’ll be changing the format of the blog and moving it to a new server. If you have difficulty posting comments, just wait and post them in a few days when all should be working well. (But if you can post a comment, go for it. All the old entries and comments should be reappearing in the reconstituted blog.)
6 0.1171947 259 andrew gelman stats-2010-09-06-Inbox zero. Really.
7 0.11263222 91 andrew gelman stats-2010-06-16-RSS mess
8 0.10808444 1567 andrew gelman stats-2012-11-07-Election reports
10 0.10395022 1311 andrew gelman stats-2012-05-10-My final exam for Design and Analysis of Sample Surveys
11 0.10049082 2206 andrew gelman stats-2014-02-10-On deck this week
12 0.087379277 872 andrew gelman stats-2011-08-26-Blog on applied probability modeling
14 0.079774462 413 andrew gelman stats-2010-11-14-Statistics of food consumption
15 0.079348899 554 andrew gelman stats-2011-02-04-An addition to the model-makers’ oath
16 0.0793081 1428 andrew gelman stats-2012-07-25-The problem with realistic advice?
17 0.078835852 120 andrew gelman stats-2010-06-30-You can’t put Pandora back in the box
18 0.078311138 1225 andrew gelman stats-2012-03-22-Procrastination as a positive productivity strategy
19 0.078245468 2096 andrew gelman stats-2013-11-10-Schiminovich is on The Simpsons
20 0.075478524 2265 andrew gelman stats-2014-03-24-On deck this week
topicId topicWeight
[(0, 0.108), (1, -0.063), (2, -0.029), (3, 0.023), (4, 0.023), (5, 0.016), (6, 0.053), (7, -0.033), (8, 0.025), (9, -0.066), (10, 0.036), (11, 0.013), (12, 0.057), (13, 0.047), (14, -0.024), (15, 0.04), (16, -0.023), (17, -0.018), (18, -0.045), (19, 0.061), (20, 0.039), (21, -0.008), (22, -0.066), (23, 0.031), (24, 0.023), (25, 0.016), (26, -0.007), (27, -0.002), (28, 0.011), (29, -0.009), (30, 0.003), (31, -0.024), (32, -0.017), (33, -0.011), (34, 0.021), (35, -0.023), (36, 0.01), (37, 0.027), (38, 0.019), (39, -0.051), (40, -0.045), (41, 0.011), (42, -0.027), (43, -0.016), (44, 0.017), (45, -0.006), (46, -0.054), (47, -0.039), (48, -0.036), (49, -0.008)]
simIndex simValue blogId blogTitle
same-blog 1 0.97714674 104 andrew gelman stats-2010-06-22-Seeking balance
Introduction: I’m trying to temporarily kick the blogging habit as I seem to be addicted. I’m currently on a binge and my plan is to schedule a bunch of already-written entries at one per weekday and not blog anything new for awhile. Yesterday I fell off the wagon and posted 4 items, but maybe now I can show some restraint. P.S. In keeping with the spirit of this blog, I scheduled it to appear on 13 May, even though I wrote it on 15 Apr. Just about everything you’ve been reading on this blog for the past several weeks (and lots of forthcoming items) were written a month ago. The only exceptions are whatever my cobloggers have been posting and various items that were timely enough that I inserted them in the queue afterward. P.P.S I bumped it up to 22 Jun because, as of 14 Apr, I was continuing to write new entries. I hope to slow down soon! P.P.P.S. (20 June) I was going to bump it up again–the horizon’s now in mid-July–but I thought, enough is enough! Right now I think that about ha
2 0.879363 1964 andrew gelman stats-2013-08-01-Non-topical blogging
Introduction: On a day with four blog posts (and followed by a day with two more), econblogger Mark Thoma wrote : Every once in awhile I [Thoma] kind of need a bit of a break . . . I ran out of energy a few weeks ago . . . I’ll do my best until then, daily links at least somehow and short “echo” posts as usual, but I doubt I’ll have time to say much myself . . . [There's a reason I haven't missed a day posting to the blog in over eight years. When I first started, I was afraid that if I missed a day new readers would bail out . . . I realize a missed day won't kill the blog at this point, but it's still important to me to keep posting every day.] What I do is post once a day; when I write new posts, I schedule them for the future. I currently have approx 2-month lag. Sometimes I post 2 or 3 times in one day, if I have something topical or just something I feel like posting on. Overall, though, I find a benefit to the lag. Posts that are less topical (not tied to the news or to a current o
3 0.82308829 790 andrew gelman stats-2011-07-08-Blog in motion
Introduction: In the next few days we’ll be changing the format of the blog and moving it to a new server. If you have difficulty posting comments, just wait and post them in a few days when all should be working well. (But if you can post a comment, go for it. All the old entries and comments should be reappearing in the reconstituted blog.)
4 0.81740355 771 andrew gelman stats-2011-06-16-30 days of statistics
Introduction: I was talking with a colleague about one of our research projects and said that I would write something up, if blogging didn’t get in the way. She suggested that for the next month I just blog about my research ideas. So I think I’ll do that. This means no mocking of plagiarists, no reflections on literature, no answers to miscellaneous questions about how many groups you need in a multilevel model, no rants about economists, no links to pretty graphs, etc., for 30 days. Meanwhile, I have a roughly 30-day backlog. So after my next 30 days of stat blogging, the backlog will gradually appear. There’s some good stuff there, including reflections on Milos, a (sincere) tribute to the haters, an updated Twitteo Killed the Bloggio Star, a question about acupuncture, and some remote statistical modeling advice I gave that actually worked! I’m sure you’ll enjoy it. But you’ll have to wait for all that fun stuff. For the next thirty days, it’s statistics research every day. P.S. I
5 0.80182701 2085 andrew gelman stats-2013-11-02-I’ve already written next year’s April Fools post!
Introduction: Good to have gotten that one out of the way already. (Actually, I wrote it a few months ago. This post is itself in the monthlong+ queue.) I don’t know how easy it is to search this blog by date to find the Fools posts from previous years.
6 0.79666531 2232 andrew gelman stats-2014-03-03-What is the appropriate time scale for blogging—the day or the week?
7 0.77504772 2088 andrew gelman stats-2013-11-04-Recently in the sister blog
8 0.77468896 220 andrew gelman stats-2010-08-20-Why I blog?
9 0.75276065 856 andrew gelman stats-2011-08-16-Our new improved blog! Thanks to Cord Blomquist
10 0.74594295 91 andrew gelman stats-2010-06-16-RSS mess
11 0.7240172 727 andrew gelman stats-2011-05-23-My new writing strategy
12 0.71891212 1408 andrew gelman stats-2012-07-07-Not much difference between communicating to self and communicating to others
13 0.71352792 2329 andrew gelman stats-2014-05-11-“What should you talk about?”
14 0.71079516 1508 andrew gelman stats-2012-09-23-Speaking frankly
15 0.70667964 1561 andrew gelman stats-2012-11-04-Someone is wrong on the internet
16 0.70307148 2126 andrew gelman stats-2013-12-07-If I could’ve done it all over again
17 0.69497144 1658 andrew gelman stats-2013-01-07-Free advice from an academic writing coach!
18 0.68790257 1905 andrew gelman stats-2013-06-18-There are no fat sprinters
19 0.68648404 2002 andrew gelman stats-2013-08-30-Blogging
20 0.68537188 1084 andrew gelman stats-2011-12-26-Tweeting the Hits?
topicId topicWeight
[(2, 0.018), (15, 0.058), (16, 0.064), (24, 0.094), (49, 0.016), (52, 0.243), (53, 0.034), (59, 0.045), (63, 0.015), (85, 0.019), (86, 0.03), (99, 0.264)]
simIndex simValue blogId blogTitle
1 0.91731077 914 andrew gelman stats-2011-09-16-meta-infographic
Introduction: “Most Popular Infographics you can find around the web” by designer and illustrator Alberto Antoniazzi.
2 0.91444892 1686 andrew gelman stats-2013-01-21-Finite-population Anova calculations for models with interactions
Introduction: Jim Thomson writes: I wonder if you could provide some clarification on the correct way to calculate the finite-population standard deviations for interaction terms in your Bayesian approach to ANOVA (as explained in your 2005 paper, and Gelman and Hill 2007). I understand that it is the SD of the constrained batch coefficients that is of interest, but in most WinBUGS examples I have seen, the SDs are all calculated directly as sd.fin<-sd(beta.main[]) for main effects and sd(beta.int[,]) for interaction effects, where beta.main and beta.int are the unconstrained coefficients, e.g. beta.int[i,j]~dnorm(0,tau). For main effects, I can see that it makes no difference, since the constrained value is calculated by subtracting the mean, and sd(B[]) = sd(B[]-mean(B[])). But the conventional sum-to-zero constraint for interaction terms in linear models is more complicated than subtracting the mean (there are only (n1-1)*(n2-1) free coefficients for an interaction b/w factors with n1 a
3 0.90306234 1301 andrew gelman stats-2012-05-05-Related to z-statistics
Introduction: Pawel Sobkowicz writes: How many zombies do you know?’ Using indirect survey methods to measure alien attacks and outbreaks of the undead, Arxiv preprint arXiv:1003.6087, 2010 I hope you would find interesting the following paper, recently posted on arXiv: Aliens on Earth. Are reports of close encounters correct?, arXiv:1203.6805 This is soooooo much better than getting links to bad graphs or to papers on sex ratios!
4 0.89990127 223 andrew gelman stats-2010-08-21-Statoverflow
Introduction: Skirant Vadali writes: I am writing to seek your help in building a community driven Q&A; website tentatively called called ‘Statistics Analysis’. I am neither a founder of this website nor do I have any financial stake in its success. By way of background to this website, please see Stackoverflow (http://stackoverflow.com/) and Mathoverflow (http://mathoverflow.net/). Stackoverflow is a Q&A; website targeted at software developers and is designed to help them ask questions and get answers from other developers. Mathoverflow is a Q&A; website targeted at research mathematicians and is designed to help them ask and answer questions from other mathematicians across the world. The success of both these sites in helping their respective communities is a strong indicator that sites designed along these lines are very useful. The company that runs Stackoverflow (who also host Mathoverflow.net) has recently decided to develop other community driven websites for various other topic are
Introduction: Mark Palko points me to a news article by Zack Beauchamp on Jason Richwine, the recent Ph.D. graduate from Harvard’s policy school who left the conservative Heritage Foundation after it came out that his Ph.D. thesis was said to be all about the low IQ’s of Hispanic immigrants. Heritage and others apparently thought this association could discredit their anti-immigration-reform position. Richwine’s mentor Charles Murray was unhappy about the whole episode. Beauchamp’s article is worth reading in that it provides some interesting background, in particular by getting into the details of the Ph.D. review process. In a sense, Beauchamp is too harsh. Flawed Ph.D. theses get published all the time. I’d say that most Ph.D. theses I’ve seen are flawed: usually the plan is to get the papers into shape later, when submitting them to journals. If a student doesn’t go into academia, the thesis typically just sits there and is rarely followed up on. I don’t know the statistics o
6 0.8931675 1531 andrew gelman stats-2012-10-12-Elderpedia
same-blog 7 0.89041305 104 andrew gelman stats-2010-06-22-Seeking balance
8 0.88821518 485 andrew gelman stats-2010-12-25-Unlogging
9 0.88722038 1246 andrew gelman stats-2012-04-04-Data visualization panel at the New York Public Library this evening!
10 0.87288564 889 andrew gelman stats-2011-09-04-The acupuncture paradox
11 0.84881735 1020 andrew gelman stats-2011-11-20-No no no no no
12 0.83820689 546 andrew gelman stats-2011-01-31-Infovis vs. statistical graphics: My talk tomorrow (Tues) 1pm at Columbia
14 0.83243001 948 andrew gelman stats-2011-10-10-Combining data from many sources
15 0.82843763 1369 andrew gelman stats-2012-06-06-Your conclusion is only as good as your data
16 0.82638943 786 andrew gelman stats-2011-07-04-Questions about quantum computing
17 0.81860775 1185 andrew gelman stats-2012-02-26-A statistician’s rants and raves
18 0.80777842 82 andrew gelman stats-2010-06-12-UnConMax – uncertainty consideration maxims 7 +-- 2
19 0.80446351 2276 andrew gelman stats-2014-03-31-On deck this week
20 0.7916491 1256 andrew gelman stats-2012-04-10-Our data visualization panel at the New York Public Library