andrew_gelman_stats andrew_gelman_stats-2011 andrew_gelman_stats-2011-771 knowledge-graph by maker-knowledge-mining

771 andrew gelman stats-2011-06-16-30 days of statistics


meta infos for this blog

Source: html

Introduction: I was talking with a colleague about one of our research projects and said that I would write something up, if blogging didn’t get in the way. She suggested that for the next month I just blog about my research ideas. So I think I’ll do that. This means no mocking of plagiarists, no reflections on literature, no answers to miscellaneous questions about how many groups you need in a multilevel model, no rants about economists, no links to pretty graphs, etc., for 30 days. Meanwhile, I have a roughly 30-day backlog. So after my next 30 days of stat blogging, the backlog will gradually appear. There’s some good stuff there, including reflections on Milos, a (sincere) tribute to the haters, an updated Twitteo Killed the Bloggio Star, a question about acupuncture, and some remote statistical modeling advice I gave that actually worked! I’m sure you’ll enjoy it. But you’ll have to wait for all that fun stuff. For the next thirty days, it’s statistics research every day. P.S. I


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 I was talking with a colleague about one of our research projects and said that I would write something up, if blogging didn’t get in the way. [sent-1, score-0.338]

2 She suggested that for the next month I just blog about my research ideas. [sent-2, score-0.495]

3 This means no mocking of plagiarists, no reflections on literature, no answers to miscellaneous questions about how many groups you need in a multilevel model, no rants about economists, no links to pretty graphs, etc. [sent-4, score-0.675]

4 So after my next 30 days of stat blogging, the backlog will gradually appear. [sent-7, score-0.669]

5 There’s some good stuff there, including reflections on Milos, a (sincere) tribute to the haters, an updated Twitteo Killed the Bloggio Star, a question about acupuncture, and some remote statistical modeling advice I gave that actually worked! [sent-8, score-0.609]

6 But you’ll have to wait for all that fun stuff. [sent-10, score-0.091]

7 For the next thirty days, it’s statistics research every day. [sent-11, score-0.358]

8 If anything comes up that’s too topical to be held for a month, I’ll post it on one of the sister blogs. [sent-14, score-0.388]

9 As always, my cobloggers can feel free to post here whenever they want on whatever they want. [sent-18, score-0.229]

10 We’ll soon be moving the blog to a new site for the blog. [sent-23, score-0.185]

11 I’ll make an exception to the all-statistics-research rule to update you on that when it occurs. [sent-24, score-0.197]

12 To anybody whose comments don’t appear: As noted earlier, we get thousands of spam comments per hour , so (a) some legitimate comments get caught by the spam filter, and (b) it’s impossible for us to look through the spam to see if anything real got stuck there. [sent-30, score-1.956]

13 Try registering as a commenter, that might help. [sent-31, score-0.145]

14 Or maybe things will be better in a few days with the new blog software. [sent-32, score-0.302]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('spam', 0.3), ('reflections', 0.22), ('ll', 0.21), ('days', 0.2), ('blogging', 0.159), ('acupuncture', 0.154), ('milos', 0.154), ('tribute', 0.154), ('next', 0.154), ('month', 0.148), ('comments', 0.147), ('registering', 0.145), ('miscellaneous', 0.139), ('remote', 0.134), ('haters', 0.134), ('cobloggers', 0.13), ('topical', 0.13), ('backlog', 0.121), ('rants', 0.121), ('plagiarists', 0.121), ('mocking', 0.113), ('exception', 0.113), ('sincere', 0.113), ('thirty', 0.113), ('killed', 0.109), ('star', 0.104), ('blog', 0.102), ('filter', 0.102), ('gradually', 0.102), ('updated', 0.101), ('whenever', 0.099), ('anybody', 0.095), ('legitimate', 0.093), ('stat', 0.092), ('wait', 0.091), ('research', 0.091), ('meanwhile', 0.091), ('projects', 0.088), ('anything', 0.087), ('impossible', 0.087), ('hour', 0.087), ('held', 0.086), ('sister', 0.085), ('update', 0.084), ('stuck', 0.084), ('soon', 0.083), ('commenter', 0.083), ('thousands', 0.082), ('roughly', 0.082), ('answers', 0.082)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0000002 771 andrew gelman stats-2011-06-16-30 days of statistics

Introduction: I was talking with a colleague about one of our research projects and said that I would write something up, if blogging didn’t get in the way. She suggested that for the next month I just blog about my research ideas. So I think I’ll do that. This means no mocking of plagiarists, no reflections on literature, no answers to miscellaneous questions about how many groups you need in a multilevel model, no rants about economists, no links to pretty graphs, etc., for 30 days. Meanwhile, I have a roughly 30-day backlog. So after my next 30 days of stat blogging, the backlog will gradually appear. There’s some good stuff there, including reflections on Milos, a (sincere) tribute to the haters, an updated Twitteo Killed the Bloggio Star, a question about acupuncture, and some remote statistical modeling advice I gave that actually worked! I’m sure you’ll enjoy it. But you’ll have to wait for all that fun stuff. For the next thirty days, it’s statistics research every day. P.S. I

2 0.277931 619 andrew gelman stats-2011-03-19-If a comment is flagged as spam, it will disappear forever

Introduction: A commenter wrote (by email): I’ve noticed that you’ve quit approving my comments on your blog. I hope I didn’t anger you in some way or write something you felt was inappropriate. My reply: I have not been unapproving any comments. If you have comments that have not appeared, they have probably been going into the spam filter. I get literally thousands of spam comments a day and so anything that hits the spam filter is gone forever. I think there is a way to register as a commenter; that could help.

3 0.24725768 425 andrew gelman stats-2010-11-21-If your comment didn’t get through . . .

Introduction: It probably got caught in the spam filter. We get tons and tons of spam (including the annoying spam that I have to remove by hand). If your comment was accompanied by an ad or a spam link, then maybe I just deleted it.

4 0.23519206 790 andrew gelman stats-2011-07-08-Blog in motion

Introduction: In the next few days we’ll be changing the format of the blog and moving it to a new server. If you have difficulty posting comments, just wait and post them in a few days when all should be working well. (But if you can post a comment, go for it. All the old entries and comments should be reappearing in the reconstituted blog.)

5 0.22734751 132 andrew gelman stats-2010-07-07-Note to “Cigarettes”

Introduction: To the person who posted an apparently non-spam comment with a URL link to a “cheap cigarettes” website: In case you’re wondering, no, your comment didn’t get caught by the spam filter–I’m not sure why not, given that URL. I put it in the spam file manually. If you’d like to participate in blog discussion in the future, please refrain from including spam links. Thank you. Also, it’s “John Tukey,” not “John Turkey.”

6 0.21567863 817 andrew gelman stats-2011-07-23-New blog home

7 0.20281769 1488 andrew gelman stats-2012-09-08-Annals of spam

8 0.15239604 523 andrew gelman stats-2011-01-18-Spam is out of control

9 0.14579698 27 andrew gelman stats-2010-05-11-Update on the spam email study

10 0.13618229 104 andrew gelman stats-2010-06-22-Seeking balance

11 0.13486588 839 andrew gelman stats-2011-08-04-To commenters who are trying to sell something

12 0.12977038 2232 andrew gelman stats-2014-03-03-What is the appropriate time scale for blogging—the day or the week?

13 0.12865505 2206 andrew gelman stats-2014-02-10-On deck this week

14 0.12653241 220 andrew gelman stats-2010-08-20-Why I blog?

15 0.12091461 889 andrew gelman stats-2011-09-04-The acupuncture paradox

16 0.12078263 826 andrew gelman stats-2011-07-27-The Statistics Forum!

17 0.11809352 1658 andrew gelman stats-2013-01-07-Free advice from an academic writing coach!

18 0.11691257 905 andrew gelman stats-2011-09-14-5 books on essentialism!

19 0.11671093 1311 andrew gelman stats-2012-05-10-My final exam for Design and Analysis of Sample Surveys

20 0.11395757 545 andrew gelman stats-2011-01-30-New innovations in spam


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.179), (1, -0.086), (2, -0.093), (3, 0.049), (4, 0.052), (5, 0.016), (6, 0.06), (7, -0.083), (8, 0.05), (9, -0.08), (10, 0.054), (11, 0.046), (12, 0.213), (13, 0.065), (14, -0.064), (15, 0.084), (16, -0.065), (17, -0.083), (18, -0.076), (19, 0.113), (20, 0.105), (21, -0.092), (22, -0.082), (23, -0.086), (24, 0.01), (25, -0.023), (26, 0.01), (27, 0.046), (28, -0.016), (29, -0.025), (30, 0.036), (31, -0.001), (32, -0.015), (33, -0.022), (34, 0.016), (35, 0.091), (36, 0.017), (37, 0.074), (38, -0.016), (39, -0.027), (40, -0.119), (41, 0.044), (42, -0.074), (43, 0.005), (44, 0.017), (45, -0.048), (46, 0.012), (47, -0.003), (48, -0.056), (49, 0.013)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.96349233 771 andrew gelman stats-2011-06-16-30 days of statistics

Introduction: I was talking with a colleague about one of our research projects and said that I would write something up, if blogging didn’t get in the way. She suggested that for the next month I just blog about my research ideas. So I think I’ll do that. This means no mocking of plagiarists, no reflections on literature, no answers to miscellaneous questions about how many groups you need in a multilevel model, no rants about economists, no links to pretty graphs, etc., for 30 days. Meanwhile, I have a roughly 30-day backlog. So after my next 30 days of stat blogging, the backlog will gradually appear. There’s some good stuff there, including reflections on Milos, a (sincere) tribute to the haters, an updated Twitteo Killed the Bloggio Star, a question about acupuncture, and some remote statistical modeling advice I gave that actually worked! I’m sure you’ll enjoy it. But you’ll have to wait for all that fun stuff. For the next thirty days, it’s statistics research every day. P.S. I

2 0.94174379 817 andrew gelman stats-2011-07-23-New blog home

Introduction: Hi all. We’ve moved the blog and are still working out some bugs. For example, we delete spam comments but sometimes they remain on the blog. A few other things. We should be cleaning it up more in the next few days.

3 0.85282093 790 andrew gelman stats-2011-07-08-Blog in motion

Introduction: In the next few days we’ll be changing the format of the blog and moving it to a new server. If you have difficulty posting comments, just wait and post them in a few days when all should be working well. (But if you can post a comment, go for it. All the old entries and comments should be reappearing in the reconstituted blog.)

4 0.8525067 619 andrew gelman stats-2011-03-19-If a comment is flagged as spam, it will disappear forever

Introduction: A commenter wrote (by email): I’ve noticed that you’ve quit approving my comments on your blog. I hope I didn’t anger you in some way or write something you felt was inappropriate. My reply: I have not been unapproving any comments. If you have comments that have not appeared, they have probably been going into the spam filter. I get literally thousands of spam comments a day and so anything that hits the spam filter is gone forever. I think there is a way to register as a commenter; that could help.

5 0.85110569 132 andrew gelman stats-2010-07-07-Note to “Cigarettes”

Introduction: To the person who posted an apparently non-spam comment with a URL link to a “cheap cigarettes” website: In case you’re wondering, no, your comment didn’t get caught by the spam filter–I’m not sure why not, given that URL. I put it in the spam file manually. If you’d like to participate in blog discussion in the future, please refrain from including spam links. Thank you. Also, it’s “John Tukey,” not “John Turkey.”

6 0.82420039 1488 andrew gelman stats-2012-09-08-Annals of spam

7 0.82321978 1709 andrew gelman stats-2013-02-06-The fractal nature of scientific revolutions

8 0.81390452 220 andrew gelman stats-2010-08-20-Why I blog?

9 0.76904541 839 andrew gelman stats-2011-08-04-To commenters who are trying to sell something

10 0.76188421 425 andrew gelman stats-2010-11-21-If your comment didn’t get through . . .

11 0.76049995 523 andrew gelman stats-2011-01-18-Spam is out of control

12 0.75511754 104 andrew gelman stats-2010-06-22-Seeking balance

13 0.72715569 9 andrew gelman stats-2010-04-28-But it all goes to pay for gas, car insurance, and tolls on the turnpike

14 0.72578937 856 andrew gelman stats-2011-08-16-Our new improved blog! Thanks to Cord Blomquist

15 0.70124674 2088 andrew gelman stats-2013-11-04-Recently in the sister blog

16 0.69742382 2075 andrew gelman stats-2013-10-23-PubMed Commons: A system for commenting on articles in PubMed

17 0.69228733 1964 andrew gelman stats-2013-08-01-Non-topical blogging

18 0.68978423 1202 andrew gelman stats-2012-03-08-Between and within-Krugman correlation

19 0.68514156 876 andrew gelman stats-2011-08-28-Vaguely related to the coke-dumping story

20 0.67699176 199 andrew gelman stats-2010-08-11-Note to semi-spammers


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(2, 0.011), (9, 0.02), (15, 0.046), (16, 0.053), (21, 0.026), (24, 0.097), (42, 0.011), (52, 0.016), (59, 0.227), (63, 0.043), (72, 0.013), (84, 0.011), (99, 0.343)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.95856917 1599 andrew gelman stats-2012-11-30-“The scientific literature must be cleansed of everything that is fraudulent, especially if it involves the work of a leading academic”

Introduction: Someone points me to this report from Tilburg University on disgraced psychology researcher Diederik Stapel. The reports includes bits like this: When the fraud was first discovered, limiting the harm it caused for the victims was a matter of urgency. This was particularly the case for Mr Stapel’s former PhD students and postdoctoral researchers . . . However, the Committees were of the opinion that the main bulk of the work had not yet even started. . . . Journal publications can often leave traces that reach far into and even beyond scientific disciplines. The self-cleansing character of science calls for fraudulent publications to be withdrawn and no longer to proliferate within the literature. In addition, based on their initial impressions, the Committees believed that there were other serious issues within Mr Stapel’s publications . . . This brought into the spotlight a research culture in which this sloppy science, alongside out-and-out fraud, was able to remain undetected

2 0.95627034 214 andrew gelman stats-2010-08-17-Probability-processing hardware

Introduction: Lyric Semiconductor posted: For over 60 years, computers have been based on digital computing principles. Data is represented as bits (0s and 1s). Boolean logic gates perform operations on these bits. A processor steps through many of these operations serially in order to perform a function. However, today’s most interesting problems are not at all suited to this approach. Here at Lyric Semiconductor, we are redesigning information processing circuits from the ground up to natively process probabilities: from the gate circuits to the processor architecture to the programming language. As a result, many applications that today require a thousand conventional processors will soon run in just one Lyric processor, providing 1,000x efficiencies in cost, power, and size. Om Malik has some more information, also relating to the team and the business. The fundamental idea is that computing architectures work deterministically, even though the world is fundamentally stochastic.

3 0.95533371 853 andrew gelman stats-2011-08-14-Preferential admissions for children of elite colleges

Introduction: Jenny Anderson reports on a discussion of the practice of colleges preferential admission of children of alumni: [Richard] Kahlenberg citing research from his book “Affirmative Action for the Rich: Legacy Preferences in College Admissions” made the case that getting into good schools matters — 12 institutions making up less than 1 percent of the U.S. population produced 42 percent of government leaders and 54 percent of corporate leaders. And being a legacy helps improve an applicant’s chances of getting in, with one study finding that being a primary legacy — the son or daughter of an undergraduate alumnus or alumna — increases one’s chance of admission by 45.1 percent. I’d call that 45 percent but I get the basic idea. But then Jeffrey Brenzel of the Yale admissions office replied: “We turn away 80 percent of our legacies, and we feel it every day,” Mr. Brenzel said, adding that he rejected more offspring of the school’s Sterling donors than he accepted this year (

4 0.94070762 965 andrew gelman stats-2011-10-19-Web-friendly visualizations in R

Introduction: Aleks points me to this new tool from Wojciech Gryc. Right now I save my graphs as pdfs or pngs and then upload them to put them on the web. I expect I’ll still be doing this for awhile—I like having full control of what my graphs look like—but Gryc’s default plots might be useful for lots of people making their analyses more accessible. Here’s an example: x = rnorm(30) y = rnorm(30) wv.plot(x, y, "~/Desktop/scatterplot", height=300, width=300, xlim=c(-2.5,2.5), ylim=c(-2.5,2.5), xbreaks=c(0), ybreaks=c(0))

5 0.92876577 229 andrew gelman stats-2010-08-24-Bizarre twisty argument about medical diagnostic tests

Introduction: My cobloggers sometimes write about “Politics Everywhere.” Here’s an example of a political writer taking something that’s not particularly political and trying to twist it into a political context. Perhaps the title should be “political journalism everywhere”. Michael Kinsley writes : Scientists have discovered a spinal fluid test that can predict with 100 percent accuracy whether people who already have memory loss are going to develop full-fledged Alzheimer’s disease. They apparently don’t know whether this test works for people with no memory problems yet, but reading between the lines of the report in the New York Times August 10, it sounds as if they believe it will. . . . This is truly the apple of knowledge: a test that can be given to physically and mentally healthy people in the prime of life, which can identify with perfect accuracy which ones are slowly going to lose their mental capabilities. If your first instinct is, “We should outlaw this test” or at lea

6 0.9281646 1000 andrew gelman stats-2011-11-10-Forecasting 2012: How much does ideology matter?

7 0.92697394 1716 andrew gelman stats-2013-02-09-iPython Notebook

same-blog 8 0.92431372 771 andrew gelman stats-2011-06-16-30 days of statistics

9 0.91948032 34 andrew gelman stats-2010-05-14-Non-academic writings on literature

10 0.91839468 1408 andrew gelman stats-2012-07-07-Not much difference between communicating to self and communicating to others

11 0.9172219 763 andrew gelman stats-2011-06-13-Inventor of Connect Four dies at 91

12 0.91426635 1764 andrew gelman stats-2013-03-15-How do I make my graphs?

13 0.90937811 1380 andrew gelman stats-2012-06-15-Coaching, teaching, and writing

14 0.90936172 199 andrew gelman stats-2010-08-11-Note to semi-spammers

15 0.9068523 517 andrew gelman stats-2011-01-14-Bayes in China update

16 0.90216595 1190 andrew gelman stats-2012-02-29-Why “Why”?

17 0.9001562 1377 andrew gelman stats-2012-06-13-A question about AIC

18 0.89917564 1415 andrew gelman stats-2012-07-13-Retractions, retractions: “left-wing enough to not care about truth if it confirms their social theories, right-wing enough to not care as long as they’re getting paid enough”

19 0.89582938 766 andrew gelman stats-2011-06-14-Last Wegman post (for now)

20 0.8909446 2233 andrew gelman stats-2014-03-04-Literal vs. rhetorical