andrew_gelman_stats andrew_gelman_stats-2011 andrew_gelman_stats-2011-771 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: I was talking with a colleague about one of our research projects and said that I would write something up, if blogging didn’t get in the way. She suggested that for the next month I just blog about my research ideas. So I think I’ll do that. This means no mocking of plagiarists, no reflections on literature, no answers to miscellaneous questions about how many groups you need in a multilevel model, no rants about economists, no links to pretty graphs, etc., for 30 days. Meanwhile, I have a roughly 30-day backlog. So after my next 30 days of stat blogging, the backlog will gradually appear. There’s some good stuff there, including reflections on Milos, a (sincere) tribute to the haters, an updated Twitteo Killed the Bloggio Star, a question about acupuncture, and some remote statistical modeling advice I gave that actually worked! I’m sure you’ll enjoy it. But you’ll have to wait for all that fun stuff. For the next thirty days, it’s statistics research every day. P.S. I
sentIndex sentText sentNum sentScore
1 I was talking with a colleague about one of our research projects and said that I would write something up, if blogging didn’t get in the way. [sent-1, score-0.338]
2 She suggested that for the next month I just blog about my research ideas. [sent-2, score-0.495]
3 This means no mocking of plagiarists, no reflections on literature, no answers to miscellaneous questions about how many groups you need in a multilevel model, no rants about economists, no links to pretty graphs, etc. [sent-4, score-0.675]
4 So after my next 30 days of stat blogging, the backlog will gradually appear. [sent-7, score-0.669]
5 There’s some good stuff there, including reflections on Milos, a (sincere) tribute to the haters, an updated Twitteo Killed the Bloggio Star, a question about acupuncture, and some remote statistical modeling advice I gave that actually worked! [sent-8, score-0.609]
6 But you’ll have to wait for all that fun stuff. [sent-10, score-0.091]
7 For the next thirty days, it’s statistics research every day. [sent-11, score-0.358]
8 If anything comes up that’s too topical to be held for a month, I’ll post it on one of the sister blogs. [sent-14, score-0.388]
9 As always, my cobloggers can feel free to post here whenever they want on whatever they want. [sent-18, score-0.229]
10 We’ll soon be moving the blog to a new site for the blog. [sent-23, score-0.185]
11 I’ll make an exception to the all-statistics-research rule to update you on that when it occurs. [sent-24, score-0.197]
12 To anybody whose comments don’t appear: As noted earlier, we get thousands of spam comments per hour , so (a) some legitimate comments get caught by the spam filter, and (b) it’s impossible for us to look through the spam to see if anything real got stuck there. [sent-30, score-1.956]
13 Try registering as a commenter, that might help. [sent-31, score-0.145]
14 Or maybe things will be better in a few days with the new blog software. [sent-32, score-0.302]
wordName wordTfidf (topN-words)
[('spam', 0.3), ('reflections', 0.22), ('ll', 0.21), ('days', 0.2), ('blogging', 0.159), ('acupuncture', 0.154), ('milos', 0.154), ('tribute', 0.154), ('next', 0.154), ('month', 0.148), ('comments', 0.147), ('registering', 0.145), ('miscellaneous', 0.139), ('remote', 0.134), ('haters', 0.134), ('cobloggers', 0.13), ('topical', 0.13), ('backlog', 0.121), ('rants', 0.121), ('plagiarists', 0.121), ('mocking', 0.113), ('exception', 0.113), ('sincere', 0.113), ('thirty', 0.113), ('killed', 0.109), ('star', 0.104), ('blog', 0.102), ('filter', 0.102), ('gradually', 0.102), ('updated', 0.101), ('whenever', 0.099), ('anybody', 0.095), ('legitimate', 0.093), ('stat', 0.092), ('wait', 0.091), ('research', 0.091), ('meanwhile', 0.091), ('projects', 0.088), ('anything', 0.087), ('impossible', 0.087), ('hour', 0.087), ('held', 0.086), ('sister', 0.085), ('update', 0.084), ('stuck', 0.084), ('soon', 0.083), ('commenter', 0.083), ('thousands', 0.082), ('roughly', 0.082), ('answers', 0.082)]
simIndex simValue blogId blogTitle
same-blog 1 1.0000002 771 andrew gelman stats-2011-06-16-30 days of statistics
Introduction: I was talking with a colleague about one of our research projects and said that I would write something up, if blogging didn’t get in the way. She suggested that for the next month I just blog about my research ideas. So I think I’ll do that. This means no mocking of plagiarists, no reflections on literature, no answers to miscellaneous questions about how many groups you need in a multilevel model, no rants about economists, no links to pretty graphs, etc., for 30 days. Meanwhile, I have a roughly 30-day backlog. So after my next 30 days of stat blogging, the backlog will gradually appear. There’s some good stuff there, including reflections on Milos, a (sincere) tribute to the haters, an updated Twitteo Killed the Bloggio Star, a question about acupuncture, and some remote statistical modeling advice I gave that actually worked! I’m sure you’ll enjoy it. But you’ll have to wait for all that fun stuff. For the next thirty days, it’s statistics research every day. P.S. I
2 0.277931 619 andrew gelman stats-2011-03-19-If a comment is flagged as spam, it will disappear forever
Introduction: A commenter wrote (by email): I’ve noticed that you’ve quit approving my comments on your blog. I hope I didn’t anger you in some way or write something you felt was inappropriate. My reply: I have not been unapproving any comments. If you have comments that have not appeared, they have probably been going into the spam filter. I get literally thousands of spam comments a day and so anything that hits the spam filter is gone forever. I think there is a way to register as a commenter; that could help.
3 0.24725768 425 andrew gelman stats-2010-11-21-If your comment didn’t get through . . .
Introduction: It probably got caught in the spam filter. We get tons and tons of spam (including the annoying spam that I have to remove by hand). If your comment was accompanied by an ad or a spam link, then maybe I just deleted it.
4 0.23519206 790 andrew gelman stats-2011-07-08-Blog in motion
Introduction: In the next few days we’ll be changing the format of the blog and moving it to a new server. If you have difficulty posting comments, just wait and post them in a few days when all should be working well. (But if you can post a comment, go for it. All the old entries and comments should be reappearing in the reconstituted blog.)
5 0.22734751 132 andrew gelman stats-2010-07-07-Note to “Cigarettes”
Introduction: To the person who posted an apparently non-spam comment with a URL link to a “cheap cigarettes” website: In case you’re wondering, no, your comment didn’t get caught by the spam filter–I’m not sure why not, given that URL. I put it in the spam file manually. If you’d like to participate in blog discussion in the future, please refrain from including spam links. Thank you. Also, it’s “John Tukey,” not “John Turkey.”
6 0.21567863 817 andrew gelman stats-2011-07-23-New blog home
7 0.20281769 1488 andrew gelman stats-2012-09-08-Annals of spam
8 0.15239604 523 andrew gelman stats-2011-01-18-Spam is out of control
9 0.14579698 27 andrew gelman stats-2010-05-11-Update on the spam email study
10 0.13618229 104 andrew gelman stats-2010-06-22-Seeking balance
11 0.13486588 839 andrew gelman stats-2011-08-04-To commenters who are trying to sell something
12 0.12977038 2232 andrew gelman stats-2014-03-03-What is the appropriate time scale for blogging—the day or the week?
13 0.12865505 2206 andrew gelman stats-2014-02-10-On deck this week
14 0.12653241 220 andrew gelman stats-2010-08-20-Why I blog?
15 0.12091461 889 andrew gelman stats-2011-09-04-The acupuncture paradox
16 0.12078263 826 andrew gelman stats-2011-07-27-The Statistics Forum!
17 0.11809352 1658 andrew gelman stats-2013-01-07-Free advice from an academic writing coach!
18 0.11691257 905 andrew gelman stats-2011-09-14-5 books on essentialism!
19 0.11671093 1311 andrew gelman stats-2012-05-10-My final exam for Design and Analysis of Sample Surveys
20 0.11395757 545 andrew gelman stats-2011-01-30-New innovations in spam
topicId topicWeight
[(0, 0.179), (1, -0.086), (2, -0.093), (3, 0.049), (4, 0.052), (5, 0.016), (6, 0.06), (7, -0.083), (8, 0.05), (9, -0.08), (10, 0.054), (11, 0.046), (12, 0.213), (13, 0.065), (14, -0.064), (15, 0.084), (16, -0.065), (17, -0.083), (18, -0.076), (19, 0.113), (20, 0.105), (21, -0.092), (22, -0.082), (23, -0.086), (24, 0.01), (25, -0.023), (26, 0.01), (27, 0.046), (28, -0.016), (29, -0.025), (30, 0.036), (31, -0.001), (32, -0.015), (33, -0.022), (34, 0.016), (35, 0.091), (36, 0.017), (37, 0.074), (38, -0.016), (39, -0.027), (40, -0.119), (41, 0.044), (42, -0.074), (43, 0.005), (44, 0.017), (45, -0.048), (46, 0.012), (47, -0.003), (48, -0.056), (49, 0.013)]
simIndex simValue blogId blogTitle
same-blog 1 0.96349233 771 andrew gelman stats-2011-06-16-30 days of statistics
Introduction: I was talking with a colleague about one of our research projects and said that I would write something up, if blogging didn’t get in the way. She suggested that for the next month I just blog about my research ideas. So I think I’ll do that. This means no mocking of plagiarists, no reflections on literature, no answers to miscellaneous questions about how many groups you need in a multilevel model, no rants about economists, no links to pretty graphs, etc., for 30 days. Meanwhile, I have a roughly 30-day backlog. So after my next 30 days of stat blogging, the backlog will gradually appear. There’s some good stuff there, including reflections on Milos, a (sincere) tribute to the haters, an updated Twitteo Killed the Bloggio Star, a question about acupuncture, and some remote statistical modeling advice I gave that actually worked! I’m sure you’ll enjoy it. But you’ll have to wait for all that fun stuff. For the next thirty days, it’s statistics research every day. P.S. I
2 0.94174379 817 andrew gelman stats-2011-07-23-New blog home
Introduction: Hi all. We’ve moved the blog and are still working out some bugs. For example, we delete spam comments but sometimes they remain on the blog. A few other things. We should be cleaning it up more in the next few days.
3 0.85282093 790 andrew gelman stats-2011-07-08-Blog in motion
Introduction: In the next few days we’ll be changing the format of the blog and moving it to a new server. If you have difficulty posting comments, just wait and post them in a few days when all should be working well. (But if you can post a comment, go for it. All the old entries and comments should be reappearing in the reconstituted blog.)
4 0.8525067 619 andrew gelman stats-2011-03-19-If a comment is flagged as spam, it will disappear forever
Introduction: A commenter wrote (by email): I’ve noticed that you’ve quit approving my comments on your blog. I hope I didn’t anger you in some way or write something you felt was inappropriate. My reply: I have not been unapproving any comments. If you have comments that have not appeared, they have probably been going into the spam filter. I get literally thousands of spam comments a day and so anything that hits the spam filter is gone forever. I think there is a way to register as a commenter; that could help.
5 0.85110569 132 andrew gelman stats-2010-07-07-Note to “Cigarettes”
Introduction: To the person who posted an apparently non-spam comment with a URL link to a “cheap cigarettes” website: In case you’re wondering, no, your comment didn’t get caught by the spam filter–I’m not sure why not, given that URL. I put it in the spam file manually. If you’d like to participate in blog discussion in the future, please refrain from including spam links. Thank you. Also, it’s “John Tukey,” not “John Turkey.”
6 0.82420039 1488 andrew gelman stats-2012-09-08-Annals of spam
7 0.82321978 1709 andrew gelman stats-2013-02-06-The fractal nature of scientific revolutions
8 0.81390452 220 andrew gelman stats-2010-08-20-Why I blog?
9 0.76904541 839 andrew gelman stats-2011-08-04-To commenters who are trying to sell something
10 0.76188421 425 andrew gelman stats-2010-11-21-If your comment didn’t get through . . .
11 0.76049995 523 andrew gelman stats-2011-01-18-Spam is out of control
12 0.75511754 104 andrew gelman stats-2010-06-22-Seeking balance
13 0.72715569 9 andrew gelman stats-2010-04-28-But it all goes to pay for gas, car insurance, and tolls on the turnpike
14 0.72578937 856 andrew gelman stats-2011-08-16-Our new improved blog! Thanks to Cord Blomquist
15 0.70124674 2088 andrew gelman stats-2013-11-04-Recently in the sister blog
16 0.69742382 2075 andrew gelman stats-2013-10-23-PubMed Commons: A system for commenting on articles in PubMed
17 0.69228733 1964 andrew gelman stats-2013-08-01-Non-topical blogging
18 0.68978423 1202 andrew gelman stats-2012-03-08-Between and within-Krugman correlation
19 0.68514156 876 andrew gelman stats-2011-08-28-Vaguely related to the coke-dumping story
20 0.67699176 199 andrew gelman stats-2010-08-11-Note to semi-spammers
topicId topicWeight
[(2, 0.011), (9, 0.02), (15, 0.046), (16, 0.053), (21, 0.026), (24, 0.097), (42, 0.011), (52, 0.016), (59, 0.227), (63, 0.043), (72, 0.013), (84, 0.011), (99, 0.343)]
simIndex simValue blogId blogTitle
Introduction: Someone points me to this report from Tilburg University on disgraced psychology researcher Diederik Stapel. The reports includes bits like this: When the fraud was first discovered, limiting the harm it caused for the victims was a matter of urgency. This was particularly the case for Mr Stapel’s former PhD students and postdoctoral researchers . . . However, the Committees were of the opinion that the main bulk of the work had not yet even started. . . . Journal publications can often leave traces that reach far into and even beyond scientific disciplines. The self-cleansing character of science calls for fraudulent publications to be withdrawn and no longer to proliferate within the literature. In addition, based on their initial impressions, the Committees believed that there were other serious issues within Mr Stapel’s publications . . . This brought into the spotlight a research culture in which this sloppy science, alongside out-and-out fraud, was able to remain undetected
2 0.95627034 214 andrew gelman stats-2010-08-17-Probability-processing hardware
Introduction: Lyric Semiconductor posted: For over 60 years, computers have been based on digital computing principles. Data is represented as bits (0s and 1s). Boolean logic gates perform operations on these bits. A processor steps through many of these operations serially in order to perform a function. However, today’s most interesting problems are not at all suited to this approach. Here at Lyric Semiconductor, we are redesigning information processing circuits from the ground up to natively process probabilities: from the gate circuits to the processor architecture to the programming language. As a result, many applications that today require a thousand conventional processors will soon run in just one Lyric processor, providing 1,000x efficiencies in cost, power, and size. Om Malik has some more information, also relating to the team and the business. The fundamental idea is that computing architectures work deterministically, even though the world is fundamentally stochastic.
3 0.95533371 853 andrew gelman stats-2011-08-14-Preferential admissions for children of elite colleges
Introduction: Jenny Anderson reports on a discussion of the practice of colleges preferential admission of children of alumni: [Richard] Kahlenberg citing research from his book “Affirmative Action for the Rich: Legacy Preferences in College Admissions” made the case that getting into good schools matters — 12 institutions making up less than 1 percent of the U.S. population produced 42 percent of government leaders and 54 percent of corporate leaders. And being a legacy helps improve an applicant’s chances of getting in, with one study finding that being a primary legacy — the son or daughter of an undergraduate alumnus or alumna — increases one’s chance of admission by 45.1 percent. I’d call that 45 percent but I get the basic idea. But then Jeffrey Brenzel of the Yale admissions office replied: “We turn away 80 percent of our legacies, and we feel it every day,” Mr. Brenzel said, adding that he rejected more offspring of the school’s Sterling donors than he accepted this year (
4 0.94070762 965 andrew gelman stats-2011-10-19-Web-friendly visualizations in R
Introduction: Aleks points me to this new tool from Wojciech Gryc. Right now I save my graphs as pdfs or pngs and then upload them to put them on the web. I expect I’ll still be doing this for awhile—I like having full control of what my graphs look like—but Gryc’s default plots might be useful for lots of people making their analyses more accessible. Here’s an example: x = rnorm(30) y = rnorm(30) wv.plot(x, y, "~/Desktop/scatterplot", height=300, width=300, xlim=c(-2.5,2.5), ylim=c(-2.5,2.5), xbreaks=c(0), ybreaks=c(0))
5 0.92876577 229 andrew gelman stats-2010-08-24-Bizarre twisty argument about medical diagnostic tests
Introduction: My cobloggers sometimes write about “Politics Everywhere.” Here’s an example of a political writer taking something that’s not particularly political and trying to twist it into a political context. Perhaps the title should be “political journalism everywhere”. Michael Kinsley writes : Scientists have discovered a spinal fluid test that can predict with 100 percent accuracy whether people who already have memory loss are going to develop full-fledged Alzheimer’s disease. They apparently don’t know whether this test works for people with no memory problems yet, but reading between the lines of the report in the New York Times August 10, it sounds as if they believe it will. . . . This is truly the apple of knowledge: a test that can be given to physically and mentally healthy people in the prime of life, which can identify with perfect accuracy which ones are slowly going to lose their mental capabilities. If your first instinct is, “We should outlaw this test” or at lea
6 0.9281646 1000 andrew gelman stats-2011-11-10-Forecasting 2012: How much does ideology matter?
7 0.92697394 1716 andrew gelman stats-2013-02-09-iPython Notebook
same-blog 8 0.92431372 771 andrew gelman stats-2011-06-16-30 days of statistics
9 0.91948032 34 andrew gelman stats-2010-05-14-Non-academic writings on literature
10 0.91839468 1408 andrew gelman stats-2012-07-07-Not much difference between communicating to self and communicating to others
11 0.9172219 763 andrew gelman stats-2011-06-13-Inventor of Connect Four dies at 91
12 0.91426635 1764 andrew gelman stats-2013-03-15-How do I make my graphs?
13 0.90937811 1380 andrew gelman stats-2012-06-15-Coaching, teaching, and writing
14 0.90936172 199 andrew gelman stats-2010-08-11-Note to semi-spammers
15 0.9068523 517 andrew gelman stats-2011-01-14-Bayes in China update
16 0.90216595 1190 andrew gelman stats-2012-02-29-Why “Why”?
17 0.9001562 1377 andrew gelman stats-2012-06-13-A question about AIC
19 0.89582938 766 andrew gelman stats-2011-06-14-Last Wegman post (for now)
20 0.8909446 2233 andrew gelman stats-2014-03-04-Literal vs. rhetorical