andrew_gelman_stats andrew_gelman_stats-2013 andrew_gelman_stats-2013-1676 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Three different people have pointed me to this post by Ken Regan on statistical evaluation of claims of cheating in chess. So I figured I have to satisfy demand and post something on this. But I have nothing to say. All these topics interest me, but I somehow had difficulty reading through the entire post. I scanned through but what I really wanted to see was some data. Show me a scatterplot, then I’ll get interested. P.S. This is meant as no disparagement of Regan or his blog. I just couldn’t quite get into this particular example.
sentIndex sentText sentNum sentScore
1 Three different people have pointed me to this post by Ken Regan on statistical evaluation of claims of cheating in chess. [sent-1, score-0.9]
2 So I figured I have to satisfy demand and post something on this. [sent-2, score-0.791]
3 All these topics interest me, but I somehow had difficulty reading through the entire post. [sent-4, score-0.758]
4 I scanned through but what I really wanted to see was some data. [sent-5, score-0.478]
5 Show me a scatterplot, then I’ll get interested. [sent-6, score-0.083]
6 This is meant as no disparagement of Regan or his blog. [sent-9, score-0.417]
7 I just couldn’t quite get into this particular example. [sent-10, score-0.254]
wordName wordTfidf (topN-words)
[('regan', 0.553), ('scanned', 0.276), ('disparagement', 0.265), ('ken', 0.223), ('satisfy', 0.213), ('scatterplot', 0.21), ('cheating', 0.205), ('figured', 0.189), ('demand', 0.181), ('evaluation', 0.162), ('somehow', 0.154), ('meant', 0.152), ('post', 0.151), ('difficulty', 0.14), ('topics', 0.136), ('entire', 0.133), ('couldn', 0.133), ('pointed', 0.12), ('wanted', 0.114), ('claims', 0.108), ('show', 0.101), ('interest', 0.1), ('reading', 0.095), ('three', 0.095), ('quite', 0.094), ('nothing', 0.094), ('get', 0.083), ('particular', 0.077), ('ll', 0.067), ('different', 0.058), ('something', 0.057), ('statistical', 0.056), ('really', 0.049), ('example', 0.049), ('people', 0.04), ('see', 0.039)]
simIndex simValue blogId blogTitle
same-blog 1 0.99999994 1676 andrew gelman stats-2013-01-16-Detecting cheating in chess
Introduction: Three different people have pointed me to this post by Ken Regan on statistical evaluation of claims of cheating in chess. So I figured I have to satisfy demand and post something on this. But I have nothing to say. All these topics interest me, but I somehow had difficulty reading through the entire post. I scanned through but what I really wanted to see was some data. Show me a scatterplot, then I’ll get interested. P.S. This is meant as no disparagement of Regan or his blog. I just couldn’t quite get into this particular example.
2 0.087749042 1448 andrew gelman stats-2012-08-07-Scientific fraud, double standards and institutions protecting themselves
Introduction: Ole Rogeberg writes: After reading your recent post , I thought you might find this interesting – especially the scanned interview that is included at the bottom of the posting. It’s an old OMNI interview with Walter Stewart that was the first thing I read (at a young and impressionable age ;) about the prevalence of errors, fraud and cheating in science, the institutional barriers to tackling it, the often high personal costs to whistleblowers, the difficulty of accessing scientific data to repeat published analyses, and the surprisingly negative attitude towards criticism within scientific communities. Highly recommended entertaining reading – with some good examples of scientific investigations into implausible effects. The post itself contains the info I once dug up about what happened to him later – he seems like an interesting and very determined guy: when the NIH tried to stop him from investigating scientific errors and fraud he went on a hunger strike. No idea what’s h
3 0.073503815 1028 andrew gelman stats-2011-11-26-Tenure lets you handle students who cheat
Introduction: The other day, a friend of mine who is an untenured professor (not in statistics or political science) was telling me about a class where many of the students seemed to be resubmitting papers that they had already written for previous classes. (The supposition was based on internal evidence of the topics of the submitted papers.) It would be possible to check this and then kick the cheating students out of the program—but why do it? It would be a lot of work, also some of the students who are caught might complain, then word would get around that my friend is a troublemaker. And nobody likes a troublemaker. Once my friend has tenure it would be possible to do the right thing. But . . . here’s the hitch: most college instructors do not have tenure, and one result, I suspect, is a decline in ethical standards. This is something I hadn’t thought of in our earlier discussion of job security for teachers: tenure gives you the freedom to kick out cheating students.
4 0.070626304 790 andrew gelman stats-2011-07-08-Blog in motion
Introduction: In the next few days we’ll be changing the format of the blog and moving it to a new server. If you have difficulty posting comments, just wait and post them in a few days when all should be working well. (But if you can post a comment, go for it. All the old entries and comments should be reappearing in the reconstituted blog.)
5 0.066909842 227 andrew gelman stats-2010-08-23-Visualization magazine
Introduction: Aleks pointed me to this .
7 0.065567449 275 andrew gelman stats-2010-09-14-Data visualization at the American Evaluation Association
8 0.062703088 589 andrew gelman stats-2011-02-24-On summarizing a noisy scatterplot with a single comparison of two points
9 0.060610566 1605 andrew gelman stats-2012-12-04-Write This Book
10 0.058056518 1965 andrew gelman stats-2013-08-02-My course this fall on l’analyse bayésienne de données
11 0.056873344 2232 andrew gelman stats-2014-03-03-What is the appropriate time scale for blogging—the day or the week?
12 0.0565947 892 andrew gelman stats-2011-09-06-Info on patent trolls
13 0.056288887 902 andrew gelman stats-2011-09-12-The importance of style in academic writing
14 0.056013316 280 andrew gelman stats-2010-09-16-Meet Hipmunk, a really cool flight-finder that doesn’t actually work
15 0.05566556 423 andrew gelman stats-2010-11-20-How to schedule projects in an introductory statistics course?
17 0.05514231 2245 andrew gelman stats-2014-03-12-More on publishing in journals
18 0.053634595 37 andrew gelman stats-2010-05-17-Is chartjunk really “more useful” than plain graphs? I don’t think so.
19 0.052753098 558 andrew gelman stats-2011-02-05-Fattening of the world and good use of the alpha channel
20 0.051997565 886 andrew gelman stats-2011-09-02-The new Helen DeWitt novel
topicId topicWeight
[(0, 0.087), (1, -0.03), (2, -0.029), (3, 0.01), (4, 0.015), (5, -0.019), (6, 0.003), (7, -0.003), (8, 0.024), (9, -0.02), (10, 0.007), (11, 0.02), (12, 0.012), (13, 0.001), (14, -0.003), (15, 0.005), (16, 0.008), (17, -0.004), (18, -0.027), (19, 0.007), (20, -0.008), (21, -0.018), (22, 0.008), (23, 0.001), (24, -0.018), (25, 0.014), (26, -0.004), (27, 0.001), (28, 0.01), (29, 0.004), (30, 0.015), (31, -0.001), (32, -0.013), (33, -0.024), (34, 0.03), (35, -0.008), (36, 0.007), (37, -0.008), (38, 0.022), (39, -0.006), (40, 0.006), (41, -0.007), (42, -0.009), (43, -0.004), (44, -0.012), (45, -0.008), (46, -0.001), (47, -0.005), (48, -0.041), (49, 0.009)]
simIndex simValue blogId blogTitle
same-blog 1 0.96257937 1676 andrew gelman stats-2013-01-16-Detecting cheating in chess
Introduction: Three different people have pointed me to this post by Ken Regan on statistical evaluation of claims of cheating in chess. So I figured I have to satisfy demand and post something on this. But I have nothing to say. All these topics interest me, but I somehow had difficulty reading through the entire post. I scanned through but what I really wanted to see was some data. Show me a scatterplot, then I’ll get interested. P.S. This is meant as no disparagement of Regan or his blog. I just couldn’t quite get into this particular example.
2 0.81124407 1658 andrew gelman stats-2013-01-07-Free advice from an academic writing coach!
Introduction: Basbøll writes : I [Basbøll] have got to come up with forty things to say [in the next few months]. . . . What would you like me to write about? I’ll of course be writing quite a bit about what I’m now calling “article design”, i.e., how to map out the roughly forty paragraphs that a journal article is composed of. And I’ll also be talking about how to plan the writing process that is to produce those paragraphs. The basic principle is still to write at least one paragraph a day in 27 minutes. (You can adapt this is various ways to your own taste; some like 18-minute or even 13-minute paragraphs.) But I’d like to talk about questions of style, too, and even a little bit about epistemology. “Knowledge—academic knowledge, that is—is the ability to compose a coherent prose paragraph about something in 27 minutes,” I always say. I’d like to reflect a little more about what this conception of knowledge really means. This means I’ll have to walk back my recent dismissal of epistemol
3 0.80845046 2329 andrew gelman stats-2014-05-11-“What should you talk about?”
Introduction: Tyler Cowen quotes Robin Hanson: If your main reason for talking is to socialize, you’ll want to talk about whatever everyone else is talking about. Like say the missing Malaysia Airlines plane. But if instead your purpose is to gain and spread useful insight, so that we can all understand more about things that matter, you’ll want to look for relatively neglected topics. . . . One advantage of having this blog on a lag of a month or two is that I can post things, knowing that when my discussion finally appears, it will no longer be topical. Indeed, this post is an example.
4 0.80167395 1351 andrew gelman stats-2012-05-29-A Ph.D. thesis is not really a marathon
Introduction: Thomas Basbøll writes : A blog called The Thesis Whisperer was recently pointed out to me. I [Basbøll] haven’t looked at it closely, but I’ll be reading it regularly for a while before I recommend it. I’m sure it’s a good place to go to discover that you’re not alone, especially when you’re struggling with your dissertation. One post caught my eye immediately. It suggested that writing a thesis is not a sprint, it’s a marathon. As a metaphorical adjustment to a particular attitude about writing, it’s probably going to help some people. But if we think it through, it’s not really a very good analogy. No one is really a “sprinter”; and writing a dissertation is nothing like running a marathon. . . . Here’s Ben’s explication of the analogy at the Thesis Whisperer, which seems initially plausible. …writing a dissertation is a lot like running a marathon. They are both endurance events, they last a long time and they require a consistent and carefully calculated amount of effor
5 0.75871021 1408 andrew gelman stats-2012-07-07-Not much difference between communicating to self and communicating to others
Introduction: Thomas Basbøll writes : [Advertising executive] Russell Davies wrote a blog post called “The Tyranny of the Big Idea”. His five-point procedure begins: Start doing stuff. Start executing things which seem right. Do it quickly and do it often. Don’t cling onto anything, good or bad. Don’t repeat much. Take what was good and do it differently. And ends with: “And something else and something else.” This inspires several thoughts, which I’ll take advantage of the blog format to present with no attempt to be cohesively organized. 1. My first concern is the extent to which productivity-enhancing advice such as Davies’s (and Basbøll’s) is zero or even negative-sum , just helping people in the rat race. But, upon reflection, I’d rate the recommendations as positive-sum. If people learn to write better and be more productive, that’s not (necessarily) just positional. 2. Blogging fits with the “Do it quickly and do it often” advice. 3. I wonder what Basbøll thinks abo
6 0.75647134 727 andrew gelman stats-2011-05-23-My new writing strategy
7 0.74351639 1411 andrew gelman stats-2012-07-10-Defining ourselves arbitrarily
8 0.73585927 1269 andrew gelman stats-2012-04-19-Believe your models (up to the point that you abandon them)
9 0.73267448 1048 andrew gelman stats-2011-12-09-Maze generation algorithms!
10 0.72797912 1982 andrew gelman stats-2013-08-15-Blaming scientific fraud on the Kuhnians
11 0.72698802 2232 andrew gelman stats-2014-03-03-What is the appropriate time scale for blogging—the day or the week?
12 0.72234702 514 andrew gelman stats-2011-01-13-News coverage of statistical issues…how did I do?
13 0.72039157 2172 andrew gelman stats-2014-01-14-Advice on writing research articles
14 0.71823078 1561 andrew gelman stats-2012-11-04-Someone is wrong on the internet
15 0.71549779 302 andrew gelman stats-2010-09-28-This is a link to a news article about a scientific paper
16 0.71428269 826 andrew gelman stats-2011-07-27-The Statistics Forum!
17 0.71129709 1964 andrew gelman stats-2013-08-01-Non-topical blogging
18 0.71066779 574 andrew gelman stats-2011-02-14-“The best data visualizations should stand on their own”? I don’t think so.
19 0.70765734 2287 andrew gelman stats-2014-04-09-Advice: positive-sum, zero-sum, or negative-sum
20 0.70674092 1863 andrew gelman stats-2013-05-19-Prose is paragraphs, prose is sentences
topicId topicWeight
[(15, 0.042), (16, 0.08), (21, 0.022), (24, 0.135), (54, 0.197), (61, 0.036), (90, 0.037), (99, 0.296)]
simIndex simValue blogId blogTitle
1 0.9635036 322 andrew gelman stats-2010-10-06-More on the differences between drugs and medical devices
Introduction: Someone who works in statistics in the pharmaceutical industry (but prefers to remain anonymous) sent me this update to our discussion on the differences between approvals of drugs and medical devices: The ‘substantial equivalence’ threshold is a very outdated. Basically the FDA has to follow federal law and the law is antiquated and leads to two extraordinarily different paths for device approval. You could have a very simple but first-in-kind device with an easy to understand physiological mechanism of action (e.g. the FDA approved a simple tiny stent that would relieve pressure from a glaucoma patient’s eye this summer). This device would require a standard (likely controlled) trial at the one-sided 0.025 level. Even after the trial it would likely go to a panel where outside experts (e.g.practicing & academic MDs and statisticians) hear evidence from the company and FDA and vote on its safety and efficacy. FDA would then rule, consider the panel’s vote, on whether to appro
2 0.95160997 1938 andrew gelman stats-2013-07-14-Learning how to speak
Introduction: I’ve been trying to reduce my American accent when speaking French. I tried taping my voice and playing it back, but that didn’t help. I couldn’t actually tell that I had a strong accent by listening to myself. My own voice is just too familiar to me. Then Malecki told me about the international phonetic alphabet, which is just great. And there’s even a convenient website that translates. For example, le loup est revenu -> lə lu ε ʀəvny I stared at Malecki’s mouth while he said the phrase, and I finally understood the difference between the two different “oo” sounds. That evening at home I tried it out on the local expert and he laughed at my attempts but grudgingly admitted I was getting better. On about the 10th try, after watching him say it over and over and staring at his mouth, I was finally able to do it! I know this is going to sound stupid to all you linguistics experts out there, but I had no idea that you could figure out how to speak better by staring at s
3 0.93643439 839 andrew gelman stats-2011-08-04-To commenters who are trying to sell something
Introduction: We screen our comments. If you link to an url of the form, http://we’re-selling-you-crap.org, then you go straight into the spam folder. If you want to contribute to the discussion here, fine. Comment without the spam links. If you want to advertise, go elsewhere. It’s customary to pay for ads. We have no plans to advertise your services for free.
same-blog 4 0.93220848 1676 andrew gelman stats-2013-01-16-Detecting cheating in chess
Introduction: Three different people have pointed me to this post by Ken Regan on statistical evaluation of claims of cheating in chess. So I figured I have to satisfy demand and post something on this. But I have nothing to say. All these topics interest me, but I somehow had difficulty reading through the entire post. I scanned through but what I really wanted to see was some data. Show me a scatterplot, then I’ll get interested. P.S. This is meant as no disparagement of Regan or his blog. I just couldn’t quite get into this particular example.
5 0.91256934 358 andrew gelman stats-2010-10-20-When Kerry Met Sally: Politics and Perceptions in the Demand for Movies
Introduction: Jason Roos sends along this article : On election days many of us see a colorful map of the U.S. where each tiny county has a color on the continuum between red and blue. So far we have not used such data to improve the effectiveness of marketing models. In this study, we show that we should. We demonstrate the usefulness of political data via an interesting application–the demand for movies. Using boxoffice data from 25 counties in the U.S. Midwest (21 quarters between 2000 and 2005) we show that by including political data one can improve out-of-sample predictions significantly. Specifically, we estimate the improvement in forecasts due to the addition of political data to be around $43 million per year for the entire U.S. theatrical market. Furthermore, when it comes to movies we depart from previous work in another way. While previous studies have relied on pre-determined movie genres, we estimate perceived movie attributes in a latent space and formulate viewers’ tastes as
6 0.90500367 1889 andrew gelman stats-2013-06-08-Using trends in R-squared to measure progress in criminology??
7 0.89809597 615 andrew gelman stats-2011-03-16-Chess vs. checkers
8 0.89074248 94 andrew gelman stats-2010-06-17-SAT stories
9 0.88645256 1721 andrew gelman stats-2013-02-13-A must-read paper on statistical analysis of experimental data
10 0.88624156 1105 andrew gelman stats-2012-01-08-Econ debate about prices at a fancy restaurant
11 0.88591433 1237 andrew gelman stats-2012-03-30-Statisticians: When We Teach, We Don’t Practice What We Preach
12 0.88518596 1473 andrew gelman stats-2012-08-28-Turing chess run update
13 0.87797219 1578 andrew gelman stats-2012-11-15-Outta control political incorrectness
14 0.87735021 867 andrew gelman stats-2011-08-23-The economics of the mac? A paradox of competition
15 0.87584966 2271 andrew gelman stats-2014-03-28-What happened to the world we knew?
16 0.87281722 1656 andrew gelman stats-2013-01-05-Understanding regression models and regression coefficients
17 0.87168336 2121 andrew gelman stats-2013-12-02-Should personal genetic testing be regulated? Battle of the blogroll
18 0.87028861 1083 andrew gelman stats-2011-12-26-The quals and the quants
19 0.86653292 502 andrew gelman stats-2011-01-04-Cash in, cash out graph
20 0.86413717 1752 andrew gelman stats-2013-03-06-Online Education and Jazz