andrew_gelman_stats andrew_gelman_stats-2011 andrew_gelman_stats-2011-509 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: From Dan Goldstein : Pretty good, but really the pie chart should be three-dimensional, shown at an angle, and with one or two of the slices popping out. P.S. They seemed to have placed a link for the Bill James Historical Baseball Abstract. That book’s ok, but what I was really recommending were his Abstracts from 1982-1986, which are something else entirely.
sentIndex sentText sentNum sentScore
1 From Dan Goldstein : Pretty good, but really the pie chart should be three-dimensional, shown at an angle, and with one or two of the slices popping out. [sent-1, score-1.525]
2 They seemed to have placed a link for the Bill James Historical Baseball Abstract. [sent-4, score-0.478]
3 That book’s ok, but what I was really recommending were his Abstracts from 1982-1986, which are something else entirely. [sent-5, score-0.578]
wordName wordTfidf (topN-words)
[('popping', 0.33), ('slices', 0.304), ('goldstein', 0.27), ('angle', 0.27), ('abstracts', 0.27), ('recommending', 0.265), ('pie', 0.257), ('chart', 0.236), ('placed', 0.229), ('baseball', 0.211), ('historical', 0.186), ('dan', 0.186), ('shown', 0.179), ('entirely', 0.176), ('james', 0.166), ('bill', 0.161), ('seemed', 0.137), ('else', 0.127), ('ok', 0.119), ('really', 0.118), ('link', 0.112), ('book', 0.091), ('pretty', 0.086), ('something', 0.068), ('two', 0.066), ('good', 0.058), ('one', 0.035)]
simIndex simValue blogId blogTitle
same-blog 1 1.0 509 andrew gelman stats-2011-01-09-Chartjunk, but in a good cause!
Introduction: From Dan Goldstein : Pretty good, but really the pie chart should be three-dimensional, shown at an angle, and with one or two of the slices popping out. P.S. They seemed to have placed a link for the Bill James Historical Baseball Abstract. That book’s ok, but what I was really recommending were his Abstracts from 1982-1986, which are something else entirely.
2 0.19364335 697 andrew gelman stats-2011-05-05-A statistician rereads Bill James
Introduction: Ben Lindbergh invited me to write an article for Baseball Prospectus. I first sent him this item on the differences between baseball and politics but he said it was too political for them. I then sent him this review of a book on baseball’s greatest fielders but he said they already had someone slotted to review that book. Then I sent him some reflections on the great Bill James and he published it ! If anybody out there knows Bill James, please send this on to him: I have some questions at the end that I’m curious about. Here’s how it begins: I read my first Bill James book in 1984, took my first statistics class in 1985, and began graduate study in statistics the next year. Besides giving me the opportunity to study with the best applied statistician of the late 20th century (Don Rubin) and the best theoretical statistician of the early 21st (Xiao-Li Meng), going to graduate school at Harvard in 1986 gave me the opportunity to sit in a basement room one evening that
Introduction: During our discussion of estimates of teacher performance, Steve Sailer wrote : I suspect we’re going to take years to work the kinks out of overall rating systems. By way of analogy, Bill James kicked off the modern era of baseball statistics analysis around 1975. But he stuck to doing smaller scale analyses and avoided trying to build one giant overall model for rating players. In contrast, other analysts such as Pete Palmer rushed into building overall ranking systems, such as his 1984 book, but they tended to generate curious results such as the greatness of Roy Smalley Jr.. James held off until 1999 before unveiling his win share model for overall rankings. I remember looking at Pete Palmer’s book many years ago and being disappointed that he did everything through his Linear Weights formula. A hit is worth X, a walk is worth Y, etc. Some of this is good–it’s presumably an improvement on counting walks as 0 or 1 hits, also an improvement on counting doubles and triples a
4 0.16983125 2116 andrew gelman stats-2013-11-28-“Statistics is what people think math is”
Introduction: My 5books interview (from 2011), where we talk about The Bill James Baseball Abstracts, Judgment under Uncertainty, How Animals Work, The Honest Rainmaker, and How to Talk So Kids Will Listen and Listen So Kids Will Talk.
5 0.15504679 29 andrew gelman stats-2010-05-12-Probability of successive wins in baseball
Introduction: Dan Goldstein did an informal study asking people the following question: When two baseball teams play each other on two consecutive days, what is the probability that the winner of the first game will be the winner of the second game? You can make your own guess and the continue reading below. Dan writes: We asked two colleagues knowledgeable in baseball and the mathematics of forecasting. The answers came in between 65% and 70%. The true answer [based on Dan's analysis of a database of baseball games]: 51.3%, a little better than a coin toss. I have to say, I’m surprised his colleagues gave such extreme guesses. I was guessing something like 50%, myself, based on the following very crude reasoning: Suppose two unequal teams are playing, and the chance of team A beating team B is 55%. (This seems like a reasonable average of all matchups, which will include some more extreme disparities but also many more equal contests.) Then the chance of the same team
7 0.13326114 1090 andrew gelman stats-2011-12-28-“. . . extending for dozens of pages”
8 0.1258733 1275 andrew gelman stats-2012-04-22-Please stop me before I barf again
9 0.11794385 190 andrew gelman stats-2010-08-07-Mister P makes the big jump from the New York Times to the Washington Post
10 0.11714876 305 andrew gelman stats-2010-09-29-Decision science vs. social psychology
11 0.1165279 1104 andrew gelman stats-2012-01-07-A compelling reason to go to London, Ontario??
12 0.11581452 440 andrew gelman stats-2010-12-01-In defense of jargon
13 0.1100046 642 andrew gelman stats-2011-04-02-Bill James and the base-rate fallacy
16 0.10172087 499 andrew gelman stats-2011-01-03-5 books
17 0.097777933 37 andrew gelman stats-2010-05-17-Is chartjunk really “more useful” than plain graphs? I don’t think so.
18 0.092075184 620 andrew gelman stats-2011-03-19-Online James?
19 0.087724827 890 andrew gelman stats-2011-09-05-Error statistics
20 0.08736454 367 andrew gelman stats-2010-10-25-In today’s economy, the rich get richer
topicId topicWeight
[(0, 0.066), (1, -0.044), (2, -0.019), (3, 0.054), (4, 0.031), (5, -0.022), (6, 0.022), (7, 0.019), (8, 0.048), (9, 0.004), (10, 0.015), (11, -0.01), (12, 0.031), (13, -0.02), (14, 0.03), (15, 0.002), (16, 0.01), (17, 0.028), (18, 0.049), (19, -0.087), (20, -0.03), (21, -0.001), (22, 0.016), (23, 0.021), (24, 0.046), (25, 0.026), (26, -0.033), (27, -0.024), (28, -0.033), (29, -0.128), (30, -0.035), (31, -0.031), (32, 0.043), (33, -0.018), (34, -0.079), (35, 0.053), (36, 0.039), (37, -0.013), (38, 0.01), (39, -0.053), (40, 0.154), (41, 0.07), (42, -0.028), (43, 0.02), (44, -0.004), (45, 0.065), (46, -0.032), (47, 0.066), (48, -0.058), (49, 0.017)]
simIndex simValue blogId blogTitle
same-blog 1 0.96585166 509 andrew gelman stats-2011-01-09-Chartjunk, but in a good cause!
Introduction: From Dan Goldstein : Pretty good, but really the pie chart should be three-dimensional, shown at an angle, and with one or two of the slices popping out. P.S. They seemed to have placed a link for the Bill James Historical Baseball Abstract. That book’s ok, but what I was really recommending were his Abstracts from 1982-1986, which are something else entirely.
2 0.84694505 440 andrew gelman stats-2010-12-01-In defense of jargon
Introduction: Daniel Drezner takes on Bill James.
Introduction: Eric Tassone writes: Probably not blog-worthy/blog-appropriate, but have you heard Bill James discussing the Sandusky & Paterno stuff? I think you discussed once his stance on the Dowd Report, and this seems to be from the same part of his personality—which goes beyond contrarian . . . I have in fact blogged on James ( many times ) and on Paterno , so yes I think this is blogworthy. On the other hand, most readers of this blog probably don’t care about baseball, football, or William James, so I’ll put the rest below the fold. What is legendary baseball statistician Bill James doing, defending the crime-coverups of legendary coach Joe Paterno? As I wrote in my earlier blog on Paterno, it isn’t always easy to do the right thing, and I have no idea if I’d behave any better if I were in such a situation. The characteristics of a good coach do not necessarily provide what it takes to make good decisions off the field. In this sense even more of the blame should go
4 0.80089676 367 andrew gelman stats-2010-10-25-In today’s economy, the rich get richer
Introduction: I found a $5 bill on the street today.
5 0.78902805 642 andrew gelman stats-2011-04-02-Bill James and the base-rate fallacy
Introduction: I was recently rereading and enjoying Bill James’s Historical Baseball Abstract (the second edition, from 2001). But even the Master is not perfect. Here he is, in the context of the all-time 20th-greatest shortstop (in his reckoning): Are athletes special people? In general, no, but occasionally, yes. Johnny Pesky at 75 was trim, youthful, optimistic, and practically exploding with energy. You rarely meet anybody like that who isn’t an ex-athlete–and that makes athletes seem special. [italics in the original] Hey, I’ve met 75-year-olds like that–and none of them are ex-athletes! That’s probably because I don’t know a lot of ex-athletes. But Bill James . . . he knows a lot of athletes. He went to the bathroom with Tim Raines once! The most I can say is that I saw Rickey Henderson steal a couple bases when he was playing against the Orioles once. Cognitive psychologists talk about the base-rate fallacy , which is the mistake of estimating probabilities without accou
6 0.72045642 697 andrew gelman stats-2011-05-05-A statistician rereads Bill James
8 0.69718146 173 andrew gelman stats-2010-07-31-Editing and clutch hitting
9 0.69381535 1113 andrew gelman stats-2012-01-11-Toshiro Kageyama on professionalism
10 0.64102006 1219 andrew gelman stats-2012-03-18-Tips on “great design” from . . . Microsoft!
11 0.63456076 942 andrew gelman stats-2011-10-04-45% hitting, 25% fielding, 25% pitching, and 100% not telling us how they did it
12 0.61322409 623 andrew gelman stats-2011-03-21-Baseball’s greatest fielders
13 0.60373425 2116 andrew gelman stats-2013-11-28-“Statistics is what people think math is”
14 0.57176012 499 andrew gelman stats-2011-01-03-5 books
16 0.56345689 355 andrew gelman stats-2010-10-20-Andy vs. the Ideal Point Model of Voting
17 0.55451429 1738 andrew gelman stats-2013-02-25-Plaig
18 0.51716214 29 andrew gelman stats-2010-05-12-Probability of successive wins in baseball
19 0.51482207 620 andrew gelman stats-2011-03-19-Online James?
topicId topicWeight
[(1, 0.092), (5, 0.067), (15, 0.105), (21, 0.051), (24, 0.062), (38, 0.18), (40, 0.055), (42, 0.05), (55, 0.049), (99, 0.131)]
simIndex simValue blogId blogTitle
same-blog 1 0.83414179 509 andrew gelman stats-2011-01-09-Chartjunk, but in a good cause!
Introduction: From Dan Goldstein : Pretty good, but really the pie chart should be three-dimensional, shown at an angle, and with one or two of the slices popping out. P.S. They seemed to have placed a link for the Bill James Historical Baseball Abstract. That book’s ok, but what I was really recommending were his Abstracts from 1982-1986, which are something else entirely.
2 0.70877796 1874 andrew gelman stats-2013-05-28-Nostalgia
Introduction: Saw Argo the other day, was impressed by the way it was filmed in such a 70s style, sorta like that movie The Limey or an episode of the Rockford Files. I also felt nostalgia for that relatively nonviolent era. All those hostages and nobody was killed. It’s a good thing the Ayatollah didn’t have some fundamentalist Shiite equivalent of John Yoo telling him to waterboard everybody. At the time we were all so angry and upset about the hostage-taking, but from the perspective of our suicide-bomber era, that whole hostage episode seems so comfortingly mild.
3 0.66346121 393 andrew gelman stats-2010-11-04-Estimating the effect of A on B, and also the effect of B on A
Introduction: Lei Liu writes: I am working with clinicians in infectious disease and international health to study the (possible causal) relation between malnutrition and virus infection episodes (e.g., diarrhea) in babies in developing countries. Basically the clinicians are interested in two questions: does malnutrition cause more diarrhea episodes? does diarrhea lead to malnutrition? The malnutrition status is indicated by height and weight (adjusted, HAZ and WAZ measures) observed every 3 months from birth to 1 year. They also recorded the time of each diarrhea episode during the 1 year follow-up period. They have very solid datasets for analysis. As you can see, this is almost like a chicken and egg problem. I am a layman to causal inference. The method I use is just to do some simple regression. For example, to study the causal relation from malnutrition to diarrhea episodes, I use binary variable (diarrhea yes/no during months 0-3) as response, and use the HAZ at month 0 as covariate
4 0.65102708 1498 andrew gelman stats-2012-09-16-Choices in graphing parallel time series
Introduction: I saw this graph posted by Tyler Cowen: and my first thought was that the bar plot should be replaced by a line plot: Six lines, one for each income category, with each line being a time series of these changes. With a line plot, you can more easily see each time series (these are hard to see in the barplot because you have to follow each color and jump from decade to decade) and also compare the patterns for each category. The line plot pretty much dominates the bar plot. At least that was the theory. Now here’s what actually happened. I downloaded the data as Excel files, saved them as csv, then read them into R. In all, it took close to an hour to get the data set up in the format that was needed to make the graphs. At this point it was pretty easy to make the line plot. But the result was disappointing: The six lines are hard to untangle (sure, a better color scheme might help, but it wouldn’t really solve the problem) and the graph as a whole is much l
Introduction: At the Statistics Forum, we highlight a debate about how statistics should be taught in high schools. Check it out and then please leave your comments there.
6 0.62360126 1032 andrew gelman stats-2011-11-28-Does Avastin work on breast cancer? Should Medicare be paying for it?
7 0.62344873 1073 andrew gelman stats-2011-12-20-Not quite getting the point
9 0.61919677 527 andrew gelman stats-2011-01-20-Cars vs. trucks
10 0.61000699 717 andrew gelman stats-2011-05-17-Statistics plagiarism scandal
12 0.60648823 1449 andrew gelman stats-2012-08-08-Gregor Mendel’s suspicious data
13 0.60111594 1541 andrew gelman stats-2012-10-19-Statistical discrimination again
14 0.60035241 251 andrew gelman stats-2010-09-02-Interactions of predictors in a causal model
15 0.59867197 1081 andrew gelman stats-2011-12-24-Statistical ethics violation
16 0.59497643 133 andrew gelman stats-2010-07-08-Gratuitous use of “Bayesian Statistics,” a branding issue?
17 0.59154534 945 andrew gelman stats-2011-10-06-W’man < W’pedia, again
18 0.59111518 600 andrew gelman stats-2011-03-04-“Social Psychologists Detect Liberal Bias Within”
20 0.58974814 1908 andrew gelman stats-2013-06-21-Interpreting interactions in discrete-data regression