andrew_gelman_stats andrew_gelman_stats-2011 andrew_gelman_stats-2011-942 knowledge-graph by maker-knowledge-mining

942 andrew gelman stats-2011-10-04-45% hitting, 25% fielding, 25% pitching, and 100% not telling us how they did it


meta infos for this blog

Source: html

Introduction: A University of Delaware press release reports : This month, the Journal of Quantitative Analysis in Sports will feature the article “An Estimate of How Hitting, Pitching, Fielding, and Base-stealing Impact Team Winning Percentages in Baseball.” In it, University of Delaware Prof. Charles Pavitt of the Department of Communication defines the perfect “formula” for Major League Baseball (MLB) teams to use to build the ultimate winning team. Pavitt found hitting accounts for more than 45 percent of teams’ winning records, fielding for 25 percent and pitching for 25 percent. And that the impact of stolen bases is greatly overestimated. He crunched hitting, pitching, fielding and base-stealing records for every MLB team over a 48-year period from 1951-1998 with a method no other researcher has used in this area. In statistical parlance, he used a conceptual decomposition of offense and defense into its component parts and then analyzed recombinations of the parts in intuitively mea


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 A University of Delaware press release reports : This month, the Journal of Quantitative Analysis in Sports will feature the article “An Estimate of How Hitting, Pitching, Fielding, and Base-stealing Impact Team Winning Percentages in Baseball. [sent-1, score-0.383]

2 Charles Pavitt of the Department of Communication defines the perfect “formula” for Major League Baseball (MLB) teams to use to build the ultimate winning team. [sent-3, score-0.673]

3 Pavitt found hitting accounts for more than 45 percent of teams’ winning records, fielding for 25 percent and pitching for 25 percent. [sent-4, score-1.348]

4 And that the impact of stolen bases is greatly overestimated. [sent-5, score-0.328]

5 He crunched hitting, pitching, fielding and base-stealing records for every MLB team over a 48-year period from 1951-1998 with a method no other researcher has used in this area. [sent-6, score-0.526]

6 In statistical parlance, he used a conceptual decomposition of offense and defense into its component parts and then analyzed recombinations of the parts in intuitively meaningful ways. [sent-7, score-0.902]

7 The good news is that the numbers add up to less than 100% (I assume the remaining 5% can be attributed to baserunning, strategy, and teamwork—those are the only other variable factors I can think of that could influence winning). [sent-8, score-0.205]

8 The bad news is that the press release does not link to the article or to any technical report. [sent-9, score-0.52]

9 So I have no idea whether to take Pavitt’s claim seriously at all. [sent-10, score-0.071]

10 I think this sort of press release is just silly: the claim is empty without the accompanying analysis. [sent-11, score-0.546]

11 I don’t have high hopes, though, given that the author appears to be analyzing “team winning percentages” rather than runs scored and runs allowed. [sent-12, score-0.864]

12 As Bill James has pointed out, runs scored and allowed are more directly related to offense and defense, and you’re pretty much just throwing away information by looking at winning percentages. [sent-13, score-0.902]

13 It’s hard to know more, though, given that we have no link to the article and I can’t find anything with that title on the web. [sent-14, score-0.124]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('winning', 0.35), ('pavitt', 0.335), ('fielding', 0.258), ('pitching', 0.251), ('hitting', 0.232), ('mlb', 0.224), ('delaware', 0.192), ('runs', 0.177), ('release', 0.167), ('scored', 0.16), ('press', 0.157), ('offense', 0.155), ('percentages', 0.144), ('team', 0.14), ('records', 0.128), ('teams', 0.126), ('defense', 0.123), ('parts', 0.103), ('impact', 0.097), ('parlance', 0.096), ('percent', 0.093), ('hopes', 0.086), ('decomposition', 0.084), ('stolen', 0.084), ('bases', 0.082), ('accompanying', 0.082), ('intuitively', 0.079), ('defines', 0.075), ('university', 0.072), ('news', 0.072), ('league', 0.071), ('accounts', 0.071), ('claim', 0.071), ('empty', 0.069), ('remaining', 0.067), ('formula', 0.067), ('conceptual', 0.066), ('component', 0.066), ('attributed', 0.066), ('link', 0.065), ('greatly', 0.065), ('ultimate', 0.065), ('charles', 0.064), ('meaningful', 0.062), ('baseball', 0.061), ('analyzed', 0.061), ('throwing', 0.06), ('article', 0.059), ('sports', 0.058), ('build', 0.057)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0000001 942 andrew gelman stats-2011-10-04-45% hitting, 25% fielding, 25% pitching, and 100% not telling us how they did it

Introduction: A University of Delaware press release reports : This month, the Journal of Quantitative Analysis in Sports will feature the article “An Estimate of How Hitting, Pitching, Fielding, and Base-stealing Impact Team Winning Percentages in Baseball.” In it, University of Delaware Prof. Charles Pavitt of the Department of Communication defines the perfect “formula” for Major League Baseball (MLB) teams to use to build the ultimate winning team. Pavitt found hitting accounts for more than 45 percent of teams’ winning records, fielding for 25 percent and pitching for 25 percent. And that the impact of stolen bases is greatly overestimated. He crunched hitting, pitching, fielding and base-stealing records for every MLB team over a 48-year period from 1951-1998 with a method no other researcher has used in this area. In statistical parlance, he used a conceptual decomposition of offense and defense into its component parts and then analyzed recombinations of the parts in intuitively mea

2 0.24538311 623 andrew gelman stats-2011-03-21-Baseball’s greatest fielders

Introduction: Someone just stopped by and dropped off a copy of the book Wizardry: Baseball’s All-time Greatest Fielders Revealed, by Michael Humphreys. I don’t have much to say about the topic–I did see Brooks Robinson play, but I don’t remember any fancy plays. I must have seen Mark Belanger but I don’t really recall. Ozzie Smith was cool but I saw only him on TV. The most impressive thing I ever saw live was Rickey Henderson stealing a base. The best thing about that was that everyone was expecting him to steal the base, and he still was able to do it. But that wasn’t fielding either. Anyway, Humphreys was nice enough to give me a copy of his book, and since I can’t say much (I didn’t have it in me to study the formulas in detail, nor do I know enough to be able to evaluate them), I might as well say what I can say right away. (Note: Humphreys replies to some of these questions in a comment .) 1. Near the beginning, Humphreys says that 10 runs are worth about 1 win. I’ve always b

3 0.1349715 2301 andrew gelman stats-2014-04-22-Ticket to Baaaaarf

Introduction: A link from the comments here took me to the wonderfully named Barfblog and a report by Don Schaffner on some reporting. First, the background: A university in England issued a press release saying that “Food picked up just a few seconds after being dropped is less likely to contain bacteria than if it is left for longer periods of time . . . The findings suggest there may be some scientific basis to the ‘5 second rule’ – the urban myth about it being fine to eat food that has only had contact with the floor for five seconds or less. Although people have long followed the 5 second rule, until now it was unclear whether it actually helped.” According to the press release, the study was “undertaken by final year Biology students” and led by a professor of microbiology. The press release hit the big time, hitting NPR, Slate, Forbes, the Daily News, etc etc. Some typical headlines: “5-second rule backed up by science” — Atlanta Journal Constitution “Eating food off the floo

4 0.13121967 1113 andrew gelman stats-2012-01-11-Toshiro Kageyama on professionalism

Introduction: Following up on our discussion of professionalism (in which Jonathan Chait argued that “the definition of a professional career track” requires pay differentials and the chance to get fired, and I argued the opposite, that a lot of people go into professional careers specifically because of the job security), Austin Frakt pointed me to this description of professionalism from Go master Toshiro Kageyama. This in turn reminds me of a remark of Bill James when he explained lack of surprise that clutch hitting does not show up in the data. He wrote that the underlying idea of clutch hitting is that a player will play particuarly well in an important situation where the game or the season is on the line. But, James pointed out, these guys are pros, and the true sign of a professional is that he can always stay concentrated. This argument applies particuarly for hitting, maybe less so for pitching, where a pitcher can’t necessarily throw his hardest for 100 pitches in a game.

5 0.11265138 29 andrew gelman stats-2010-05-12-Probability of successive wins in baseball

Introduction: Dan Goldstein did an informal study asking people the following question: When two baseball teams play each other on two consecutive days, what is the probability that the winner of the first game will be the winner of the second game? You can make your own guess and the continue reading below. Dan writes: We asked two colleagues knowledgeable in baseball and the mathematics of forecasting. The answers came in between 65% and 70%. The true answer [based on Dan's analysis of a database of baseball games]: 51.3%, a little better than a coin toss. I have to say, I’m surprised his colleagues gave such extreme guesses. I was guessing something like 50%, myself, based on the following very crude reasoning: Suppose two unequal teams are playing, and the chance of team A beating team B is 55%. (This seems like a reasonable average of all matchups, which will include some more extreme disparities but also many more equal contests.) Then the chance of the same team

6 0.10392684 697 andrew gelman stats-2011-05-05-A statistician rereads Bill James

7 0.090944208 2215 andrew gelman stats-2014-02-17-The Washington Post reprints university press releases without editing them

8 0.090846755 652 andrew gelman stats-2011-04-07-Minor-league Stats Predict Major-league Performance, Sarah Palin, and Some Differences Between Baseball and Politics

9 0.086369611 173 andrew gelman stats-2010-07-31-Editing and clutch hitting

10 0.085147038 559 andrew gelman stats-2011-02-06-Bidding for the kickoff

11 0.08362148 1556 andrew gelman stats-2012-11-01-Recently in the sister blogs: special pre-election edition!

12 0.083364129 279 andrew gelman stats-2010-09-15-Electability and perception of electability

13 0.083330527 2124 andrew gelman stats-2013-12-05-Stan (quietly) passes 512 people on the users list

14 0.077922948 1139 andrew gelman stats-2012-01-26-Suggested resolution of the Bem paradox

15 0.077111155 2049 andrew gelman stats-2013-10-03-On house arrest for p-hacking

16 0.077014796 2226 andrew gelman stats-2014-02-26-Econometrics, political science, epidemiology, etc.: Don’t model the probability of a discrete outcome, model the underlying continuous variable

17 0.076275922 541 andrew gelman stats-2011-01-27-Why can’t I be more like Bill James, or, The use of default and default-like models

18 0.075298116 473 andrew gelman stats-2010-12-17-Why a bonobo won’t play poker with you

19 0.070732832 1381 andrew gelman stats-2012-06-16-The Art of Fielding

20 0.069942497 562 andrew gelman stats-2011-02-06-Statistician cracks Toronto lottery


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.103), (1, -0.042), (2, 0.01), (3, -0.009), (4, 0.0), (5, 0.003), (6, 0.012), (7, -0.024), (8, -0.015), (9, -0.002), (10, -0.005), (11, 0.001), (12, -0.013), (13, -0.015), (14, -0.035), (15, 0.05), (16, 0.022), (17, 0.021), (18, 0.026), (19, -0.019), (20, -0.04), (21, 0.028), (22, 0.002), (23, 0.027), (24, 0.025), (25, 0.018), (26, -0.033), (27, 0.015), (28, -0.03), (29, -0.095), (30, -0.011), (31, -0.009), (32, 0.05), (33, -0.006), (34, -0.021), (35, 0.047), (36, 0.044), (37, -0.001), (38, -0.006), (39, 0.028), (40, 0.068), (41, 0.017), (42, -0.007), (43, -0.022), (44, 0.012), (45, 0.052), (46, -0.032), (47, 0.027), (48, -0.067), (49, -0.012)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.96147859 942 andrew gelman stats-2011-10-04-45% hitting, 25% fielding, 25% pitching, and 100% not telling us how they did it

Introduction: A University of Delaware press release reports : This month, the Journal of Quantitative Analysis in Sports will feature the article “An Estimate of How Hitting, Pitching, Fielding, and Base-stealing Impact Team Winning Percentages in Baseball.” In it, University of Delaware Prof. Charles Pavitt of the Department of Communication defines the perfect “formula” for Major League Baseball (MLB) teams to use to build the ultimate winning team. Pavitt found hitting accounts for more than 45 percent of teams’ winning records, fielding for 25 percent and pitching for 25 percent. And that the impact of stolen bases is greatly overestimated. He crunched hitting, pitching, fielding and base-stealing records for every MLB team over a 48-year period from 1951-1998 with a method no other researcher has used in this area. In statistical parlance, he used a conceptual decomposition of offense and defense into its component parts and then analyzed recombinations of the parts in intuitively mea

2 0.78743762 1113 andrew gelman stats-2012-01-11-Toshiro Kageyama on professionalism

Introduction: Following up on our discussion of professionalism (in which Jonathan Chait argued that “the definition of a professional career track” requires pay differentials and the chance to get fired, and I argued the opposite, that a lot of people go into professional careers specifically because of the job security), Austin Frakt pointed me to this description of professionalism from Go master Toshiro Kageyama. This in turn reminds me of a remark of Bill James when he explained lack of surprise that clutch hitting does not show up in the data. He wrote that the underlying idea of clutch hitting is that a player will play particuarly well in an important situation where the game or the season is on the line. But, James pointed out, these guys are pros, and the true sign of a professional is that he can always stay concentrated. This argument applies particuarly for hitting, maybe less so for pitching, where a pitcher can’t necessarily throw his hardest for 100 pitches in a game.

3 0.74926984 1419 andrew gelman stats-2012-07-17-“Faith means belief in something concerning which doubt is theoretically possible.” — William James

Introduction: Eric Tassone writes: Probably not blog-worthy/blog-appropriate, but have you heard Bill James discussing the Sandusky & Paterno stuff? I think you discussed once his stance on the Dowd Report, and this seems to be from the same part of his personality—which goes beyond contrarian . . . I have in fact blogged on James ( many times ) and on Paterno , so yes I think this is blogworthy. On the other hand, most readers of this blog probably don’t care about baseball, football, or William James, so I’ll put the rest below the fold. What is legendary baseball statistician Bill James doing, defending the crime-coverups of legendary coach Joe Paterno? As I wrote in my earlier blog on Paterno, it isn’t always easy to do the right thing, and I have no idea if I’d behave any better if I were in such a situation. The characteristics of a good coach do not necessarily provide what it takes to make good decisions off the field. In this sense even more of the blame should go

4 0.70867288 642 andrew gelman stats-2011-04-02-Bill James and the base-rate fallacy

Introduction: I was recently rereading and enjoying Bill James’s Historical Baseball Abstract (the second edition, from 2001). But even the Master is not perfect. Here he is, in the context of the all-time 20th-greatest shortstop (in his reckoning): Are athletes special people? In general, no, but occasionally, yes. Johnny Pesky at 75 was trim, youthful, optimistic, and practically exploding with energy. You rarely meet anybody like that who isn’t an ex-athlete–and that makes athletes seem special. [italics in the original] Hey, I’ve met 75-year-olds like that–and none of them are ex-athletes! That’s probably because I don’t know a lot of ex-athletes. But Bill James . . . he knows a lot of athletes. He went to the bathroom with Tim Raines once! The most I can say is that I saw Rickey Henderson steal a couple bases when he was playing against the Orioles once. Cognitive psychologists talk about the base-rate fallacy , which is the mistake of estimating probabilities without accou

5 0.70601237 445 andrew gelman stats-2010-12-03-Getting a job in pro sports… as a statistician

Introduction: Posted at MediaBistro: The Harvard Sports Analysis Collective are the group that tackles problems such as “ Who wrote this column: Bill Simmons, Rick Reilly, or Kevin Whitlock? ” and “ Should a football team give up free touchdowns? ” It’s all fun and games, until the students land jobs with major teams. According to the Harvard Crimson , sophomore John Ezekowitz and junior Jason Rosenfeld scored gigs with the Phoenix Suns and the Shanghai Sharks, respectively, in part based on their work for HSAC. It’s perhaps not a huge surprise that the Sharks would be interested in taking advantage of every available statistic. They are owned by Yao Ming, who plays for the Houston Rockets. The Rockets, in turn, employ general manager Daryl Morey who Simmons nicknamed “Dork Elvis” for his ahead of the curve analysis. (See Michael Lewis ‘ The No Stats All-Star for an example.) But still, it’s very cool to see the pair get an opportunity to change the game.

6 0.68026936 697 andrew gelman stats-2011-05-05-A statistician rereads Bill James

7 0.67555964 473 andrew gelman stats-2010-12-17-Why a bonobo won’t play poker with you

8 0.67146105 509 andrew gelman stats-2011-01-09-Chartjunk, but in a good cause!

9 0.66947794 1115 andrew gelman stats-2012-01-12-Where are the larger-than-life athletes?

10 0.66293126 173 andrew gelman stats-2010-07-31-Editing and clutch hitting

11 0.66252804 367 andrew gelman stats-2010-10-25-In today’s economy, the rich get richer

12 0.65804362 623 andrew gelman stats-2011-03-21-Baseball’s greatest fielders

13 0.6460796 541 andrew gelman stats-2011-01-27-Why can’t I be more like Bill James, or, The use of default and default-like models

14 0.64353687 29 andrew gelman stats-2010-05-12-Probability of successive wins in baseball

15 0.63998401 440 andrew gelman stats-2010-12-01-In defense of jargon

16 0.61935592 802 andrew gelman stats-2011-07-13-Super Sam Fuld Needs Your Help (with Foul Ball stats)

17 0.61514026 652 andrew gelman stats-2011-04-07-Minor-league Stats Predict Major-league Performance, Sarah Palin, and Some Differences Between Baseball and Politics

18 0.61049706 1219 andrew gelman stats-2012-03-18-Tips on “great design” from . . . Microsoft!

19 0.60555446 2262 andrew gelman stats-2014-03-23-Win probabilities during a sporting event

20 0.5994963 559 andrew gelman stats-2011-02-06-Bidding for the kickoff


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(9, 0.06), (16, 0.079), (21, 0.013), (24, 0.096), (27, 0.058), (35, 0.214), (63, 0.015), (69, 0.01), (73, 0.017), (79, 0.01), (82, 0.016), (86, 0.014), (89, 0.079), (99, 0.212)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.91781646 942 andrew gelman stats-2011-10-04-45% hitting, 25% fielding, 25% pitching, and 100% not telling us how they did it

Introduction: A University of Delaware press release reports : This month, the Journal of Quantitative Analysis in Sports will feature the article “An Estimate of How Hitting, Pitching, Fielding, and Base-stealing Impact Team Winning Percentages in Baseball.” In it, University of Delaware Prof. Charles Pavitt of the Department of Communication defines the perfect “formula” for Major League Baseball (MLB) teams to use to build the ultimate winning team. Pavitt found hitting accounts for more than 45 percent of teams’ winning records, fielding for 25 percent and pitching for 25 percent. And that the impact of stolen bases is greatly overestimated. He crunched hitting, pitching, fielding and base-stealing records for every MLB team over a 48-year period from 1951-1998 with a method no other researcher has used in this area. In statistical parlance, he used a conceptual decomposition of offense and defense into its component parts and then analyzed recombinations of the parts in intuitively mea

2 0.9047817 473 andrew gelman stats-2010-12-17-Why a bonobo won’t play poker with you

Introduction: Sciencedaily has posted an article titled Apes Unwilling to Gamble When Odds Are Uncertain : The apes readily distinguished between the different probabilities of winning: they gambled a lot when there was a 100 percent chance, less when there was a 50 percent chance, and only rarely when there was no chance In some trials, however, the experimenter didn’t remove a lid from the bowl, so the apes couldn’t assess the likelihood of winning a banana The odds from the covered bowl were identical to those from the risky option: a 50 percent chance of getting the much sought-after banana. But apes of both species were less likely to choose this ambiguous option. Like humans, they showed “ambiguity aversion” — preferring to gamble more when they knew the odds than when they didn’t. Given some of the other differences between chimps and bonobos, Hare and Rosati had expected to find the bonobos to be more averse to ambiguity, but that didn’t turn out to be the case. Thanks to Sta

3 0.87533736 881 andrew gelman stats-2011-08-30-Rickey Henderson and Peter Angelos, together again

Introduction: Today I was reminded of a riddle from junior high: Q: What do you get when you cross an elephant with peanut butter? A: Peanut butter that never forgets, or an elephant that sticks to the roof of your mouth. The occasion was a link from Tyler Cowen to a new book by Garry Kasparov and . . . Peter Thiel. Kasparov we all know about. I still remember how he pulled out a victory in the last game of his tournament with Karpov. Just amazing: he had to win the game, a draw would not be enough. Both players knew that Kasparov had to win. And he did it. A feat as impressive as Kirk Gibson’s off-the-bench game-winning home run in the 1987 Series. Peter Theil is a more obscure figure. He’s been featured a couple of times on this blog and comes across as your typical overconfident rich dude. It’s an odd combination, sort of like what you might get if Rickey Henderson and Peter Angelos were to write a book about how to reform baseball. Cowen writes, “How can I not pre-orde

4 0.86854774 837 andrew gelman stats-2011-08-04-Is it rational to vote?

Introduction: Hear me interviewed on the topic here . P.S. The interview was fine but I don’t agree with everything on the linked website. For example, this bit: Global warming is not the first case of a widespread fear based on incomplete knowledge turned out to be false or at least greatly exaggerated. Global warming has many of the characteristics of a popular delusion, an irrational fear or cause that is embraced by millions of people because, well, it is believed by millions of people! All right, then.

5 0.83318859 1443 andrew gelman stats-2012-08-04-Bayesian Learning via Stochastic Gradient Langevin Dynamics

Introduction: Burak Bayramli writes: In this paper by Sunjin Ahn, Anoop Korattikara, and Max Welling and this paper by Welling and Yee Whye The, there are some arguments on big data and the use of MCMC. Both papers have suggested improvements to speed up MCMC computations. I was wondering what your thoughts were, especially on this paragraph: When a dataset has a billion data-cases (as is not uncommon these days) MCMC algorithms will not even have generated a single (burn-in) sample when a clever learning algorithm based on stochastic gradients may already be making fairly good predictions. In fact, the intriguing results of Bottou and Bousquet (2008) seem to indicate that in terms of “number of bits learned per unit of computation”, an algorithm as simple as stochastic gradient descent is almost optimally efficient. We therefore argue that for Bayesian methods to remain useful in an age when the datasets grow at an exponential rate, they need to embrace the ideas of the stochastic optimiz

6 0.82666481 2049 andrew gelman stats-2013-10-03-On house arrest for p-hacking

7 0.81640536 895 andrew gelman stats-2011-09-08-How to solve the Post Office’s problems?

8 0.81390488 591 andrew gelman stats-2011-02-25-Quantitative Methods in the Social Sciences M.A.: Innovative, interdisciplinary social science research program for a data-rich world

9 0.80006349 1516 andrew gelman stats-2012-09-30-Computational problems with glm etc.

10 0.7895962 2253 andrew gelman stats-2014-03-17-On deck this week: Revisitings

11 0.78929073 1926 andrew gelman stats-2013-07-05-More plain old everyday Bayesianism

12 0.78715682 296 andrew gelman stats-2010-09-26-A simple semigraphic display

13 0.78160453 80 andrew gelman stats-2010-06-11-Free online course in multilevel modeling

14 0.76719642 392 andrew gelman stats-2010-11-03-Taleb + 3.5 years

15 0.766047 566 andrew gelman stats-2011-02-09-The boxer, the wrestler, and the coin flip, again

16 0.76301795 1264 andrew gelman stats-2012-04-14-Learning from failure

17 0.75933397 388 andrew gelman stats-2010-11-01-The placebo effect in pharma

18 0.75775671 623 andrew gelman stats-2011-03-21-Baseball’s greatest fielders

19 0.74768722 1130 andrew gelman stats-2012-01-20-Prior beliefs about locations of decision boundaries

20 0.74233341 2274 andrew gelman stats-2014-03-30-Adjudicating between alternative interpretations of a statistical interaction?