andrew_gelman_stats andrew_gelman_stats-2012 andrew_gelman_stats-2012-1610 knowledge-graph by maker-knowledge-mining

1610 andrew gelman stats-2012-12-06-Yes, checking calibration of probability forecasts is part of Bayesian statistics


meta infos for this blog

Source: html

Introduction: Yes, checking calibration of probability forecasts is part of Bayesian statistics. At the end of this post are three figures from Chapter 1 of Bayesian Data Analysis illustrating empirical evaluation of forecasts. But first the background. Why am I bringing this up now? It’s because of something Larry Wasserman wrote the other day : One of the striking facts about [baseball/political forecaster Nate Silver's recent] book is the emphasis the Silver places on frequency calibration. . . . Have no doubt about it: Nate Silver is a frequentist. For example, he says: One of the most important tests of a forecast — I would argue that it is the single most important one — is called calibration. Out of all the times you said there was a 40 percent chance of rain, how often did rain actually occur? If over the long run, it really did rain about 40 percent of the time, that means your forecasts were well calibrated. I had some discussion with Larry in the comments section of h


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Yes, checking calibration of probability forecasts is part of Bayesian statistics. [sent-1, score-0.657]

2 It’s because of something Larry Wasserman wrote the other day : One of the striking facts about [baseball/political forecaster Nate Silver's recent] book is the emphasis the Silver places on frequency calibration. [sent-5, score-0.362]

3 If over the long run, it really did rain about 40 percent of the time, that means your forecasts were well calibrated. [sent-12, score-0.309]

4 I had some discussion with Larry in the comments section of his blog and raised the following point: There is such a thing as Bayesian calibration of probability forecasts. [sent-13, score-0.493]

5 This isn’t the whole story (as always, calibration matters but so does precision). [sent-23, score-0.418]

6 The last time I took (or taught) a theoretical statistics course was almost thirty years ago, but I recall frequentist coverage to be defined with the expectation taken conditional on the value of the unknown parameters theta in the model. [sent-24, score-1.159]

7 The calibration Larry describes above (for another example, see here and scroll down) is unconditional on theta, thus Bayesian. [sent-25, score-0.638]

8 Just about any purely data-based calibration will be Bayesian, as we never know theta. [sent-28, score-0.418]

9 I don’t completely understand his reply, but I think he says that that unconditional coverage calculations are frequentist also. [sent-30, score-0.784]

10 In that case, maybe we can divide up the coverage calculations as follows: Unconditional coverage (E(y. [sent-31, score-0.69]

11 (For both modes of inference, unconditional coverage will occur if all the assumptions are true. [sent-34, score-0.659]

12 They’d rather develop methods with good average coverage properties under minimal assumptions. [sent-47, score-0.363]

13 When it comes to frequency evaluation, the point is that Bayesian inference is supposed to be calibrated conditional on any aspect of the data. [sent-50, score-0.577]

14 To return to the title of this post, yes, checking calibration of probability forecasts is part of Bayesian statistics. [sent-51, score-0.657]

15 We have two examples of this calibration in the very first chapter of Bayesian Data Analysis. [sent-52, score-0.518]

16 I think it’s fair enough to agree with Larry that these are frequency calculations. [sent-56, score-0.26]

17 But they are Bayesian frequency calculations by virtue of being conditional on data, not on unknown parameters. [sent-57, score-0.676]

18 I think this should make Larry happy, that frequency evaluation (albeit conditional on y, not theta) is central to modern Bayesian statistics. [sent-59, score-0.605]

19 I think Nate’s doing them in the Bayesian way but I’ll accept Larry’s statement that Nate and I and other applied Bayesians are frequentists too (of the sort that perform our frequency evaluations conditional on observed data rather than unknown parameters). [sent-62, score-0.811]

20 And I do see the conceptual (and, at times, practical) appeal of frequentist methods that allow fewer probability statements but make correspondingly fewer assumptions, even if I don’t usually go that way myself. [sent-63, score-0.491]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('calibration', 0.418), ('bayesian', 0.307), ('coverage', 0.294), ('larry', 0.266), ('frequency', 0.26), ('nate', 0.224), ('theta', 0.216), ('conditional', 0.194), ('unconditional', 0.177), ('frequentist', 0.167), ('rain', 0.146), ('unknown', 0.12), ('silver', 0.115), ('forecasts', 0.109), ('frequentists', 0.103), ('calculations', 0.102), ('chapter', 0.1), ('evaluation', 0.098), ('evaluations', 0.086), ('assumptions', 0.077), ('expectation', 0.077), ('inference', 0.075), ('probability', 0.075), ('occur', 0.07), ('methods', 0.069), ('bayesians', 0.067), ('fewer', 0.065), ('important', 0.06), ('justifiably', 0.059), ('calibrations', 0.056), ('forecaster', 0.056), ('checking', 0.055), ('empirical', 0.054), ('percent', 0.054), ('central', 0.053), ('illustrating', 0.05), ('correspondingly', 0.05), ('data', 0.048), ('calibrated', 0.048), ('parameters', 0.048), ('wary', 0.047), ('book', 0.046), ('albeit', 0.045), ('says', 0.044), ('thirty', 0.043), ('scroll', 0.043), ('tens', 0.042), ('wasserman', 0.042), ('modes', 0.041), ('responsibility', 0.04)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.99999994 1610 andrew gelman stats-2012-12-06-Yes, checking calibration of probability forecasts is part of Bayesian statistics

Introduction: Yes, checking calibration of probability forecasts is part of Bayesian statistics. At the end of this post are three figures from Chapter 1 of Bayesian Data Analysis illustrating empirical evaluation of forecasts. But first the background. Why am I bringing this up now? It’s because of something Larry Wasserman wrote the other day : One of the striking facts about [baseball/political forecaster Nate Silver's recent] book is the emphasis the Silver places on frequency calibration. . . . Have no doubt about it: Nate Silver is a frequentist. For example, he says: One of the most important tests of a forecast — I would argue that it is the single most important one — is called calibration. Out of all the times you said there was a 40 percent chance of rain, how often did rain actually occur? If over the long run, it really did rain about 40 percent of the time, that means your forecasts were well calibrated. I had some discussion with Larry in the comments section of h

2 0.26843494 391 andrew gelman stats-2010-11-03-Some thoughts on election forecasting

Introduction: I’ve written a lot on polls and elections (“a poll is a snapshot, not a forecast,” etc., or see here for a more technical paper with Kari Lock) but had a few things to add in light of Sam Wang’s recent efforts . As a biologist with a physics degree, Wang brings an outsider’s perspective to political forecasting, which can be a good thing. (I’m a bit of an outsider to political science myself, as is my sometime collaborator Nate Silver, who’s done a lot of good work in the past few years.) But there are two places where Wang misses the point, I think. He refers to his method as a “transparent, low-assumption calculation” and compares it favorably to “fancy modeling” and “assumption-laden models.” Assumptions are a bad thing, right? Well, no, I don’t think so. Bad assumptions are a bad thing. Good assumptions are just fine. Similarly for fancy modeling. I don’t see why a model should get credit for not including a factor that might be important. Let me clarify. I

3 0.2618759 291 andrew gelman stats-2010-09-22-Philosophy of Bayes and non-Bayes: A dialogue with Deborah Mayo

Introduction: I sent Deborah Mayo a link to my paper with Cosma Shalizi on the philosophy of statistics, and she sent me the link to this conference which unfortunately already occurred. (It’s too bad, because I’d have liked to have been there.) I summarized my philosophy as follows: I am highly sympathetic to the approach of Lakatos (or of Popper, if you consider Lakatos’s “Popper_2″ to be a reasonable simulation of the true Popperism), in that (a) I view statistical models as being built within theoretical structures, and (b) I see the checking and refutation of models to be a key part of scientific progress. A big problem I have with mainstream Bayesianism is its “inductivist” view that science can operate completely smoothly with posterior updates: the idea that new data causes us to increase the posterior probability of good models and decrease the posterior probability of bad models. I don’t buy that: I see models as ever-changing entities that are flexible and can be patched and ex

4 0.24512257 1554 andrew gelman stats-2012-10-31-It not necessary that Bayesian methods conform to the likelihood principle

Introduction: Bayesian inference, conditional on the model and data, conforms to the likelihood principle. But there is more to Bayesian methods than Bayesian inference. See chapters 6 and 7 of Bayesian Data Analysis for much discussion of this point. It saddens me to see that people are still confused on this issue.

5 0.23884897 586 andrew gelman stats-2011-02-23-A statistical version of Arrow’s paradox

Introduction: Unfortunately, when we deal with scientists, statisticians are often put in a setting reminiscent of Arrow’s paradox, where we are asked to provide estimates that are informative and unbiased and confidence statements that are correct conditional on the data and also on the underlying true parameter. [It's not generally possible for an estimate to do all these things at the same time -- ed.] Larry Wasserman feels that scientists are truly frequentist, and Don Rubin has told me how he feels that scientists interpret all statistical estimates Bayesianly. I have no doubt that both Larry and Don are correct. Voters want lower taxes and more services, and scientists want both Bayesian and frequency coverage; as the saying goes, everybody wants to go to heaven but nobody wants to die.

6 0.23678666 1205 andrew gelman stats-2012-03-09-Coming to agreement on philosophy of statistics

7 0.23525733 1868 andrew gelman stats-2013-05-23-Validation of Software for Bayesian Models Using Posterior Quantiles

8 0.22002678 1898 andrew gelman stats-2013-06-14-Progress! (on the understanding of the role of randomization in Bayesian inference)

9 0.21691212 1560 andrew gelman stats-2012-11-03-Statistical methods that work in some settings but not others

10 0.21231055 1634 andrew gelman stats-2012-12-21-Two reviews of Nate Silver’s new book, from Kaiser Fung and Cathy O’Neil

11 0.20450707 317 andrew gelman stats-2010-10-04-Rob Kass on statistical pragmatism, and my reactions

12 0.20041789 1695 andrew gelman stats-2013-01-28-Economists argue about Bayes

13 0.19689071 1389 andrew gelman stats-2012-06-23-Larry Wasserman’s statistics blog

14 0.19328116 1572 andrew gelman stats-2012-11-10-I don’t like this cartoon

15 0.19305407 534 andrew gelman stats-2011-01-24-Bayes at the end

16 0.1912387 1469 andrew gelman stats-2012-08-25-Ways of knowing

17 0.18308637 961 andrew gelman stats-2011-10-16-The “Washington read” and the algebra of conditional distributions

18 0.18306805 899 andrew gelman stats-2011-09-10-The statistical significance filter

19 0.17955868 1438 andrew gelman stats-2012-07-31-What is a Bayesian?

20 0.1787506 662 andrew gelman stats-2011-04-15-Bayesian statistical pragmatism


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.209), (1, 0.171), (2, -0.089), (3, 0.1), (4, -0.174), (5, 0.013), (6, -0.06), (7, 0.102), (8, 0.062), (9, -0.163), (10, 0.004), (11, -0.041), (12, 0.022), (13, 0.003), (14, 0.025), (15, 0.05), (16, 0.022), (17, 0.045), (18, -0.001), (19, 0.007), (20, 0.024), (21, 0.149), (22, -0.0), (23, 0.056), (24, 0.071), (25, -0.009), (26, -0.003), (27, 0.025), (28, 0.024), (29, 0.064), (30, -0.029), (31, 0.042), (32, -0.021), (33, -0.034), (34, 0.043), (35, 0.033), (36, -0.018), (37, 0.011), (38, -0.007), (39, -0.019), (40, 0.038), (41, 0.075), (42, -0.057), (43, -0.066), (44, -0.07), (45, -0.021), (46, 0.086), (47, 0.068), (48, -0.046), (49, 0.035)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.97090489 1610 andrew gelman stats-2012-12-06-Yes, checking calibration of probability forecasts is part of Bayesian statistics

Introduction: Yes, checking calibration of probability forecasts is part of Bayesian statistics. At the end of this post are three figures from Chapter 1 of Bayesian Data Analysis illustrating empirical evaluation of forecasts. But first the background. Why am I bringing this up now? It’s because of something Larry Wasserman wrote the other day : One of the striking facts about [baseball/political forecaster Nate Silver's recent] book is the emphasis the Silver places on frequency calibration. . . . Have no doubt about it: Nate Silver is a frequentist. For example, he says: One of the most important tests of a forecast — I would argue that it is the single most important one — is called calibration. Out of all the times you said there was a 40 percent chance of rain, how often did rain actually occur? If over the long run, it really did rain about 40 percent of the time, that means your forecasts were well calibrated. I had some discussion with Larry in the comments section of h

2 0.8128913 1438 andrew gelman stats-2012-07-31-What is a Bayesian?

Introduction: Deborah Mayo recommended that I consider coming up with a new name for the statistical methods that I used, given that the term “Bayesian” has all sorts of associations that I dislike (as discussed, for example, in section 1 of this article ). I replied that I agree on Bayesian, I never liked the term and always wanted something better, but I couldn’t think of any convenient alternative. Also, I was finding that Bayesians (even the Bayesians I disagreed with) were reading my research articles, while non-Bayesians were simply ignoring them. So I thought it was best to identify with, and communicate with, those people who were willing to engage with me. More formally, I’m happy defining “Bayesian” as “using inference from the posterior distribution, p(theta|y)”. This says nothing about where the probability distributions come from (thus, no requirement to be “subjective” or “objective”) and it says nothing about the models (thus, no requirement to use the discrete models that hav

3 0.78068048 449 andrew gelman stats-2010-12-04-Generalized Method of Moments, whatever that is

Introduction: Xuequn Hu writes: I am an econ doctoral student, trying to do some empirical work using Bayesian methods. Recently I read a paper(and its discussion) that pitches Bayesian methods against GMM (Generalized Method of Moments), which is quite popular in econometrics for frequentists. I am wondering if you can, here or on your blog, give some insights about these two methods, from the perspective of a Bayesian statistician. I know GMM does not conform to likelihood principle, but Bayesian are often charged with strong distribution assumptions. I can’t actually help on this, since I don’t know what GMM is. My guess is that, like other methods that don’t explicitly use prior estimation, this method will work well if sufficient information is included as data. Which would imply a hierarchical structure.

4 0.77932715 1262 andrew gelman stats-2012-04-12-“Not only defended but also applied”: The perceived absurdity of Bayesian inference

Introduction: Updated version of my paper with Xian: The missionary zeal of many Bayesians of old has been matched, in the other direction, by an attitude among some theoreticians that Bayesian methods are absurd—not merely misguided but obviously wrong in principle. We consider several examples, beginning with Feller’s classic text on probability theory and continuing with more recent cases such as the perceived Bayesian nature of the so-called doomsday argument. We analyze in this note the intellectual background behind various misconceptions about Bayesian statistics, without aiming at a complete historical coverage of the reasons for this dismissal. I love this stuff.

5 0.77815545 1898 andrew gelman stats-2013-06-14-Progress! (on the understanding of the role of randomization in Bayesian inference)

Introduction: Leading theoretical statistician Larry Wassserman in 2008 : Some of the greatest contributions of statistics to science involve adding additional randomness and leveraging that randomness. Examples are randomized experiments, permutation tests, cross-validation and data-splitting. These are unabashedly frequentist ideas and, while one can strain to fit them into a Bayesian framework, they don’t really have a place in Bayesian inference. The fact that Bayesian methods do not naturally accommodate such a powerful set of statistical ideas seems like a serious deficiency. To which I responded on the second-to-last paragraph of page 8 here . Larry Wasserman in 2013 : Some people say that there is no role for randomization in Bayesian inference. In other words, the randomization mechanism plays no role in Bayes’ theorem. But this is not really true. Without randomization, we can indeed derive a posterior for theta but it is highly sensitive to the prior. This is just a restat

6 0.7766881 117 andrew gelman stats-2010-06-29-Ya don’t know Bayes, Jack

7 0.77404362 1554 andrew gelman stats-2012-10-31-It not necessary that Bayesian methods conform to the likelihood principle

8 0.77342921 1469 andrew gelman stats-2012-08-25-Ways of knowing

9 0.77031815 1560 andrew gelman stats-2012-11-03-Statistical methods that work in some settings but not others

10 0.76945287 291 andrew gelman stats-2010-09-22-Philosophy of Bayes and non-Bayes: A dialogue with Deborah Mayo

11 0.7665785 1205 andrew gelman stats-2012-03-09-Coming to agreement on philosophy of statistics

12 0.764902 1779 andrew gelman stats-2013-03-27-“Two Dogmas of Strong Objective Bayesianism”

13 0.75982559 534 andrew gelman stats-2011-01-24-Bayes at the end

14 0.75826895 2000 andrew gelman stats-2013-08-28-Why during the 1950-1960′s did Jerry Cornfield become a Bayesian?

15 0.7568329 2368 andrew gelman stats-2014-06-11-Bayes in the research conversation

16 0.75288159 183 andrew gelman stats-2010-08-04-Bayesian models for simultaneous equation systems?

17 0.74754232 1781 andrew gelman stats-2013-03-29-Another Feller theory

18 0.74029732 1182 andrew gelman stats-2012-02-24-Untangling the Jeffreys-Lindley paradox

19 0.73890644 586 andrew gelman stats-2011-02-23-A statistical version of Arrow’s paradox

20 0.73623747 2293 andrew gelman stats-2014-04-16-Looking for Bayesian expertise in India, for the purpose of analysis of sarcoma trials


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(1, 0.021), (5, 0.064), (8, 0.022), (11, 0.06), (15, 0.057), (16, 0.078), (20, 0.012), (24, 0.167), (36, 0.016), (42, 0.013), (68, 0.01), (76, 0.015), (85, 0.018), (86, 0.021), (95, 0.016), (99, 0.285)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.96777511 1610 andrew gelman stats-2012-12-06-Yes, checking calibration of probability forecasts is part of Bayesian statistics

Introduction: Yes, checking calibration of probability forecasts is part of Bayesian statistics. At the end of this post are three figures from Chapter 1 of Bayesian Data Analysis illustrating empirical evaluation of forecasts. But first the background. Why am I bringing this up now? It’s because of something Larry Wasserman wrote the other day : One of the striking facts about [baseball/political forecaster Nate Silver's recent] book is the emphasis the Silver places on frequency calibration. . . . Have no doubt about it: Nate Silver is a frequentist. For example, he says: One of the most important tests of a forecast — I would argue that it is the single most important one — is called calibration. Out of all the times you said there was a 40 percent chance of rain, how often did rain actually occur? If over the long run, it really did rain about 40 percent of the time, that means your forecasts were well calibrated. I had some discussion with Larry in the comments section of h

2 0.9631018 1713 andrew gelman stats-2013-02-08-P-values and statistical practice

Introduction: From my new article in the journal Epidemiology: Sander Greenland and Charles Poole accept that P values are here to stay but recognize that some of their most common interpretations have problems. The casual view of the P value as posterior probability of the truth of the null hypothesis is false and not even close to valid under any reasonable model, yet this misunderstanding persists even in high-stakes settings (as discussed, for example, by Greenland in 2011). The formal view of the P value as a probability conditional on the null is mathematically correct but typically irrelevant to research goals (hence, the popularity of alternative—if wrong—interpretations). A Bayesian interpretation based on a spike-and-slab model makes little sense in applied contexts in epidemiology, political science, and other fields in which true effects are typically nonzero and bounded (thus violating both the “spike” and the “slab” parts of the model). I find Greenland and Poole’s perspective t

3 0.95889068 391 andrew gelman stats-2010-11-03-Some thoughts on election forecasting

Introduction: I’ve written a lot on polls and elections (“a poll is a snapshot, not a forecast,” etc., or see here for a more technical paper with Kari Lock) but had a few things to add in light of Sam Wang’s recent efforts . As a biologist with a physics degree, Wang brings an outsider’s perspective to political forecasting, which can be a good thing. (I’m a bit of an outsider to political science myself, as is my sometime collaborator Nate Silver, who’s done a lot of good work in the past few years.) But there are two places where Wang misses the point, I think. He refers to his method as a “transparent, low-assumption calculation” and compares it favorably to “fancy modeling” and “assumption-laden models.” Assumptions are a bad thing, right? Well, no, I don’t think so. Bad assumptions are a bad thing. Good assumptions are just fine. Similarly for fancy modeling. I don’t see why a model should get credit for not including a factor that might be important. Let me clarify. I

4 0.95848668 1637 andrew gelman stats-2012-12-24-Textbook for data visualization?

Introduction: Dave Choi writes: I’m building a course called “Exploring and visualizing data,” for Heinz college in Carnegie Mellon (public policy and information systems). Do you know any books that might be good for such a course? I’m hoping to get non-statisticians to appreciate the statistician’s point of view on this subject. I immediately thought of Bill Cleveland’s 1985 classic, The Elements of Graphing Data, but I wasn’t sure of what comes next. There are a lot of books on how to make graphics in R, but I’m not quite sure that’s the point. And I’m loath to recommend Tufte since it would be kinda scary if a student were to take all of his ideas too seriously. Any suggestions?

5 0.95788813 1387 andrew gelman stats-2012-06-21-Will Tiger Woods catch Jack Nicklaus? And a discussion of the virtues of using continuous data even if your goal is discrete prediction

Introduction: I know next to nothing about golf. My mini-golf scores typically approach the maximum of 7 per hole, and I’ve never actually played macro-golf. I did publish a paper on golf once ( A Probability Model for Golf Putting , with Deb Nolan), but it’s not so rare for people to publish papers on topics they know nothing about. Those who can’t, research. But I certainly have the ability to post other people’s ideas. Charles Murray writes: I [Murray] am playing around with the likelihood of Tiger Woods breaking Nicklaus’s record in the Majors. I’ve already gone on record two years ago with the reason why he won’t, but now I’m looking at it from a non-psychological perspective. Given the history of the majors, what how far above the average _for other great golfers_ does Tiger have to perform? Here’s the procedure I’ve been working on: 1. For all golfers who have won at at least one major since 1934 (the year the Masters began), create 120 lines: one for each Major for each year f

6 0.95617843 1914 andrew gelman stats-2013-06-25-Is there too much coauthorship in economics (and science more generally)? Or too little?

7 0.95580065 1386 andrew gelman stats-2012-06-21-Belief in hell is associated with lower crime rates

8 0.9529742 1578 andrew gelman stats-2012-11-15-Outta control political incorrectness

9 0.9520123 167 andrew gelman stats-2010-07-27-Why don’t more medical discoveries become cures?

10 0.951877 2244 andrew gelman stats-2014-03-11-What if I were to stop publishing in journals?

11 0.95177495 2080 andrew gelman stats-2013-10-28-Writing for free

12 0.95110083 2055 andrew gelman stats-2013-10-08-A Bayesian approach for peer-review panels? and a speculation about Bruno Frey

13 0.95090127 1894 andrew gelman stats-2013-06-12-How to best graph the Beveridge curve, relating the vacancy rate in jobs to the unemployment rate?

14 0.95052981 1225 andrew gelman stats-2012-03-22-Procrastination as a positive productivity strategy

15 0.95036685 511 andrew gelman stats-2011-01-11-One more time on that ESP study: The problem of overestimates and the shrinkage solution

16 0.94999611 447 andrew gelman stats-2010-12-03-Reinventing the wheel, only more so.

17 0.94960523 2353 andrew gelman stats-2014-05-30-I posted this as a comment on a sociology blog

18 0.94959688 902 andrew gelman stats-2011-09-12-The importance of style in academic writing

19 0.94912159 586 andrew gelman stats-2011-02-23-A statistical version of Arrow’s paradox

20 0.9483875 878 andrew gelman stats-2011-08-29-Infovis, infographics, and data visualization: Where I’m coming from, and where I’d like to go