andrew_gelman_stats andrew_gelman_stats-2011 andrew_gelman_stats-2011-534 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: John Cook noticed something : I [Cook] was looking at the preface of an old statistics book and read this: The Bayesian techniques occur at the end of each chapter; therefore they can be omitted if time does not permit their inclusion. This approach is typical. Many textbooks present frequentist statistics with a little Bayesian statistics at the end of each section or at the end of the book. There are a couple ways to look at that. One is simply that Bayesian methods are optional. They must not be that important or they’d get more space. The author even recommends dropping them if pressed for time. Another way to look at this is that Bayesian statistics must be simpler than frequentist statistics since the Bayesian approach to each task requires fewer pages. My reaction: Classical statistics is all about summarizing the data. Bayesian statistics is data + prior information. On those grounds alone, Bayes is more complicated, and it makes sense to do classical sta
sentIndex sentText sentNum sentScore
1 John Cook noticed something : I [Cook] was looking at the preface of an old statistics book and read this: The Bayesian techniques occur at the end of each chapter; therefore they can be omitted if time does not permit their inclusion. [sent-1, score-1.676]
2 Many textbooks present frequentist statistics with a little Bayesian statistics at the end of each section or at the end of the book. [sent-3, score-1.69]
3 They must not be that important or they’d get more space. [sent-6, score-0.14]
4 The author even recommends dropping them if pressed for time. [sent-7, score-0.593]
5 Another way to look at this is that Bayesian statistics must be simpler than frequentist statistics since the Bayesian approach to each task requires fewer pages. [sent-8, score-1.809]
6 My reaction: Classical statistics is all about summarizing the data. [sent-9, score-0.477]
7 Bayesian statistics is data + prior information. [sent-10, score-0.413]
8 On those grounds alone, Bayes is more complicated, and it makes sense to do classical statistics first. [sent-11, score-0.667]
9 , but estimates, standard errors, and confidence intervals for sure. [sent-13, score-0.271]
wordName wordTfidf (topN-words)
[('statistics', 0.343), ('bayesian', 0.32), ('cook', 0.262), ('frequentist', 0.231), ('end', 0.209), ('pressed', 0.205), ('permit', 0.193), ('classical', 0.19), ('preface', 0.185), ('omitted', 0.169), ('recommends', 0.153), ('dropping', 0.147), ('must', 0.14), ('grounds', 0.134), ('summarizing', 0.134), ('approach', 0.131), ('simpler', 0.124), ('occur', 0.122), ('task', 0.121), ('textbooks', 0.121), ('alone', 0.115), ('fewer', 0.113), ('techniques', 0.112), ('intervals', 0.109), ('look', 0.106), ('therefore', 0.103), ('complicated', 0.103), ('reaction', 0.101), ('requires', 0.099), ('confidence', 0.098), ('necessarily', 0.096), ('noticed', 0.096), ('bayes', 0.09), ('section', 0.09), ('author', 0.088), ('chapter', 0.087), ('present', 0.084), ('errors', 0.082), ('old', 0.078), ('simply', 0.073), ('ways', 0.073), ('john', 0.073), ('estimates', 0.071), ('couple', 0.07), ('prior', 0.07), ('looking', 0.066), ('standard', 0.064), ('little', 0.06), ('methods', 0.06), ('since', 0.058)]
simIndex simValue blogId blogTitle
same-blog 1 1.0000001 534 andrew gelman stats-2011-01-24-Bayes at the end
Introduction: John Cook noticed something : I [Cook] was looking at the preface of an old statistics book and read this: The Bayesian techniques occur at the end of each chapter; therefore they can be omitted if time does not permit their inclusion. This approach is typical. Many textbooks present frequentist statistics with a little Bayesian statistics at the end of each section or at the end of the book. There are a couple ways to look at that. One is simply that Bayesian methods are optional. They must not be that important or they’d get more space. The author even recommends dropping them if pressed for time. Another way to look at this is that Bayesian statistics must be simpler than frequentist statistics since the Bayesian approach to each task requires fewer pages. My reaction: Classical statistics is all about summarizing the data. Bayesian statistics is data + prior information. On those grounds alone, Bayes is more complicated, and it makes sense to do classical sta
2 0.23697183 291 andrew gelman stats-2010-09-22-Philosophy of Bayes and non-Bayes: A dialogue with Deborah Mayo
Introduction: I sent Deborah Mayo a link to my paper with Cosma Shalizi on the philosophy of statistics, and she sent me the link to this conference which unfortunately already occurred. (It’s too bad, because I’d have liked to have been there.) I summarized my philosophy as follows: I am highly sympathetic to the approach of Lakatos (or of Popper, if you consider Lakatos’s “Popper_2″ to be a reasonable simulation of the true Popperism), in that (a) I view statistical models as being built within theoretical structures, and (b) I see the checking and refutation of models to be a key part of scientific progress. A big problem I have with mainstream Bayesianism is its “inductivist” view that science can operate completely smoothly with posterior updates: the idea that new data causes us to increase the posterior probability of good models and decrease the posterior probability of bad models. I don’t buy that: I see models as ever-changing entities that are flexible and can be patched and ex
3 0.20537557 1695 andrew gelman stats-2013-01-28-Economists argue about Bayes
Introduction: Robert Bell pointed me to this post by Brad De Long on Bayesian statistics, and then I also noticed this from Noah Smith, who wrote: My impression is that although the Bayesian/Frequentist debate is interesting and intellectually fun, there’s really not much “there” there… despite being so-hip-right-now, Bayesian is not the Statistical Jesus. I’m happy to see the discussion going in this direction. Twenty-five years ago or so, when I got into this biz, there were some serious anti-Bayesian attitudes floating around in mainstream statistics. Discussions in the journals sometimes devolved into debates of the form, “Bayesians: knaves or fools?”. You’d get all sorts of free-floating skepticism about any prior distribution at all, even while people were accepting without question (and doing theory on) logistic regressions, proportional hazards models, and all sorts of strong strong models. (In the subfield of survey sampling, various prominent researchers would refuse to mode
4 0.19305407 1610 andrew gelman stats-2012-12-06-Yes, checking calibration of probability forecasts is part of Bayesian statistics
Introduction: Yes, checking calibration of probability forecasts is part of Bayesian statistics. At the end of this post are three figures from Chapter 1 of Bayesian Data Analysis illustrating empirical evaluation of forecasts. But first the background. Why am I bringing this up now? It’s because of something Larry Wasserman wrote the other day : One of the striking facts about [baseball/political forecaster Nate Silver's recent] book is the emphasis the Silver places on frequency calibration. . . . Have no doubt about it: Nate Silver is a frequentist. For example, he says: One of the most important tests of a forecast — I would argue that it is the single most important one — is called calibration. Out of all the times you said there was a 40 percent chance of rain, how often did rain actually occur? If over the long run, it really did rain about 40 percent of the time, that means your forecasts were well calibrated. I had some discussion with Larry in the comments section of h
5 0.19037877 1712 andrew gelman stats-2013-02-07-Philosophy and the practice of Bayesian statistics (with all the discussions!)
Introduction: My article with Cosma Shalizi has appeared in the British Journal of Mathematical and Statistical Psychology. I’m so glad this paper has come out. I’d been thinking about writing such a paper for almost 20 years. What got me to actually do it was an invitation a few years ago to write a chapter on Bayesian statistics for a volume on the philosophy of social sciences. Once I started doing that, I realized I had enough for a journal article. I contacted Cosma because he, unlike me, was familiar with the post-1970 philosophy literature (my knowledge went only up to Popper, Kuhn, and Lakatos). We submitted it to a couple statistics journals that didn’t want it (for reasons that weren’t always clear ), but ultimately I think it ended up in the right place, as psychologists have been as serious as anyone in thinking about statistical foundations in recent years. Here’s the issue of the journal , which also includes an introduction, several discussions, and a rejoinder: Prior app
6 0.18310404 1572 andrew gelman stats-2012-11-10-I don’t like this cartoon
7 0.18280861 1554 andrew gelman stats-2012-10-31-It not necessary that Bayesian methods conform to the likelihood principle
9 0.17380036 247 andrew gelman stats-2010-09-01-How does Bayes do it?
10 0.17320135 1205 andrew gelman stats-2012-03-09-Coming to agreement on philosophy of statistics
11 0.16106735 1779 andrew gelman stats-2013-03-27-“Two Dogmas of Strong Objective Bayesianism”
12 0.16053136 1719 andrew gelman stats-2013-02-11-Why waste time philosophizing?
13 0.15494636 2368 andrew gelman stats-2014-06-11-Bayes in the research conversation
14 0.15161353 1868 andrew gelman stats-2013-05-23-Validation of Software for Bayesian Models Using Posterior Quantiles
15 0.15026166 1182 andrew gelman stats-2012-02-24-Untangling the Jeffreys-Lindley paradox
16 0.14512384 1469 andrew gelman stats-2012-08-25-Ways of knowing
17 0.14370038 781 andrew gelman stats-2011-06-28-The holes in my philosophy of Bayesian data analysis
18 0.1415067 2009 andrew gelman stats-2013-09-05-A locally organized online BDA course on G+ hangout?
19 0.14040591 746 andrew gelman stats-2011-06-05-An unexpected benefit of Arrow’s other theorem
20 0.13879457 1151 andrew gelman stats-2012-02-03-Philosophy of Bayesian statistics: my reactions to Senn
topicId topicWeight
[(0, 0.189), (1, 0.159), (2, -0.138), (3, 0.072), (4, -0.145), (5, 0.051), (6, -0.064), (7, 0.157), (8, 0.028), (9, -0.09), (10, 0.026), (11, -0.08), (12, 0.065), (13, 0.042), (14, 0.089), (15, 0.055), (16, -0.065), (17, 0.063), (18, 0.023), (19, -0.066), (20, 0.059), (21, 0.134), (22, -0.018), (23, 0.01), (24, 0.041), (25, -0.033), (26, -0.046), (27, -0.033), (28, -0.026), (29, 0.006), (30, 0.036), (31, 0.026), (32, -0.002), (33, -0.025), (34, 0.03), (35, 0.037), (36, -0.022), (37, 0.069), (38, -0.027), (39, 0.004), (40, 0.004), (41, -0.023), (42, -0.032), (43, -0.008), (44, -0.009), (45, -0.026), (46, 0.022), (47, 0.034), (48, 0.045), (49, 0.025)]
simIndex simValue blogId blogTitle
same-blog 1 0.98665595 534 andrew gelman stats-2011-01-24-Bayes at the end
Introduction: John Cook noticed something : I [Cook] was looking at the preface of an old statistics book and read this: The Bayesian techniques occur at the end of each chapter; therefore they can be omitted if time does not permit their inclusion. This approach is typical. Many textbooks present frequentist statistics with a little Bayesian statistics at the end of each section or at the end of the book. There are a couple ways to look at that. One is simply that Bayesian methods are optional. They must not be that important or they’d get more space. The author even recommends dropping them if pressed for time. Another way to look at this is that Bayesian statistics must be simpler than frequentist statistics since the Bayesian approach to each task requires fewer pages. My reaction: Classical statistics is all about summarizing the data. Bayesian statistics is data + prior information. On those grounds alone, Bayes is more complicated, and it makes sense to do classical sta
2 0.8491351 2000 andrew gelman stats-2013-08-28-Why during the 1950-1960′s did Jerry Cornfield become a Bayesian?
Introduction: Joel Greenhouse writes: I saw your recent paper on Feller [see here and, for a more fanciful theory, here ]. Looks like it was fun to write. I recently wrote a paper that asks an orthogonal question to yours. Why during the 1950-1960′s did Jerry Cornfield become a Bayesian? It appeared in Statistics in Medicine – “On becoming a Bayesian: Early correspondences between J. Cornfield and L. J. Savage.” In his paper, Greenhouse writes: Jerome Cornfield was arguably the leading proponent for the use of Bayesian methods in biostatistics during the 1960s. Prior to 1963, however, Cornfield had no publications in the area of Bayesian statistics. At a time when frequentist methods were the dominant influence on statistical practice, Cornfield went against the mainstream and embraced Bayes. . . . Cornfield’s interest in Bayesian methods began prior to 1961 and that the clarity of his Bayesian outlook began to take shape following Birnbaum’s ASA paper on the likelihood prin- cip
Introduction: Yes, checking calibration of probability forecasts is part of Bayesian statistics. At the end of this post are three figures from Chapter 1 of Bayesian Data Analysis illustrating empirical evaluation of forecasts. But first the background. Why am I bringing this up now? It’s because of something Larry Wasserman wrote the other day : One of the striking facts about [baseball/political forecaster Nate Silver's recent] book is the emphasis the Silver places on frequency calibration. . . . Have no doubt about it: Nate Silver is a frequentist. For example, he says: One of the most important tests of a forecast — I would argue that it is the single most important one — is called calibration. Out of all the times you said there was a 40 percent chance of rain, how often did rain actually occur? If over the long run, it really did rain about 40 percent of the time, that means your forecasts were well calibrated. I had some discussion with Larry in the comments section of h
4 0.82400155 1259 andrew gelman stats-2012-04-11-How things sound to us, versus how they sound to others
Introduction: Hykel Hosni noticed this bit from the Lindley Prize page of the Society for Bayesan Analysis: Lindley became a great missionary for the Bayesian gospel. The atmosphere of the Bayesian revival is captured in a comment by Rivett on Lindley’s move to University College London and the premier chair of statistics in Britain: “it was as though a Jehovah’s Witness had been elected Pope.” From my perspective, this was amusing (if commonplace): a group of rationalists jocularly characterizing themselves as religious fanatics. And some of this is in response to intense opposition from outsiders (see the Background section here ). That’s my view. I’m an insider, a statistician who’s heard all jokes about religious Bayesians, from Bayesian and non-Bayesian statisticians alike. But Hosni is an outsider, and here’s how he sees the above-quoted paragraph: Research, however, is not a matter of faith but a matter of arguments, which should always be evaluated with the utmost intellec
Introduction: Updated version of my paper with Xian: The missionary zeal of many Bayesians of old has been matched, in the other direction, by an attitude among some theoreticians that Bayesian methods are absurd—not merely misguided but obviously wrong in principle. We consider several examples, beginning with Feller’s classic text on probability theory and continuing with more recent cases such as the perceived Bayesian nature of the so-called doomsday argument. We analyze in this note the intellectual background behind various misconceptions about Bayesian statistics, without aiming at a complete historical coverage of the reasons for this dismissal. I love this stuff.
6 0.7954728 133 andrew gelman stats-2010-07-08-Gratuitous use of “Bayesian Statistics,” a branding issue?
7 0.78420973 449 andrew gelman stats-2010-12-04-Generalized Method of Moments, whatever that is
8 0.78279114 117 andrew gelman stats-2010-06-29-Ya don’t know Bayes, Jack
9 0.77523178 183 andrew gelman stats-2010-08-04-Bayesian models for simultaneous equation systems?
10 0.77515233 1712 andrew gelman stats-2013-02-07-Philosophy and the practice of Bayesian statistics (with all the discussions!)
11 0.77509534 2368 andrew gelman stats-2014-06-11-Bayes in the research conversation
12 0.76318765 83 andrew gelman stats-2010-06-13-Silly Sas lays out old-fashioned statistical thinking
13 0.76194477 1781 andrew gelman stats-2013-03-29-Another Feller theory
14 0.7586804 2009 andrew gelman stats-2013-09-05-A locally organized online BDA course on G+ hangout?
15 0.75719637 1151 andrew gelman stats-2012-02-03-Philosophy of Bayesian statistics: my reactions to Senn
16 0.75630647 2293 andrew gelman stats-2014-04-16-Looking for Bayesian expertise in India, for the purpose of analysis of sarcoma trials
17 0.75358111 1438 andrew gelman stats-2012-07-31-What is a Bayesian?
18 0.74602479 1228 andrew gelman stats-2012-03-25-Continuous variables in Bayesian networks
19 0.74449933 1554 andrew gelman stats-2012-10-31-It not necessary that Bayesian methods conform to the likelihood principle
20 0.74023783 1779 andrew gelman stats-2013-03-27-“Two Dogmas of Strong Objective Bayesianism”
topicId topicWeight
[(15, 0.017), (16, 0.02), (21, 0.014), (24, 0.207), (29, 0.035), (55, 0.018), (56, 0.143), (86, 0.074), (99, 0.356)]
simIndex simValue blogId blogTitle
1 0.97498262 780 andrew gelman stats-2011-06-27-Bridges between deterministic and probabilistic models for binary data
Introduction: For the analysis of binary data, various deterministic models have been proposed, which are generally simpler to fit and easier to understand than probabilistic models. We claim that corresponding to any deterministic model is an implicit stochastic model in which the deterministic model fits imperfectly, with errors occurring at random. In the context of binary data, we consider a model in which the probability of error depends on the model prediction. We show how to fit this model using a stochastic modification of deterministic optimization schemes. The advantages of fitting the stochastic model explicitly (rather than implicitly, by simply fitting a deterministic model and accepting the occurrence of errors) include quantification of uncertainty in the deterministic model’s parameter estimates, better estimation of the true model error rate, and the ability to check the fit of the model nontrivially. We illustrate this with a simple theoretical example of item response data and w
Introduction: The story starts in September, when psychology professor Fred Oswald wrote me: I [Oswald] wanted to point out this paper in Science (Ramirez & Beilock, 2010) examining how students’ emotional writing improves their test performance in high-pressure situations. Although replication is viewed as the hallmark of research, this paper replicates implausibly large d-values and correlations across studies, leading me to be more suspicious of the findings (not less, as is generally the case). He also pointed me to this paper: Experimental disclosure and its moderators: A meta-analysis. Frattaroli, Joanne Psychological Bulletin, Vol 132(6), Nov 2006, 823-865. Disclosing information, thoughts, and feelings about personal and meaningful topics (experimental disclosure) is purported to have various health and psychological consequences (e.g., J. W. Pennebaker, 1993). Although the results of 2 small meta-analyses (P. G. Frisina, J. C. Borod, & S. J. Lepore, 2004; J. M. Smyth
3 0.96634859 933 andrew gelman stats-2011-09-30-More bad news: The (mis)reporting of statistical results in psychology journals
Introduction: Another entry in the growing literature on systematic flaws in the scientific research literature. This time the bad tidings come from Marjan Bakker and Jelte Wicherts, who write : Around 18% of statistical results in the psychological literature are incorrectly reported. Inconsistencies were more common in low-impact journals than in high-impact journals. Moreover, around 15% of the articles contained at least one statistical conclusion that proved, upon recalculation, to be incorrect; that is, recalculation rendered the previously significant result insignificant, or vice versa. These errors were often in line with researchers’ expectations. Their research also had a qualitative component: To obtain a better understanding of the origins of the errors made in the reporting of statistics, we contacted the authors of the articles with errors in the second study and asked them to send us the raw data. Regrettably, only 24% of the authors shared their data, despite our request
same-blog 4 0.96570736 534 andrew gelman stats-2011-01-24-Bayes at the end
Introduction: John Cook noticed something : I [Cook] was looking at the preface of an old statistics book and read this: The Bayesian techniques occur at the end of each chapter; therefore they can be omitted if time does not permit their inclusion. This approach is typical. Many textbooks present frequentist statistics with a little Bayesian statistics at the end of each section or at the end of the book. There are a couple ways to look at that. One is simply that Bayesian methods are optional. They must not be that important or they’d get more space. The author even recommends dropping them if pressed for time. Another way to look at this is that Bayesian statistics must be simpler than frequentist statistics since the Bayesian approach to each task requires fewer pages. My reaction: Classical statistics is all about summarizing the data. Bayesian statistics is data + prior information. On those grounds alone, Bayes is more complicated, and it makes sense to do classical sta
5 0.96020925 984 andrew gelman stats-2011-11-01-David MacKay sez . . . 12??
Introduction: I’ve recently been reading David MacKay’s 2003 book , Information Theory, Inference, and Learning Algorithms. It’s great background for my Bayesian computation class because he has lots of pictures and detailed discussions of the algorithms. (Regular readers of this blog will not be surprised to hear that I hate all the Occam -factor stuff that MacKay talks about, but overall it’s a great book.) Anyway, I happened to notice the following bit, under the heading, “How many samples are needed?”: In many problems, we really only need about twelve independent samples from P(x). Imagine that x is an unknown vector such as the amount of corrosion present in each of 10 000 underground pipelines around Cambridge, and φ(x) is the total cost of repairing those pipelines. The distribution P(x) describes the probability of a state x given the tests that have been carried out on some pipelines and the assumptions about the physics of corrosion. The quantity Φ is the expected cost of the repa
6 0.9507069 24 andrew gelman stats-2010-05-09-Special journal issue on statistical methods for the social sciences
7 0.94970727 1929 andrew gelman stats-2013-07-07-Stereotype threat!
9 0.94734168 1162 andrew gelman stats-2012-02-11-Adding an error model to a deterministic model
10 0.94560599 1011 andrew gelman stats-2011-11-15-World record running times vs. distance
11 0.94433135 14 andrew gelman stats-2010-05-01-Imputing count data
12 0.94347674 1158 andrew gelman stats-2012-02-07-The more likely it is to be X, the more likely it is to be Not X?
13 0.94288838 2129 andrew gelman stats-2013-12-10-Cross-validation and Bayesian estimation of tuning parameters
14 0.93946177 426 andrew gelman stats-2010-11-22-Postdoc opportunity here at Columbia — deadline soon!
15 0.93863046 1041 andrew gelman stats-2011-12-04-David MacKay and Occam’s Razor
16 0.93681753 669 andrew gelman stats-2011-04-19-The mysterious Gamma (1.4, 0.4)
18 0.93568623 781 andrew gelman stats-2011-06-28-The holes in my philosophy of Bayesian data analysis
19 0.93524098 2093 andrew gelman stats-2013-11-07-I’m negative on the expression “false positives”
20 0.93510246 899 andrew gelman stats-2011-09-10-The statistical significance filter