andrew_gelman_stats andrew_gelman_stats-2011 andrew_gelman_stats-2011-811 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Astrophysicist Andrew Jaffe pointed me to this and discussion of my philosophy of statistics (which is, in turn, my rational reconstruction of the statistical practice of Bayesians such as Rubin and Jaynes). Jaffe’s summary is fair enough and I only disagree in a few points: 1. Jaffe writes: Subjective probability, at least the way it is actually used by practicing scientists, is a sort of “as-if” subjectivity — how would an agent reason if her beliefs were reflected in a certain set of probability distributions? This is why when I discuss probability I try to make the pedantic point that all probabilities are conditional, at least on some background prior information or context. I agree, and my problem with the usual procedures used for Bayesian model comparison and Bayesian model averaging is not that these approaches are subjective but that the particular models being considered don’t make sense. I’m thinking of the sorts of models that say the truth is either A or
sentIndex sentText sentNum sentScore
1 Astrophysicist Andrew Jaffe pointed me to this and discussion of my philosophy of statistics (which is, in turn, my rational reconstruction of the statistical practice of Bayesians such as Rubin and Jaynes). [sent-1, score-0.281]
2 Jaffe writes: Subjective probability, at least the way it is actually used by practicing scientists, is a sort of “as-if” subjectivity — how would an agent reason if her beliefs were reflected in a certain set of probability distributions? [sent-3, score-0.553]
3 This is why when I discuss probability I try to make the pedantic point that all probabilities are conditional, at least on some background prior information or context. [sent-4, score-0.352]
4 I agree, and my problem with the usual procedures used for Bayesian model comparison and Bayesian model averaging is not that these approaches are subjective but that the particular models being considered don’t make sense. [sent-5, score-0.757]
5 I’m thinking of the sorts of models that say the truth is either A or B or C. [sent-6, score-0.29]
6 As discussed in chapter 6 of BDA, I prefer continuous model expansion to discrete model averaging. [sent-7, score-0.626]
7 Either way, we’re doing Bayesian inference conditional on a model; I’d just rather do it on a model that I like. [sent-8, score-0.418]
8 There is some relevant statistical analysis here, I think, about how these different sorts of models perform under different real-world situations. [sent-9, score-0.195]
9 Jaffe writes that I view my philosophy as “Popperian rather than Kuhnian. [sent-11, score-0.202]
10 In my paper with Shalizi, we speak of our philosophy as containing elements of Popper, Kuhn, and Lakatos. [sent-13, score-0.34]
11 In particular, we can make a Kuhnian identification of Bayesian inference within a model as “normal science” and model checking and replacement as “scientific revolution. [sent-14, score-0.683]
12 ” (From a Lakatosian perspective, I identify various responses to model checks as different forms of operations in a scientific research programme, ranging from exception-handling through modification of the protective belt of auxiliary hypothesis through full replacement of a model. [sent-15, score-0.819]
13 Jaffe writes that I “make a rather strange leap: deciding amongst any discrete set of parameters falls into the category of model comparison. [sent-17, score-0.643]
14 ” This reveals that I wasn’t so clear in stating my position. [sent-18, score-0.129]
15 I’m not saying that a Bayesian such as myself shouldn’t or wouldn’t apply Bayesian inference to a discrete-parameter model. [sent-19, score-0.121]
16 What I was saying is that my philosophy isn’t complete. [sent-20, score-0.202]
17 My incoherence is that I don’t really have a clear rule of when it’s OK to do Bayesian model averaging and when it’s not. [sent-22, score-0.556]
18 As noted in my recent article, I don’t think this incoherence is fatal–all other statistical frameworks I know of have incoherence issues–but it’s interesting. [sent-23, score-0.535]
wordName wordTfidf (topN-words)
[('jaffe', 0.524), ('incoherence', 0.23), ('discrete', 0.212), ('model', 0.207), ('philosophy', 0.202), ('bayesian', 0.198), ('replacement', 0.148), ('inference', 0.121), ('models', 0.12), ('averaging', 0.119), ('subjective', 0.104), ('prior', 0.097), ('belt', 0.096), ('either', 0.095), ('probability', 0.091), ('lakatosian', 0.09), ('conditional', 0.09), ('normal', 0.089), ('approximating', 0.086), ('programme', 0.086), ('kuhnian', 0.086), ('pedantic', 0.086), ('dense', 0.083), ('practicing', 0.083), ('set', 0.081), ('auxiliary', 0.081), ('protective', 0.081), ('reconstruction', 0.079), ('fatal', 0.079), ('least', 0.078), ('amongst', 0.077), ('popperian', 0.075), ('frameworks', 0.075), ('agent', 0.075), ('kuhn', 0.075), ('sorts', 0.075), ('modification', 0.074), ('subjectivity', 0.074), ('jaynes', 0.071), ('reflected', 0.071), ('leap', 0.07), ('containing', 0.07), ('operations', 0.068), ('elements', 0.068), ('grid', 0.067), ('deciding', 0.066), ('stating', 0.065), ('scientific', 0.064), ('popper', 0.064), ('reveals', 0.064)]
simIndex simValue blogId blogTitle
same-blog 1 0.99999994 811 andrew gelman stats-2011-07-20-Kind of Bayesian
Introduction: Astrophysicist Andrew Jaffe pointed me to this and discussion of my philosophy of statistics (which is, in turn, my rational reconstruction of the statistical practice of Bayesians such as Rubin and Jaynes). Jaffe’s summary is fair enough and I only disagree in a few points: 1. Jaffe writes: Subjective probability, at least the way it is actually used by practicing scientists, is a sort of “as-if” subjectivity — how would an agent reason if her beliefs were reflected in a certain set of probability distributions? This is why when I discuss probability I try to make the pedantic point that all probabilities are conditional, at least on some background prior information or context. I agree, and my problem with the usual procedures used for Bayesian model comparison and Bayesian model averaging is not that these approaches are subjective but that the particular models being considered don’t make sense. I’m thinking of the sorts of models that say the truth is either A or
2 0.33337092 781 andrew gelman stats-2011-06-28-The holes in my philosophy of Bayesian data analysis
Introduction: I’ve been writing a lot about my philosophy of Bayesian statistics and how it fits into Popper’s ideas about falsification and Kuhn’s ideas about scientific revolutions. Here’s my long, somewhat technical paper with Cosma Shalizi. Here’s our shorter overview for the volume on the philosophy of social science. Here’s my latest try (for an online symposium), focusing on the key issues. I’m pretty happy with my approach–the familiar idea that Bayesian data analysis iterates the three steps of model building, inference, and model checking–but it does have some unresolved (maybe unresolvable) problems. Here are a couple mentioned in the third of the above links. Consider a simple model with independent data y_1, y_2, .., y_10 ~ N(θ,σ^2), with a prior distribution θ ~ N(0,10^2) and σ known and taking on some value of approximately 10. Inference about μ is straightforward, as is model checking, whether based on graphs or numerical summaries such as the sample variance and skewn
3 0.2705133 1719 andrew gelman stats-2013-02-11-Why waste time philosophizing?
Introduction: I’ll answer the above question after first sharing some background and history on the the philosophy of Bayesian statistics, which appeared at the end of our rejoinder to the discussion to which I linked the other day: When we were beginning our statistical educations, the word ‘Bayesian’ conveyed membership in an obscure cult. Statisticians who were outside the charmed circle could ignore the Bayesian subfield, while Bayesians themselves tended to be either apologetic or brazenly defiant. These two extremes manifested themselves in ever more elaborate proposals for non-informative priors, on the one hand, and declarations of the purity of subjective probability, on the other. Much has changed in the past 30 years. ‘Bayesian’ is now often used in casual scientific parlance as a synonym for ‘rational’, the anti-Bayesians have mostly disappeared, and non-Bayesian statisticians feel the need to keep up with developments in Bayesian modelling and computation. Bayesians themselves
4 0.2534053 1247 andrew gelman stats-2012-04-05-More philosophy of Bayes
Introduction: Konrad Scheffler writes: I was interested by your paper “Induction and deduction in Bayesian data analysis” and was wondering if you would entertain a few questions: – Under the banner of objective Bayesianism, I would posit something like this as a description of Bayesian inference: “Objective Bayesian probability is not a degree of belief (which would necessarily be subjective) but a measure of the plausibility of a hypothesis, conditional on a formally specified information state. One way of specifying a formal information state is to specify a model, which involves specifying both a prior distribution (typically for a set of unobserved variables) and a likelihood function (typically for a set of observed variables, conditioned on the values of the unobserved variables). Bayesian inference involves calculating the objective degree of plausibility of a hypothesis (typically the truth value of the hypothesis is a function of the variables mentioned above) given such a
5 0.25068158 1779 andrew gelman stats-2013-03-27-“Two Dogmas of Strong Objective Bayesianism”
Introduction: Prasanta Bandyopadhyay and Gordon Brittan write : We introduce a distinction, unnoticed in the literature, between four varieties of objective Bayesianism. What we call ‘strong objective Bayesianism’ is characterized by two claims, that all scientific inference is ‘logical’ and that, given the same background information two agents will ascribe a unique probability to their priors. We think that neither of these claims can be sustained; in this sense, they are ‘dogmatic’. The first fails to recognize that some scientific inference, in particular that concerning evidential relations, is not (in the appropriate sense) logical, the second fails to provide a non-question-begging account of ‘same background information’. We urge that a suitably objective Bayesian account of scientific inference does not require either of the claims. Finally, we argue that Bayesianism needs to be fine-grained in the same way that Bayesians fine-grain their beliefs. I have not read their paper in detai
6 0.24199383 1712 andrew gelman stats-2013-02-07-Philosophy and the practice of Bayesian statistics (with all the discussions!)
7 0.23448287 754 andrew gelman stats-2011-06-09-Difficulties with Bayesian model averaging
8 0.22874179 110 andrew gelman stats-2010-06-26-Philosophy and the practice of Bayesian statistics
9 0.20923683 1205 andrew gelman stats-2012-03-09-Coming to agreement on philosophy of statistics
10 0.20095876 1510 andrew gelman stats-2012-09-25-Incoherence of Bayesian data analysis
11 0.20013708 291 andrew gelman stats-2010-09-22-Philosophy of Bayes and non-Bayes: A dialogue with Deborah Mayo
12 0.19732957 320 andrew gelman stats-2010-10-05-Does posterior predictive model checking fit with the operational subjective approach?
13 0.18543921 1151 andrew gelman stats-2012-02-03-Philosophy of Bayesian statistics: my reactions to Senn
14 0.18072212 1438 andrew gelman stats-2012-07-31-What is a Bayesian?
15 0.1754282 1695 andrew gelman stats-2013-01-28-Economists argue about Bayes
16 0.17475086 1182 andrew gelman stats-2012-02-24-Untangling the Jeffreys-Lindley paradox
18 0.17296997 217 andrew gelman stats-2010-08-19-The “either-or” fallacy of believing in discrete models: an example of folk statistics
19 0.17293414 1554 andrew gelman stats-2012-10-31-It not necessary that Bayesian methods conform to the likelihood principle
20 0.17281517 614 andrew gelman stats-2011-03-15-Induction within a model, deductive inference for model evaluation
topicId topicWeight
[(0, 0.228), (1, 0.278), (2, -0.074), (3, 0.098), (4, -0.169), (5, -0.02), (6, -0.029), (7, 0.07), (8, 0.078), (9, -0.023), (10, 0.007), (11, 0.005), (12, -0.063), (13, 0.029), (14, -0.023), (15, 0.026), (16, 0.085), (17, -0.006), (18, -0.023), (19, 0.049), (20, -0.038), (21, -0.007), (22, -0.043), (23, -0.076), (24, -0.01), (25, -0.002), (26, 0.014), (27, -0.023), (28, -0.043), (29, -0.026), (30, -0.013), (31, -0.014), (32, -0.022), (33, 0.008), (34, -0.007), (35, -0.004), (36, 0.029), (37, 0.002), (38, 0.001), (39, -0.003), (40, 0.017), (41, -0.031), (42, 0.015), (43, 0.012), (44, 0.008), (45, 0.001), (46, 0.002), (47, -0.004), (48, -0.009), (49, 0.003)]
simIndex simValue blogId blogTitle
same-blog 1 0.98518318 811 andrew gelman stats-2011-07-20-Kind of Bayesian
Introduction: Astrophysicist Andrew Jaffe pointed me to this and discussion of my philosophy of statistics (which is, in turn, my rational reconstruction of the statistical practice of Bayesians such as Rubin and Jaynes). Jaffe’s summary is fair enough and I only disagree in a few points: 1. Jaffe writes: Subjective probability, at least the way it is actually used by practicing scientists, is a sort of “as-if” subjectivity — how would an agent reason if her beliefs were reflected in a certain set of probability distributions? This is why when I discuss probability I try to make the pedantic point that all probabilities are conditional, at least on some background prior information or context. I agree, and my problem with the usual procedures used for Bayesian model comparison and Bayesian model averaging is not that these approaches are subjective but that the particular models being considered don’t make sense. I’m thinking of the sorts of models that say the truth is either A or
2 0.90160811 1779 andrew gelman stats-2013-03-27-“Two Dogmas of Strong Objective Bayesianism”
Introduction: Prasanta Bandyopadhyay and Gordon Brittan write : We introduce a distinction, unnoticed in the literature, between four varieties of objective Bayesianism. What we call ‘strong objective Bayesianism’ is characterized by two claims, that all scientific inference is ‘logical’ and that, given the same background information two agents will ascribe a unique probability to their priors. We think that neither of these claims can be sustained; in this sense, they are ‘dogmatic’. The first fails to recognize that some scientific inference, in particular that concerning evidential relations, is not (in the appropriate sense) logical, the second fails to provide a non-question-begging account of ‘same background information’. We urge that a suitably objective Bayesian account of scientific inference does not require either of the claims. Finally, we argue that Bayesianism needs to be fine-grained in the same way that Bayesians fine-grain their beliefs. I have not read their paper in detai
3 0.9001078 1182 andrew gelman stats-2012-02-24-Untangling the Jeffreys-Lindley paradox
Introduction: Ryan Ickert writes: I was wondering if you’d seen this post , by a particle physicist with some degree of influence. Dr. Dorigo works at CERN and Fermilab. The penultimate paragraph is: From the above expression, the Frequentist researcher concludes that the tracker is indeed biased, and rejects the null hypothesis H0, since there is a less-than-2% probability (P’<α) that a result as the one observed could arise by chance! A Frequentist thus draws, strongly, the opposite conclusion than a Bayesian from the same set of data. How to solve the riddle? He goes on to not solve the riddle. Perhaps you can? Surely with the large sample size they have (n=10^6), the precision on the frequentist p-value is pretty good, is it not? My reply: The first comment on the site (by Anonymous [who, just to be clear, is not me; I have no idea who wrote that comment], 22 Feb 2012, 21:27pm) pretty much nails it: In setting up the Bayesian model, Dorigo assumed a silly distribution on th
4 0.88977808 1205 andrew gelman stats-2012-03-09-Coming to agreement on philosophy of statistics
Introduction: Deborah Mayo collected some reactions to my recent article , Induction and Deduction in Bayesian Data Analysis. I’m pleased that that everybody (philosopher Mayo, applied statistician Stephen Senn, and theoretical statistician Larry Wasserman) is so positive about my article and that nobody’s defending the sort of hard-core inductivism that’s featured on the Bayesian inference wikipedia page. Here’s the Wikipedia definition, which I disagree with: Bayesian inference uses aspects of the scientific method, which involves collecting evidence that is meant to be consistent or inconsistent with a given hypothesis. As evidence accumulates, the degree of belief in a hypothesis ought to change. With enough evidence, it should become very high or very low. . . . Bayesian inference uses a numerical estimate of the degree of belief in a hypothesis before evidence has been observed and calculates a numerical estimate of the degree of belief in the hypothesis after evidence has been obse
5 0.8896749 781 andrew gelman stats-2011-06-28-The holes in my philosophy of Bayesian data analysis
Introduction: I’ve been writing a lot about my philosophy of Bayesian statistics and how it fits into Popper’s ideas about falsification and Kuhn’s ideas about scientific revolutions. Here’s my long, somewhat technical paper with Cosma Shalizi. Here’s our shorter overview for the volume on the philosophy of social science. Here’s my latest try (for an online symposium), focusing on the key issues. I’m pretty happy with my approach–the familiar idea that Bayesian data analysis iterates the three steps of model building, inference, and model checking–but it does have some unresolved (maybe unresolvable) problems. Here are a couple mentioned in the third of the above links. Consider a simple model with independent data y_1, y_2, .., y_10 ~ N(θ,σ^2), with a prior distribution θ ~ N(0,10^2) and σ known and taking on some value of approximately 10. Inference about μ is straightforward, as is model checking, whether based on graphs or numerical summaries such as the sample variance and skewn
8 0.87774265 1719 andrew gelman stats-2013-02-11-Why waste time philosophizing?
10 0.86980337 1510 andrew gelman stats-2012-09-25-Incoherence of Bayesian data analysis
11 0.86666912 2182 andrew gelman stats-2014-01-22-Spell-checking example demonstrates key aspects of Bayesian data analysis
12 0.86456829 1041 andrew gelman stats-2011-12-04-David MacKay and Occam’s Razor
13 0.8631199 1438 andrew gelman stats-2012-07-31-What is a Bayesian?
14 0.85931098 291 andrew gelman stats-2010-09-22-Philosophy of Bayes and non-Bayes: A dialogue with Deborah Mayo
15 0.85425055 754 andrew gelman stats-2011-06-09-Difficulties with Bayesian model averaging
16 0.8469587 1247 andrew gelman stats-2012-04-05-More philosophy of Bayes
17 0.84569919 244 andrew gelman stats-2010-08-30-Useful models, model checking, and external validation: a mini-discussion
18 0.84211695 1529 andrew gelman stats-2012-10-11-Bayesian brains?
19 0.83583105 1208 andrew gelman stats-2012-03-11-Gelman on Hennig on Gelman on Bayes
20 0.83416945 1571 andrew gelman stats-2012-11-09-The anti-Bayesian moment and its passing
topicId topicWeight
[(10, 0.014), (15, 0.017), (16, 0.081), (21, 0.03), (24, 0.171), (31, 0.013), (45, 0.153), (62, 0.015), (63, 0.015), (76, 0.018), (84, 0.024), (86, 0.038), (99, 0.284)]
simIndex simValue blogId blogTitle
1 0.97724521 791 andrew gelman stats-2011-07-08-Censoring on one end, “outliers” on the other, what can we do with the middle?
Introduction: This post was written by Phil. A medical company is testing a cancer drug. They get a 16 genetically identical (or nearly identical) rats that all have the same kind of tumor, give 8 of them the drug and leave 8 untreated…or maybe they give them a placebo, I don’t know; is there a placebo effect in rats?. Anyway, after a while the rats are killed and examined. If the tumors in the treated rats are smaller than the tumors in the untreated rats, then all of the rats have their blood tested for dozens of different proteins that are known to be associated with tumor growth or suppression. If there is a “significant” difference in one of the protein levels, then the working assumption is that the drug increases or decreases levels of that protein and that may be the mechanism by which the drug affects cancer. All of the above is done on many different cancer types and possibly several different types of rats. It’s just the initial screening: if things look promising, many more tests an
2 0.97381318 310 andrew gelman stats-2010-10-02-The winner’s curse
Introduction: If an estimate is statistically significant, it’s probably an overestimate of the magnitude of your effect. P.S. I think youall know what I mean here. But could someone rephrase it in a more pithy manner? I’d like to include it in our statistical lexicon.
3 0.96487182 1121 andrew gelman stats-2012-01-15-R-squared for multilevel models
Introduction: Fred Schiff writes: I’m writing to you to ask about the “R-squared” approximation procedure you suggest in your 2004 book with Dr. Hill. [See also this paper with Pardoe---ed.] I’m a media sociologist at the University of Houston. I’ve been using HLM3 for about two years. Briefly about my data. It’s a content analysis of news stories with a continuous scale dependent variable, story prominence. I have 6090 news stories, 114 newspapers, and 59 newspaper group owners. All the Level-1, Level-2 and dependent variables have been standardized. Since the means were zero anyway, we left the variables uncentered. All the Level-3 ownership groups and characteristics are dichotomous scales that were left uncentered. PROBLEM: The single most important result I am looking for is to compare the strength of nine competing Level-1 variables in their ability to predict and explain the outcome variable, story prominence. We are trying to use the residuals to calculate a “R-squ
4 0.96341062 69 andrew gelman stats-2010-06-04-A Wikipedia whitewash
Introduction: After hearing a few times about the divorce predictions of researchers John Gottman and James Murray (work that was featured in Blink with a claim that they could predict with 83 percent accuracy whether a couple would be divorced–after meeting with them for 15 minutes) and feeling some skepticism , I decided to do the Lord’s work and amend Gottman’s wikipedia entry, which had a paragraph saying: Gottman found his methodology predicts with 90% accuracy which newlywed couples will remain married and which will divorce four to six years later. It is also 81% percent accurate in predicting which marriages will survive after seven to nine years. I added the following: Gottman’s claim of 81% or 90% accuracy is misleading, however, because the accuracy is measured only after fitting a model to his data. There is no evidence that he can predict the outcome of a marriage with high accuracy in advance. As Laurie Abraham writes, “For the 1998 study, which focused on videotapes of 57
5 0.96340543 1325 andrew gelman stats-2012-05-17-More on the difficulty of “preaching what you practice”
Introduction: A couple months ago, in discussing Charles Murray’s argument that America’s social leaders should “preach what they practice” (Murray argues that they—we!—tend to lead good lives of hard work and moderation but are all too tolerant of antisocial and unproductive behavior among the lower classes), I wrote : Murray does not consider the case of Joe Paterno, but in many ways the Penn State football coach fits his story well. Paterno was said to live an exemplary personal and professional life, combining traditional morality with football success—but, by his actions, he showed little concern about the morality of his players and coaches. At a professional level, Paterno rose higher and higher, and in his personal life he was a responsible adult. But he had an increasing disconnect with the real world, to the extent that horrible crimes were occurring nearby (in the physical and social senses) but he was completely insulated from the consequences for many years. Paterno’s story is s
6 0.95880109 1031 andrew gelman stats-2011-11-27-Richard Stallman and John McCarthy
7 0.95469081 1089 andrew gelman stats-2011-12-28-Path sampling for models of varying dimension
8 0.95304894 1012 andrew gelman stats-2011-11-16-Blog bribes!
9 0.95151794 1658 andrew gelman stats-2013-01-07-Free advice from an academic writing coach!
10 0.94840574 2189 andrew gelman stats-2014-01-28-History is too important to be left to the history professors
11 0.94837284 999 andrew gelman stats-2011-11-09-I was at a meeting a couple months ago . . .
same-blog 12 0.94152659 811 andrew gelman stats-2011-07-20-Kind of Bayesian
13 0.94136536 362 andrew gelman stats-2010-10-22-A redrawing of the Red-Blue map in November 2010?
14 0.94092178 1767 andrew gelman stats-2013-03-17-The disappearing or non-disappearing middle class
15 0.93876773 1504 andrew gelman stats-2012-09-20-Could someone please lock this guy and Niall Ferguson in a room together?
16 0.93642676 1854 andrew gelman stats-2013-05-13-A Structural Comparison of Conspicuous Consumption in China and the United States
17 0.9356553 777 andrew gelman stats-2011-06-23-Combining survey data obtained using different modes of sampling
18 0.93498993 673 andrew gelman stats-2011-04-20-Upper-income people still don’t realize they’re upper-income
19 0.934412 543 andrew gelman stats-2011-01-28-NYT shills for personal DNA tests
20 0.9290117 728 andrew gelman stats-2011-05-24-A (not quite) grand unified theory of plagiarism, as applied to the Wegman case