andrew_gelman_stats andrew_gelman_stats-2012 andrew_gelman_stats-2012-1510 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Hogg writes: At the end this article you wonder about consistency. Have you ever considered the possibility that utility might resolve some of the problems? I have no idea if it would—I am not advocating that position—I just get some kind of intuition from phrases like “Judgment is required to decide…”. Perhaps there is a coherent and objective description of what is—or could be—done under a coherent “utility” model (like a utility that could be objectively agreed upon and computed). Utilities are usually subjective—true—but priors are usually subjective too. My reply: I’m happy to think about utility, for some particular problem or class of problems going to the effort of assigning costs and benefits to different outcomes. I agree that a utility analysis, even if (necessarily) imperfect, can usefully focus discussion. For example, if a statistical method for selecting variables is justified on the basis of cost, I like the idea of attempting to quantify the costs of ga
sentIndex sentText sentNum sentScore
1 Hogg writes: At the end this article you wonder about consistency. [sent-1, score-0.086]
2 Have you ever considered the possibility that utility might resolve some of the problems? [sent-2, score-0.464]
3 I have no idea if it would—I am not advocating that position—I just get some kind of intuition from phrases like “Judgment is required to decide…”. [sent-3, score-0.203]
4 Perhaps there is a coherent and objective description of what is—or could be—done under a coherent “utility” model (like a utility that could be objectively agreed upon and computed). [sent-4, score-1.061]
5 Utilities are usually subjective—true—but priors are usually subjective too. [sent-5, score-0.33]
6 My reply: I’m happy to think about utility, for some particular problem or class of problems going to the effort of assigning costs and benefits to different outcomes. [sent-6, score-0.438]
7 I agree that a utility analysis, even if (necessarily) imperfect, can usefully focus discussion. [sent-7, score-0.48]
8 For example, if a statistical method for selecting variables is justified on the basis of cost, I like the idea of attempting to quantify the costs of gathering and handling predictors, as compared to the costs of errors in predictions for new data. [sent-8, score-0.965]
9 But the problem of incoherence as discussed at the end of my article—that’s something different. [sent-9, score-0.188]
10 Here I’m referring to two fundamental problems with Bayesian data analysis as I practice it: 1. [sent-10, score-0.247]
11 I prefer continuous model expansion to discrete model averaging—but the former can be seen as just a limiting case of the latter. [sent-11, score-0.571]
12 So really I need a better understanding of what sorts of model expansions work well and what sorts run into trouble. [sent-12, score-0.516]
13 From a Bayesian perspective, the trouble typically arises from the joint prior distribution over the larger, expanded space. [sent-13, score-0.184]
14 Default choices such as prior independence often create problems that were not so obvious when the model was set up. [sent-14, score-0.512]
15 My procedure of model building, inference, and model checking requires outside human intervention. [sent-16, score-0.516]
16 How could a computer do it, if you wanted to program a computer to do Bayesian data analysis? [sent-17, score-0.36]
17 How can our brains do anything approximating Bayesian data analysis? [sent-18, score-0.21]
18 Neither the computer nor the brain has a “homunculus” that can sit outside, make graphs, and do posterior predictive checks. [sent-19, score-0.26]
19 I don’t have a great answer to this right now, but I suspect that the natural or artificial intelligence actually would need some external module to check model fit. [sent-20, score-0.552]
20 This connects to the familiar “aha” feeling and to the fractal nature of scientific revolutions. [sent-21, score-0.19]
wordName wordTfidf (topN-words)
[('utility', 0.382), ('costs', 0.206), ('model', 0.199), ('computer', 0.18), ('coherent', 0.151), ('problems', 0.139), ('subjective', 0.138), ('bayesian', 0.131), ('expansions', 0.119), ('homunculus', 0.119), ('advocating', 0.119), ('outside', 0.118), ('approximating', 0.114), ('aha', 0.11), ('analysis', 0.108), ('objectively', 0.104), ('utilities', 0.104), ('revolutions', 0.102), ('incoherence', 0.102), ('module', 0.102), ('gathering', 0.1), ('sorts', 0.099), ('usefully', 0.098), ('expanded', 0.098), ('fractal', 0.098), ('brains', 0.096), ('usually', 0.096), ('selecting', 0.093), ('assigning', 0.093), ('attempting', 0.093), ('connects', 0.092), ('quantify', 0.092), ('artificial', 0.09), ('limiting', 0.09), ('hogg', 0.09), ('imperfect', 0.089), ('independence', 0.088), ('handling', 0.088), ('justified', 0.087), ('prior', 0.086), ('end', 0.086), ('phrases', 0.084), ('expansion', 0.083), ('intelligence', 0.082), ('resolve', 0.082), ('computed', 0.082), ('sit', 0.08), ('averaging', 0.079), ('external', 0.079), ('agreed', 0.074)]
simIndex simValue blogId blogTitle
same-blog 1 1.0000002 1510 andrew gelman stats-2012-09-25-Incoherence of Bayesian data analysis
Introduction: Hogg writes: At the end this article you wonder about consistency. Have you ever considered the possibility that utility might resolve some of the problems? I have no idea if it would—I am not advocating that position—I just get some kind of intuition from phrases like “Judgment is required to decide…”. Perhaps there is a coherent and objective description of what is—or could be—done under a coherent “utility” model (like a utility that could be objectively agreed upon and computed). Utilities are usually subjective—true—but priors are usually subjective too. My reply: I’m happy to think about utility, for some particular problem or class of problems going to the effort of assigning costs and benefits to different outcomes. I agree that a utility analysis, even if (necessarily) imperfect, can usefully focus discussion. For example, if a statistical method for selecting variables is justified on the basis of cost, I like the idea of attempting to quantify the costs of ga
2 0.29268578 781 andrew gelman stats-2011-06-28-The holes in my philosophy of Bayesian data analysis
Introduction: I’ve been writing a lot about my philosophy of Bayesian statistics and how it fits into Popper’s ideas about falsification and Kuhn’s ideas about scientific revolutions. Here’s my long, somewhat technical paper with Cosma Shalizi. Here’s our shorter overview for the volume on the philosophy of social science. Here’s my latest try (for an online symposium), focusing on the key issues. I’m pretty happy with my approach–the familiar idea that Bayesian data analysis iterates the three steps of model building, inference, and model checking–but it does have some unresolved (maybe unresolvable) problems. Here are a couple mentioned in the third of the above links. Consider a simple model with independent data y_1, y_2, .., y_10 ~ N(θ,σ^2), with a prior distribution θ ~ N(0,10^2) and σ known and taking on some value of approximately 10. Inference about μ is straightforward, as is model checking, whether based on graphs or numerical summaries such as the sample variance and skewn
3 0.2885339 1200 andrew gelman stats-2012-03-06-Some economists are skeptical about microfoundations
Introduction: A few months ago, I wrote : Economists seem to rely heavily on a sort of folk psychology, a relic of the 1920s-1950s in which people calculate utilities (or act as if they are doing so) in order to make decisions. A central tenet of economics is that inference or policy recommendation be derived from first principles from this folk-psychology model. This just seems silly to me, as if astronomers justified all their calculations with an underlying appeal to Aristotle’s mechanics. Or maybe the better analogy is the Stalinist era in which everything had to be connected to Marxist principles (followed, perhaps, by an equationful explanation of how the world can be interpreted as if Marxism were valid). Mark Thoma and Paul Krugman seem to agree with me on this one (as does my Barnard colleague Rajiv Sethi ). They don’t go so far as to identify utility etc as folk psychology, but maybe that will come next. P.S. Perhaps this will clarify: In a typical economics research pap
4 0.23327431 1247 andrew gelman stats-2012-04-05-More philosophy of Bayes
Introduction: Konrad Scheffler writes: I was interested by your paper “Induction and deduction in Bayesian data analysis” and was wondering if you would entertain a few questions: – Under the banner of objective Bayesianism, I would posit something like this as a description of Bayesian inference: “Objective Bayesian probability is not a degree of belief (which would necessarily be subjective) but a measure of the plausibility of a hypothesis, conditional on a formally specified information state. One way of specifying a formal information state is to specify a model, which involves specifying both a prior distribution (typically for a set of unobserved variables) and a likelihood function (typically for a set of observed variables, conditioned on the values of the unobserved variables). Bayesian inference involves calculating the objective degree of plausibility of a hypothesis (typically the truth value of the hypothesis is a function of the variables mentioned above) given such a
5 0.20439702 754 andrew gelman stats-2011-06-09-Difficulties with Bayesian model averaging
Introduction: In response to this article by Cosma Shalizi and myself on the philosophy of Bayesian statistics, David Hogg writes: I [Hogg] agree–even in physics and astronomy–that the models are not “True” in the God-like sense of being absolute reality (that is, I am not a realist); and I have argued (a philosophically very naive paper, but hey, I was new to all this) that for pretty fundamental reasons we could never arrive at the True (with a capital “T”) model of the Universe. The goal of inference is to find the “best” model, where “best” might have something to do with prediction, or explanation, or message length, or (horror!) our utility. Needless to say, most of my physics friends *are* realists, even in the face of “effective theories” as Newtonian mechanics is an effective theory of GR and GR is an effective theory of “quantum gravity” (this plays to your point, because if you think any theory is possibly an effective theory, how could you ever find Truth?). I also liked the i
6 0.20095876 811 andrew gelman stats-2011-07-20-Kind of Bayesian
7 0.19862223 1779 andrew gelman stats-2013-03-27-“Two Dogmas of Strong Objective Bayesianism”
8 0.19556253 1431 andrew gelman stats-2012-07-27-Overfitting
9 0.1948438 1482 andrew gelman stats-2012-09-04-Model checking and model understanding in machine learning
10 0.19152386 922 andrew gelman stats-2011-09-24-Economists don’t think like accountants—but maybe they should
11 0.19069862 320 andrew gelman stats-2010-10-05-Does posterior predictive model checking fit with the operational subjective approach?
12 0.18301541 1941 andrew gelman stats-2013-07-16-Priors
13 0.17718257 1205 andrew gelman stats-2012-03-09-Coming to agreement on philosophy of statistics
14 0.17050049 1155 andrew gelman stats-2012-02-05-What is a prior distribution?
15 0.16998163 2312 andrew gelman stats-2014-04-29-Ken Rice presents a unifying approach to statistical inference and hypothesis testing
16 0.16975555 1719 andrew gelman stats-2013-02-11-Why waste time philosophizing?
18 0.16308871 1946 andrew gelman stats-2013-07-19-Prior distributions on derived quantities rather than on parameters themselves
19 0.15500703 1999 andrew gelman stats-2013-08-27-Bayesian model averaging or fitting a larger model
20 0.14948229 1092 andrew gelman stats-2011-12-29-More by Berger and me on weakly informative priors
topicId topicWeight
[(0, 0.24), (1, 0.209), (2, -0.047), (3, 0.076), (4, -0.079), (5, -0.022), (6, 0.001), (7, 0.049), (8, 0.032), (9, 0.042), (10, 0.002), (11, 0.023), (12, -0.049), (13, 0.013), (14, -0.039), (15, 0.005), (16, 0.094), (17, -0.035), (18, 0.001), (19, 0.038), (20, -0.017), (21, -0.039), (22, -0.035), (23, -0.033), (24, -0.091), (25, 0.018), (26, 0.043), (27, -0.048), (28, -0.011), (29, -0.025), (30, -0.02), (31, -0.018), (32, 0.026), (33, 0.028), (34, 0.018), (35, -0.009), (36, -0.007), (37, -0.003), (38, -0.003), (39, 0.017), (40, 0.013), (41, -0.013), (42, -0.03), (43, 0.053), (44, 0.021), (45, -0.025), (46, 0.035), (47, -0.003), (48, 0.021), (49, 0.04)]
simIndex simValue blogId blogTitle
same-blog 1 0.97494256 1510 andrew gelman stats-2012-09-25-Incoherence of Bayesian data analysis
Introduction: Hogg writes: At the end this article you wonder about consistency. Have you ever considered the possibility that utility might resolve some of the problems? I have no idea if it would—I am not advocating that position—I just get some kind of intuition from phrases like “Judgment is required to decide…”. Perhaps there is a coherent and objective description of what is—or could be—done under a coherent “utility” model (like a utility that could be objectively agreed upon and computed). Utilities are usually subjective—true—but priors are usually subjective too. My reply: I’m happy to think about utility, for some particular problem or class of problems going to the effort of assigning costs and benefits to different outcomes. I agree that a utility analysis, even if (necessarily) imperfect, can usefully focus discussion. For example, if a statistical method for selecting variables is justified on the basis of cost, I like the idea of attempting to quantify the costs of ga
2 0.91320866 781 andrew gelman stats-2011-06-28-The holes in my philosophy of Bayesian data analysis
Introduction: I’ve been writing a lot about my philosophy of Bayesian statistics and how it fits into Popper’s ideas about falsification and Kuhn’s ideas about scientific revolutions. Here’s my long, somewhat technical paper with Cosma Shalizi. Here’s our shorter overview for the volume on the philosophy of social science. Here’s my latest try (for an online symposium), focusing on the key issues. I’m pretty happy with my approach–the familiar idea that Bayesian data analysis iterates the three steps of model building, inference, and model checking–but it does have some unresolved (maybe unresolvable) problems. Here are a couple mentioned in the third of the above links. Consider a simple model with independent data y_1, y_2, .., y_10 ~ N(θ,σ^2), with a prior distribution θ ~ N(0,10^2) and σ known and taking on some value of approximately 10. Inference about μ is straightforward, as is model checking, whether based on graphs or numerical summaries such as the sample variance and skewn
3 0.89364588 811 andrew gelman stats-2011-07-20-Kind of Bayesian
Introduction: Astrophysicist Andrew Jaffe pointed me to this and discussion of my philosophy of statistics (which is, in turn, my rational reconstruction of the statistical practice of Bayesians such as Rubin and Jaynes). Jaffe’s summary is fair enough and I only disagree in a few points: 1. Jaffe writes: Subjective probability, at least the way it is actually used by practicing scientists, is a sort of “as-if” subjectivity — how would an agent reason if her beliefs were reflected in a certain set of probability distributions? This is why when I discuss probability I try to make the pedantic point that all probabilities are conditional, at least on some background prior information or context. I agree, and my problem with the usual procedures used for Bayesian model comparison and Bayesian model averaging is not that these approaches are subjective but that the particular models being considered don’t make sense. I’m thinking of the sorts of models that say the truth is either A or
4 0.89267546 1247 andrew gelman stats-2012-04-05-More philosophy of Bayes
Introduction: Konrad Scheffler writes: I was interested by your paper “Induction and deduction in Bayesian data analysis” and was wondering if you would entertain a few questions: – Under the banner of objective Bayesianism, I would posit something like this as a description of Bayesian inference: “Objective Bayesian probability is not a degree of belief (which would necessarily be subjective) but a measure of the plausibility of a hypothesis, conditional on a formally specified information state. One way of specifying a formal information state is to specify a model, which involves specifying both a prior distribution (typically for a set of unobserved variables) and a likelihood function (typically for a set of observed variables, conditioned on the values of the unobserved variables). Bayesian inference involves calculating the objective degree of plausibility of a hypothesis (typically the truth value of the hypothesis is a function of the variables mentioned above) given such a
Introduction: David Rohde writes: I have been thinking a lot lately about your Bayesian model checking approach. This is in part because I have been working on exploratory data analysis and wishing to avoid controversy and mathematical statistics we omitted model checking from our discussion. This is something that the refereeing process picked us up on and we ultimately added a critical discussion of null-hypothesis testing to our paper . The exploratory technique we discussed was essentially a 2D histogram approach, but we used Polya models as a formal model for the histogram. We are currently working on a new paper, and we are thinking through how or if we should do “confirmatory analysis” or model checking in the paper. What I find most admirable about your statistical work is that you clearly use the Bayesian approach to do useful applied statistical analysis. My own attempts at applied Bayesian analysis makes me greatly admire your applied successes. On the other hand it may be t
6 0.88607591 754 andrew gelman stats-2011-06-09-Difficulties with Bayesian model averaging
7 0.87403882 1041 andrew gelman stats-2011-12-04-David MacKay and Occam’s Razor
8 0.85689676 244 andrew gelman stats-2010-08-30-Useful models, model checking, and external validation: a mini-discussion
10 0.84903193 1431 andrew gelman stats-2012-07-27-Overfitting
11 0.84363621 1999 andrew gelman stats-2013-08-27-Bayesian model averaging or fitting a larger model
12 0.84268838 1723 andrew gelman stats-2013-02-15-Wacky priors can work well?
13 0.83855259 1208 andrew gelman stats-2012-03-11-Gelman on Hennig on Gelman on Bayes
14 0.83571595 1182 andrew gelman stats-2012-02-24-Untangling the Jeffreys-Lindley paradox
15 0.83071411 1392 andrew gelman stats-2012-06-26-Occam
16 0.82643235 1529 andrew gelman stats-2012-10-11-Bayesian brains?
17 0.82390356 1406 andrew gelman stats-2012-07-05-Xiao-Li Meng and Xianchao Xie rethink asymptotics
18 0.81454062 1216 andrew gelman stats-2012-03-17-Modeling group-level predictors in a multilevel regression
19 0.80673236 1200 andrew gelman stats-2012-03-06-Some economists are skeptical about microfoundations
20 0.79505455 1779 andrew gelman stats-2013-03-27-“Two Dogmas of Strong Objective Bayesianism”
topicId topicWeight
[(15, 0.031), (16, 0.049), (21, 0.014), (24, 0.211), (34, 0.021), (55, 0.025), (66, 0.021), (72, 0.016), (84, 0.019), (86, 0.04), (87, 0.018), (94, 0.086), (99, 0.319)]
simIndex simValue blogId blogTitle
same-blog 1 0.98135477 1510 andrew gelman stats-2012-09-25-Incoherence of Bayesian data analysis
Introduction: Hogg writes: At the end this article you wonder about consistency. Have you ever considered the possibility that utility might resolve some of the problems? I have no idea if it would—I am not advocating that position—I just get some kind of intuition from phrases like “Judgment is required to decide…”. Perhaps there is a coherent and objective description of what is—or could be—done under a coherent “utility” model (like a utility that could be objectively agreed upon and computed). Utilities are usually subjective—true—but priors are usually subjective too. My reply: I’m happy to think about utility, for some particular problem or class of problems going to the effort of assigning costs and benefits to different outcomes. I agree that a utility analysis, even if (necessarily) imperfect, can usefully focus discussion. For example, if a statistical method for selecting variables is justified on the basis of cost, I like the idea of attempting to quantify the costs of ga
2 0.97366583 1211 andrew gelman stats-2012-03-13-A personal bit of spam, just for me!
Introduction: Hi Andrew, I came across your site while searching for blogs and posts around American obesity and wanted to reach out to get your readership’s feedback on an infographic my team built which focuses on the obesity of America and where we could end up at the going rate. If you’re interested, let’s connect. Have a great weekend! Thanks. *** I have to say, that’s pretty pitiful, to wish someone a “great weekend” on a Tuesday! This guy’s gotta ratchet up his sophistication a few notches if he ever wants to get a job as a spammer for a major software company , for example.
3 0.96666455 582 andrew gelman stats-2011-02-20-Statisticians vs. everybody else
Introduction: Statisticians are literalists. When someone says that the U.K. boundary commission’s delay in redistricting gave the Tories an advantage equivalent to 10 percent of the vote, we’re the kind of person who looks it up and claims that the effect is less than 0.7 percent. When someone says, “Since 1968, with the single exception of the election of George W. Bush in 2000, Americans have chosen Republican presidents in times of perceived danger and Democrats in times of relative calm,” we’re like, Hey, really? And we go look that one up too. And when someone says that engineers have more sons and nurses have more daughters . . . well, let’s not go there. So, when I was pointed to this blog by Michael O’Hare making the following claim, in the context of K-12 education in the United States: My [O'Hare's] favorite examples of this junk [educational content with no workplace value] are spelling and pencil-and-paper algorithm arithmetic. These are absolutely critical for a clerk
4 0.96632814 2129 andrew gelman stats-2013-12-10-Cross-validation and Bayesian estimation of tuning parameters
Introduction: Ilya Lipkovich writes: I read with great interest your 2008 paper [with Aleks Jakulin, Grazia Pittau, and Yu-Sung Su] on weakly informative priors for logistic regression and also followed an interesting discussion on your blog. This discussion was within Bayesian community in relation to the validity of priors. However i would like to approach it rather from a more broad perspective on predictive modeling bringing in the ideas from machine/statistical learning approach”. Actually you were the first to bring it up by mentioning in your paper “borrowing ideas from computer science” on cross-validation when comparing predictive ability of your proposed priors with other choices. However, using cross-validation for comparing method performance is not the only or primary use of CV in machine-learning. Most of machine learning methods have some “meta” or complexity parameters and use cross-validation to tune them up. For example, one of your comparison methods is BBR which actually
5 0.96627223 669 andrew gelman stats-2011-04-19-The mysterious Gamma (1.4, 0.4)
Introduction: A student writes: I have a question about an earlier recommendation of yours on the election of the prior distribution for the precision hyperparameter of a normal distribution, and a reference for the recommendation. If I recall correctly I have read that you have suggested to use Gamma(1.4, 0.4) instead of Gamma(0.01,0.01) for the prior distribution of the precision hyper parameter of a normal distribution. I would very much appreciate if you would have the time to point me to this publication of yours. The reason is that I have used the prior distribution (Gamma(1.4, 0.4)) in a study which we now revise for publication, and where a reviewer question the choice of the distribution (claiming that it is too informative!). I am well aware of that you in recent publications (Prior distributions for variance parameters in hierarchical models. Bayesian Analysis; Data Analysis using regression and multilevel/hierarchical models) suggest to model the precision as pow(standard deviatio
6 0.96616822 899 andrew gelman stats-2011-09-10-The statistical significance filter
7 0.966061 1746 andrew gelman stats-2013-03-02-Fishing for cherries
8 0.9658426 2208 andrew gelman stats-2014-02-12-How to think about “identifiability” in Bayesian inference?
9 0.96514916 970 andrew gelman stats-2011-10-24-Bell Labs
11 0.96472895 783 andrew gelman stats-2011-06-30-Don’t stop being a statistician once the analysis is done
12 0.96468651 1760 andrew gelman stats-2013-03-12-Misunderstanding the p-value
13 0.96440095 2109 andrew gelman stats-2013-11-21-Hidden dangers of noninformative priors
14 0.96429288 511 andrew gelman stats-2011-01-11-One more time on that ESP study: The problem of overestimates and the shrinkage solution
15 0.96423191 2149 andrew gelman stats-2013-12-26-Statistical evidence for revised standards
16 0.96355963 1941 andrew gelman stats-2013-07-16-Priors
18 0.96313679 1644 andrew gelman stats-2012-12-30-Fixed effects, followed by Bayes shrinkage?
19 0.96307844 2140 andrew gelman stats-2013-12-19-Revised evidence for statistical standards
20 0.96288395 1695 andrew gelman stats-2013-01-28-Economists argue about Bayes