andrew_gelman_stats andrew_gelman_stats-2013 andrew_gelman_stats-2013-1792 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Christian Robert writes on the Jeffreys-Lindley paradox. I have nothing to add to this beyond my recent comments : To me, the Lindley paradox falls apart because of its noninformative prior distribution on the parameter of interest. If you really think there’s a high probability the parameter is nearly exactly zero, I don’t see the point of the model saying that you have no prior information at all on the parameter. In short: my criticism of so-called Bayesian hypothesis testing is that it’s insufficiently Bayesian. To clarify, I’m speaking of all the examples I’ve ever worked on in social and environmental science, where in some settings I can imagine a parameter being very close to zero and in other settings I can imagine a parameter taking on just about any value in a wide range, but where I’ve never seen an example where a parameter could be either right at zero or taking on any possible value. But such examples might occur in areas of application that I haven’t worked on.
sentIndex sentText sentNum sentScore
1 I have nothing to add to this beyond my recent comments : To me, the Lindley paradox falls apart because of its noninformative prior distribution on the parameter of interest. [sent-2, score-1.521]
2 If you really think there’s a high probability the parameter is nearly exactly zero, I don’t see the point of the model saying that you have no prior information at all on the parameter. [sent-3, score-0.862]
3 In short: my criticism of so-called Bayesian hypothesis testing is that it’s insufficiently Bayesian. [sent-4, score-0.463]
4 But such examples might occur in areas of application that I haven’t worked on. [sent-6, score-0.651]
5 We are not in disagreement, we’re just looking at different aspects of the problem. [sent-9, score-0.259]
wordName wordTfidf (topN-words)
[('parameter', 0.466), ('christian', 0.262), ('zero', 0.258), ('settings', 0.214), ('insufficiently', 0.182), ('worked', 0.176), ('imagine', 0.17), ('taking', 0.155), ('lindley', 0.152), ('examples', 0.147), ('noninformative', 0.147), ('prior', 0.146), ('environmental', 0.136), ('falls', 0.134), ('disagreement', 0.134), ('occur', 0.128), ('apart', 0.127), ('paradox', 0.127), ('wide', 0.123), ('clarify', 0.115), ('speaking', 0.106), ('aspects', 0.105), ('application', 0.104), ('robert', 0.104), ('range', 0.101), ('nearly', 0.099), ('testing', 0.098), ('areas', 0.096), ('criticism', 0.096), ('recommend', 0.094), ('hypothesis', 0.087), ('haven', 0.086), ('different', 0.085), ('exactly', 0.083), ('short', 0.083), ('add', 0.083), ('close', 0.083), ('perspective', 0.082), ('beyond', 0.079), ('value', 0.079), ('seen', 0.076), ('ever', 0.075), ('distribution', 0.075), ('ve', 0.073), ('either', 0.071), ('looking', 0.069), ('nothing', 0.069), ('comments', 0.068), ('probability', 0.068), ('possible', 0.066)]
simIndex simValue blogId blogTitle
same-blog 1 1.0000001 1792 andrew gelman stats-2013-04-07-X on JLP
Introduction: Christian Robert writes on the Jeffreys-Lindley paradox. I have nothing to add to this beyond my recent comments : To me, the Lindley paradox falls apart because of its noninformative prior distribution on the parameter of interest. If you really think there’s a high probability the parameter is nearly exactly zero, I don’t see the point of the model saying that you have no prior information at all on the parameter. In short: my criticism of so-called Bayesian hypothesis testing is that it’s insufficiently Bayesian. To clarify, I’m speaking of all the examples I’ve ever worked on in social and environmental science, where in some settings I can imagine a parameter being very close to zero and in other settings I can imagine a parameter taking on just about any value in a wide range, but where I’ve never seen an example where a parameter could be either right at zero or taking on any possible value. But such examples might occur in areas of application that I haven’t worked on.
2 0.89123052 1757 andrew gelman stats-2013-03-11-My problem with the Lindley paradox
Introduction: From a couple years ago but still relevant, I think: To me, the Lindley paradox falls apart because of its noninformative prior distribution on the parameter of interest. If you really think there’s a high probability the parameter is nearly exactly zero, I don’t see the point of the model saying that you have no prior information at all on the parameter. In short: my criticism of so-called Bayesian hypothesis testing is that it’s insufficiently Bayesian. P.S. To clarify (in response to Bill’s comment below): I’m speaking of all the examples I’ve ever worked on in social and environmental science, where in some settings I can imagine a parameter being very close to zero and in other settings I can imagine a parameter taking on just about any value in a wide range, but where I’ve never seen an example where a parameter could be either right at zero or taking on any possible value. But such examples might occur in areas of application that I haven’t worked on.
3 0.22631675 2109 andrew gelman stats-2013-11-21-Hidden dangers of noninformative priors
Introduction: Following up on Christian’s post [link fixed] on the topic, I’d like to offer a few thoughts of my own. In BDA, we express the idea that a noninformative prior is a placeholder: you can use the noninformative prior to get the analysis started, then if your posterior distribution is less informative than you would like, or if it does not make sense, you can go back and add prior information. Same thing for the data model (the “likelihood”), for that matter: it often makes sense to start with something simple and conventional and then go from there. So, in that sense, noninformative priors are no big deal, they’re just a way to get started. Just don’t take them too seriously. Traditionally in statistics we’ve worked with the paradigm of a single highly informative dataset with only weak external information. But if the data are sparse and prior information is strong, we have to think differently. And, when you increase the dimensionality of a problem, both these things hap
Introduction: Type S error: When your estimate is the wrong sign, compared to the true value of the parameter Type M error: When the magnitude of your estimate is far off, compared to the true value of the parameter More here.
5 0.20928937 160 andrew gelman stats-2010-07-23-Unhappy with improvement by a factor of 10^29
Introduction: I have an optimization problem: I have a complicated physical model that predicts energy and thermal behavior of a building, given the values of a slew of parameters, such as insulation effectiveness, window transmissivity, etc. I’m trying to find the parameter set that best fits several weeks of thermal and energy use data from the real building that we modeled. (Of course I would rather explore parameter space and come up with probability distributions for the parameters, and maybe that will come later, but for now I’m just optimizing). To do the optimization, colleagues and I implemented a “particle swarm optimization” algorithm on a massively parallel machine. This involves giving each of about 120 “particles” an initial position in parameter space, then letting them move around, trying to move to better positions according to a specific algorithm. We gave each particle an initial position sampled from our prior distribution for each parameter. So far we’ve run about 140 itera
6 0.18812665 1941 andrew gelman stats-2013-07-16-Priors
7 0.18366732 1858 andrew gelman stats-2013-05-15-Reputations changeable, situations tolerable
8 0.18250383 1355 andrew gelman stats-2012-05-31-Lindley’s paradox
9 0.15054087 1155 andrew gelman stats-2012-02-05-What is a prior distribution?
10 0.14926758 1474 andrew gelman stats-2012-08-29-More on scaled-inverse Wishart and prior independence
11 0.14854495 184 andrew gelman stats-2010-08-04-That half-Cauchy prior
12 0.14295274 2129 andrew gelman stats-2013-12-10-Cross-validation and Bayesian estimation of tuning parameters
13 0.13279186 1046 andrew gelman stats-2011-12-07-Neutral noninformative and informative conjugate beta and gamma prior distributions
14 0.13242 2208 andrew gelman stats-2014-02-12-How to think about “identifiability” in Bayesian inference?
15 0.13196363 2200 andrew gelman stats-2014-02-05-Prior distribution for a predicted probability
16 0.13060364 1182 andrew gelman stats-2012-02-24-Untangling the Jeffreys-Lindley paradox
17 0.12930349 1695 andrew gelman stats-2013-01-28-Economists argue about Bayes
18 0.1259286 1092 andrew gelman stats-2011-12-29-More by Berger and me on weakly informative priors
19 0.12517761 1737 andrew gelman stats-2013-02-25-Correlation of 1 . . . too good to be true?
20 0.12443578 779 andrew gelman stats-2011-06-25-Avoiding boundary estimates using a prior distribution as regularization
topicId topicWeight
[(0, 0.184), (1, 0.157), (2, -0.007), (3, 0.038), (4, -0.076), (5, -0.076), (6, 0.123), (7, 0.071), (8, -0.131), (9, -0.003), (10, -0.042), (11, 0.01), (12, 0.048), (13, -0.021), (14, -0.041), (15, -0.016), (16, -0.062), (17, -0.069), (18, 0.035), (19, -0.051), (20, 0.055), (21, -0.058), (22, 0.025), (23, -0.013), (24, -0.023), (25, 0.011), (26, -0.008), (27, 0.022), (28, 0.005), (29, -0.017), (30, -0.013), (31, 0.018), (32, -0.019), (33, -0.039), (34, -0.134), (35, -0.063), (36, 0.123), (37, -0.032), (38, 0.03), (39, 0.027), (40, -0.107), (41, 0.042), (42, -0.077), (43, 0.03), (44, -0.136), (45, 0.015), (46, 0.067), (47, 0.042), (48, -0.088), (49, 0.064)]
simIndex simValue blogId blogTitle
same-blog 1 0.98441666 1792 andrew gelman stats-2013-04-07-X on JLP
Introduction: Christian Robert writes on the Jeffreys-Lindley paradox. I have nothing to add to this beyond my recent comments : To me, the Lindley paradox falls apart because of its noninformative prior distribution on the parameter of interest. If you really think there’s a high probability the parameter is nearly exactly zero, I don’t see the point of the model saying that you have no prior information at all on the parameter. In short: my criticism of so-called Bayesian hypothesis testing is that it’s insufficiently Bayesian. To clarify, I’m speaking of all the examples I’ve ever worked on in social and environmental science, where in some settings I can imagine a parameter being very close to zero and in other settings I can imagine a parameter taking on just about any value in a wide range, but where I’ve never seen an example where a parameter could be either right at zero or taking on any possible value. But such examples might occur in areas of application that I haven’t worked on.
2 0.97318542 1757 andrew gelman stats-2013-03-11-My problem with the Lindley paradox
Introduction: From a couple years ago but still relevant, I think: To me, the Lindley paradox falls apart because of its noninformative prior distribution on the parameter of interest. If you really think there’s a high probability the parameter is nearly exactly zero, I don’t see the point of the model saying that you have no prior information at all on the parameter. In short: my criticism of so-called Bayesian hypothesis testing is that it’s insufficiently Bayesian. P.S. To clarify (in response to Bill’s comment below): I’m speaking of all the examples I’ve ever worked on in social and environmental science, where in some settings I can imagine a parameter being very close to zero and in other settings I can imagine a parameter taking on just about any value in a wide range, but where I’ve never seen an example where a parameter could be either right at zero or taking on any possible value. But such examples might occur in areas of application that I haven’t worked on.
3 0.73364955 160 andrew gelman stats-2010-07-23-Unhappy with improvement by a factor of 10^29
Introduction: I have an optimization problem: I have a complicated physical model that predicts energy and thermal behavior of a building, given the values of a slew of parameters, such as insulation effectiveness, window transmissivity, etc. I’m trying to find the parameter set that best fits several weeks of thermal and energy use data from the real building that we modeled. (Of course I would rather explore parameter space and come up with probability distributions for the parameters, and maybe that will come later, but for now I’m just optimizing). To do the optimization, colleagues and I implemented a “particle swarm optimization” algorithm on a massively parallel machine. This involves giving each of about 120 “particles” an initial position in parameter space, then letting them move around, trying to move to better positions according to a specific algorithm. We gave each particle an initial position sampled from our prior distribution for each parameter. So far we’ve run about 140 itera
4 0.65912157 1858 andrew gelman stats-2013-05-15-Reputations changeable, situations tolerable
Introduction: David Kessler, Peter Hoff, and David Dunson write : Marginally specified priors for nonparametric Bayesian estimation Prior specification for nonparametric Bayesian inference involves the difficult task of quantifying prior knowledge about a parameter of high, often infinite, dimension. Realistically, a statistician is unlikely to have informed opinions about all aspects of such a parameter, but may have real information about functionals of the parameter, such the population mean or variance. This article proposes a new framework for nonparametric Bayes inference in which the prior distribution for a possibly infinite-dimensional parameter is decomposed into two parts: an informative prior on a finite set of functionals, and a nonparametric conditional prior for the parameter given the functionals. Such priors can be easily constructed from standard nonparametric prior distributions in common use, and inherit the large support of the standard priors upon which they are based. Ad
5 0.65660024 1130 andrew gelman stats-2012-01-20-Prior beliefs about locations of decision boundaries
Introduction: Forest Gregg writes: I want to incorporate a prior belief into an estimation of a logistic regression classifier of points distributed in a 2d space. My prior belief is a funny kind of prior though. It’s a belief about where the decision boundary between classes should fall. Over the 2d space, I lay a grid, and I believe that a decision boundary that separates any two classes should fall along any of the grid line with some probablity, and that the decision boundary should fall anywhere except a gridline with a much lower probability. For the two class case, and a logistic regression model parameterized by W and data X, my prior could perhaps be expressed Pr(W) = (normalizing constant)/exp(d) where d = f(grid,W,X) such that when logistic(W^TX)= .5 and X is ‘far’ from grid lines, then d is large. Have you ever seen a model like this, or do you have any notions about a good avenue to pursue? My real data consist of geocoded Craigslist’s postings that are labeled with the
6 0.64738321 1941 andrew gelman stats-2013-07-16-Priors
7 0.64234298 801 andrew gelman stats-2011-07-13-On the half-Cauchy prior for a global scale parameter
8 0.6377849 1713 andrew gelman stats-2013-02-08-P-values and statistical practice
10 0.62369049 1474 andrew gelman stats-2012-08-29-More on scaled-inverse Wishart and prior independence
11 0.6207599 2129 andrew gelman stats-2013-12-10-Cross-validation and Bayesian estimation of tuning parameters
12 0.61642545 779 andrew gelman stats-2011-06-25-Avoiding boundary estimates using a prior distribution as regularization
13 0.61574447 1089 andrew gelman stats-2011-12-28-Path sampling for models of varying dimension
14 0.60923672 2200 andrew gelman stats-2014-02-05-Prior distribution for a predicted probability
15 0.60420543 2109 andrew gelman stats-2013-11-21-Hidden dangers of noninformative priors
16 0.60405374 184 andrew gelman stats-2010-08-04-That half-Cauchy prior
17 0.59197664 1221 andrew gelman stats-2012-03-19-Whassup with deviance having a high posterior correlation with a parameter in the model?
19 0.58785731 1155 andrew gelman stats-2012-02-05-What is a prior distribution?
20 0.57394528 1092 andrew gelman stats-2011-12-29-More by Berger and me on weakly informative priors
topicId topicWeight
[(16, 0.066), (21, 0.066), (24, 0.291), (50, 0.041), (56, 0.018), (77, 0.031), (86, 0.027), (99, 0.345)]
simIndex simValue blogId blogTitle
1 0.99470544 1757 andrew gelman stats-2013-03-11-My problem with the Lindley paradox
Introduction: From a couple years ago but still relevant, I think: To me, the Lindley paradox falls apart because of its noninformative prior distribution on the parameter of interest. If you really think there’s a high probability the parameter is nearly exactly zero, I don’t see the point of the model saying that you have no prior information at all on the parameter. In short: my criticism of so-called Bayesian hypothesis testing is that it’s insufficiently Bayesian. P.S. To clarify (in response to Bill’s comment below): I’m speaking of all the examples I’ve ever worked on in social and environmental science, where in some settings I can imagine a parameter being very close to zero and in other settings I can imagine a parameter taking on just about any value in a wide range, but where I’ve never seen an example where a parameter could be either right at zero or taking on any possible value. But such examples might occur in areas of application that I haven’t worked on.
same-blog 2 0.99325991 1792 andrew gelman stats-2013-04-07-X on JLP
Introduction: Christian Robert writes on the Jeffreys-Lindley paradox. I have nothing to add to this beyond my recent comments : To me, the Lindley paradox falls apart because of its noninformative prior distribution on the parameter of interest. If you really think there’s a high probability the parameter is nearly exactly zero, I don’t see the point of the model saying that you have no prior information at all on the parameter. In short: my criticism of so-called Bayesian hypothesis testing is that it’s insufficiently Bayesian. To clarify, I’m speaking of all the examples I’ve ever worked on in social and environmental science, where in some settings I can imagine a parameter being very close to zero and in other settings I can imagine a parameter taking on just about any value in a wide range, but where I’ve never seen an example where a parameter could be either right at zero or taking on any possible value. But such examples might occur in areas of application that I haven’t worked on.
3 0.99161124 2029 andrew gelman stats-2013-09-18-Understanding posterior p-values
Introduction: David Kaplan writes: I came across your paper “Understanding Posterior Predictive P-values”, and I have a question regarding your statement “If a posterior predictive p-value is 0.4, say, that means that, if we believe the model, we think there is a 40% chance that tomorrow’s value of T(y_rep) will exceed today’s T(y).” This is perfectly understandable to me and represents the idea of calibration. However, I am unsure how this relates to statements about fit. If T is the LR chi-square or Pearson chi-square, then your statement that there is a 40% chance that tomorrows value exceeds today’s value indicates bad fit, I think. Yet, some literature indicates that high p-values suggest good fit. Could you clarify this? My reply: I think that “fit” depends on the question being asked. In this case, I’d say the model fits for this particular purpose, even though it might not fit for other purposes. And here’s the abstract of the paper: Posterior predictive p-values do not i
4 0.98550689 896 andrew gelman stats-2011-09-09-My homework success
Introduction: A friend writes to me: You will be amused to know that students in our Bayesian Inference paper at 4th year found solutions to exercises from your book on-line. The amazing thing was that some of them were dumb enough to copy out solutions verbatim. However, I thought you might like to know you have done well in this class! I’m happy to hear this. I worked hard on those solutions!
Introduction: Pointing to this news article by Megan McArdle discussing a recent study of Medicaid recipients, Jonathan Falk writes: Forget the interpretation for a moment, and the political spin, but haven’t we reached an interesting point when a journalist says things like: When you do an RCT with more than 12,000 people in it, and your defense of your hypothesis is that maybe the study just didn’t have enough power, what you’re actually saying is “the beneficial effects are probably pretty small”. and A good Bayesian—and aren’t most of us are supposed to be good Bayesians these days?—should be updating in light of this new information. Given this result, what is the likelihood that Obamacare will have a positive impact on the average health of Americans? Every one of us, for or against, should be revising that probability downwards. I’m not saying that you have to revise it to zero; I certainly haven’t. But however high it was yesterday, it should be somewhat lower today. This
6 0.98511034 1080 andrew gelman stats-2011-12-24-Latest in blog advertising
8 0.98326075 1208 andrew gelman stats-2012-03-11-Gelman on Hennig on Gelman on Bayes
9 0.98315525 1465 andrew gelman stats-2012-08-21-D. Buggin
10 0.98274839 1155 andrew gelman stats-2012-02-05-What is a prior distribution?
11 0.98270798 1240 andrew gelman stats-2012-04-02-Blogads update
12 0.98191679 953 andrew gelman stats-2011-10-11-Steve Jobs’s cancer and science-based medicine
13 0.98175478 502 andrew gelman stats-2011-01-04-Cash in, cash out graph
14 0.98171318 2086 andrew gelman stats-2013-11-03-How best to compare effects measured in two different time periods?
15 0.98115778 1644 andrew gelman stats-2012-12-30-Fixed effects, followed by Bayes shrinkage?
17 0.98079026 1176 andrew gelman stats-2012-02-19-Standardized writing styles and standardized graphing styles
18 0.98053789 2247 andrew gelman stats-2014-03-14-The maximal information coefficient
19 0.98023909 2109 andrew gelman stats-2013-11-21-Hidden dangers of noninformative priors