andrew_gelman_stats andrew_gelman_stats-2013 andrew_gelman_stats-2013-2129 knowledge-graph by maker-knowledge-mining

2129 andrew gelman stats-2013-12-10-Cross-validation and Bayesian estimation of tuning parameters


meta infos for this blog

Source: html

Introduction: Ilya Lipkovich writes: I read with great interest your 2008 paper [with Aleks Jakulin, Grazia Pittau, and Yu-Sung Su] on weakly informative priors for logistic regression and also followed an interesting discussion on your blog. This discussion was within Bayesian community in relation to the validity of priors. However i would like to approach it rather from a more broad perspective on predictive modeling bringing in the ideas from machine/statistical learning approach”. Actually you were the first to bring it up by mentioning in your paper “borrowing ideas from computer science” on cross-validation when comparing predictive ability of your proposed priors with other choices. However, using cross-validation for comparing method performance is not the only or primary use of CV in machine-learning. Most of machine learning methods have some “meta” or complexity parameters and use cross-validation to tune them up. For example, one of your comparison methods is BBR which actually


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Ilya Lipkovich writes: I read with great interest your 2008 paper [with Aleks Jakulin, Grazia Pittau, and Yu-Sung Su] on weakly informative priors for logistic regression and also followed an interesting discussion on your blog. [sent-1, score-0.392]

2 However i would like to approach it rather from a more broad perspective on predictive modeling bringing in the ideas from machine/statistical learning approach”. [sent-3, score-0.409]

3 Actually you were the first to bring it up by mentioning in your paper “borrowing ideas from computer science” on cross-validation when comparing predictive ability of your proposed priors with other choices. [sent-4, score-0.535]

4 Most of machine learning methods have some “meta” or complexity parameters and use cross-validation to tune them up. [sent-6, score-0.623]

5 For example, one of your comparison methods is BBR which actually resorts to CV for selecting the prior variance (whether you use Laplace or Gaussian priors). [sent-7, score-0.245]

6 This makes their method essentially equivalent to ridge regression or lasso with tuning parameter selected by cross-validation so there is really not much Bayesian flavor left there. [sent-8, score-0.887]

7 From my personal communication with David Madigan I did not remember whether he ever advocated using default priors, he seemed to like CV approach in choosing them (and that was the whole point of making the algorithm fast), as most people in the statistical learning community would (e. [sent-11, score-0.563]

8 Now, it is unclear from your paper, whether when comparing your automated Cauchy priors with BBR you let the BBR to chose optimal tuning parameter, or used the default values. [sent-14, score-1.119]

9 If you let BBR tune parameters then you should have performed a “double cross-validation,” allowing BBR to select a (possibly different) value of tuning parameter (prior variance) on each fold of your “outer cross-validation,” based on a separate “inner CV” within that fold. [sent-15, score-0.836]

10 If you used automated priors then you might not have done justice to the BBR. [sent-16, score-0.441]

11 But then you may say that it would be unfair to let them choose optimal prior variance via CV if your method uses automated priors. [sent-17, score-0.581]

12 If we leave the Bayesian grounds and move to the statistical learning (or “computer science” in your interpretation) turf, then what is the optimal way to fit a predictive model? [sent-20, score-0.44]

13 From reading your paper it seems that you believe in the existence of default priors, which translates in having default complexity parameters when performing statistical learning. [sent-21, score-0.669]

14 This seems to be in contrast with what the “authorities” in the statistical leaning literature tell us where they reject the idea that one can preset complexity parameters in any large-scale predictive modeling as a popular myth. [sent-22, score-0.692]

15 The answer may be that your approach with automated priors is intended only when having just few predictors? [sent-25, score-0.51]

16 Or there is here a deeper philosophical split between the Bayesian and the statistical learning community? [sent-26, score-0.213]

17 It depends on the structure of the problem: the more replication, the more it is possible to estimate such tuning parameters internally. [sent-31, score-0.587]

18 In our paper we were particularly interested in cases where the number of predictors is small. [sent-33, score-0.257]

19 If there are general differences between statistics and machine learning here, then, it’s not on the philosophy of automated priors or whatever; it’s that in statistics we often talk about small problems with only a few predictors (see any statistics textbooks, including mine! [sent-34, score-0.842]

20 ), whereas machine learning methods tend to be applied to problems with large numbers of predictors. [sent-35, score-0.297]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('bbr', 0.416), ('tuning', 0.384), ('cv', 0.236), ('priors', 0.228), ('automated', 0.213), ('parameter', 0.211), ('learning', 0.161), ('predictors', 0.153), ('parameters', 0.152), ('complexity', 0.134), ('predictive', 0.123), ('bayesian', 0.113), ('default', 0.113), ('preset', 0.108), ('optimal', 0.104), ('method', 0.099), ('prior', 0.092), ('hyperparameters', 0.092), ('tune', 0.089), ('fast', 0.087), ('machine', 0.087), ('algorithm', 0.086), ('community', 0.082), ('selecting', 0.08), ('flavor', 0.08), ('comparing', 0.077), ('variance', 0.073), ('approach', 0.069), ('reject', 0.067), ('selected', 0.061), ('modeling', 0.056), ('paper', 0.056), ('logistic', 0.056), ('madigan', 0.054), ('borrowing', 0.054), ('sparseness', 0.054), ('statistical', 0.052), ('regression', 0.052), ('computer', 0.051), ('pittau', 0.051), ('turf', 0.051), ('su', 0.051), ('ilya', 0.051), ('cauchy', 0.051), ('estimate', 0.051), ('grazia', 0.049), ('translates', 0.049), ('inner', 0.049), ('large', 0.049), ('number', 0.048)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0 2129 andrew gelman stats-2013-12-10-Cross-validation and Bayesian estimation of tuning parameters

Introduction: Ilya Lipkovich writes: I read with great interest your 2008 paper [with Aleks Jakulin, Grazia Pittau, and Yu-Sung Su] on weakly informative priors for logistic regression and also followed an interesting discussion on your blog. This discussion was within Bayesian community in relation to the validity of priors. However i would like to approach it rather from a more broad perspective on predictive modeling bringing in the ideas from machine/statistical learning approach”. Actually you were the first to bring it up by mentioning in your paper “borrowing ideas from computer science” on cross-validation when comparing predictive ability of your proposed priors with other choices. However, using cross-validation for comparing method performance is not the only or primary use of CV in machine-learning. Most of machine learning methods have some “meta” or complexity parameters and use cross-validation to tune them up. For example, one of your comparison methods is BBR which actually

2 0.23633686 846 andrew gelman stats-2011-08-09-Default priors update?

Introduction: Ryan King writes: I was wondering if you have a brief comment on the state of the art for objective priors for hierarchical generalized linear models (generalized linear mixed models). I have been working off the papers in Bayesian Analysis (2006) 1, Number 3 (Browne and Draper, Kass and Natarajan, Gelman). There seems to have been continuous work for matching priors in linear mixed models, but GLMMs less so because of the lack of an analytic marginal likelihood for the variance components. There are a number of additional suggestions in the literature since 2006, but little robust practical guidance. I’m interested in both mean parameters and the variance components. I’m almost always concerned with logistic random effect models. I’m fascinated by the matching-priors idea of higher-order asymptotic improvements to maximum likelihood, and need to make some kind of defensible default recommendation. Given the massive scale of the datasets (genetics …), extensive sensitivity a

3 0.1950226 1092 andrew gelman stats-2011-12-29-More by Berger and me on weakly informative priors

Introduction: A couple days ago we discussed some remarks by Tony O’Hagan and Jim Berger on weakly informative priors. Jim followed up on Deborah Mayo’s blog with this: Objective Bayesian priors are often improper (i.e., have infinite total mass), but this is not a problem when they are developed correctly. But not every improper prior is satisfactory. For instance, the constant prior is known to be unsatisfactory in many situations. The ‘solution’ pseudo-Bayesians often use is to choose a constant prior over a large but bounded set (a ‘weakly informative’ prior), saying it is now proper and so all is well. This is not true; if the constant prior on the whole parameter space is bad, so will be the constant prior over the bounded set. The problem is, in part, that some people confuse proper priors with subjective priors and, having learned that true subjective priors are fine, incorrectly presume that weakly informative proper priors are fine. I have a few reactions to this: 1. I agree

4 0.18670043 1946 andrew gelman stats-2013-07-19-Prior distributions on derived quantities rather than on parameters themselves

Introduction: Following up on our discussion of the other day , Nick Firoozye writes: One thing I meant by my initial query (but really didn’t manage to get across) was this: I have no idea what my prior would be on many many models, but just like Utility Theory expects ALL consumers to attach a utility to any and all consumption goods (even those I haven’t seen or heard of), Bayesian Stats (almost) expects the same for priors. (Of course it’s not a religious edict much in the way Utility Theory has, since there is no theory of a “modeler” in the Bayesian paradigm—nonetheless there is still an expectation that we should have priors over all sorts of parameters which mean almost nothing to us). For most models with sufficient complexity, I also have no idea what my informative priors are actually doing and the only way to know anything is through something I can see and experience, through data, not parameters or state variables. My question was more on the—let’s use the prior to come up

5 0.18377107 1858 andrew gelman stats-2013-05-15-Reputations changeable, situations tolerable

Introduction: David Kessler, Peter Hoff, and David Dunson write : Marginally specified priors for nonparametric Bayesian estimation Prior specification for nonparametric Bayesian inference involves the difficult task of quantifying prior knowledge about a parameter of high, often infinite, dimension. Realistically, a statistician is unlikely to have informed opinions about all aspects of such a parameter, but may have real information about functionals of the parameter, such the population mean or variance. This article proposes a new framework for nonparametric Bayes inference in which the prior distribution for a possibly infinite-dimensional parameter is decomposed into two parts: an informative prior on a finite set of functionals, and a nonparametric conditional prior for the parameter given the functionals. Such priors can be easily constructed from standard nonparametric prior distributions in common use, and inherit the large support of the standard priors upon which they are based. Ad

6 0.18276007 801 andrew gelman stats-2011-07-13-On the half-Cauchy prior for a global scale parameter

7 0.18088281 1441 andrew gelman stats-2012-08-02-“Based on my experiences, I think you could make general progress by constructing a solution to your specific problem.”

8 0.16689508 1740 andrew gelman stats-2013-02-26-“Is machine learning a subset of statistics?”

9 0.16655645 1941 andrew gelman stats-2013-07-16-Priors

10 0.15906377 779 andrew gelman stats-2011-06-25-Avoiding boundary estimates using a prior distribution as regularization

11 0.15569445 1486 andrew gelman stats-2012-09-07-Prior distributions for regression coefficients

12 0.15527686 1465 andrew gelman stats-2012-08-21-D. Buggin

13 0.15489051 1877 andrew gelman stats-2013-05-30-Infill asymptotics and sprawl asymptotics

14 0.15488499 1695 andrew gelman stats-2013-01-28-Economists argue about Bayes

15 0.14754216 1757 andrew gelman stats-2013-03-11-My problem with the Lindley paradox

16 0.14547326 1431 andrew gelman stats-2012-07-27-Overfitting

17 0.1454293 1474 andrew gelman stats-2012-08-29-More on scaled-inverse Wishart and prior independence

18 0.14511982 2109 andrew gelman stats-2013-11-21-Hidden dangers of noninformative priors

19 0.14497632 1087 andrew gelman stats-2011-12-27-“Keeping things unridiculous”: Berger, O’Hagan, and me on weakly informative priors

20 0.14369939 1966 andrew gelman stats-2013-08-03-Uncertainty in parameter estimates using multilevel models


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.223), (1, 0.233), (2, -0.009), (3, 0.053), (4, -0.029), (5, 0.005), (6, 0.052), (7, 0.002), (8, -0.138), (9, 0.083), (10, 0.034), (11, -0.008), (12, 0.05), (13, 0.055), (14, -0.013), (15, -0.007), (16, -0.034), (17, -0.002), (18, -0.005), (19, -0.037), (20, -0.028), (21, -0.003), (22, 0.041), (23, 0.055), (24, 0.004), (25, 0.015), (26, -0.005), (27, 0.024), (28, -0.01), (29, 0.016), (30, 0.034), (31, -0.011), (32, 0.017), (33, -0.017), (34, 0.003), (35, -0.033), (36, 0.005), (37, -0.001), (38, 0.01), (39, -0.038), (40, -0.051), (41, 0.006), (42, -0.02), (43, 0.095), (44, -0.04), (45, -0.013), (46, 0.018), (47, 0.018), (48, 0.012), (49, 0.012)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.9648819 2129 andrew gelman stats-2013-12-10-Cross-validation and Bayesian estimation of tuning parameters

Introduction: Ilya Lipkovich writes: I read with great interest your 2008 paper [with Aleks Jakulin, Grazia Pittau, and Yu-Sung Su] on weakly informative priors for logistic regression and also followed an interesting discussion on your blog. This discussion was within Bayesian community in relation to the validity of priors. However i would like to approach it rather from a more broad perspective on predictive modeling bringing in the ideas from machine/statistical learning approach”. Actually you were the first to bring it up by mentioning in your paper “borrowing ideas from computer science” on cross-validation when comparing predictive ability of your proposed priors with other choices. However, using cross-validation for comparing method performance is not the only or primary use of CV in machine-learning. Most of machine learning methods have some “meta” or complexity parameters and use cross-validation to tune them up. For example, one of your comparison methods is BBR which actually

2 0.88598031 846 andrew gelman stats-2011-08-09-Default priors update?

Introduction: Ryan King writes: I was wondering if you have a brief comment on the state of the art for objective priors for hierarchical generalized linear models (generalized linear mixed models). I have been working off the papers in Bayesian Analysis (2006) 1, Number 3 (Browne and Draper, Kass and Natarajan, Gelman). There seems to have been continuous work for matching priors in linear mixed models, but GLMMs less so because of the lack of an analytic marginal likelihood for the variance components. There are a number of additional suggestions in the literature since 2006, but little robust practical guidance. I’m interested in both mean parameters and the variance components. I’m almost always concerned with logistic random effect models. I’m fascinated by the matching-priors idea of higher-order asymptotic improvements to maximum likelihood, and need to make some kind of defensible default recommendation. Given the massive scale of the datasets (genetics …), extensive sensitivity a

3 0.86243826 63 andrew gelman stats-2010-06-02-The problem of overestimation of group-level variance parameters

Introduction: John Lawson writes: I have been experimenting using Bayesian Methods to estimate variance components, and I have noticed that even when I use a noninformative prior, my estimates are never close to the method of moments or REML estimates. In every case I have tried, the sum of the Bayesian estimated variance components is always larger than the sum of the estimates obtained by method of moments or REML. For data sets I have used that arise from a simple one-way random effects model, the Bayesian estimates of the between groups variance component is usually larger than the method of moments or REML estimates. When I use a uniform prior on the between standard deviation (as you recommended in your 2006 paper ) rather than an inverse gamma prior on the between variance component, the between variance component is usually reduced. However, for the dyestuff data in Davies(1949, p74), the opposite appears to be the case. I am a worried that the Bayesian estimators of the varian

4 0.84945315 779 andrew gelman stats-2011-06-25-Avoiding boundary estimates using a prior distribution as regularization

Introduction: For awhile I’ve been fitting most of my multilevel models using lmer/glmer, which gives point estimates of the group-level variance parameters (maximum marginal likelihood estimate for lmer and an approximation for glmer). I’m usually satisfied with this–sure, point estimation understates the uncertainty in model fitting, but that’s typically the least of our worries. Sometimes, though, lmer/glmer estimates group-level variances at 0 or estimates group-level correlation parameters at +/- 1. Typically, when this happens, it’s not that we’re so sure the variance is close to zero or that the correlation is close to 1 or -1; rather, the marginal likelihood does not provide a lot of information about these parameters of the group-level error distribution. I don’t want point estimates on the boundary. I don’t want to say that the unexplained variance in some dimension is exactly zero. One way to handle this problem is full Bayes: slap a prior on sigma, do your Gibbs and Metropolis

5 0.8470158 1858 andrew gelman stats-2013-05-15-Reputations changeable, situations tolerable

Introduction: David Kessler, Peter Hoff, and David Dunson write : Marginally specified priors for nonparametric Bayesian estimation Prior specification for nonparametric Bayesian inference involves the difficult task of quantifying prior knowledge about a parameter of high, often infinite, dimension. Realistically, a statistician is unlikely to have informed opinions about all aspects of such a parameter, but may have real information about functionals of the parameter, such the population mean or variance. This article proposes a new framework for nonparametric Bayes inference in which the prior distribution for a possibly infinite-dimensional parameter is decomposed into two parts: an informative prior on a finite set of functionals, and a nonparametric conditional prior for the parameter given the functionals. Such priors can be easily constructed from standard nonparametric prior distributions in common use, and inherit the large support of the standard priors upon which they are based. Ad

6 0.83697718 1454 andrew gelman stats-2012-08-11-Weakly informative priors for Bayesian nonparametric models?

7 0.83603221 801 andrew gelman stats-2011-07-13-On the half-Cauchy prior for a global scale parameter

8 0.8025018 1465 andrew gelman stats-2012-08-21-D. Buggin

9 0.80067259 1474 andrew gelman stats-2012-08-29-More on scaled-inverse Wishart and prior independence

10 0.79655248 1946 andrew gelman stats-2013-07-19-Prior distributions on derived quantities rather than on parameters themselves

11 0.79110926 669 andrew gelman stats-2011-04-19-The mysterious Gamma (1.4, 0.4)

12 0.78512055 1466 andrew gelman stats-2012-08-22-The scaled inverse Wishart prior distribution for a covariance matrix in a hierarchical model

13 0.78300899 184 andrew gelman stats-2010-08-04-That half-Cauchy prior

14 0.77440959 1877 andrew gelman stats-2013-05-30-Infill asymptotics and sprawl asymptotics

15 0.75571316 1486 andrew gelman stats-2012-09-07-Prior distributions for regression coefficients

16 0.7550928 1092 andrew gelman stats-2011-12-29-More by Berger and me on weakly informative priors

17 0.75042945 247 andrew gelman stats-2010-09-01-How does Bayes do it?

18 0.74211341 1046 andrew gelman stats-2011-12-07-Neutral noninformative and informative conjugate beta and gamma prior distributions

19 0.74099272 1941 andrew gelman stats-2013-07-16-Priors

20 0.74041402 1087 andrew gelman stats-2011-12-27-“Keeping things unridiculous”: Berger, O’Hagan, and me on weakly informative priors


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(2, 0.034), (8, 0.018), (15, 0.041), (16, 0.045), (21, 0.02), (24, 0.27), (30, 0.015), (44, 0.012), (55, 0.046), (56, 0.038), (69, 0.014), (79, 0.028), (95, 0.017), (99, 0.289)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.97976768 1283 andrew gelman stats-2012-04-26-Let’s play “Guess the smoother”!

Introduction: Andre de Boer writes: In my profession as a risk manager I encountered this graph: I can’t figure out what kind of regression this is, would you be so kind to enlighten me? The points represent (maturity,yield) of bonds. My reply: That’s a fun problem, reverse-engineering a curve fit! My first guess is lowess, although it seems too flat and asympoty on the right side of the graph to be lowess. Maybe a Gaussian process? Looks too smooth to be a spline. I guess I’ll go with my original guess, on the theory that lowess is the most accessible smoother out there, and if someone fit something much more complicated they’d make more of a big deal about it. On the other hand, if the curve is an automatic output of some software (Excel? Stata?) then it could be just about anything. Does anyone have any ideas?

same-blog 2 0.97762573 2129 andrew gelman stats-2013-12-10-Cross-validation and Bayesian estimation of tuning parameters

Introduction: Ilya Lipkovich writes: I read with great interest your 2008 paper [with Aleks Jakulin, Grazia Pittau, and Yu-Sung Su] on weakly informative priors for logistic regression and also followed an interesting discussion on your blog. This discussion was within Bayesian community in relation to the validity of priors. However i would like to approach it rather from a more broad perspective on predictive modeling bringing in the ideas from machine/statistical learning approach”. Actually you were the first to bring it up by mentioning in your paper “borrowing ideas from computer science” on cross-validation when comparing predictive ability of your proposed priors with other choices. However, using cross-validation for comparing method performance is not the only or primary use of CV in machine-learning. Most of machine learning methods have some “meta” or complexity parameters and use cross-validation to tune them up. For example, one of your comparison methods is BBR which actually

3 0.9767161 399 andrew gelman stats-2010-11-07-Challenges of experimental design; also another rant on the practice of mentioning the publication of an article but not naming its author

Introduction: After learning of a news article by Amy Harmon on problems with medical trials–sometimes people are stuck getting the placebo when they could really use the experimental treatment, and it can be a life-or-death difference, John Langford discusses some fifteen-year-old work on optimal design in machine learning and makes the following completely reasonable point: With reasonable record keeping of existing outcomes for the standard treatments, there is no need to explicitly assign people to a control group with the standard treatment, as that approach is effectively explored with great certainty. Asserting otherwise would imply that the nature of effective treatments for cancer has changed between now and a year ago, which denies the value of any clinical trial. . . . Done the right way, the clinical trial for a successful treatment would start with some initial small pool (equivalent to “phase 1″ in the article) and then simply expanded the pool of participants over time as it

4 0.97654045 197 andrew gelman stats-2010-08-10-The last great essayist?

Introduction: I recently read a bizarre article by Janet Malcolm on a murder trial in NYC. What threw me about the article was that the story was utterly commonplace (by the standards of today’s headlines): divorced mom kills ex-husband in a custody dispute over their four-year-old daughter. The only interesting features were (a) the wife was a doctor and the husband were a dentist, the sort of people you’d expect to sue rather than slay, and (b) the wife hired a hitman from within the insular immigrant community that she (and her husband) belonged to. But, really, neither of these was much of a twist. To add to the non-storyness of it all, there were no other suspects, the evidence against the wife and the hitman was overwhelming, and even the high-paid defense lawyers didn’t seem to be making much of an effort to convince anyone of their client’s innocents. (One of the closing arguments was that one aspect of the wife’s story was so ridiculous that it had to be true. In the lawyer’s wo

5 0.97569078 847 andrew gelman stats-2011-08-10-Using a “pure infographic” to explore differences between information visualization and statistical graphics

Introduction: Our discussion on data visualization continues. One one side are three statisticians–Antony Unwin, Kaiser Fung, and myself. We have been writing about the different goals served by information visualization and statistical graphics. On the other side are graphics experts (sorry for the imprecision, I don’t know exactly what these people do in their day jobs or how they are trained, and I don’t want to mislabel them) such as Robert Kosara and Jen Lowe , who seem a bit annoyed at how my colleagues and myself seem to follow the Tufte strategy of criticizing what we don’t understand. And on the third side are many (most?) academic statisticians, econometricians, etc., who don’t understand or respect graphs and seem to think of visualization as a toy that is unrelated to serious science or statistics. I’m not so interested in the third group right now–I tried to communicate with them in my big articles from 2003 and 2004 )–but I am concerned that our dialogue with the graphic

6 0.97373676 1240 andrew gelman stats-2012-04-02-Blogads update

7 0.97361201 1838 andrew gelman stats-2013-05-03-Setting aside the politics, the debate over the new health-care study reveals that we’re moving to a new high standard of statistical journalism

8 0.97281945 1087 andrew gelman stats-2011-12-27-“Keeping things unridiculous”: Berger, O’Hagan, and me on weakly informative priors

9 0.97201121 896 andrew gelman stats-2011-09-09-My homework success

10 0.97174948 1757 andrew gelman stats-2013-03-11-My problem with the Lindley paradox

11 0.97082633 1455 andrew gelman stats-2012-08-12-Probabilistic screening to get an approximate self-weighted sample

12 0.97050947 278 andrew gelman stats-2010-09-15-Advice that might make sense for individuals but is negative-sum overall

13 0.97003329 2358 andrew gelman stats-2014-06-03-Did you buy laundry detergent on their most recent trip to the store? Also comments on scientific publication and yet another suggestion to do a study that allows within-person comparisons

14 0.96974826 1999 andrew gelman stats-2013-08-27-Bayesian model averaging or fitting a larger model

15 0.96974194 2247 andrew gelman stats-2014-03-14-The maximal information coefficient

16 0.96946216 2143 andrew gelman stats-2013-12-22-The kluges of today are the textbook solutions of tomorrow.

17 0.96904123 2099 andrew gelman stats-2013-11-13-“What are some situations in which the classical approach (or a naive implementation of it, based on cookbook recipes) gives worse results than a Bayesian approach, results that actually impeded the science?”

18 0.96835661 85 andrew gelman stats-2010-06-14-Prior distribution for design effects

19 0.96831048 1072 andrew gelman stats-2011-12-19-“The difference between . . .”: It’s not just p=.05 vs. p=.06

20 0.96802801 1080 andrew gelman stats-2011-12-24-Latest in blog advertising