andrew_gelman_stats andrew_gelman_stats-2011 andrew_gelman_stats-2011-801 knowledge-graph by maker-knowledge-mining

801 andrew gelman stats-2011-07-13-On the half-Cauchy prior for a global scale parameter


meta infos for this blog

Source: html

Introduction: Nick Polson and James Scott write : We generalize the half-Cauchy prior for a global scale parameter to the wider class of hypergeometric inverted-beta priors. We derive expressions for posterior moments and marginal densities when these priors are used for a top-level normal variance in a Bayesian hierarchical model. Finally, we prove a result that characterizes the frequentist risk of the Bayes estimators under all priors in the class. These arguments provide an alternative, classical justification for the use of the half-Cauchy prior in Bayesian hierarchical models, complementing the arguments in Gelman (2006). This makes me happy, of course. It’s great to be validated. The only think I didn’t catch is how they set the scale parameter for the half-Cauchy prior. In my 2006 paper I frame it as a weakly informative prior and recommend that the scale be set based on actual prior knowledge. But Polson and Scott are talking about a default choice. I used to think that such a


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Nick Polson and James Scott write : We generalize the half-Cauchy prior for a global scale parameter to the wider class of hypergeometric inverted-beta priors. [sent-1, score-0.997]

2 We derive expressions for posterior moments and marginal densities when these priors are used for a top-level normal variance in a Bayesian hierarchical model. [sent-2, score-1.187]

3 Finally, we prove a result that characterizes the frequentist risk of the Bayes estimators under all priors in the class. [sent-3, score-0.7]

4 These arguments provide an alternative, classical justification for the use of the half-Cauchy prior in Bayesian hierarchical models, complementing the arguments in Gelman (2006). [sent-4, score-0.897]

5 The only think I didn’t catch is how they set the scale parameter for the half-Cauchy prior. [sent-7, score-0.515]

6 In my 2006 paper I frame it as a weakly informative prior and recommend that the scale be set based on actual prior knowledge. [sent-8, score-0.988]

7 But Polson and Scott are talking about a default choice. [sent-9, score-0.245]

8 I used to think that such a default would not really be possible but given our recent success with automatic priors for regularized point estimates, now I’m thinking that a reasonable default might be possible in the full Bayes case too. [sent-10, score-1.296]

9 I found the above article while looking on Polson’s site for this excellent paper , which considers in a more theoretical way some of the themes that Jennifer, Masanao, and I are exploring in our research on hierarchical models and multiple comparisons. [sent-13, score-0.749]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('polson', 0.444), ('default', 0.245), ('priors', 0.22), ('prior', 0.213), ('hierarchical', 0.203), ('scale', 0.201), ('scott', 0.187), ('hypergeometric', 0.157), ('arguments', 0.149), ('bayes', 0.138), ('regularized', 0.137), ('parameter', 0.136), ('masanao', 0.121), ('expressions', 0.119), ('estimators', 0.117), ('derive', 0.117), ('densities', 0.112), ('considers', 0.111), ('moments', 0.11), ('justification', 0.11), ('nick', 0.108), ('automatic', 0.107), ('characterizes', 0.106), ('wider', 0.105), ('frame', 0.102), ('exploring', 0.102), ('themes', 0.1), ('generalize', 0.1), ('weakly', 0.098), ('prove', 0.096), ('possible', 0.096), ('set', 0.089), ('catch', 0.089), ('frequentist', 0.088), ('marginal', 0.086), ('global', 0.085), ('jennifer', 0.083), ('site', 0.082), ('bayesian', 0.082), ('models', 0.079), ('used', 0.076), ('james', 0.074), ('success', 0.074), ('risk', 0.073), ('normal', 0.073), ('classical', 0.073), ('alternative', 0.072), ('informative', 0.072), ('excellent', 0.072), ('variance', 0.071)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0 801 andrew gelman stats-2011-07-13-On the half-Cauchy prior for a global scale parameter

Introduction: Nick Polson and James Scott write : We generalize the half-Cauchy prior for a global scale parameter to the wider class of hypergeometric inverted-beta priors. We derive expressions for posterior moments and marginal densities when these priors are used for a top-level normal variance in a Bayesian hierarchical model. Finally, we prove a result that characterizes the frequentist risk of the Bayes estimators under all priors in the class. These arguments provide an alternative, classical justification for the use of the half-Cauchy prior in Bayesian hierarchical models, complementing the arguments in Gelman (2006). This makes me happy, of course. It’s great to be validated. The only think I didn’t catch is how they set the scale parameter for the half-Cauchy prior. In my 2006 paper I frame it as a weakly informative prior and recommend that the scale be set based on actual prior knowledge. But Polson and Scott are talking about a default choice. I used to think that such a

2 0.29973304 846 andrew gelman stats-2011-08-09-Default priors update?

Introduction: Ryan King writes: I was wondering if you have a brief comment on the state of the art for objective priors for hierarchical generalized linear models (generalized linear mixed models). I have been working off the papers in Bayesian Analysis (2006) 1, Number 3 (Browne and Draper, Kass and Natarajan, Gelman). There seems to have been continuous work for matching priors in linear mixed models, but GLMMs less so because of the lack of an analytic marginal likelihood for the variance components. There are a number of additional suggestions in the literature since 2006, but little robust practical guidance. I’m interested in both mean parameters and the variance components. I’m almost always concerned with logistic random effect models. I’m fascinated by the matching-priors idea of higher-order asymptotic improvements to maximum likelihood, and need to make some kind of defensible default recommendation. Given the massive scale of the datasets (genetics …), extensive sensitivity a

3 0.26372305 1092 andrew gelman stats-2011-12-29-More by Berger and me on weakly informative priors

Introduction: A couple days ago we discussed some remarks by Tony O’Hagan and Jim Berger on weakly informative priors. Jim followed up on Deborah Mayo’s blog with this: Objective Bayesian priors are often improper (i.e., have infinite total mass), but this is not a problem when they are developed correctly. But not every improper prior is satisfactory. For instance, the constant prior is known to be unsatisfactory in many situations. The ‘solution’ pseudo-Bayesians often use is to choose a constant prior over a large but bounded set (a ‘weakly informative’ prior), saying it is now proper and so all is well. This is not true; if the constant prior on the whole parameter space is bad, so will be the constant prior over the bounded set. The problem is, in part, that some people confuse proper priors with subjective priors and, having learned that true subjective priors are fine, incorrectly presume that weakly informative proper priors are fine. I have a few reactions to this: 1. I agree

4 0.22342274 779 andrew gelman stats-2011-06-25-Avoiding boundary estimates using a prior distribution as regularization

Introduction: For awhile I’ve been fitting most of my multilevel models using lmer/glmer, which gives point estimates of the group-level variance parameters (maximum marginal likelihood estimate for lmer and an approximation for glmer). I’m usually satisfied with this–sure, point estimation understates the uncertainty in model fitting, but that’s typically the least of our worries. Sometimes, though, lmer/glmer estimates group-level variances at 0 or estimates group-level correlation parameters at +/- 1. Typically, when this happens, it’s not that we’re so sure the variance is close to zero or that the correlation is close to 1 or -1; rather, the marginal likelihood does not provide a lot of information about these parameters of the group-level error distribution. I don’t want point estimates on the boundary. I don’t want to say that the unexplained variance in some dimension is exactly zero. One way to handle this problem is full Bayes: slap a prior on sigma, do your Gibbs and Metropolis

5 0.21863645 1474 andrew gelman stats-2012-08-29-More on scaled-inverse Wishart and prior independence

Introduction: I’ve had a couple of email conversations in the past couple days on dependence in multivariate prior distributions. Modeling the degrees of freedom and scale parameters in the t distribution First, in our Stan group we’ve been discussing the choice of priors for the degrees-of-freedom parameter in the t distribution. I wrote that also there’s the question of parameterization. It does not necessarily make sense to have independent priors on the df and scale parameters. In some sense, the meaning of the scale parameter changes with the df. Prior dependence between correlation and scale parameters in the scaled inverse-Wishart model The second case of parameterization in prior distribution arose from an email I received from Chris Chatham pointing me to this exploration by Matt Simpson of the scaled inverse-Wishart prior distribution for hierarchical covariance matrices. Simpson writes: A popular prior for Σ is the inverse-Wishart distribution [ not the same as the

6 0.19918215 1941 andrew gelman stats-2013-07-16-Priors

7 0.19871613 1087 andrew gelman stats-2011-12-27-“Keeping things unridiculous”: Berger, O’Hagan, and me on weakly informative priors

8 0.19732815 1858 andrew gelman stats-2013-05-15-Reputations changeable, situations tolerable

9 0.19692987 2109 andrew gelman stats-2013-11-21-Hidden dangers of noninformative priors

10 0.19419967 1155 andrew gelman stats-2012-02-05-What is a prior distribution?

11 0.18558571 1695 andrew gelman stats-2013-01-28-Economists argue about Bayes

12 0.18276007 2129 andrew gelman stats-2013-12-10-Cross-validation and Bayesian estimation of tuning parameters

13 0.17922637 63 andrew gelman stats-2010-06-02-The problem of overestimation of group-level variance parameters

14 0.17594297 1946 andrew gelman stats-2013-07-19-Prior distributions on derived quantities rather than on parameters themselves

15 0.17433551 1046 andrew gelman stats-2011-12-07-Neutral noninformative and informative conjugate beta and gamma prior distributions

16 0.17363903 1486 andrew gelman stats-2012-09-07-Prior distributions for regression coefficients

17 0.17286497 1465 andrew gelman stats-2012-08-21-D. Buggin

18 0.17037268 2200 andrew gelman stats-2014-02-05-Prior distribution for a predicted probability

19 0.15269864 1564 andrew gelman stats-2012-11-06-Choose your default, or your default will choose you (election forecasting edition)

20 0.15140225 1999 andrew gelman stats-2013-08-27-Bayesian model averaging or fitting a larger model


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.173), (1, 0.226), (2, -0.018), (3, 0.069), (4, -0.068), (5, -0.069), (6, 0.141), (7, 0.012), (8, -0.212), (9, 0.074), (10, 0.015), (11, 0.008), (12, 0.089), (13, 0.063), (14, 0.061), (15, -0.005), (16, -0.022), (17, 0.016), (18, 0.019), (19, 0.017), (20, -0.044), (21, -0.007), (22, 0.02), (23, 0.051), (24, 0.016), (25, -0.025), (26, 0.005), (27, 0.011), (28, -0.009), (29, -0.027), (30, -0.012), (31, -0.025), (32, 0.059), (33, -0.033), (34, -0.053), (35, -0.016), (36, 0.012), (37, -0.016), (38, -0.006), (39, 0.026), (40, -0.033), (41, 0.033), (42, 0.024), (43, 0.002), (44, -0.047), (45, -0.005), (46, -0.046), (47, -0.021), (48, -0.015), (49, -0.016)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.98796171 801 andrew gelman stats-2011-07-13-On the half-Cauchy prior for a global scale parameter

Introduction: Nick Polson and James Scott write : We generalize the half-Cauchy prior for a global scale parameter to the wider class of hypergeometric inverted-beta priors. We derive expressions for posterior moments and marginal densities when these priors are used for a top-level normal variance in a Bayesian hierarchical model. Finally, we prove a result that characterizes the frequentist risk of the Bayes estimators under all priors in the class. These arguments provide an alternative, classical justification for the use of the half-Cauchy prior in Bayesian hierarchical models, complementing the arguments in Gelman (2006). This makes me happy, of course. It’s great to be validated. The only think I didn’t catch is how they set the scale parameter for the half-Cauchy prior. In my 2006 paper I frame it as a weakly informative prior and recommend that the scale be set based on actual prior knowledge. But Polson and Scott are talking about a default choice. I used to think that such a

2 0.92926121 1858 andrew gelman stats-2013-05-15-Reputations changeable, situations tolerable

Introduction: David Kessler, Peter Hoff, and David Dunson write : Marginally specified priors for nonparametric Bayesian estimation Prior specification for nonparametric Bayesian inference involves the difficult task of quantifying prior knowledge about a parameter of high, often infinite, dimension. Realistically, a statistician is unlikely to have informed opinions about all aspects of such a parameter, but may have real information about functionals of the parameter, such the population mean or variance. This article proposes a new framework for nonparametric Bayes inference in which the prior distribution for a possibly infinite-dimensional parameter is decomposed into two parts: an informative prior on a finite set of functionals, and a nonparametric conditional prior for the parameter given the functionals. Such priors can be easily constructed from standard nonparametric prior distributions in common use, and inherit the large support of the standard priors upon which they are based. Ad

3 0.91896504 846 andrew gelman stats-2011-08-09-Default priors update?

Introduction: Ryan King writes: I was wondering if you have a brief comment on the state of the art for objective priors for hierarchical generalized linear models (generalized linear mixed models). I have been working off the papers in Bayesian Analysis (2006) 1, Number 3 (Browne and Draper, Kass and Natarajan, Gelman). There seems to have been continuous work for matching priors in linear mixed models, but GLMMs less so because of the lack of an analytic marginal likelihood for the variance components. There are a number of additional suggestions in the literature since 2006, but little robust practical guidance. I’m interested in both mean parameters and the variance components. I’m almost always concerned with logistic random effect models. I’m fascinated by the matching-priors idea of higher-order asymptotic improvements to maximum likelihood, and need to make some kind of defensible default recommendation. Given the massive scale of the datasets (genetics …), extensive sensitivity a

4 0.86735171 468 andrew gelman stats-2010-12-15-Weakly informative priors and imprecise probabilities

Introduction: Giorgio Corani writes: Your work on weakly informative priors is close to some research I [Corani] did (together with Prof. Zaffalon) in the last years using the so-called imprecise probabilities. The idea is to work with a set of priors (containing even very different priors); to update them via Bayes’ rule and then compute a set of posteriors. The set of priors is convex and the priors are Dirichlet (thus, conjugate to the likelihood); this allows to compute the set of posteriors exactly and efficiently. I [Corani] have used this approach for classification, extending naive Bayes and TAN to imprecise probabilities. Classifiers based on imprecise probabilities return more classes when they find that the most probable class is prior-dependent, i.e., if picking different priors in the convex set leads to identify different classes as the most probable one. Instead of returning a single (unreliable) prior-dependent class, credal classifiers in this case preserve reliability by

5 0.85835415 1092 andrew gelman stats-2011-12-29-More by Berger and me on weakly informative priors

Introduction: A couple days ago we discussed some remarks by Tony O’Hagan and Jim Berger on weakly informative priors. Jim followed up on Deborah Mayo’s blog with this: Objective Bayesian priors are often improper (i.e., have infinite total mass), but this is not a problem when they are developed correctly. But not every improper prior is satisfactory. For instance, the constant prior is known to be unsatisfactory in many situations. The ‘solution’ pseudo-Bayesians often use is to choose a constant prior over a large but bounded set (a ‘weakly informative’ prior), saying it is now proper and so all is well. This is not true; if the constant prior on the whole parameter space is bad, so will be the constant prior over the bounded set. The problem is, in part, that some people confuse proper priors with subjective priors and, having learned that true subjective priors are fine, incorrectly presume that weakly informative proper priors are fine. I have a few reactions to this: 1. I agree

6 0.85097349 779 andrew gelman stats-2011-06-25-Avoiding boundary estimates using a prior distribution as regularization

7 0.85004568 1046 andrew gelman stats-2011-12-07-Neutral noninformative and informative conjugate beta and gamma prior distributions

8 0.84176797 1474 andrew gelman stats-2012-08-29-More on scaled-inverse Wishart and prior independence

9 0.84041631 63 andrew gelman stats-2010-06-02-The problem of overestimation of group-level variance parameters

10 0.83404654 1155 andrew gelman stats-2012-02-05-What is a prior distribution?

11 0.82898587 1087 andrew gelman stats-2011-12-27-“Keeping things unridiculous”: Berger, O’Hagan, and me on weakly informative priors

12 0.80569494 1946 andrew gelman stats-2013-07-19-Prior distributions on derived quantities rather than on parameters themselves

13 0.7912727 1454 andrew gelman stats-2012-08-11-Weakly informative priors for Bayesian nonparametric models?

14 0.78919059 2017 andrew gelman stats-2013-09-11-“Informative g-Priors for Logistic Regression”

15 0.78760576 2129 andrew gelman stats-2013-12-10-Cross-validation and Bayesian estimation of tuning parameters

16 0.77746409 1465 andrew gelman stats-2012-08-21-D. Buggin

17 0.77650553 2109 andrew gelman stats-2013-11-21-Hidden dangers of noninformative priors

18 0.77140701 669 andrew gelman stats-2011-04-19-The mysterious Gamma (1.4, 0.4)

19 0.75768304 1466 andrew gelman stats-2012-08-22-The scaled inverse Wishart prior distribution for a covariance matrix in a hierarchical model

20 0.75293827 1941 andrew gelman stats-2013-07-16-Priors


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(15, 0.038), (16, 0.037), (17, 0.015), (21, 0.011), (24, 0.278), (33, 0.03), (36, 0.013), (55, 0.015), (63, 0.014), (73, 0.152), (86, 0.027), (87, 0.013), (90, 0.015), (95, 0.014), (99, 0.233)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.97089076 801 andrew gelman stats-2011-07-13-On the half-Cauchy prior for a global scale parameter

Introduction: Nick Polson and James Scott write : We generalize the half-Cauchy prior for a global scale parameter to the wider class of hypergeometric inverted-beta priors. We derive expressions for posterior moments and marginal densities when these priors are used for a top-level normal variance in a Bayesian hierarchical model. Finally, we prove a result that characterizes the frequentist risk of the Bayes estimators under all priors in the class. These arguments provide an alternative, classical justification for the use of the half-Cauchy prior in Bayesian hierarchical models, complementing the arguments in Gelman (2006). This makes me happy, of course. It’s great to be validated. The only think I didn’t catch is how they set the scale parameter for the half-Cauchy prior. In my 2006 paper I frame it as a weakly informative prior and recommend that the scale be set based on actual prior knowledge. But Polson and Scott are talking about a default choice. I used to think that such a

2 0.95738953 917 andrew gelman stats-2011-09-20-Last post on Hipmunk

Introduction: There was some confusion on my last try , so let me explain one more time . . . The flights I where Hipmunk failed (see here for background) were not obscure itineraries. One of them was a nonstop from New York to Cincinnati; another was from NY to Durham, North Carolina; and yet another was a trip to Midway in Chicago. In that last case, Hipmunk showed no nonstops at all—which will come as a surprise to the passengers on the Southwest Airlines flight I was on a couple days ago! In these cases, Hipmunk didn’t even do the courtesy of flashing a message telling me to try elsewhere. I don’t understand. How hard would it be for the program to automatically do a Kayak search and find all the flights? Hipmunk’s graphics are great, though. Lee Wilkinson reports: Check out the figure below from The Grammar of Graphics. Dan Rope invented this graphic and programmed it in Java in the late 1990′s. We shopped this graph around to Orbitz and Expedia but they weren’t interested. So I

3 0.94474208 1748 andrew gelman stats-2013-03-04-PyStan!

Introduction: Stan is written in C++ and can be run from the command line and from R. We’d like for Python users to be able to run Stan as well. If anyone is interested in doing this, please let us know and we’d be happy to work with you on it. Stan, like Python, is completely free and open-source. P.S. Because Stan is open-source, it of course would also be possible for people to translate Stan into Python, or to take whatever features they like from Stan and incorporate them into a Python package. That’s fine too. But we think it would make sense in addition for users to be able to run Stan directly from Python, in the same way that it can be run from R.

4 0.93267584 2017 andrew gelman stats-2013-09-11-“Informative g-Priors for Logistic Regression”

Introduction: Tim Hanson sends along this paper (coauthored with Adam Branscum and Wesley Johnson): Eliciting information from experts for use in constructing prior distributions for logistic regression coefficients can be challenging. The task is especially difficult when the model contains many predictor variables, because the expert is asked to provide summary information about the probability of “success” for many subgroups of the population. Often, however, experts are confident only in their assessment of the population as a whole. This paper is about incorporating such overall, marginal or averaged, information easily into a logistic regression data analysis by using g-priors. We present a version of the g-prior such that the prior distribution on the probability of success can be set to closely match a beta dis- tribution, when averaged over the set of predictors in a logistic regression. A simple data augmentation formulation that can be implemented in standard statistical software pac

5 0.92717779 2346 andrew gelman stats-2014-05-24-Buzzfeed, Porn, Kansas…That Can’t Be Good

Introduction: This post is by  David K. Park  and courtesy of Alex Palen Ellis… Thought you might find this funny: Buzzfeed set out to study porn consumption versus the red/blue political spectrum. And they failed miserably. An article form opennews.org outlines six major fallacies Buzzfeed committed, the best of which resulted in the Kansas effect: “Pornhub’s writeup omitted any explicit description of their methodology—this is never a good sign—but it seems to have involved mapping the IP addresses from which users visited the site to physical addresses and reverse geocoding those to get states…. a large percentage of IP addresses could not be resolved to an address any more specific than “USA.” When that address was geocoded, it returned a point in the centroid of the continental United States, which placed it in the state of—you guessed it—Kansas!” As a result, Kansas was 2.95 std dev above the mean. Those pervs! from:  https://source.opennews.org/en-US/learning/distrust-your-data/

6 0.91266894 278 andrew gelman stats-2010-09-15-Advice that might make sense for individuals but is negative-sum overall

7 0.9125762 1455 andrew gelman stats-2012-08-12-Probabilistic screening to get an approximate self-weighted sample

8 0.91221118 1706 andrew gelman stats-2013-02-04-Too many MC’s not enough MIC’s, or What principles should govern attempts to summarize bivariate associations in large multivariate datasets?

9 0.91108876 197 andrew gelman stats-2010-08-10-The last great essayist?

10 0.91046071 1999 andrew gelman stats-2013-08-27-Bayesian model averaging or fitting a larger model

11 0.90982556 1376 andrew gelman stats-2012-06-12-Simple graph WIN: the example of birthday frequencies

12 0.90874374 953 andrew gelman stats-2011-10-11-Steve Jobs’s cancer and science-based medicine

13 0.90834773 482 andrew gelman stats-2010-12-23-Capitalism as a form of voluntarism

14 0.9082309 743 andrew gelman stats-2011-06-03-An argument that can’t possibly make sense

15 0.90816581 2231 andrew gelman stats-2014-03-03-Running into a Stan Reference by Accident

16 0.90791762 847 andrew gelman stats-2011-08-10-Using a “pure infographic” to explore differences between information visualization and statistical graphics

17 0.90738928 2143 andrew gelman stats-2013-12-22-The kluges of today are the textbook solutions of tomorrow.

18 0.90673685 1368 andrew gelman stats-2012-06-06-Question 27 of my final exam for Design and Analysis of Sample Surveys

19 0.90646315 1838 andrew gelman stats-2013-05-03-Setting aside the politics, the debate over the new health-care study reveals that we’re moving to a new high standard of statistical journalism

20 0.90619791 1224 andrew gelman stats-2012-03-21-Teaching velocity and acceleration