andrew_gelman_stats andrew_gelman_stats-2011 andrew_gelman_stats-2011-916 knowledge-graph by maker-knowledge-mining

916 andrew gelman stats-2011-09-18-Multimodality in hierarchical models


meta infos for this blog

Source: html

Introduction: Jim Hodges posted a note to the Bugs mailing list that I thought could be of more general interest: Is multi-modality a common experience? I [Hodges] think the answer is “nobody knows in any generality”. Here are some examples of bimodality that certainly do *not* involve the kind of labeling problems that arise in mixture models. The only systematic study of multimodality I know of is Liu J, Hodges JS (2003). Posterior bimodality in the balanced one-way random effects model. J.~Royal Stat.~Soc., Ser.~B, 65:247-255. The surprise of this paper is that in the simplest possible hierarchical model (analyzed using the standard inverse-gamma priors for the two variances), bimodality occurs quite readily, although it is much less common to have two modes that are big enough so that you’d actually get a noticeable fraction of MCMC draws from both of them. Because the restricted likelihood (= the marginal posterior for the two variances, if you’ve put flat priors on them) is


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Jim Hodges posted a note to the Bugs mailing list that I thought could be of more general interest: Is multi-modality a common experience? [sent-1, score-0.149]

2 Here are some examples of bimodality that certainly do *not* involve the kind of labeling problems that arise in mixture models. [sent-3, score-0.628]

3 The only systematic study of multimodality I know of is Liu J, Hodges JS (2003). [sent-4, score-0.094]

4 Posterior bimodality in the balanced one-way random effects model. [sent-5, score-0.609]

5 Some algebra and geometry for hierarchical models, applied to diagnostics (with discussion). [sent-14, score-0.298]

6 Here a simple, harmless-looking two-level model with normal errors and random effect had a bimodal posterior. [sent-16, score-0.309]

7 I don’t know what features of the data, model, and priors produced this. [sent-17, score-0.176]

8 My former student Brian Reich also got bimodal posteriors fitting the models and data described in this paper: Reich BJ, Hodges JS, Carlin BP (2007). [sent-18, score-0.242]

9 Spatial analysis of periodontal data using conditionally autoregressive priors having two types of neighbor relations. [sent-19, score-0.506]

10 However, those fits don’t appear in this paper (long story). [sent-21, score-0.098]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('bimodality', 0.471), ('hodges', 0.444), ('js', 0.283), ('reich', 0.188), ('priors', 0.176), ('bimodal', 0.164), ('bf', 0.159), ('royal', 0.137), ('variances', 0.131), ('conflict', 0.115), ('paper', 0.098), ('arise', 0.096), ('wakefield', 0.094), ('bp', 0.094), ('multimodality', 0.094), ('noticeable', 0.094), ('conditionally', 0.089), ('autoregressive', 0.085), ('neighbor', 0.085), ('likelihood', 0.083), ('unimodal', 0.082), ('hierarchical', 0.081), ('geometry', 0.078), ('posteriors', 0.078), ('posterior', 0.077), ('mailing', 0.076), ('model', 0.074), ('generality', 0.074), ('common', 0.073), ('diagnostics', 0.073), ('jon', 0.072), ('random', 0.071), ('two', 0.071), ('readily', 0.07), ('carlin', 0.068), ('balanced', 0.067), ('liu', 0.067), ('algebra', 0.066), ('restricted', 0.066), ('modes', 0.066), ('simplest', 0.064), ('brian', 0.064), ('spatial', 0.063), ('fraction', 0.061), ('labeling', 0.061), ('occurs', 0.061), ('jim', 0.061), ('produces', 0.061), ('journal', 0.06), ('flat', 0.057)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0 916 andrew gelman stats-2011-09-18-Multimodality in hierarchical models

Introduction: Jim Hodges posted a note to the Bugs mailing list that I thought could be of more general interest: Is multi-modality a common experience? I [Hodges] think the answer is “nobody knows in any generality”. Here are some examples of bimodality that certainly do *not* involve the kind of labeling problems that arise in mixture models. The only systematic study of multimodality I know of is Liu J, Hodges JS (2003). Posterior bimodality in the balanced one-way random effects model. J.~Royal Stat.~Soc., Ser.~B, 65:247-255. The surprise of this paper is that in the simplest possible hierarchical model (analyzed using the standard inverse-gamma priors for the two variances), bimodality occurs quite readily, although it is much less common to have two modes that are big enough so that you’d actually get a noticeable fraction of MCMC draws from both of them. Because the restricted likelihood (= the marginal posterior for the two variances, if you’ve put flat priors on them) is

2 0.15718329 2176 andrew gelman stats-2014-01-19-Transformations for non-normal data

Introduction: Steve Peterson writes: I recently submitted a proposal on applying a Bayesian analysis to gender comparisons on motivational constructs. I had an idea on how to improve the model I used and was hoping you could give me some feedback. The data come from a survey based on 5-point Likert scales. Different constructs are measured for each student as scores derived from averaging a student’s responses on particular subsets of survey questions. (I suppose it is not uncontroversial to treat these scores as interval measures and would be interested to hear if you have any objections.) I am comparing genders on each construct. Researchers typically use t-tests to do so. To use a Bayesian approach I applied the programs written in R and JAGS by John Kruschke for estimating the difference of means: http://www.indiana.edu/~kruschke/BEST/ An issue in that analysis is that the distributions of student scores are not normal. There was skewness in some of the distributions and not always in

3 0.10995239 1459 andrew gelman stats-2012-08-15-How I think about mixture models

Introduction: Larry Wasserman refers to finite mixture models as “beasts” and writes jokes that they “should be avoided at all costs.” I’ve thought a lot about mixture models, ever since using them in an analysis of voting patterns that was published in 1990. First off, I’d like to say that our model was useful so I’d prefer not to pay the cost of avoiding it. For a quick description of our mixture model and its context, see pp. 379-380 of my article in the Jim Berger volume). Actually, our case was particularly difficult because we were not even fitting a mixture model to data, we were fitting it to latent data and using the model to perform partial pooling. My difficulties in trying to fit this model inspired our discussion of mixture models in Bayesian Data Analysis (page 109 in the second edition, in the section on “Counterexamples to the theorems”). I agree with Larry that if you’re fitting a mixture model, it’s good to be aware of the problems that arise if you try to estimate

4 0.10886344 1092 andrew gelman stats-2011-12-29-More by Berger and me on weakly informative priors

Introduction: A couple days ago we discussed some remarks by Tony O’Hagan and Jim Berger on weakly informative priors. Jim followed up on Deborah Mayo’s blog with this: Objective Bayesian priors are often improper (i.e., have infinite total mass), but this is not a problem when they are developed correctly. But not every improper prior is satisfactory. For instance, the constant prior is known to be unsatisfactory in many situations. The ‘solution’ pseudo-Bayesians often use is to choose a constant prior over a large but bounded set (a ‘weakly informative’ prior), saying it is now proper and so all is well. This is not true; if the constant prior on the whole parameter space is bad, so will be the constant prior over the bounded set. The problem is, in part, that some people confuse proper priors with subjective priors and, having learned that true subjective priors are fine, incorrectly presume that weakly informative proper priors are fine. I have a few reactions to this: 1. I agree

5 0.10642103 1946 andrew gelman stats-2013-07-19-Prior distributions on derived quantities rather than on parameters themselves

Introduction: Following up on our discussion of the other day , Nick Firoozye writes: One thing I meant by my initial query (but really didn’t manage to get across) was this: I have no idea what my prior would be on many many models, but just like Utility Theory expects ALL consumers to attach a utility to any and all consumption goods (even those I haven’t seen or heard of), Bayesian Stats (almost) expects the same for priors. (Of course it’s not a religious edict much in the way Utility Theory has, since there is no theory of a “modeler” in the Bayesian paradigm—nonetheless there is still an expectation that we should have priors over all sorts of parameters which mean almost nothing to us). For most models with sufficient complexity, I also have no idea what my informative priors are actually doing and the only way to know anything is through something I can see and experience, through data, not parameters or state variables. My question was more on the—let’s use the prior to come up

6 0.10563207 432 andrew gelman stats-2010-11-27-Neumann update

7 0.10353515 1999 andrew gelman stats-2013-08-27-Bayesian model averaging or fitting a larger model

8 0.10160786 846 andrew gelman stats-2011-08-09-Default priors update?

9 0.099254683 1941 andrew gelman stats-2013-07-16-Priors

10 0.093887493 1474 andrew gelman stats-2012-08-29-More on scaled-inverse Wishart and prior independence

11 0.093056671 2109 andrew gelman stats-2013-11-21-Hidden dangers of noninformative priors

12 0.092162788 779 andrew gelman stats-2011-06-25-Avoiding boundary estimates using a prior distribution as regularization

13 0.091048658 1877 andrew gelman stats-2013-05-30-Infill asymptotics and sprawl asymptotics

14 0.091013558 1309 andrew gelman stats-2012-05-09-The first version of my “inference from iterative simulation using parallel sequences” paper!

15 0.09068206 1486 andrew gelman stats-2012-09-07-Prior distributions for regression coefficients

16 0.087783001 801 andrew gelman stats-2011-07-13-On the half-Cauchy prior for a global scale parameter

17 0.087355122 850 andrew gelman stats-2011-08-11-Understanding how estimates change when you move to a multilevel model

18 0.086579278 1465 andrew gelman stats-2012-08-21-D. Buggin

19 0.085642748 1046 andrew gelman stats-2011-12-07-Neutral noninformative and informative conjugate beta and gamma prior distributions

20 0.084681854 427 andrew gelman stats-2010-11-23-Bayesian adaptive methods for clinical trials


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.133), (1, 0.115), (2, -0.004), (3, -0.002), (4, -0.008), (5, -0.028), (6, 0.035), (7, -0.051), (8, -0.045), (9, 0.041), (10, 0.034), (11, 0.023), (12, -0.007), (13, 0.008), (14, 0.013), (15, -0.02), (16, 0.022), (17, 0.022), (18, -0.025), (19, 0.008), (20, -0.02), (21, -0.013), (22, 0.008), (23, -0.01), (24, -0.012), (25, 0.01), (26, -0.051), (27, 0.007), (28, 0.031), (29, -0.012), (30, -0.034), (31, -0.022), (32, -0.017), (33, 0.007), (34, -0.002), (35, 0.017), (36, -0.036), (37, -0.025), (38, -0.004), (39, 0.026), (40, -0.007), (41, 0.016), (42, 0.002), (43, 0.02), (44, -0.019), (45, -0.003), (46, -0.044), (47, -0.028), (48, 0.02), (49, -0.023)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.95223701 916 andrew gelman stats-2011-09-18-Multimodality in hierarchical models

Introduction: Jim Hodges posted a note to the Bugs mailing list that I thought could be of more general interest: Is multi-modality a common experience? I [Hodges] think the answer is “nobody knows in any generality”. Here are some examples of bimodality that certainly do *not* involve the kind of labeling problems that arise in mixture models. The only systematic study of multimodality I know of is Liu J, Hodges JS (2003). Posterior bimodality in the balanced one-way random effects model. J.~Royal Stat.~Soc., Ser.~B, 65:247-255. The surprise of this paper is that in the simplest possible hierarchical model (analyzed using the standard inverse-gamma priors for the two variances), bimodality occurs quite readily, although it is much less common to have two modes that are big enough so that you’d actually get a noticeable fraction of MCMC draws from both of them. Because the restricted likelihood (= the marginal posterior for the two variances, if you’ve put flat priors on them) is

2 0.83288628 1674 andrew gelman stats-2013-01-15-Prior Selection for Vector Autoregressions

Introduction: Brendan Nyhan sends along this paper by Domenico Giannone, Michele Lenza, and Giorgio Primiceri: Vector autoregressions are flexible time series models that can capture complex dynamic interrelationships among macroeconomic variables. However, their dense parameterization leads to unstable inference and inaccurate out-of-sample forecasts, particularly for models with many variables. A solution to this problem is to use informative priors, in order to shrink the richly parameterized unrestricted model towards a parsimonious naive benchmark, and thus reduce estimation uncertainty. This paper studies the optimal choice of the informativeness of these priors, which we treat as additional parameters, in the spirit of hierarchical modeling. This approach is theoretically grounded, easy to implement, and greatly reduces the number and importance of subjective choices in the setting of the prior. Moreover, it performs very well both in terms of out-of-sample forecasting—as well as factor

3 0.80299181 846 andrew gelman stats-2011-08-09-Default priors update?

Introduction: Ryan King writes: I was wondering if you have a brief comment on the state of the art for objective priors for hierarchical generalized linear models (generalized linear mixed models). I have been working off the papers in Bayesian Analysis (2006) 1, Number 3 (Browne and Draper, Kass and Natarajan, Gelman). There seems to have been continuous work for matching priors in linear mixed models, but GLMMs less so because of the lack of an analytic marginal likelihood for the variance components. There are a number of additional suggestions in the literature since 2006, but little robust practical guidance. I’m interested in both mean parameters and the variance components. I’m almost always concerned with logistic random effect models. I’m fascinated by the matching-priors idea of higher-order asymptotic improvements to maximum likelihood, and need to make some kind of defensible default recommendation. Given the massive scale of the datasets (genetics …), extensive sensitivity a

4 0.77941978 1786 andrew gelman stats-2013-04-03-Hierarchical array priors for ANOVA decompositions

Introduction: Alexander Volfovsky and Peter Hoff write : ANOVA decompositions are a standard method for describing and estimating heterogeneity among the means of a response variable across levels of multiple categorical factors. In such a decomposition, the complete set of main effects and interaction terms can be viewed as a collection of vectors, matrices and arrays that share various index sets defined by the factor levels. For many types of categorical factors, it is plausible that an ANOVA decomposition exhibits some consistency across orders of effects, in that the levels of a factor that have similar main-effect coefficients may also have similar coefficients in higher-order interaction terms. In such a case, estimation of the higher-order interactions should be improved by borrowing information from the main effects and lower-order interactions. To take advantage of such patterns, this article introduces a class of hierarchical prior distributions for collections of interaction arrays t

5 0.77838838 1877 andrew gelman stats-2013-05-30-Infill asymptotics and sprawl asymptotics

Introduction: Anirban Bhattacharya, Debdeep Pati, Natesh Pillai, and David Dunson write : Penalized regression methods, such as L1 regularization, are routinely used in high-dimensional applications, and there is a rich literature on optimality properties under sparsity assumptions. In the Bayesian paradigm, sparsity is routinely induced through two-component mixture priors having a probability mass at zero, but such priors encounter daunting computational problems in high dimensions. This has motivated an amazing variety of continuous shrinkage priors, which can be expressed as global-local scale mixtures of Gaussians, facilitating computation. In sharp contrast to the corresponding frequentist literature, very little is known about the properties of such priors. Focusing on a broad class of shrinkage priors, we provide precise results on prior and posterior concentration. Interestingly, we demonstrate that most commonly used shrinkage priors, including the Bayesian Lasso, are suboptimal in hig

6 0.77580917 184 andrew gelman stats-2010-08-04-That half-Cauchy prior

7 0.77074218 1459 andrew gelman stats-2012-08-15-How I think about mixture models

8 0.74949324 1465 andrew gelman stats-2012-08-21-D. Buggin

9 0.74635375 1991 andrew gelman stats-2013-08-21-BDA3 table of contents (also a new paper on visualization)

10 0.74164182 1287 andrew gelman stats-2012-04-28-Understanding simulations in terms of predictive inference?

11 0.74033439 1466 andrew gelman stats-2012-08-22-The scaled inverse Wishart prior distribution for a covariance matrix in a hierarchical model

12 0.73569506 1374 andrew gelman stats-2012-06-11-Convergence Monitoring for Non-Identifiable and Non-Parametric Models

13 0.73528826 810 andrew gelman stats-2011-07-20-Adding more information can make the variance go up (depending on your model)

14 0.73513937 850 andrew gelman stats-2011-08-11-Understanding how estimates change when you move to a multilevel model

15 0.72967833 1309 andrew gelman stats-2012-05-09-The first version of my “inference from iterative simulation using parallel sequences” paper!

16 0.72765803 669 andrew gelman stats-2011-04-19-The mysterious Gamma (1.4, 0.4)

17 0.72337931 2145 andrew gelman stats-2013-12-24-Estimating and summarizing inference for hierarchical variance parameters when the number of groups is small

18 0.71840066 1983 andrew gelman stats-2013-08-15-More on AIC, WAIC, etc

19 0.71500361 1454 andrew gelman stats-2012-08-11-Weakly informative priors for Bayesian nonparametric models?

20 0.71368295 1474 andrew gelman stats-2012-08-29-More on scaled-inverse Wishart and prior independence


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(8, 0.242), (9, 0.017), (15, 0.02), (16, 0.062), (21, 0.026), (24, 0.125), (36, 0.015), (75, 0.019), (86, 0.061), (99, 0.232)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.96250403 1822 andrew gelman stats-2013-04-24-Samurai sword-wielding Mormon bishop pharmaceutical statistician stops mugger

Introduction: Brett Keller points us to this feel-good story of the day: A Samurai sword-wielding Mormon bishop helped a neighbor woman escape a Tuesday morning attack by a man who had been stalking her. Kent Hendrix woke up Tuesday to his teenage son pounding on his bedroom door and telling him somebody was being mugged in front of their house. The 47-year-old father of six rushed out the door and grabbed the weapon closest to him — a 29-inch high carbon steel Samurai sword. . . . Hendrix, a pharmaceutical statistician, was one of several neighbors who came to the woman’s aid after she began yelling for help . . . Too bad the whole “statistician” thing got buried in the middle of the article. Fair enough, though: I don’t know what it takes to become a Mormon bishop, but I assume it’s more effort than what it takes to learn statistics.

same-blog 2 0.89972639 916 andrew gelman stats-2011-09-18-Multimodality in hierarchical models

Introduction: Jim Hodges posted a note to the Bugs mailing list that I thought could be of more general interest: Is multi-modality a common experience? I [Hodges] think the answer is “nobody knows in any generality”. Here are some examples of bimodality that certainly do *not* involve the kind of labeling problems that arise in mixture models. The only systematic study of multimodality I know of is Liu J, Hodges JS (2003). Posterior bimodality in the balanced one-way random effects model. J.~Royal Stat.~Soc., Ser.~B, 65:247-255. The surprise of this paper is that in the simplest possible hierarchical model (analyzed using the standard inverse-gamma priors for the two variances), bimodality occurs quite readily, although it is much less common to have two modes that are big enough so that you’d actually get a noticeable fraction of MCMC draws from both of them. Because the restricted likelihood (= the marginal posterior for the two variances, if you’ve put flat priors on them) is

3 0.87517893 1128 andrew gelman stats-2012-01-19-Sharon Begley: Worse than Stephen Jay Gould?

Introduction: Commenter Tggp links to a criticism of science journalist Sharon Begley by science journalist Matthew Hutson. I learned of this dispute after reporting that Begley had received the American Statistical Association’s Excellence in Statistical Reporting Award, a completely undeserved honor, if Hutson is to believed. The two journalists have somewhat similar profiles: Begley was science editor at Newsweek (she’s now at Reuters) and author of “Train Your Mind, Change Your Brain: How a New Science Reveals Our Extraordinary Potential to Transform Ourselves,” and Hutson was news editor at Psychology Today and wrote the similarly self-helpy-titled, “The 7 Laws of Magical Thinking: How Irrational Beliefs Keep Us Happy, Healthy, and Sane.” Hutson writes : Psychological Science recently published a fascinating new study on jealousy. I was interested to read Newsweek’s 1300-word article covering the research by their science editor, Sharon Begley. But part-way through the article, I

4 0.86674726 575 andrew gelman stats-2011-02-15-What are the trickiest models to fit?

Introduction: John Salvatier writes: What do you and your readers think are the trickiest models to fit? If I had an algorithm that I claimed could fit many models with little fuss, what kinds of models would really impress you? I am interested in testing different MCMC sampling methods to evaluate their performance and I want to stretch the bounds of their abilities. I don’t know what’s the trickiest, but just about anything I work on in a serious way gives me some troubles. This reminds me that we should finish our Bayesian Benchmarks paper already.

5 0.86236608 1133 andrew gelman stats-2012-01-21-Judea Pearl on why he is “only a half-Bayesian”

Introduction: In an article published in 2001, Pearl wrote: I [Pearl] turned Bayesian in 1971, as soon as I began reading Savage’s monograph The Foundations of Statistical Inference [Savage, 1962]. The arguments were unassailable: (i) It is plain silly to ignore what we know, (ii) It is natural and useful to cast what we know in the language of probabilities, and (iii) If our subjective probabilities are erroneous, their impact will get washed out in due time, as the number of observations increases. Thirty years later, I [Pearl] am still a devout Bayesian in the sense of (i), but I now doubt the wisdom of (ii) and I know that, in general, (iii) is false. He elaborates: The bulk of human knowledge is organized around causal, not probabilistic relationships, and the grammar of probability calculus is insufficient for capturing those relationships. Specifically, the building blocks of our scientific and everyday knowledge are elementary facts such as “mud does not cause rain” and “symptom

6 0.85261172 1378 andrew gelman stats-2012-06-13-Economists . . .

7 0.8441751 317 andrew gelman stats-2010-10-04-Rob Kass on statistical pragmatism, and my reactions

8 0.83406454 1056 andrew gelman stats-2011-12-13-Drawing to Learn in Science

9 0.83057487 647 andrew gelman stats-2011-04-04-Irritating pseudo-populism, backed up by false statistics and implausible speculations

10 0.81190199 994 andrew gelman stats-2011-11-06-Josh Tenenbaum presents . . . a model of folk physics!

11 0.8020466 1355 andrew gelman stats-2012-05-31-Lindley’s paradox

12 0.79998326 198 andrew gelman stats-2010-08-11-Multilevel modeling in R on a Mac

13 0.79364681 85 andrew gelman stats-2010-06-14-Prior distribution for design effects

14 0.79208171 662 andrew gelman stats-2011-04-15-Bayesian statistical pragmatism

15 0.78811413 220 andrew gelman stats-2010-08-20-Why I blog?

16 0.77641732 478 andrew gelman stats-2010-12-20-More on why “all politics is local” is an outdated slogan

17 0.76724583 1729 andrew gelman stats-2013-02-20-My beef with Brooks: the alternative to “good statistics” is not “no statistics,” it’s “bad statistics”

18 0.75479239 2183 andrew gelman stats-2014-01-23-Discussion on preregistration of research studies

19 0.7537443 1221 andrew gelman stats-2012-03-19-Whassup with deviance having a high posterior correlation with a parameter in the model?

20 0.74819404 781 andrew gelman stats-2011-06-28-The holes in my philosophy of Bayesian data analysis