andrew_gelman_stats andrew_gelman_stats-2011 andrew_gelman_stats-2011-850 knowledge-graph by maker-knowledge-mining

850 andrew gelman stats-2011-08-11-Understanding how estimates change when you move to a multilevel model


meta infos for this blog

Source: html

Introduction: Ramu Sudhagoni writes: I am working on combining three longitudinal studies using Bayesian hierarchical technique. In each study, I have at least 70 subjects follow up on 5 different visit months. My model consists of 10 different covariates including longitudinal and cross-sectional effects. Mixed models are used to fit the three studies individually using Bayesian approach and I noticed that few covariates were significant. When I combined using three level hierarchical approach, all the covariates became non-significant at the population level, and large estimates were found for variance parameters at the population level. I am struggling to understand why I am getting large variances at population level and wider credible intervals. I assumed non-informative normal priors for all my cross sectional and longitudinal effects, and non-informative inverse-gamma priors for variance parameters. I followed the approach explained by Inoue et al. (Title: Combining Longitudinal Studie


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Ramu Sudhagoni writes: I am working on combining three longitudinal studies using Bayesian hierarchical technique. [sent-1, score-1.247]

2 In each study, I have at least 70 subjects follow up on 5 different visit months. [sent-2, score-0.337]

3 My model consists of 10 different covariates including longitudinal and cross-sectional effects. [sent-3, score-1.109]

4 Mixed models are used to fit the three studies individually using Bayesian approach and I noticed that few covariates were significant. [sent-4, score-1.172]

5 When I combined using three level hierarchical approach, all the covariates became non-significant at the population level, and large estimates were found for variance parameters at the population level. [sent-5, score-1.886]

6 I am struggling to understand why I am getting large variances at population level and wider credible intervals. [sent-6, score-1.02]

7 I assumed non-informative normal priors for all my cross sectional and longitudinal effects, and non-informative inverse-gamma priors for variance parameters. [sent-7, score-1.528]

8 I followed the approach explained by Inoue et al. [sent-8, score-0.383]

9 My reply: I don’t know but I’d recommend you graph your data and fitted model so you can try to understand where the estimates are coming from. [sent-10, score-0.561]

10 Also, get rid of those inverse-gamma priors, which aren’t noninformative at all! [sent-11, score-0.227]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('longitudinal', 0.474), ('covariates', 0.323), ('priors', 0.232), ('combining', 0.196), ('population', 0.187), ('studies', 0.17), ('psa', 0.166), ('level', 0.162), ('three', 0.16), ('approach', 0.158), ('sectional', 0.156), ('variance', 0.15), ('hierarchical', 0.143), ('credible', 0.131), ('consists', 0.128), ('individually', 0.126), ('visit', 0.117), ('struggling', 0.116), ('variances', 0.116), ('rid', 0.114), ('estimates', 0.114), ('cross', 0.113), ('noninformative', 0.113), ('wider', 0.111), ('combined', 0.104), ('using', 0.104), ('large', 0.099), ('understand', 0.098), ('assumed', 0.094), ('mixed', 0.094), ('fitted', 0.093), ('subjects', 0.089), ('bayesian', 0.086), ('became', 0.086), ('explained', 0.082), ('noticed', 0.077), ('normal', 0.077), ('followed', 0.073), ('title', 0.073), ('recommend', 0.072), ('et', 0.07), ('aren', 0.069), ('parameters', 0.067), ('different', 0.066), ('model', 0.065), ('follow', 0.065), ('coming', 0.062), ('graph', 0.057), ('fit', 0.054), ('including', 0.053)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0000001 850 andrew gelman stats-2011-08-11-Understanding how estimates change when you move to a multilevel model

Introduction: Ramu Sudhagoni writes: I am working on combining three longitudinal studies using Bayesian hierarchical technique. In each study, I have at least 70 subjects follow up on 5 different visit months. My model consists of 10 different covariates including longitudinal and cross-sectional effects. Mixed models are used to fit the three studies individually using Bayesian approach and I noticed that few covariates were significant. When I combined using three level hierarchical approach, all the covariates became non-significant at the population level, and large estimates were found for variance parameters at the population level. I am struggling to understand why I am getting large variances at population level and wider credible intervals. I assumed non-informative normal priors for all my cross sectional and longitudinal effects, and non-informative inverse-gamma priors for variance parameters. I followed the approach explained by Inoue et al. (Title: Combining Longitudinal Studie

2 0.20022213 1209 andrew gelman stats-2012-03-12-As a Bayesian I want scientists to report their data non-Bayesianly

Introduction: Philipp Doebler writes: I was quite happy that recently you shared some thoughts of yours and others on meta-analysis. I especially liked the slides by Chris Schmid that you linked from your blog. A large portion of my work deals with meta-analysis and I am also fond of using Bayesian methods (actually two of the projects I am working on are very Bayesian), though I can not say I have opinions with respect to the underlying philosophy. I would say though, that I do share your view that there are good reasons to use informative priors. The reason I am writing to you is that this leads to the following dilemma, which is puzzling me. Say a number of scientists conduct similar studies over the years and all of them did this in a Bayesian fashion. If each of the groups used informative priors based on the research of existing groups the priors could become more and more informative over the years, since more and more is known over the subject. At least in smallish studies these p

3 0.18053742 1248 andrew gelman stats-2012-04-06-17 groups, 6 group-level predictors: What to do?

Introduction: Yi-Chun Ou writes: I am using a multilevel model with three levels. I read that you wrote a book about multilevel models, and wonder if you can solve the following question. The data structure is like this: Level one: customer (8444 customers) Level two: companys (90 companies) Level three: industry (17 industries) I use 6 level-three variables (i.e. industry characteristics) to explain the variance of the level-one effect across industries. The question here is whether there is an over-fitting problem since there are only 17 industries. I understand that this must be a problem for non-multilevel models, but is it also a problem for multilevel models? My reply: Yes, this could be a problem. I’d suggest combining some of your variables into a common score, or using only some of the variables, or using strong priors to control the inferences. This is an interesting and important area of statistics research, to do this sort of thing systematically. There’s lots o

4 0.17055047 1194 andrew gelman stats-2012-03-04-Multilevel modeling even when you’re not interested in predictions for new groups

Introduction: Fred Wu writes: I work at National Prescribing Services in Australia. I have a database representing say, antidiabetic drug utilisation for the entire Australia in the past few years. I planned to do a longitudinal analysis across GP Division Network (112 divisions in AUS) using mixed-effects models (or as you called in your book varying intercept and varying slope) on this data. The problem here is: as data actually represent the population who use antidiabetic drugs in AUS, should I use 112 fixed dummy variables to capture the random variations or use varying intercept and varying slope for the model ? Because some one may aruge, like divisions in AUS or states in USA can hardly be considered from a “superpopulation”, then fixed dummies should be used. What I think is the population are those who use the drugs, what will happen when the rest need to use them? In terms of exchangeability, using varying intercept and varying slopes can be justified. Also you provided in y

5 0.16760331 2294 andrew gelman stats-2014-04-17-If you get to the point of asking, just do it. But some difficulties do arise . . .

Introduction: Nelson Villoria writes: I find the multilevel approach very useful for a problem I am dealing with, and I was wondering whether you could point me to some references about poolability tests for multilevel models. I am working with time series of cross sectional data and I want to test whether the data supports cross sectional and/or time pooling. In a standard panel data setting I do this with Chow tests and/or CUSUM. Are these ideas directly transferable to the multilevel setting? My reply: I think you should do partial pooling. Once the question arises, just do it. Other models are just special cases. I don’t see the need for any test. That said, if you do a group-level model, you need to consider including group-level averages of individual predictors (see here ). And if the number of groups is small, there can be real gains from using an informative prior distribution on the hierarchical variance parameters. This is something that Jennifer and I do not discuss in our

6 0.16656905 846 andrew gelman stats-2011-08-09-Default priors update?

7 0.15986162 1465 andrew gelman stats-2012-08-21-D. Buggin

8 0.15130389 936 andrew gelman stats-2011-10-02-Covariate Adjustment in RCT - Model Overfitting in Multilevel Regression

9 0.13407904 801 andrew gelman stats-2011-07-13-On the half-Cauchy prior for a global scale parameter

10 0.13329208 1999 andrew gelman stats-2013-08-27-Bayesian model averaging or fitting a larger model

11 0.13142413 63 andrew gelman stats-2010-06-02-The problem of overestimation of group-level variance parameters

12 0.13088244 2145 andrew gelman stats-2013-12-24-Estimating and summarizing inference for hierarchical variance parameters when the number of groups is small

13 0.12925494 383 andrew gelman stats-2010-10-31-Analyzing the entire population rather than a sample

14 0.12901711 857 andrew gelman stats-2011-08-17-Bayes pays

15 0.1273614 1425 andrew gelman stats-2012-07-23-Examples of the use of hierarchical modeling to generalize to new settings

16 0.1247328 1946 andrew gelman stats-2013-07-19-Prior distributions on derived quantities rather than on parameters themselves

17 0.12377684 972 andrew gelman stats-2011-10-25-How do you interpret standard errors from a regression fit to the entire population?

18 0.11891209 2129 andrew gelman stats-2013-12-10-Cross-validation and Bayesian estimation of tuning parameters

19 0.11779556 1092 andrew gelman stats-2011-12-29-More by Berger and me on weakly informative priors

20 0.11744394 810 andrew gelman stats-2011-07-20-Adding more information can make the variance go up (depending on your model)


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.157), (1, 0.177), (2, 0.053), (3, -0.016), (4, 0.032), (5, 0.0), (6, 0.016), (7, -0.015), (8, -0.042), (9, 0.105), (10, 0.039), (11, 0.007), (12, 0.058), (13, 0.041), (14, 0.071), (15, 0.031), (16, -0.018), (17, 0.011), (18, -0.007), (19, 0.037), (20, -0.039), (21, 0.025), (22, -0.029), (23, 0.024), (24, -0.005), (25, -0.047), (26, -0.047), (27, 0.029), (28, 0.015), (29, 0.044), (30, -0.026), (31, -0.051), (32, -0.014), (33, -0.03), (34, -0.007), (35, 0.055), (36, -0.034), (37, -0.038), (38, 0.024), (39, 0.042), (40, -0.012), (41, -0.001), (42, 0.067), (43, 0.024), (44, -0.04), (45, -0.063), (46, -0.007), (47, 0.02), (48, -0.015), (49, -0.012)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.97456908 850 andrew gelman stats-2011-08-11-Understanding how estimates change when you move to a multilevel model

Introduction: Ramu Sudhagoni writes: I am working on combining three longitudinal studies using Bayesian hierarchical technique. In each study, I have at least 70 subjects follow up on 5 different visit months. My model consists of 10 different covariates including longitudinal and cross-sectional effects. Mixed models are used to fit the three studies individually using Bayesian approach and I noticed that few covariates were significant. When I combined using three level hierarchical approach, all the covariates became non-significant at the population level, and large estimates were found for variance parameters at the population level. I am struggling to understand why I am getting large variances at population level and wider credible intervals. I assumed non-informative normal priors for all my cross sectional and longitudinal effects, and non-informative inverse-gamma priors for variance parameters. I followed the approach explained by Inoue et al. (Title: Combining Longitudinal Studie

2 0.79276133 653 andrew gelman stats-2011-04-08-Multilevel regression with shrinkage for “fixed” effects

Introduction: Dean Eckles writes: I remember reading on your blog that you were working on some tools to fit multilevel models that also include “fixed” effects — such as continuous predictors — that are also estimated with shrinkage (for example, an L1 or L2 penalty). Any new developments on this front? I often find myself wanting to fit a multilevel model to some data, but also needing to include a number of “fixed” effects, mainly continuous variables. This makes me wary of overfitting to these predictors, so then I’d want to use some kind of shrinkage. As far as I can tell, the main options for doing this now is by going fully Bayesian and using a Gibbs sampler. With MCMCglmm or BUGS/JAGS I could just specify a prior on the fixed effects that corresponds to a desired penalty. However, this is pretty slow, especially with a large data set and because I’d like to select the penalty parameter by cross-validation (which is where this isn’t very Bayesian I guess?). My reply: We allow info

3 0.75746065 1786 andrew gelman stats-2013-04-03-Hierarchical array priors for ANOVA decompositions

Introduction: Alexander Volfovsky and Peter Hoff write : ANOVA decompositions are a standard method for describing and estimating heterogeneity among the means of a response variable across levels of multiple categorical factors. In such a decomposition, the complete set of main effects and interaction terms can be viewed as a collection of vectors, matrices and arrays that share various index sets defined by the factor levels. For many types of categorical factors, it is plausible that an ANOVA decomposition exhibits some consistency across orders of effects, in that the levels of a factor that have similar main-effect coefficients may also have similar coefficients in higher-order interaction terms. In such a case, estimation of the higher-order interactions should be improved by borrowing information from the main effects and lower-order interactions. To take advantage of such patterns, this article introduces a class of hierarchical prior distributions for collections of interaction arrays t

4 0.75503296 63 andrew gelman stats-2010-06-02-The problem of overestimation of group-level variance parameters

Introduction: John Lawson writes: I have been experimenting using Bayesian Methods to estimate variance components, and I have noticed that even when I use a noninformative prior, my estimates are never close to the method of moments or REML estimates. In every case I have tried, the sum of the Bayesian estimated variance components is always larger than the sum of the estimates obtained by method of moments or REML. For data sets I have used that arise from a simple one-way random effects model, the Bayesian estimates of the between groups variance component is usually larger than the method of moments or REML estimates. When I use a uniform prior on the between standard deviation (as you recommended in your 2006 paper ) rather than an inverse gamma prior on the between variance component, the between variance component is usually reduced. However, for the dyestuff data in Davies(1949, p74), the opposite appears to be the case. I am a worried that the Bayesian estimators of the varian

5 0.75394058 846 andrew gelman stats-2011-08-09-Default priors update?

Introduction: Ryan King writes: I was wondering if you have a brief comment on the state of the art for objective priors for hierarchical generalized linear models (generalized linear mixed models). I have been working off the papers in Bayesian Analysis (2006) 1, Number 3 (Browne and Draper, Kass and Natarajan, Gelman). There seems to have been continuous work for matching priors in linear mixed models, but GLMMs less so because of the lack of an analytic marginal likelihood for the variance components. There are a number of additional suggestions in the literature since 2006, but little robust practical guidance. I’m interested in both mean parameters and the variance components. I’m almost always concerned with logistic random effect models. I’m fascinated by the matching-priors idea of higher-order asymptotic improvements to maximum likelihood, and need to make some kind of defensible default recommendation. Given the massive scale of the datasets (genetics …), extensive sensitivity a

6 0.75323212 1209 andrew gelman stats-2012-03-12-As a Bayesian I want scientists to report their data non-Bayesianly

7 0.750135 2145 andrew gelman stats-2013-12-24-Estimating and summarizing inference for hierarchical variance parameters when the number of groups is small

8 0.74646866 1465 andrew gelman stats-2012-08-21-D. Buggin

9 0.73895514 1102 andrew gelman stats-2012-01-06-Bayesian Anova found useful in ecology

10 0.73197275 1674 andrew gelman stats-2013-01-15-Prior Selection for Vector Autoregressions

11 0.7220962 2033 andrew gelman stats-2013-09-23-More on Bayesian methods and multilevel modeling

12 0.72163701 1194 andrew gelman stats-2012-03-04-Multilevel modeling even when you’re not interested in predictions for new groups

13 0.71985817 1248 andrew gelman stats-2012-04-06-17 groups, 6 group-level predictors: What to do?

14 0.71844459 269 andrew gelman stats-2010-09-10-R vs. Stata, or, Different ways to estimate multilevel models

15 0.71450084 2294 andrew gelman stats-2014-04-17-If you get to the point of asking, just do it. But some difficulties do arise . . .

16 0.71449035 184 andrew gelman stats-2010-08-04-That half-Cauchy prior

17 0.70941323 2086 andrew gelman stats-2013-11-03-How best to compare effects measured in two different time periods?

18 0.70285225 1877 andrew gelman stats-2013-05-30-Infill asymptotics and sprawl asymptotics

19 0.70048416 383 andrew gelman stats-2010-10-31-Analyzing the entire population rather than a sample

20 0.6969884 851 andrew gelman stats-2011-08-12-year + (1|year)


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(3, 0.017), (16, 0.029), (21, 0.02), (24, 0.147), (71, 0.016), (86, 0.036), (89, 0.214), (99, 0.401)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.98467207 1855 andrew gelman stats-2013-05-13-Stan!

Introduction: Guy Freeman writes: I thought you’d all like to know that Stan was used and referenced in a peer-reviewed Rapid Communications paper on influenza. Thank you for this excellent modelling language and sampler, which made it possible to carry out this work quickly! I haven’t actually read the paper, but I’m happy to see Stan getting around like that.

2 0.98313928 459 andrew gelman stats-2010-12-09-Solve mazes by starting at the exit

Introduction: It worked on this one . Good maze designers know this trick and are careful to design multiple branches in each direction. Back when I was in junior high, I used to make huge mazes, and the basic idea was to anticipate what the solver might try to do and to make the maze difficult by postponing the point at which he would realize a path was going nowhere. For example, you might have 6 branches: one dead end, two pairs that form loops going back to the start, and one that is the correct solution. You do this from both directions and add some twists and turns, and there you are. But the maze designer aiming for the naive solver–the sap who starts from the entrance and goes toward the exit–can simplify matters by just having 6 branches: five dead ends and one winner. This sort of thing is easy to solve in the reverse direction. I’m surprised the Times didn’t do better for their special puzzle issue.

3 0.98151529 1756 andrew gelman stats-2013-03-10-He said he was sorry

Introduction: Yes, it can be done : Hereby I contact you to clarify the situation that occurred with the publication of the article entitled *** which was published in Volume 11, Issue 3 of *** and I made the mistake of declaring as an author. This chapter is a plagiarism of . . . I wish to express and acknowledge that I am solely responsible for this . . . I recognize the gravity of the offense committed, since there is no justification for so doing. Therefore, and as a sign of shame and regret I feel in this situation, I will publish this letter, in order to set an example for other researchers do not engage in a similar error. No more, and to please accept my apologies, Sincerely, *** P.S. Since we’re on Retraction Watch already, I’ll point you to this unrelated story featuring a hilarious photo of a fraudster, who in this case was a grad student in psychology who faked his data and “has agreed to submit to a three-year supervisory period for any work involving funding from the

4 0.97962707 833 andrew gelman stats-2011-07-31-Untunable Metropolis

Introduction: Michael Margolis writes: What are we to make of it when a Metropolis-Hastings step just won’t tune? That is, the acceptance rate is zero at expected-jump-size X, and way above 1/2 at X-exp(-16) (i.e., machine precision ). I’ve solved my practical problem by writing that I would have liked to include results from a diffuse prior, but couldn’t. But I’m bothered by the poverty of my intuition. And since everything I’ve read says this is an issue of efficiency, rather than accuracy, I wonder if I could solve it just by running massive and heavily thinned chains. My reply: I can’t see how this could happen in a well-specified problem! I suspect it’s a bug. Otherwise try rescaling your variables so that your parameters will have values on the order of magnitude of 1. To which Margolis responded: I hardly wrote any of the code, so I can’t speak to the bug question — it’s binomial kriging from the R package geoRglm. And there are no covariates to scale — just the zero and one

5 0.97939312 566 andrew gelman stats-2011-02-09-The boxer, the wrestler, and the coin flip, again

Introduction: Mike Grosskopf writes: I came across your blog the other day and noticed your paper about “The Boxer, the Wrestler, and the Coin Flip” . . . I do not understand the objection to the robust Bayesian inference for conditioning on X=Y in the problem as you describe in the paper. The paper talks about how using Robust Bayes when conditioning on X=Y “degrades our inference about the coin flip” and “has led us to the claim that we can say nothing at all about the coin flip”. Does that have to be the case however, because while conditioning on X=Y does mean that p({X=1}|{X=Y}I) = p({Y=1}|{X=Y}I), I don’t see why it has to mean that both have the same π-distribution where Pr(Y = 1) = π. Which type of inference is being done about Y in the problem? If you are trying to make an inference on the results of the fight between the boxer and the wrestler that has already happened, in which your friend tells you that either the boxer won and he flipped heads with a coin or the boxer lost a

same-blog 6 0.97189581 850 andrew gelman stats-2011-08-11-Understanding how estimates change when you move to a multilevel model

7 0.96826577 407 andrew gelman stats-2010-11-11-Data Visualization vs. Statistical Graphics

8 0.96582639 623 andrew gelman stats-2011-03-21-Baseball’s greatest fielders

9 0.96522301 1628 andrew gelman stats-2012-12-17-Statistics in a world where nothing is random

10 0.9624688 1783 andrew gelman stats-2013-03-31-He’s getting ready to write a book

11 0.9616558 231 andrew gelman stats-2010-08-24-Yet another Bayesian job opportunity

12 0.96099722 1320 andrew gelman stats-2012-05-14-Question 4 of my final exam for Design and Analysis of Sample Surveys

13 0.95700932 1991 andrew gelman stats-2013-08-21-BDA3 table of contents (also a new paper on visualization)

14 0.95422304 1903 andrew gelman stats-2013-06-17-Weak identification provides partial information

15 0.95183039 1215 andrew gelman stats-2012-03-16-The “hot hand” and problems with hypothesis testing

16 0.94862354 1708 andrew gelman stats-2013-02-05-Wouldn’t it be cool if Glenn Hubbard were consulting for Herbalife and I were on the other side?

17 0.9434967 1290 andrew gelman stats-2012-04-30-I suppose it’s too late to add Turing’s run-around-the-house-chess to the 2012 London Olympics?

18 0.94293231 2267 andrew gelman stats-2014-03-26-Is a steal really worth 9 points?

19 0.93989205 1702 andrew gelman stats-2013-02-01-Don’t let your standard errors drive your research agenda

20 0.93722868 1839 andrew gelman stats-2013-05-04-Jesus historian Niall Ferguson and the improving standards of public discourse