andrew_gelman_stats andrew_gelman_stats-2011 andrew_gelman_stats-2011-753 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Zoltan Fazekas writes: I am a 2nd year graduate student in political science at the University of Vienna. In my empirical research I often employ multilevel modeling, and recently I came across a situation that kept me wondering for quite a while. As I did not find much on this in the literature and considering the topics that you work on and blog about, I figured I will try to contact you. The situation is as follows: in a linear multilevel model, there are two important individual level predictors (x1 and x2) and a set of controls. Let us assume that there is a theoretically grounded argument suggesting that an interaction between x1 and x2 should be included in the model (x1 * x2). Both x1 and x2 are let to vary randomly across groups. Would this directly imply that the coefficient of the interaction should also be left to vary across country? This is even more burning if there is no specific hypothesis on the variance of the conditional effect across countries. And then i
sentIndex sentText sentNum sentScore
1 In my empirical research I often employ multilevel modeling, and recently I came across a situation that kept me wondering for quite a while. [sent-2, score-0.512]
2 The situation is as follows: in a linear multilevel model, there are two important individual level predictors (x1 and x2) and a set of controls. [sent-4, score-0.619]
3 Let us assume that there is a theoretically grounded argument suggesting that an interaction between x1 and x2 should be included in the model (x1 * x2). [sent-5, score-0.478]
4 Both x1 and x2 are let to vary randomly across groups. [sent-6, score-0.567]
5 Would this directly imply that the coefficient of the interaction should also be left to vary across country? [sent-7, score-0.857]
6 This is even more burning if there is no specific hypothesis on the variance of the conditional effect across countries. [sent-8, score-0.362]
7 And then if we add predictors on the second level for the x1 and x2 slopes, would these also be expected to influence the coefficient of the the interaction between x1 and x2? [sent-9, score-1.012]
8 The last step refers to the situation in which x1 is modeled as a function of u1 on the second level, whereas x2 as a function of u2, making the assumption that x1 is independent from u2, and x2 from u1. [sent-10, score-0.457]
9 If the coefficient of the x1*x2 interaction is let to vary, we would have 3 parameters varying. [sent-11, score-0.583]
10 I came across this situation in research carried out with a colleague where political information was the response variable, education and media exposure (for example) the two variables of interest. [sent-15, score-0.395]
11 My reply: First off, except in unusual cases you should center (or approximately center) x1 and x2 before putting them in the model and letting their coefficients vary. [sent-17, score-0.343]
12 Second, if these coefs vary, you should almost certainly allow the intercept in your regression vary. [sent-18, score-0.638]
13 Fourth, it’s ok to let the main effects vary but have the interaction not vary. [sent-20, score-0.759]
14 But if the interaction varies, really both main effects should vary. [sent-21, score-0.337]
15 (a) If you don’t center the predictors, or if you allow the coefs for x1 and x2 to vary without letting the intercept to vary, your inferences are not even close to invariant to the selection of predictors in the model–even if the predictors are noise. [sent-27, score-1.684]
16 Fazekas then got back to me: I did not mention it, but: I always center the variables (given some cross-group aims, mostly grand mean) and always run varying intercept and varying slopes model. [sent-30, score-1.048]
17 When there is a predictor on the second level for a slope, I almost automatically put it as a predictor for the intercept. [sent-31, score-0.501]
18 As far as I’ve seen, int lmer() this is the default, plus it was clear from your book that only in very specific cases should one run varying slope but fixed intercept models. [sent-32, score-0.699]
19 So this would have been the default setup: (with variables grand mean centered) y=b0+b1 x1+ b2 x2 + b3 x1*x3 + b controls + e b0 = g0 + ee0 b1 = g1 + ee1 b2 = g2 + ee2 But in this case your fourth point is reassuring, because generally there are not that many second level units. [sent-33, score-0.879]
20 The extended specification would be, with predictors: y=b0+b1 x1+ b2 x2 + b3 x1*x3 + b controls + e b0 = g0 + g01 u1 + g02 u2 + ee0 b1 = g1 + g11 u1 + ee1 b2 = g2 + g21 u1 + ee2 From how I understood your answer, even in this scenario the same logic of efficiency would apply. [sent-34, score-0.391]
wordName wordTfidf (topN-words)
[('vary', 0.307), ('interaction', 0.273), ('intercept', 0.241), ('predictors', 0.231), ('coefs', 0.221), ('fazekas', 0.18), ('situation', 0.16), ('level', 0.16), ('center', 0.155), ('invariance', 0.154), ('second', 0.153), ('across', 0.145), ('varying', 0.135), ('coefficient', 0.132), ('slopes', 0.124), ('fourth', 0.117), ('slope', 0.115), ('let', 0.115), ('allow', 0.114), ('letting', 0.107), ('grand', 0.106), ('controls', 0.105), ('efficiency', 0.099), ('predictor', 0.094), ('variables', 0.09), ('default', 0.085), ('doable', 0.082), ('model', 0.081), ('int', 0.077), ('invariant', 0.077), ('empirical', 0.075), ('variance', 0.074), ('stabilize', 0.074), ('burning', 0.074), ('function', 0.072), ('strikingly', 0.069), ('specific', 0.069), ('multilevel', 0.068), ('grounded', 0.064), ('employ', 0.064), ('puzzled', 0.064), ('main', 0.064), ('would', 0.063), ('run', 0.062), ('lmer', 0.062), ('setup', 0.062), ('certainly', 0.062), ('considerations', 0.061), ('extended', 0.061), ('theoretically', 0.06)]
simIndex simValue blogId blogTitle
same-blog 1 1.0 753 andrew gelman stats-2011-06-09-Allowing interaction terms to vary
Introduction: Zoltan Fazekas writes: I am a 2nd year graduate student in political science at the University of Vienna. In my empirical research I often employ multilevel modeling, and recently I came across a situation that kept me wondering for quite a while. As I did not find much on this in the literature and considering the topics that you work on and blog about, I figured I will try to contact you. The situation is as follows: in a linear multilevel model, there are two important individual level predictors (x1 and x2) and a set of controls. Let us assume that there is a theoretically grounded argument suggesting that an interaction between x1 and x2 should be included in the model (x1 * x2). Both x1 and x2 are let to vary randomly across groups. Would this directly imply that the coefficient of the interaction should also be left to vary across country? This is even more burning if there is no specific hypothesis on the variance of the conditional effect across countries. And then i
Introduction: Chris Che-Castaldo writes: I am trying to compute variance components for a hierarchical model where the group level has two binary predictors and their interaction. When I model each of these three predictors as N(0, tau) the model will not converge, perhaps because the number of coefficients in each batch is so small (2 for the main effects and 4 for the interaction). Although I could simply leave all these as predictors as unmodeled fixed effects, the last sentence of section 21.2 on page 462 of Gelman and Hill (2007) suggests this would not be a wise course of action: For example, it is not clear how to define the (finite) standard deviation of variables that are included in interactions. I am curious – is there still no clear cut way to directly compute the finite standard deviation for binary unmodeled variables that are also part of an interaction as well as the interaction itself? My reply: I’d recommend including these in your model (it’s probably easiest to do so
3 0.19149579 1194 andrew gelman stats-2012-03-04-Multilevel modeling even when you’re not interested in predictions for new groups
Introduction: Fred Wu writes: I work at National Prescribing Services in Australia. I have a database representing say, antidiabetic drug utilisation for the entire Australia in the past few years. I planned to do a longitudinal analysis across GP Division Network (112 divisions in AUS) using mixed-effects models (or as you called in your book varying intercept and varying slope) on this data. The problem here is: as data actually represent the population who use antidiabetic drugs in AUS, should I use 112 fixed dummy variables to capture the random variations or use varying intercept and varying slope for the model ? Because some one may aruge, like divisions in AUS or states in USA can hardly be considered from a “superpopulation”, then fixed dummies should be used. What I think is the population are those who use the drugs, what will happen when the rest need to use them? In terms of exchangeability, using varying intercept and varying slopes can be justified. Also you provided in y
4 0.18405502 1248 andrew gelman stats-2012-04-06-17 groups, 6 group-level predictors: What to do?
Introduction: Yi-Chun Ou writes: I am using a multilevel model with three levels. I read that you wrote a book about multilevel models, and wonder if you can solve the following question. The data structure is like this: Level one: customer (8444 customers) Level two: companys (90 companies) Level three: industry (17 industries) I use 6 level-three variables (i.e. industry characteristics) to explain the variance of the level-one effect across industries. The question here is whether there is an over-fitting problem since there are only 17 industries. I understand that this must be a problem for non-multilevel models, but is it also a problem for multilevel models? My reply: Yes, this could be a problem. I’d suggest combining some of your variables into a common score, or using only some of the variables, or using strong priors to control the inferences. This is an interesting and important area of statistics research, to do this sort of thing systematically. There’s lots o
5 0.17743894 1966 andrew gelman stats-2013-08-03-Uncertainty in parameter estimates using multilevel models
Introduction: David Hsu writes: I have a (perhaps) simple question about uncertainty in parameter estimates using multilevel models — what is an appropriate threshold for measure parameter uncertainty in a multilevel model? The reason why I ask is that I set out to do a crossed two-way model with two varying intercepts, similar to your flight simulator example in your 2007 book. The difference is that I have a lot of predictors specific to each cell (I think equivalent to airport and pilot in your example), and I find after modeling this in JAGS, I happily find that the predictors are much less important than the variability by cell (airport and pilot effects). Happily because this is what I am writing a paper about. However, I then went to check subsets of predictors using lm() and lmer(). I understand that they all use different estimation methods, but what I can’t figure out is why the errors on all of the coefficient estimates are *so* different. For example, using JAGS, and th
6 0.16889779 888 andrew gelman stats-2011-09-03-A psychology researcher asks: Is Anova dead?
7 0.16709195 472 andrew gelman stats-2010-12-17-So-called fixed and random effects
8 0.14265393 1981 andrew gelman stats-2013-08-14-The robust beauty of improper linear models in decision making
9 0.14109929 383 andrew gelman stats-2010-10-31-Analyzing the entire population rather than a sample
10 0.13262534 823 andrew gelman stats-2011-07-26-Including interactions or not
11 0.12831762 1144 andrew gelman stats-2012-01-29-How many parameters are in a multilevel model?
12 0.12792517 1506 andrew gelman stats-2012-09-21-Building a regression model . . . with only 27 data points
13 0.12754136 295 andrew gelman stats-2010-09-25-Clusters with very small numbers of observations
14 0.12489819 255 andrew gelman stats-2010-09-04-How does multilevel modeling affect the estimate of the grand mean?
15 0.12355409 1150 andrew gelman stats-2012-02-02-The inevitable problems with statistical significance and 95% intervals
16 0.12238123 1196 andrew gelman stats-2012-03-04-Piss-poor monocausal social science
17 0.11981864 77 andrew gelman stats-2010-06-09-Sof[t]
18 0.11858758 936 andrew gelman stats-2011-10-02-Covariate Adjustment in RCT - Model Overfitting in Multilevel Regression
19 0.112813 948 andrew gelman stats-2011-10-10-Combining data from many sources
20 0.11280398 1786 andrew gelman stats-2013-04-03-Hierarchical array priors for ANOVA decompositions
topicId topicWeight
[(0, 0.215), (1, 0.096), (2, 0.09), (3, -0.043), (4, 0.086), (5, 0.045), (6, 0.023), (7, -0.045), (8, 0.041), (9, 0.142), (10, 0.017), (11, 0.039), (12, 0.056), (13, -0.035), (14, 0.048), (15, 0.006), (16, -0.065), (17, -0.017), (18, 0.0), (19, 0.001), (20, 0.008), (21, 0.041), (22, 0.025), (23, -0.004), (24, -0.009), (25, -0.071), (26, -0.03), (27, 0.038), (28, -0.062), (29, -0.007), (30, 0.013), (31, -0.013), (32, -0.014), (33, 0.028), (34, 0.005), (35, -0.044), (36, 0.013), (37, -0.012), (38, 0.016), (39, 0.001), (40, -0.009), (41, -0.069), (42, 0.041), (43, -0.007), (44, 0.006), (45, 0.006), (46, 0.053), (47, -0.02), (48, -0.003), (49, 0.023)]
simIndex simValue blogId blogTitle
same-blog 1 0.97276771 753 andrew gelman stats-2011-06-09-Allowing interaction terms to vary
Introduction: Zoltan Fazekas writes: I am a 2nd year graduate student in political science at the University of Vienna. In my empirical research I often employ multilevel modeling, and recently I came across a situation that kept me wondering for quite a while. As I did not find much on this in the literature and considering the topics that you work on and blog about, I figured I will try to contact you. The situation is as follows: in a linear multilevel model, there are two important individual level predictors (x1 and x2) and a set of controls. Let us assume that there is a theoretically grounded argument suggesting that an interaction between x1 and x2 should be included in the model (x1 * x2). Both x1 and x2 are let to vary randomly across groups. Would this directly imply that the coefficient of the interaction should also be left to vary across country? This is even more burning if there is no specific hypothesis on the variance of the conditional effect across countries. And then i
2 0.84863484 2296 andrew gelman stats-2014-04-19-Index or indicator variables
Introduction: Someone who doesn’t want his name shared (for the perhaps reasonable reason that he’ll “one day not be confused, and would rather my confusion not live on online forever”) writes: I’m exploring HLMs and stan, using your book with Jennifer Hill as my field guide to this new territory. I think I have a generally clear grasp on the material, but wanted to be sure I haven’t gone astray. The problem in working on involves a multi-nation survey of students, and I’m especially interested in understanding the effects of country, religion, and sex, and the interactions among those factors (using IRT to estimate individual-level ability, then estimating individual, school, and country effects). Following the basic approach laid out in chapter 13 for such interactions between levels, I think I need to create a matrix of indicator variables for religion and sex. Elsewhere in the book, you recommend against indicator variables in favor of a single index variable. Am I right in thinking t
3 0.83756751 1248 andrew gelman stats-2012-04-06-17 groups, 6 group-level predictors: What to do?
Introduction: Yi-Chun Ou writes: I am using a multilevel model with three levels. I read that you wrote a book about multilevel models, and wonder if you can solve the following question. The data structure is like this: Level one: customer (8444 customers) Level two: companys (90 companies) Level three: industry (17 industries) I use 6 level-three variables (i.e. industry characteristics) to explain the variance of the level-one effect across industries. The question here is whether there is an over-fitting problem since there are only 17 industries. I understand that this must be a problem for non-multilevel models, but is it also a problem for multilevel models? My reply: Yes, this could be a problem. I’d suggest combining some of your variables into a common score, or using only some of the variables, or using strong priors to control the inferences. This is an interesting and important area of statistics research, to do this sort of thing systematically. There’s lots o
4 0.83630294 1686 andrew gelman stats-2013-01-21-Finite-population Anova calculations for models with interactions
Introduction: Jim Thomson writes: I wonder if you could provide some clarification on the correct way to calculate the finite-population standard deviations for interaction terms in your Bayesian approach to ANOVA (as explained in your 2005 paper, and Gelman and Hill 2007). I understand that it is the SD of the constrained batch coefficients that is of interest, but in most WinBUGS examples I have seen, the SDs are all calculated directly as sd.fin<-sd(beta.main[]) for main effects and sd(beta.int[,]) for interaction effects, where beta.main and beta.int are the unconstrained coefficients, e.g. beta.int[i,j]~dnorm(0,tau). For main effects, I can see that it makes no difference, since the constrained value is calculated by subtracting the mean, and sd(B[]) = sd(B[]-mean(B[])). But the conventional sum-to-zero constraint for interaction terms in linear models is more complicated than subtracting the mean (there are only (n1-1)*(n2-1) free coefficients for an interaction b/w factors with n1 a
Introduction: Chris Che-Castaldo writes: I am trying to compute variance components for a hierarchical model where the group level has two binary predictors and their interaction. When I model each of these three predictors as N(0, tau) the model will not converge, perhaps because the number of coefficients in each batch is so small (2 for the main effects and 4 for the interaction). Although I could simply leave all these as predictors as unmodeled fixed effects, the last sentence of section 21.2 on page 462 of Gelman and Hill (2007) suggests this would not be a wise course of action: For example, it is not clear how to define the (finite) standard deviation of variables that are included in interactions. I am curious – is there still no clear cut way to directly compute the finite standard deviation for binary unmodeled variables that are also part of an interaction as well as the interaction itself? My reply: I’d recommend including these in your model (it’s probably easiest to do so
6 0.8057043 888 andrew gelman stats-2011-09-03-A psychology researcher asks: Is Anova dead?
7 0.80490124 2086 andrew gelman stats-2013-11-03-How best to compare effects measured in two different time periods?
8 0.79409027 1966 andrew gelman stats-2013-08-03-Uncertainty in parameter estimates using multilevel models
9 0.77865154 1194 andrew gelman stats-2012-03-04-Multilevel modeling even when you’re not interested in predictions for new groups
10 0.77736109 1294 andrew gelman stats-2012-05-01-Modeling y = a + b + c
11 0.76609558 1121 andrew gelman stats-2012-01-15-R-squared for multilevel models
12 0.74892873 1462 andrew gelman stats-2012-08-18-Standardizing regression inputs
13 0.74878329 1908 andrew gelman stats-2013-06-21-Interpreting interactions in discrete-data regression
14 0.74498743 653 andrew gelman stats-2011-04-08-Multilevel regression with shrinkage for “fixed” effects
15 0.74163181 2294 andrew gelman stats-2014-04-17-If you get to the point of asking, just do it. But some difficulties do arise . . .
16 0.73950535 464 andrew gelman stats-2010-12-12-Finite-population standard deviation in a hierarchical model
17 0.73813838 417 andrew gelman stats-2010-11-17-Clutering and variance components
18 0.73722094 251 andrew gelman stats-2010-09-02-Interactions of predictors in a causal model
19 0.73406464 1786 andrew gelman stats-2013-04-03-Hierarchical array priors for ANOVA decompositions
20 0.7336114 397 andrew gelman stats-2010-11-06-Multilevel quantile regression
topicId topicWeight
[(7, 0.026), (11, 0.02), (13, 0.018), (16, 0.109), (21, 0.019), (24, 0.152), (31, 0.065), (59, 0.026), (79, 0.024), (86, 0.045), (89, 0.02), (99, 0.332)]
simIndex simValue blogId blogTitle
same-blog 1 0.98152804 753 andrew gelman stats-2011-06-09-Allowing interaction terms to vary
Introduction: Zoltan Fazekas writes: I am a 2nd year graduate student in political science at the University of Vienna. In my empirical research I often employ multilevel modeling, and recently I came across a situation that kept me wondering for quite a while. As I did not find much on this in the literature and considering the topics that you work on and blog about, I figured I will try to contact you. The situation is as follows: in a linear multilevel model, there are two important individual level predictors (x1 and x2) and a set of controls. Let us assume that there is a theoretically grounded argument suggesting that an interaction between x1 and x2 should be included in the model (x1 * x2). Both x1 and x2 are let to vary randomly across groups. Would this directly imply that the coefficient of the interaction should also be left to vary across country? This is even more burning if there is no specific hypothesis on the variance of the conditional effect across countries. And then i
2 0.97605276 2207 andrew gelman stats-2014-02-11-My talks in Bristol this Wed and London this Thurs
Introduction: 1. Causality and statistical learning (Wed 12 Feb 2014, 16:00, at University of Bristol): Causal inference is central to the social and biomedical sciences. There are unresolved debates about the meaning of causality and the methods that should be used to measure it. As a statistician, I am trained to say that randomized experiments are a gold standard, yet I have spent almost all my applied career analyzing observational data. In this talk we shall consider various approaches to causal reasoning from the perspective of an applied statistician who recognizes the importance of causal identification, yet must learn from available information. This is a good one. They laughed their asses off when I did it in Ann Arbor. But it has serious stuff too. As George Carlin (or, for that matter, John or Brad) might say, it’s funny because it’s true. Here are some old slides, but I plan to mix in a bit of new material. 2. Theoretical Statistics is the Theory of Applied Statistics
3 0.97181779 242 andrew gelman stats-2010-08-29-The Subtle Micro-Effects of Peacekeeping
Introduction: Eric Mvukiyehe and Cyrus Samii write : We [Mvukiyehe and Samii] use original survey data and administrative data to test a theory of the micro-level impacts of peacekeeping. The theory proposes that through the creation of local security bubbles and also through direct assistance, peacekeeping deployments contribute to economic and social revitalization that may contribute to more durable peace. This theory guides the design of current United Nations peacekeeping operations, and has been proposed as one of the explanations for peacekeeping’s well-documented association with more durable peace. Our evidence paint a complex picture that deviates substantially from the theory. We do not find evidence for local security bubbles around deployment base areas, and we do not find that deployments were substantial contributors to local social infrastructure. In addition, we find a negative relationship between deployment basing locations and NGO contributions to social infrastructure.
4 0.9689126 187 andrew gelman stats-2010-08-05-Update on state size and governors’ popularity
Introduction: Nick Obradovich saw our graphs and regressions showing that the most popular governors tended to come from small states and suggested looking at unemployment rates. (I’d used change in per-capita income as my economic predictor, following the usual practice in political science.) Here’s the graph that got things started: And here’s what Obradovich wrote: It seems that average unemployment rate is more strongly negatively correlated with positive governor approval ratings than is population. The unemployment rate and state size is positively correlated. Anyway, when I include state unemployment rate in the regressions, it pulls the significance away from state population. I do economic data work much of the day, so when I read your post this morning and looked at your charts, state unemployment rates jumped right out at me as a potential confound. I passed this suggestion on to Hanfei, who ran some regressions: lm (popularity ~ c.log.statepop + c.unemployment)
5 0.96849954 1391 andrew gelman stats-2012-06-25-A question about the Tiger Mom: what if she’d had boys instead of girls?
Introduction: I was just thinking about that Yale professor who wrote that book, remember , she screamed at her daughters all the time and didn’t let them go the bathroom while they were practicing piano (violin?), Asian parenting-style etc etc. I was just wondering . . . what if she’d had sons rather than daughters? What are the rules? Would bringing up boys be the job of mellow white dad rather than intense Asian mom? If mom were still in charge, would the boys be doing piano, or something else like hockey? Tiger Mom seems to be into traditional values so I’m assuming she’d want her kids doing something sex-stereotyped? But sports would be tough, her (hypothetical) boys would have to compete with big strong athletes and would be less likely to be winners so then she couldn’t brag about her amazing parenting skills. I really don’t know the answer to this one. Maybe some of our readers who are Yale law professors can enlighten us?
6 0.96817082 599 andrew gelman stats-2011-03-03-Two interesting posts elsewhere on graphics
7 0.96812153 1760 andrew gelman stats-2013-03-12-Misunderstanding the p-value
9 0.9669193 1921 andrew gelman stats-2013-07-01-Going meta on Niall Ferguson
10 0.96677095 226 andrew gelman stats-2010-08-23-More on those L.A. Times estimates of teacher effectiveness
11 0.9666515 110 andrew gelman stats-2010-06-26-Philosophy and the practice of Bayesian statistics
12 0.96638703 1673 andrew gelman stats-2013-01-15-My talk last night at the visualization meetup
13 0.96610719 288 andrew gelman stats-2010-09-21-Discussion of the paper by Girolami and Calderhead on Bayesian computation
14 0.9660576 888 andrew gelman stats-2011-09-03-A psychology researcher asks: Is Anova dead?
16 0.96597964 2263 andrew gelman stats-2014-03-24-Empirical implications of Empirical Implications of Theoretical Models
17 0.96595883 2182 andrew gelman stats-2014-01-22-Spell-checking example demonstrates key aspects of Bayesian data analysis
18 0.96590984 1163 andrew gelman stats-2012-02-12-Meta-analysis, game theory, and incentives to do replicable research
19 0.96577394 1529 andrew gelman stats-2012-10-11-Bayesian brains?
20 0.96542573 1061 andrew gelman stats-2011-12-16-CrossValidated: A place to post your statistics questions