andrew_gelman_stats andrew_gelman_stats-2011 andrew_gelman_stats-2011-888 knowledge-graph by maker-knowledge-mining

888 andrew gelman stats-2011-09-03-A psychology researcher asks: Is Anova dead?


meta infos for this blog

Source: html

Introduction: A research psychologist writes in with a question that’s so long that I’ll put my answer first, then put the question itself below the fold. Here’s my reply: As I wrote in my Anova paper and in my book with Jennifer Hill, I do think that multilevel models can completely replace Anova. At the same time, I think the central idea of Anova should persist in our understanding of these models. To me the central idea of Anova is not F-tests or p-values or sums of squares, but rather the idea of predicting an outcome based on factors with discrete levels, and understanding these factors using variance components. The continuous or categorical response thing doesn’t really matter so much to me. I have no problem using a normal linear model for continuous outcomes (perhaps suitably transformed) and a logistic model for binary outcomes. I don’t want to throw away interactions just because they’re not statistically significant. I’d rather partially pool them toward zero using an inform


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 I have no problem using a normal linear model for continuous outcomes (perhaps suitably transformed) and a logistic model for binary outcomes. [sent-6, score-0.485]

2 Regarding your conceptual point, yes yes yes yes yes I agree that you should use those continuous variables, don’t chop them up as binary, that would just throw away info. [sent-12, score-0.742]

3 And now here’s the question: Recently, there has been a shift in field away from ANOVA to the use of mixed effects logit models. [sent-14, score-0.721]

4 Learning to program it is relatively easy, learning how to use it appropriately, and especially, understanding how to interpret logit models is much harder. [sent-24, score-0.773]

5 And I have overheard too many discussions about interactions amongst my poli sci and economist friends, especially in logit models, to not be somewhat sceptical of the advice in said paper. [sent-25, score-0.438]

6 The main impetus for the shift away from ANOVA to logit is two-fold: 1) arguing that we actually have categorical response data, and 2) a demonstration of a spurious interaction effect in ANOVA – as in, it’s significant in ANOVA (even using transformed data) but not in the logit model. [sent-31, score-1.366]

7 As far as I can tell, the interpretation of interactions in logit is very tricky. [sent-33, score-0.438]

8 Given all the complications, I am loathe to throw away a result because it was not significant in a logit model. [sent-46, score-0.632]

9 But according to Golder and colleagues “ the coefficient and standard error on the interaction term does not tell us the direction, magnitude, or significance of the ‘interaction effect’” . [sent-48, score-0.464]

10 htm ) “Just because the interaction term is significant in the log odds model, it doesn’t mean that the probability difference in differences will be significant for values of the covariate of interest. [sent-53, score-0.665]

11 Paradoxically, even if the interaction term is not significant in the log odds model, the probability difference in differences may be significant for some values of the covariate. [sent-54, score-0.665]

12 So reading off the p values for an interaction term is not a straightforward matter, or should I say, using them to directly reject the hypothesis that there is an interaction is not the same as in an ANOVA. [sent-58, score-0.65]

13 Since I care about their overall performance, why would I use an approximation, or put differently, a single sample of their performance, to test whether learning methods affect overall performance. [sent-82, score-0.394]

14 Moreover, it gets rid of including the random variation in an individual’s performance on an item. [sent-83, score-0.327]

15 My understanding of the difference (from the perspective of assumptions) is that random effects are more efficient but biased, and that in other disciplines the choice of a random effects model would have to be tested and justified. [sent-99, score-0.666]

16 The push to use mixed effects models has been predicated on ‘the fact that ordinary logit models provide no direct way to model random subject and item effects’. [sent-108, score-0.97]

17 But given what I just talked about, random vs fixed effects, bias doesn’t seem to be too much of a concern…) My reluctance seems to be supported by Kennedy (1998). [sent-115, score-0.439]

18 But, even though I use SPSS, I can program in it – I learned to use it back in the SPSS for DOS days – so using an improved ANOVA model is something I could do with some work). [sent-121, score-0.402]

19 Unlike what seems to be the case for practitioners of regression (from what I gleaned from a presentation and paper by Golder and colleagues), I was taught to be careful interpreting main effects given a significant interaction in an ANOVA. [sent-123, score-0.696]

20 Regression clearly has some benefits, in particular co-efficients, but I am unconvinced that logit is the way to go. [sent-126, score-0.36]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('logit', 0.36), ('anova', 0.325), ('interaction', 0.253), ('golder', 0.167), ('random', 0.129), ('significant', 0.129), ('binary', 0.127), ('multicollinearity', 0.125), ('performance', 0.12), ('effects', 0.118), ('categorical', 0.117), ('continuous', 0.115), ('variables', 0.111), ('overall', 0.109), ('fixed', 0.107), ('interpret', 0.098), ('use', 0.095), ('person', 0.093), ('model', 0.09), ('treated', 0.085), ('http', 0.085), ('away', 0.084), ('understanding', 0.082), ('term', 0.081), ('learning', 0.081), ('rid', 0.078), ('interactions', 0.078), ('variable', 0.077), ('moreover', 0.077), ('conceptual', 0.074), ('odds', 0.073), ('given', 0.073), ('basically', 0.071), ('bias', 0.068), ('biased', 0.067), ('error', 0.067), ('spss', 0.066), ('predictor', 0.065), ('mixed', 0.064), ('colleagues', 0.063), ('using', 0.063), ('yes', 0.063), ('seems', 0.062), ('responses', 0.061), ('regression', 0.061), ('throw', 0.059), ('something', 0.059), ('https', 0.059), ('models', 0.057), ('drives', 0.057)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.99999988 888 andrew gelman stats-2011-09-03-A psychology researcher asks: Is Anova dead?

Introduction: A research psychologist writes in with a question that’s so long that I’ll put my answer first, then put the question itself below the fold. Here’s my reply: As I wrote in my Anova paper and in my book with Jennifer Hill, I do think that multilevel models can completely replace Anova. At the same time, I think the central idea of Anova should persist in our understanding of these models. To me the central idea of Anova is not F-tests or p-values or sums of squares, but rather the idea of predicting an outcome based on factors with discrete levels, and understanding these factors using variance components. The continuous or categorical response thing doesn’t really matter so much to me. I have no problem using a normal linear model for continuous outcomes (perhaps suitably transformed) and a logistic model for binary outcomes. I don’t want to throw away interactions just because they’re not statistically significant. I’d rather partially pool them toward zero using an inform

2 0.28141224 472 andrew gelman stats-2010-12-17-So-called fixed and random effects

Introduction: Someone writes: I am hoping you can give me some advice about when to use fixed and random effects model. I am currently working on a paper that examines the effect of . . . by comparing states . . . It got reviewed . . . by three economists and all suggest that we run a fixed effects model. We ran a hierarchial model in the paper that allow the intercept and slope to vary before and after . . . My question is which is correct? We have ran it both ways and really it makes no difference which model you run, the results are very similar. But for my own learning, I would really like to understand which to use under what circumstances. Is the fact that we use the whole population reason enough to just run a fixed effect model? Perhaps you can suggest a good reference to this question of when to run a fixed vs. random effects model. I’m not always sure what is meant by a “fixed effects model”; see my paper on Anova for discussion of the problems with this terminology: http://w

3 0.26565865 2145 andrew gelman stats-2013-12-24-Estimating and summarizing inference for hierarchical variance parameters when the number of groups is small

Introduction: Chris Che-Castaldo writes: I am trying to compute variance components for a hierarchical model where the group level has two binary predictors and their interaction. When I model each of these three predictors as N(0, tau) the model will not converge, perhaps because the number of coefficients in each batch is so small (2 for the main effects and 4 for the interaction). Although I could simply leave all these as predictors as unmodeled fixed effects, the last sentence of section 21.2 on page 462 of Gelman and Hill (2007) suggests this would not be a wise course of action: For example, it is not clear how to define the (finite) standard deviation of variables that are included in interactions. I am curious – is there still no clear cut way to directly compute the finite standard deviation for binary unmodeled variables that are also part of an interaction as well as the interaction itself? My reply: I’d recommend including these in your model (it’s probably easiest to do so

4 0.2638272 1786 andrew gelman stats-2013-04-03-Hierarchical array priors for ANOVA decompositions

Introduction: Alexander Volfovsky and Peter Hoff write : ANOVA decompositions are a standard method for describing and estimating heterogeneity among the means of a response variable across levels of multiple categorical factors. In such a decomposition, the complete set of main effects and interaction terms can be viewed as a collection of vectors, matrices and arrays that share various index sets defined by the factor levels. For many types of categorical factors, it is plausible that an ANOVA decomposition exhibits some consistency across orders of effects, in that the levels of a factor that have similar main-effect coefficients may also have similar coefficients in higher-order interaction terms. In such a case, estimation of the higher-order interactions should be improved by borrowing information from the main effects and lower-order interactions. To take advantage of such patterns, this article introduces a class of hierarchical prior distributions for collections of interaction arrays t

5 0.23417822 684 andrew gelman stats-2011-04-28-Hierarchical ordered logit or probit

Introduction: Jeff writes: How far off is bglmer and can it handle ordered logit or multinom logit? My reply: bglmer is very close. No ordered logit but I was just talking about it with Sophia today. My guess is that the easiest way to fit a hierarchical ordered logit or multinom logit will be to use stan. For right now I’d recommend using glmer/bglmer to fit the ordered logits in order (e.g., 1 vs. 2,3,4, then 2 vs. 3,4, then 3 vs. 4). Or maybe there’s already a hierarchical multinomial logit in mcmcpack or somewhere?

6 0.22511294 1908 andrew gelman stats-2013-06-21-Interpreting interactions in discrete-data regression

7 0.20678015 417 andrew gelman stats-2010-11-17-Clutering and variance components

8 0.20010771 782 andrew gelman stats-2011-06-29-Putting together multinomial discrete regressions by combining simple logits

9 0.18490437 1644 andrew gelman stats-2012-12-30-Fixed effects, followed by Bayes shrinkage?

10 0.17932488 2274 andrew gelman stats-2014-03-30-Adjudicating between alternative interpretations of a statistical interaction?

11 0.16889779 753 andrew gelman stats-2011-06-09-Allowing interaction terms to vary

12 0.16781141 1228 andrew gelman stats-2012-03-25-Continuous variables in Bayesian networks

13 0.16665836 823 andrew gelman stats-2011-07-26-Including interactions or not

14 0.16357714 2163 andrew gelman stats-2014-01-08-How to display multinominal logit results graphically?

15 0.1582206 1605 andrew gelman stats-2012-12-04-Write This Book

16 0.15721275 1247 andrew gelman stats-2012-04-05-More philosophy of Bayes

17 0.15299316 2042 andrew gelman stats-2013-09-28-Difficulties of using statistical significance (or lack thereof) to sift through and compare research hypotheses

18 0.14985025 1267 andrew gelman stats-2012-04-17-Hierarchical-multilevel modeling with “big data”

19 0.14947124 1418 andrew gelman stats-2012-07-16-Long discussion about causal inference and the use of hierarchical models to bridge between different inferential settings

20 0.14882237 1686 andrew gelman stats-2013-01-21-Finite-population Anova calculations for models with interactions


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.343), (1, 0.134), (2, 0.115), (3, -0.131), (4, 0.105), (5, 0.003), (6, 0.002), (7, -0.044), (8, 0.104), (9, 0.078), (10, -0.016), (11, 0.038), (12, 0.033), (13, -0.088), (14, 0.03), (15, 0.039), (16, -0.036), (17, -0.019), (18, -0.041), (19, 0.019), (20, 0.011), (21, 0.023), (22, 0.04), (23, -0.028), (24, -0.034), (25, -0.037), (26, 0.003), (27, 0.003), (28, -0.037), (29, -0.04), (30, 0.019), (31, 0.063), (32, 0.016), (33, 0.006), (34, 0.026), (35, -0.066), (36, -0.051), (37, 0.036), (38, -0.02), (39, 0.002), (40, 0.007), (41, -0.106), (42, 0.026), (43, 0.059), (44, -0.003), (45, 0.021), (46, 0.027), (47, -0.03), (48, 0.035), (49, 0.056)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.97062731 888 andrew gelman stats-2011-09-03-A psychology researcher asks: Is Anova dead?

Introduction: A research psychologist writes in with a question that’s so long that I’ll put my answer first, then put the question itself below the fold. Here’s my reply: As I wrote in my Anova paper and in my book with Jennifer Hill, I do think that multilevel models can completely replace Anova. At the same time, I think the central idea of Anova should persist in our understanding of these models. To me the central idea of Anova is not F-tests or p-values or sums of squares, but rather the idea of predicting an outcome based on factors with discrete levels, and understanding these factors using variance components. The continuous or categorical response thing doesn’t really matter so much to me. I have no problem using a normal linear model for continuous outcomes (perhaps suitably transformed) and a logistic model for binary outcomes. I don’t want to throw away interactions just because they’re not statistically significant. I’d rather partially pool them toward zero using an inform

2 0.85906416 753 andrew gelman stats-2011-06-09-Allowing interaction terms to vary

Introduction: Zoltan Fazekas writes: I am a 2nd year graduate student in political science at the University of Vienna. In my empirical research I often employ multilevel modeling, and recently I came across a situation that kept me wondering for quite a while. As I did not find much on this in the literature and considering the topics that you work on and blog about, I figured I will try to contact you. The situation is as follows: in a linear multilevel model, there are two important individual level predictors (x1 and x2) and a set of controls. Let us assume that there is a theoretically grounded argument suggesting that an interaction between x1 and x2 should be included in the model (x1 * x2). Both x1 and x2 are let to vary randomly across groups. Would this directly imply that the coefficient of the interaction should also be left to vary across country? This is even more burning if there is no specific hypothesis on the variance of the conditional effect across countries. And then i

3 0.85453779 1686 andrew gelman stats-2013-01-21-Finite-population Anova calculations for models with interactions

Introduction: Jim Thomson writes: I wonder if you could provide some clarification on the correct way to calculate the finite-population standard deviations for interaction terms in your Bayesian approach to ANOVA (as explained in your 2005 paper, and Gelman and Hill 2007). I understand that it is the SD of the constrained batch coefficients that is of interest, but in most WinBUGS examples I have seen, the SDs are all calculated directly as sd.fin<-sd(beta.main[]) for main effects and sd(beta.int[,]) for interaction effects, where beta.main and beta.int are the unconstrained coefficients, e.g. beta.int[i,j]~dnorm(0,tau). For main effects, I can see that it makes no difference, since the constrained value is calculated by subtracting the mean, and sd(B[]) = sd(B[]-mean(B[])). But the conventional sum-to-zero constraint for interaction terms in linear models is more complicated than subtracting the mean (there are only (n1-1)*(n2-1) free coefficients for an interaction b/w factors with n1 a

4 0.85298198 1908 andrew gelman stats-2013-06-21-Interpreting interactions in discrete-data regression

Introduction: Mike Johns writes: Are you familiar with the work of Ai and Norton on interactions in logit/probit models? I’d be curious to hear your thoughts. Ai, C.R. and Norton E.C. 2003. Interaction terms in logit and probit models. Economics Letters 80(1): 123-129. A peer ref just cited this paper in reaction to a logistic model we tested and claimed that the “only” way to test an interaction in logit/probit regression is to use the cross derivative method of Ai & Norton. I’ve never heard of this issue or method. It leaves me wondering what the interaction term actually tests (something Ai & Norton don’t discuss) and why such an important discovery is not more widely known. Is this an issue that is of particular relevance to econometric analysis because they approach interactions from the difference-in-difference perspective? Full disclosure, I’m coming from a social science/epi background. Thus, i’m not interested in the d-in-d estimator; I want to know if any variables modify the rela

5 0.84500003 1070 andrew gelman stats-2011-12-19-The scope for snooping

Introduction: Macartan Humphreys sent the following question to David Madigan and me: I am working on a piece on the registration of research designs (to prevent snooping). As part of it we want to give some estimates for the “scope for snooping” and how this can be affected by different registration requirements. So we want to answer questions of the form: “Say in truth there is no relation between x and y, you were willing to mess about with models until you found a significant relation between them, what are the chances that you would succeed if: 1. You were free to choose the indicators for x and y 2. You were free to choose h control variable from some group of k possible controls 3. You were free to divide up the sample in k ways to examine heterogeneous treatment effects 4. You were free to select from some set of k reasonable models” People have thought a lot about the first problem of choosing your indicators; we have done a set of simulations to answer the other questions

6 0.83362001 1462 andrew gelman stats-2012-08-18-Standardizing regression inputs

7 0.81025845 2086 andrew gelman stats-2013-11-03-How best to compare effects measured in two different time periods?

8 0.80886346 251 andrew gelman stats-2010-09-02-Interactions of predictors in a causal model

9 0.80778027 1294 andrew gelman stats-2012-05-01-Modeling y = a + b + c

10 0.79986912 1966 andrew gelman stats-2013-08-03-Uncertainty in parameter estimates using multilevel models

11 0.7994113 1703 andrew gelman stats-2013-02-02-Interaction-based feature selection and classification for high-dimensional biological data

12 0.79760003 86 andrew gelman stats-2010-06-14-“Too much data”?

13 0.79472387 553 andrew gelman stats-2011-02-03-is it possible to “overstratify” when assigning a treatment in a randomized control trial?

14 0.79160446 1121 andrew gelman stats-2012-01-15-R-squared for multilevel models

15 0.79052627 401 andrew gelman stats-2010-11-08-Silly old chi-square!

16 0.78731644 257 andrew gelman stats-2010-09-04-Question about standard range for social science correlations

17 0.78729331 726 andrew gelman stats-2011-05-22-Handling multiple versions of an outcome variable

18 0.78444397 1267 andrew gelman stats-2012-04-17-Hierarchical-multilevel modeling with “big data”

19 0.78001481 2145 andrew gelman stats-2013-12-24-Estimating and summarizing inference for hierarchical variance parameters when the number of groups is small

20 0.77729172 2296 andrew gelman stats-2014-04-19-Index or indicator variables


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(2, 0.018), (15, 0.028), (16, 0.082), (21, 0.03), (24, 0.184), (40, 0.029), (45, 0.013), (63, 0.034), (76, 0.01), (79, 0.046), (86, 0.04), (89, 0.015), (99, 0.32)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.98377049 888 andrew gelman stats-2011-09-03-A psychology researcher asks: Is Anova dead?

Introduction: A research psychologist writes in with a question that’s so long that I’ll put my answer first, then put the question itself below the fold. Here’s my reply: As I wrote in my Anova paper and in my book with Jennifer Hill, I do think that multilevel models can completely replace Anova. At the same time, I think the central idea of Anova should persist in our understanding of these models. To me the central idea of Anova is not F-tests or p-values or sums of squares, but rather the idea of predicting an outcome based on factors with discrete levels, and understanding these factors using variance components. The continuous or categorical response thing doesn’t really matter so much to me. I have no problem using a normal linear model for continuous outcomes (perhaps suitably transformed) and a logistic model for binary outcomes. I don’t want to throw away interactions just because they’re not statistically significant. I’d rather partially pool them toward zero using an inform

2 0.98011971 1671 andrew gelman stats-2013-01-13-Preregistration of Studies and Mock Reports

Introduction: The traditional system of scientific and scholarly publishing is breaking down in two different directions. On one hand, we are moving away from relying on a small set of journals as gatekeepers: the number of papers and research projects is increasing, the number of publication outlets is increasing, and important manuscripts are being posted on SSRN, Arxiv, and other nonrefereed sites. At the same time, many researchers are worried about the profusion of published claims that turn out to not replicate or in plain language, to be false. This concern is not new–some prominent discussions include Rosenthal (1979), Ioannidis (2005), and Vul et al. (2009)–but there is a growing sense that the scientific signal is being swamped by noise. I recently had the opportunity to comment in the journal Political Analysis on two papers, one by Humphreys, Sierra, and Windt, and one by Monogan, on the preregistration of studies and mock reports. Here’s the issue of the journal. Given the hi

3 0.97976208 1786 andrew gelman stats-2013-04-03-Hierarchical array priors for ANOVA decompositions

Introduction: Alexander Volfovsky and Peter Hoff write : ANOVA decompositions are a standard method for describing and estimating heterogeneity among the means of a response variable across levels of multiple categorical factors. In such a decomposition, the complete set of main effects and interaction terms can be viewed as a collection of vectors, matrices and arrays that share various index sets defined by the factor levels. For many types of categorical factors, it is plausible that an ANOVA decomposition exhibits some consistency across orders of effects, in that the levels of a factor that have similar main-effect coefficients may also have similar coefficients in higher-order interaction terms. In such a case, estimation of the higher-order interactions should be improved by borrowing information from the main effects and lower-order interactions. To take advantage of such patterns, this article introduces a class of hierarchical prior distributions for collections of interaction arrays t

4 0.97860193 291 andrew gelman stats-2010-09-22-Philosophy of Bayes and non-Bayes: A dialogue with Deborah Mayo

Introduction: I sent Deborah Mayo a link to my paper with Cosma Shalizi on the philosophy of statistics, and she sent me the link to this conference which unfortunately already occurred. (It’s too bad, because I’d have liked to have been there.) I summarized my philosophy as follows: I am highly sympathetic to the approach of Lakatos (or of Popper, if you consider Lakatos’s “Popper_2″ to be a reasonable simulation of the true Popperism), in that (a) I view statistical models as being built within theoretical structures, and (b) I see the checking and refutation of models to be a key part of scientific progress. A big problem I have with mainstream Bayesianism is its “inductivist” view that science can operate completely smoothly with posterior updates: the idea that new data causes us to increase the posterior probability of good models and decrease the posterior probability of bad models. I don’t buy that: I see models as ever-changing entities that are flexible and can be patched and ex

5 0.97859418 1162 andrew gelman stats-2012-02-11-Adding an error model to a deterministic model

Introduction: Daniel Lakeland asks , “Where do likelihoods come from?” He describes a class of problems where you have a deterministic dynamic model that you want to fit to data. The data won’t fit perfectly so, if you want to do Bayesian inference, you need to introduce an error model. This looks a little bit different from the usual way that models are presented in statistics textbooks, where the focus is typically on the random error process, not on the deterministic part of the model. A focus on the error process makes sense in some applications that have inherent randomness or variation (for example, genetics, psychology, and survey sampling) but not so much in the physical sciences, where the deterministic model can be complicated and is typically the essence of the study. Often in these sorts of studies, the staring point (and sometimes the ending point) is what the physicists call “nonlinear least squares” or what we would call normally-distributed errors. That’s what we did for our

6 0.97837299 1695 andrew gelman stats-2013-01-28-Economists argue about Bayes

7 0.97789574 1763 andrew gelman stats-2013-03-14-Everyone’s trading bias for variance at some point, it’s just done at different places in the analyses

8 0.97767603 2183 andrew gelman stats-2014-01-23-Discussion on preregistration of research studies

9 0.97754931 878 andrew gelman stats-2011-08-29-Infovis, infographics, and data visualization: Where I’m coming from, and where I’d like to go

10 0.97753882 1403 andrew gelman stats-2012-07-02-Moving beyond hopeless graphics

11 0.97749221 2040 andrew gelman stats-2013-09-26-Difficulties in making inferences about scientific truth from distributions of published p-values

12 0.97674775 2201 andrew gelman stats-2014-02-06-Bootstrap averaging: Examples where it works and where it doesn’t work

13 0.976722 1760 andrew gelman stats-2013-03-12-Misunderstanding the p-value

14 0.97653997 807 andrew gelman stats-2011-07-17-Macro causality

15 0.97649896 1247 andrew gelman stats-2012-04-05-More philosophy of Bayes

16 0.97624022 747 andrew gelman stats-2011-06-06-Research Directions for Machine Learning and Algorithms

17 0.97611868 1390 andrew gelman stats-2012-06-23-Traditionalist claims that modern art could just as well be replaced by a “paint-throwing chimp”

18 0.97558177 351 andrew gelman stats-2010-10-18-“I was finding the test so irritating and boring that I just started to click through as fast as I could”

19 0.97553927 262 andrew gelman stats-2010-09-08-Here’s how rumors get started: Lineplots, dotplots, and nonfunctional modernist architecture

20 0.97550869 1644 andrew gelman stats-2012-12-30-Fixed effects, followed by Bayes shrinkage?