Introduction: A research psychologist writes in with a question that’s so long that I’ll put my answer first, then put the question itself below the fold. Here’s my reply: As I wrote in my Anova paper and in my book with Jennifer Hill, I do think that multilevel models can completely replace Anova. At the same time, I think the central idea of Anova should persist in our understanding of these models. To me the central idea of Anova is not F-tests or p-values or sums of squares, but rather the idea of predicting an outcome based on factors with discrete levels, and understanding these factors using variance components. The continuous or categorical response thing doesn’t really matter so much to me. I have no problem using a normal linear model for continuous outcomes (perhaps suitably transformed) and a logistic model for binary outcomes. I don’t want to throw away interactions just because they’re not statistically significant. I’d rather partially pool them toward zero using an inform

1 I have no problem using a normal linear model for continuous outcomes (perhaps suitably transformed) and a logistic model for binary outcomes. [sent-6, score-0.485]

2 Regarding your conceptual point, yes yes yes yes yes I agree that you should use those continuous variables, don’t chop them up as binary, that would just throw away info. [sent-12, score-0.742]

3 And now here’s the question: Recently, there has been a shift in field away from ANOVA to the use of mixed effects logit models. [sent-14, score-0.721]

4 Learning to program it is relatively easy, learning how to use it appropriately, and especially, understanding how to interpret logit models is much harder. [sent-24, score-0.773]

5 And I have overheard too many discussions about interactions amongst my poli sci and economist friends, especially in logit models, to not be somewhat sceptical of the advice in said paper. [sent-25, score-0.438]

6 The main impetus for the shift away from ANOVA to logit is two-fold: 1) arguing that we actually have categorical response data, and 2) a demonstration of a spurious interaction effect in ANOVA – as in, it’s significant in ANOVA (even using transformed data) but not in the logit model. [sent-31, score-1.366]

7 As far as I can tell, the interpretation of interactions in logit is very tricky. [sent-33, score-0.438]

8 Given all the complications, I am loathe to throw away a result because it was not significant in a logit model. [sent-46, score-0.632]

9 But according to Golder and colleagues “ the coefficient and standard error on the interaction term does not tell us the direction, magnitude, or significance of the ‘interaction effect’” . [sent-48, score-0.464]

10 htm ) “Just because the interaction term is significant in the log odds model, it doesn’t mean that the probability difference in differences will be significant for values of the covariate of interest. [sent-53, score-0.665]

11 Paradoxically, even if the interaction term is not significant in the log odds model, the probability difference in differences may be significant for some values of the covariate. [sent-54, score-0.665]

12 So reading off the p values for an interaction term is not a straightforward matter, or should I say, using them to directly reject the hypothesis that there is an interaction is not the same as in an ANOVA. [sent-58, score-0.65]

13 Since I care about their overall performance, why would I use an approximation, or put differently, a single sample of their performance, to test whether learning methods affect overall performance. [sent-82, score-0.394]

14 Moreover, it gets rid of including the random variation in an individual’s performance on an item. [sent-83, score-0.327]

15 My understanding of the difference (from the perspective of assumptions) is that random effects are more efficient but biased, and that in other disciplines the choice of a random effects model would have to be tested and justified. [sent-99, score-0.666]

16 The push to use mixed effects models has been predicated on ‘the fact that ordinary logit models provide no direct way to model random subject and item effects’. [sent-108, score-0.97]

17 But given what I just talked about, random vs fixed effects, bias doesn’t seem to be too much of a concern…) My reluctance seems to be supported by Kennedy (1998). [sent-115, score-0.439]

18 But, even though I use SPSS, I can program in it – I learned to use it back in the SPSS for DOS days – so using an improved ANOVA model is something I could do with some work). [sent-121, score-0.402]

19 Unlike what seems to be the case for practitioners of regression (from what I gleaned from a presentation and paper by Golder and colleagues), I was taught to be careful interpreting main effects given a significant interaction in an ANOVA. [sent-123, score-0.696]

20 Regression clearly has some benefits, in particular co-efficients, but I am unconvinced that logit is the way to go. [sent-126, score-0.36]

same-blog 1 0.98377049 888 andrew gelman stats-2011-09-03-A psychology researcher asks: Is Anova dead?

Introduction: A research psychologist writes in with a question that’s so long that I’ll put my answer first, then put the question itself below the fold. Here’s my reply: As I wrote in my Anova paper and in my book with Jennifer Hill, I do think that multilevel models can completely replace Anova. At the same time, I think the central idea of Anova should persist in our understanding of these models. To me the central idea of Anova is not F-tests or p-values or sums of squares, but rather the idea of predicting an outcome based on factors with discrete levels, and understanding these factors using variance components. The continuous or categorical response thing doesn’t really matter so much to me. I have no problem using a normal linear model for continuous outcomes (perhaps suitably transformed) and a logistic model for binary outcomes. I don’t want to throw away interactions just because they’re not statistically significant. I’d rather partially pool them toward zero using an inform

2 0.98011971 1671 andrew gelman stats-2013-01-13-Preregistration of Studies and Mock Reports

Introduction: The traditional system of scientific and scholarly publishing is breaking down in two different directions. On one hand, we are moving away from relying on a small set of journals as gatekeepers: the number of papers and research projects is increasing, the number of publication outlets is increasing, and important manuscripts are being posted on SSRN, Arxiv, and other nonrefereed sites. At the same time, many researchers are worried about the profusion of published claims that turn out to not replicate or in plain language, to be false. This concern is not new–some prominent discussions include Rosenthal (1979), Ioannidis (2005), and Vul et al. (2009)–but there is a growing sense that the scientific signal is being swamped by noise. I recently had the opportunity to comment in the journal Political Analysis on two papers, one by Humphreys, Sierra, and Windt, and one by Monogan, on the preregistration of studies and mock reports. Here’s the issue of the journal. Given the hi

3 0.97976208 1786 andrew gelman stats-2013-04-03-Hierarchical array priors for ANOVA decompositions

Introduction: Alexander Volfovsky and Peter Hoff write : ANOVA decompositions are a standard method for describing and estimating heterogeneity among the means of a response variable across levels of multiple categorical factors. In such a decomposition, the complete set of main effects and interaction terms can be viewed as a collection of vectors, matrices and arrays that share various index sets defined by the factor levels. For many types of categorical factors, it is plausible that an ANOVA decomposition exhibits some consistency across orders of effects, in that the levels of a factor that have similar main-effect coefficients may also have similar coefficients in higher-order interaction terms. In such a case, estimation of the higher-order interactions should be improved by borrowing information from the main effects and lower-order interactions. To take advantage of such patterns, this article introduces a class of hierarchical prior distributions for collections of interaction arrays t

4 0.97860193 291 andrew gelman stats-2010-09-22-Philosophy of Bayes and non-Bayes: A dialogue with Deborah Mayo

Introduction: I sent Deborah Mayo a link to my paper with Cosma Shalizi on the philosophy of statistics, and she sent me the link to this conference which unfortunately already occurred. (It’s too bad, because I’d have liked to have been there.) I summarized my philosophy as follows: I am highly sympathetic to the approach of Lakatos (or of Popper, if you consider Lakatos’s “Popper_2″ to be a reasonable simulation of the true Popperism), in that (a) I view statistical models as being built within theoretical structures, and (b) I see the checking and refutation of models to be a key part of scientific progress. A big problem I have with mainstream Bayesianism is its “inductivist” view that science can operate completely smoothly with posterior updates: the idea that new data causes us to increase the posterior probability of good models and decrease the posterior probability of bad models. I don’t buy that: I see models as ever-changing entities that are flexible and can be patched and ex

5 0.97859418 1162 andrew gelman stats-2012-02-11-Adding an error model to a deterministic model

Introduction: Daniel Lakeland asks , “Where do likelihoods come from?” He describes a class of problems where you have a deterministic dynamic model that you want to fit to data. The data won’t fit perfectly so, if you want to do Bayesian inference, you need to introduce an error model. This looks a little bit different from the usual way that models are presented in statistics textbooks, where the focus is typically on the random error process, not on the deterministic part of the model. A focus on the error process makes sense in some applications that have inherent randomness or variation (for example, genetics, psychology, and survey sampling) but not so much in the physical sciences, where the deterministic model can be complicated and is typically the essence of the study. Often in these sorts of studies, the staring point (and sometimes the ending point) is what the physicists call “nonlinear least squares” or what we would call normally-distributed errors. That’s what we did for our

6 0.97837299 1695 andrew gelman stats-2013-01-28-Economists argue about Bayes

7 0.97789574 1763 andrew gelman stats-2013-03-14-Everyone’s trading bias for variance at some point, it’s just done at different places in the analyses

8 0.97767603 2183 andrew gelman stats-2014-01-23-Discussion on preregistration of research studies

9 0.97754931 878 andrew gelman stats-2011-08-29-Infovis, infographics, and data visualization: Where I’m coming from, and where I’d like to go

10 0.97753882 1403 andrew gelman stats-2012-07-02-Moving beyond hopeless graphics

11 0.97749221 2040 andrew gelman stats-2013-09-26-Difficulties in making inferences about scientific truth from distributions of published p-values

12 0.97674775 2201 andrew gelman stats-2014-02-06-Bootstrap averaging: Examples where it works and where it doesn’t work

13 0.976722 1760 andrew gelman stats-2013-03-12-Misunderstanding the p-value

14 0.97653997 807 andrew gelman stats-2011-07-17-Macro causality

15 0.97649896 1247 andrew gelman stats-2012-04-05-More philosophy of Bayes

16 0.97624022 747 andrew gelman stats-2011-06-06-Research Directions for Machine Learning and Algorithms

17 0.97611868 1390 andrew gelman stats-2012-06-23-Traditionalist claims that modern art could just as well be replaced by a “paint-throwing chimp”

18 0.97558177 351 andrew gelman stats-2010-10-18-“I was finding the test so irritating and boring that I just started to click through as fast as I could”

19 0.97553927 262 andrew gelman stats-2010-09-08-Here’s how rumors get started: Lineplots, dotplots, and nonfunctional modernist architecture

20 0.97550869 1644 andrew gelman stats-2012-12-30-Fixed effects, followed by Bayes shrinkage?