andrew_gelman_stats andrew_gelman_stats-2014 andrew_gelman_stats-2014-2357 knowledge-graph by maker-knowledge-mining

2357 andrew gelman stats-2014-06-02-Why we hate stepwise regression


meta infos for this blog

Source: html

Introduction: Haynes Goddard writes: I have been slowly working my way through the grad program in stats here, and the latest course was a biostats course on categorical and survival analysis. I noticed in the semi-parametric and parametric material (Wang and Lee is the text) that they use stepwise regression a lot. I learned in econometrics that stepwise is poor practice, as it defaults to the “theory of the regression line”, that is no theory at all, just the variation in the data. I don’t find the topic on your blog, and wonder if you have addressed the issue. My reply: Stepwise regression is one of these things, like outlier detection and pie charts, which appear to be popular among non-statisticans but are considered by statisticians to be a bit of a joke. For example, Jennifer and I don’t mention stepwise regression in our book, not even once. To address the issue more directly: the motivation behind stepwise regression is that you have a lot of potential predictors but not e


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Haynes Goddard writes: I have been slowly working my way through the grad program in stats here, and the latest course was a biostats course on categorical and survival analysis. [sent-1, score-0.881]

2 I noticed in the semi-parametric and parametric material (Wang and Lee is the text) that they use stepwise regression a lot. [sent-2, score-1.215]

3 I learned in econometrics that stepwise is poor practice, as it defaults to the “theory of the regression line”, that is no theory at all, just the variation in the data. [sent-3, score-1.467]

4 I don’t find the topic on your blog, and wonder if you have addressed the issue. [sent-4, score-0.097]

5 My reply: Stepwise regression is one of these things, like outlier detection and pie charts, which appear to be popular among non-statisticans but are considered by statisticians to be a bit of a joke. [sent-5, score-0.818]

6 For example, Jennifer and I don’t mention stepwise regression in our book, not even once. [sent-6, score-1.056]

7 To address the issue more directly: the motivation behind stepwise regression is that you have a lot of potential predictors but not enough data to estimate their coefficients in any meaningful way. [sent-7, score-1.422]

8 This sort of problem comes up all the time, for example here’s an example from my research, a meta-analysis of the effects of incentives in sample surveys. [sent-8, score-0.218]

9 The trouble with stepwise regression is that, at any given step, the model is fit using unconstrained least squares. [sent-9, score-1.161]

10 I prefer methods such as factor analysis or lasso that group or constrain the coefficient estimates in some way. [sent-10, score-0.415]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('stepwise', 0.721), ('regression', 0.265), ('biostats', 0.138), ('goddard', 0.138), ('constrain', 0.12), ('outlier', 0.114), ('defaults', 0.111), ('detection', 0.109), ('unconstrained', 0.109), ('categorical', 0.107), ('parametric', 0.102), ('pie', 0.102), ('wang', 0.102), ('lasso', 0.1), ('survival', 0.097), ('addressed', 0.097), ('theory', 0.094), ('slowly', 0.09), ('charts', 0.088), ('meaningful', 0.085), ('econometrics', 0.085), ('grad', 0.083), ('lee', 0.082), ('incentives', 0.08), ('stats', 0.08), ('course', 0.078), ('motivation', 0.075), ('coefficient', 0.074), ('jennifer', 0.073), ('text', 0.071), ('address', 0.071), ('latest', 0.07), ('behind', 0.07), ('mention', 0.07), ('coefficients', 0.07), ('example', 0.069), ('trouble', 0.066), ('predictors', 0.065), ('factor', 0.065), ('poor', 0.065), ('noticed', 0.064), ('learned', 0.064), ('material', 0.063), ('variation', 0.062), ('program', 0.06), ('popular', 0.058), ('appear', 0.058), ('statisticians', 0.056), ('considered', 0.056), ('prefer', 0.056)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0000001 2357 andrew gelman stats-2014-06-02-Why we hate stepwise regression

Introduction: Haynes Goddard writes: I have been slowly working my way through the grad program in stats here, and the latest course was a biostats course on categorical and survival analysis. I noticed in the semi-parametric and parametric material (Wang and Lee is the text) that they use stepwise regression a lot. I learned in econometrics that stepwise is poor practice, as it defaults to the “theory of the regression line”, that is no theory at all, just the variation in the data. I don’t find the topic on your blog, and wonder if you have addressed the issue. My reply: Stepwise regression is one of these things, like outlier detection and pie charts, which appear to be popular among non-statisticans but are considered by statisticians to be a bit of a joke. For example, Jennifer and I don’t mention stepwise regression in our book, not even once. To address the issue more directly: the motivation behind stepwise regression is that you have a lot of potential predictors but not e

2 0.39441055 1535 andrew gelman stats-2012-10-16-Bayesian analogue to stepwise regression?

Introduction: Bill Harris writes: On pp. 250-251 of BDA second edition, you write about multiple comparisons, and you write about stepwise regression on p. 405. How would you look at stepwise regression analyses in light of the multiple comparisons problem? Is there an issue? My reply: In this case I think the right approach is to keep all the coefs but partially pool them toward 0 (after suitable transformation). But then the challenge is coming up with a general way to construct good prior distributions. I’m still thinking about that one! Yet another approach is to put something together purely nonparametrically as with Bart.

3 0.21129006 2348 andrew gelman stats-2014-05-26-On deck this week

Introduction: Mon: WAIC and cross-validation in Stan! Tues: A whole fleet of gremlins: Looking more carefully at Richard Tol’s twice-corrected paper, “The Economic Effects of Climate Change” Wed: Just wondering Thurs: When you believe in things that you don’t understand Fri: I posted this as a comment on a sociology blog Sat: “Building on theories used to describe magnets, scientists have put together a model that captures something very different . . .” Sun: Why we hate stepwise regression

4 0.18337865 2356 andrew gelman stats-2014-06-02-On deck this week

Introduction: Mon: Why we hate stepwise regression Tues: Did you buy laundry detergent on their most recent trip to the store? Also comments on scientific publication and yet another suggestion to do a study that allows within-person comparisons Wed: All the Assumptions That Are My Life Thurs: Identifying pathways for managing multiple disturbances to limit plant invasions Fri: Statistically savvy journalism Sat: “Does researching casual marijuana use cause brain abnormalities?” Sun: Regression and causality and variable ordering

5 0.1362469 1769 andrew gelman stats-2013-03-18-Tibshirani announces new research result: A significance test for the lasso

Introduction: Lasso and me For a long time I was wrong about lasso. Lasso (“least absolute shrinkage and selection operator”) is a regularization procedure that shrinks regression coefficients toward zero, and in its basic form is equivalent to maximum penalized likelihood estimation with a penalty function that is proportional to the sum of the absolute values of the regression coefficients. I first heard about lasso from a talk that Trevor Hastie Rob Tibshirani gave at Berkeley in 1994 or 1995. He demonstrated that it shrunk regression coefficients to zero. I wasn’t impressed, first because it seemed like no big deal (if that’s the prior you use, that’s the shrinkage you get) and second because, from a Bayesian perspective, I don’t want to shrink things all the way to zero. In the sorts of social and environmental science problems I’ve worked on, just about nothing is zero. I’d like to control my noisy estimates but there’s nothing special about zero. At the end of the talk I stood

6 0.11825454 796 andrew gelman stats-2011-07-10-Matching and regression: two great tastes etc etc

7 0.11418616 1735 andrew gelman stats-2013-02-24-F-f-f-fake data

8 0.11404345 1656 andrew gelman stats-2013-01-05-Understanding regression models and regression coefficients

9 0.10643578 1718 andrew gelman stats-2013-02-11-Toward a framework for automatic model building

10 0.10176391 451 andrew gelman stats-2010-12-05-What do practitioners need to know about regression?

11 0.089351542 1849 andrew gelman stats-2013-05-09-Same old same old

12 0.087345935 1506 andrew gelman stats-2012-09-21-Building a regression model . . . with only 27 data points

13 0.083821215 144 andrew gelman stats-2010-07-13-Hey! Here’s a referee report for you!

14 0.081724837 1094 andrew gelman stats-2011-12-31-Using factor analysis or principal components analysis or measurement-error models for biological measurements in archaeology?

15 0.080091529 247 andrew gelman stats-2010-09-01-How does Bayes do it?

16 0.076325804 888 andrew gelman stats-2011-09-03-A psychology researcher asks: Is Anova dead?

17 0.075041294 1870 andrew gelman stats-2013-05-26-How to understand coefficients that reverse sign when you start controlling for things?

18 0.074206822 1267 andrew gelman stats-2012-04-17-Hierarchical-multilevel modeling with “big data”

19 0.073809944 1196 andrew gelman stats-2012-03-04-Piss-poor monocausal social science

20 0.073633432 1763 andrew gelman stats-2013-03-14-Everyone’s trading bias for variance at some point, it’s just done at different places in the analyses


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.131), (1, 0.06), (2, 0.025), (3, -0.015), (4, 0.07), (5, 0.031), (6, -0.007), (7, -0.024), (8, 0.054), (9, 0.067), (10, 0.027), (11, 0.046), (12, 0.045), (13, 0.017), (14, 0.049), (15, 0.02), (16, -0.048), (17, 0.009), (18, 0.014), (19, -0.004), (20, 0.014), (21, 0.045), (22, 0.001), (23, 0.026), (24, 0.008), (25, 0.011), (26, 0.063), (27, -0.082), (28, -0.048), (29, -0.032), (30, 0.055), (31, 0.048), (32, 0.008), (33, 0.008), (34, 0.006), (35, -0.036), (36, 0.009), (37, 0.018), (38, -0.039), (39, -0.013), (40, 0.025), (41, 0.049), (42, -0.021), (43, -0.054), (44, 0.086), (45, 0.024), (46, -0.025), (47, 0.024), (48, 0.04), (49, -0.056)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.96863693 2357 andrew gelman stats-2014-06-02-Why we hate stepwise regression

Introduction: Haynes Goddard writes: I have been slowly working my way through the grad program in stats here, and the latest course was a biostats course on categorical and survival analysis. I noticed in the semi-parametric and parametric material (Wang and Lee is the text) that they use stepwise regression a lot. I learned in econometrics that stepwise is poor practice, as it defaults to the “theory of the regression line”, that is no theory at all, just the variation in the data. I don’t find the topic on your blog, and wonder if you have addressed the issue. My reply: Stepwise regression is one of these things, like outlier detection and pie charts, which appear to be popular among non-statisticans but are considered by statisticians to be a bit of a joke. For example, Jennifer and I don’t mention stepwise regression in our book, not even once. To address the issue more directly: the motivation behind stepwise regression is that you have a lot of potential predictors but not e

2 0.85374063 796 andrew gelman stats-2011-07-10-Matching and regression: two great tastes etc etc

Introduction: Matthew Bogard writes: Regarding the book Mostly Harmless Econometrics, you state : A casual reader of the book might be left with the unfortunate impression that matching is a competitor to regression rather than a tool for making regression more effective. But in fact isn’t that what they are arguing, that, in a ‘mostly harmless way’ regression is in fact a matching estimator itself? “Our view is that regression can be motivated as a particular sort of weighted matching estimator, and therefore the differences between regression and matching estimates are unlikely to be of major empirical importance” (Chapter 3 p. 70) They seem to be distinguishing regression (without prior matching) from all other types of matching techniques, and therefore implying that regression can be a ‘mostly harmless’ substitute or competitor to matching. My previous understanding, before starting this book was as you say, that matching is a tool that makes regression more effective. I have n

3 0.84229422 1535 andrew gelman stats-2012-10-16-Bayesian analogue to stepwise regression?

Introduction: Bill Harris writes: On pp. 250-251 of BDA second edition, you write about multiple comparisons, and you write about stepwise regression on p. 405. How would you look at stepwise regression analyses in light of the multiple comparisons problem? Is there an issue? My reply: In this case I think the right approach is to keep all the coefs but partially pool them toward 0 (after suitable transformation). But then the challenge is coming up with a general way to construct good prior distributions. I’m still thinking about that one! Yet another approach is to put something together purely nonparametrically as with Bart.

4 0.81029165 1656 andrew gelman stats-2013-01-05-Understanding regression models and regression coefficients

Introduction: David Hoaglin writes: After seeing it cited, I just read your paper in Technometrics. The home radon levels provide an interesting and instructive example. I [Hoaglin] have a different take on the difficulty of interpreting the estimated coefficient of the county-level basement proportion (gamma-sub-2) on page 434. An important part of the difficulty involves “other things being equal.” That sounds like the widespread interpretation of a regression coefficient as telling how the dependent variable responds to change in that predictor when the other predictors are held constant. Unfortunately, as a general interpretation, that language is oversimplified; it doesn’t reflect how regression actually works. The appropriate general interpretation is that the coefficient tells how the dependent variable responds to change in that predictor after allowing for simultaneous change in the other predictors in the data at hand. Thus, in the county-level regression gamma-sub-2 summarize

5 0.80448139 1094 andrew gelman stats-2011-12-31-Using factor analysis or principal components analysis or measurement-error models for biological measurements in archaeology?

Introduction: Greg Campbell writes: I am a Canadian archaeologist (BSc in Chemistry) researching the past human use of European Atlantic shellfish. After two decades of practice I am finally getting a MA in archaeology at Reading. I am seeing if the habitat or size of harvested mussels (Mytilus edulis) can be reconstructed from measurements of the umbo (the pointy end, and the only bit that survives well in archaeological deposits) using log-transformed measurements (or allometry; relationships between dimensions are more likely exponential than linear). Of course multivariate regressions in most statistics packages (Minitab, SPSS, SAS) assume you are trying to predict one variable from all the others (a Model I regression), and use ordinary least squares to fit the regression line. For organismal dimensions this makes little sense, since all the dimensions are (at least in theory) free to change their mutual proportions during growth. So there is no predictor and predicted, mutual variation of

6 0.79732734 451 andrew gelman stats-2010-12-05-What do practitioners need to know about regression?

7 0.79715925 144 andrew gelman stats-2010-07-13-Hey! Here’s a referee report for you!

8 0.77611601 1870 andrew gelman stats-2013-05-26-How to understand coefficients that reverse sign when you start controlling for things?

9 0.74405277 1967 andrew gelman stats-2013-08-04-What are the key assumptions of linear regression?

10 0.74387467 1814 andrew gelman stats-2013-04-20-A mess with which I am comfortable

11 0.72659904 770 andrew gelman stats-2011-06-15-Still more Mr. P in public health

12 0.70278084 1849 andrew gelman stats-2013-05-09-Same old same old

13 0.7021327 10 andrew gelman stats-2010-04-29-Alternatives to regression for social science predictions

14 0.69858634 375 andrew gelman stats-2010-10-28-Matching for preprocessing data for causal inference

15 0.69021952 1815 andrew gelman stats-2013-04-20-Displaying inferences from complex models

16 0.68832731 397 andrew gelman stats-2010-11-06-Multilevel quantile regression

17 0.68226802 327 andrew gelman stats-2010-10-07-There are never 70 distinct parameters

18 0.67741877 1900 andrew gelman stats-2013-06-15-Exploratory multilevel analysis when group-level variables are of importance

19 0.65811592 14 andrew gelman stats-2010-05-01-Imputing count data

20 0.65746492 146 andrew gelman stats-2010-07-14-The statistics and the science


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(5, 0.015), (6, 0.014), (16, 0.038), (21, 0.077), (24, 0.146), (38, 0.016), (40, 0.013), (42, 0.117), (49, 0.017), (62, 0.03), (69, 0.069), (90, 0.025), (99, 0.305)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.95565844 2357 andrew gelman stats-2014-06-02-Why we hate stepwise regression

Introduction: Haynes Goddard writes: I have been slowly working my way through the grad program in stats here, and the latest course was a biostats course on categorical and survival analysis. I noticed in the semi-parametric and parametric material (Wang and Lee is the text) that they use stepwise regression a lot. I learned in econometrics that stepwise is poor practice, as it defaults to the “theory of the regression line”, that is no theory at all, just the variation in the data. I don’t find the topic on your blog, and wonder if you have addressed the issue. My reply: Stepwise regression is one of these things, like outlier detection and pie charts, which appear to be popular among non-statisticans but are considered by statisticians to be a bit of a joke. For example, Jennifer and I don’t mention stepwise regression in our book, not even once. To address the issue more directly: the motivation behind stepwise regression is that you have a lot of potential predictors but not e

2 0.95478404 60 andrew gelman stats-2010-05-30-What Auteur Theory and Freshwater Economics have in common

Introduction: Mark Palko writes : We’ll define freshwater economics as the theory that economic behavior (and perhaps most non-economic behavior) can be explained using the concepts of rational actors and efficient markets and auteur theory as the idea that most films (particularly great films) represent the artistic vision of a single author (almost always the director) and the best way to approach one of those films is through the body of work of its author. Both of these definitions are oversimplified and a bit unfair but they will get the discussion started. . . . Compared to their nearest neighbors, film criticism and economics (particularly macroeconomics) are both difficult, messy fields. Films are collaborative efforts where individual contributions defy attribution and creative decisions often can’t be distinguished from accidents of filming. Worse yet, most films are the product of large corporations which means that dozens of VPs and executives might have played a role (sometimes

3 0.95141494 1692 andrew gelman stats-2013-01-25-Freakonomics Experiments

Introduction: Stephen Dubner writes : Freakonomics Experiments is a set of simple experiments about complex issues—whether to break up with your significant other, quit your job, or start a diet, just to name a few. . . . a collaboration between researchers at the University of Chicago, Freakonomics, and—we hope!—you. Steve Levitt and John List, of the University of Chicago, run the experimental and statistical side of things. Stephen Dubner, Steve Levitt, and the Freakonomics staff have given these experiments the Freakonomics twist you’re used to. Once you flip the coin, you become a member of the most important part of the collaboration, the Freakonomics Experiments team. Without your participation, we couldn’t complete any of this research. . . . You’ll choose a question that you are facing today, such as whether to quit your job or buy a house. Then you’ll provide us some background information about yourself. After that, you’ll flip the coin to find out what you should do in your situati

4 0.94410402 1535 andrew gelman stats-2012-10-16-Bayesian analogue to stepwise regression?

Introduction: Bill Harris writes: On pp. 250-251 of BDA second edition, you write about multiple comparisons, and you write about stepwise regression on p. 405. How would you look at stepwise regression analyses in light of the multiple comparisons problem? Is there an issue? My reply: In this case I think the right approach is to keep all the coefs but partially pool them toward 0 (after suitable transformation). But then the challenge is coming up with a general way to construct good prior distributions. I’m still thinking about that one! Yet another approach is to put something together purely nonparametrically as with Bart.

5 0.94169104 808 andrew gelman stats-2011-07-18-The estimated effect size is implausibly large. Under what models is this a piece of evidence that the true effect is small?

Introduction: Paul Pudaite writes in response to my discussion with Bartels regarding effect sizes and measurement error models: You [Gelman] wrote: “I actually think there will be some (non-Gaussian) models for which, as y gets larger, E(x|y) can actually go back toward zero.” I [Pudaite] encountered this phenomenon some time in the ’90s. See this graph which shows the conditional expectation of X given Z, when Z = X + Y and the probability density functions of X and Y are, respectively, exp(-x^2) and 1/(y^2+1) (times appropriate constants). As the magnitude of Z increases, E[X|Z] shrinks to zero. I wasn’t sure it was worth the effort to try to publish a two paragraph paper. I suspect that this is true whenever the tail of one distribution is ‘sufficiently heavy’ with respect to the tail of the other. Hmm, I suppose there might be enough substance in a paper that attempted to characterize this outcome for, say, unimodal symmetric distributions. Maybe someone can do this? I think i

6 0.93757439 1104 andrew gelman stats-2012-01-07-A compelling reason to go to London, Ontario??

7 0.93734759 713 andrew gelman stats-2011-05-15-1-2 social scientist + 1-2 politician = ???

8 0.93719214 590 andrew gelman stats-2011-02-25-Good introductory book for statistical computation?

9 0.93694901 117 andrew gelman stats-2010-06-29-Ya don’t know Bayes, Jack

10 0.93674099 1223 andrew gelman stats-2012-03-20-A kaleidoscope of responses to Dubner’s criticisms of our criticisms of Freaknomics

11 0.93643904 1936 andrew gelman stats-2013-07-13-Economic policy does not occur in a political vacuum

12 0.93519342 1138 andrew gelman stats-2012-01-25-Chris Schmid on Evidence Based Medicine

13 0.93395776 2128 andrew gelman stats-2013-12-09-How to model distributions that have outliers in one direction

14 0.93305367 492 andrew gelman stats-2010-12-30-That puzzle-solving feeling

15 0.93303072 483 andrew gelman stats-2010-12-23-Science, ideology, and human origins

16 0.93039382 746 andrew gelman stats-2011-06-05-An unexpected benefit of Arrow’s other theorem

17 0.92753863 1726 andrew gelman stats-2013-02-18-What to read to catch up on multivariate statistics?

18 0.92749447 1921 andrew gelman stats-2013-07-01-Going meta on Niall Ferguson

19 0.92628312 2015 andrew gelman stats-2013-09-10-The ethics of lying, cheating, and stealing with data: A case study

20 0.92592841 1844 andrew gelman stats-2013-05-06-Against optimism about social science