andrew_gelman_stats andrew_gelman_stats-2011 andrew_gelman_stats-2011-524 knowledge-graph by maker-knowledge-mining

524 andrew gelman stats-2011-01-19-Data exploration and multiple comparisons


meta infos for this blog

Source: html

Introduction: Bill Harris writes: I’ve read your paper and presentation showing why you don’t usually worry about multiple comparisons. I see how that applies when you are comparing results across multiple settings (states, etc.). Does the same principle hold when you are exploring data to find interesting relationships? For example, you have some data, and you’re trying a series of models to see which gives you the most useful insight. Do you try your models on a subset of the data so you have another subset for confirmatory analysis later, or do you simply throw all the data against your models? My reply: I’d like to estimate all the relationships at once and use a multilevel model to do partial pooling to handle the mutiplicity issues. That said, in practice, in my applied work I’m always bouncing back and forth between different hypotheses and different datasets, and often I learn a lot when next year’s data come in and I can modify my hypotheses. The trouble with the classical


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Bill Harris writes: I’ve read your paper and presentation showing why you don’t usually worry about multiple comparisons. [sent-1, score-0.551]

2 I see how that applies when you are comparing results across multiple settings (states, etc. [sent-2, score-0.571]

3 Does the same principle hold when you are exploring data to find interesting relationships? [sent-4, score-0.486]

4 For example, you have some data, and you’re trying a series of models to see which gives you the most useful insight. [sent-5, score-0.398]

5 Do you try your models on a subset of the data so you have another subset for confirmatory analysis later, or do you simply throw all the data against your models? [sent-6, score-1.319]

6 My reply: I’d like to estimate all the relationships at once and use a multilevel model to do partial pooling to handle the mutiplicity issues. [sent-7, score-0.733]

7 That said, in practice, in my applied work I’m always bouncing back and forth between different hypotheses and different datasets, and often I learn a lot when next year’s data come in and I can modify my hypotheses. [sent-8, score-1.364]

8 The trouble with the classical hypothesis-testing framework, at least for me, is that so-called statistical hypotheses are very precise things, whereas the sorts of hypotheses that arise in science and social science are vaguer and are not so amenable to “testing” in the classical sense. [sent-9, score-2.025]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('hypotheses', 0.363), ('relationships', 0.267), ('subset', 0.263), ('amenable', 0.199), ('classical', 0.196), ('bouncing', 0.191), ('confirmatory', 0.163), ('multiple', 0.161), ('harris', 0.16), ('models', 0.159), ('modify', 0.155), ('data', 0.143), ('exploring', 0.137), ('pooling', 0.137), ('applies', 0.13), ('forth', 0.125), ('datasets', 0.124), ('partial', 0.122), ('handle', 0.119), ('precise', 0.118), ('presentation', 0.111), ('throw', 0.109), ('arise', 0.108), ('framework', 0.108), ('hold', 0.106), ('settings', 0.105), ('worry', 0.103), ('science', 0.102), ('trouble', 0.1), ('principle', 0.1), ('comparing', 0.1), ('bill', 0.097), ('showing', 0.096), ('testing', 0.096), ('whereas', 0.095), ('series', 0.089), ('multilevel', 0.088), ('different', 0.083), ('practice', 0.083), ('states', 0.083), ('sorts', 0.083), ('gives', 0.083), ('later', 0.081), ('usually', 0.08), ('learn', 0.08), ('simply', 0.076), ('across', 0.075), ('applied', 0.071), ('next', 0.07), ('useful', 0.067)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0 524 andrew gelman stats-2011-01-19-Data exploration and multiple comparisons

Introduction: Bill Harris writes: I’ve read your paper and presentation showing why you don’t usually worry about multiple comparisons. I see how that applies when you are comparing results across multiple settings (states, etc.). Does the same principle hold when you are exploring data to find interesting relationships? For example, you have some data, and you’re trying a series of models to see which gives you the most useful insight. Do you try your models on a subset of the data so you have another subset for confirmatory analysis later, or do you simply throw all the data against your models? My reply: I’d like to estimate all the relationships at once and use a multilevel model to do partial pooling to handle the mutiplicity issues. That said, in practice, in my applied work I’m always bouncing back and forth between different hypotheses and different datasets, and often I learn a lot when next year’s data come in and I can modify my hypotheses. The trouble with the classical

2 0.19498082 2263 andrew gelman stats-2014-03-24-Empirical implications of Empirical Implications of Theoretical Models

Introduction: Robert Bloomfield writes: Most of the people in my field (accounting, which is basically applied economics and finance, leavened with psychology and organizational behavior) use ‘positive research methods’, which are typically described as coming to the data with a predefined theory, and using hypothesis testing to accept or reject the theory’s predictions. But a substantial minority use ‘interpretive research methods’ (sometimes called qualitative methods, for those that call positive research ‘quantitative’). No one seems entirely happy with the definition of this method, but I’ve found it useful to think of it as an attempt to see the world through the eyes of your subjects, much as Jane Goodall lived with gorillas and tried to see the world through their eyes.) Interpretive researchers often criticize positive researchers by noting that the latter don’t make the best use of their data, because they come to the data with a predetermined theory, and only test a narrow set of h

3 0.17248073 114 andrew gelman stats-2010-06-28-More on Bayesian deduction-induction

Introduction: Kevin Bryan wrote: I read your new article on deduction/induction under Bayes. There are a couple interesting papers from economic decision theory which are related that you might find interesting. Samuelson et al have a (very) recent paper about what happens when you have some Bayesian and some non-Bayesian hypotheses. (I mentioned this one on my blog earlier this year.) Essentially, the Bayesian hypotheses are forced to “make predictions” in every future period (“if the unemployment rate is x%, the president is reelected with pr=x), whereas other forms of reasoning (say, analogies: “If the unemployment rate is above 10%, the president will not be reelected”). Imagine you have some prior over, say, the economy and elections, with 99.9% of the hypotheses being Bayesian and the rest being analogies as above. Then 100 years from now, because the analogies are so hard to refute, using deduction will push the proportion of Bayesian hypotheses toward zero. There is a

4 0.15700689 544 andrew gelman stats-2011-01-29-Splitting the data

Introduction: Antonio Rangel writes: I’m a neuroscientist at Caltech . . . I’m using the debate on the ESP paper , as I’m sure other labs around the world are, as an opportunity to discuss some basic statistical issues/ideas w/ my lab. Request: Is there any chance you would be willing to share your thoughts about the difference between exploratory “data mining” studies and confirmatory studies? What I have in mind is that one could use a dataset to explore/discover novel hypotheses and then conduct another experiment to test those hypotheses rigorously. It seems that a good combination of both approaches could be the best of both worlds, since the first would lead to novel hypothesis discovery, and the later to careful testing. . . it is a fundamental issue for neuroscience and psychology. My reply: I know that people talk about this sort of thing . . . but in any real setting, I think I’d want all my data right now to answer any questions I have. I like cross-validation and have used

5 0.14558654 1016 andrew gelman stats-2011-11-17-I got 99 comparisons but multiplicity ain’t one

Introduction: After I gave my talk at an econ seminar on Why We (Usually) Don’t Care About Multiple Comparisons, I got the following comment: One question that came up later was whether your argument is really with testing in general, rather than only with testing in multiple comparison settings. My reply: Yes, my argument is with testing in general. But it arises with particular force in multiple comparisons. With a single test, we can just say we dislike testing so we use confidence intervals or Bayesian inference instead, and it’s no problem—really more of a change in emphasis than a change in methods. But with multiple tests, the classical advice is not simply to look at type 1 error rates but more specifically to make a multiplicity adjustment, for example to make confidence intervals wider to account for multiplicity. I don’t want to do this! So here there is a real battle to fight. P.S. Here’s the article (with Jennifer and Masanao), to appear in the Journal of Research on

6 0.1334568 772 andrew gelman stats-2011-06-17-Graphical tools for understanding multilevel models

7 0.13230208 295 andrew gelman stats-2010-09-25-Clusters with very small numbers of observations

8 0.12865388 112 andrew gelman stats-2010-06-27-Sampling rate of human-scaled time series

9 0.12668926 2281 andrew gelman stats-2014-04-04-The Notorious N.H.S.T. presents: Mo P-values Mo Problems

10 0.12198928 1695 andrew gelman stats-2013-01-28-Economists argue about Bayes

11 0.11770605 1763 andrew gelman stats-2013-03-14-Everyone’s trading bias for variance at some point, it’s just done at different places in the analyses

12 0.11740114 774 andrew gelman stats-2011-06-20-The pervasive twoishness of statistics; in particular, the “sampling distribution” and the “likelihood” are two different models, and that’s a good thing

13 0.1152327 291 andrew gelman stats-2010-09-22-Philosophy of Bayes and non-Bayes: A dialogue with Deborah Mayo

14 0.11385874 1989 andrew gelman stats-2013-08-20-Correcting for multiple comparisons in a Bayesian regression model

15 0.11347898 704 andrew gelman stats-2011-05-10-Multiple imputation and multilevel analysis

16 0.11027274 1418 andrew gelman stats-2012-07-16-Long discussion about causal inference and the use of hierarchical models to bridge between different inferential settings

17 0.11022341 2007 andrew gelman stats-2013-09-03-Popper and Jaynes

18 0.10951585 608 andrew gelman stats-2011-03-12-Single or multiple imputation?

19 0.10619156 1469 andrew gelman stats-2012-08-25-Ways of knowing

20 0.10442236 2326 andrew gelman stats-2014-05-08-Discussion with Steven Pinker on research that is attached to data that are so noisy as to be essentially uninformative


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.204), (1, 0.094), (2, -0.017), (3, -0.024), (4, 0.02), (5, -0.008), (6, -0.088), (7, 0.0), (8, 0.072), (9, 0.043), (10, 0.001), (11, 0.034), (12, -0.009), (13, -0.046), (14, 0.025), (15, 0.012), (16, -0.042), (17, -0.051), (18, 0.006), (19, -0.046), (20, 0.002), (21, -0.034), (22, -0.036), (23, 0.035), (24, -0.065), (25, -0.071), (26, -0.009), (27, -0.023), (28, 0.045), (29, -0.013), (30, 0.034), (31, 0.003), (32, 0.056), (33, -0.03), (34, -0.034), (35, 0.024), (36, 0.026), (37, -0.02), (38, 0.035), (39, 0.008), (40, 0.009), (41, 0.046), (42, -0.023), (43, -0.011), (44, -0.045), (45, -0.025), (46, 0.017), (47, -0.009), (48, -0.044), (49, -0.11)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.9576478 524 andrew gelman stats-2011-01-19-Data exploration and multiple comparisons

Introduction: Bill Harris writes: I’ve read your paper and presentation showing why you don’t usually worry about multiple comparisons. I see how that applies when you are comparing results across multiple settings (states, etc.). Does the same principle hold when you are exploring data to find interesting relationships? For example, you have some data, and you’re trying a series of models to see which gives you the most useful insight. Do you try your models on a subset of the data so you have another subset for confirmatory analysis later, or do you simply throw all the data against your models? My reply: I’d like to estimate all the relationships at once and use a multilevel model to do partial pooling to handle the mutiplicity issues. That said, in practice, in my applied work I’m always bouncing back and forth between different hypotheses and different datasets, and often I learn a lot when next year’s data come in and I can modify my hypotheses. The trouble with the classical

2 0.77554286 295 andrew gelman stats-2010-09-25-Clusters with very small numbers of observations

Introduction: James O’Brien writes: How would you explain, to a “classically-trained” hypothesis-tester, that “It’s OK to fit a multilevel model even if some groups have only one observation each”? I [O'Brien] think I understand the logic and the statistical principles at work in this, but I’ve having trouble being clear and persuasive. I also feel like I’m contending with some methodological conventional wisdom here. My reply: I’m so used to this idea that I find it difficult to defend it in some sort of general conceptual way. So let me retreat to a more functional defense, which is that multilevel modeling gives good estimates, especially when the number of observations per group is small. One way to see this in any particular example in through cross-validation. Another way is to consider the alternatives. If you try really hard you can come up with a “classical hypothesis testing” approach which will do as well as the multilevel model. It would just take a lot of work. I’d r

3 0.75105476 1763 andrew gelman stats-2013-03-14-Everyone’s trading bias for variance at some point, it’s just done at different places in the analyses

Introduction: Some things I respect When it comes to meta-models of statistics, here are two philosophies that I respect: 1. (My) Bayesian approach, which I associate with E. T. Jaynes, in which you construct models with strong assumptions, ride your models hard, check their fit to data, and then scrap them and improve them as necessary. 2. At the other extreme, model-free statistical procedures that are designed to work well under very weak assumptions—for example, instead of assuming a distribution is Gaussian, you would just want the procedure to work well under some conditions on the smoothness of the second derivative of the log density function. Both the above philosophies recognize that (almost) all important assumptions will be wrong, and they resolve this concern via aggressive model checking or via robustness. And of course there are intermediate positions, such as working with Bayesian models that have been shown to be robust, and then still checking them. Or, to flip it arou

4 0.74613506 2263 andrew gelman stats-2014-03-24-Empirical implications of Empirical Implications of Theoretical Models

Introduction: Robert Bloomfield writes: Most of the people in my field (accounting, which is basically applied economics and finance, leavened with psychology and organizational behavior) use ‘positive research methods’, which are typically described as coming to the data with a predefined theory, and using hypothesis testing to accept or reject the theory’s predictions. But a substantial minority use ‘interpretive research methods’ (sometimes called qualitative methods, for those that call positive research ‘quantitative’). No one seems entirely happy with the definition of this method, but I’ve found it useful to think of it as an attempt to see the world through the eyes of your subjects, much as Jane Goodall lived with gorillas and tried to see the world through their eyes.) Interpretive researchers often criticize positive researchers by noting that the latter don’t make the best use of their data, because they come to the data with a predetermined theory, and only test a narrow set of h

5 0.74573237 704 andrew gelman stats-2011-05-10-Multiple imputation and multilevel analysis

Introduction: Robert Birkelbach: I am writing my Bachelor Thesis in which I want to assess the reading competencies of German elementary school children using the PIRLS2006 data. My levels are classrooms and the individuals. However, my dependent variable is a multiple imputed (m=5) reading test. The problem I have is, that I do not know, whether I can just calculate 5 linear multilevel models and then average all the results (the coefficients, standard deviation, bic, intra class correlation, R2, t-statistics, p-values etc) or if I need different formulas for integrating the results of the five models into one because it is a multilevel analysis? Do you think there’s a better way in solving my problem? I would greatly appreciate if you could help me with a problem regarding my analysis — I am quite a newbie to multilevel modeling and especially to multiple imputation. Also: Is it okay to use frequentist models when the multiple imputation was done bayesian? Would the different philosophies of sc

6 0.74095792 948 andrew gelman stats-2011-10-10-Combining data from many sources

7 0.73474693 2294 andrew gelman stats-2014-04-17-If you get to the point of asking, just do it. But some difficulties do arise . . .

8 0.73213667 1270 andrew gelman stats-2012-04-19-Demystifying Blup

9 0.73189294 1934 andrew gelman stats-2013-07-11-Yes, worry about generalizing from data to population. But multilevel modeling is the solution, not the problem

10 0.72976398 1425 andrew gelman stats-2012-07-23-Examples of the use of hierarchical modeling to generalize to new settings

11 0.72966242 1718 andrew gelman stats-2013-02-11-Toward a framework for automatic model building

12 0.72851455 1195 andrew gelman stats-2012-03-04-Multiple comparisons dispute in the tabloids

13 0.72418445 772 andrew gelman stats-2011-06-17-Graphical tools for understanding multilevel models

14 0.7208119 1690 andrew gelman stats-2013-01-23-When are complicated models helpful in psychology research and when are they overkill?

15 0.71817493 421 andrew gelman stats-2010-11-19-Just chaid

16 0.71373689 1066 andrew gelman stats-2011-12-17-Ripley on model selection, and some links on exploratory model analysis

17 0.70586473 1383 andrew gelman stats-2012-06-18-Hierarchical modeling as a framework for extrapolation

18 0.70316958 424 andrew gelman stats-2010-11-21-Data cleaning tool!

19 0.70292729 690 andrew gelman stats-2011-05-01-Peter Huber’s reflections on data analysis

20 0.70224506 1248 andrew gelman stats-2012-04-06-17 groups, 6 group-level predictors: What to do?


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(16, 0.102), (17, 0.032), (21, 0.026), (24, 0.123), (25, 0.057), (30, 0.019), (36, 0.017), (42, 0.019), (63, 0.03), (76, 0.015), (86, 0.038), (99, 0.422)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.98453343 524 andrew gelman stats-2011-01-19-Data exploration and multiple comparisons

Introduction: Bill Harris writes: I’ve read your paper and presentation showing why you don’t usually worry about multiple comparisons. I see how that applies when you are comparing results across multiple settings (states, etc.). Does the same principle hold when you are exploring data to find interesting relationships? For example, you have some data, and you’re trying a series of models to see which gives you the most useful insight. Do you try your models on a subset of the data so you have another subset for confirmatory analysis later, or do you simply throw all the data against your models? My reply: I’d like to estimate all the relationships at once and use a multilevel model to do partial pooling to handle the mutiplicity issues. That said, in practice, in my applied work I’m always bouncing back and forth between different hypotheses and different datasets, and often I learn a lot when next year’s data come in and I can modify my hypotheses. The trouble with the classical

2 0.98048115 2235 andrew gelman stats-2014-03-06-How much time (if any) should we spend criticizing research that’s fraudulent, crappy, or just plain pointless?

Introduction: I had a brief email exchange with Jeff Leek regarding our recent discussions of replication, criticism, and the self-correcting process of science. Jeff writes: (1) I can see the problem with serious, evidence-based criticisms not being published in the same journal (and linked to) studies that are shown to be incorrect. I have been mostly seeing these sorts of things show up in blogs. But I’m not sure that is a bad thing. I think people read blogs more than they read the literature. I wonder if this means that blogs will eventually be a sort of “shadow literature”? (2) I think there is a ton of bad literature out there, just like there is a ton of bad stuff on Google. If we focus too much on the bad stuff we will be paralyzed. I still manage to find good papers despite all the bad papers. (3) I think one positive solution to this problem is to incentivize/publish referee reports and give people credit for a good referee report just like they get credit for a good paper. T

3 0.97985333 859 andrew gelman stats-2011-08-18-Misunderstanding analysis of covariance

Introduction: Jeremy Miles writes: Are you familiar with Miller and Chapman’s (2001) article : Misunderstanding Analysis of Covariance saying that ANCOVA (and therefore, I suppose regression) should not be used when groups differ on a covariate. It has caused a moderate splash in psychology circles. I wondered if you had any thoughts on it. I had not heard of the article so I followed the link . . . ugh! Already on the very first column of the very first page they confuse nonadditivity with nonlinearity. I could probably continue with, “and it gets worse,” but since nobody’s paying me to read this one, I’ll stop reading right there on the first page! I prefer when people point me to good papers to read. . . .

4 0.97947353 773 andrew gelman stats-2011-06-18-Should we always be using the t and robit instead of the normal and logit?

Introduction: My (coauthored) books on Bayesian data analysis and applied regression are like almost all the other statistics textbooks out there, in that we spend most of our time on the basic distributions such as normal and logistic and then, only as an aside, discuss robust models such as t and robit. Why aren’t the t and robit front and center? Sure, I can see starting with the normal (at least in the Bayesian book, where we actually work out all the algebra), but then why don’t we move on immediately to the real stuff? This isn’t just (or mainly) a question of textbooks or teaching; I’m really thinking here about statistical practice. My statistical practice. Should t and robit be the default? If not, why not? Some possible answers: 10. Estimating the degrees of freedom in the error distribution isn’t so easy, and throwing this extra parameter into the model could make inference unstable. 9. Real data usually don’t have outliers. In practice, fitting a robust model costs you

5 0.97904032 1577 andrew gelman stats-2012-11-14-Richer people continue to vote Republican

Introduction: From the exit polls: This is all pretty obvious but it seemed worth posting because some people still don’t seem to get it. For example, Jay Cost, writing in the Weekly Standard: The Democratic party now dominates the Upper East Side of Manhattan, as well as the wealthiest neighborhoods in the most powerful cities. And yet Republicans are still effectively castigated as the party of the rich. They are not — at least not any more than the Democratic party is. Arguably, both the Democrats and the Republicans are “the party of the rich.” But Republicans more so than Democrats (see above graph, also consider the debates over the estate tax and upper-income tax rates). Cost writes: Sure, the GOP favors tax rate reductions to generate economic growth, but the Democratic party has proven itself ready, willing, and able to dole out benefits to the well-heeled rent-seekers who swarm Washington, D.C. looking for favors from Uncle Sam. But he’s missing the point. The par

6 0.9790262 1656 andrew gelman stats-2013-01-05-Understanding regression models and regression coefficients

7 0.978944 2255 andrew gelman stats-2014-03-19-How Americans vote

8 0.97892874 110 andrew gelman stats-2010-06-26-Philosophy and the practice of Bayesian statistics

9 0.97803152 1100 andrew gelman stats-2012-01-05-Freakonomics: Why ask “What went wrong?”

10 0.97748768 451 andrew gelman stats-2010-12-05-What do practitioners need to know about regression?

11 0.97740483 315 andrew gelman stats-2010-10-03-He doesn’t trust the fit . . . r=.999

12 0.9772979 1972 andrew gelman stats-2013-08-07-When you’re planning on fitting a model, build up to it by fitting simpler models first. Then, once you have a model you like, check the hell out of it

13 0.97723997 2158 andrew gelman stats-2014-01-03-Booze: Been There. Done That.

14 0.97723776 2084 andrew gelman stats-2013-11-01-Doing Data Science: What’s it all about?

15 0.97716713 1289 andrew gelman stats-2012-04-29-We go to war with the data we have, not the data we want

16 0.97701377 989 andrew gelman stats-2011-11-03-This post does not mention Wegman

17 0.97693115 2279 andrew gelman stats-2014-04-02-Am I too negative?

18 0.97669053 154 andrew gelman stats-2010-07-18-Predictive checks for hierarchical models

19 0.97667348 2263 andrew gelman stats-2014-03-24-Empirical implications of Empirical Implications of Theoretical Models

20 0.97650576 690 andrew gelman stats-2011-05-01-Peter Huber’s reflections on data analysis