andrew_gelman_stats andrew_gelman_stats-2012 andrew_gelman_stats-2012-1195 knowledge-graph by maker-knowledge-mining

1195 andrew gelman stats-2012-03-04-Multiple comparisons dispute in the tabloids


meta infos for this blog

Source: html

Introduction: Yarden Katz writes: I’m probably not the first to point this out, but just in case, you might be interested in this article by T. Florian Jaeger, Daniel Pontillo, and Peter Graff on a statistical dispute [regarding the claim, "Phonemic Diversity Supports a Serial Founder Effect Model of Language Expansion from Africa"]. Seems directly relevant to your article on multiple hypothesis testing and associated talk at the Voodoo correlations meeting. Curious to know your thoughts on this if you think it’s blog-worthy. Here’s the abstract of the paper: Atkinson (Reports, 15 April 2011, p. 346) argues that the phonological complexity of languages reflects the loss of phonemic distinctions due to successive founder events during human migration (the serial founder hypothesis). Statistical simulations show that the type I error rate of Atkinson’s analysis is hugely inflated. The data at best support only a weak interpretation of the serial founder hypothesis. My reaction: I d


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Florian Jaeger, Daniel Pontillo, and Peter Graff on a statistical dispute [regarding the claim, "Phonemic Diversity Supports a Serial Founder Effect Model of Language Expansion from Africa"]. [sent-2, score-0.145]

2 Seems directly relevant to your article on multiple hypothesis testing and associated talk at the Voodoo correlations meeting. [sent-3, score-0.239]

3 346) argues that the phonological complexity of languages reflects the loss of phonemic distinctions due to successive founder events during human migration (the serial founder hypothesis). [sent-6, score-2.295]

4 Statistical simulations show that the type I error rate of Atkinson’s analysis is hugely inflated. [sent-7, score-0.146]

5 The data at best support only a weak interpretation of the serial founder hypothesis. [sent-8, score-0.852]

6 My reaction: I did not look at either the science or the statistics in detail so I can’t judge the arguments being made on the two sides, but one thing I wold like to comment on, and disagree with, is the implication that the goal of a statistical analysis is to find a correct p-value. [sent-9, score-0.205]

7 For example, in his response, Quentin Atkinson writes , “What we really want to know, however, is the probability of finding an effect of distance from any origin by chance that is at least as large as the effect we observe in the real data. [sent-10, score-0.444]

8 ” I don’t always mind p-values—it can often be useful to check model fit by comparing observed to potentially replicated data, thus giving a sense of whether an observed pattern can easily be explained by chance. [sent-11, score-0.423]

9 But I’d prefer to summarize via inferences on scientifically-meaningful parameters of the model, for example the magnitudes of the different founder events (or whatever is a reasonable way to look at these models). [sent-12, score-0.838]

10 0001, I think it makes more sense to convert these discrepancies into statements about directly interpretable and generalizable parameters. [sent-15, score-0.454]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('founder', 0.468), ('atkinson', 0.375), ('serial', 0.322), ('phonemic', 0.25), ('events', 0.12), ('graff', 0.114), ('successive', 0.107), ('observed', 0.106), ('voodoo', 0.103), ('generalizable', 0.103), ('migration', 0.103), ('effect', 0.101), ('distinctions', 0.099), ('katz', 0.099), ('quentin', 0.096), ('origin', 0.094), ('interpretable', 0.094), ('hypothesis', 0.092), ('parameters', 0.091), ('magnitudes', 0.088), ('discrepancies', 0.088), ('africa', 0.086), ('directly', 0.086), ('convert', 0.083), ('hugely', 0.081), ('languages', 0.081), ('replicated', 0.08), ('observe', 0.08), ('diversity', 0.079), ('dispute', 0.079), ('reflects', 0.078), ('supports', 0.078), ('expansion', 0.075), ('april', 0.073), ('implication', 0.073), ('summarize', 0.071), ('complexity', 0.07), ('distance', 0.068), ('model', 0.067), ('sides', 0.067), ('judge', 0.066), ('statistical', 0.066), ('argues', 0.066), ('simulations', 0.065), ('potentially', 0.064), ('peter', 0.064), ('loss', 0.063), ('daniel', 0.063), ('weak', 0.062), ('correlations', 0.061)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0 1195 andrew gelman stats-2012-03-04-Multiple comparisons dispute in the tabloids

Introduction: Yarden Katz writes: I’m probably not the first to point this out, but just in case, you might be interested in this article by T. Florian Jaeger, Daniel Pontillo, and Peter Graff on a statistical dispute [regarding the claim, "Phonemic Diversity Supports a Serial Founder Effect Model of Language Expansion from Africa"]. Seems directly relevant to your article on multiple hypothesis testing and associated talk at the Voodoo correlations meeting. Curious to know your thoughts on this if you think it’s blog-worthy. Here’s the abstract of the paper: Atkinson (Reports, 15 April 2011, p. 346) argues that the phonological complexity of languages reflects the loss of phonemic distinctions due to successive founder events during human migration (the serial founder hypothesis). Statistical simulations show that the type I error rate of Atkinson’s analysis is hugely inflated. The data at best support only a weak interpretation of the serial founder hypothesis. My reaction: I d

2 0.088502616 1695 andrew gelman stats-2013-01-28-Economists argue about Bayes

Introduction: Robert Bell pointed me to this post by Brad De Long on Bayesian statistics, and then I also noticed this from Noah Smith, who wrote: My impression is that although the Bayesian/Frequentist debate is interesting and intellectually fun, there’s really not much “there” there… despite being so-hip-right-now, Bayesian is not the Statistical Jesus. I’m happy to see the discussion going in this direction. Twenty-five years ago or so, when I got into this biz, there were some serious anti-Bayesian attitudes floating around in mainstream statistics. Discussions in the journals sometimes devolved into debates of the form, “Bayesians: knaves or fools?”. You’d get all sorts of free-floating skepticism about any prior distribution at all, even while people were accepting without question (and doing theory on) logistic regressions, proportional hazards models, and all sorts of strong strong models. (In the subfield of survey sampling, various prominent researchers would refuse to mode

3 0.087144889 1605 andrew gelman stats-2012-12-04-Write This Book

Introduction: This post is by Phil Price. I’ve been preparing a review of a new statistics textbook aimed at students and practitioners in the “physical sciences,” as distinct from the social sciences and also distinct from people who intend to take more statistics courses. I figured that since it’s been years since I looked at an intro stats textbook, I should look at a few others and see how they differ from this one, so in addition to the book I’m reviewing I’ve looked at some other textbooks aimed at similar audiences: Milton and Arnold; Hines, Montgomery, Goldsman, and Borror; and a few others. I also looked at the table of contents of several more. There is a lot of overlap in the coverage of these books — they all have discussions of common discrete and continuous distributions, joint distributions, descriptive statistics, parameter estimation, hypothesis testing, linear regression, ANOVA, factorial experimental design, and a few other topics. I can see how, from a statisti

4 0.083711259 662 andrew gelman stats-2011-04-15-Bayesian statistical pragmatism

Introduction: Rob Kass’s article on statistical pragmatism is scheduled to appear in Statistical Science along with some discussions. Here are my comments. I agree with Rob Kass’s point that we can and should make use of statistical methods developed under different philosophies, and I am happy to take the opportunity to elaborate on some of his arguments. I’ll discuss the following: - Foundations of probability - Confidence intervals and hypothesis tests - Sampling - Subjectivity and belief - Different schools of statistics Foundations of probability. Kass describes probability theory as anchored upon physical randomization (coin flips, die rolls and the like) but being useful more generally as a mathematical model. I completely agree but would also add another anchoring point: calibration. Calibration of probability assessments is an objective, not subjective process, although some subjectivity (or scientific judgment) is necessarily involved in the choice of events used

5 0.083542347 1883 andrew gelman stats-2013-06-04-Interrogating p-values

Introduction: This article is a discussion of a paper by Greg Francis for a special issue, edited by E. J. Wagenmakers, of the Journal of Mathematical Psychology. Here’s what I wrote: Much of statistical practice is an effort to reduce or deny variation and uncertainty. The reduction is done through standardization, replication, and other practices of experimental design, with the idea being to isolate and stabilize the quantity being estimated and then average over many cases. Even so, however, uncertainty persists, and statistical hypothesis testing is in many ways an endeavor to deny this, by reporting binary accept/reject decisions. Classical statistical methods produce binary statements, but there is no reason to assume that the world works that way. Expressions such as Type 1 error, Type 2 error, false positive, and so on, are based on a model in which the world is divided into real and non-real effects. To put it another way, I understand the general scientific distinction of real vs

6 0.081123084 803 andrew gelman stats-2011-07-14-Subtleties with measurement-error models for the evaluation of wacky claims

7 0.080571279 1607 andrew gelman stats-2012-12-05-The p-value is not . . .

8 0.078940667 317 andrew gelman stats-2010-10-04-Rob Kass on statistical pragmatism, and my reactions

9 0.078443557 1024 andrew gelman stats-2011-11-23-Of hypothesis tests and Unitarians

10 0.077801093 511 andrew gelman stats-2011-01-11-One more time on that ESP study: The problem of overestimates and the shrinkage solution

11 0.075155891 1941 andrew gelman stats-2013-07-16-Priors

12 0.075030386 2263 andrew gelman stats-2014-03-24-Empirical implications of Empirical Implications of Theoretical Models

13 0.074095421 320 andrew gelman stats-2010-10-05-Does posterior predictive model checking fit with the operational subjective approach?

14 0.07264547 404 andrew gelman stats-2010-11-09-“Much of the recent reported drop in interstate migration is a statistical artifact”

15 0.071492963 1247 andrew gelman stats-2012-04-05-More philosophy of Bayes

16 0.071132936 1665 andrew gelman stats-2013-01-10-That controversial claim that high genetic diversity, or low genetic diversity, is bad for the economy

17 0.070379972 2312 andrew gelman stats-2014-04-29-Ken Rice presents a unifying approach to statistical inference and hypothesis testing

18 0.070291005 781 andrew gelman stats-2011-06-28-The holes in my philosophy of Bayesian data analysis

19 0.069762059 1434 andrew gelman stats-2012-07-29-FindTheData.org

20 0.068152741 2007 andrew gelman stats-2013-09-03-Popper and Jaynes


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.159), (1, 0.06), (2, 0.013), (3, -0.033), (4, -0.011), (5, -0.03), (6, -0.014), (7, 0.007), (8, 0.031), (9, -0.013), (10, -0.045), (11, 0.018), (12, -0.007), (13, -0.048), (14, -0.012), (15, 0.005), (16, -0.021), (17, -0.029), (18, -0.008), (19, -0.011), (20, 0.003), (21, -0.019), (22, 0.005), (23, -0.015), (24, -0.035), (25, 0.006), (26, -0.005), (27, 0.006), (28, 0.009), (29, -0.038), (30, -0.004), (31, 0.0), (32, 0.005), (33, -0.005), (34, -0.036), (35, 0.004), (36, 0.023), (37, -0.027), (38, 0.017), (39, 0.014), (40, -0.006), (41, -0.006), (42, -0.019), (43, 0.009), (44, 0.004), (45, 0.012), (46, 0.006), (47, -0.034), (48, 0.008), (49, -0.011)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.96801287 1195 andrew gelman stats-2012-03-04-Multiple comparisons dispute in the tabloids

Introduction: Yarden Katz writes: I’m probably not the first to point this out, but just in case, you might be interested in this article by T. Florian Jaeger, Daniel Pontillo, and Peter Graff on a statistical dispute [regarding the claim, "Phonemic Diversity Supports a Serial Founder Effect Model of Language Expansion from Africa"]. Seems directly relevant to your article on multiple hypothesis testing and associated talk at the Voodoo correlations meeting. Curious to know your thoughts on this if you think it’s blog-worthy. Here’s the abstract of the paper: Atkinson (Reports, 15 April 2011, p. 346) argues that the phonological complexity of languages reflects the loss of phonemic distinctions due to successive founder events during human migration (the serial founder hypothesis). Statistical simulations show that the type I error rate of Atkinson’s analysis is hugely inflated. The data at best support only a weak interpretation of the serial founder hypothesis. My reaction: I d

2 0.84770906 1883 andrew gelman stats-2013-06-04-Interrogating p-values

Introduction: This article is a discussion of a paper by Greg Francis for a special issue, edited by E. J. Wagenmakers, of the Journal of Mathematical Psychology. Here’s what I wrote: Much of statistical practice is an effort to reduce or deny variation and uncertainty. The reduction is done through standardization, replication, and other practices of experimental design, with the idea being to isolate and stabilize the quantity being estimated and then average over many cases. Even so, however, uncertainty persists, and statistical hypothesis testing is in many ways an endeavor to deny this, by reporting binary accept/reject decisions. Classical statistical methods produce binary statements, but there is no reason to assume that the world works that way. Expressions such as Type 1 error, Type 2 error, false positive, and so on, are based on a model in which the world is divided into real and non-real effects. To put it another way, I understand the general scientific distinction of real vs

3 0.8252123 2263 andrew gelman stats-2014-03-24-Empirical implications of Empirical Implications of Theoretical Models

Introduction: Robert Bloomfield writes: Most of the people in my field (accounting, which is basically applied economics and finance, leavened with psychology and organizational behavior) use ‘positive research methods’, which are typically described as coming to the data with a predefined theory, and using hypothesis testing to accept or reject the theory’s predictions. But a substantial minority use ‘interpretive research methods’ (sometimes called qualitative methods, for those that call positive research ‘quantitative’). No one seems entirely happy with the definition of this method, but I’ve found it useful to think of it as an attempt to see the world through the eyes of your subjects, much as Jane Goodall lived with gorillas and tried to see the world through their eyes.) Interpretive researchers often criticize positive researchers by noting that the latter don’t make the best use of their data, because they come to the data with a predetermined theory, and only test a narrow set of h

4 0.80782586 1355 andrew gelman stats-2012-05-31-Lindley’s paradox

Introduction: Sam Seaver writes: I [Seaver] happened to be reading an ironic article by Karl Friston when I learned something new about frequentist vs bayesian, namely Lindley’s paradox, on page 12. The text is as follows: So why are we worried about trivial effects? They are important because the probability that the true effect size is exactly zero is itself zero and could cause us to reject the null hypothesis inappropriately. This is a fallacy of classical inference and is not unrelated to Lindley’s paradox (Lindley 1957). Lindley’s paradox describes a counterintuitive situation in which Bayesian and frequentist approaches to hypothesis testing give opposite results. It occurs when; (i) a result is significant by a frequentist test, indicating sufficient evidence to reject the null hypothesis d=0 and (ii) priors render the posterior probability of d=0 high, indicating strong evidence that the null hypothesis is true. In his original treatment, Lindley (1957) showed that – under a parti

5 0.80744404 1095 andrew gelman stats-2012-01-01-Martin and Liu: Probabilistic inference based on consistency of model with data

Introduction: What better way to start then new year than with some hard-core statistical theory? Ryan Martin and Chuanhai Liu send along a new paper on inferential models: Probability is a useful tool for describing uncertainty, so it is natural to strive for a system of statistical inference based on probabilities for or against various hypotheses. But existing probabilistic inference methods struggle to provide a meaningful interpretation of the probabilities across experiments in sufficient generality. In this paper we further develop a promising new approach based on what are called inferential models (IMs). The fundamental idea behind IMs is that there is an unobservable auxiliary variable that itself describes the inherent uncertainty about the parameter of interest, and that posterior probabilistic inference can be accomplished by predicting this unobserved quantity. We describe a simple and intuitive three-step construction of a random set of candidate parameter values, each being co

6 0.80563241 2295 andrew gelman stats-2014-04-18-One-tailed or two-tailed?

7 0.80395234 1299 andrew gelman stats-2012-05-04-Models, assumptions, and data summaries

8 0.80094892 1760 andrew gelman stats-2013-03-12-Misunderstanding the p-value

9 0.79959589 1575 andrew gelman stats-2012-11-12-Thinking like a statistician (continuously) rather than like a civilian (discretely)

10 0.79352415 1605 andrew gelman stats-2012-12-04-Write This Book

11 0.79088247 1409 andrew gelman stats-2012-07-08-Is linear regression unethical in that it gives more weight to cases that are far from the average?

12 0.78798449 2183 andrew gelman stats-2014-01-23-Discussion on preregistration of research studies

13 0.77671659 2281 andrew gelman stats-2014-04-04-The Notorious N.H.S.T. presents: Mo P-values Mo Problems

14 0.77343959 1215 andrew gelman stats-2012-03-16-The “hot hand” and problems with hypothesis testing

15 0.77105081 309 andrew gelman stats-2010-10-01-Why Development Economics Needs Theory?

16 0.76743978 1861 andrew gelman stats-2013-05-17-Where do theories come from?

17 0.76281792 2149 andrew gelman stats-2013-12-26-Statistical evidence for revised standards

18 0.76158923 2243 andrew gelman stats-2014-03-11-The myth of the myth of the myth of the hot hand

19 0.75796801 888 andrew gelman stats-2011-09-03-A psychology researcher asks: Is Anova dead?

20 0.75555223 2029 andrew gelman stats-2013-09-18-Understanding posterior p-values


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(2, 0.016), (15, 0.015), (16, 0.066), (21, 0.032), (22, 0.011), (24, 0.157), (28, 0.01), (30, 0.233), (84, 0.024), (86, 0.042), (88, 0.017), (95, 0.023), (99, 0.241)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.95227098 41 andrew gelman stats-2010-05-19-Updated R code and data for ARM

Introduction: Patricia and I have cleaned up some of the R and Bugs code and collected the data for almost all the examples in ARM. See here for links to zip files with the code and data.

2 0.95024896 1188 andrew gelman stats-2012-02-28-Reference on longitudinal models?

Introduction: Antonio Ramos writes: The book with Hill has very little on longitudinal models. So do you recommended any reference to complement your book on covariance structures typical from these models, such as AR(1), Antedependence, Factor Analytic, etc? I am very much interest in BUGS code for these basic models as well as how to extend them to more complex situations. My reply: There is a book by Banerjee, Carlin, and Gelfand on Bayesian space-time models. Beyond that, I think there is good work in psychometrics on covaraince structures but I don’t know the literature.

3 0.93057656 179 andrew gelman stats-2010-08-03-An Olympic size swimming pool full of lithium water

Introduction: As part of his continuing plan to sap etc etc., Aleks pointed me to an article by Max Miller reporting on a recommendation from Jacob Appel: Adding trace amounts of lithium to the drinking water could limit suicides. . . . Communities with higher than average amounts of lithium in their drinking water had significantly lower suicide rates than communities with lower levels. Regions of Texas with lower lithium concentrations had an average suicide rate of 14.2 per 100,000 people, whereas those areas with naturally higher lithium levels had a dramatically lower suicide rate of 8.7 per 100,000. The highest levels in Texas (150 micrograms of lithium per liter of water) are only a thousandth of the minimum pharmaceutical dose, and have no known deleterious effects. I don’t know anything about this and am offering no judgment on it; I’m just passing it on. The research studies are here and here . I am skeptical, though, about this part of the argument: We are not talking a

4 0.91684437 1416 andrew gelman stats-2012-07-14-Ripping off a ripoff

Introduction: I opened the newspaper today (recall that this blog is on an approximately one-month delay) to see a moderately horrifying story about art appraisers who are deterred by fear of lawsuits from expressing an opinion about possible forgeries. Maybe this trend will come to science too? Perhaps Brett Pelham will sue Uri Simonsohn for the pain, suffering, and loss of income occurring from the questioning of his Dennis the dentist paper ? Or maybe I’ll be sued by some rogue sociologist for publicly questioning his data dredging? Anyway, what amused me about the NYT article on art forgery was that two of the artists featured in the discussion were . . . Andy Warhol and Roy Lichtenstein! Warhol is famous for diluting the notion of the unique art object and for making works of art in a “Factory,” and Lichtenstein is famous for ripping off the style and imagery of comic book artists. It’s funny for the two of them, of all people, to come up in a discussion of authenticity. Or maybe it

5 0.90884054 1259 andrew gelman stats-2012-04-11-How things sound to us, versus how they sound to others

Introduction: Hykel Hosni noticed this bit from the Lindley Prize page of the Society for Bayesan Analysis: Lindley became a great missionary for the Bayesian gospel. The atmosphere of the Bayesian revival is captured in a comment by Rivett on Lindley’s move to University College London and the premier chair of statistics in Britain: “it was as though a Jehovah’s Witness had been elected Pope.” From my perspective, this was amusing (if commonplace): a group of rationalists jocularly characterizing themselves as religious fanatics. And some of this is in response to intense opposition from outsiders (see the Background section here ). That’s my view. I’m an insider, a statistician who’s heard all jokes about religious Bayesians, from Bayesian and non-Bayesian statisticians alike. But Hosni is an outsider, and here’s how he sees the above-quoted paragraph: Research, however, is not a matter of faith but a matter of arguments, which should always be evaluated with the utmost intellec

6 0.90166026 412 andrew gelman stats-2010-11-13-Time to apply for the hackNY summer fellows program

same-blog 7 0.90137756 1195 andrew gelman stats-2012-03-04-Multiple comparisons dispute in the tabloids

8 0.88926899 1265 andrew gelman stats-2012-04-15-Progress in U.S. education; also, a discussion of what it takes to hit the op-ed pages

9 0.88617539 593 andrew gelman stats-2011-02-27-Heat map

10 0.87959361 1623 andrew gelman stats-2012-12-14-GiveWell charity recommendations

11 0.87828207 1768 andrew gelman stats-2013-03-18-Mertz’s reply to Unz’s response to Mertz’s comments on Unz’s article

12 0.86875248 1831 andrew gelman stats-2013-04-29-The Great Race

13 0.86796927 1429 andrew gelman stats-2012-07-26-Our broken scholarly publishing system

14 0.85202801 1497 andrew gelman stats-2012-09-15-Our blog makes connections!

15 0.82361341 631 andrew gelman stats-2011-03-28-Explaining that plot.

16 0.82120013 2073 andrew gelman stats-2013-10-22-Ivy Jew update

17 0.81803888 450 andrew gelman stats-2010-12-04-The Joy of Stats

18 0.81718349 1936 andrew gelman stats-2013-07-13-Economic policy does not occur in a political vacuum

19 0.81019187 852 andrew gelman stats-2011-08-13-Checking your model using fake data

20 0.80905867 109 andrew gelman stats-2010-06-25-Classics of statistics