andrew_gelman_stats andrew_gelman_stats-2013 andrew_gelman_stats-2013-1691 knowledge-graph by maker-knowledge-mining

1691 andrew gelman stats-2013-01-25-Extreem p-values!


meta infos for this blog

Source: html

Introduction: Joshua Vogelstein writes: I know you’ve discussed this on your blog in the past, but I don’t know exactly how you’d answer the following query: Suppose you run an analysis and obtain a p-value of 10^-300. What would you actually report? I’m fairly confident that I’m not that confident :) I’m guessing: “p-value \approx 0.” One possibility is to determine the accuracy with this one *could* in theory know, by virtue of the sample size, and say that p-value is less than or equal to that? For example, if I used a Monte Carlo approach to generate the null distribution with 10,000 samples, and I found that the observed value was more extreme than all of the sample values, then I might say that p is less than or equal to 1/10,000. My reply: Mosteller and Wallace talked a bit about this in their book, the idea that there are various other 1-in-a-million possibilities (for example, the data were faked somewhere before they got to you) so p-values such as 10^-6 don’t really mean an


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Joshua Vogelstein writes: I know you’ve discussed this on your blog in the past, but I don’t know exactly how you’d answer the following query: Suppose you run an analysis and obtain a p-value of 10^-300. [sent-1, score-0.485]

2 I’m fairly confident that I’m not that confident :) I’m guessing: “p-value \approx 0. [sent-3, score-0.692]

3 ” One possibility is to determine the accuracy with this one *could* in theory know, by virtue of the sample size, and say that p-value is less than or equal to that? [sent-4, score-1.027]

4 For example, if I used a Monte Carlo approach to generate the null distribution with 10,000 samples, and I found that the observed value was more extreme than all of the sample values, then I might say that p is less than or equal to 1/10,000. [sent-5, score-0.985]

5 My reply: Mosteller and Wallace talked a bit about this in their book, the idea that there are various other 1-in-a-million possibilities (for example, the data were faked somewhere before they got to you) so p-values such as 10^-6 don’t really mean anything. [sent-6, score-0.518]

6 On the other hand, in some fields such as genetics with extreem multiple comparisons issues, they demand p-values on the order of 10^-6 before doing anything at all. [sent-7, score-0.783]

7 Here I think the solution is multilevel modeling (which may well be done implicitly as part of a classical multiple comparisons adjustment procedure). [sent-8, score-0.854]

8 In general, I think the way to go is to move away from p-values and instead focus directly on effect sizes. [sent-9, score-0.248]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('confident', 0.286), ('equal', 0.225), ('approx', 0.187), ('mosteller', 0.181), ('comparisons', 0.175), ('vogelstein', 0.175), ('wallace', 0.16), ('query', 0.16), ('faked', 0.16), ('multiple', 0.158), ('joshua', 0.155), ('sample', 0.14), ('genetics', 0.139), ('virtue', 0.137), ('possibilities', 0.135), ('carlo', 0.129), ('demand', 0.128), ('adjustment', 0.127), ('obtain', 0.126), ('monte', 0.123), ('fairly', 0.12), ('implicitly', 0.118), ('generate', 0.118), ('talked', 0.118), ('possibility', 0.117), ('null', 0.117), ('guessing', 0.111), ('accuracy', 0.111), ('determine', 0.11), ('less', 0.108), ('procedure', 0.107), ('samples', 0.107), ('sizes', 0.106), ('somewhere', 0.105), ('fields', 0.103), ('extreme', 0.101), ('know', 0.1), ('observed', 0.097), ('classical', 0.096), ('solution', 0.094), ('move', 0.087), ('multilevel', 0.086), ('values', 0.086), ('size', 0.084), ('focus', 0.082), ('exactly', 0.08), ('order', 0.08), ('say', 0.079), ('run', 0.079), ('directly', 0.079)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.99999988 1691 andrew gelman stats-2013-01-25-Extreem p-values!

Introduction: Joshua Vogelstein writes: I know you’ve discussed this on your blog in the past, but I don’t know exactly how you’d answer the following query: Suppose you run an analysis and obtain a p-value of 10^-300. What would you actually report? I’m fairly confident that I’m not that confident :) I’m guessing: “p-value \approx 0.” One possibility is to determine the accuracy with this one *could* in theory know, by virtue of the sample size, and say that p-value is less than or equal to that? For example, if I used a Monte Carlo approach to generate the null distribution with 10,000 samples, and I found that the observed value was more extreme than all of the sample values, then I might say that p is less than or equal to 1/10,000. My reply: Mosteller and Wallace talked a bit about this in their book, the idea that there are various other 1-in-a-million possibilities (for example, the data were faked somewhere before they got to you) so p-values such as 10^-6 don’t really mean an

2 0.16543353 1989 andrew gelman stats-2013-08-20-Correcting for multiple comparisons in a Bayesian regression model

Introduction: Joe Northrup writes: I have a question about correcting for multiple comparisons in a Bayesian regression model. I believe I understand the argument in your 2012 paper in Journal of Research on Educational Effectiveness that when you have a hierarchical model there is shrinkage of estimates towards the group-level mean and thus there is no need to add any additional penalty to correct for multiple comparisons. In my case I do not have hierarchically structured data—i.e. I have only 1 observation per group but have a categorical variable with a large number of categories. Thus, I am fitting a simple multiple regression in a Bayesian framework. Would putting a strong, mean 0, multivariate normal prior on the betas in this model accomplish the same sort of shrinkage (it seems to me that it would) and do you believe this is a valid way to address criticism of multiple comparisons in this setting? My reply: Yes, I think this makes sense. One way to address concerns of multiple com

3 0.14677928 1382 andrew gelman stats-2012-06-17-How to make a good fig?

Introduction: Joshua Vogelstein writes: Are you aware of a paper the explains current best practice of figure generation, in general? i’m thinking things like: have legends and labels that are legible, etc. seems like you or hadley shoulda written some such thing by now…. My reply: A couple of sources I can think of are: one of the appendixes in my book with Jennifer, and the book by Rafe Donahue.

4 0.12851974 1016 andrew gelman stats-2011-11-17-I got 99 comparisons but multiplicity ain’t one

Introduction: After I gave my talk at an econ seminar on Why We (Usually) Don’t Care About Multiple Comparisons, I got the following comment: One question that came up later was whether your argument is really with testing in general, rather than only with testing in multiple comparison settings. My reply: Yes, my argument is with testing in general. But it arises with particular force in multiple comparisons. With a single test, we can just say we dislike testing so we use confidence intervals or Bayesian inference instead, and it’s no problem—really more of a change in emphasis than a change in methods. But with multiple tests, the classical advice is not simply to look at type 1 error rates but more specifically to make a multiplicity adjustment, for example to make confidence intervals wider to account for multiplicity. I don’t want to do this! So here there is a real battle to fight. P.S. Here’s the article (with Jennifer and Masanao), to appear in the Journal of Research on

5 0.12412185 1934 andrew gelman stats-2013-07-11-Yes, worry about generalizing from data to population. But multilevel modeling is the solution, not the problem

Introduction: A sociologist writes in: Samuel Lucas has just published a paper in Quality and Quantity arguing that anything less than a full probability sample of higher levels in HLMs yields biased and unusable results. If I follow him correctly, he is arguing that not only are the SEs too small, but the parameter estimates themselves are biased and we cannot say in advance whether the bias is positive or negative. Lucas has thrown down a big gauntlet, advising us throw away our data unless the sample of macro units is right and ignore the published results that fail this standard. Extreme. Is there another conclusion to be drawn? Other advice to be given? A Bayesian path out of the valley? Heres’s the abstract to Lucas’s paper: The multilevel model has become a staple of social research. I textually and formally explicate sample design features that, I contend, are required for unbiased estimation of macro-level multilevel model parameters and the use of tools for statistical infe

6 0.12098058 1955 andrew gelman stats-2013-07-25-Bayes-respecting experimental design and other things

7 0.1169934 1737 andrew gelman stats-2013-02-25-Correlation of 1 . . . too good to be true?

8 0.1147639 960 andrew gelman stats-2011-10-15-The bias-variance tradeoff

9 0.11212885 1607 andrew gelman stats-2012-12-05-The p-value is not . . .

10 0.11156446 2295 andrew gelman stats-2014-04-18-One-tailed or two-tailed?

11 0.10873526 963 andrew gelman stats-2011-10-18-Question on Type M errors

12 0.10462032 295 andrew gelman stats-2010-09-25-Clusters with very small numbers of observations

13 0.1041715 2042 andrew gelman stats-2013-09-28-Difficulties of using statistical significance (or lack thereof) to sift through and compare research hypotheses

14 0.10326618 1315 andrew gelman stats-2012-05-12-Question 2 of my final exam for Design and Analysis of Sample Surveys

15 0.10162812 1746 andrew gelman stats-2013-03-02-Fishing for cherries

16 0.098311692 820 andrew gelman stats-2011-07-25-Design of nonrandomized cluster sample study

17 0.097521499 1535 andrew gelman stats-2012-10-16-Bayesian analogue to stepwise regression?

18 0.095393315 383 andrew gelman stats-2010-10-31-Analyzing the entire population rather than a sample

19 0.095277637 1309 andrew gelman stats-2012-05-09-The first version of my “inference from iterative simulation using parallel sequences” paper!

20 0.094780862 80 andrew gelman stats-2010-06-11-Free online course in multilevel modeling


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.195), (1, 0.052), (2, 0.055), (3, -0.076), (4, 0.054), (5, -0.006), (6, 0.02), (7, 0.01), (8, 0.035), (9, -0.046), (10, -0.016), (11, -0.034), (12, 0.02), (13, -0.026), (14, 0.025), (15, -0.015), (16, -0.062), (17, -0.023), (18, 0.038), (19, -0.029), (20, 0.015), (21, 0.02), (22, 0.014), (23, 0.051), (24, -0.047), (25, -0.03), (26, 0.013), (27, 0.015), (28, 0.019), (29, -0.024), (30, 0.019), (31, -0.017), (32, 0.03), (33, 0.062), (34, -0.051), (35, -0.02), (36, 0.026), (37, -0.002), (38, -0.003), (39, 0.027), (40, -0.035), (41, 0.064), (42, -0.005), (43, -0.06), (44, 0.01), (45, -0.031), (46, 0.014), (47, 0.0), (48, -0.004), (49, -0.05)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.96096748 1691 andrew gelman stats-2013-01-25-Extreem p-values!

Introduction: Joshua Vogelstein writes: I know you’ve discussed this on your blog in the past, but I don’t know exactly how you’d answer the following query: Suppose you run an analysis and obtain a p-value of 10^-300. What would you actually report? I’m fairly confident that I’m not that confident :) I’m guessing: “p-value \approx 0.” One possibility is to determine the accuracy with this one *could* in theory know, by virtue of the sample size, and say that p-value is less than or equal to that? For example, if I used a Monte Carlo approach to generate the null distribution with 10,000 samples, and I found that the observed value was more extreme than all of the sample values, then I might say that p is less than or equal to 1/10,000. My reply: Mosteller and Wallace talked a bit about this in their book, the idea that there are various other 1-in-a-million possibilities (for example, the data were faked somewhere before they got to you) so p-values such as 10^-6 don’t really mean an

2 0.75245953 1746 andrew gelman stats-2013-03-02-Fishing for cherries

Introduction: Someone writes: I’m currently trying to make sense of the Army’s preliminary figures on their Comprehensive Soldier Fitness programme, which I found here . That report (see for example table 4 on p.15) has only a few very small “effect sizes” with p<.01 on some of the subscales and nothing significant on the rest. It looks to me like it's not much different from random noise, which I suspect might be caused by the large N (and there's more to come, because N for the whole programme will be in excess of 1 million). While googling on the subject of large N, I came across this entry in your blog. My question is, does that imply that when one has a large N – and, thus, presumably, large statistical power – one should systematically reduce alpha as well? Is there any literature on this? Does one always/sometimes/never need to take Lindley’s “paradox” into account? And a supplementary question: can it ever be legitimate to quote a result as significant for one DV (“Social fi

3 0.71224803 107 andrew gelman stats-2010-06-24-PPS in Georgia

Introduction: Lucy Flynn writes: I’m working at a non-profit organization called CRRC in the Republic of Georgia. I’m having a methodological problem and I saw the syllabus for your sampling class online and thought I might be able to ask you about it? We do a lot of complex surveys nationwide; our typical sample design is as follows: - stratify by rural/urban/capital - sub-stratify the rural and urban strata into NE/NW/SE/SW geographic quadrants - select voting precincts as PSUs - select households as SSUs - select individual respondents as TSUs I’m relatively new here, and past practice has been to sample voting precincts with probability proportional to size. It’s desirable because it’s not logistically feasible for us to vary the number of interviews per precinct with precinct size, so it makes the selection probabilities for households more even across precinct sizes. However, I have a complex sampling textbook (Lohr 1999), and it explains how complex it is to calculate sel

4 0.70381135 972 andrew gelman stats-2011-10-25-How do you interpret standard errors from a regression fit to the entire population?

Introduction: David Radwin asks a question which comes up fairly often in one form or another: How should one respond to requests for statistical hypothesis tests for population (or universe) data? I [Radwin] first encountered this issue as an undergraduate when a professor suggested a statistical significance test for my paper comparing roll call votes between freshman and veteran members of Congress. Later I learned that such tests apply only to samples because their purpose is to tell you whether the difference in the observed sample is likely to exist in the population. If you have data for the whole population, like all members of the 103rd House of Representatives, you do not need a test to discern the true difference in the population. Sometimes researchers assume some sort of superpopulation like “all possible Congresses” or “Congresses across all time” and that the members of any given Congress constitute a sample. In my current work in education research, it is sometimes asserted t

5 0.70077586 1330 andrew gelman stats-2012-05-19-Cross-validation to check missing-data imputation

Introduction: Aureliano Crameri writes: I have questions regarding one technique you and your colleagues described in your papers: the cross validation (Multiple Imputation with Diagnostics (mi) in R: Opening Windows into the Black Box, with reference to Gelman, King, and Liu, 1998). I think this is the technique I need for my purpose, but I am not sure I understand it right. I want to use the multiple imputation to estimate the outcome of psychotherapies based on longitudinal data. First I have to demonstrate that I am able to get unbiased estimates with the multiple imputation. The expected bias is the overestimation of the outcome of dropouts. I will test my imputation strategies by means of a series of simulations (delete values, impute, compare with the original). Due to the complexity of the statistical analyses I think I need at least 200 cases. Now I don’t have so many cases without any missings. My data have missing values in different variables. The proportion of missing values is

6 0.69961828 2159 andrew gelman stats-2014-01-04-“Dogs are sensitive to small variations of the Earth’s magnetic field”

7 0.69838065 1605 andrew gelman stats-2012-12-04-Write This Book

8 0.69755876 2295 andrew gelman stats-2014-04-18-One-tailed or two-tailed?

9 0.69719511 608 andrew gelman stats-2011-03-12-Single or multiple imputation?

10 0.69400972 1018 andrew gelman stats-2011-11-19-Tempering and modes

11 0.69377047 2152 andrew gelman stats-2013-12-28-Using randomized incentives as an instrument for survey nonresponse?

12 0.68777567 2183 andrew gelman stats-2014-01-23-Discussion on preregistration of research studies

13 0.68722129 833 andrew gelman stats-2011-07-31-Untunable Metropolis

14 0.68627512 704 andrew gelman stats-2011-05-10-Multiple imputation and multilevel analysis

15 0.68612516 1016 andrew gelman stats-2011-11-17-I got 99 comparisons but multiplicity ain’t one

16 0.66809404 726 andrew gelman stats-2011-05-22-Handling multiple versions of an outcome variable

17 0.66624177 511 andrew gelman stats-2011-01-11-One more time on that ESP study: The problem of overestimates and the shrinkage solution

18 0.66336769 1702 andrew gelman stats-2013-02-01-Don’t let your standard errors drive your research agenda

19 0.66169387 2281 andrew gelman stats-2014-04-04-The Notorious N.H.S.T. presents: Mo P-values Mo Problems

20 0.66117257 212 andrew gelman stats-2010-08-17-Futures contracts, Granger causality, and my preference for estimation to testing


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(9, 0.012), (16, 0.105), (18, 0.18), (21, 0.013), (24, 0.162), (35, 0.032), (55, 0.015), (73, 0.039), (89, 0.013), (99, 0.336)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.9590798 969 andrew gelman stats-2011-10-22-Researching the cost-effectiveness of political lobbying organisations

Introduction: Sally Murray from Giving What We Can writes: We are an organisation that assesses different charitable (/fundable) interventions, to estimate which are the most cost-effective (measured in terms of the improvement of life for people in developing countries gained for every dollar invested). Our research guides and encourages greater donations to the most cost-effective charities we thus identify, and our members have so far pledged a total of $14m to these causes, with many hundreds more relying on our advice in a less formal way. I am specifically researching the cost-effectiveness of political lobbying organisations. We are initially focusing on organisations that lobby for ‘big win’ outcomes such as increased funding of the most cost-effective NTD treatments/ vaccine research, changes to global trade rules (potentially) and more obscure lobbies such as “Keep Antibiotics Working”. We’ve a great deal of respect for your work and the superbly rational way you go about it, and

2 0.95820123 456 andrew gelman stats-2010-12-07-The red-state, blue-state war is happening in the upper half of the income distribution

Introduction: As we said in Red State, Blue State, it’s not the Prius vs. the pickup truck, it’s the Prius vs. the Hummer. Here’s the graph: Or, as Ross Douthat put it in an op-ed yesterday: This means that a culture war that’s often seen as a clash between liberal elites and a conservative middle America looks more and more like a conflict within the educated class — pitting Wheaton and Baylor against Brown and Bard, Redeemer Presbyterian Church against the 92nd Street Y, C. S. Lewis devotees against the Philip Pullman fan club. Our main motivation for doing this work was to change how the news media think about America’s political divisions, and so it’s good to see our ideas getting mainstreamed and moving toward conventional wisdom. P.S. Here’s the time series of graphs showing how the pattern that we and Douthat noticed, of a battle between coastal states and middle America that is occurring among upper-income Americans, is relatively recent, having arisen in the Clinton ye

same-blog 3 0.95689052 1691 andrew gelman stats-2013-01-25-Extreem p-values!

Introduction: Joshua Vogelstein writes: I know you’ve discussed this on your blog in the past, but I don’t know exactly how you’d answer the following query: Suppose you run an analysis and obtain a p-value of 10^-300. What would you actually report? I’m fairly confident that I’m not that confident :) I’m guessing: “p-value \approx 0.” One possibility is to determine the accuracy with this one *could* in theory know, by virtue of the sample size, and say that p-value is less than or equal to that? For example, if I used a Monte Carlo approach to generate the null distribution with 10,000 samples, and I found that the observed value was more extreme than all of the sample values, then I might say that p is less than or equal to 1/10,000. My reply: Mosteller and Wallace talked a bit about this in their book, the idea that there are various other 1-in-a-million possibilities (for example, the data were faked somewhere before they got to you) so p-values such as 10^-6 don’t really mean an

4 0.95392996 698 andrew gelman stats-2011-05-05-Shocking but not surprising

Introduction: Much-honored playwright Tony Kushner was set to receive one more honor–a degree from John Jay College–but it was suddenly taken away from him on an 11-1 vote of the trustees of the City University of New York. This was the first rejection of an honorary degree nomination since 1961. The news article focuses on one trustee, Jeffrey Wiesenfeld, an investment adviser and onetime political aide, who opposed Kushner’s honorary degree, but to me the relevant point is that the committee as a whole voted 11-1 to ding him. Kusnher said, “I’m sickened,” he added, “that this is happening in New York City. Shocked, really.” I can see why he’s shocked, but perhaps it’s not so surprising that it’s happening in NYC. Recall the famous incident from 1940 in which Bertrand Russell was invited and then uninvited to teach at City College. The problem that time was Russell’s views on free love (as they called it back then). There seems to be a long tradition of city college officials being will

5 0.95306516 1967 andrew gelman stats-2013-08-04-What are the key assumptions of linear regression?

Introduction: Andy Cooper writes: A link to an article , “Four Assumptions Of Multiple Regression That Researchers Should Always Test”, has been making the rounds on Twitter. Their first rule is “Variables are Normally distributed.” And they seem to be talking about the independent variables – but then later bring in tests on the residuals (while admitting that the normally-distributed error assumption is a weak assumption). I thought we had long-since moved away from transforming our independent variables to make them normally distributed for statistical reasons (as opposed to standardizing them for interpretability, etc.) Am I missing something? I agree that leverage in a influence is important, but normality of the variables? The article is from 2002, so it might be dated, but given the popularity of the tweet, I thought I’d ask your opinion. My response: There’s some useful advice on that page but overall I think the advice was dated even in 2002. In section 3.6 of my book wit

6 0.95183343 718 andrew gelman stats-2011-05-18-Should kids be able to bring their own lunches to school?

7 0.95067084 1292 andrew gelman stats-2012-05-01-Colorless green facts asserted resolutely

8 0.94404674 1319 andrew gelman stats-2012-05-14-I hate to get all Gerd Gigerenzer on you here, but . . .

9 0.9435482 2046 andrew gelman stats-2013-10-01-I’ll say it again

10 0.94173145 588 andrew gelman stats-2011-02-24-In case you were wondering, here’s the price of milk

11 0.93958771 621 andrew gelman stats-2011-03-20-Maybe a great idea in theory, didn’t work so well in practice

12 0.92384219 1204 andrew gelman stats-2012-03-08-The politics of economic and statistical models

13 0.92298931 114 andrew gelman stats-2010-06-28-More on Bayesian deduction-induction

14 0.92246926 1074 andrew gelman stats-2011-12-20-Reading a research paper != agreeing with its claims

15 0.91615999 829 andrew gelman stats-2011-07-29-Infovis vs. statgraphics: A clear example of their different goals

16 0.91349345 1382 andrew gelman stats-2012-06-17-How to make a good fig?

17 0.91117811 1922 andrew gelman stats-2013-07-02-They want me to send them free material and pay for the privilege

18 0.90716708 2181 andrew gelman stats-2014-01-21-The Commissar for Traffic presents the latest Five-Year Plan

19 0.90541101 2239 andrew gelman stats-2014-03-09-Reviewing the peer review process?

20 0.90127861 2148 andrew gelman stats-2013-12-25-Spam!