andrew_gelman_stats andrew_gelman_stats-2010 andrew_gelman_stats-2010-480 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: I’ve become increasingly uncomfortable with the term “confidence interval,” for several reasons: - The well-known difficulties in interpretation (officially the confidence statement can be interpreted only on average, but people typically implicitly give the Bayesian interpretation to each case), - The ambiguity between confidence intervals and predictive intervals. (See the footnote in BDA where we discuss the difference between “inference” and “prediction” in the classical framework.) - The awkwardness of explaining that confidence intervals are big in noisy situations where you have less confidence, and confidence intervals are small when you have more confidence. So here’s my proposal. Let’s use the term “uncertainty interval” instead. The uncertainty interval tells you how much uncertainty you have. That works pretty well, I think. P.S. As of this writing, “confidence interval” outGoogles “uncertainty interval” by the huge margin of 9.5 million to 54000. So we
sentIndex sentText sentNum sentScore
1 (See the footnote in BDA where we discuss the difference between “inference” and “prediction” in the classical framework. [sent-2, score-0.311]
2 ) - The awkwardness of explaining that confidence intervals are big in noisy situations where you have less confidence, and confidence intervals are small when you have more confidence. [sent-3, score-2.199]
3 The uncertainty interval tells you how much uncertainty you have. [sent-6, score-1.14]
4 As of this writing, “confidence interval” outGoogles “uncertainty interval” by the huge margin of 9. [sent-10, score-0.184]
wordName wordTfidf (topN-words)
[('confidence', 0.555), ('interval', 0.472), ('uncertainty', 0.291), ('intervals', 0.264), ('interpretation', 0.157), ('awkwardness', 0.144), ('term', 0.142), ('ambiguity', 0.133), ('officially', 0.13), ('footnote', 0.114), ('margin', 0.113), ('uncomfortable', 0.111), ('increasingly', 0.107), ('bda', 0.103), ('interpreted', 0.099), ('noisy', 0.095), ('implicitly', 0.094), ('situations', 0.094), ('explaining', 0.092), ('difficulties', 0.086), ('tells', 0.086), ('million', 0.08), ('prediction', 0.08), ('classical', 0.077), ('predictive', 0.075), ('statement', 0.072), ('huge', 0.071), ('reasons', 0.069), ('become', 0.067), ('works', 0.067), ('typically', 0.063), ('discuss', 0.063), ('ways', 0.059), ('difference', 0.057), ('average', 0.056), ('inference', 0.053), ('several', 0.052), ('writing', 0.05), ('small', 0.049), ('let', 0.047), ('big', 0.044), ('give', 0.044), ('bayesian', 0.043), ('less', 0.043), ('pretty', 0.041), ('go', 0.037), ('case', 0.036), ('well', 0.035), ('use', 0.032), ('ve', 0.028)]
simIndex simValue blogId blogTitle
same-blog 1 1.0 480 andrew gelman stats-2010-12-21-Instead of “confidence interval,” let’s say “uncertainty interval”
Introduction: I’ve become increasingly uncomfortable with the term “confidence interval,” for several reasons: - The well-known difficulties in interpretation (officially the confidence statement can be interpreted only on average, but people typically implicitly give the Bayesian interpretation to each case), - The ambiguity between confidence intervals and predictive intervals. (See the footnote in BDA where we discuss the difference between “inference” and “prediction” in the classical framework.) - The awkwardness of explaining that confidence intervals are big in noisy situations where you have less confidence, and confidence intervals are small when you have more confidence. So here’s my proposal. Let’s use the term “uncertainty interval” instead. The uncertainty interval tells you how much uncertainty you have. That works pretty well, I think. P.S. As of this writing, “confidence interval” outGoogles “uncertainty interval” by the huge margin of 9.5 million to 54000. So we
Introduction: I’m reposing this classic from 2011 . . . Peter Bergman pointed me to this discussion from Cyrus of a presentation by Guido Imbens on design of randomized experiments. Cyrus writes: The standard analysis that Imbens proposes includes (1) a Fisher-type permutation test of the sharp null hypothesis–what Imbens referred to as “testing”–along with a (2) Neyman-type point estimate of the sample average treatment effect and confidence interval–what Imbens referred to as “estimation.” . . . Imbens claimed that testing and estimation are separate enterprises with separate goals and that the two should not be confused. I [Cyrus] took it as a warning against proposals that use “inverted” tests in order to produce point estimates and confidence intervals. There is no reason that such confidence intervals will have accurate coverage except under rather dire assumptions, meaning that they are not “confidence intervals” in the way that we usually think of them. I agree completely. T
Introduction: Peter Bergman points me to this discussion from Cyrus of a presentation by Guido Imbens on design of randomized experiments. Cyrus writes: The standard analysis that Imbens proposes includes (1) a Fisher-type permutation test of the sharp null hypothesis–what Imbens referred to as “testing”–along with a (2) Neyman-type point estimate of the sample average treatment effect and confidence interval–what Imbens referred to as “estimation.” . . . Imbens claimed that testing and estimation are separate enterprises with separate goals and that the two should not be confused. I [Cyrus] took it as a warning against proposals that use “inverted” tests in order to produce point estimates and confidence intervals. There is no reason that such confidence intervals will have accurate coverage except under rather dire assumptions, meaning that they are not “confidence intervals” in the way that we usually think of them. I agree completely. This is something I’ve been saying for a long
4 0.41102442 1672 andrew gelman stats-2013-01-14-How do you think about the values in a confidence interval?
Introduction: Philip Jones writes: As an interested reader of your blog, I wondered if you might consider a blog entry sometime on the following question I posed on CrossValidated (StackExchange). I originally posed the question based on my uncertainty about 95% CIs: “Are all values within the 95% CI equally likely (probable), or are the values at the “tails” of the 95% CI less likely than those in the middle of the CI closer to the point estimate?” I posed this question based on discordant information I found at a couple of different web sources (I posted these sources in the body of the question). I received some interesting replies, and the replies were not unanimous, in fact there is some serious disagreement there! After seeing this disagreement, I naturally thought of you, and whether you might be able to clear this up. Please note I am not referring to credible intervals, but rather to the common medical journal reporting standard of confidence intervals. My response: First
5 0.27510232 1333 andrew gelman stats-2012-05-20-Question 10 of my final exam for Design and Analysis of Sample Surveys
Introduction: 10. Out of a random sample of 100 Americans, zero report having ever held political office. From this information, give a 95% confidence interval for the proportion of Americans who have ever held political office. Solution to question 9 From yesterday : 9. Out of a population of 100 medical records, 40 are randomly sampled and then audited. 10 out of the 40 audits reveal fraud. From this information, give an estimate, standard error, and 95% confidence interval for the proportion of audits in the population with fraud. Solution: estimate is p.hat=10/40=0.25. Se is sqrt(1-f)*sqrt(p.hat*(1-.hat)/n)=sqrt(1-0.4)*sqrt(0.25*0.75/40)=0.053. 95% interval is [0.25 +/- 2*0.053] = [0.14,0.36].
6 0.26955172 1334 andrew gelman stats-2012-05-21-Question 11 of my final exam for Design and Analysis of Sample Surveys
8 0.21234578 2248 andrew gelman stats-2014-03-15-Problematic interpretations of confidence intervals
9 0.18651359 2201 andrew gelman stats-2014-02-06-Bootstrap averaging: Examples where it works and where it doesn’t work
10 0.18127163 1016 andrew gelman stats-2011-11-17-I got 99 comparisons but multiplicity ain’t one
11 0.17766985 1021 andrew gelman stats-2011-11-21-Don’t judge a book by its title
12 0.16981894 1206 andrew gelman stats-2012-03-10-95% intervals that I don’t believe, because they’re from a flat prior I don’t believe
13 0.16154593 662 andrew gelman stats-2011-04-15-Bayesian statistical pragmatism
14 0.15132017 1452 andrew gelman stats-2012-08-09-Visually weighting regression displays
15 0.1439178 1527 andrew gelman stats-2012-10-10-Another reason why you can get good inferences from a bad model
17 0.12927052 1178 andrew gelman stats-2012-02-21-How many data points do you really have?
18 0.12905762 2142 andrew gelman stats-2013-12-21-Chasing the noise
19 0.12700593 2240 andrew gelman stats-2014-03-10-On deck this week: Things people sent me
topicId topicWeight
[(0, 0.101), (1, 0.065), (2, 0.048), (3, -0.047), (4, -0.031), (5, -0.042), (6, -0.014), (7, 0.06), (8, 0.049), (9, -0.189), (10, -0.071), (11, 0.012), (12, 0.02), (13, -0.046), (14, -0.049), (15, -0.049), (16, -0.066), (17, -0.058), (18, 0.067), (19, -0.132), (20, 0.187), (21, 0.059), (22, 0.156), (23, -0.035), (24, 0.224), (25, -0.136), (26, -0.074), (27, -0.156), (28, -0.057), (29, 0.106), (30, -0.032), (31, -0.135), (32, -0.082), (33, -0.099), (34, 0.096), (35, 0.089), (36, -0.03), (37, 0.189), (38, 0.054), (39, -0.023), (40, 0.028), (41, -0.008), (42, 0.155), (43, -0.048), (44, 0.038), (45, -0.024), (46, -0.061), (47, -0.002), (48, -0.002), (49, 0.006)]
simIndex simValue blogId blogTitle
same-blog 1 0.99452215 480 andrew gelman stats-2010-12-21-Instead of “confidence interval,” let’s say “uncertainty interval”
Introduction: I’ve become increasingly uncomfortable with the term “confidence interval,” for several reasons: - The well-known difficulties in interpretation (officially the confidence statement can be interpreted only on average, but people typically implicitly give the Bayesian interpretation to each case), - The ambiguity between confidence intervals and predictive intervals. (See the footnote in BDA where we discuss the difference between “inference” and “prediction” in the classical framework.) - The awkwardness of explaining that confidence intervals are big in noisy situations where you have less confidence, and confidence intervals are small when you have more confidence. So here’s my proposal. Let’s use the term “uncertainty interval” instead. The uncertainty interval tells you how much uncertainty you have. That works pretty well, I think. P.S. As of this writing, “confidence interval” outGoogles “uncertainty interval” by the huge margin of 9.5 million to 54000. So we
2 0.84828734 1672 andrew gelman stats-2013-01-14-How do you think about the values in a confidence interval?
Introduction: Philip Jones writes: As an interested reader of your blog, I wondered if you might consider a blog entry sometime on the following question I posed on CrossValidated (StackExchange). I originally posed the question based on my uncertainty about 95% CIs: “Are all values within the 95% CI equally likely (probable), or are the values at the “tails” of the 95% CI less likely than those in the middle of the CI closer to the point estimate?” I posed this question based on discordant information I found at a couple of different web sources (I posted these sources in the body of the question). I received some interesting replies, and the replies were not unanimous, in fact there is some serious disagreement there! After seeing this disagreement, I naturally thought of you, and whether you might be able to clear this up. Please note I am not referring to credible intervals, but rather to the common medical journal reporting standard of confidence intervals. My response: First
Introduction: I’m reposing this classic from 2011 . . . Peter Bergman pointed me to this discussion from Cyrus of a presentation by Guido Imbens on design of randomized experiments. Cyrus writes: The standard analysis that Imbens proposes includes (1) a Fisher-type permutation test of the sharp null hypothesis–what Imbens referred to as “testing”–along with a (2) Neyman-type point estimate of the sample average treatment effect and confidence interval–what Imbens referred to as “estimation.” . . . Imbens claimed that testing and estimation are separate enterprises with separate goals and that the two should not be confused. I [Cyrus] took it as a warning against proposals that use “inverted” tests in order to produce point estimates and confidence intervals. There is no reason that such confidence intervals will have accurate coverage except under rather dire assumptions, meaning that they are not “confidence intervals” in the way that we usually think of them. I agree completely. T
Introduction: Peter Bergman points me to this discussion from Cyrus of a presentation by Guido Imbens on design of randomized experiments. Cyrus writes: The standard analysis that Imbens proposes includes (1) a Fisher-type permutation test of the sharp null hypothesis–what Imbens referred to as “testing”–along with a (2) Neyman-type point estimate of the sample average treatment effect and confidence interval–what Imbens referred to as “estimation.” . . . Imbens claimed that testing and estimation are separate enterprises with separate goals and that the two should not be confused. I [Cyrus] took it as a warning against proposals that use “inverted” tests in order to produce point estimates and confidence intervals. There is no reason that such confidence intervals will have accurate coverage except under rather dire assumptions, meaning that they are not “confidence intervals” in the way that we usually think of them. I agree completely. This is something I’ve been saying for a long
5 0.70395923 1333 andrew gelman stats-2012-05-20-Question 10 of my final exam for Design and Analysis of Sample Surveys
Introduction: 10. Out of a random sample of 100 Americans, zero report having ever held political office. From this information, give a 95% confidence interval for the proportion of Americans who have ever held political office. Solution to question 9 From yesterday : 9. Out of a population of 100 medical records, 40 are randomly sampled and then audited. 10 out of the 40 audits reveal fraud. From this information, give an estimate, standard error, and 95% confidence interval for the proportion of audits in the population with fraud. Solution: estimate is p.hat=10/40=0.25. Se is sqrt(1-f)*sqrt(p.hat*(1-.hat)/n)=sqrt(1-0.4)*sqrt(0.25*0.75/40)=0.053. 95% interval is [0.25 +/- 2*0.053] = [0.14,0.36].
6 0.62542516 1334 andrew gelman stats-2012-05-21-Question 11 of my final exam for Design and Analysis of Sample Surveys
7 0.61522222 2248 andrew gelman stats-2014-03-15-Problematic interpretations of confidence intervals
9 0.55602491 1016 andrew gelman stats-2011-11-17-I got 99 comparisons but multiplicity ain’t one
10 0.55131292 2201 andrew gelman stats-2014-02-06-Bootstrap averaging: Examples where it works and where it doesn’t work
11 0.52994788 1206 andrew gelman stats-2012-03-10-95% intervals that I don’t believe, because they’re from a flat prior I don’t believe
12 0.52799451 1331 andrew gelman stats-2012-05-19-Question 9 of my final exam for Design and Analysis of Sample Surveys
13 0.49192247 1881 andrew gelman stats-2013-06-03-Boot
14 0.48432136 306 andrew gelman stats-2010-09-29-Statistics and the end of time
15 0.4829275 1021 andrew gelman stats-2011-11-21-Don’t judge a book by its title
16 0.47724473 1470 andrew gelman stats-2012-08-26-Graphs showing regression uncertainty: the code!
19 0.44003659 1478 andrew gelman stats-2012-08-31-Watercolor regression
20 0.43936107 586 andrew gelman stats-2011-02-23-A statistical version of Arrow’s paradox
topicId topicWeight
[(1, 0.012), (2, 0.013), (9, 0.015), (15, 0.022), (16, 0.036), (20, 0.164), (21, 0.041), (24, 0.081), (63, 0.02), (72, 0.023), (86, 0.029), (99, 0.4)]
simIndex simValue blogId blogTitle
same-blog 1 0.99368215 480 andrew gelman stats-2010-12-21-Instead of “confidence interval,” let’s say “uncertainty interval”
Introduction: I’ve become increasingly uncomfortable with the term “confidence interval,” for several reasons: - The well-known difficulties in interpretation (officially the confidence statement can be interpreted only on average, but people typically implicitly give the Bayesian interpretation to each case), - The ambiguity between confidence intervals and predictive intervals. (See the footnote in BDA where we discuss the difference between “inference” and “prediction” in the classical framework.) - The awkwardness of explaining that confidence intervals are big in noisy situations where you have less confidence, and confidence intervals are small when you have more confidence. So here’s my proposal. Let’s use the term “uncertainty interval” instead. The uncertainty interval tells you how much uncertainty you have. That works pretty well, I think. P.S. As of this writing, “confidence interval” outGoogles “uncertainty interval” by the huge margin of 9.5 million to 54000. So we
2 0.97408646 910 andrew gelman stats-2011-09-15-Google Refine
Introduction: Tools worth knowing about: Google Refine is a power tool for working with messy data, cleaning it up, transforming it from one format into another, extending it with web services, and linking it to databases like Freebase. A recent discussion on the Polmeth list about the ANES Cumulative File is a setting where I think Refine might help (admittedly 49760×951 is bigger than I’d really like to deal with in the browser with js… but on a subset yes). [I might write this example up later.] Go watch the screencast videos for Refine. Data-entry problems are rampant in stuff we all use — leading or trailing spaces; mixed decimal-indicators; different units or transformations used in the same column; mixed lettercase leading to false duplicates; that’s only the beginning. Refine certainly would help find duplicates, and it counts things for you too. Just counting rows is too much for researchers sometimes (see yesterday’s post )! Refine 2.0 adds some data-collection tools for
3 0.97407687 1937 andrew gelman stats-2013-07-13-Meritocracy rerun
Introduction: I’ve said it here so often, this time I put it on the sister blog. . . .
4 0.95741183 1629 andrew gelman stats-2012-12-18-It happened in Connecticut
Introduction: From the sister blog, some reasons why the political reaction might be different this time.
5 0.95715672 974 andrew gelman stats-2011-10-26-NYC jobs in applied statistics, psychometrics, and causal inference!
Introduction: The Center for the Promotion of Research Involving Innovative Statistical Methodology at the Steinhardt School of Education has two job openings ! One is for an assistant/associated tenure track position for an applied statistician or psychometrician. The other is for a postdoc in causal inference and sensitivity analysis. Jennifer Hill and Marc Scott at the Steinhardt school are just great! We’re working together on various research projects so if you manage to get one of these jobs maybe you can collaborate with us here at Columbia too. So I have every interest in encouraging the very best people to apply for these jobs.
7 0.95652395 831 andrew gelman stats-2011-07-30-A Wikipedia riddle!
8 0.9557091 592 andrew gelman stats-2011-02-26-“Do you need ideal conditions to do great work?”
9 0.95421523 661 andrew gelman stats-2011-04-14-NYC 1950
10 0.95131326 479 andrew gelman stats-2010-12-20-WWJD? U can find out!
11 0.94956112 270 andrew gelman stats-2010-09-12-Comparison of forecasts for the 2010 congressional elections
13 0.94321251 1270 andrew gelman stats-2012-04-19-Demystifying Blup
14 0.93954718 1912 andrew gelman stats-2013-06-24-Bayesian quality control?
15 0.93942398 900 andrew gelman stats-2011-09-11-Symptomatic innumeracy
16 0.93834782 246 andrew gelman stats-2010-08-31-Somewhat Bayesian multilevel modeling
17 0.93624884 194 andrew gelman stats-2010-08-09-Data Visualization
18 0.93517292 1782 andrew gelman stats-2013-03-30-“Statistical Modeling: A Fresh Approach”
19 0.93268454 2181 andrew gelman stats-2014-01-21-The Commissar for Traffic presents the latest Five-Year Plan