andrew_gelman_stats andrew_gelman_stats-2013 andrew_gelman_stats-2013-1838 knowledge-graph by maker-knowledge-mining

1838 andrew gelman stats-2013-05-03-Setting aside the politics, the debate over the new health-care study reveals that we’re moving to a new high standard of statistical journalism

meta infos for this blog

Source: html

Introduction: Pointing to this news article by Megan McArdle discussing a recent study of Medicaid recipients, Jonathan Falk writes: Forget the interpretation for a moment, and the political spin, but haven’t we reached an interesting point when a journalist says things like: When you do an RCT with more than 12,000 people in it, and your defense of your hypothesis is that maybe the study just didn’t have enough power, what you’re actually saying is “the beneficial effects are probably pretty small”. and A good Bayesian—and aren’t most of us are supposed to be good Bayesians these days?—should be updating in light of this new information. Given this result, what is the likelihood that Obamacare will have a positive impact on the average health of Americans? Every one of us, for or against, should be revising that probability downwards. I’m not saying that you have to revise it to zero; I certainly haven’t. But however high it was yesterday, it should be somewhat lower today. This

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Given this result, what is the likelihood that Obamacare will have a positive impact on the average health of Americans? [sent-4, score-0.223]

2 Also this sensible understanding of statistical significance and effect sizes: But that doesn’t mean Medicaid has no effect on health. [sent-9, score-0.204]

3 It means that Medicaid had no statistically significant effect on three major health markers during a two-year study. [sent-10, score-0.646]

4 But this result is kind of weird, because it’s not coupled with a statistically significant increase in the use of anti-depressants. [sent-16, score-0.364]

5 McArdle is forgetting that the difference between “significant” and “not significant” is not itself statistically significant . [sent-21, score-0.228]

6 ” Also I’d prefer she’d talk with some public health experts rather than relying on sources such as, “as Josh Barro pointed out on Twitter. [sent-24, score-0.223]

7 With regard to the larger questions, I agree with McArdle that ultimately the goals are health and economic security, not health insurance or even health care. [sent-28, score-0.768]

8 She proposes replacing Medicaid with “free mental health clinics, or cash. [sent-29, score-0.285]

9 ” The challenge is that we seem to have worked ourselves into an expensive, paperwork-soaked health-care system, and it’s not clear to me that free mental health clinics or even cash would do the trick. [sent-30, score-0.393]

10 Carroll writes: Most people who get health insurance are healthy. [sent-45, score-0.382]

11 If 8 people’s lives in the study were saved in some way by the coverage, the total statistic holds. [sent-55, score-0.323]

12 I’m guessing that McArdle’s would reply that there’s no evidence that 8 people’s lives were saved in the Oregon study. [sent-57, score-0.256]

13 Thus, numbers such as 100,000 lives saved are possible , but other things are possible too. [sent-58, score-0.209]

14 McArdle describes Obamacare as “a $1 trillion program to treat mild depression. [sent-60, score-0.407]

15 ” I’m not sure where the trillion dollars comes from. [sent-61, score-0.379]

16 health care spending at $7000 per person per year, that’s a total of 2. [sent-64, score-0.32]

17 3 trillion, which would correspond to an additional trillion over a five-year period? [sent-69, score-0.299]

18 1 trillion dollars don’t want to give up any of their share! [sent-73, score-0.379]

19 If a policy will reduce mild depression, I assume it would have some eventual effect on severe depression too, no? [sent-75, score-0.442]

20 I’m like many (I suspect, most) Americans who already have health insurance in that I don’t actually know what’s in that famous health-care bill. [sent-77, score-0.367]

similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('mcardle', 0.487), ('medicaid', 0.37), ('trillion', 0.299), ('health', 0.223), ('carroll', 0.177), ('significant', 0.167), ('obamacare', 0.147), ('depression', 0.137), ('mild', 0.108), ('barro', 0.108), ('clinics', 0.108), ('saved', 0.107), ('effect', 0.102), ('lives', 0.102), ('insurance', 0.099), ('markers', 0.093), ('increase', 0.084), ('oregon', 0.081), ('dollars', 0.08), ('josh', 0.073), ('study', 0.068), ('mental', 0.062), ('coverage', 0.061), ('statistically', 0.061), ('people', 0.06), ('journalist', 0.057), ('close', 0.057), ('news', 0.052), ('result', 0.052), ('reduce', 0.051), ('spending', 0.051), ('flux', 0.049), ('hindrance', 0.049), ('ballpark', 0.049), ('rct', 0.049), ('uninsured', 0.049), ('supposed', 0.048), ('plan', 0.047), ('evidence', 0.047), ('falk', 0.046), ('frankly', 0.046), ('law', 0.046), ('every', 0.046), ('total', 0.046), ('famous', 0.045), ('saying', 0.045), ('healthier', 0.044), ('eventual', 0.044), ('acknowledges', 0.044), ('sounds', 0.044)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0000001 1838 andrew gelman stats-2013-05-03-Setting aside the politics, the debate over the new health-care study reveals that we’re moving to a new high standard of statistical journalism

2 0.26283735 1767 andrew gelman stats-2013-03-17-The disappearing or non-disappearing middle class

Introduction: Despite the title, this post is mostly not about economics or even politics but rather about the central role of comparisons in statistics and statistical graphics. It started when someone pointed me to this article in which Megan McArdle points out the misleadingness of a graph that seems to show a bimodal income distribution but only by combining cells in the tail: McArdle makes a good point: of course, if you spread the histogram along a uniform scale (or, for that matter, a log scale), you don’t see that bump at the high end. McArdle reproduces some Census charts showing income stability over the past few decades: Before I had a chance to chance to write about this, I noticed that Mark Palko did the job for me. Palko writes: To the extent that statistics includes data visualization, this is definitely bad statistics. When trying to depict trends and relationships, you generally want to get as much of the pertinent information as possible into the same grap

3 0.19271666 311 andrew gelman stats-2010-10-02-Where do our taxes go?

Introduction: Mark Palko links to a blog by Megan McArdle which reproduces a list entitled, “What You Paid For: 2009 tax receipt for a taxpayer earning $34,140 and paying $5,400 in federal income tax and FICA (selected items).” McArdle writes, “isn’t it possible that the widespread support for programs like Social Security and Medicare rests on the fact that most people don’t realize just how big a portion of your paycheck those programs consume?” But, as Palko points out, the FICA and Medicare withholdings are actually already right there on your W-2 form. So the real problem is not a lack of information but that people aren’t reading their W-2 forms more carefully. (Also, I don’t know if people are so upset about their withholdings for Social Security and Medicare, given that they’ll be getting that money back when they retire.) I’m more concerned about the list itself, though. I think a lot of cognitive-perceptual effects are involved in what gets a separate line item, and what doesn

4 0.15174046 1263 andrew gelman stats-2012-04-13-Question of the week: Will the authors of a controversial new study apologize to busy statistician Don Berry for wasting his time reading and responding to their flawed article?

Introduction: Aaron Carroll shoots down a politically-loaded claim about cancer survival. Lots of useful background from science reporter Sharon Begley: With the United States spending more on healthcare than any other country — $2.5 trillion, or just over $8,000 per capita, in 2009 — the question has long been, is it worth it? At least for spending on cancer, a controversial new study answers with an emphatic “yes.” . . . Experts shown an advance copy of the paper by Reuters argued that the tricky statistics of cancer outcomes tripped up the authors. “This study is pure folly,” said biostatistician Dr. Don Berry of MD Anderson Cancer Center in Houston. “It’s completely misguided and it’s dangerous. Not only are the authors’ analyses flawed but their conclusions are also wrong.” Ouch. Arguably the study shouldn’t be getting any coverage at all, but given that it’s in the news, it’s good to see it get shot down. I wonder if the authors will respond to Don Berry and say they’re sorr

5 0.12059437 1072 andrew gelman stats-2011-12-19-“The difference between . . .”: It’s not just p=.05 vs. p=.06

Introduction: The title of this post by Sanjay Srivastava illustrates an annoying misconception that’s crept into the (otherwise delightful) recent publicity related to my article with Hal Stern, he difference between “significant” and “not significant” is not itself statistically significant. When people bring this up, they keep referring to the difference between p=0.05 and p=0.06, making the familiar (and correct) point about the arbitrariness of the conventional p-value threshold of 0.05. And, sure, I agree with this, but everybody knows that already. The point Hal and I were making was that even apparently large differences in p-values are not statistically significant. For example, if you have one study with z=2.5 (almost significant at the 1% level!) and another with z=1 (not statistically significant at all, only 1 se from zero!), then their difference has a z of about 1 (again, not statistically significant at all). So it’s not just a comparison of 0.05 vs. 0.06, even a differenc

6 0.11100685 15 andrew gelman stats-2010-05-03-Public Opinion on Health Care Reform

7 0.10800163 585 andrew gelman stats-2011-02-22-“How has your thinking changed over the past three years?”

8 0.10583578 713 andrew gelman stats-2011-05-15-1-2 social scientist + 1-2 politician = ???

9 0.10168618 465 andrew gelman stats-2010-12-13-$3M health care prediction challenge

10 0.09852089 2255 andrew gelman stats-2014-03-19-How Americans vote

11 0.096132562 1695 andrew gelman stats-2013-01-28-Economists argue about Bayes

12 0.094344132 803 andrew gelman stats-2011-07-14-Subtleties with measurement-error models for the evaluation of wacky claims

13 0.093046814 511 andrew gelman stats-2011-01-11-One more time on that ESP study: The problem of overestimates and the shrinkage solution

14 0.092298768 899 andrew gelman stats-2011-09-10-The statistical significance filter

15 0.091597654 888 andrew gelman stats-2011-09-03-A psychology researcher asks: Is Anova dead?

16 0.091317669 2042 andrew gelman stats-2013-09-28-Difficulties of using statistical significance (or lack thereof) to sift through and compare research hypotheses

17 0.091061942 1147 andrew gelman stats-2012-01-30-Statistical Murder

18 0.090321332 2171 andrew gelman stats-2014-01-13-Postdoc with Liz Stuart on propensity score methods when the covariates are measured with error

19 0.090222612 820 andrew gelman stats-2011-07-25-Design of nonrandomized cluster sample study

20 0.089580186 706 andrew gelman stats-2011-05-11-The happiness gene: My bottom line (for now)

similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.203), (1, -0.063), (2, 0.054), (3, -0.053), (4, -0.017), (5, -0.035), (6, 0.012), (7, 0.039), (8, -0.0), (9, -0.031), (10, -0.094), (11, -0.019), (12, 0.059), (13, -0.003), (14, 0.015), (15, 0.025), (16, 0.038), (17, 0.007), (18, 0.006), (19, 0.015), (20, 0.016), (21, 0.033), (22, -0.005), (23, 0.017), (24, -0.027), (25, 0.023), (26, -0.008), (27, -0.02), (28, -0.004), (29, -0.014), (30, -0.017), (31, 0.028), (32, -0.0), (33, 0.011), (34, 0.043), (35, 0.045), (36, -0.04), (37, 0.011), (38, 0.014), (39, -0.005), (40, -0.002), (41, -0.027), (42, -0.08), (43, 0.038), (44, 0.033), (45, -0.02), (46, -0.009), (47, 0.007), (48, -0.014), (49, -0.033)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.9591915 1838 andrew gelman stats-2013-05-03-Setting aside the politics, the debate over the new health-care study reveals that we’re moving to a new high standard of statistical journalism

2 0.88058293 2030 andrew gelman stats-2013-09-19-Is coffee a killer? I don’t think the effect is as high as was estimated from the highest number that came out of a noisy study

Introduction: Thomas Lumley writes : The Herald has a story about hazards of coffee. The picture caption says Men who drink more than four cups a day are 56 per cent more likely to die. which is obviously not true: deaths, as we’ve observed before, are fixed at one per customer. The story says It’s not that people are dying at a rapid rate. But men who drink more than four cups a day are 56 per cent more likely to die and women have double the chance compared with moderate drinkers, according to the The University of Queensland and the University of South Carolina study. What the study actually reported was rates of death: over an average of 17 years, men who drink more than four cups a day died at about a 21% higher rate, with little evidence of any difference in men. After they considered only men and women under 55 (which they don’t say was something they had planned to do), and attempted to control for a whole bunch of other factors, the rate increase went to 56% for me

3 0.85719121 2049 andrew gelman stats-2013-10-03-On house arrest for p-hacking

Introduction: People keep pointing me to this excellent news article by David Brown, about a scientist who was convicted of data manipulation: In all, 330 patients were randomly assigned to get either interferon gamma-1b or placebo injections. Disease progression or death occurred in 46 percent of those on the drug and 52 percent of those on placebo. That was not a significant difference, statistically speaking. When only survival was considered, however, the drug looked better: 10 percent of people getting the drug died, compared with 17 percent of those on placebo. However, that difference wasn’t “statistically significant,” either. Specifically, the so-called P value — a mathematical measure of the strength of the evidence that there’s a true difference between a treatment and placebo — was 0.08. . . . Technically, the study was a bust, although the results leaned toward a benefit from interferon gamma-1b. Was there a group of patients in which the results tipped? Harkonen asked the statis

4 0.83714032 66 andrew gelman stats-2010-06-03-How can news reporters avoid making mistakes when reporting on technical issues? Or, Data used to justify “Data Used to Justify Health Savings Can Be Shaky” can be shaky

Introduction: Reed Abelson and Gardiner Harris report in the New York Times that some serious statistical questions have been raised about the Dartmouth Atlas of Health Care, an influential project that reports huge differences in health care costs and practices in different places in the United States, suggesting large potential cost savings if more efficient practices are used. (A claim that is certainly plausible to me, given this notorious graph ; see here for background.) Here’s an example of a claim from the Dartmouth Atlas (just picking something that happens to be featured on their webpage right now): Medicare beneficiaries who move to some regions receive many more diagnostic tests and new diagnoses than those who move to other regions. This study, published in the New England Journal of Medicine, raises important questions about whether being given more diagnoses is beneficial to patients and may help to explain recent controversies about regional differences in spending. A

5 0.80401391 67 andrew gelman stats-2010-06-03-More on that Dartmouth health care study

Introduction: Hank Aaron at the Brookings Institution, who knows a lot more about policy than I do, had some interesting comments on the recent New York Times article about problems with the Dartmouth health care atlas. which I discussed a few hours ago . Aaron writes that much of the criticism in that newspaper article was off-base, but that there are real difficulties in translating the Dartmouth results (finding little relation between spending and quality of care) to cost savings in the real world. Aaron writes: The Dartmouth research, showing huge variation in the use of various medical procedures and large variations in per patient spending under Medicare, has been a revelation and a useful one. There is no way to explain such variation on medical grounds and it is problematic. But readers, including my former colleague Orszag, have taken an oversimplistic view of what the numbers mean and what to do about them. There are three really big problems with the common interpreta

6 0.79499865 2223 andrew gelman stats-2014-02-24-“Edlin’s rule” for routinely scaling down published estimates

7 0.78152412 646 andrew gelman stats-2011-04-04-Graphical insights into the safety of cycling.

8 0.77682298 584 andrew gelman stats-2011-02-22-“Are Wisconsin Public Employees Underpaid?”

9 0.77239078 1364 andrew gelman stats-2012-06-04-Massive confusion about a study that purports to show that exercise may increase heart risk

10 0.77216613 179 andrew gelman stats-2010-08-03-An Olympic size swimming pool full of lithium water

11 0.76996613 1263 andrew gelman stats-2012-04-13-Question of the week: Will the authors of a controversial new study apologize to busy statistician Don Berry for wasting his time reading and responding to their flawed article?

12 0.76748687 2114 andrew gelman stats-2013-11-26-“Please make fun of this claim”

13 0.76696408 2090 andrew gelman stats-2013-11-05-How much do we trust a new claim that early childhood stimulation raised earnings by 42%?

14 0.76517069 702 andrew gelman stats-2011-05-09-“Discovered: the genetic secret of a happy life”

15 0.76510239 791 andrew gelman stats-2011-07-08-Censoring on one end, “outliers” on the other, what can we do with the middle?

16 0.75897163 284 andrew gelman stats-2010-09-18-Continuing efforts to justify false “death panels” claim

17 0.75498098 526 andrew gelman stats-2011-01-19-“If it saves the life of a single child…” and other nonsense

18 0.75345266 1662 andrew gelman stats-2013-01-09-The difference between “significant” and “non-significant” is not itself statistically significant

19 0.74940062 1906 andrew gelman stats-2013-06-19-“Behind a cancer-treatment firm’s rosy survival claims”

20 0.74876165 156 andrew gelman stats-2010-07-20-Burglars are local

similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(16, 0.071), (21, 0.024), (24, 0.442), (45, 0.011), (53, 0.015), (61, 0.022), (86, 0.025), (99, 0.223)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.99101293 643 andrew gelman stats-2011-04-02-So-called Bayesian hypothesis testing is just as bad as regular hypothesis testing

Introduction: Steve Ziliak points me to this article by the always-excellent Carl Bialik, slamming hypothesis tests. I only wish Carl had talked with me before so hastily posting, though! I would’ve argued with some of the things in the article. In particular, he writes: Reese and Brad Carlin . . . suggest that Bayesian statistics are a better alternative, because they tackle the probability that the hypothesis is true head-on, and incorporate prior knowledge about the variables involved. Brad Carlin does great work in theory, methods, and applications, and I like the bit about the prior knowledge (although I might prefer the more general phrase “additional information”), but I hate that quote! My quick response is that the hypothesis of zero effect is almost never true! The problem with the significance testing framework–Bayesian or otherwise–is in the obsession with the possibility of an exact zero effect. The real concern is not with zero, it’s with claiming a positive effect whe

2 0.98945951 1092 andrew gelman stats-2011-12-29-More by Berger and me on weakly informative priors

Introduction: A couple days ago we discussed some remarks by Tony O’Hagan and Jim Berger on weakly informative priors. Jim followed up on Deborah Mayo’s blog with this: Objective Bayesian priors are often improper (i.e., have infinite total mass), but this is not a problem when they are developed correctly. But not every improper prior is satisfactory. For instance, the constant prior is known to be unsatisfactory in many situations. The ‘solution’ pseudo-Bayesians often use is to choose a constant prior over a large but bounded set (a ‘weakly informative’ prior), saying it is now proper and so all is well. This is not true; if the constant prior on the whole parameter space is bad, so will be the constant prior over the bounded set. The problem is, in part, that some people confuse proper priors with subjective priors and, having learned that true subjective priors are fine, incorrectly presume that weakly informative proper priors are fine. I have a few reactions to this: 1. I agree

3 0.98886502 938 andrew gelman stats-2011-10-03-Comparing prediction errors

Introduction: Someone named James writes: I’m working on a classification task, sentence segmentation. The classifier algorithm we use (BoosTexter, a boosted learning algorithm) classifies each word independently conditional on its features, i.e. a bag-of-words model, so any contextual clues need to be encoded into the features. The feature extraction system I am proposing in my thesis uses a heteroscedastic LDA to transform data to produce the features the classifier runs on. The HLDA system has a couple parameters I’m testing, and I’m running a 3×2 full factorial experiment. That’s the background which may or may not be relevant to the question. The output of each trial is a class (there are only 2 classes, right now) for every word in the dataset. Because of the nature of the task, one class strongly predominates, say 90-95% of the data. My question is this: in terms of overall performance (we use F1 score), many of these trials are pretty close together, which leads me to ask whethe

4 0.98787665 1978 andrew gelman stats-2013-08-12-Fixing the race, ethnicity, and national origin questions on the U.S. Census

Introduction: In his new book, “What is Your Race? The Census and Our Flawed Efforts to Classify Americans,” former Census Bureau director Ken Prewitt recommends taking the race question off the decennial census: He recommends gradual changes, integrating the race and national origin questions while improving both. In particular, he would replace the main “race” question by a “race or origin” question, with the instruction to “Mark one or more” of the following boxes: “White,” “Black, African Am., or Negro,” “Hispanic, Latino, or Spanish origin,” “American Indian or Alaska Native,” “Asian”, “Native Hawaiian or Other Pacific Islander,” and “Some other race or origin.” Then the next question is to write in “specific race, origin, or enrolled or principal tribe.” Prewitt writes: His suggestion is to go with these questions in 2020 and 2030, then in 2040 “drop the race question and use only the national origin question.” He’s also relying on the American Community Survey to gather a lo

5 0.98734128 1479 andrew gelman stats-2012-09-01-Mothers and Moms

Introduction: Philip Cohen asks , “Why are mothers becoming moms?” These aren’t just two words for the same thing: in political terms “mother” is merely descriptive while “mom” is more positive. Indeed, we speak of “mom and apple pie” as unquestionable American icons. Cohen points out that motherhood is sometimes but not always respected in political discourse: On the one hand, both President Obama and pundit Hilary Rosen have now called motherhood the world’s hardest job. And with the Romneys flopping onto the all-mothers-work bandwagon, it appears we’re reaching a rare rhetorical consensus. On the other hand, the majority in both major political parties agrees that poor single mothers and their children need one thing above all – a (real) job, one that provides the “dignity of an honest day’s work.” For welfare purposes, taking care of children is not only not the toughest job in the world, it is more akin to nothing at all. When Bill Clinton’s endorsed welfare-to-work he famously decla

6 0.98724639 38 andrew gelman stats-2010-05-18-Breastfeeding, infant hyperbilirubinemia, statistical graphics, and modern medicine

7 0.98643148 482 andrew gelman stats-2010-12-23-Capitalism as a form of voluntarism

8 0.9859457 241 andrew gelman stats-2010-08-29-Ethics and statistics in development research

9 0.98424172 1787 andrew gelman stats-2013-04-04-Wanna be the next Tyler Cowen? It’s not as easy as you might think!

10 0.98176837 1706 andrew gelman stats-2013-02-04-Too many MC’s not enough MIC’s, or What principles should govern attempts to summarize bivariate associations in large multivariate datasets?

11 0.98099077 545 andrew gelman stats-2011-01-30-New innovations in spam

12 0.98062634 743 andrew gelman stats-2011-06-03-An argument that can’t possibly make sense

13 0.98059511 240 andrew gelman stats-2010-08-29-ARM solutions

14 0.97403646 1376 andrew gelman stats-2012-06-12-Simple graph WIN: the example of birthday frequencies

15 0.97389716 1046 andrew gelman stats-2011-12-07-Neutral noninformative and informative conjugate beta and gamma prior distributions

16 0.97265697 1891 andrew gelman stats-2013-06-09-“Heterogeneity of variance in experimental studies: A challenge to conventional interpretations”

17 0.97153211 2229 andrew gelman stats-2014-02-28-God-leaf-tree

18 0.9649955 278 andrew gelman stats-2010-09-15-Advice that might make sense for individuals but is negative-sum overall

19 0.96398044 1455 andrew gelman stats-2012-08-12-Probabilistic screening to get an approximate self-weighted sample

20 0.96375382 1437 andrew gelman stats-2012-07-31-Paying survey respondents