andrew_gelman_stats andrew_gelman_stats-2012 andrew_gelman_stats-2012-1288 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Cassie Murdoch points to a report from a corporate survey: Sixty-two percent of U.S. employees say it’s not likely they or a family member will be diagnosed with a serious illness like cancer, a survey indicates. The Aflac WorkForces Report, a survey of nearly 1,900 benefits decision-makers and more than 6,100 U.S. workers, also indicated 55 percent said they were not very or not at all likely to be diagnosed with a chronic illness, such as heart disease or diabetes. Here are some actual statistics: The American Cancer Society, Cancer Facts & Figures 2012, said 1-in-3 women and 1-in-2 men will be diagnosed with cancer at some point in their lives, and the National Safety Council, Injury Facts 2011 edition, says more than 38.9 million injuries occur in a year requiring medical treatment. The American Heart Association, Heart Disease & Stroke Statistics 2012, said 1-in-6 U.S. deaths were caused by coronary heart disease, Tillman said. And some details on the survey:
sentIndex sentText sentNum sentScore
1 Cassie Murdoch points to a report from a corporate survey: Sixty-two percent of U. [sent-1, score-0.281]
2 employees say it’s not likely they or a family member will be diagnosed with a serious illness like cancer, a survey indicates. [sent-3, score-1.471]
3 The Aflac WorkForces Report, a survey of nearly 1,900 benefits decision-makers and more than 6,100 U. [sent-4, score-0.321]
4 workers, also indicated 55 percent said they were not very or not at all likely to be diagnosed with a chronic illness, such as heart disease or diabetes. [sent-6, score-1.404]
5 Here are some actual statistics: The American Cancer Society, Cancer Facts & Figures 2012, said 1-in-3 women and 1-in-2 men will be diagnosed with cancer at some point in their lives, and the National Safety Council, Injury Facts 2011 edition, says more than 38. [sent-7, score-0.821]
6 9 million injuries occur in a year requiring medical treatment. [sent-8, score-0.267]
7 The American Heart Association, Heart Disease & Stroke Statistics 2012, said 1-in-6 U. [sent-9, score-0.105]
8 deaths were caused by coronary heart disease, Tillman said. [sent-11, score-0.598]
9 And some details on the survey: The survey conducted in January and February by Research Now. [sent-12, score-0.397]
10 The first 3,151 worker interviews were nationally representative, while the remaining 3,000 interviews were conducted among the Top 30 designated market areas. [sent-13, score-0.918]
11 Did these people really say they that neither they nor a family member will have a serious illness? [sent-14, score-0.423]
12 I’m used to seeing wacky survey findings, but this one is ridiculous. [sent-17, score-0.354]
wordName wordTfidf (topN-words)
[('diagnosed', 0.37), ('heart', 0.323), ('illness', 0.313), ('cancer', 0.283), ('survey', 0.26), ('disease', 0.224), ('interviews', 0.165), ('member', 0.144), ('conducted', 0.137), ('facts', 0.137), ('cassie', 0.123), ('coronary', 0.123), ('family', 0.122), ('percent', 0.113), ('injuries', 0.111), ('stroke', 0.111), ('designated', 0.107), ('chronic', 0.107), ('said', 0.105), ('murdoch', 0.104), ('nationally', 0.099), ('injury', 0.097), ('worker', 0.097), ('serious', 0.094), ('wacky', 0.094), ('february', 0.092), ('council', 0.091), ('employees', 0.089), ('american', 0.089), ('report', 0.087), ('requiring', 0.083), ('indicated', 0.083), ('deaths', 0.082), ('safety', 0.082), ('remaining', 0.082), ('corporate', 0.081), ('january', 0.08), ('likely', 0.079), ('workers', 0.077), ('ridiculous', 0.076), ('representative', 0.074), ('occur', 0.073), ('edition', 0.073), ('caused', 0.07), ('figures', 0.067), ('market', 0.066), ('lives', 0.064), ('men', 0.063), ('neither', 0.063), ('benefits', 0.061)]
simIndex simValue blogId blogTitle
same-blog 1 1.0000001 1288 andrew gelman stats-2012-04-29-Clueless Americans think they’ll never get sick
Introduction: Cassie Murdoch points to a report from a corporate survey: Sixty-two percent of U.S. employees say it’s not likely they or a family member will be diagnosed with a serious illness like cancer, a survey indicates. The Aflac WorkForces Report, a survey of nearly 1,900 benefits decision-makers and more than 6,100 U.S. workers, also indicated 55 percent said they were not very or not at all likely to be diagnosed with a chronic illness, such as heart disease or diabetes. Here are some actual statistics: The American Cancer Society, Cancer Facts & Figures 2012, said 1-in-3 women and 1-in-2 men will be diagnosed with cancer at some point in their lives, and the National Safety Council, Injury Facts 2011 edition, says more than 38.9 million injuries occur in a year requiring medical treatment. The American Heart Association, Heart Disease & Stroke Statistics 2012, said 1-in-6 U.S. deaths were caused by coronary heart disease, Tillman said. And some details on the survey:
2 0.23593232 21 andrew gelman stats-2010-05-07-Environmentally induced cancer “grossly underestimated”? Doubtful.
Introduction: The (U.S.) “President’s Cancer Panel” has released its 2008-2009 annual report, which includes a cover letter that says “the true burden of environmentally induced cancer has been grossly underestimated.” The report itself discusses exposures to various types of industrial chemicals, some of which are known carcinogens, in some detail, but gives nearly no data or analysis to suggest that these exposures are contributing to significant numbers of cancers. In fact, there is pretty good evidence that they are not. The plot above shows age-adjusted cancer mortality for men, by cancer type, in the U.S. The plot below shows the same for women. In both cases, the cancers with the highest mortality rates are shown, but not all cancers (e.g. brain cancer is not shown). For what it’s worth, I’m not sure how trustworthy the rates are from the 1930s — it seems possible that reporting, autopsies, or both, were less careful during the Great Depression — so I suggest focusing on the r
3 0.18093944 1766 andrew gelman stats-2013-03-16-“Nightshifts Linked to Increased Risk for Ovarian Cancer”
Introduction: Zosia Chustecka writes : Much of the previous work on the link between cancer and nightshifts has focused on breast cancer . . . The latest report, focusing on ovarian cancer, was published in the April issue of Occupational and Environmental Medicine. This increase in the risk for ovarian cancer with nightshift work is consistent with, and of similar magnitude to, the risk for breast cancer, say lead author Parveen Bhatti, PhD, and colleagues from the epidemiology program at the Fred Hutchinson Cancer Research Center in Seattle, Washington. The researchers examined data from a local population-based cancer registry that is part of the Surveillance Epidemiology and End Results (SEER) Program. They identified 1101 women with advanced epithelial ovarian cancer, 389 with borderline disease, and 1832 without ovarian cancer (control group). The women, who were 35 to 74 years of age, were asked about the hours they worked, and specifically whether they had ever worked the nig
4 0.15420315 761 andrew gelman stats-2011-06-13-A survey’s not a survey if they don’t tell you how they did it
Introduction: Since we’re on the topic of nonreplicable research . . . see here (link from here ) for a story of a survey that’s so bad that the people who did it won’t say how they did it. I know too many cases where people screwed up in a survey when they were actually trying to get the right answer, for me to trust any report of a survey that doesn’t say what they did. I’m reminded of this survey which may well have been based on a sample of size 6 (again, the people who did it refused to release any description of methodology).
5 0.1438466 1371 andrew gelman stats-2012-06-07-Question 28 of my final exam for Design and Analysis of Sample Surveys
Introduction: This is it, the last question on the exam! 28. A telephone survey was conducted several years ago, asking people how often they were polled in the past year. I can’t recall the responses, but suppose that 40% of the respondents said they participated in zero surveys in the previous year, 30% said they participated in one survey, 15% said two surveys, 10% said three, and 5% said four. From this it is easy to estimate an average, but there is a worry that this survey will itself overrepresent survey participants and thus overestimate the rate at which the average person is surveyed. Come up with a procedure to use these data to get an improved estimate of the average number of surveys that a randomly-sampled American is polled in a year. Solution to question 27 From yesterday : 27. Which of the following problems were identified with the Burnham et al. survey of Iraq mortality? (Indicate all that apply.) (a) The survey used cluster sampling, which is inappropriate for estim
9 0.1106821 2367 andrew gelman stats-2014-06-10-Spring forward, fall back, drop dead?
10 0.10234843 784 andrew gelman stats-2011-07-01-Weighting and prediction in sample surveys
11 0.10231312 1906 andrew gelman stats-2013-06-19-“Behind a cancer-treatment firm’s rosy survival claims”
12 0.10105883 333 andrew gelman stats-2010-10-10-Psychiatric drugs and the reduction in crime
13 0.098450221 1221 andrew gelman stats-2012-03-19-Whassup with deviance having a high posterior correlation with a parameter in the model?
14 0.090114117 450 andrew gelman stats-2010-12-04-The Joy of Stats
15 0.089344151 1725 andrew gelman stats-2013-02-17-“1.7%” ha ha ha
16 0.087220542 385 andrew gelman stats-2010-10-31-Wacky surveys where they don’t tell you the questions they asked
17 0.085684776 1480 andrew gelman stats-2012-09-02-“If our product is harmful . . . we’ll stop making it.”
18 0.084321104 2205 andrew gelman stats-2014-02-10-More on US health care overkill
19 0.084113412 673 andrew gelman stats-2011-04-20-Upper-income people still don’t realize they’re upper-income
20 0.079317138 1949 andrew gelman stats-2013-07-21-Defensive political science responds defensively to an attack on social science
topicId topicWeight
[(0, 0.09), (1, -0.051), (2, 0.071), (3, -0.061), (4, 0.001), (5, 0.031), (6, -0.016), (7, 0.031), (8, -0.022), (9, -0.033), (10, -0.008), (11, -0.092), (12, 0.017), (13, 0.102), (14, -0.031), (15, 0.021), (16, 0.052), (17, 0.022), (18, 0.05), (19, -0.012), (20, -0.042), (21, 0.024), (22, -0.081), (23, -0.005), (24, -0.022), (25, 0.035), (26, -0.064), (27, -0.056), (28, 0.04), (29, -0.003), (30, -0.076), (31, 0.066), (32, 0.023), (33, 0.039), (34, -0.028), (35, 0.003), (36, 0.053), (37, 0.01), (38, -0.012), (39, 0.043), (40, -0.034), (41, 0.015), (42, 0.062), (43, -0.01), (44, 0.017), (45, 0.015), (46, 0.011), (47, 0.041), (48, -0.06), (49, 0.006)]
simIndex simValue blogId blogTitle
same-blog 1 0.97356206 1288 andrew gelman stats-2012-04-29-Clueless Americans think they’ll never get sick
Introduction: Cassie Murdoch points to a report from a corporate survey: Sixty-two percent of U.S. employees say it’s not likely they or a family member will be diagnosed with a serious illness like cancer, a survey indicates. The Aflac WorkForces Report, a survey of nearly 1,900 benefits decision-makers and more than 6,100 U.S. workers, also indicated 55 percent said they were not very or not at all likely to be diagnosed with a chronic illness, such as heart disease or diabetes. Here are some actual statistics: The American Cancer Society, Cancer Facts & Figures 2012, said 1-in-3 women and 1-in-2 men will be diagnosed with cancer at some point in their lives, and the National Safety Council, Injury Facts 2011 edition, says more than 38.9 million injuries occur in a year requiring medical treatment. The American Heart Association, Heart Disease & Stroke Statistics 2012, said 1-in-6 U.S. deaths were caused by coronary heart disease, Tillman said. And some details on the survey:
2 0.77310669 21 andrew gelman stats-2010-05-07-Environmentally induced cancer “grossly underestimated”? Doubtful.
Introduction: The (U.S.) “President’s Cancer Panel” has released its 2008-2009 annual report, which includes a cover letter that says “the true burden of environmentally induced cancer has been grossly underestimated.” The report itself discusses exposures to various types of industrial chemicals, some of which are known carcinogens, in some detail, but gives nearly no data or analysis to suggest that these exposures are contributing to significant numbers of cancers. In fact, there is pretty good evidence that they are not. The plot above shows age-adjusted cancer mortality for men, by cancer type, in the U.S. The plot below shows the same for women. In both cases, the cancers with the highest mortality rates are shown, but not all cancers (e.g. brain cancer is not shown). For what it’s worth, I’m not sure how trustworthy the rates are from the 1930s — it seems possible that reporting, autopsies, or both, were less careful during the Great Depression — so I suggest focusing on the r
3 0.70417368 730 andrew gelman stats-2011-05-25-Rechecking the census
Introduction: Sam Roberts writes : The Census Bureau [reported] that though New York City’s population reached a record high of 8,175,133 in 2010, the gain of 2 percent, or 166,855 people, since 2000 fell about 200,000 short of what the bureau itself had estimated. Public officials were incredulous that a city that lures tens of thousands of immigrants each year and where a forest of new buildings has sprouted could really have recorded such a puny increase. How, they wondered, could Queens have grown by only one-tenth of 1 percent since 2000? How, even with a surge in foreclosures, could the number of vacant apartments have soared by nearly 60 percent in Queens and by 66 percent in Brooklyn? That does seem a bit suspicious. So the newspaper did its own survey: Now, a house-to-house New York Times survey of three representative square blocks where the Census Bureau said vacancies had increased and the population had declined since 2000 suggests that the city’s outrage is somewhat ju
4 0.69006127 1371 andrew gelman stats-2012-06-07-Question 28 of my final exam for Design and Analysis of Sample Surveys
Introduction: This is it, the last question on the exam! 28. A telephone survey was conducted several years ago, asking people how often they were polled in the past year. I can’t recall the responses, but suppose that 40% of the respondents said they participated in zero surveys in the previous year, 30% said they participated in one survey, 15% said two surveys, 10% said three, and 5% said four. From this it is easy to estimate an average, but there is a worry that this survey will itself overrepresent survey participants and thus overestimate the rate at which the average person is surveyed. Come up with a procedure to use these data to get an improved estimate of the average number of surveys that a randomly-sampled American is polled in a year. Solution to question 27 From yesterday : 27. Which of the following problems were identified with the Burnham et al. survey of Iraq mortality? (Indicate all that apply.) (a) The survey used cluster sampling, which is inappropriate for estim
5 0.6760329 761 andrew gelman stats-2011-06-13-A survey’s not a survey if they don’t tell you how they did it
Introduction: Since we’re on the topic of nonreplicable research . . . see here (link from here ) for a story of a survey that’s so bad that the people who did it won’t say how they did it. I know too many cases where people screwed up in a survey when they were actually trying to get the right answer, for me to trust any report of a survey that doesn’t say what they did. I’m reminded of this survey which may well have been based on a sample of size 6 (again, the people who did it refused to release any description of methodology).
6 0.67115062 1766 andrew gelman stats-2013-03-16-“Nightshifts Linked to Increased Risk for Ovarian Cancer”
7 0.66273314 1741 andrew gelman stats-2013-02-27-Thin scientists say it’s unhealthy to be fat
8 0.65916866 381 andrew gelman stats-2010-10-30-Sorry, Senator DeMint: Most Americans Don’t Want to Ban Gays from the Classroom
10 0.651456 385 andrew gelman stats-2010-10-31-Wacky surveys where they don’t tell you the questions they asked
11 0.65064478 142 andrew gelman stats-2010-07-12-God, Guns, and Gaydar: The Laws of Probability Push You to Overestimate Small Groups
12 0.64134061 1115 andrew gelman stats-2012-01-12-Where are the larger-than-life athletes?
13 0.63654613 1906 andrew gelman stats-2013-06-19-“Behind a cancer-treatment firm’s rosy survival claims”
14 0.6293965 5 andrew gelman stats-2010-04-27-Ethical and data-integrity problems in a study of mortality in Iraq
15 0.6123932 12 andrew gelman stats-2010-04-30-More on problems with surveys estimating deaths in war zones
17 0.60567212 705 andrew gelman stats-2011-05-10-Some interesting unpublished ideas on survey weighting
18 0.5941481 784 andrew gelman stats-2011-07-01-Weighting and prediction in sample surveys
19 0.5837037 1345 andrew gelman stats-2012-05-26-Question 16 of my final exam for Design and Analysis of Sample Surveys
20 0.56098562 1313 andrew gelman stats-2012-05-11-Question 1 of my final exam for Design and Analysis of Sample Surveys
topicId topicWeight
[(9, 0.019), (16, 0.056), (20, 0.014), (24, 0.034), (25, 0.013), (46, 0.013), (99, 0.728)]
simIndex simValue blogId blogTitle
same-blog 1 0.99949813 1288 andrew gelman stats-2012-04-29-Clueless Americans think they’ll never get sick
Introduction: Cassie Murdoch points to a report from a corporate survey: Sixty-two percent of U.S. employees say it’s not likely they or a family member will be diagnosed with a serious illness like cancer, a survey indicates. The Aflac WorkForces Report, a survey of nearly 1,900 benefits decision-makers and more than 6,100 U.S. workers, also indicated 55 percent said they were not very or not at all likely to be diagnosed with a chronic illness, such as heart disease or diabetes. Here are some actual statistics: The American Cancer Society, Cancer Facts & Figures 2012, said 1-in-3 women and 1-in-2 men will be diagnosed with cancer at some point in their lives, and the National Safety Council, Injury Facts 2011 edition, says more than 38.9 million injuries occur in a year requiring medical treatment. The American Heart Association, Heart Disease & Stroke Statistics 2012, said 1-in-6 U.S. deaths were caused by coronary heart disease, Tillman said. And some details on the survey:
2 0.99827653 756 andrew gelman stats-2011-06-10-Christakis-Fowler update
Introduction: After I posted on Russ Lyons’s criticisms of the work of Nicholas Christakis and James Fowler’s work on social networks, several people emailed in with links to related articles. (Nobody wants to comment on the blog anymore; all I get is emails.) Here they are: Political scientists Hans Noel and Brendan Nyhan wrote a paper called “The ‘Unfriending’ Problem: The Consequences of Homophily in Friendship Retention for Causal Estimates of Social Influence” in which they argue that the Christakis-Fowler results are subject to bias because of patterns in the time course of friendships. Statisticians Cosma Shalizi and AT wrote a paper called “Homophily and Contagion Are Generically Confounded in Observational Social Network Studies” arguing that analyses such as those of Christakis and Fowler cannot hope to disentangle different sorts of network effects. And Christakis and Fowler reply to Noel and Nyhan, Shalizi and Thomas, Lyons, and others in an article that begins: H
3 0.99805903 1813 andrew gelman stats-2013-04-19-Grad students: Participate in an online survey on statistics education
Introduction: Joan Garfield, a leading researcher in statistics education, is conducting a survey of graduate students who teach or assist with the teaching of statistics. She writes: We want to invite them to take a short survey that will enable us to collect some baseline data that we may use in a grant proposal we are developing. The project would provide summer workshops and ongoing support for graduate students who will be teaching or assisting with teaching introductory statistics classes. If the grant is funded, we would invite up to 40 students from around the country who are entering graduate programs in statistics to participate in a three-year training and support program. The goal of this program is to help these students become expert and flexible teachers of statistics, and to support them as they move through their teaching experiences as graduate students. Here’s the the online survey . Garfield writes, “Your responses are completely voluntary and anonymous. Results w
4 0.99768132 772 andrew gelman stats-2011-06-17-Graphical tools for understanding multilevel models
Introduction: There are a few things I want to do: 1. Understand a fitted model using tools such as average predictive comparisons , R-squared, and partial pooling factors . In defining these concepts, Iain and I came up with some clever tricks, including (but not limited to): - Separating the inputs and averaging over all possible values of the input not being altered (for average predictive comparisons); - Defining partial pooling without referring to a raw-data or maximum-likelihood or no-pooling estimate (these don’t necessarily exist when you’re fitting logistic regression with sparse data); - Defining an R-squared for each level of a multilevel model. The methods get pretty complicated, though, and they have some loose ends–in particular, for average predictive comparisons with continuous input variables. So now we want to implement these in R and put them into arm along with bglmer etc. 2. Setting up coefplot so it works more generally (that is, so the graphics look nice
5 0.99761051 180 andrew gelman stats-2010-08-03-Climate Change News
Introduction: I. State of the Climate report The National Oceanic and Atmospheric Administration recently released their “State of the Climate Report” for 2009 . The report has chapters discussing global climate (temperatures, water vapor, cloudiness, alpine glaciers,…); oceans (ocean heat content, sea level, sea surface temperatures, etc.); the arctic (sea ice extent, permafrost, vegetation, and so on); Antarctica (weather observations, sea ice extent,…), and regional climates. NOAA also provides a nice page that lets you display any of 11 relevant time-series datasets (land-surface air temperature, sea level, ocean heat content, September arctic sea-ice extent, sea-surface temperature, northern hemisphere snow cover, specific humidity, glacier mass balance, marine air temperature, tropospheric temperature, and stratospheric temperature). Each of the plots overlays data from several databases (not necessarily indepenedent of each other), and you can select which ones to include or leave
6 0.99758148 174 andrew gelman stats-2010-08-01-Literature and life
7 0.99744266 1483 andrew gelman stats-2012-09-04-“Bestselling Author Caught Posting Positive Reviews of His Own Work on Amazon”
8 0.99691159 860 andrew gelman stats-2011-08-18-Trolls!
9 0.99675661 1434 andrew gelman stats-2012-07-29-FindTheData.org
10 0.99633986 1315 andrew gelman stats-2012-05-12-Question 2 of my final exam for Design and Analysis of Sample Surveys
11 0.99631023 740 andrew gelman stats-2011-06-01-The “cushy life” of a University of Illinois sociology professor
12 0.99605602 23 andrew gelman stats-2010-05-09-Popper’s great, but don’t bother with his theory of probability
13 0.99558884 521 andrew gelman stats-2011-01-17-“the Tea Party’s ire, directed at Democrats and Republicans alike”
14 0.9952876 589 andrew gelman stats-2011-02-24-On summarizing a noisy scatterplot with a single comparison of two points
15 0.99519539 6 andrew gelman stats-2010-04-27-Jelte Wicherts lays down the stats on IQ
16 0.99519539 90 andrew gelman stats-2010-06-16-Oil spill and corn production
17 0.99519539 122 andrew gelman stats-2010-07-01-MCMC machine
18 0.99519539 299 andrew gelman stats-2010-09-27-what is = what “should be” ??
19 0.99519539 632 andrew gelman stats-2011-03-28-Wobegon on the Potomac
20 0.99519539 826 andrew gelman stats-2011-07-27-The Statistics Forum!