andrew_gelman_stats andrew_gelman_stats-2012 andrew_gelman_stats-2012-1441 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: David Radwin writes: I am seeking a statistic measuring an estimate’s reliability or stability as an alternative to the coefficient of variation (CV), also known as the relative standard error. The CV is the standard error of an estimate (proportion, mean, regression coefficient, etc.) divided by the estimate itself, usually expressed as a percentage. For example, if a survey finds 15% unemployment with a 6% standard error, the CV is .06/.15 = .4 = 40%. Some US government agencies flag or suppress as unreliable any estimate with a CV over a certain threshold such as 30% or 50%. But this standard can be arbitrary (for example, 85% employment would have a much lower CV of .06/.85 = 7%), and the CV has other drawbacks I won’t elaborate here. I don’t need an evaluation of the wisdom of using the CV or anything else for measuring an estimate’s stability, but one of my projects calls for such a measure and I would like to find a better alternative. Can you or your blog readers suggest
sentIndex sentText sentNum sentScore
1 David Radwin writes: I am seeking a statistic measuring an estimate’s reliability or stability as an alternative to the coefficient of variation (CV), also known as the relative standard error. [sent-1, score-1.117]
2 The CV is the standard error of an estimate (proportion, mean, regression coefficient, etc. [sent-2, score-0.558]
3 ) divided by the estimate itself, usually expressed as a percentage. [sent-3, score-0.419]
4 For example, if a survey finds 15% unemployment with a 6% standard error, the CV is . [sent-4, score-0.35]
5 Some US government agencies flag or suppress as unreliable any estimate with a CV over a certain threshold such as 30% or 50%. [sent-8, score-0.759]
6 But this standard can be arbitrary (for example, 85% employment would have a much lower CV of . [sent-9, score-0.356]
7 85 = 7%), and the CV has other drawbacks I won’t elaborate here. [sent-11, score-0.201]
8 I don’t need an evaluation of the wisdom of using the CV or anything else for measuring an estimate’s stability, but one of my projects calls for such a measure and I would like to find a better alternative. [sent-12, score-0.666]
9 Can you or your blog readers suggest a different measure of reliability? [sent-13, score-0.155]
10 My reply: If you are stuck here, go back to first principles. [sent-14, score-0.068]
11 If you just need a measure, you can supply both the estimate and the standard error. [sent-15, score-0.601]
12 But it sounds like you are looking for a rule of some sort? [sent-16, score-0.058]
13 Maybe it would help to try to quantify the gains and losses from classifying an estimate as “stable. [sent-17, score-0.651]
14 I’m not saying that the formal decision analysis should decide your answer but it could give you some intuition about what various proposed procedures are doing. [sent-19, score-0.449]
15 Based on my experiences, I think you could make general progress by constructing a solution to your specific problem. [sent-20, score-0.209]
wordName wordTfidf (topN-words)
[('cv', 0.636), ('estimate', 0.273), ('standard', 0.197), ('stability', 0.177), ('reliability', 0.171), ('measure', 0.155), ('measuring', 0.151), ('coefficient', 0.135), ('radwin', 0.125), ('drawbacks', 0.118), ('flag', 0.113), ('decision', 0.107), ('classifying', 0.106), ('suppress', 0.103), ('unreliable', 0.099), ('agencies', 0.094), ('losses', 0.092), ('quantify', 0.091), ('gains', 0.089), ('statistic', 0.089), ('error', 0.088), ('constructing', 0.086), ('wisdom', 0.083), ('elaborate', 0.083), ('employment', 0.081), ('finds', 0.08), ('seeking', 0.079), ('arbitrary', 0.078), ('divided', 0.077), ('threshold', 0.077), ('calls', 0.074), ('unemployment', 0.073), ('intuition', 0.072), ('projects', 0.071), ('formal', 0.07), ('experiences', 0.069), ('evaluation', 0.069), ('expressed', 0.069), ('procedures', 0.068), ('proposed', 0.068), ('stuck', 0.068), ('supply', 0.068), ('proportion', 0.067), ('progress', 0.066), ('decide', 0.064), ('need', 0.063), ('relative', 0.06), ('rule', 0.058), ('alternative', 0.058), ('solution', 0.057)]
simIndex simValue blogId blogTitle
Introduction: David Radwin writes: I am seeking a statistic measuring an estimate’s reliability or stability as an alternative to the coefficient of variation (CV), also known as the relative standard error. The CV is the standard error of an estimate (proportion, mean, regression coefficient, etc.) divided by the estimate itself, usually expressed as a percentage. For example, if a survey finds 15% unemployment with a 6% standard error, the CV is .06/.15 = .4 = 40%. Some US government agencies flag or suppress as unreliable any estimate with a CV over a certain threshold such as 30% or 50%. But this standard can be arbitrary (for example, 85% employment would have a much lower CV of .06/.85 = 7%), and the CV has other drawbacks I won’t elaborate here. I don’t need an evaluation of the wisdom of using the CV or anything else for measuring an estimate’s stability, but one of my projects calls for such a measure and I would like to find a better alternative. Can you or your blog readers suggest
2 0.22141974 1395 andrew gelman stats-2012-06-27-Cross-validation (What is it good for?)
Introduction: I think cross-validation is a good way to estimate a model’s forecasting error but I don’t think it’s always such a great tool for comparing models. I mean, sure, if the differences are dramatic, ok. But you can easily have a few candidate models, and one model makes a lot more sense than the others (even from a purely predictive sense, I’m not talking about causality here). The difference between the model doesn’t show up in a xval measure of total error but in the patterns of the predictions. For a simple example, imagine using a linear model with positive slope to model a function that is constrained to be increasing. If the constraint isn’t in the model, the predicted/imputed series will sometimes be nonmonotonic. The effect on the prediction error can be so tiny as to be undetectable (or it might even increase avg prediction error to include the constraint); nonetheless, the predictions will be clearly nonsensical. That’s an extreme example but I think the general point h
3 0.18088281 2129 andrew gelman stats-2013-12-10-Cross-validation and Bayesian estimation of tuning parameters
Introduction: Ilya Lipkovich writes: I read with great interest your 2008 paper [with Aleks Jakulin, Grazia Pittau, and Yu-Sung Su] on weakly informative priors for logistic regression and also followed an interesting discussion on your blog. This discussion was within Bayesian community in relation to the validity of priors. However i would like to approach it rather from a more broad perspective on predictive modeling bringing in the ideas from machine/statistical learning approach”. Actually you were the first to bring it up by mentioning in your paper “borrowing ideas from computer science” on cross-validation when comparing predictive ability of your proposed priors with other choices. However, using cross-validation for comparing method performance is not the only or primary use of CV in machine-learning. Most of machine learning methods have some “meta” or complexity parameters and use cross-validation to tune them up. For example, one of your comparison methods is BBR which actually
4 0.14323106 1348 andrew gelman stats-2012-05-27-Question 17 of my final exam for Design and Analysis of Sample Surveys
Introduction: 17. In a survey of n people, half are asked if they support “the health care law recently passed by Congress” and half are asked if they support “the law known as Obamacare.” The goal is to estimate the effect of the wording on the proportion of Yes responses. How large must n be for the effect to be estimated within a standard error of 5 percentage points? Solution to question 16 From yesterday : 16. You are doing a survey in a war-torn country to estimate what percentage of unemployed men support the rebels in a civil war. Express this as a ratio estimation problem, where goal is to estimate Y.bar/X.bar. What are x and y here? Give the estimate and standard error for the percentage of unemployed men who support the rebels. Solution: x is 1 if the respondent is an unemployed man, 0 otherwise. y is 1 if the respondent is an unemployed man and supports the rebels, 0 otherwise. The estimate is y.bar/x.bar [typo fixed], the standard error is (1/x.bar)*(1/sqrt(n))*s.z, whe
5 0.12496674 255 andrew gelman stats-2010-09-04-How does multilevel modeling affect the estimate of the grand mean?
Introduction: Subhadeep Mukhopadhyay writes: I am convinced of the power of hierarchical modeling and individual parameter pooling concept. I was wondering how could multi-level modeling could influence the estimate of grad mean (NOT individual label). My reply: Multilevel modeling will affect the estimate of the grand mean in two ways: 1. If the group-level mean is correlated with group size, then the partial pooling will change the estimate of the grand mean (and, indeed, you might want to include group size or some similar variable as a group-level predictor. 2. In any case, the extra error term(s) in a multilevel model will typically affect the standard error of everything, including the estimate of the grand mean.
6 0.11853317 960 andrew gelman stats-2011-10-15-The bias-variance tradeoff
8 0.11512722 1361 andrew gelman stats-2012-06-02-Question 23 of my final exam for Design and Analysis of Sample Surveys
9 0.11198904 1349 andrew gelman stats-2012-05-28-Question 18 of my final exam for Design and Analysis of Sample Surveys
10 0.1112751 1315 andrew gelman stats-2012-05-12-Question 2 of my final exam for Design and Analysis of Sample Surveys
11 0.10879511 1358 andrew gelman stats-2012-06-01-Question 22 of my final exam for Design and Analysis of Sample Surveys
12 0.10728699 494 andrew gelman stats-2010-12-31-Type S error rates for classical and Bayesian single and multiple comparison procedures
13 0.1050252 2315 andrew gelman stats-2014-05-02-Discovering general multidimensional associations
14 0.095469579 1966 andrew gelman stats-2013-08-03-Uncertainty in parameter estimates using multilevel models
15 0.095288202 1334 andrew gelman stats-2012-05-21-Question 11 of my final exam for Design and Analysis of Sample Surveys
16 0.085692592 1110 andrew gelman stats-2012-01-10-Jobs in statistics research! In New Jersey!
17 0.085080415 25 andrew gelman stats-2010-05-10-Two great tastes that taste great together
18 0.081476986 1333 andrew gelman stats-2012-05-20-Question 10 of my final exam for Design and Analysis of Sample Surveys
19 0.081379183 1651 andrew gelman stats-2013-01-03-Faculty Position in Visualization, Visual Analytics, Imaging, and Human Centered Computing
20 0.078343838 537 andrew gelman stats-2011-01-25-Postdoc Position #1: Missing-Data Imputation, Diagnostics, and Applications
topicId topicWeight
[(0, 0.132), (1, 0.042), (2, 0.088), (3, -0.062), (4, 0.053), (5, 0.031), (6, 0.003), (7, 0.017), (8, -0.008), (9, -0.018), (10, 0.019), (11, -0.057), (12, 0.009), (13, 0.053), (14, -0.051), (15, -0.016), (16, -0.042), (17, 0.011), (18, -0.001), (19, -0.007), (20, 0.036), (21, 0.006), (22, 0.058), (23, 0.041), (24, 0.035), (25, 0.007), (26, 0.036), (27, -0.002), (28, -0.033), (29, -0.022), (30, 0.019), (31, 0.028), (32, -0.004), (33, -0.028), (34, 0.041), (35, -0.054), (36, -0.006), (37, -0.028), (38, 0.025), (39, -0.06), (40, -0.047), (41, 0.003), (42, -0.088), (43, 0.047), (44, -0.027), (45, 0.025), (46, 0.011), (47, -0.018), (48, 0.009), (49, -0.018)]
simIndex simValue blogId blogTitle
same-blog 1 0.96765077 1441 andrew gelman stats-2012-08-02-“Based on my experiences, I think you could make general progress by constructing a solution to your specific problem.”
Introduction: David Radwin writes: I am seeking a statistic measuring an estimate’s reliability or stability as an alternative to the coefficient of variation (CV), also known as the relative standard error. The CV is the standard error of an estimate (proportion, mean, regression coefficient, etc.) divided by the estimate itself, usually expressed as a percentage. For example, if a survey finds 15% unemployment with a 6% standard error, the CV is .06/.15 = .4 = 40%. Some US government agencies flag or suppress as unreliable any estimate with a CV over a certain threshold such as 30% or 50%. But this standard can be arbitrary (for example, 85% employment would have a much lower CV of .06/.85 = 7%), and the CV has other drawbacks I won’t elaborate here. I don’t need an evaluation of the wisdom of using the CV or anything else for measuring an estimate’s stability, but one of my projects calls for such a measure and I would like to find a better alternative. Can you or your blog readers suggest
2 0.81794167 1348 andrew gelman stats-2012-05-27-Question 17 of my final exam for Design and Analysis of Sample Surveys
Introduction: 17. In a survey of n people, half are asked if they support “the health care law recently passed by Congress” and half are asked if they support “the law known as Obamacare.” The goal is to estimate the effect of the wording on the proportion of Yes responses. How large must n be for the effect to be estimated within a standard error of 5 percentage points? Solution to question 16 From yesterday : 16. You are doing a survey in a war-torn country to estimate what percentage of unemployed men support the rebels in a civil war. Express this as a ratio estimation problem, where goal is to estimate Y.bar/X.bar. What are x and y here? Give the estimate and standard error for the percentage of unemployed men who support the rebels. Solution: x is 1 if the respondent is an unemployed man, 0 otherwise. y is 1 if the respondent is an unemployed man and supports the rebels, 0 otherwise. The estimate is y.bar/x.bar [typo fixed], the standard error is (1/x.bar)*(1/sqrt(n))*s.z, whe
3 0.7981658 1361 andrew gelman stats-2012-06-02-Question 23 of my final exam for Design and Analysis of Sample Surveys
Introduction: 23. Suppose you are conducting a survey in which people are asked about their health behaviors (how often they wash their hands, how often they go to the doctor, etc.). There is a concern that different interviewers will get different sorts of responses—that is, there may be important interviewer effects. Describe (in two sentences) how you could estimate the interviewer effects within your survey. Can the interviewer effects create problems of reliability of the survey responses? Explain (in one sentence). Can the interviewer effects create problems of validity of the survey responses? Explain (in one sentence). Solution to question 22 From yesterday : 22. A supermarket chain has 100 equally-sized stores. It is desired to estimate the proportion of vegetables that spoil before being sold. Three stores are selected at random and are checked: the percent of spoiled vegetables are 3%, 5%, and 10% in the three stores. Give an estimate and standard error for the percentage of spo
4 0.77332735 1358 andrew gelman stats-2012-06-01-Question 22 of my final exam for Design and Analysis of Sample Surveys
Introduction: 22. A supermarket chain has 100 equally-sized stores. It is desired to estimate the proportion of vegetables that spoil before being sold. Three stores are selected at random and are checked: the percent of spoiled vegetables are 3%, 5%, and 10% in the three stores. Give an estimate and standard error for the percentage of spoiled vegetables for the entire chain. Solution to question 21 From yesterday : 21. A country is divided into three regions with populations of 2 million, 2 million, and 0.5 million, respectively. A survey is done asking about foreign policy opinions. Somebody proposes taking a sample of 50 people from each reason. Give a reason why this non-proportional sample would not usually be done, and also a reason why it might actually be a good idea. Solution: Nonproportional sampling is usually avoided because it makes the analysis more complicated and it results in a higher standard error for estimates of the general population. It might be a good idea her
5 0.75269741 1349 andrew gelman stats-2012-05-28-Question 18 of my final exam for Design and Analysis of Sample Surveys
Introduction: 18. A survey is taken of 100 undergraduates, 100 graduate students, and 100 continuing education students at a university. Assume a simple random sample within each group. Each student is asked to rate his or her satisfaction (on a 1–10 scale) with his or her experiences. Write the estimate and standard error of the average satisfaction of all the students at the university. Introduce notation as necessary for all the information needed to solve the problem. Solution to question 17 From yesterday : 17. In a survey of n people, half are asked if they support “the health care law recently passed by Congress” and half are asked if they support “the law known as Obamacare.” The goal is to estimate the effect of the wording on the proportion of Yes responses. How large must n be for the effect to be estimated within a standard error of 5 percentage points? Solution: se is sqrt(.5*.5/(n/2)+.5*.5/(n/2)) = 1/sqrt(n). Solve 1/sqrt(n) = .05, you get n = (1/.05)^2 = 400.
6 0.74519908 1377 andrew gelman stats-2012-06-13-A question about AIC
8 0.68284303 1345 andrew gelman stats-2012-05-26-Question 16 of my final exam for Design and Analysis of Sample Surveys
9 0.67524141 822 andrew gelman stats-2011-07-26-Any good articles on the use of error bars?
10 0.65877557 1337 andrew gelman stats-2012-05-22-Question 12 of my final exam for Design and Analysis of Sample Surveys
11 0.65551925 960 andrew gelman stats-2011-10-15-The bias-variance tradeoff
12 0.65549463 1362 andrew gelman stats-2012-06-03-Question 24 of my final exam for Design and Analysis of Sample Surveys
13 0.6532498 1320 andrew gelman stats-2012-05-14-Question 4 of my final exam for Design and Analysis of Sample Surveys
14 0.65018618 1761 andrew gelman stats-2013-03-13-Lame Statistics Patents
15 0.64421213 1326 andrew gelman stats-2012-05-17-Question 7 of my final exam for Design and Analysis of Sample Surveys
16 0.6358937 310 andrew gelman stats-2010-10-02-The winner’s curse
17 0.63147837 255 andrew gelman stats-2010-09-04-How does multilevel modeling affect the estimate of the grand mean?
18 0.62780267 1340 andrew gelman stats-2012-05-23-Question 13 of my final exam for Design and Analysis of Sample Surveys
20 0.61889154 918 andrew gelman stats-2011-09-21-Avoiding boundary estimates in linear mixed models
topicId topicWeight
[(2, 0.013), (15, 0.107), (16, 0.071), (17, 0.013), (21, 0.028), (24, 0.072), (29, 0.011), (43, 0.011), (47, 0.012), (53, 0.011), (66, 0.012), (84, 0.037), (86, 0.051), (91, 0.058), (95, 0.023), (97, 0.012), (99, 0.356)]
simIndex simValue blogId blogTitle
same-blog 1 0.97444308 1441 andrew gelman stats-2012-08-02-“Based on my experiences, I think you could make general progress by constructing a solution to your specific problem.”
Introduction: David Radwin writes: I am seeking a statistic measuring an estimate’s reliability or stability as an alternative to the coefficient of variation (CV), also known as the relative standard error. The CV is the standard error of an estimate (proportion, mean, regression coefficient, etc.) divided by the estimate itself, usually expressed as a percentage. For example, if a survey finds 15% unemployment with a 6% standard error, the CV is .06/.15 = .4 = 40%. Some US government agencies flag or suppress as unreliable any estimate with a CV over a certain threshold such as 30% or 50%. But this standard can be arbitrary (for example, 85% employment would have a much lower CV of .06/.85 = 7%), and the CV has other drawbacks I won’t elaborate here. I don’t need an evaluation of the wisdom of using the CV or anything else for measuring an estimate’s stability, but one of my projects calls for such a measure and I would like to find a better alternative. Can you or your blog readers suggest
2 0.96835947 1833 andrew gelman stats-2013-04-30-“Tragedy of the science-communication commons”
Introduction: I’ve earlier written that science is science communication —that is, the act of communicating scientific ideas and findings to ourselves and others is itself a central part of science. My point was to push against a conventional separation between the act of science and the act of communication, the idea that science is done by scientists and communication is done by communicators. It’s a rare bit of science that does not include communication as part of it. As a scientist and science communicator myself, I’m particularly sensitive to devaluing of communication. (For example, Bayesian Data Analysis is full of original research that was done in order to communicate; or, to put it another way, we often think we understand a scientific idea, but once we try to communicate it, we recognize gaps in our understanding that motivate further research.) I once saw the following on one of those inspirational-sayings-for-every-day desk calendars: “To have ideas is to gather flowers. To thin
3 0.96586818 1385 andrew gelman stats-2012-06-20-Reconciling different claims about working-class voters
Introduction: After our discussions of psychologist Jonathan Haidt’s opinions about working-class voters (see here and here ), a question arose on how to reconcile the analyses of Alan Abramowitz and Tom Edsall (showing an increase in Republican voting among low-education working white southerners), with Larry Bartels’s finding that “there has been no discernible trend in presidential voting behavior among the ‘working white working class.’” Here is my resolution: All the statistics that have been posted seem reasonable to me. Also relevant to the discussion, I believe, are Figures 3.1, 4.2b, 10.1, and 10.2 of Red State Blue State. In short: Republicans continue to do about 20 percentage points better among upper-income voters compared to lower-income, but the compositions of these coalitions have changed over time. As has been noted, low-education white workers have moved toward the Republican party over the past few decades, and at the same time there have been compositional changes
4 0.9633714 2014 andrew gelman stats-2013-09-09-False memories and statistical analysis
Introduction: Alison George interviews Elizabeth Loftus. It’s good stuff, important stuff, and it relates to my view as a statistician that uncertainty and variation are important. Uncertainty is relevant here because there are things that people remember that never happened. Variation is important because different people remember different things. Loftus’s work also seems relevant to the problems with pseudoscience that we’ve been discussing recently on the blog, studies where researchers follow the forms of scientific reasoning and publish in scientific journals, but what they are publishing is essentially unreplicable noise. Perhaps there’s some connection between all those people Loftus has interviewed, who remember events that never happened to them, and people such as Daryl Bem, who think they’ve computed rigorous p-values even though their analyses seem so clearly to be contingent on the data. It’s almost like a false memory, that scientists convince themselves that the analysis the
5 0.96309018 1678 andrew gelman stats-2013-01-17-Wanted: 365 stories of statistics
Introduction: The American Statistical Association has a blog called the Statistics Forum that I edit but haven’t been doing much with. Originally I thought we’d get a bunch of bloggers and have a topic each week or each month and get discussions from lots of perspectives. But it was hard to get people to keep contributing, and the blog+comments approach didn’t seem to be working as a way to get wide-ranging discussion. I did organize a good roundtable discussion at one point, but it took a lot of work on my part. Recently I had another idea for the blog, based on something that Kaiser Fung wrote on three hours in the life of a statistician , along with a similar (if a bit more impressionistic) piece I wrote awhile back describing my experiences on a typical workday. So here’s the plan. 365 of you write vignettes about your statistical lives. Get into the nitty gritty—tell me what you do, and why you’re doing it. I’ll collect these and then post them at the Statistics Forum, one a day
6 0.96274889 1393 andrew gelman stats-2012-06-26-The reverse-journal-submission system
10 0.96191812 1395 andrew gelman stats-2012-06-27-Cross-validation (What is it good for?)
11 0.96191466 1273 andrew gelman stats-2012-04-20-Proposals for alternative review systems for scientific work
12 0.96096981 1779 andrew gelman stats-2013-03-27-“Two Dogmas of Strong Objective Bayesianism”
13 0.96035028 1139 andrew gelman stats-2012-01-26-Suggested resolution of the Bem paradox
14 0.95996052 1389 andrew gelman stats-2012-06-23-Larry Wasserman’s statistics blog
15 0.95981133 1998 andrew gelman stats-2013-08-25-A new Bem theory
17 0.95925635 371 andrew gelman stats-2010-10-26-Musical chairs in econ journals
18 0.95921785 2268 andrew gelman stats-2014-03-26-New research journal on observational studies
19 0.95898753 2006 andrew gelman stats-2013-09-03-Evaluating evidence from published research
20 0.95886743 1774 andrew gelman stats-2013-03-22-Likelihood Ratio ≠ 1 Journal