andrew_gelman_stats andrew_gelman_stats-2010 andrew_gelman_stats-2010-241 knowledge-graph by maker-knowledge-mining

241 andrew gelman stats-2010-08-29-Ethics and statistics in development research


meta infos for this blog

Source: html

Introduction: From Bannerjee and Duflo, “The Experimental Approach to Development Economics,” Annual Review of Economics (2009): One issue with the explicit acknowledgment of randomization as a fair way to allocate the program is that implementers may find that the easiest way to present it to the community is to say that an expansion of the program is planned for the control areas in the future (especially when such is indeed the case, as in phased-in design). I can’t quite figure out whether Bannerjee and Duflo are saying that they would lie and tell people that an expansion is planned when it isn’t, or whether they’re deploring that other people do it. I’m not bothered by a lot of the deception in experimental research–for example, I think the Milgram obedience experiment was just fine–but somehow the above deception bothers me. It just seems wrong to tell people that an expansion is planned if it’s not. P.S. Overall the article is pretty good. My only real problem with it is that


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 I can’t quite figure out whether Bannerjee and Duflo are saying that they would lie and tell people that an expansion is planned when it isn’t, or whether they’re deploring that other people do it. [sent-2, score-0.781]

2 I’m not bothered by a lot of the deception in experimental research–for example, I think the Milgram obedience experiment was just fine–but somehow the above deception bothers me. [sent-3, score-0.84]

3 It just seems wrong to tell people that an expansion is planned if it’s not. [sent-4, score-0.636]

4 My only real problem with it is that when discussing data analysis, they pretty much ignore the statistical literature and just look at econometrics. [sent-8, score-0.183]

5 In the long run, that’s fine—any relevant developments in statistics should eventually make their way over to the econometrics literature. [sent-9, score-0.156]

6 But for now I think it’s a drawback in that it encourages a focus on theory and testing rather than modeling and scientific understanding. [sent-10, score-0.466]

7 Rather, I’m suggesting that their statistical methods might not be allowing them to get the most out of their data–and that they’re looking in the wrong place when researching better methods. [sent-12, score-0.573]

8 The problem, I think, is that they (like many economists) think of statistical methods not as a tool for learning but as a tool for rigor. [sent-13, score-0.503]

9 So they gravitate toward math-heavy methods based on testing, asymptotics, and abstract theories, rather than toward complex modeling. [sent-14, score-0.541]

10 The result is a disconnect between statistical methods and applied goals. [sent-15, score-0.427]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('bannerjee', 0.34), ('duflo', 0.325), ('planned', 0.236), ('expansion', 0.236), ('deception', 0.226), ('asymptotics', 0.209), ('testing', 0.164), ('methods', 0.139), ('tool', 0.124), ('acknowledgment', 0.12), ('allocate', 0.12), ('obedience', 0.12), ('statistical', 0.116), ('drawback', 0.113), ('experimental', 0.112), ('milgram', 0.108), ('tests', 0.105), ('gravitate', 0.105), ('program', 0.104), ('treatment', 0.102), ('researching', 0.101), ('toward', 0.099), ('rather', 0.099), ('titles', 0.099), ('economics', 0.096), ('distributional', 0.095), ('bootstrap', 0.091), ('disconnect', 0.091), ('tell', 0.09), ('heterogeneity', 0.09), ('encourages', 0.09), ('bothers', 0.088), ('easiest', 0.088), ('randomization', 0.087), ('fine', 0.082), ('developments', 0.082), ('explicit', 0.082), ('lie', 0.081), ('applied', 0.081), ('instrumental', 0.079), ('nonparametric', 0.077), ('annual', 0.076), ('allowing', 0.075), ('wrong', 0.074), ('econometrics', 0.074), ('whether', 0.069), ('claiming', 0.068), ('suggesting', 0.068), ('bothered', 0.068), ('ignore', 0.067)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0 241 andrew gelman stats-2010-08-29-Ethics and statistics in development research

Introduction: From Bannerjee and Duflo, “The Experimental Approach to Development Economics,” Annual Review of Economics (2009): One issue with the explicit acknowledgment of randomization as a fair way to allocate the program is that implementers may find that the easiest way to present it to the community is to say that an expansion of the program is planned for the control areas in the future (especially when such is indeed the case, as in phased-in design). I can’t quite figure out whether Bannerjee and Duflo are saying that they would lie and tell people that an expansion is planned when it isn’t, or whether they’re deploring that other people do it. I’m not bothered by a lot of the deception in experimental research–for example, I think the Milgram obedience experiment was just fine–but somehow the above deception bothers me. It just seems wrong to tell people that an expansion is planned if it’s not. P.S. Overall the article is pretty good. My only real problem with it is that

2 0.30158728 1763 andrew gelman stats-2013-03-14-Everyone’s trading bias for variance at some point, it’s just done at different places in the analyses

Introduction: Some things I respect When it comes to meta-models of statistics, here are two philosophies that I respect: 1. (My) Bayesian approach, which I associate with E. T. Jaynes, in which you construct models with strong assumptions, ride your models hard, check their fit to data, and then scrap them and improve them as necessary. 2. At the other extreme, model-free statistical procedures that are designed to work well under very weak assumptions—for example, instead of assuming a distribution is Gaussian, you would just want the procedure to work well under some conditions on the smoothness of the second derivative of the log density function. Both the above philosophies recognize that (almost) all important assumptions will be wrong, and they resolve this concern via aggressive model checking or via robustness. And of course there are intermediate positions, such as working with Bayesian models that have been shown to be robust, and then still checking them. Or, to flip it arou

3 0.21042864 1690 andrew gelman stats-2013-01-23-When are complicated models helpful in psychology research and when are they overkill?

Introduction: Nick Brown is bothered by this article , “An unscented Kalman filter approach to the estimation of nonlinear dynamical systems models,” by Sy-Miin Chow, Emilio Ferrer, and John Nesselroade. The introduction of the article cites a bunch of articles in serious psych/statistics journals. The question is, are such advanced statistical techniques really needed, or even legitimate, with the kind of very rough data that is usually available in psych applications? Or is it just fishing in the hope of discovering patterns that are not really there? I wrote: It seems like a pretty innocuous literature review. I agree that many of the applications are silly (for example, they cite the work of the notorious John Gottman in fitting a predator-prey model to spousal relations (!)), but overall they just seem to be presenting very standard ideas for the mathematical-psychology audience. It’s not clear whether advanced techniques are always appropriate here, but they come in through a natura

4 0.13595033 1891 andrew gelman stats-2013-06-09-“Heterogeneity of variance in experimental studies: A challenge to conventional interpretations”

Introduction: Avi sent along this old paper from Bryk and Raudenbush, who write: The presence of heterogeneity of variance across groups indicates that the standard statistical model for treatment effects no longer applies. Specifically, the assumption that treatments add a constant to each subject’s development fails. An alternative model is required to represent how treatment effects are distributed across individuals. We develop in this article a simple statistical model to demonstrate the link between heterogeneity of variance and random treatment effects. Next, we illustrate with results from two previously published studies how a failure to recognize the substantive importance of heterogeneity of variance obscured significant results present in these data. The article concludes with a review and synthesis of techniques for modeling variances. Although these methods have been well established in the statistical literature, they are not widely known by social and behavioral scientists. T

5 0.12387586 2312 andrew gelman stats-2014-04-29-Ken Rice presents a unifying approach to statistical inference and hypothesis testing

Introduction: Ken Rice writes: In the recent discussion on stopping rules I saw a comment that I wanted to chip in on, but thought it might get a bit lost, in the already long thread. Apologies in advance if I misinterpreted what you wrote, or am trying to tell you things you already know. The comment was: “In Bayesian decision making, there is a utility function and you choose the decision with highest expected utility. Making a decision based on statistical significance does not correspond to any utility function.” … which immediately suggests this little 2010 paper; A Decision-Theoretic Formulation of Fisher’s Approach to Testing, The American Statistician, 64(4) 345-349. It contains utilities that lead to decisions that very closely mimic classical Wald tests, and provides a rationale for why this utility is not totally unconnected from how some scientists think. Some (old) slides discussing it are here . A few notes, on things not in the paper: * I know you don’t like squared-

6 0.11020663 1582 andrew gelman stats-2012-11-18-How to teach methods we don’t like?

7 0.11015551 1695 andrew gelman stats-2013-01-28-Economists argue about Bayes

8 0.10424538 2 andrew gelman stats-2010-04-23-Modeling heterogenous treatment effects

9 0.10164268 888 andrew gelman stats-2011-09-03-A psychology researcher asks: Is Anova dead?

10 0.10112751 1750 andrew gelman stats-2013-03-05-Watership Down, thick description, applied statistics, immutability of stories, and playing tennis with a net

11 0.10103368 744 andrew gelman stats-2011-06-03-Statistical methods for healthcare regulation: rating, screening and surveillance

12 0.097197786 1176 andrew gelman stats-2012-02-19-Standardized writing styles and standardized graphing styles

13 0.096236736 498 andrew gelman stats-2011-01-02-Theoretical vs applied statistics

14 0.095933519 1482 andrew gelman stats-2012-09-04-Model checking and model understanding in machine learning

15 0.092641726 1431 andrew gelman stats-2012-07-27-Overfitting

16 0.092151903 2210 andrew gelman stats-2014-02-13-Stopping rules and Bayesian analysis

17 0.092147551 423 andrew gelman stats-2010-11-20-How to schedule projects in an introductory statistics course?

18 0.091731943 1950 andrew gelman stats-2013-07-22-My talks that were scheduled for Tues at the Data Skeptics meetup and Wed at the Open Statistical Programming meetup

19 0.091510542 32 andrew gelman stats-2010-05-14-Causal inference in economics

20 0.091232538 2245 andrew gelman stats-2014-03-12-More on publishing in journals


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.191), (1, 0.041), (2, -0.047), (3, -0.063), (4, -0.021), (5, 0.013), (6, -0.088), (7, 0.009), (8, 0.044), (9, 0.041), (10, -0.053), (11, 0.002), (12, 0.012), (13, -0.057), (14, 0.022), (15, -0.03), (16, -0.039), (17, -0.014), (18, -0.021), (19, -0.002), (20, 0.003), (21, -0.057), (22, 0.017), (23, 0.061), (24, -0.034), (25, 0.009), (26, -0.005), (27, 0.009), (28, -0.019), (29, 0.024), (30, -0.011), (31, 0.035), (32, 0.028), (33, -0.017), (34, -0.018), (35, -0.037), (36, -0.005), (37, 0.012), (38, -0.026), (39, 0.012), (40, 0.007), (41, 0.017), (42, -0.016), (43, -0.005), (44, 0.021), (45, 0.029), (46, -0.021), (47, -0.046), (48, -0.003), (49, 0.043)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.97077978 241 andrew gelman stats-2010-08-29-Ethics and statistics in development research

Introduction: From Bannerjee and Duflo, “The Experimental Approach to Development Economics,” Annual Review of Economics (2009): One issue with the explicit acknowledgment of randomization as a fair way to allocate the program is that implementers may find that the easiest way to present it to the community is to say that an expansion of the program is planned for the control areas in the future (especially when such is indeed the case, as in phased-in design). I can’t quite figure out whether Bannerjee and Duflo are saying that they would lie and tell people that an expansion is planned when it isn’t, or whether they’re deploring that other people do it. I’m not bothered by a lot of the deception in experimental research–for example, I think the Milgram obedience experiment was just fine–but somehow the above deception bothers me. It just seems wrong to tell people that an expansion is planned if it’s not. P.S. Overall the article is pretty good. My only real problem with it is that

2 0.81604415 1690 andrew gelman stats-2013-01-23-When are complicated models helpful in psychology research and when are they overkill?

Introduction: Nick Brown is bothered by this article , “An unscented Kalman filter approach to the estimation of nonlinear dynamical systems models,” by Sy-Miin Chow, Emilio Ferrer, and John Nesselroade. The introduction of the article cites a bunch of articles in serious psych/statistics journals. The question is, are such advanced statistical techniques really needed, or even legitimate, with the kind of very rough data that is usually available in psych applications? Or is it just fishing in the hope of discovering patterns that are not really there? I wrote: It seems like a pretty innocuous literature review. I agree that many of the applications are silly (for example, they cite the work of the notorious John Gottman in fitting a predator-prey model to spousal relations (!)), but overall they just seem to be presenting very standard ideas for the mathematical-psychology audience. It’s not clear whether advanced techniques are always appropriate here, but they come in through a natura

3 0.804066 1575 andrew gelman stats-2012-11-12-Thinking like a statistician (continuously) rather than like a civilian (discretely)

Introduction: John Cook writes : When I hear someone say “personalized medicine” I want to ask “as opposed to what?” All medicine is personalized. If you are in an emergency room with a broken leg and the person next to you is lapsing into a diabetic coma, the two of you will be treated differently. The aim of personalized medicine is to increase the degree of personalization, not to introduce personalization. . . . This to me is a statistical way of thinking, to change an “Is it or isn’t it?” question into a “How much?” question. This distinction arises in many settings but particularly in discussions of causal inference, for example here and here , where I use the “statistical thinking” approach of imagining everything as being on some continuous scale, in contrast to computer scientist Elias Bareinboim and psychology researcher Steven Sloman, both of whom prefer what might be called the “civilian” or “common sense” idea that effects are either real or not, or that certain data can

4 0.79706705 738 andrew gelman stats-2011-05-30-Works well versus well understood

Introduction: John Cook discusses the John Tukey quote, “The test of a good procedure is how well it works, not how well it is understood.” Cook writes: At some level, it’s hard to argue against this. Statistical procedures operate on empirical data, so it makes sense that the procedures themselves be evaluated empirically. But I [Cook] question whether we really know that a statistical procedure works well if it isn’t well understood. Specifically, I’m skeptical of complex statistical methods whose only credentials are a handful of simulations. “We don’t have any theoretical results, buy hey, it works well in practice. Just look at the simulations.” Every method works well on the scenarios its author publishes, almost by definition. If the method didn’t handle a scenario well, the author would publish a different scenario. I agree with Cook but would give a slightly different emphasis. I’d say that a lot of methods can work when they are done well. See the second meta-principle liste

5 0.79687542 744 andrew gelman stats-2011-06-03-Statistical methods for healthcare regulation: rating, screening and surveillance

Introduction: Here is my discussion of a recent article by David Spiegelhalter, Christopher Sherlaw-Johnson, Martin Bardsley, Ian Blunt, Christopher Wood and Olivia Grigg, that is scheduled to appear in the Journal of the Royal Statistical Society: I applaud the authors’ use of a mix of statistical methods to attack an important real-world problem. Policymakers need results right away, and I admire the authors’ ability and willingness to combine several different modeling and significance testing ideas for the purposes of rating and surveillance. That said, I am uncomfortable with the statistical ideas here, for three reasons. First, I feel that the proposed methods, centered as they are around data manipulation and corrections for uncertainty, has serious defects compared to a more model-based approach. My problem with methods based on p-values and z-scores–however they happen to be adjusted–is that they draw discussion toward error rates, sequential analysis, and other technical statistical

6 0.78626531 1763 andrew gelman stats-2013-03-14-Everyone’s trading bias for variance at some point, it’s just done at different places in the analyses

7 0.78362995 2151 andrew gelman stats-2013-12-27-Should statistics have a Nobel prize?

8 0.76984483 1645 andrew gelman stats-2012-12-31-Statistical modeling, causal inference, and social science

9 0.76727885 1883 andrew gelman stats-2013-06-04-Interrogating p-values

10 0.76324278 2127 andrew gelman stats-2013-12-08-The never-ending (and often productive) race between theory and practice

11 0.75955099 1979 andrew gelman stats-2013-08-13-Convincing Evidence

12 0.75557172 1880 andrew gelman stats-2013-06-02-Flame bait

13 0.7458598 1750 andrew gelman stats-2013-03-05-Watership Down, thick description, applied statistics, immutability of stories, and playing tennis with a net

14 0.74201179 309 andrew gelman stats-2010-10-01-Why Development Economics Needs Theory?

15 0.74175143 789 andrew gelman stats-2011-07-07-Descriptive statistics, causal inference, and story time

16 0.73890346 2263 andrew gelman stats-2014-03-24-Empirical implications of Empirical Implications of Theoretical Models

17 0.73171657 1195 andrew gelman stats-2012-03-04-Multiple comparisons dispute in the tabloids

18 0.72963572 1861 andrew gelman stats-2013-05-17-Where do theories come from?

19 0.72948444 1859 andrew gelman stats-2013-05-16-How do we choose our default methods?

20 0.7264269 155 andrew gelman stats-2010-07-19-David Blackwell


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(15, 0.03), (16, 0.012), (21, 0.022), (24, 0.608), (42, 0.013), (53, 0.017), (99, 0.182)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.99749571 1046 andrew gelman stats-2011-12-07-Neutral noninformative and informative conjugate beta and gamma prior distributions

Introduction: Jouni Kerman did a cool bit of research justifying the Beta (1/3, 1/3) prior as noninformative for binomial data, and the Gamma (1/3, 0) prior for Poisson data. You probably thought that nothing new could be said about noninformative priors in such basic problems, but you were wrong! Here’s the story : The conjugate binomial and Poisson models are commonly used for estimating proportions or rates. However, it is not well known that the conventional noninformative conjugate priors tend to shrink the posterior quantiles toward the boundary or toward the middle of the parameter space, making them thus appear excessively informative. The shrinkage is always largest when the number of observed events is small. This behavior persists for all sample sizes and exposures. The effect of the prior is therefore most conspicuous and potentially controversial when analyzing rare events. As alternative default conjugate priors, I [Jouni] introduce Beta(1/3, 1/3) and Gamma(1/3, 0), which I cal

2 0.99687231 1437 andrew gelman stats-2012-07-31-Paying survey respondents

Introduction: I agree with Casey Mulligan that participants in government surveys should be paid, and I think it should be part of the code of ethics for commercial pollsters to compensate their respondents also. As Mulligan points out, if a survey is worth doing, it should be worth compensating the participants for their time and effort. P.S. Just to clarify, I do not recommend that Census surveys be made voluntary, I just think that respondents (who can be required to participate) should be paid a small amount. P.P.S. More rant here .

3 0.99561149 471 andrew gelman stats-2010-12-17-Attractive models (and data) wanted for statistical art show.

Introduction: I have agreed to do a local art exhibition in February. An excuse to think about form, colour and style for plotting almost individual observation likelihoods – while invoking the artists privilege of refusing to give interpretations of their own work. In order to make it possibly less dry I’ll try to use intuitive suggestive captions like in this example TheTyranyof13.pdf thereby side stepping the technical discussions like here RadfordNealBlog Suggested models and data sets (or even submissions) would be most appreciated. I likely be sticking to realism i.e. plots that represent ‘statistical reality’ faithfully. K?

4 0.99484825 240 andrew gelman stats-2010-08-29-ARM solutions

Introduction: People sometimes email asking if a solution set is available for the exercises in ARM. The answer, unfortunately, is no. Many years ago, I wrote up 50 solutions for BDA and it was a lot of work–really, it was like writing a small book in itself. The trouble is that, once I started writing them up, I wanted to do it right, to set a good example. That’s a lot more effort than simply scrawling down some quick answers.

5 0.99074203 545 andrew gelman stats-2011-01-30-New innovations in spam

Introduction: I received the following (unsolicited) email today: Hello Andrew, I’m interested in whether you are accepting guest article submissions for your site Statistical Modeling, Causal Inference, and Social Science? I’m the owner of the recently created nonprofit site OnlineEngineeringDegree.org and am interested in writing / submitting an article for your consideration to be published on your site. Is that something you’d be willing to consider, and if so, what specs in terms of topics or length requirements would you be looking for? Thanks you for your time, and if you have any questions or are interested, I’d appreciate you letting me know. Sincerely, Samantha Rhodes Huh? P.S. My vote for most obnoxious spam remains this one , which does its best to dilute whatever remains of the reputation of Wolfram Research. Or maybe that particular bit of spam was written by a particularly awesome cellular automaton that Wolfram discovered? I guess in the world of big-time software

6 0.98741537 643 andrew gelman stats-2011-04-02-So-called Bayesian hypothesis testing is just as bad as regular hypothesis testing

7 0.97419679 38 andrew gelman stats-2010-05-18-Breastfeeding, infant hyperbilirubinemia, statistical graphics, and modern medicine

8 0.97257042 59 andrew gelman stats-2010-05-30-Extended Binary Format Support for Mac OS X

same-blog 9 0.96715474 241 andrew gelman stats-2010-08-29-Ethics and statistics in development research

10 0.96168882 938 andrew gelman stats-2011-10-03-Comparing prediction errors

11 0.96159154 1978 andrew gelman stats-2013-08-12-Fixing the race, ethnicity, and national origin questions on the U.S. Census

12 0.95980555 1092 andrew gelman stats-2011-12-29-More by Berger and me on weakly informative priors

13 0.95946437 1479 andrew gelman stats-2012-09-01-Mothers and Moms

14 0.95877862 2229 andrew gelman stats-2014-02-28-God-leaf-tree

15 0.9554559 613 andrew gelman stats-2011-03-15-Gay-married state senator shot down gay marriage

16 0.9554559 712 andrew gelman stats-2011-05-14-The joys of working in the public domain

17 0.9554559 723 andrew gelman stats-2011-05-21-Literary blurb translation guide

18 0.9554559 1242 andrew gelman stats-2012-04-03-Best lottery story ever

19 0.9554559 1252 andrew gelman stats-2012-04-08-Jagdish Bhagwati’s definition of feminist sincerity

20 0.95008814 373 andrew gelman stats-2010-10-27-It’s better than being forwarded the latest works of you-know-who