andrew_gelman_stats andrew_gelman_stats-2011 andrew_gelman_stats-2011-907 knowledge-graph by maker-knowledge-mining

907 andrew gelman stats-2011-09-14-Reproducibility in Practice


meta infos for this blog

Source: html

Introduction: In light of the recent article about drug-target research and replication (Andrew blogged it here ) and l’affaire Potti , I have mentioned the “Forensic Bioinformatics” paper (Baggerly & Coombes 2009) to several colleagues in passing this week. I have concluded that it has not gotten the attention it deserves, though it has been discussed on this blog before too. Figure 1 from Baggerly & Coombes 2009 The authors try to reproduce published data, and end up “reverse engineering” what the original authors had to have done. Some examples: §2.2: “Training data sensitive/resistant labels are reversed.” §2.4: “Only 84/122 test samples are distinct; some samples are labeled both sensitive and resistant.” §2.7: Almost half of the data is incorrectly labeled resistant. §3.2: “This offset involves a single row shift: for example, … [data from] row 98 were used instead of those from row 97.” §5.4: “Poor documentation led a report on drug A to include a heatmap


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Figure 1 from Baggerly & Coombes 2009 The authors try to reproduce published data, and end up “reverse engineering” what the original authors had to have done. [sent-3, score-0.229]

2 4: “Only 84/122 test samples are distinct; some samples are labeled both sensitive and resistant. [sent-7, score-0.346]

3 7: Almost half of the data is incorrectly labeled resistant. [sent-9, score-0.291]

4 2: “This offset involves a single row shift: for example, … [data from] row 98 were used instead of those from row 97. [sent-11, score-0.761]

5 4: “Poor documentation led a report on drug A to include a heatmap for drug B and a gene list for drug C. [sent-13, score-0.797]

6 These results are based on simple visual inspection and counting, and are not documented further. [sent-14, score-0.184]

7 Continuing in the usual theme of my occasional posts, I’ll share what reproducible research means for me in practice. [sent-16, score-0.083]

8 Here is my xetex template if you care about typography. [sent-19, score-0.153]

9 Eventually I save objects that took a long time to compute, set their evaluation to false, and then load the saved object immediately below, but crucially I still have their generative code right there . [sent-20, score-0.787]

10 Rdata") @ So once I was satisfied that computation1 produced the object. [sent-25, score-0.076]

11 1 of my dreams, I could just flip eval=FALSE on the first code chunk and save myself the hassle. [sent-26, score-0.551]

12 It is generally not painful to leave any pre-processing / data loading and joining, and recoding in the first code chunk. [sent-28, score-0.575]

13 This will prevent you from having a stylized data file that you don’t know what you did to it, because you actually redo it from scratch every time. [sent-29, score-0.669]

14 It sometimes makes sense to separate this out into a file that you source() . [sent-30, score-0.253]

15 For presentations or other destinations, I can just copy the paper. [sent-31, score-0.083]

16 Rnw, make any necessary changes (to the size in the code chunk argument, for example, to make Beamer-friendly images). [sent-33, score-0.467]

17 Rnw on this will ensure that my names are consistent (“pres-gfx-codechunkname. [sent-35, score-0.076]

18 pdf”) and I don’t do something completely different or accidentally use the wrong model on the wrong graphic. [sent-36, score-0.083]

19 I could, if I had truly mastered ediff , easily merge any changes I made for presentation back to the paper. [sent-37, score-0.303]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('file', 0.253), ('sweave', 0.231), ('row', 0.223), ('baggerly', 0.211), ('coombes', 0.211), ('chunk', 0.19), ('code', 0.184), ('load', 0.178), ('save', 0.177), ('drug', 0.177), ('template', 0.153), ('labeled', 0.128), ('samples', 0.109), ('mastered', 0.105), ('forensic', 0.105), ('heatmap', 0.105), ('loading', 0.105), ('merge', 0.105), ('recoding', 0.105), ('inspection', 0.099), ('false', 0.097), ('painful', 0.095), ('dreams', 0.095), ('changes', 0.093), ('crucially', 0.092), ('offset', 0.092), ('redo', 0.092), ('potti', 0.089), ('joining', 0.087), ('data', 0.086), ('documented', 0.085), ('accidentally', 0.083), ('presentations', 0.083), ('reproducible', 0.083), ('documentation', 0.081), ('scratch', 0.08), ('gene', 0.08), ('generative', 0.08), ('stylized', 0.079), ('prevent', 0.079), ('authors', 0.078), ('incorrectly', 0.077), ('ensure', 0.076), ('satisfied', 0.076), ('saved', 0.076), ('gross', 0.074), ('reproduce', 0.073), ('distinct', 0.073), ('deserves', 0.073), ('almost', 0.073)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.99999982 907 andrew gelman stats-2011-09-14-Reproducibility in Practice

Introduction: In light of the recent article about drug-target research and replication (Andrew blogged it here ) and l’affaire Potti , I have mentioned the “Forensic Bioinformatics” paper (Baggerly & Coombes 2009) to several colleagues in passing this week. I have concluded that it has not gotten the attention it deserves, though it has been discussed on this blog before too. Figure 1 from Baggerly & Coombes 2009 The authors try to reproduce published data, and end up “reverse engineering” what the original authors had to have done. Some examples: §2.2: “Training data sensitive/resistant labels are reversed.” §2.4: “Only 84/122 test samples are distinct; some samples are labeled both sensitive and resistant.” §2.7: Almost half of the data is incorrectly labeled resistant. §3.2: “This offset involves a single row shift: for example, … [data from] row 98 were used instead of those from row 97.” §5.4: “Poor documentation led a report on drug A to include a heatmap

2 0.2361128 360 andrew gelman stats-2010-10-21-Forensic bioinformatics, or, Don’t believe everything you read in the (scientific) papers

Introduction: Hadley Wickham sent me this , by Keith Baggerly and Kevin Coombes: In this report we [Baggerly and Coombes] examine several related papers purporting to use microarray-based signatures of drug sensitivity derived from cell lines to predict patient response. Patients in clinical trials are currently being allocated to treatment arms on the basis of these results. However, we show in five case studies that the results incorporate several simple errors that may be putting patients at risk. One theme that emerges is that the most common errors are simple (e.g., row or column offsets); conversely, it is our experience that the most simple errors are common. This is horrible! But, in a way, it’s not surprising. I make big mistakes in my applied work all the time. I mean, all the time. Sometimes I scramble the order of the 50 states, or I’m plotting a pure noise variable, or whatever. But usually I don’t drift too far from reality because I have a lot of cross-checks and I (or my

3 0.14378637 1447 andrew gelman stats-2012-08-07-Reproducible science FAIL (so far): What’s stoppin people from sharin data and code?

Introduction: David Karger writes: Your recent post on sharing data was of great interest to me, as my own research in computer science asks how to incentivize and lower barriers to data sharing. I was particularly curious about your highlighting of effort as the major dis-incentive to sharing. I would love to hear more, as this question of effort is on we specifically target in our development of tools for data authoring and publishing. As a straw man, let me point out that sharing data technically requires no more than posting an excel spreadsheet online. And that you likely already produced that spreadsheet during your own analytic work. So, in what way does such low-tech publishing fail to meet your data sharing objectives? Our own hypothesis has been that the effort is really quite low, with the problem being a lack of *immediate/tangible* benefits (as opposed to the long-term values you accurately describe). To attack this problem, we’re developing tools (and, since it appear

4 0.13725612 1807 andrew gelman stats-2013-04-17-Data problems, coding errors…what can be done?

Introduction: This post is by Phil A recent post on this blog discusses a prominent case of an Excel error leading to substantially wrong results from a statistical analysis. Excel is notorious for this because it is easy to add a row or column of data (or intermediate results) but forget to update equations so that they correctly use the new data. That particular error is less common in a language like R because R programmers usually refer to data by variable name (or by applying functions to a named variable), so the same code works even if you add or remove data. Still, there is plenty of opportunity for errors no matter what language one uses. Andrew ran into problems fairly recently, and also blogged about another instance. I’ve never had to retract a paper, but that’s partly because I haven’t published a whole lot of papers. Certainly I have found plenty of substantial errors pretty late in some of my data analyses, and I obviously don’t have sufficient mechanisms in place to be sure

5 0.10930879 1059 andrew gelman stats-2011-12-14-Looking at many comparisons may increase the risk of finding something statistically significant by epidemiologists, a population with relatively low multilevel modeling consumption

Introduction: To understand the above title, see here . Masanao writes: This report claims that eating meat increases the risk of cancer. I’m sure you can’t read the page but you probably can understand the graphs. Different bars represent subdivision in the amount of the particular type of meat one consumes. And each chunk is different types of meat. Left is for male right is for female. They claim that the difference is significant, but they are clearly not!! I’m for not eating much meat but this is just way too much… Here’s the graph: I don’t know what to think. If you look carefully you can find one or two statistically significant differences but overall the pattern doesn’t look so compelling. I don’t know what the top and bottom rows are, though. Overall, the pattern in the top row looks like it could represent a real trend, while the graphs on the bottom row look like noise. This could be a good example for our multiple comparisons paper. If the researchers won’t

6 0.10142946 41 andrew gelman stats-2010-05-19-Updated R code and data for ARM

7 0.09269838 2078 andrew gelman stats-2013-10-26-“The Bayesian approach to forensic evidence”

8 0.088585675 42 andrew gelman stats-2010-05-19-Updated solutions to Bayesian Data Analysis homeworks

9 0.08567372 1754 andrew gelman stats-2013-03-08-Cool GSS training video! And cumulative file 1972-2012!

10 0.085059933 198 andrew gelman stats-2010-08-11-Multilevel modeling in R on a Mac

11 0.079790026 706 andrew gelman stats-2011-05-11-The happiness gene: My bottom line (for now)

12 0.079603866 852 andrew gelman stats-2011-08-13-Checking your model using fake data

13 0.075539157 2137 andrew gelman stats-2013-12-17-Replication backlash

14 0.07442151 1883 andrew gelman stats-2013-06-04-Interrogating p-values

15 0.074365988 702 andrew gelman stats-2011-05-09-“Discovered: the genetic secret of a happy life”

16 0.074306905 154 andrew gelman stats-2010-07-18-Predictive checks for hierarchical models

17 0.074011967 2303 andrew gelman stats-2014-04-23-Thinking of doing a list experiment? Here’s a list of reasons why you should think again

18 0.073187493 1716 andrew gelman stats-2013-02-09-iPython Notebook

19 0.070626304 1009 andrew gelman stats-2011-11-14-Wickham R short course

20 0.070174821 1695 andrew gelman stats-2013-01-28-Economists argue about Bayes


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.15), (1, -0.007), (2, -0.018), (3, -0.024), (4, 0.052), (5, -0.037), (6, -0.004), (7, -0.051), (8, 0.005), (9, -0.019), (10, -0.019), (11, 0.013), (12, -0.019), (13, -0.039), (14, 0.006), (15, 0.019), (16, 0.016), (17, -0.012), (18, 0.006), (19, -0.0), (20, 0.023), (21, 0.027), (22, -0.022), (23, -0.006), (24, -0.039), (25, 0.011), (26, 0.008), (27, -0.018), (28, 0.051), (29, 0.009), (30, 0.027), (31, -0.011), (32, 0.003), (33, 0.041), (34, 0.039), (35, -0.021), (36, -0.018), (37, 0.039), (38, -0.003), (39, 0.008), (40, -0.003), (41, -0.002), (42, 0.005), (43, -0.002), (44, -0.007), (45, 0.042), (46, -0.036), (47, -0.016), (48, 0.047), (49, -0.007)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.95442873 907 andrew gelman stats-2011-09-14-Reproducibility in Practice

Introduction: In light of the recent article about drug-target research and replication (Andrew blogged it here ) and l’affaire Potti , I have mentioned the “Forensic Bioinformatics” paper (Baggerly & Coombes 2009) to several colleagues in passing this week. I have concluded that it has not gotten the attention it deserves, though it has been discussed on this blog before too. Figure 1 from Baggerly & Coombes 2009 The authors try to reproduce published data, and end up “reverse engineering” what the original authors had to have done. Some examples: §2.2: “Training data sensitive/resistant labels are reversed.” §2.4: “Only 84/122 test samples are distinct; some samples are labeled both sensitive and resistant.” §2.7: Almost half of the data is incorrectly labeled resistant. §3.2: “This offset involves a single row shift: for example, … [data from] row 98 were used instead of those from row 97.” §5.4: “Poor documentation led a report on drug A to include a heatmap

2 0.85743344 1807 andrew gelman stats-2013-04-17-Data problems, coding errors…what can be done?

Introduction: This post is by Phil A recent post on this blog discusses a prominent case of an Excel error leading to substantially wrong results from a statistical analysis. Excel is notorious for this because it is easy to add a row or column of data (or intermediate results) but forget to update equations so that they correctly use the new data. That particular error is less common in a language like R because R programmers usually refer to data by variable name (or by applying functions to a named variable), so the same code works even if you add or remove data. Still, there is plenty of opportunity for errors no matter what language one uses. Andrew ran into problems fairly recently, and also blogged about another instance. I’ve never had to retract a paper, but that’s partly because I haven’t published a whole lot of papers. Certainly I have found plenty of substantial errors pretty late in some of my data analyses, and I obviously don’t have sufficient mechanisms in place to be sure

3 0.81178188 272 andrew gelman stats-2010-09-13-Ross Ihaka to R: Drop Dead

Introduction: Christian Robert posts these thoughts : I [Ross Ihaka] have been worried for some time that R isn’t going to provide the base that we’re going to need for statistical computation in the future. (It may well be that the future is already upon us.) There are certainly efficiency problems (speed and memory use), but there are more fundamental issues too. Some of these were inherited from S and some are peculiar to R. One of the worst problems is scoping. Consider the following little gem. f =function() { if (runif(1) > .5) x = 10 x } The x being returned by this function is randomly local or global. There are other examples where variables alternate between local and non-local throughout the body of a function. No sensible language would allow this. It’s ugly and it makes optimisation really difficult. This isn’t the only problem, even weirder things happen because of interactions between scoping and lazy evaluation. In light of this, I [Ihaka] have come to the c

4 0.78727615 1447 andrew gelman stats-2012-08-07-Reproducible science FAIL (so far): What’s stoppin people from sharin data and code?

Introduction: David Karger writes: Your recent post on sharing data was of great interest to me, as my own research in computer science asks how to incentivize and lower barriers to data sharing. I was particularly curious about your highlighting of effort as the major dis-incentive to sharing. I would love to hear more, as this question of effort is on we specifically target in our development of tools for data authoring and publishing. As a straw man, let me point out that sharing data technically requires no more than posting an excel spreadsheet online. And that you likely already produced that spreadsheet during your own analytic work. So, in what way does such low-tech publishing fail to meet your data sharing objectives? Our own hypothesis has been that the effort is really quite low, with the problem being a lack of *immediate/tangible* benefits (as opposed to the long-term values you accurately describe). To attack this problem, we’re developing tools (and, since it appear

5 0.78580093 1716 andrew gelman stats-2013-02-09-iPython Notebook

Introduction: Burak Bayramli writes: I wanted to inform you on iPython Notebook technology – allowing markup, Python code to reside in one document. Someone ported one of your examples from ARM . iPynb file is actually a live document, can be downloaded and reran locally, hence change of code on document means change of images, results. Graphs (as well as text output) which are generated by the code, are placed inside the document automatically. No more referencing image files seperately. For now running notebooks locally require a notebook server, but that part can live “on the cloud” as part of an educational software. Viewers, such as nbviewer.ipython.org, do not even need that much, since all recent results of a notebook are embedded in the notebook itself. A lot of people are excited about this; Also out of nowhere, Alfred P. Sloan Foundation dropped a $1.15 million grant on the developers of ipython which provided some extra energy on the project. Cool. We’ll have to do that ex

6 0.76360244 266 andrew gelman stats-2010-09-09-The future of R

7 0.76347184 360 andrew gelman stats-2010-10-21-Forensic bioinformatics, or, Don’t believe everything you read in the (scientific) papers

8 0.76106817 2089 andrew gelman stats-2013-11-04-Shlemiel the Software Developer and Unknown Unknowns

9 0.74785835 470 andrew gelman stats-2010-12-16-“For individuals with wine training, however, we find indications of a positive relationship between price and enjoyment”

10 0.7462644 2355 andrew gelman stats-2014-05-31-Jessica Tracy and Alec Beall (authors of the fertile-women-wear-pink study) comment on our Garden of Forking Paths paper, and I comment on their comments

11 0.74021345 1805 andrew gelman stats-2013-04-16-Memo to Reinhart and Rogoff: I think it’s best to admit your errors and go on from there

12 0.73500675 736 andrew gelman stats-2011-05-29-Response to “Why Tables Are Really Much Better Than Graphs”

13 0.72998172 1808 andrew gelman stats-2013-04-17-Excel-bashing

14 0.7256043 1134 andrew gelman stats-2012-01-21-Lessons learned from a recent R package submission

15 0.7252863 2337 andrew gelman stats-2014-05-18-Never back down: The culture of poverty and the culture of journalism

16 0.72454602 1369 andrew gelman stats-2012-06-06-Your conclusion is only as good as your data

17 0.72389424 268 andrew gelman stats-2010-09-10-Fighting Migraine with Multilevel Modeling

18 0.72211361 597 andrew gelman stats-2011-03-02-RStudio – new cross-platform IDE for R

19 0.719356 2352 andrew gelman stats-2014-05-29-When you believe in things that you don’t understand

20 0.71514881 1238 andrew gelman stats-2012-03-31-Dispute about ethics of data sharing


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(4, 0.16), (16, 0.055), (21, 0.052), (24, 0.116), (47, 0.033), (63, 0.029), (76, 0.02), (84, 0.073), (86, 0.017), (89, 0.016), (96, 0.026), (97, 0.013), (99, 0.244)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.94566226 947 andrew gelman stats-2011-10-08-GiveWell sez: Cost-effectiveness of de-worming was overstated by a factor of 100 (!) due to a series of sloppy calculations

Introduction: Alexander at GiveWell writes : The Disease Control Priorities in Developing Countries (DCP2), a major report funded by the Gates Foundation . . . provides an estimate of $3.41 per disability-adjusted life-year (DALY) for the cost-effectiveness of soil-transmitted-helminth (STH) treatment, implying that STH treatment is one of the most cost-effective interventions for global health. In investigating this figure, we have corresponded, over a period of months, with six scholars who had been directly or indirectly involved in the production of the estimate. Eventually, we were able to obtain the spreadsheet that was used to generate the $3.41/DALY estimate. That spreadsheet contains five separate errors that, when corrected, shift the estimated cost effectiveness of deworming from $3.41 to $326.43. [I think they mean to say $300 -- ed.] We came to this conclusion a year after learning that the DCP2’s published cost-effectiveness estimate for schistosomiasis treatment – another kind of

2 0.93592745 1618 andrew gelman stats-2012-12-11-The consulting biz

Introduction: I received the following (unsolicited) email: Hello, *** LLC, a ***-based market research company, has a financial client who is interested in speaking with a statistician who has done research in the field of Alzheimer’s Disease and preferably familiar with the SOLA and BAPI trials. We offer an honorarium of $200 for a 30 minute telephone interview. Please advise us if you have an employment or consulting agreement with any organization or operate professionally pursuant to an organization’s code of conduct or employee manual that may control activities by you outside of your regular present and former employment, such as participating in this consulting project for MedPanel. If there are such contracts or other documents that do apply to you, please forward MedPanel a copy of each such document asap as we are obligated to review such documents to determine if you are permitted to participate as a consultant for MedPanel on a project with this particular client. If you are

same-blog 3 0.9355039 907 andrew gelman stats-2011-09-14-Reproducibility in Practice

Introduction: In light of the recent article about drug-target research and replication (Andrew blogged it here ) and l’affaire Potti , I have mentioned the “Forensic Bioinformatics” paper (Baggerly & Coombes 2009) to several colleagues in passing this week. I have concluded that it has not gotten the attention it deserves, though it has been discussed on this blog before too. Figure 1 from Baggerly & Coombes 2009 The authors try to reproduce published data, and end up “reverse engineering” what the original authors had to have done. Some examples: §2.2: “Training data sensitive/resistant labels are reversed.” §2.4: “Only 84/122 test samples are distinct; some samples are labeled both sensitive and resistant.” §2.7: Almost half of the data is incorrectly labeled resistant. §3.2: “This offset involves a single row shift: for example, … [data from] row 98 were used instead of those from row 97.” §5.4: “Poor documentation led a report on drug A to include a heatmap

4 0.92477572 1801 andrew gelman stats-2013-04-13-Can you write a program to determine the causal order?

Introduction: Mike Zyphur writes: Kaggle.com has launched a competition to determine what’s an effect and what’s a cause. They’ve got correlated variables, they’re deprived of context, and you’re asked to determine the causal order. $5,000 prizes. I followed the link and the example they gave didn’t make much sense to me (the two variables were temperature and altitude of cities in Germany, and they said that altitude causes temperature). It has the feeling to me of one of those weird standardized tests we used to see sometimes in school, where there’s no real correct answer so the goal is to figure out what the test-writer wanted you to say. Nonetheless, this might be of interest, so I’m passing it along to you.

5 0.91399169 1919 andrew gelman stats-2013-06-29-R sucks

Introduction: I was trying to make some new graphs using 5-year-old R code and I got all these problems because I was reading in files with variable names such as “co.fipsid” and now R is automatically changing them to “co_fipsid”. Or maybe the names had underbars all along, and the old R had changed them into dots. Whatever. I understand that backward compatibility can be hard to maintain, but this is just annoying.

6 0.91353405 1918 andrew gelman stats-2013-06-29-Going negative

7 0.90661949 238 andrew gelman stats-2010-08-27-No radon lobby

8 0.8930707 113 andrew gelman stats-2010-06-28-Advocacy in the form of a “deliberative forum”

9 0.89044797 419 andrew gelman stats-2010-11-18-Derivative-based MCMC as a breakthrough technique for implementing Bayesian statistics

10 0.87921381 1829 andrew gelman stats-2013-04-28-Plain old everyday Bayesianism!

11 0.876284 2211 andrew gelman stats-2014-02-14-The popularity of certain baby names is falling off the clifffffffffffff

12 0.87399244 2212 andrew gelman stats-2014-02-15-Mary, Mary, why ya buggin

13 0.87203443 1470 andrew gelman stats-2012-08-26-Graphs showing regression uncertainty: the code!

14 0.86974883 2078 andrew gelman stats-2013-10-26-“The Bayesian approach to forensic evidence”

15 0.86770707 2000 andrew gelman stats-2013-08-28-Why during the 1950-1960′s did Jerry Cornfield become a Bayesian?

16 0.86358255 1997 andrew gelman stats-2013-08-24-Measurement error in monkey studies

17 0.85385597 1350 andrew gelman stats-2012-05-28-Value-added assessment: What went wrong?

18 0.85156602 2053 andrew gelman stats-2013-10-06-Ideas that spread fast and slow

19 0.85066295 1605 andrew gelman stats-2012-12-04-Write This Book

20 0.85004729 360 andrew gelman stats-2010-10-21-Forensic bioinformatics, or, Don’t believe everything you read in the (scientific) papers