andrew_gelman_stats andrew_gelman_stats-2012 andrew_gelman_stats-2012-1238 knowledge-graph by maker-knowledge-mining

1238 andrew gelman stats-2012-03-31-Dispute about ethics of data sharing


meta infos for this blog

Source: html

Introduction: Several months ago, Sam Behseta, the new editor of Chance magazine, asked me if I’d like to have a column. I said yes, I’d like to write on ethics and statistics. My first column was called “Open Data and Open Methods” and I discussed the ethical obligation to share data and make our computations transparent wherever possible. In my column, I recounted a story from a bit over 20 years ago when I noticed a problem in a published analysis (involving electromagnetic fields and calcium flow in chicken brains) and contacted the researcher in charge of the study, who would not share his data with me. Two of the people from that research team—biologist Carl Blackman and statistician Dennis House—saw my Chance column and felt that I had misrepresented the situation and had criticized them unfairly. Blackman and House expressed their concerns in letters to the editor which were just published, along with my reply, in the latest issue of Chance . Seeing as I posted my article here, I


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Several months ago, Sam Behseta, the new editor of Chance magazine, asked me if I’d like to have a column. [sent-1, score-0.137]

2 I said yes, I’d like to write on ethics and statistics. [sent-2, score-0.247]

3 My first column was called “Open Data and Open Methods” and I discussed the ethical obligation to share data and make our computations transparent wherever possible. [sent-3, score-0.871]

4 In my column, I recounted a story from a bit over 20 years ago when I noticed a problem in a published analysis (involving electromagnetic fields and calcium flow in chicken brains) and contacted the researcher in charge of the study, who would not share his data with me. [sent-4, score-0.769]

5 Two of the people from that research team—biologist Carl Blackman and statistician Dennis House—saw my Chance column and felt that I had misrepresented the situation and had criticized them unfairly. [sent-5, score-0.389]

6 Blackman and House expressed their concerns in letters to the editor which were just published, along with my reply, in the latest issue of Chance . [sent-6, score-0.571]

7 I encourage all of you who are interested in ethics and data sharing to take a look. [sent-9, score-0.572]

8 As I wrote in my response, I appreciate the letters of Dr. [sent-10, score-0.353]

9 House and I hope that readers will benefit from seeing both their perspectives and mine—just as researchers in general can benefit from seeing multiple analyses of publicly shared data. [sent-12, score-0.892]

10 Please don’t put any criticisms of Blackman or House (or me! [sent-15, score-0.135]

11 I appreciate that they put in the effort to respond, and my purpose in posting their letters here is to give a forum for their views. [sent-17, score-0.633]

12 General comments about ethics and data sharing would be fine, but no need to focus on this particular case. [sent-18, score-0.496]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('blackman', 0.516), ('house', 0.26), ('ethics', 0.247), ('letters', 0.229), ('column', 0.194), ('seeing', 0.168), ('sharing', 0.161), ('chance', 0.144), ('editor', 0.137), ('behseta', 0.129), ('appreciate', 0.124), ('benefit', 0.123), ('share', 0.117), ('misrepresented', 0.112), ('wherever', 0.106), ('carl', 0.102), ('transparent', 0.102), ('biologist', 0.102), ('open', 0.101), ('computations', 0.1), ('brains', 0.098), ('chicken', 0.096), ('dennis', 0.092), ('flow', 0.09), ('data', 0.088), ('sam', 0.086), ('contacted', 0.085), ('obligation', 0.085), ('perspectives', 0.084), ('criticized', 0.083), ('publicly', 0.082), ('forum', 0.08), ('charge', 0.08), ('ethical', 0.079), ('shared', 0.078), ('encourage', 0.076), ('published', 0.075), ('ago', 0.074), ('mine', 0.071), ('expressed', 0.071), ('criticisms', 0.07), ('magazine', 0.069), ('concerns', 0.068), ('posting', 0.068), ('involving', 0.067), ('purpose', 0.067), ('general', 0.066), ('latest', 0.066), ('put', 0.065), ('fields', 0.064)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0 1238 andrew gelman stats-2012-03-31-Dispute about ethics of data sharing

Introduction: Several months ago, Sam Behseta, the new editor of Chance magazine, asked me if I’d like to have a column. I said yes, I’d like to write on ethics and statistics. My first column was called “Open Data and Open Methods” and I discussed the ethical obligation to share data and make our computations transparent wherever possible. In my column, I recounted a story from a bit over 20 years ago when I noticed a problem in a published analysis (involving electromagnetic fields and calcium flow in chicken brains) and contacted the researcher in charge of the study, who would not share his data with me. Two of the people from that research team—biologist Carl Blackman and statistician Dennis House—saw my Chance column and felt that I had misrepresented the situation and had criticized them unfairly. Blackman and House expressed their concerns in letters to the editor which were just published, along with my reply, in the latest issue of Chance . Seeing as I posted my article here, I

2 0.23478088 1237 andrew gelman stats-2012-03-30-Statisticians: When We Teach, We Don’t Practice What We Preach

Introduction: My new Chance ethics column (cowritten with Eric Loken). Click through and take a look. It’s a short article and I really like it. And here’s more Chance.

3 0.19236256 1117 andrew gelman stats-2012-01-13-What are the important issues in ethics and statistics? I’m looking for your input!

Introduction: I’ve recently started a regular column on ethics, appearing every three months in Chance magazine . My first column, “Open Data and Open Methods,” is here , and my second column, “Statisticians: When we teach, we don’t practice what we preach” (coauthored with Eric Loken) will be appearing in the next issue. Statistical ethics is a wide-open topic, and I’d be very interested in everyone’s thoughts, questions, and stories. I’d like to get beyond generic questions such as, Is it right to do a randomized trial when you think the treatment is probably better than the control?, and I’d also like to avoid the really easy questions such as, Is it ethical to copy Wikipedia entries and then sell the resulting publication for $2800 a year? [Note to people who are sick of hearing about this particular story: I'll consider stopping my blogging on it, the moment that the people involved consider apologizing for their behavior.] Please insert your thoughts, questions, stories, links, et

4 0.18417731 1588 andrew gelman stats-2012-11-23-No one knows what it’s like to be the bad man

Introduction: Part 1. The ideal policy Basbøll, as always, gets right to the point: Andrew Gelman is not the plagiarism police because there is no such thing as the plagiarism police. But, he continues: There is, at any self-respecting university and any self-respecting academic journal, a plagiarism policy, and there sure as hell is a “morality” of writing in the world of scholarship. The cardinal rule is: don’t use other people’s words or ideas without attributing those words or ideas to the people you got them from. What to do when the plagiarism (or, perhaps, sloppy quotation, to use a less loaded word) comes to light? Everyone makes mistakes, but if you make one you have to correct it. Don’t explain why your mistake isn’t very serious or “set things right” by pointing to the “obvious” signs of your good intentions. . . . Don’t say you’ve cleared it with the original author. The real victim of your crime is not the other writer; it’s your reader. That’s whose trust you’ve be

5 0.14477253 1447 andrew gelman stats-2012-08-07-Reproducible science FAIL (so far): What’s stoppin people from sharin data and code?

Introduction: David Karger writes: Your recent post on sharing data was of great interest to me, as my own research in computer science asks how to incentivize and lower barriers to data sharing. I was particularly curious about your highlighting of effort as the major dis-incentive to sharing. I would love to hear more, as this question of effort is on we specifically target in our development of tools for data authoring and publishing. As a straw man, let me point out that sharing data technically requires no more than posting an excel spreadsheet online. And that you likely already produced that spreadsheet during your own analytic work. So, in what way does such low-tech publishing fail to meet your data sharing objectives? Our own hypothesis has been that the effort is really quite low, with the problem being a lack of *immediate/tangible* benefits (as opposed to the long-term values you accurately describe). To attack this problem, we’re developing tools (and, since it appear

6 0.11389209 1590 andrew gelman stats-2012-11-26-I need a title for my book on ethics and statistics!!

7 0.10486461 158 andrew gelman stats-2010-07-22-Tenants and landlords

8 0.1048535 991 andrew gelman stats-2011-11-04-Insecure researchers aren’t sharing their data

9 0.10438026 237 andrew gelman stats-2010-08-27-Bafumi-Erikson-Wlezien predict a 50-seat loss for Democrats in November

10 0.10133134 1212 andrew gelman stats-2012-03-14-Controversy about a ranking of philosophy departments, or How should we think about statistical results when we can’t see the raw data?

11 0.09877336 1164 andrew gelman stats-2012-02-13-Help with this problem, win valuable prizes

12 0.094574898 250 andrew gelman stats-2010-09-02-Blending results from two relatively independent multi-level models

13 0.092864394 1268 andrew gelman stats-2012-04-18-Experimenting on your intro stat course, as a way of teaching experimentation in your intro stat course (and also to improve the course itself)

14 0.087741017 1581 andrew gelman stats-2012-11-17-Horrible but harmless?

15 0.081231296 1399 andrew gelman stats-2012-06-28-Life imitates blog

16 0.081209965 434 andrew gelman stats-2010-11-28-When Small Numbers Lead to Big Errors

17 0.081118248 1807 andrew gelman stats-2013-04-17-Data problems, coding errors…what can be done?

18 0.080812171 374 andrew gelman stats-2010-10-27-No matter how famous you are, billions of people have never heard of you.

19 0.079295889 2353 andrew gelman stats-2014-05-30-I posted this as a comment on a sociology blog

20 0.078032762 2179 andrew gelman stats-2014-01-20-The AAA Tranche of Subprime Science


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.14), (1, -0.048), (2, -0.032), (3, -0.015), (4, -0.018), (5, -0.001), (6, -0.014), (7, -0.043), (8, -0.019), (9, -0.024), (10, 0.038), (11, 0.008), (12, 0.011), (13, -0.009), (14, -0.034), (15, 0.048), (16, -0.024), (17, 0.004), (18, 0.034), (19, 0.041), (20, 0.002), (21, 0.07), (22, 0.039), (23, -0.027), (24, -0.05), (25, 0.053), (26, -0.014), (27, -0.045), (28, -0.011), (29, -0.003), (30, 0.024), (31, -0.004), (32, -0.008), (33, 0.052), (34, -0.004), (35, 0.006), (36, 0.008), (37, -0.002), (38, 0.007), (39, 0.034), (40, -0.007), (41, 0.014), (42, 0.022), (43, -0.001), (44, -0.025), (45, -0.015), (46, -0.036), (47, -0.058), (48, 0.024), (49, 0.001)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.94262123 1238 andrew gelman stats-2012-03-31-Dispute about ethics of data sharing

Introduction: Several months ago, Sam Behseta, the new editor of Chance magazine, asked me if I’d like to have a column. I said yes, I’d like to write on ethics and statistics. My first column was called “Open Data and Open Methods” and I discussed the ethical obligation to share data and make our computations transparent wherever possible. In my column, I recounted a story from a bit over 20 years ago when I noticed a problem in a published analysis (involving electromagnetic fields and calcium flow in chicken brains) and contacted the researcher in charge of the study, who would not share his data with me. Two of the people from that research team—biologist Carl Blackman and statistician Dennis House—saw my Chance column and felt that I had misrepresented the situation and had criticized them unfairly. Blackman and House expressed their concerns in letters to the editor which were just published, along with my reply, in the latest issue of Chance . Seeing as I posted my article here, I

2 0.7057085 1117 andrew gelman stats-2012-01-13-What are the important issues in ethics and statistics? I’m looking for your input!

Introduction: I’ve recently started a regular column on ethics, appearing every three months in Chance magazine . My first column, “Open Data and Open Methods,” is here , and my second column, “Statisticians: When we teach, we don’t practice what we preach” (coauthored with Eric Loken) will be appearing in the next issue. Statistical ethics is a wide-open topic, and I’d be very interested in everyone’s thoughts, questions, and stories. I’d like to get beyond generic questions such as, Is it right to do a randomized trial when you think the treatment is probably better than the control?, and I’d also like to avoid the really easy questions such as, Is it ethical to copy Wikipedia entries and then sell the resulting publication for $2800 a year? [Note to people who are sick of hearing about this particular story: I'll consider stopping my blogging on it, the moment that the people involved consider apologizing for their behavior.] Please insert your thoughts, questions, stories, links, et

3 0.6899904 1756 andrew gelman stats-2013-03-10-He said he was sorry

Introduction: Yes, it can be done : Hereby I contact you to clarify the situation that occurred with the publication of the article entitled *** which was published in Volume 11, Issue 3 of *** and I made the mistake of declaring as an author. This chapter is a plagiarism of . . . I wish to express and acknowledge that I am solely responsible for this . . . I recognize the gravity of the offense committed, since there is no justification for so doing. Therefore, and as a sign of shame and regret I feel in this situation, I will publish this letter, in order to set an example for other researchers do not engage in a similar error. No more, and to please accept my apologies, Sincerely, *** P.S. Since we’re on Retraction Watch already, I’ll point you to this unrelated story featuring a hilarious photo of a fraudster, who in this case was a grad student in psychology who faked his data and “has agreed to submit to a three-year supervisory period for any work involving funding from the

4 0.68339372 2148 andrew gelman stats-2013-12-25-Spam!

Introduction: This one totally faked me out at first. It was an email from “Nick Bagnall” that began: Dear Dr. Gelman, I made contact last year regarding your work in the CMG: Reconstructing Climate from Tree Ring Data project. We are about to start producing the 2014 edition and I wanted to discuss this with you as we still remain keen to feature your work. Research Media are producing a special publication in February of 2014, within this report we will be working with a small selected number of PI’s with a focus on geosciences, atmospheric and geospace sciences and earth Sciences.. At this point, I’m thinking: Hmmm, I don’t remember this guy, is this some sort of collaborative project that I’d forgotten about? The message then continues: The publication is called International Innovation . . . Huh? This doesn’t sound so good. The email then goes on with some very long lists, and then finally the kicker: The total cost for each article produced in this report is fixed a

5 0.67617023 1835 andrew gelman stats-2013-05-02-7 ways to separate errors from statistics

Introduction: Betsey Stevenson and Justin Wolfers have been inspired by the recent Reinhardt and Rogoff debacle to list “six ways to separate lies from statistics” in economics research: 1. “Focus on how robust a finding is, meaning that different ways of looking at the evidence point to the same conclusion.” 2. Don’t confuse statistical with practical significance. 3. “Be wary of scholars using high-powered statistical techniques as a bludgeon to silence critics who are not specialists.” 4. “Don’t fall into the trap of thinking about an empirical finding as ‘right’ or ‘wrong.’ At best, data provide an imperfect guide.” 5. “Don’t mistake correlation for causation.” 6. “Always ask ‘so what?’” I like all these points, especially #4, which I think doesn’t get said enough. As I wrote a few months ago, high-profile social science research aims for proof, not for understanding—and that’s a problem. My addition to the list If you compare my title above to that of Stevenson

6 0.67224842 69 andrew gelman stats-2010-06-04-A Wikipedia whitewash

7 0.66569161 1212 andrew gelman stats-2012-03-14-Controversy about a ranking of philosophy departments, or How should we think about statistical results when we can’t see the raw data?

8 0.65917015 1640 andrew gelman stats-2012-12-26-What do people do wrong? WSJ columnist is looking for examples!

9 0.65866792 1922 andrew gelman stats-2013-07-02-They want me to send them free material and pay for the privilege

10 0.64052868 1074 andrew gelman stats-2011-12-20-Reading a research paper != agreeing with its claims

11 0.6397683 1525 andrew gelman stats-2012-10-08-Ethical standards in different data communities

12 0.63947725 1103 andrew gelman stats-2012-01-06-Unconvincing defense of the recent Russian elections, and a problem when an official organ of an academic society has low standards for publication

13 0.63398743 2304 andrew gelman stats-2014-04-24-An open site for researchers to post and share papers

14 0.63116413 2355 andrew gelman stats-2014-05-31-Jessica Tracy and Alec Beall (authors of the fertile-women-wear-pink study) comment on our Garden of Forking Paths paper, and I comment on their comments

15 0.62776703 1588 andrew gelman stats-2012-11-23-No one knows what it’s like to be the bad man

16 0.62626839 2352 andrew gelman stats-2014-05-29-When you believe in things that you don’t understand

17 0.62583369 989 andrew gelman stats-2011-11-03-This post does not mention Wegman

18 0.625512 2309 andrew gelman stats-2014-04-28-Crowdstorming a dataset

19 0.62427318 907 andrew gelman stats-2011-09-14-Reproducibility in Practice

20 0.62198442 677 andrew gelman stats-2011-04-24-My NOAA story


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(9, 0.06), (16, 0.07), (21, 0.044), (24, 0.154), (27, 0.178), (63, 0.018), (73, 0.024), (76, 0.017), (86, 0.014), (87, 0.012), (95, 0.065), (99, 0.233)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.95544964 802 andrew gelman stats-2011-07-13-Super Sam Fuld Needs Your Help (with Foul Ball stats)

Introduction: I was pleasantly surprised to have my recreational reading about baseball in the New Yorker interrupted by a digression on statistics. Sam Fuld of the Tampa Bay Rays, was the subjet of a Ben McGrath profile in the 4 July 2011 issue of the New Yorker , in an article titled Super Sam . After quoting a minor-league trainer who described Fuld as “a bit of a geek” (who isn’t these days?), McGrath gets into that lovely New Yorker detail: One could have pointed out the more persuasive and telling examples, such as the fact that in 2005, after his first pro season, with the Class-A Peoria Chiefs, Fuld applied for a fall internship with Stats, Inc., the research firm that supplies broadcasters with much of the data anad analysis that you hear in sports telecasts. After a description of what they had him doing, reviewing footage of games and cataloguing, he said “I thought, They have a stat for everything, but they don’t have any stats regarding foul balls.” Fuld’s

2 0.9319551 134 andrew gelman stats-2010-07-08-“What do you think about curved lines connecting discrete data-points?”

Introduction: John Keltz writes: What do you think about curved lines connecting discrete data-points? (For example, here .) The problem with the smoothed graph is it seems to imply that something is going on in between the discrete data points, which is false. However, the straight-line version isn’t representing actual events either- it is just helping the eye connect each point. So maybe the curved version is also just helping the eye connect each point, and looks better doing it. In my own work (value-added modeling of achievement test scores) I use straight lines, but I guess I am not too bothered when people use smoothing. I’d appreciate your input. Regular readers will be unsurprised that, yes, I have an opinion on this one, and that this opinion is connected to some more general ideas about statistical graphics. In general I’m not a fan of the curved lines. They’re ok, but I don’t really see the point. I can connect the dots just fine without the curves. The more general id

3 0.92962945 930 andrew gelman stats-2011-09-28-Wiley Wegman chutzpah update: Now you too can buy a selection of garbled Wikipedia articles, for a mere $1400-$2800 per year!

Introduction: Someone passed on to a message from his university library announcing that the journal “Wiley Interdisciplinary Reviews: Computational Statistics” is no longer free. Librarians have to decide what to do, so I thought I’d offer the following consumer guide: Wiley Computational Statistics journal Wikipedia Frequency 6 issues per year Continuously updated Includes articles from Wikipedia? Yes Yes Cites the Wikipedia sources it uses? No Yes Edited by recipient of ASA Founders Award? Yes No Articles are subject to rigorous review? No Yes Errors, when discovered, get fixed? No Yes Number of vertices in n-dimensional hypercube? 2n 2 n Easy access to Brady Bunch trivia? No Yes Cost (North America) $1400-$2800 $0 Cost (UK) £986-£1972 £0 Cost (Europe) €1213-€2426 €0 The choice seems pretty clear to me! It’s funny for the Wiley journal to start charging now

4 0.92734587 347 andrew gelman stats-2010-10-17-Getting arm and lme4 running on the Mac

Introduction: Our “arm” package in R requires Doug Bates’s “lme4″ which fits multilevel models. lme4 is currently having some problems on the Mac. But installation on the Mac can be done; it just takes a bit of work. I have two sets of instructions below. From Yu-Sung: If you have MAC OS DVD, you should install developer X code packages from it. Otherwise, install them from here . After this, do the following in R: install.packages(“lme4″, type = “source”) Then you will have lme4 in R and you can install arm without a problem. And, from David Ozonoff: I installed the lme4 package via the Package Installer but this didn’t work, of course. I then installed, via this link , gfortran which seemed to put the libraries in the right place (I had earlier installed via Fink the gcc42 compiler, so I’m not sure if this is required or not). I then ran, in R, this: install.packages(c(“Matrix”,”lme4″), repos=”http://R-Forge.R-project.org”) This does not appear to work since it wi

5 0.92454457 343 andrew gelman stats-2010-10-15-?

Introduction: How am I supposed to handle this sort of thing? (See below.) I just stuck it one of my email folders without responding, but then I wondered . . . what’s it all about? Is there some sort of Glengarry Glen Ross-like parallel world where down-on-their-luck Jack Lemmons of public relations world send out electronic cold calls? More than anything else, this sort of thing makes me glad I have a steady job. Here’s the (unsolicited) email, which came with the subject line “Please help a reporter do his job”: Dear Andrew, As an Editor for the Bulldog Reporter (www.bulldogreporter.com/dailydog), a media relations trade publication, my job is to help ensure that my readers have accurate info about you and send you the best quality pitches. By taking five minutes or less to answer my questions (pasted below), you’ll receive targeted PR pitches from our client base that will match your beat and interests. Any help or direction is appreciated. Here are my questions. We have you listed

same-blog 6 0.92399776 1238 andrew gelman stats-2012-03-31-Dispute about ethics of data sharing

7 0.918257 1472 andrew gelman stats-2012-08-28-Migrating from dot to underscore

8 0.91067517 465 andrew gelman stats-2010-12-13-$3M health care prediction challenge

9 0.90639573 173 andrew gelman stats-2010-07-31-Editing and clutch hitting

10 0.90157437 708 andrew gelman stats-2011-05-12-Improvement of 5 MPG: how many more auto deaths?

11 0.90131319 804 andrew gelman stats-2011-07-15-Static sensitivity analysis

12 0.89576423 1869 andrew gelman stats-2013-05-24-In which I side with Neyman over Fisher

13 0.8864392 652 andrew gelman stats-2011-04-07-Minor-league Stats Predict Major-league Performance, Sarah Palin, and Some Differences Between Baseball and Politics

14 0.88317513 66 andrew gelman stats-2010-06-03-How can news reporters avoid making mistakes when reporting on technical issues? Or, Data used to justify “Data Used to Justify Health Savings Can Be Shaky” can be shaky

15 0.8768115 2132 andrew gelman stats-2013-12-13-And now, here’s something that would make Ed Tufte spin in his . . . ummm, Tufte’s still around, actually, so let’s just say I don’t think he’d like it!

16 0.86676657 1113 andrew gelman stats-2012-01-11-Toshiro Kageyama on professionalism

17 0.86632389 341 andrew gelman stats-2010-10-14-Confusion about continuous probability densities

18 0.86484635 120 andrew gelman stats-2010-06-30-You can’t put Pandora back in the box

19 0.86130583 1982 andrew gelman stats-2013-08-15-Blaming scientific fraud on the Kuhnians

20 0.8611511 2319 andrew gelman stats-2014-05-05-Can we make better graphs of global temperature history?