andrew_gelman_stats andrew_gelman_stats-2013 andrew_gelman_stats-2013-2015 knowledge-graph by maker-knowledge-mining

2015 andrew gelman stats-2013-09-10-The ethics of lying, cheating, and stealing with data: A case study


meta infos for this blog

Source: html

Introduction: I’ve been following with mild interest the recent news stories on the lawbreaking at the Steven A. Cohen hedge fund, for the silly reason that I gave a paid lecture for them a few years ago. I wasn’t thinking too hard about whether they would be using my wonderful statistical ideas to be more effective at insider trading . . . Recently Paul Alper sent me an email pointing out that one of the lawbreakers involved is named Gilman—perhaps he’s related to me? Everyone is related to everyone else but I don’t know my relation to this particular guy. I actually have an aunt whose last name is Gilman. Here’s how it happened. A few years after my father was born (but before the birth of his sister), my grandfather changed his name from Gelman to Gilman. The story was that he was tired of people always calling him Gilman so he just changed his name. I’d call that a true commitment to the descriptive approach to linguistics. On the minus side, he gave my father’s older sister Luther


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 I’ve been following with mild interest the recent news stories on the lawbreaking at the Steven A. [sent-1, score-0.109]

2 Cohen hedge fund, for the silly reason that I gave a paid lecture for them a few years ago. [sent-2, score-0.54]

3 I wasn’t thinking too hard about whether they would be using my wonderful statistical ideas to be more effective at insider trading . [sent-3, score-0.618]

4 Recently Paul Alper sent me an email pointing out that one of the lawbreakers involved is named Gilman—perhaps he’s related to me? [sent-6, score-0.275]

5 Everyone is related to everyone else but I don’t know my relation to this particular guy. [sent-7, score-0.315]

6 I actually have an aunt whose last name is Gilman. [sent-8, score-0.313]

7 A few years after my father was born (but before the birth of his sister), my grandfather changed his name from Gelman to Gilman. [sent-10, score-0.862]

8 The story was that he was tired of people always calling him Gilman so he just changed his name. [sent-11, score-0.419]

9 I’d call that a true commitment to the descriptive approach to linguistics. [sent-12, score-0.2]

10 On the minus side, he gave my father’s older sister Lutheria, middle name Burbank (after the famous agriculturalist). [sent-13, score-0.737]

11 Getting back to the hedge fund scandal, Alper writes: The article is fascinating on many levels: clinical trials, insider trading, failings of the medical profession, pernicious influence of money, need for recognition, mistakes of the elderly. [sent-15, score-1.37]

12 Murray Gell-Mann is probably getting $258,000 a year too, but I think he’s worth it. [sent-17, score-0.09]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('gilman', 0.327), ('hedge', 0.245), ('alper', 0.245), ('insider', 0.239), ('trading', 0.219), ('father', 0.213), ('fund', 0.208), ('name', 0.179), ('sister', 0.164), ('failings', 0.149), ('grandfather', 0.14), ('changed', 0.138), ('aunt', 0.134), ('pernicious', 0.134), ('gave', 0.124), ('everyone', 0.12), ('scandal', 0.115), ('related', 0.112), ('mild', 0.109), ('commitment', 0.109), ('tired', 0.106), ('minus', 0.106), ('murray', 0.104), ('cohen', 0.103), ('profession', 0.101), ('birth', 0.099), ('calling', 0.098), ('lecture', 0.098), ('recognition', 0.095), ('born', 0.093), ('fascinating', 0.093), ('descriptive', 0.091), ('older', 0.09), ('getting', 0.09), ('trials', 0.087), ('wonderful', 0.087), ('clinical', 0.084), ('relation', 0.083), ('pointing', 0.082), ('named', 0.081), ('steven', 0.078), ('mistakes', 0.077), ('always', 0.077), ('middle', 0.074), ('paid', 0.073), ('effective', 0.073), ('paul', 0.072), ('influence', 0.072), ('levels', 0.071), ('medical', 0.069)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0 2015 andrew gelman stats-2013-09-10-The ethics of lying, cheating, and stealing with data: A case study

Introduction: I’ve been following with mild interest the recent news stories on the lawbreaking at the Steven A. Cohen hedge fund, for the silly reason that I gave a paid lecture for them a few years ago. I wasn’t thinking too hard about whether they would be using my wonderful statistical ideas to be more effective at insider trading . . . Recently Paul Alper sent me an email pointing out that one of the lawbreakers involved is named Gilman—perhaps he’s related to me? Everyone is related to everyone else but I don’t know my relation to this particular guy. I actually have an aunt whose last name is Gilman. Here’s how it happened. A few years after my father was born (but before the birth of his sister), my grandfather changed his name from Gelman to Gilman. The story was that he was tired of people always calling him Gilman so he just changed his name. I’d call that a true commitment to the descriptive approach to linguistics. On the minus side, he gave my father’s older sister Luther

2 0.16514486 1003 andrew gelman stats-2011-11-11-$

Introduction: Felix Salmon relates the story of an economics Nobel Prize winner getting paid by a hedge fund. It would all seems pretty silly—sort of like Coca-Cola featuring Michael Jordan in their ads—except that hedge funds are disreputable nowadays and so it seems vaguely sleazy for a scholar to trade on his academic reputation to make free money in this way. It falls roughly in the same category as that notorious b-school prof in Inside Job who got $125K for writing a b.s. report about the financial stability of Iceland—and then, when they came back to him later and asked how he could’ve written it, he basically said: Hey, I don’t know anything about Iceland, I was just taking their money! That said, if a hedge fund offered me $125K to sit on their board, I’d probably take it! It’s hard to turn down free money. Or maybe not, I don’t really know. So far, when companies have paid me $, it’s been to do something for them, to consult or give a short course. I’d like to think that if

3 0.1521531 905 andrew gelman stats-2011-09-14-5 books on essentialism!

Introduction: At the sister blog .

4 0.12499882 1937 andrew gelman stats-2013-07-13-Meritocracy rerun

Introduction: I’ve said it here so often, this time I put it on the sister blog. . . .

5 0.12134027 2228 andrew gelman stats-2014-02-28-Combining two of my interests

Introduction: Paul Alper writes: Hi Andrew (or Andy or even Gelman [17 of them]): Go to this link and have some fun with (useless? powerful?) data mining. As the authors say, it is addictive. Paul (no other way to spell it) Alper [215 of us] I’m reminded of this discussion from 2012, “Michael’s a Republican, Susan’s a Democrat.” As I wrote at the time: It’s no surprise that men give more to Republicans and women to Democrats, or that the average contribution to a Republican has a larger dollar value than the average contribution to a Democrat, nor perhaps should we be surprised that “Tom” splits his support between the two parties while “Thomas” is a strong Republican. Still, it’s fun to see the data. Overall, I think this graph understates contributions to Republicans because it doesn’t include those new super-pacs. But the new tool seems to be based on a different dataset, opinion polls rather than campaign contributions. Playing around a bit, I see a lot less variability

6 0.10505104 1819 andrew gelman stats-2013-04-23-Charles Murray’s “Coming Apart” and the measurement of social and political divisions

7 0.098053128 763 andrew gelman stats-2011-06-13-Inventor of Connect Four dies at 91

8 0.093300506 1629 andrew gelman stats-2012-12-18-It happened in Connecticut

9 0.09269876 1169 andrew gelman stats-2012-02-15-Charles Murray on the new upper class

10 0.089367792 2160 andrew gelman stats-2014-01-06-Spam names

11 0.087499805 900 andrew gelman stats-2011-09-11-Symptomatic innumeracy

12 0.086297862 411 andrew gelman stats-2010-11-13-Ethical concerns in medical trials

13 0.079659536 399 andrew gelman stats-2010-11-07-Challenges of experimental design; also another rant on the practice of mentioning the publication of an article but not naming its author

14 0.07444559 1534 andrew gelman stats-2012-10-15-The strange reappearance of Matthew Klam

15 0.07429558 1836 andrew gelman stats-2013-05-02-Culture clash

16 0.073867656 2308 andrew gelman stats-2014-04-27-White stripes and dead armadillos

17 0.073726028 1832 andrew gelman stats-2013-04-29-The blogroll

18 0.072050385 2280 andrew gelman stats-2014-04-03-As the boldest experiment in journalism history, you admit you made a mistake

19 0.07155975 525 andrew gelman stats-2011-01-19-Thiel update

20 0.071241856 2138 andrew gelman stats-2013-12-18-In Memoriam Dennis Lindley


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.12), (1, -0.069), (2, -0.023), (3, 0.025), (4, -0.013), (5, -0.013), (6, 0.035), (7, -0.011), (8, 0.025), (9, 0.018), (10, -0.012), (11, -0.018), (12, 0.023), (13, 0.026), (14, -0.013), (15, 0.023), (16, -0.008), (17, -0.044), (18, 0.016), (19, 0.039), (20, 0.015), (21, -0.017), (22, -0.041), (23, 0.012), (24, 0.07), (25, -0.039), (26, -0.036), (27, -0.023), (28, 0.013), (29, -0.021), (30, -0.002), (31, 0.055), (32, -0.019), (33, 0.067), (34, -0.001), (35, -0.032), (36, 0.031), (37, -0.011), (38, -0.041), (39, 0.044), (40, 0.01), (41, -0.042), (42, -0.002), (43, 0.032), (44, -0.01), (45, 0.017), (46, 0.008), (47, -0.019), (48, 0.066), (49, -0.013)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.95234841 2015 andrew gelman stats-2013-09-10-The ethics of lying, cheating, and stealing with data: A case study

Introduction: I’ve been following with mild interest the recent news stories on the lawbreaking at the Steven A. Cohen hedge fund, for the silly reason that I gave a paid lecture for them a few years ago. I wasn’t thinking too hard about whether they would be using my wonderful statistical ideas to be more effective at insider trading . . . Recently Paul Alper sent me an email pointing out that one of the lawbreakers involved is named Gilman—perhaps he’s related to me? Everyone is related to everyone else but I don’t know my relation to this particular guy. I actually have an aunt whose last name is Gilman. Here’s how it happened. A few years after my father was born (but before the birth of his sister), my grandfather changed his name from Gelman to Gilman. The story was that he was tired of people always calling him Gilman so he just changed his name. I’d call that a true commitment to the descriptive approach to linguistics. On the minus side, he gave my father’s older sister Luther

2 0.70088601 1007 andrew gelman stats-2011-11-13-At last, treated with the disrespect that I deserve

Introduction: I was at a work-related event today [actually, last month; these non-topical blog entries are on approximately one-month delay], but not connected to the statistics or political science departments. There were a few people there I knew well, and they were introducing me to others. Then at some point when I was talking with one of the more important people in the room, a sixtyish guy comes by and stands next to us. I put out my hand and introduce myself. He looks at me in puzzlement, spits out his first name, and without a pause starts talking to the person I’d been speaking with. After about a minute of talk, he walks away, and the important person and I continued our conversation. No big deal . . . but, I have to admit, I haven’t had that experience very often recently. I’m often at events where I know everyone (or almost everyone) and they know me, and I’m also often at events where I know very few people and have to introduce myself. But it’s rare to be somewhere where I’m

3 0.69352287 1421 andrew gelman stats-2012-07-19-Alexa, Maricel, and Marty: Three cellular automata who got on my nerves

Introduction: I received the following two emails within fifteen minutes of each other. First, from “Alexa Russell,” subject line “An idea for a blog post: The Role, Importance, and Power of Words”: Hi Andrew, I’m a researcher/writer for a resource covering the importance of English proficiency in today’s workplace. I came across your blog andrewgelman.com as I was conducting research and I’m interested in contributing an article to your blog because I found the topics you cover very engaging. I’m thinking about writing an article that looks at how the Internet has changed the way English is used today; not only has its syntax changed as a result of the Internet Revolution, but the amount of job opportunities has also shifted as a result of this shift. I’d be happy to work with you on the topic if you have any insights. Thanks, and I look forward to hearing from you soon. Best, Alexa Second, From “Maricel Anderson,” subject line “An idea for a blog post: Healthcare Management and Geri

4 0.67936003 321 andrew gelman stats-2010-10-05-Racism!

Introduction: Last night I spoke at the Columbia Club of New York, along with some of my political science colleagues, in a panel about politics, the economy, and the forthcoming election. The discussion was fine . . . until one guy in the audience accused us of bias based on what he imputed as our ethnicity. One of the panelists replied by asking the questioner what of all the things we had said was biased, and the questioner couldn’t actually supply any examples. It makes sense that the questioner couldn’t come up with a single example of bias on our part, considering that we were actually presenting facts . At some level, the questioner’s imputation of our ethnicity and accusation of bias isn’t so horrible. When talking with my friends, I engage in casual ethnic stereotyping all the time–hey, it’s a free country!–and one can certainly make the statistical argument that you can guess people’s ethnicities from their names, appearance, and speech patterns, and in turn you can infer a lot

5 0.67079037 763 andrew gelman stats-2011-06-13-Inventor of Connect Four dies at 91

Introduction: Obit here . I think I have a cousin with the same last name as this guy, so maybe we’re related by marriage in some way. (By that standard we’re also related to Marge Simpson and, I seem to recall, the guy who wrote the scripts for Dark Shadows.)

6 0.66766602 805 andrew gelman stats-2011-07-16-Hey–here’s what you missed in the past 30 days!

7 0.65960151 720 andrew gelman stats-2011-05-20-Baby name wizards

8 0.6514892 458 andrew gelman stats-2010-12-08-Blogging: Is it “fair use”?

9 0.65029943 2126 andrew gelman stats-2013-12-07-If I could’ve done it all over again

10 0.64669472 640 andrew gelman stats-2011-03-31-Why Edit Wikipedia?

11 0.63972336 1177 andrew gelman stats-2012-02-20-Joshua Clover update

12 0.63408375 1417 andrew gelman stats-2012-07-15-Some decision analysis problems are pretty easy, no?

13 0.63114953 1600 andrew gelman stats-2012-12-01-$241,364.83 – $13,000 = $228,364.83

14 0.63100553 2211 andrew gelman stats-2014-02-14-The popularity of certain baby names is falling off the clifffffffffffff

15 0.62607163 1084 andrew gelman stats-2011-12-26-Tweeting the Hits?

16 0.62567782 912 andrew gelman stats-2011-09-15-n = 2

17 0.62356472 1937 andrew gelman stats-2013-07-13-Meritocracy rerun

18 0.60713148 1432 andrew gelman stats-2012-07-27-“Get off my lawn”-blogging

19 0.60426068 69 andrew gelman stats-2010-06-04-A Wikipedia whitewash

20 0.60414624 698 andrew gelman stats-2011-05-05-Shocking but not surprising


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(5, 0.039), (16, 0.097), (18, 0.013), (24, 0.19), (30, 0.034), (40, 0.012), (42, 0.123), (57, 0.064), (59, 0.029), (63, 0.013), (75, 0.029), (82, 0.023), (86, 0.013), (98, 0.011), (99, 0.218)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.94813699 2015 andrew gelman stats-2013-09-10-The ethics of lying, cheating, and stealing with data: A case study

Introduction: I’ve been following with mild interest the recent news stories on the lawbreaking at the Steven A. Cohen hedge fund, for the silly reason that I gave a paid lecture for them a few years ago. I wasn’t thinking too hard about whether they would be using my wonderful statistical ideas to be more effective at insider trading . . . Recently Paul Alper sent me an email pointing out that one of the lawbreakers involved is named Gilman—perhaps he’s related to me? Everyone is related to everyone else but I don’t know my relation to this particular guy. I actually have an aunt whose last name is Gilman. Here’s how it happened. A few years after my father was born (but before the birth of his sister), my grandfather changed his name from Gelman to Gilman. The story was that he was tired of people always calling him Gilman so he just changed his name. I’d call that a true commitment to the descriptive approach to linguistics. On the minus side, he gave my father’s older sister Luther

2 0.93944669 1002 andrew gelman stats-2011-11-10-“Venetia Orcutt, GWU med school professor, quits after complaints of no-show class”

Introduction: She was assigned to teach a class in “evidence-based medicine”! ( link from my usual news source). I wonder what was in the syllabus? If anyone has a copy, feel free to send to me and I will post it here. My favorite part of the story, though, is this: Almost all physician assistant students refused to comment to a reporter Tuesday, saying they’d been told by the department not to talk to media. Talk about obedience to authority! They’re studying in a program that offers nonexistent courses, but then they follow the department’s gag order.

3 0.93078345 1104 andrew gelman stats-2012-01-07-A compelling reason to go to London, Ontario??

Introduction: Dan Goldstein asks what I think of this : My reply: It’s hard for me to imagine a compelling reason for anyone to go to London, Ontario–but, hey, I guess there’s all kinds of people in this world! More seriously, I see the appeal of the graph but it’s a bit busy for my taste. Over the years I’ve moved toward small multiples rather than single busy graphs. That’s one reason why I prefer Tufte’s second book to his first book. The Napoleon-in-Russia graph is a bad model, in that inspires people to try to cram lots of variables on a single graph. Dan wrote back: I [Dan] like it as a travel planning graph, it gives you what you want to know (how how will the days be, how cold will the nights be, will it rain) but is a bit easier on the brain than a table of highs and lows. Also makes it easy to see the trend. I agree the 2nd axis doesn’t help.

4 0.9281022 808 andrew gelman stats-2011-07-18-The estimated effect size is implausibly large. Under what models is this a piece of evidence that the true effect is small?

Introduction: Paul Pudaite writes in response to my discussion with Bartels regarding effect sizes and measurement error models: You [Gelman] wrote: “I actually think there will be some (non-Gaussian) models for which, as y gets larger, E(x|y) can actually go back toward zero.” I [Pudaite] encountered this phenomenon some time in the ’90s. See this graph which shows the conditional expectation of X given Z, when Z = X + Y and the probability density functions of X and Y are, respectively, exp(-x^2) and 1/(y^2+1) (times appropriate constants). As the magnitude of Z increases, E[X|Z] shrinks to zero. I wasn’t sure it was worth the effort to try to publish a two paragraph paper. I suspect that this is true whenever the tail of one distribution is ‘sufficiently heavy’ with respect to the tail of the other. Hmm, I suppose there might be enough substance in a paper that attempted to characterize this outcome for, say, unimodal symmetric distributions. Maybe someone can do this? I think i

5 0.92644095 1535 andrew gelman stats-2012-10-16-Bayesian analogue to stepwise regression?

Introduction: Bill Harris writes: On pp. 250-251 of BDA second edition, you write about multiple comparisons, and you write about stepwise regression on p. 405. How would you look at stepwise regression analyses in light of the multiple comparisons problem? Is there an issue? My reply: In this case I think the right approach is to keep all the coefs but partially pool them toward 0 (after suitable transformation). But then the challenge is coming up with a general way to construct good prior distributions. I’m still thinking about that one! Yet another approach is to put something together purely nonparametrically as with Bart.

6 0.89037299 2179 andrew gelman stats-2014-01-20-The AAA Tranche of Subprime Science

7 0.88976288 1936 andrew gelman stats-2013-07-13-Economic policy does not occur in a political vacuum

8 0.88881373 492 andrew gelman stats-2010-12-30-That puzzle-solving feeling

9 0.88700855 1219 andrew gelman stats-2012-03-18-Tips on “great design” from . . . Microsoft!

10 0.88692379 2040 andrew gelman stats-2013-09-26-Difficulties in making inferences about scientific truth from distributions of published p-values

11 0.88365185 2323 andrew gelman stats-2014-05-07-Cause he thinks he’s so-phisticated

12 0.88326222 1881 andrew gelman stats-2013-06-03-Boot

13 0.88293135 1036 andrew gelman stats-2011-11-30-Stan uses Nuts!

14 0.88198137 799 andrew gelman stats-2011-07-13-Hypothesis testing with multiple imputations

15 0.88169688 60 andrew gelman stats-2010-05-30-What Auteur Theory and Freshwater Economics have in common

16 0.87818348 1223 andrew gelman stats-2012-03-20-A kaleidoscope of responses to Dubner’s criticisms of our criticisms of Freaknomics

17 0.87719768 1713 andrew gelman stats-2013-02-08-P-values and statistical practice

18 0.8768661 1080 andrew gelman stats-2011-12-24-Latest in blog advertising

19 0.8764233 1206 andrew gelman stats-2012-03-10-95% intervals that I don’t believe, because they’re from a flat prior I don’t believe

20 0.87549073 1155 andrew gelman stats-2012-02-05-What is a prior distribution?