andrew_gelman_stats andrew_gelman_stats-2012 andrew_gelman_stats-2012-1293 knowledge-graph by maker-knowledge-mining

1293 andrew gelman stats-2012-05-01-Huff the Magic Dragon


meta infos for this blog

Source: html

Introduction: Upon reading this , Susan remarked, “Don’t you think it’s interesting that a guy who promotes smoking has a last name of ‘Huff’? Reminds me of the Dennis/Dentist studies.” Good point. P.S. As discussed in the linked thread, the great statistician R. A. Fisher was notorious for minimizing the risks of smoking. How does this connect to Fisher’s name, one might ask?


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Upon reading this , Susan remarked, “Don’t you think it’s interesting that a guy who promotes smoking has a last name of ‘Huff’? [sent-1, score-1.257]

2 As discussed in the linked thread, the great statistician R. [sent-6, score-0.507]

3 Fisher was notorious for minimizing the risks of smoking. [sent-8, score-0.632]

4 How does this connect to Fisher’s name, one might ask? [sent-9, score-0.287]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('fisher', 0.397), ('promotes', 0.324), ('huff', 0.305), ('minimizing', 0.26), ('name', 0.26), ('remarked', 0.246), ('susan', 0.246), ('smoking', 0.226), ('risks', 0.203), ('thread', 0.202), ('connect', 0.197), ('notorious', 0.169), ('linked', 0.167), ('upon', 0.166), ('reminds', 0.159), ('guy', 0.131), ('statistician', 0.128), ('ask', 0.123), ('discussed', 0.12), ('reading', 0.105), ('last', 0.093), ('great', 0.092), ('interesting', 0.083), ('might', 0.058), ('good', 0.053), ('think', 0.035), ('one', 0.032)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.99999994 1293 andrew gelman stats-2012-05-01-Huff the Magic Dragon

Introduction: Upon reading this , Susan remarked, “Don’t you think it’s interesting that a guy who promotes smoking has a last name of ‘Huff’? Reminds me of the Dennis/Dentist studies.” Good point. P.S. As discussed in the linked thread, the great statistician R. A. Fisher was notorious for minimizing the risks of smoking. How does this connect to Fisher’s name, one might ask?

2 0.25167108 1285 andrew gelman stats-2012-04-27-“How to Lie with Statistics” guy worked for the tobacco industry to mock studies of the risks of smoking statistics

Introduction: Remember How to Lie With Statistics? It turns out that the author worked for the cigarette companies. John Mashey points to this, from Robert Proctor’s book, “Golden Holocaust: Origins of the Cigarette Catastrophe and the Case for Abolition”: Darrell Huff, author of the wildly popular (and aptly named) How to Lie With Statistics, was paid to testify before Congress in the 1950s and then again in the 1960s, with the assigned task of ridiculing any notion of a cigarette-disease link. On March 22, 1965, Huff testified at hearings on cigarette labeling and advertising, accusing the recent Surgeon General’s report of myriad failures and “fallacies.” Huff peppered his attack with with amusing asides and anecdotes, lampooning spurious correlations like that between the size of Dutch families and the number of storks nesting on rooftops–which proves not that storks bring babies but rather that people with large families tend to have larger houses (which therefore attract more storks).

3 0.15834315 1869 andrew gelman stats-2013-05-24-In which I side with Neyman over Fisher

Introduction: As a data analyst and a scientist, Fisher > Neyman, no question. But as a theorist, Fisher came up with ideas that worked just fine in his applications but can fall apart when people try to apply them too generally. Here’s an example that recently came up. Deborah Mayo pointed me to a comment by Stephen Senn on the so-called Fisher and Neyman null hypotheses. In an experiment with n participants (or, as we used to say, subjects or experimental units), the Fisher null hypothesis is that the treatment effect is exactly 0 for every one of the n units, while the Neyman null hypothesis is that the individual treatment effects can be negative or positive but have an average of zero. Senn explains why Neyman’s hypothesis in general makes no sense—the short story is that Fisher’s hypothesis seems relevant in some problems (sometimes we really are studying effects that are zero or close enough for all practical purposes), whereas Neyman’s hypothesis just seems weird (it’s implausible

4 0.1451188 2283 andrew gelman stats-2014-04-06-An old discussion of food deserts

Introduction: I happened to be reading an old comment thread from 2012 (follow the link from here ) and came across this amusing exchange: Perhaps this is the paper Jonathan was talking about? Here’s more from the thread: Anyway, I don’t have anything to add right now, I just thought it was an interesting discussion.

5 0.13754715 1880 andrew gelman stats-2013-06-02-Flame bait

Introduction: Mark Palko asks what I think of this article by Francisco Louca, who writes about “‘hybridization’, a synthesis between Fisherian and Neyman-Pearsonian precepts, defined as a number of practical proceedings for statistical testing and inference that were developed notwithstanding the original authors, as an eventual convergence between what they considered to be radically irreconcilable.” To me, the statistical ideas in this paper are too old-fashioned. The issue is not that the Neyman-Pearson and Fisher approaches are “irreconcilable” but rather that neither does the job in the sort of hard problems that face statistical science today. I’m thinking of technically difficult models such as hierarchical Gaussian processes and also challenges that arise with small sample size and multiple testing. Neyman, Pearson, and Fisher all were brilliant, and they all developed statistical methods that remain useful today, but I think their foundations are out of date. Yes, we currently use m

6 0.12473273 2339 andrew gelman stats-2014-05-19-On deck this week

7 0.11471913 1249 andrew gelman stats-2012-04-06-Thinking seriously about social science research

8 0.11176335 349 andrew gelman stats-2010-10-18-Bike shelf

9 0.10298683 763 andrew gelman stats-2011-06-13-Inventor of Connect Four dies at 91

10 0.095741324 1480 andrew gelman stats-2012-09-02-“If our product is harmful . . . we’ll stop making it.”

11 0.088970333 293 andrew gelman stats-2010-09-23-Lowess is great

12 0.08808668 507 andrew gelman stats-2011-01-07-Small world: MIT, asymptotic behavior of differential-difference equations, Susan Assmann, subgroup analysis, multilevel modeling

13 0.082178637 289 andrew gelman stats-2010-09-21-“How segregated is your city?”: A story of why every graph, no matter how clear it seems to be, needs a caption to anchor the reader in some numbers

14 0.081355564 2126 andrew gelman stats-2013-12-07-If I could’ve done it all over again

15 0.080652088 2160 andrew gelman stats-2014-01-06-Spam names

16 0.07471405 1534 andrew gelman stats-2012-10-15-The strange reappearance of Matthew Klam

17 0.073902547 2320 andrew gelman stats-2014-05-05-On deck this month

18 0.069557197 504 andrew gelman stats-2011-01-05-For those of you in the U.K., also an amusing paradox involving the infamous hookah story

19 0.069505006 1705 andrew gelman stats-2013-02-04-Recently in the sister blog

20 0.069467649 1583 andrew gelman stats-2012-11-19-I can’t read this interview with me


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.06), (1, -0.032), (2, -0.029), (3, 0.015), (4, -0.003), (5, -0.012), (6, 0.027), (7, 0.016), (8, 0.032), (9, -0.005), (10, -0.008), (11, 0.003), (12, 0.021), (13, 0.002), (14, 0.022), (15, -0.011), (16, -0.002), (17, -0.013), (18, 0.022), (19, -0.053), (20, -0.014), (21, -0.013), (22, 0.032), (23, 0.013), (24, 0.007), (25, -0.032), (26, -0.036), (27, 0.02), (28, -0.023), (29, 0.008), (30, 0.025), (31, 0.042), (32, 0.014), (33, 0.03), (34, -0.024), (35, -0.03), (36, 0.024), (37, -0.009), (38, -0.007), (39, 0.01), (40, -0.008), (41, 0.003), (42, 0.059), (43, 0.018), (44, 0.03), (45, 0.005), (46, 0.004), (47, -0.031), (48, 0.02), (49, 0.01)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.94892401 1293 andrew gelman stats-2012-05-01-Huff the Magic Dragon

Introduction: Upon reading this , Susan remarked, “Don’t you think it’s interesting that a guy who promotes smoking has a last name of ‘Huff’? Reminds me of the Dennis/Dentist studies.” Good point. P.S. As discussed in the linked thread, the great statistician R. A. Fisher was notorious for minimizing the risks of smoking. How does this connect to Fisher’s name, one might ask?

2 0.6793291 1780 andrew gelman stats-2013-03-28-Racism!

Introduction: I was reading a book of Alfred Kazin’s letters—I don’t know if they’d be so interesting to someone who hadn’t already read a bunch of his stuff , but I found them pretty interesting—and came across this amazing bit, dated August 11, 1957: No, really, Al. Tell us what you really feel. This was in his private diary, so I can’t really criticize him for it. And all of us have private thoughts, sometimes publicly expressed, that are unworthy of our better self. For example, once I was crossing a street and a taxi driver came dangerously close, and I screamed at him, “Go back to your own country, you #&@#%*^&.” So I’m not claiming that I’m any better than Kazin. I just thought that quote was pretty amazing. I guess that’s how (some) people thought, back in the fifties. Also interesting that he wrote “ass-hole” in that context. The hyphen surprised me, also I don’t think people would use that word in this way anymore. Nowadays I think of an asshole as a person, not a place.

3 0.62244087 430 andrew gelman stats-2010-11-25-The von Neumann paradox

Introduction: I, like Steve Hsu , I too would love to read a definitive biography of John von Neumann (or, as we’d say in the U.S., “John Neumann”). I’ve read little things about him in various places such as Stanislaw Ulam’s classic autobiography, and two things I’ve repeatedly noticed are: 1. Neumann comes off as a obnoxious, self-satisfied jerk. He just seems like the kind of guy I wouldn’t like in real life. 2. All these great men seem to really have loved the guy. It’s hard for me to reconcile two impressions above. Of course, lots of people have a good side and a bad side, but what’s striking here is that my impressions of Neumann’s bad side come from the very stories that his friends use to demonstrate how lovable he was! So, yes, I’d like to see the biography–but only if it could resolve this paradox. Also, I don’t know how relevant this is, but Neumann shares one thing with the more-lovable Ulam and the less-lovable Mandelbrot: all had Jewish backgrounds but didn’t seem to

4 0.61954635 1316 andrew gelman stats-2012-05-12-black and Black, white and White

Introduction: I’ve always thought it looked strange to see people referred to in print as Black or White rather than black or white. For example consider this sentence: “A black guy was walking down the street and he saw a bunch of white guys standing around.” That looks fine, whereas “A Black guy was walking down the street and he saw a bunch of White guys standing around”—that looks weird to me, as if the encounter was taking place in an Ethnic Studies seminar. But maybe I’m wrong on this. Jay Livingston argues that black and white are colors whereas Black and White are races (or, as I would prefer to say, ethnic categories) and illustrates with this picture of a white person and a White person: In conversation, I sometimes talk about pink people, brown people, and tan people, but that won’t work in a research paper. P.S. I suspect Carp will argue that I’m being naive: meanings of words change across contexts and over time. To which I reply: Sure, but I still have to choose h

5 0.61528188 1827 andrew gelman stats-2013-04-27-Continued fractions!!

Introduction: Upon reading this note by John Cook on continued fractions, I wrote: If you like continued fractions, I recommend you read the relevant parts of the classic Numerical Methods That Work. The details are probably obsolete but it’s fun reading (at least, if you think that sort of thing is fun to read). I then looked up Acton in Wikipedia and was surprised to see he’s still alive. And he wrote a second book (published at the age of 77!). I wonder if it’s any good. It’s sobering to read Numerical Methods That Work: it’s so wonderful and so readable, yet in this modern era there’s really not much reason to read it. Perhaps William Goldman (hey, I checked: he’s still alive too!) or some equivalent could prepare a 50-page “good parts” version that could be still be useful as a basic textbook.

6 0.61097163 30 andrew gelman stats-2010-05-13-Trips to Cleveland

7 0.60753971 763 andrew gelman stats-2011-06-13-Inventor of Connect Four dies at 91

8 0.60175228 52 andrew gelman stats-2010-05-26-Intellectual property

9 0.5990473 1354 andrew gelman stats-2012-05-30-“I didn’t marry a horn, I married a man”

10 0.59657836 886 andrew gelman stats-2011-09-02-The new Helen DeWitt novel

11 0.5944913 1505 andrew gelman stats-2012-09-20-“Joseph Anton”

12 0.58726776 1285 andrew gelman stats-2012-04-27-“How to Lie with Statistics” guy worked for the tobacco industry to mock studies of the risks of smoking statistics

13 0.58539659 1639 andrew gelman stats-2012-12-26-Impersonators

14 0.58443421 664 andrew gelman stats-2011-04-16-Dilbert update: cartooning can give you the strength to open jars with your bare hands

15 0.57794017 657 andrew gelman stats-2011-04-11-Note to Dilbert: The difference between Charlie Sheen and Superman is that the Man of Steel protected Lois Lane, he didn’t bruise her

16 0.577896 28 andrew gelman stats-2010-05-12-Alert: Incompetent colleague wastes time of hardworking Wolfram Research publicist

17 0.57668769 1977 andrew gelman stats-2013-08-11-Debutante Hill

18 0.57356513 1534 andrew gelman stats-2012-10-15-The strange reappearance of Matthew Klam

19 0.57163531 1442 andrew gelman stats-2012-08-03-Double standard? Plagiarizing journos get slammed, plagiarizing profs just shrug it off

20 0.57144147 824 andrew gelman stats-2011-07-26-Milo and Milo


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(6, 0.048), (16, 0.153), (21, 0.038), (22, 0.042), (24, 0.21), (27, 0.073), (33, 0.06), (47, 0.057), (99, 0.155)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.96958476 1293 andrew gelman stats-2012-05-01-Huff the Magic Dragon

Introduction: Upon reading this , Susan remarked, “Don’t you think it’s interesting that a guy who promotes smoking has a last name of ‘Huff’? Reminds me of the Dennis/Dentist studies.” Good point. P.S. As discussed in the linked thread, the great statistician R. A. Fisher was notorious for minimizing the risks of smoking. How does this connect to Fisher’s name, one might ask?

2 0.87997532 1871 andrew gelman stats-2013-05-27-Annals of spam

Introduction: I received the following email, subject line “Want to Buy Text Link from andrewgelman.com”: Dear, I am Mary Taylor. I have started a link building campaign for my growing websites. For this, I need your cooperation. The campaign is quite diverse and large scale and if you take some time to understand it – it will benefit us. First I want to clarify that I do not want “blogroll” ”footer” or any other type of “site wide links”. Secondly I want links from inner pages of site – with good page rank of course. Third links should be within text so that Google may not mark them as spam – not for you and not for me. Hence this link building will cause almost no harm to your site or me. Because content links are fine with Google. Now I should come to the requirements. I will accept links from Page Rank 3 to as high as you have got. Also kindly note that I can buy 1 to 50 links from one site – so you should understand the scale of the project. If you have multiple sites with co

3 0.87039399 1258 andrew gelman stats-2012-04-10-Why display 6 years instead of 30?

Introduction: I continue to be the go-to guy for bad graphs. Today (i.e., 22 Feb), I received an email from Gary Rosin: I [Rosin] thought you might be interested in this graph showing the decline in median prices of homes since 1997. It exaggerates the proportions by using $150,000 as the floor, rather than zero. Indeed. Here’s the graph: A line plot, rather than a bar plot, would be appropriate here. Also, it’s weird that the headline says “10 years” but the graph has only 6 years. Why not give some perspective and show, say, 30 years?

4 0.86723375 799 andrew gelman stats-2011-07-13-Hypothesis testing with multiple imputations

Introduction: Vincent Yip writes: I have read your paper [with Kobi Abayomi and Marc Levy] regarding multiple imputation application. In order to diagnostic my imputed data, I used Kolmogorov-Smirnov (K-S) tests to compare the distribution differences between the imputed and observed values of a single attribute as mentioned in your paper. My question is: For example I have this attribute X with the following data: (NA = missing) Original dataset: 1, NA, 3, 4, 1, 5, NA Imputed dataset: 1, 2 , 3, 4, 1, 5, 6 a) in order to run the KS test, will I treat the observed data as 1, 3, 4,1, 5? b) and for the observed data, will I treat 1, 2 , 3, 4, 1, 5, 6 as the imputed dataset for the K-S test? or just 2 ,6? c) if I used m=5, I will have 5 set of imputed data sets. How would I apply K-S test to 5 of them and compare to the single observed distribution? Do I combine the 5 imputed data set into one by averaging each imputed values so I get one single imputed data and compare with the ob

5 0.86152017 177 andrew gelman stats-2010-08-02-Reintegrating rebels into civilian life: Quasi-experimental evidence from Burundi

Introduction: Michael Gilligan, Eric Mvukiyehe, and Cyrus Samii write : We [Gilligan, Mvukiyehe, and Samii] use original survey data, collected in Burundi in the summer of 2007, to show that a World Bank ex-combatant reintegration program implemented after Burundi’s civil war caused significant economic reintegration for its beneficiaries but that this economic reintegration did not translate into greater political and social reintegration. Previous studies of reintegration programs have found them to be ineffective, but these studies have suffered from selection bias: only ex-combatants who self selected into those programs were studied. We avoid such bias with a quasi-experimental research design made possible by an exogenous bureaucratic failure in the implementation of program. One of the World Bank’s implementing partners delayed implementation by almost a year due to an unforeseen contract dispute. As a result, roughly a third of ex-combatants had their program benefits withheld for reas

6 0.85791135 1080 andrew gelman stats-2011-12-24-Latest in blog advertising

7 0.85617715 1584 andrew gelman stats-2012-11-19-Tradeoffs in information graphics

8 0.85592139 548 andrew gelman stats-2011-02-01-What goes around . . .

9 0.85130084 1607 andrew gelman stats-2012-12-05-The p-value is not . . .

10 0.8509115 411 andrew gelman stats-2010-11-13-Ethical concerns in medical trials

11 0.85075098 2 andrew gelman stats-2010-04-23-Modeling heterogenous treatment effects

12 0.85064101 1219 andrew gelman stats-2012-03-18-Tips on “great design” from . . . Microsoft!

13 0.84847593 1155 andrew gelman stats-2012-02-05-What is a prior distribution?

14 0.84816587 2095 andrew gelman stats-2013-11-09-Typo in Ghitza and Gelman MRP paper

15 0.84622741 399 andrew gelman stats-2010-11-07-Challenges of experimental design; also another rant on the practice of mentioning the publication of an article but not naming its author

16 0.84581387 488 andrew gelman stats-2010-12-27-Graph of the year

17 0.84580618 2316 andrew gelman stats-2014-05-03-“The graph clearly shows that mammography adds virtually nothing to survival and if anything, decreases survival (and increases cost and provides unnecessary treatment)”

18 0.84420437 503 andrew gelman stats-2011-01-04-Clarity on my email policy

19 0.84213603 1082 andrew gelman stats-2011-12-25-Further evidence of a longstanding principle of statistics

20 0.84117925 898 andrew gelman stats-2011-09-10-Fourteen magic words: an update