andrew_gelman_stats andrew_gelman_stats-2012 andrew_gelman_stats-2012-1232 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: The list includes “hunting” but not “fishing,” so that’s cool. I wonder how they’d feel about a question involving different cuts of meat. In any case, I’m happy to see that “Bayes” is not on the banned list. P.S. Russell explains .
sentIndex sentText sentNum sentScore
1 The list includes “hunting” but not “fishing,” so that’s cool. [sent-1, score-0.404]
2 I wonder how they’d feel about a question involving different cuts of meat. [sent-2, score-1.084]
3 In any case, I’m happy to see that “Bayes” is not on the banned list. [sent-3, score-0.653]
wordName wordTfidf (topN-words)
[('banned', 0.41), ('hunting', 0.384), ('cuts', 0.34), ('fishing', 0.334), ('russell', 0.321), ('explains', 0.247), ('involving', 0.235), ('includes', 0.225), ('bayes', 0.2), ('happy', 0.183), ('list', 0.179), ('wonder', 0.17), ('feel', 0.141), ('question', 0.108), ('case', 0.098), ('different', 0.09), ('see', 0.06)]
simIndex simValue blogId blogTitle
same-blog 1 1.0 1232 andrew gelman stats-2012-03-27-Banned in NYC school tests
Introduction: The list includes “hunting” but not “fishing,” so that’s cool. I wonder how they’d feel about a question involving different cuts of meat. In any case, I’m happy to see that “Bayes” is not on the banned list. P.S. Russell explains .
2 0.11159396 1067 andrew gelman stats-2011-12-18-Christopher Hitchens was a Bayesian
Introduction: 1. We Bayesian statisticians like to say there are three kinds of statisticians: a. Bayesians; b. People who are Bayesians but don’t realize it (that is, they act in coherence with some unstated probability); c. Failed Bayesians (that is, people whose inference could be improved by some attention to coherence). So, if a statistician does great work, we are inclined to claim this person for the Bayesian cause, even if he or she vehemently denies any Bayesian leanings. 2. In his autobiography, Bertrand Russell tells the story of when he went to prison for opposing World War 1: I [Russell] was much cheered on my arrival by the warden at the gate, who had to take particulars about me. He asked my religion, and I replied ‘agnostic.’ He asked how to spell it, and remarked with a sigh: “Well, there are many religions, but I suppose they all worship the same God.” This remark kept me cheerful for about a week. 3. In an op-ed today, Ross Douthat argues that celebrated a
3 0.08637692 3 andrew gelman stats-2010-04-26-Bayes in the news…in a somewhat frustrating way
Introduction: I’m not sure how the New York Times defines a blog versus an article, so perhaps this post should be called “Bayes in the blogs.” Whatever. A recent NY Times article/blog post discusses a classic Bayes’ Theorem application — probability that the patient has cancer, given a “positive” mammogram — and purports to give a solution that is easy for students to understand because it doesn’t require Bayes’ Theorem, which is of course complicated and confusing. You can see my comment (#17) here.
4 0.080120459 2303 andrew gelman stats-2014-04-23-Thinking of doing a list experiment? Here’s a list of reasons why you should think again
Introduction: Someone wrote in: We are about to conduct a voting list experiment. We came across your comment recommending that each item be removed from the list. Would greatly appreciate it if you take a few minutes to spell out your recommendation in a little more detail. In particular: (a) Why are you “uneasy” about list experiments? What would strengthen your confidence in list experiments? (b) What do you mean by “each item be removed”? As you know, there are several non-sensitive items and one sensitive item in a list experiment. Do you mean that the non-sensitive items should be removed one-by-one for the control group or are you suggesting a multiple arm design in which each arm of the experiment has one non-sensitive item removed. What would be achieved by this design? I replied: I’ve always been a bit skeptical about list experiments, partly because I worry that the absolute number of items on the list could itself affect the response. For example, someone might not want to che
5 0.079337284 577 andrew gelman stats-2011-02-16-Annals of really really stupid spam
Introduction: This came in the inbox today: Dear Dr. Gelman, GenWay recently found your article titled “Multiple imputation for model checking: completed-data plots with missing and latent data.” (Biometrics. 2005 Mar;61(1):74-85.) and thought you might be interested in learning about our superior quality signaling proteins. GenWay prides itself on being a leader in customer service aiming to exceed your expectations with the quality and price of our products. With more than 60,000 reagents backed by our outstanding guarantee you are sure to find the products you have been searching for. Please feel free to visit the following resource pages: * Apoptosis Pathway (product list) * Adipocytokine (product list) * Cell Cycle Pathway (product list) * Jak STAT (product list) * GnRH (product list) * MAPK (product list) * mTOR (product list) * T Cell Receptor (product list) * TGF-beta (product list) * Wnt (product list) * View All Pathways
7 0.060009483 90 andrew gelman stats-2010-06-16-Oil spill and corn production
8 0.058252167 1382 andrew gelman stats-2012-06-17-How to make a good fig?
10 0.057898056 517 andrew gelman stats-2011-01-14-Bayes in China update
11 0.057758175 698 andrew gelman stats-2011-05-05-Shocking but not surprising
12 0.055600904 688 andrew gelman stats-2011-04-30-Why it’s so relaxing to think about social issues
13 0.055043057 1146 andrew gelman stats-2012-01-30-Convenient page of data sources from the Washington Post
14 0.05483833 219 andrew gelman stats-2010-08-20-Some things are just really hard to believe: more on choosing your facts.
15 0.053484634 1341 andrew gelman stats-2012-05-24-Question 14 of my final exam for Design and Analysis of Sample Surveys
16 0.052933302 1951 andrew gelman stats-2013-07-22-Top 5 stat papers since 2000?
17 0.052758239 2108 andrew gelman stats-2013-11-20-That’s crazy talk!
18 0.051480737 2052 andrew gelman stats-2013-10-05-Give me a ticket for an aeroplane
19 0.05100663 1690 andrew gelman stats-2013-01-23-When are complicated models helpful in psychology research and when are they overkill?
20 0.050624929 2140 andrew gelman stats-2013-12-19-Revised evidence for statistical standards
topicId topicWeight
[(0, 0.052), (1, 0.005), (2, -0.002), (3, 0.002), (4, 0.003), (5, 0.012), (6, 0.01), (7, -0.008), (8, 0.012), (9, -0.016), (10, -0.018), (11, -0.007), (12, 0.007), (13, 0.012), (14, 0.034), (15, 0.007), (16, -0.009), (17, 0.011), (18, 0.0), (19, 0.008), (20, 0.006), (21, -0.015), (22, 0.009), (23, -0.014), (24, 0.017), (25, -0.008), (26, -0.002), (27, 0.001), (28, -0.029), (29, 0.011), (30, 0.004), (31, 0.013), (32, 0.001), (33, 0.015), (34, -0.009), (35, -0.02), (36, 0.015), (37, 0.005), (38, -0.025), (39, 0.019), (40, -0.006), (41, -0.016), (42, 0.035), (43, 0.019), (44, -0.002), (45, 0.009), (46, 0.002), (47, -0.04), (48, -0.012), (49, 0.001)]
simIndex simValue blogId blogTitle
same-blog 1 0.94881612 1232 andrew gelman stats-2012-03-27-Banned in NYC school tests
Introduction: The list includes “hunting” but not “fishing,” so that’s cool. I wonder how they’d feel about a question involving different cuts of meat. In any case, I’m happy to see that “Bayes” is not on the banned list. P.S. Russell explains .
2 0.62799072 560 andrew gelman stats-2011-02-06-Education and Poverty
Introduction: Jonathan Livengood writes: There has been some discussion about the recent PISA results (in which the U.S. comes out pretty badly), for example here and here . The claim being made is that the poor U.S. scores are due to rampant individual- or family-level poverty in the U.S. They claim that when one controls for poverty, the U.S. comes out on top in the PISA standings, and then they infer that poverty causes poor test scores. The further inference is then that the U.S. could improve education by the “simple” action of reducing poverty. Anyway, I was wondering what you thought about their analysis. My reply: I agree this is interesting and I agree it’s hard to know exactly what to say about these comparisons. When I’m stuck in this sort of question, I ask, WWJD? In this case, I think Jennifer would ask what are the potential interventions being considered. Various ideas for changing the school system would perhaps have different effects on different groups of students.
Introduction: Peter Bergman writes: is it possible to “overstratify” when assigning a treatment in a randomized control trial? I [Bergman] have a sample size of roughly 400 people, and several binary variables correlate strongly with the outcome of interest and would also define interesting subgroups for analysis. The problem is, stratifying over all of these (five or six) variables leaves me with strata that have only 1 person in them. I have done some background reading on whether there is a rule of thumb for the maximum number of variables to stratify. There does not seem to be much agreement (some say there should be between N/50-N/100 strata, others say as few as possible). In economics, the paper I looked to is here, which seems to summarize literature related to clinical trials. In short, my question is: is it bad to have several strata with 1 person in them? Should I group these people in with another stratum? P.S. In the paper I mention above, they also say it is important to inc
4 0.57579315 1293 andrew gelman stats-2012-05-01-Huff the Magic Dragon
Introduction: Upon reading this , Susan remarked, “Don’t you think it’s interesting that a guy who promotes smoking has a last name of ‘Huff’? Reminds me of the Dennis/Dentist studies.” Good point. P.S. As discussed in the linked thread, the great statistician R. A. Fisher was notorious for minimizing the risks of smoking. How does this connect to Fisher’s name, one might ask?
5 0.57447678 1882 andrew gelman stats-2013-06-03-The statistical properties of smart chains (and referral chains more generally)
Introduction: Louis Mittel writes: The premise of the column this guy is starting is interesting: Noah Davis interviews a smart person and then interviews the smartest person that smart person knows and so on. It reminded me of you mentioning survey design strategy of asking people about other people, like “How many people do you know named Stuart?” or “How many people do you know that have had an abortion?” Ignoring the interview aspect of what this guy is doing, I think there’s some cool questions about the distribution/path behavior of smartest-person-I-know chains (say, seeded at random). Do they loop? If so, how long do they run before looping, how large are the loops? What parts of the population do the explore? Do you know of anything that’s been done on something like this? My reply: Interesting question. It could be asked of any referral chain, for example asking a sequence of people, “Who’s the tallest person you know?” or “Who’s the best piano player you know” or “Who’
8 0.55935061 763 andrew gelman stats-2011-06-13-Inventor of Connect Four dies at 91
10 0.53759491 1316 andrew gelman stats-2012-05-12-black and Black, white and White
11 0.53723276 1296 andrew gelman stats-2012-05-03-Google Translate for code, and an R help-list bot
12 0.53349781 1411 andrew gelman stats-2012-07-10-Defining ourselves arbitrarily
13 0.53105801 577 andrew gelman stats-2011-02-16-Annals of really really stupid spam
14 0.52806032 1410 andrew gelman stats-2012-07-09-Experimental work on market-based or non-market-based incentives
15 0.52733696 2296 andrew gelman stats-2014-04-19-Index or indicator variables
16 0.52681118 2015 andrew gelman stats-2013-09-10-The ethics of lying, cheating, and stealing with data: A case study
17 0.52547342 1241 andrew gelman stats-2012-04-02-Fixed effects and identification
18 0.52186519 1780 andrew gelman stats-2013-03-28-Racism!
19 0.52029252 1908 andrew gelman stats-2013-06-21-Interpreting interactions in discrete-data regression
20 0.51999205 1457 andrew gelman stats-2012-08-13-Retro ethnic slurs
topicId topicWeight
[(16, 0.099), (21, 0.611), (99, 0.085)]
simIndex simValue blogId blogTitle
same-blog 1 0.94353372 1232 andrew gelman stats-2012-03-27-Banned in NYC school tests
Introduction: The list includes “hunting” but not “fishing,” so that’s cool. I wonder how they’d feel about a question involving different cuts of meat. In any case, I’m happy to see that “Bayes” is not on the banned list. P.S. Russell explains .
2 0.79266983 672 andrew gelman stats-2011-04-20-The R code for those time-use graphs
Introduction: By popular demand, here’s my R script for the time-use graphs : # The data a1 <- c(4.2,3.2,11.1,1.3,2.2,2.0) a2 <- c(3.9,3.2,10.0,0.8,3.1,3.1) a3 <- c(6.3,2.5,9.8,0.9,2.2,2.4) a4 <- c(4.4,3.1,9.8,0.8,3.3,2.7) a5 <- c(4.8,3.0,9.9,0.7,3.3,2.4) a6 <- c(4.0,3.4,10.5,0.7,3.3,2.1) a <- rbind(a1,a2,a3,a4,a5,a6) avg <- colMeans (a) avg.array <- t (array (avg, rev(dim(a)))) diff <- a - avg.array country.name <- c("France", "Germany", "Japan", "Britain", "USA", "Turkey") # The line plots par (mfrow=c(2,3), mar=c(4,4,2,.5), mgp=c(2,.7,0), tck=-.02, oma=c(3,0,4,0), bg="gray96", fg="gray30") for (i in 1:6){ plot (c(1,6), c(-1,1.7), xlab="", ylab="", xaxt="n", yaxt="n", bty="l", type="n") lines (1:6, diff[i,], col="blue") points (1:6, diff[i,], pch=19, col="black") if (i>3){ axis (1, c(1,3,5), c ("Work,\nstudy", "Eat,\nsleep", "Leisure"), mgp=c(2,1.5,0), tck=0, cex.axis=1.2) axis (1, c(2,4,6), c ("Unpaid\nwork", "Personal\nCare", "Other"), mgp=c(2,1.5,0),
3 0.7332865 2298 andrew gelman stats-2014-04-21-On deck this week
Introduction: Mon : Ticket to Baaaath Tues : Ticket to Baaaaarf Wed : Thinking of doing a list experiment? Here’s a list of reasons why you should think again Thurs : An open site for researchers to post and share papers Fri : Questions about “Too Good to Be True” Sat : Sleazy sock puppet can’t stop spamming our discussion of compressed sensing and promoting the work of Xiteng Liu Sun : White stripes and dead armadillos
4 0.7173965 2272 andrew gelman stats-2014-03-29-I agree with this comment
Introduction: The anonymous commenter puts it well : The problem is simple, the researchers are disproving always false null hypotheses and taking this disproof as near proof that their theory is correct.
5 0.69154036 227 andrew gelman stats-2010-08-23-Visualization magazine
Introduction: Aleks pointed me to this .
6 0.67536867 151 andrew gelman stats-2010-07-16-Wanted: Probability distributions for rank orderings
7 0.59124959 1857 andrew gelman stats-2013-05-15-Does quantum uncertainty have a place in everyday applied statistics?
8 0.5853098 894 andrew gelman stats-2011-09-07-Hipmunk FAIL: Graphics without content is not enough
9 0.58514726 1275 andrew gelman stats-2012-04-22-Please stop me before I barf again
10 0.58072484 432 andrew gelman stats-2010-11-27-Neumann update
11 0.56641436 854 andrew gelman stats-2011-08-15-A silly paper that tries to make fun of multilevel models
12 0.56359792 1049 andrew gelman stats-2011-12-09-Today in the sister blog
15 0.546395 62 andrew gelman stats-2010-06-01-Two Postdoc Positions Available on Bayesian Hierarchical Modeling
16 0.52681726 1401 andrew gelman stats-2012-06-30-David Hogg on statistics
17 0.50955027 1675 andrew gelman stats-2013-01-15-“10 Things You Need to Know About Causal Effects”
20 0.48078525 514 andrew gelman stats-2011-01-13-News coverage of statistical issues…how did I do?