andrew_gelman_stats andrew_gelman_stats-2013 andrew_gelman_stats-2013-2134 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Aaron Edlin writes: This story is interesting in its own right . . . I have a question so I thought I would ask a Bayesian statistician. One fact I learned on reading this article is that Oswald had a job in the building that Kennedy drove by before Kennedy’s route was chosen. So Oswald didn’t get the job to shoot Kennedy. Does this “prove” there was no conspiracy or indeed have any bearing on the likelihood of that inference? My reply: I actually have a friend whose father worked on the Warren Commission so I’ve long been convinced that Oswald acted alone. But, sure, this piece of information should shift the probability a bit. The difficulty is that the amount (and even the direction) of the shifting of the probability depends on the model you are assuming for various possibilities, and these possibilities themselves are not so clearly defined. Bayesian statistician Jay Kadane wrote a book a few years ago on the Sacco and Vanzetti case, going into the evidence supplie
sentIndex sentText sentNum sentScore
1 I have a question so I thought I would ask a Bayesian statistician. [sent-4, score-0.07]
2 One fact I learned on reading this article is that Oswald had a job in the building that Kennedy drove by before Kennedy’s route was chosen. [sent-5, score-0.701]
3 Does this “prove” there was no conspiracy or indeed have any bearing on the likelihood of that inference? [sent-7, score-0.474]
4 My reply: I actually have a friend whose father worked on the Warren Commission so I’ve long been convinced that Oswald acted alone. [sent-8, score-0.724]
5 But, sure, this piece of information should shift the probability a bit. [sent-9, score-0.519]
6 The difficulty is that the amount (and even the direction) of the shifting of the probability depends on the model you are assuming for various possibilities, and these possibilities themselves are not so clearly defined. [sent-10, score-0.99]
7 Bayesian statistician Jay Kadane wrote a book a few years ago on the Sacco and Vanzetti case, going into the evidence supplied by each piece of information. [sent-11, score-0.56]
8 The whole piece of work was impressive but it was hard for me to follow, there were so many little details. [sent-12, score-0.517]
wordName wordTfidf (topN-words)
[('oswald', 0.458), ('piece', 0.294), ('kennedy', 0.287), ('possibilities', 0.241), ('bearing', 0.175), ('acted', 0.167), ('conspiracy', 0.157), ('edlin', 0.149), ('warren', 0.146), ('commission', 0.146), ('shifting', 0.143), ('drove', 0.136), ('job', 0.136), ('supplied', 0.134), ('aaron', 0.133), ('route', 0.133), ('father', 0.133), ('shoot', 0.128), ('probability', 0.117), ('jay', 0.115), ('prove', 0.114), ('convinced', 0.112), ('shift', 0.108), ('impressive', 0.101), ('bayesian', 0.096), ('friend', 0.096), ('depends', 0.095), ('building', 0.091), ('assuming', 0.089), ('difficulty', 0.088), ('learned', 0.085), ('amount', 0.085), ('whose', 0.084), ('likelihood', 0.082), ('direction', 0.081), ('worked', 0.076), ('clearly', 0.074), ('statistician', 0.073), ('follow', 0.073), ('details', 0.073), ('ask', 0.07), ('whole', 0.068), ('indeed', 0.06), ('reading', 0.06), ('fact', 0.06), ('evidence', 0.059), ('inference', 0.059), ('various', 0.058), ('long', 0.056), ('little', 0.054)]
simIndex simValue blogId blogTitle
same-blog 1 1.0 2134 andrew gelman stats-2013-12-14-Oswald evidence
Introduction: Aaron Edlin writes: This story is interesting in its own right . . . I have a question so I thought I would ask a Bayesian statistician. One fact I learned on reading this article is that Oswald had a job in the building that Kennedy drove by before Kennedy’s route was chosen. So Oswald didn’t get the job to shoot Kennedy. Does this “prove” there was no conspiracy or indeed have any bearing on the likelihood of that inference? My reply: I actually have a friend whose father worked on the Warren Commission so I’ve long been convinced that Oswald acted alone. But, sure, this piece of information should shift the probability a bit. The difficulty is that the amount (and even the direction) of the shifting of the probability depends on the model you are assuming for various possibilities, and these possibilities themselves are not so clearly defined. Bayesian statistician Jay Kadane wrote a book a few years ago on the Sacco and Vanzetti case, going into the evidence supplie
Introduction: The story starts in September, when psychology professor Fred Oswald wrote me: I [Oswald] wanted to point out this paper in Science (Ramirez & Beilock, 2010) examining how students’ emotional writing improves their test performance in high-pressure situations. Although replication is viewed as the hallmark of research, this paper replicates implausibly large d-values and correlations across studies, leading me to be more suspicious of the findings (not less, as is generally the case). He also pointed me to this paper: Experimental disclosure and its moderators: A meta-analysis. Frattaroli, Joanne Psychological Bulletin, Vol 132(6), Nov 2006, 823-865. Disclosing information, thoughts, and feelings about personal and meaningful topics (experimental disclosure) is purported to have various health and psychological consequences (e.g., J. W. Pennebaker, 1993). Although the results of 2 small meta-analyses (P. G. Frisina, J. C. Borod, & S. J. Lepore, 2004; J. M. Smyth
Introduction: We recently considered a pair of studies that came out awhile ago involving children and political orientation: Andrew Oswald and Nattavudh Powdthavee found that, in Great Britain, parents of girls were more likely to support left-wing parties, compared to parents of boys. And, in the other direction, Dalton Conley and Emily Rauscher found with survey data from the United States that parents of girls were more likely to support the Republican party, compared to parents of boys. As I discussed the other day, the latest version of the Conley and Raucher study came with some incoherent evolutionary theorizing. There was also some discussion regarding the differences between the two studies. Oswald sent me some relevant comments: It will be hard in cross-sections like the GSS to solve the problem of endogeneity bias. Conservative families may want to have boys, and may use that as a stopping rule. If so, they will end up with disproportionately large number of girls. Without l
4 0.11304613 2223 andrew gelman stats-2014-02-24-“Edlin’s rule” for routinely scaling down published estimates
Introduction: A few months ago I reacted (see further discussion in comments here ) to a recent study on early childhood intervention, in which researchers Paul Gertler, James Heckman, Rodrigo Pinto, Arianna Zanolini, Christel Vermeerch, Susan Walker, Susan M. Chang, and Sally Grantham-McGregor estimated that a particular intervention on young children had raised incomes on young adults by 42%. I wrote: Major decisions on education policy can turn on the statistical interpretation of small, idiosyncratic data sets — in this case, a study of 129 Jamaican children. . . . Overall, I have no reason to doubt the direction of the effect, namely, that psychosocial stimulation should be good. But I’m skeptical of the claim that income differed by 42%, due to the reason of the statistical significance filter . In section 2.3, the authors are doing lots of hypothesizing based on some comparisons being statistically significant and others being non-significant. There’s nothing wrong with speculation, b
5 0.1071571 630 andrew gelman stats-2011-03-27-What is an economic “conspiracy theory”?
Introduction: Reviewing a research article by Michael Spence and Sandile Hlatshwayo about globalization (a paper with the sobering message that “higher-paying jobs [are] likely to follow low-paying jobs in leaving US,” Tyler Cowen writes : It is also a useful corrective to the political conspiracy theories of changes in the income distribution. . . Being not-so-blissfully ignorant of macroeconomics, I can focus on the political question, namely these conspiracy theories. I’m not quite sure what Cowen is referring to here–he neglects to provide a link to the conspiracy theories–but I’m guessing he’s referring to the famous graph by Piketty and Saez showing how very high-end incomes (top 1% or 0.1%) have, since the 1970s, risen much more dramatically in the U.S. than in other countries, along with claims by Paul Krugman and others that much of this difference can be explained by political changes in the U.S. In particular, top tax rates in the U.S. have declined since the 1970s and the p
6 0.10128544 1935 andrew gelman stats-2013-07-12-“A tangle of unexamined emotional impulses and illogical responses”
7 0.095322192 2182 andrew gelman stats-2014-01-22-Spell-checking example demonstrates key aspects of Bayesian data analysis
8 0.090409786 2055 andrew gelman stats-2013-10-08-A Bayesian approach for peer-review panels? and a speculation about Bruno Frey
9 0.089358889 1695 andrew gelman stats-2013-01-28-Economists argue about Bayes
10 0.088740081 1055 andrew gelman stats-2011-12-13-Data sharing update
12 0.084746286 54 andrew gelman stats-2010-05-27-Hype about conditional probability puzzles
13 0.082941353 1554 andrew gelman stats-2012-10-31-It not necessary that Bayesian methods conform to the likelihood principle
14 0.081581183 967 andrew gelman stats-2011-10-20-Picking on Gregg Easterbrook
15 0.079802491 1205 andrew gelman stats-2012-03-09-Coming to agreement on philosophy of statistics
16 0.078597613 1912 andrew gelman stats-2013-06-24-Bayesian quality control?
17 0.077403761 1779 andrew gelman stats-2013-03-27-“Two Dogmas of Strong Objective Bayesianism”
18 0.072654366 676 andrew gelman stats-2011-04-23-The payoff: $650. The odds: 1 in 500,000.
19 0.067177549 1848 andrew gelman stats-2013-05-09-A tale of two discussion papers
20 0.067128941 1678 andrew gelman stats-2013-01-17-Wanted: 365 stories of statistics
topicId topicWeight
[(0, 0.133), (1, 0.023), (2, -0.032), (3, 0.033), (4, -0.048), (5, -0.0), (6, 0.033), (7, 0.032), (8, 0.035), (9, -0.051), (10, -0.003), (11, -0.016), (12, 0.001), (13, -0.008), (14, 0.019), (15, -0.001), (16, 0.051), (17, 0.032), (18, 0.011), (19, 0.013), (20, -0.018), (21, 0.018), (22, 0.002), (23, 0.007), (24, -0.006), (25, 0.005), (26, -0.008), (27, 0.004), (28, 0.013), (29, 0.003), (30, -0.012), (31, -0.008), (32, 0.014), (33, 0.04), (34, 0.002), (35, -0.004), (36, -0.007), (37, -0.018), (38, 0.01), (39, -0.018), (40, -0.006), (41, 0.0), (42, 0.026), (43, -0.031), (44, -0.011), (45, -0.021), (46, 0.06), (47, 0.027), (48, -0.038), (49, 0.052)]
simIndex simValue blogId blogTitle
same-blog 1 0.96902639 2134 andrew gelman stats-2013-12-14-Oswald evidence
Introduction: Aaron Edlin writes: This story is interesting in its own right . . . I have a question so I thought I would ask a Bayesian statistician. One fact I learned on reading this article is that Oswald had a job in the building that Kennedy drove by before Kennedy’s route was chosen. So Oswald didn’t get the job to shoot Kennedy. Does this “prove” there was no conspiracy or indeed have any bearing on the likelihood of that inference? My reply: I actually have a friend whose father worked on the Warren Commission so I’ve long been convinced that Oswald acted alone. But, sure, this piece of information should shift the probability a bit. The difficulty is that the amount (and even the direction) of the shifting of the probability depends on the model you are assuming for various possibilities, and these possibilities themselves are not so clearly defined. Bayesian statistician Jay Kadane wrote a book a few years ago on the Sacco and Vanzetti case, going into the evidence supplie
2 0.7403971 1067 andrew gelman stats-2011-12-18-Christopher Hitchens was a Bayesian
Introduction: 1. We Bayesian statisticians like to say there are three kinds of statisticians: a. Bayesians; b. People who are Bayesians but don’t realize it (that is, they act in coherence with some unstated probability); c. Failed Bayesians (that is, people whose inference could be improved by some attention to coherence). So, if a statistician does great work, we are inclined to claim this person for the Bayesian cause, even if he or she vehemently denies any Bayesian leanings. 2. In his autobiography, Bertrand Russell tells the story of when he went to prison for opposing World War 1: I [Russell] was much cheered on my arrival by the warden at the gate, who had to take particulars about me. He asked my religion, and I replied ‘agnostic.’ He asked how to spell it, and remarked with a sigh: “Well, there are many religions, but I suppose they all worship the same God.” This remark kept me cheerful for about a week. 3. In an op-ed today, Ross Douthat argues that celebrated a
3 0.73048335 1332 andrew gelman stats-2012-05-20-Problemen met het boek
Introduction: Regarding the so-called Dutch Book argument for Bayesian inference (the idea that, if your inferences do not correspond to a Bayesian posterior distribution, you can be forced to make incoherent bets and ultimately become a money pump), I wrote: I have never found this argument appealing, because a bet is a game not a decision. A bet requires 2 players, and one player has to offer the bets. I do agree that in some bounded settings (for example, betting on win place show in a horse race), I’d want my bets to be coherent; if they are incoherent (e.g., if my bets correspond to P(A|B)*P(B) not being equal to P(A,B)), then I should be able to do better by examining the incoherence. But in an “open system” (to borrow some physics jargon), I don’t think coherence is possible. There is always new information coming in, and there is always additional prior information in reserve that hasn’t entered the model.
Introduction: Yes, checking calibration of probability forecasts is part of Bayesian statistics. At the end of this post are three figures from Chapter 1 of Bayesian Data Analysis illustrating empirical evaluation of forecasts. But first the background. Why am I bringing this up now? It’s because of something Larry Wasserman wrote the other day : One of the striking facts about [baseball/political forecaster Nate Silver's recent] book is the emphasis the Silver places on frequency calibration. . . . Have no doubt about it: Nate Silver is a frequentist. For example, he says: One of the most important tests of a forecast — I would argue that it is the single most important one — is called calibration. Out of all the times you said there was a 40 percent chance of rain, how often did rain actually occur? If over the long run, it really did rain about 40 percent of the time, that means your forecasts were well calibrated. I had some discussion with Larry in the comments section of h
5 0.71799743 1560 andrew gelman stats-2012-11-03-Statistical methods that work in some settings but not others
Introduction: David Hogg pointed me to this post by Larry Wasserman: 1. The Horwitz-Thompson estimator satisfies the following condition: for every , where — the parameter space — is the set of all functions . (There are practical improvements to the Horwitz-Thompson estimator that we discussed in our earlier posts but we won’t revisit those here.) 2. A Bayes estimator requires a prior for . In general, if is not a function of then (1) will not hold. . . . 3. If you let be a function if , (1) still, in general, does not hold. 4. If you make a function if in just the right way, then (1) will hold. . . . There is nothing wrong with doing this, but in our opinion this is not in the spirit of Bayesian inference. . . . 7. This example is only meant to show that Bayesian estimators do not necessarily have good frequentist properties. This should not be surprising. There is no reason why we should in general expect a Bayesian method to have a frequentist property
6 0.71020806 1438 andrew gelman stats-2012-07-31-What is a Bayesian?
7 0.70441979 1781 andrew gelman stats-2013-03-29-Another Feller theory
8 0.70317632 1182 andrew gelman stats-2012-02-24-Untangling the Jeffreys-Lindley paradox
10 0.70295763 566 andrew gelman stats-2011-02-09-The boxer, the wrestler, and the coin flip, again
11 0.69681573 427 andrew gelman stats-2010-11-23-Bayesian adaptive methods for clinical trials
12 0.69601041 1529 andrew gelman stats-2012-10-11-Bayesian brains?
13 0.6943751 117 andrew gelman stats-2010-06-29-Ya don’t know Bayes, Jack
14 0.6933012 2368 andrew gelman stats-2014-06-11-Bayes in the research conversation
15 0.68465972 1868 andrew gelman stats-2013-05-23-Validation of Software for Bayesian Models Using Posterior Quantiles
16 0.68447548 2322 andrew gelman stats-2014-05-06-Priors I don’t believe
17 0.6772542 1779 andrew gelman stats-2013-03-27-“Two Dogmas of Strong Objective Bayesianism”
18 0.6772328 1955 andrew gelman stats-2013-07-25-Bayes-respecting experimental design and other things
19 0.67622918 2293 andrew gelman stats-2014-04-16-Looking for Bayesian expertise in India, for the purpose of analysis of sarcoma trials
20 0.67490101 1912 andrew gelman stats-2013-06-24-Bayesian quality control?
topicId topicWeight
[(3, 0.036), (4, 0.016), (13, 0.037), (16, 0.127), (20, 0.023), (24, 0.062), (35, 0.019), (44, 0.02), (56, 0.078), (61, 0.094), (72, 0.035), (75, 0.03), (99, 0.314)]
simIndex simValue blogId blogTitle
same-blog 1 0.97062516 2134 andrew gelman stats-2013-12-14-Oswald evidence
Introduction: Aaron Edlin writes: This story is interesting in its own right . . . I have a question so I thought I would ask a Bayesian statistician. One fact I learned on reading this article is that Oswald had a job in the building that Kennedy drove by before Kennedy’s route was chosen. So Oswald didn’t get the job to shoot Kennedy. Does this “prove” there was no conspiracy or indeed have any bearing on the likelihood of that inference? My reply: I actually have a friend whose father worked on the Warren Commission so I’ve long been convinced that Oswald acted alone. But, sure, this piece of information should shift the probability a bit. The difficulty is that the amount (and even the direction) of the shifting of the probability depends on the model you are assuming for various possibilities, and these possibilities themselves are not so clearly defined. Bayesian statistician Jay Kadane wrote a book a few years ago on the Sacco and Vanzetti case, going into the evidence supplie
2 0.93036544 9 andrew gelman stats-2010-04-28-But it all goes to pay for gas, car insurance, and tolls on the turnpike
Introduction: As a New Yorker I think I’m obliged to pass on the occasional Jersey joke (most recently, this one , which annoyingly continues to attract spam comments). I’ll let the above title be my comment on this entry from Tyler Cowen entitled, “Which Americans are ‘best off’?”: If you consult human development indices the answer is Asians living in New Jersey. The standard is: The index factors in life expectancy at birth, educational degree attainment among adults 25-years or older, school enrollment for people at least three years old and median annual gross personal earnings. More generally, these sorts of rankings and ndexes seem to be cheap ways of grabbing headlines. This has always irritated me but really maybe I should go with the flow and invent a few of these indexes myself.
3 0.92451102 1158 andrew gelman stats-2012-02-07-The more likely it is to be X, the more likely it is to be Not X?
Introduction: This post is by Phil Price. A paper by Wood, Douglas, and Sutton looks at “Beliefs in Contradictory Conspiracy Theories.” Unfortunately the subjects were 140 undergraduate psychology students, so one wonders how general the results are. I found this sort of arresting: In Study 1 (n=137), the more participants believed that Princess Diana faked her own death, the more they believed she was murdered. In Study 2 (n=102), the more participants believed that Osama Bin Laden was already dead when U.S. Special Forces raided his compound in Pakistan, the more they believed he is still alive. As the article says, “conspiracy advocates’ distrust of official narratives may be so strong that many alternative theories are simultaneously endorsed in spite of any contradictions between them.” But I think the authors overstate things when they say “One would think that there ought to be a negative correlation between beliefs in contradictory accounts of events — the more one believes in
4 0.9240768 21 andrew gelman stats-2010-05-07-Environmentally induced cancer “grossly underestimated”? Doubtful.
Introduction: The (U.S.) “President’s Cancer Panel” has released its 2008-2009 annual report, which includes a cover letter that says “the true burden of environmentally induced cancer has been grossly underestimated.” The report itself discusses exposures to various types of industrial chemicals, some of which are known carcinogens, in some detail, but gives nearly no data or analysis to suggest that these exposures are contributing to significant numbers of cancers. In fact, there is pretty good evidence that they are not. The plot above shows age-adjusted cancer mortality for men, by cancer type, in the U.S. The plot below shows the same for women. In both cases, the cancers with the highest mortality rates are shown, but not all cancers (e.g. brain cancer is not shown). For what it’s worth, I’m not sure how trustworthy the rates are from the 1930s — it seems possible that reporting, autopsies, or both, were less careful during the Great Depression — so I suggest focusing on the r
5 0.92041999 2106 andrew gelman stats-2013-11-19-More on “data science” and “statistics”
Introduction: After reading Rachel and Cathy’s book , I wrote that “Statistics is the least important part of data science . . . I think it would be fair to consider statistics as a subset of data science. . . . it’s not the most important part of data science, or even close.” But then I received “Data Science for Business,” by Foster Provost and Tom Fawcett, in the mail. I might not have opened the book at all (as I’m hardly in the target audience) but for seeing a blurb by Chris Volinsky, a statistician whom I respect a lot. So I flipped through the book and it indeed looked pretty good. It moves slowly but that’s appropriate for an intro book. But what surprised me, given the book’s title and our recent discussion on the nature of data science, was that the book was 100% statistics! It had some math (for example, definitions of various distance measures), some simple algebra, some conceptual graphs such as ROC curve, some tables and graphs of low-dimensional data summaries—but almost
6 0.91919178 2070 andrew gelman stats-2013-10-20-The institution of tenure
7 0.9182272 54 andrew gelman stats-2010-05-27-Hype about conditional probability puzzles
8 0.91660738 2066 andrew gelman stats-2013-10-17-G+ hangout for test run of BDA course
10 0.91274697 2280 andrew gelman stats-2014-04-03-As the boldest experiment in journalism history, you admit you made a mistake
11 0.91135007 2197 andrew gelman stats-2014-02-04-Peabody here.
13 0.90979761 722 andrew gelman stats-2011-05-20-Why no Wegmania?
14 0.90978098 892 andrew gelman stats-2011-09-06-Info on patent trolls
15 0.90965509 935 andrew gelman stats-2011-10-01-When should you worry about imputed data?
16 0.90952951 967 andrew gelman stats-2011-10-20-Picking on Gregg Easterbrook
17 0.90948874 2349 andrew gelman stats-2014-05-26-WAIC and cross-validation in Stan!
18 0.90932763 14 andrew gelman stats-2010-05-01-Imputing count data
19 0.90876019 561 andrew gelman stats-2011-02-06-Poverty, educational performance – and can be done about it
20 0.90872478 1652 andrew gelman stats-2013-01-03-“The Case for Inductive Theory Building”