andrew_gelman_stats andrew_gelman_stats-2013 andrew_gelman_stats-2013-1942 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Washington Post columnist Richard Cohen brings up one of my research topics: In New York City, blacks make up a quarter of the population, yet they represent 78 percent of all shooting suspects — almost all of them young men. We know them from the nightly news. Those statistics represent the justification for New York City’s controversial stop-and-frisk program, which amounts to racial profiling writ large. After all, if young black males are your shooters, then it ought to be young black males whom the police stop and frisk. I have two comments on this. First, my research with Jeff Fagan and Alex Kiss (based on data from the late 1990s, so maybe things have changed) found that the NYPD was stopping blacks and hispanics at a rate higher than their previous arrest rates: To briefly summarize our findings, blacks and Hispanics represented 51% and 33% of the stops while representing only 26% and 24% of the New York City population. Compared with the number of arrests of
sentIndex sentText sentNum sentScore
1 Washington Post columnist Richard Cohen brings up one of my research topics: In New York City, blacks make up a quarter of the population, yet they represent 78 percent of all shooting suspects — almost all of them young men. [sent-1, score-0.837]
2 Those statistics represent the justification for New York City’s controversial stop-and-frisk program, which amounts to racial profiling writ large. [sent-3, score-0.432]
3 After all, if young black males are your shooters, then it ought to be young black males whom the police stop and frisk. [sent-4, score-1.301]
4 Compared with the number of arrests of each group in the previous year (used as a proxy for the rate of criminal behavior), blacks were stopped 23% more often than whites and Hispanics were stopped 39% more often than whites. [sent-7, score-1.155]
5 Controlling for precinct actually increased these discrepancies, with minorities between 1. [sent-8, score-0.067]
6 5 times as often as whites (compared with the groups’ previous arrest rates in the precincts where they were stopped) for the most common categories of stops (violent crimes and drug crimes), with smaller differences for property and drug crimes. [sent-10, score-1.025]
7 And things may have changed since 1998-1999 (which is when our data are from). [sent-12, score-0.072]
8 But the data we have here shows the police were disproportionately stopping minorities. [sent-13, score-0.523]
9 The other thing is, I don’t think Cohen is necessarily being fair to the police when he describes the stop-and-frisk program as “racial profiling. [sent-14, score-0.424]
10 ” As we wrote in our paper, “It is quite reasonable to suppose that effective policing requires stopping and questioning many people to gather information about any given crime. [sent-15, score-0.166]
11 ” It could well be that a statistical pattern of stops could arise from individual decisions that are not based on race but instead are based on characteristics that are correlated with race. [sent-16, score-0.259]
12 I have no idea what the police are doing—my only experience here is with the numbers. [sent-17, score-0.357]
13 I that Cohen is, on one hand, way too quick to dismiss the numbers with his blanket statement that “young black males are your shooters” and on the other hand may be way too quick to describe police work as racial profiling. [sent-18, score-0.942]
14 As a bonus, Slate columnist Matthew Yglesias connects this to one of my other research interests: Bayesian inference. [sent-21, score-0.173]
15 I won’t comment on Yglesias’s remarks except to point out that Bayes’ theorem is a two-way street. [sent-22, score-0.087]
16 If you have to make individual decisions by maximizing the probability of success (catching a criminal, if you are the police), then profiling can be a logical strategy. [sent-24, score-0.34]
17 Reasons not to profile include, “equal protection of the laws” etc. [sent-25, score-0.06]
18 and also indirect effects of what one might call the “profiling culture,” effects such as hassling innocent people, reducing trust in the police, empowerment of Bernard Goetz and George Zimmerman to go around shooting people, etc. [sent-26, score-0.261]
19 Bayes’ theorem is relevant in all these calculations but the issues here are not trivial. [sent-27, score-0.087]
wordName wordTfidf (topN-words)
[('police', 0.357), ('blacks', 0.243), ('cohen', 0.213), ('hispanics', 0.209), ('profiling', 0.202), ('stops', 0.186), ('males', 0.179), ('columnist', 0.173), ('shooters', 0.17), ('young', 0.169), ('stopping', 0.166), ('racial', 0.156), ('arrest', 0.154), ('stopped', 0.146), ('black', 0.124), ('city', 0.124), ('shooting', 0.117), ('yglesias', 0.115), ('crimes', 0.11), ('criminal', 0.11), ('previous', 0.102), ('whites', 0.098), ('york', 0.097), ('theorem', 0.087), ('drug', 0.087), ('zimmerman', 0.077), ('empowerment', 0.077), ('represent', 0.074), ('decisions', 0.073), ('fagan', 0.073), ('changed', 0.072), ('rates', 0.07), ('precincts', 0.07), ('nypd', 0.07), ('bayes', 0.068), ('precinct', 0.067), ('innocent', 0.067), ('arrests', 0.067), ('program', 0.067), ('maximizing', 0.065), ('quick', 0.063), ('bernard', 0.062), ('catching', 0.062), ('often', 0.061), ('rate', 0.061), ('suspects', 0.061), ('compared', 0.061), ('proxy', 0.06), ('profile', 0.06), ('discrepancies', 0.06)]
simIndex simValue blogId blogTitle
same-blog 1 1.0 1942 andrew gelman stats-2013-07-17-“Stop and frisk” statistics
Introduction: Washington Post columnist Richard Cohen brings up one of my research topics: In New York City, blacks make up a quarter of the population, yet they represent 78 percent of all shooting suspects — almost all of them young men. We know them from the nightly news. Those statistics represent the justification for New York City’s controversial stop-and-frisk program, which amounts to racial profiling writ large. After all, if young black males are your shooters, then it ought to be young black males whom the police stop and frisk. I have two comments on this. First, my research with Jeff Fagan and Alex Kiss (based on data from the late 1990s, so maybe things have changed) found that the NYPD was stopping blacks and hispanics at a rate higher than their previous arrest rates: To briefly summarize our findings, blacks and Hispanics represented 51% and 33% of the stops while representing only 26% and 24% of the New York City population. Compared with the number of arrests of
2 0.13271871 156 andrew gelman stats-2010-07-20-Burglars are local
Introduction: This makes sense: In the land of fiction, it’s the criminal’s modus operandi – his method of entry, his taste for certain jewellery and so forth – that can be used by detectives to identify his handiwork. The reality according to a new analysis of solved burglaries in the Northamptonshire region of England is that these aspects of criminal behaviour are on their own unreliable as identifying markers, most likely because they are dictated by circumstances rather than the criminal’s taste and style. However, the geographical spread and timing of a burglar’s crimes are distinctive, and could help with police investigations. And, as a bonus, more Tourette’s pride! P.S. On yet another unrelated topic from the same blog, I wonder if the researchers in this study are aware that the difference between “significant” and “not significant” is not itself statistically significant .
3 0.12129223 2210 andrew gelman stats-2014-02-13-Stopping rules and Bayesian analysis
Introduction: I happened to receive two questions about stopping rules on the same day. First, from Tom Cunningham: I’ve been arguing with my colleagues about whether the stopping rule is relevant (a presenter disclosed that he went out to collect more data because the first experiment didn’t get significant results) — and I believe you have some qualifications to the Bayesian irrelevance argument but I don’t properly understand them. Then, from Benjamin Kay: I have a question that may be of interest for your blog. I was reading about the early history of AIDS and learned that the the trial of AZT was ended early because it was so effective : The trial reported in the New England Journal of medicine, had produced a dramatic result. Before the planned 24 week duration of the study, after a mean period of participation of about 120 days, nineteen participants receiving placebo had died while there was only a single death among those receiving AZT. This appeared to be a momentous break
4 0.10982613 312 andrew gelman stats-2010-10-02-“Regression to the mean” is fine. But what’s the “mean”?
Introduction: In the context of a discussion of Democratic party strategies, Matthew Yglesias writes : Given where things stood in January 2009, large House losses were essentially inevitable. The Democratic majority elected in 2008 was totally unsustainable and was doomed by basic regression to the mean. I’d like to push back on this, if for no other reason than that I didn’t foresee all this back in January 2009. Regression to the mean is a fine idea, but what’s the “mean” that you’re regressing to? Here’s a graph I made a couple years ago , showing the time series of Democratic vote share in congressional and presidential elections: Take a look at the House vote in 2006 and 2008. Is this a blip, just begging to be slammed down in 2010 by a regression to the mean? Or does it represent a return to form, back to the 55% level of support that the Democrats had for most of the previous fifty years? It’s not so obvious what to think–at least, not simply from looking at the graph.
5 0.10900676 673 andrew gelman stats-2011-04-20-Upper-income people still don’t realize they’re upper-income
Introduction: Catherine Rampell highlights this stunning Gallup Poll result: 6 percent of Americans in households earning over $250,000 a year think their taxes are “too low.” Of that same group, 26 percent said their taxes were “about right,” and a whopping 67 percent said their taxes were “too high.” OK, fine. Most people don’t like taxes. No surprise there. But get this next part: And yet when this same group of high earners was asked whether “upper-income people” paid their fair share in taxes, 30 percent said “upper-income people” paid too little, 30 percent said it was a “fair share,” and 38 percent said it was too much. 30 percent of these upper-income people say that upper-income people pay too little, but only 6 percent say that they personally pay too little. 38% say that upper-income people pay too much, but 67% say they personally pay too much. Rampell attributes this to people’s ignorance about population statistics–these 250K+ families just don’t realize t
8 0.10016466 1385 andrew gelman stats-2012-06-20-Reconciling different claims about working-class voters
10 0.098812431 1949 andrew gelman stats-2013-07-21-Defensive political science responds defensively to an attack on social science
11 0.097274169 1489 andrew gelman stats-2012-09-09-Commercial Bayesian inference software is popping up all over
12 0.096621349 3 andrew gelman stats-2010-04-26-Bayes in the news…in a somewhat frustrating way
14 0.089717567 647 andrew gelman stats-2011-04-04-Irritating pseudo-populism, backed up by false statistics and implausible speculations
15 0.080699295 788 andrew gelman stats-2011-07-06-Early stopping and penalized likelihood
16 0.080601305 1183 andrew gelman stats-2012-02-25-Calibration!
17 0.0803525 1649 andrew gelman stats-2013-01-02-Back when 50 miles was a long way
18 0.079616591 350 andrew gelman stats-2010-10-18-Subtle statistical issues to be debated on TV.
19 0.077543572 420 andrew gelman stats-2010-11-18-Prison terms for financial fraud?
20 0.074922316 2327 andrew gelman stats-2014-05-09-Nicholas Wade and the paradox of racism
topicId topicWeight
[(0, 0.145), (1, -0.023), (2, 0.018), (3, -0.011), (4, -0.032), (5, 0.007), (6, -0.021), (7, 0.02), (8, -0.012), (9, -0.0), (10, -0.049), (11, -0.009), (12, 0.017), (13, 0.009), (14, 0.024), (15, 0.041), (16, 0.043), (17, -0.002), (18, 0.012), (19, 0.014), (20, 0.012), (21, 0.002), (22, -0.035), (23, -0.027), (24, 0.005), (25, -0.023), (26, -0.03), (27, 0.01), (28, 0.05), (29, -0.004), (30, -0.0), (31, -0.004), (32, -0.002), (33, 0.008), (34, 0.016), (35, -0.002), (36, 0.002), (37, 0.009), (38, -0.009), (39, 0.009), (40, -0.006), (41, -0.011), (42, -0.015), (43, -0.019), (44, 0.01), (45, 0.012), (46, -0.03), (47, -0.012), (48, 0.013), (49, 0.004)]
simIndex simValue blogId blogTitle
same-blog 1 0.95345515 1942 andrew gelman stats-2013-07-17-“Stop and frisk” statistics
Introduction: Washington Post columnist Richard Cohen brings up one of my research topics: In New York City, blacks make up a quarter of the population, yet they represent 78 percent of all shooting suspects — almost all of them young men. We know them from the nightly news. Those statistics represent the justification for New York City’s controversial stop-and-frisk program, which amounts to racial profiling writ large. After all, if young black males are your shooters, then it ought to be young black males whom the police stop and frisk. I have two comments on this. First, my research with Jeff Fagan and Alex Kiss (based on data from the late 1990s, so maybe things have changed) found that the NYPD was stopping blacks and hispanics at a rate higher than their previous arrest rates: To briefly summarize our findings, blacks and Hispanics represented 51% and 33% of the stops while representing only 26% and 24% of the New York City population. Compared with the number of arrests of
2 0.84214592 845 andrew gelman stats-2011-08-08-How adoption speed affects the abandonment of cultural tastes
Introduction: Interesting article by Jonah Berger and Gael Le Mens: Products, styles, and social movements often catch on and become popular, but little is known about why such identity-relevant cultural tastes and practices die out. We demonstrate that the velocity of adoption may affect abandonment: Analysis of over 100 years of data on first-name adoption in both France and the United States illustrates that cultural tastes that have been adopted quickly die faster (i.e., are less likely to persist). Mirroring this aggregate pattern, at the individual level, expecting parents are more hesitant to adopt names that recently experienced sharper increases in adoption. Further analysis indicate that these effects are driven by concerns about symbolic value: Fads are perceived negatively, so people avoid identity-relevant items with sharply increasing popularity because they believe that they will be short lived. Ancillary analyses also indicate that, in contrast to conventional wisdom, identity-r
3 0.76527619 1789 andrew gelman stats-2013-04-05-Elites have alcohol problems too!
Introduction: Speaking of Tyler Cowen, I’m puzzled by this paragraph of his: Guns, like alcohol, have many legitimate uses, and they are enjoyed by many people in a responsible manner. In both cases, there is an elite which has absolutely no problems handling the institution in question, but still there is the question of whether the nation really can have such bifurcated social norms, namely one set of standards for the elite and another set for everybody else. I don’t know anything about guns so I’ll set that part aside. My bafflement is with the claim that “there is an elite which has absolutely no problem handling [alcohol].” Is he kidding? Unless Cowen is circularly defining “an elite” as the subset of elites who don’t have an alcohol problem, I don’t buy this claim. And I actually think it’s a serious problem, that various “elites” are so sure that they have “absolutely no problem” that they do dangerous, dangerous things. Consider the notorious incident when Dick Cheney shot a
Introduction: Thomas Lumley writes : The Herald has a story about hazards of coffee. The picture caption says Men who drink more than four cups a day are 56 per cent more likely to die. which is obviously not true: deaths, as we’ve observed before, are fixed at one per customer. The story says It’s not that people are dying at a rapid rate. But men who drink more than four cups a day are 56 per cent more likely to die and women have double the chance compared with moderate drinkers, according to the The University of Queensland and the University of South Carolina study. What the study actually reported was rates of death: over an average of 17 years, men who drink more than four cups a day died at about a 21% higher rate, with little evidence of any difference in men. After they considered only men and women under 55 (which they don’t say was something they had planned to do), and attempted to control for a whole bunch of other factors, the rate increase went to 56% for me
Introduction: There’s a paradigm in applied statistics that goes something like this: 1. There is a scientific or policy question of some theoretical or practical importance. 2. Researchers gather data on relevant outcomes and perform a statistical analysis, ideally leading to a clear conclusion (p less than 0.05, or a strong posterior distribution, or good predictive performance, or high reliability and validity, whatever). 3. This conclusion informs policy. This paradigm has room for positive findings (for example, that a new program is statistically significantly better, or statistically significantly worse than what came before) or negative findings (data are inconclusive, further study is needed), even if negative findings seem less likely to make their way into the textbooks. But what happens when step 2 simply isn’t possible. This came up a few years ago—nearly 10 years ago, now!—with the excellent paper by Donohue and Wolfers which explained why it’s just about impossible to
6 0.74032116 624 andrew gelman stats-2011-03-22-A question about the economic benefits of universities
7 0.73605114 333 andrew gelman stats-2010-10-10-Psychiatric drugs and the reduction in crime
8 0.73261052 1386 andrew gelman stats-2012-06-21-Belief in hell is associated with lower crime rates
10 0.72749192 526 andrew gelman stats-2011-01-19-“If it saves the life of a single child…” and other nonsense
11 0.72746009 12 andrew gelman stats-2010-04-30-More on problems with surveys estimating deaths in war zones
12 0.72512329 116 andrew gelman stats-2010-06-29-How to grab power in a democracy – in 5 easy non-violent steps
15 0.72348142 1623 andrew gelman stats-2012-12-14-GiveWell charity recommendations
16 0.72083497 1397 andrew gelman stats-2012-06-27-Stand Your Ground laws and homicides
17 0.71923208 940 andrew gelman stats-2011-10-03-It depends upon what the meaning of the word “firm” is.
18 0.71782809 2049 andrew gelman stats-2013-10-03-On house arrest for p-hacking
19 0.71488243 1086 andrew gelman stats-2011-12-27-The most dangerous jobs in America
20 0.71173358 67 andrew gelman stats-2010-06-03-More on that Dartmouth health care study
topicId topicWeight
[(9, 0.03), (13, 0.166), (15, 0.056), (16, 0.071), (21, 0.034), (24, 0.121), (29, 0.012), (45, 0.014), (62, 0.017), (63, 0.015), (67, 0.011), (83, 0.017), (86, 0.024), (98, 0.015), (99, 0.249)]
simIndex simValue blogId blogTitle
1 0.9598304 172 andrew gelman stats-2010-07-30-Why don’t we have peer reviewing for oral presentations?
Introduction: Panos Ipeirotis writes in his blog post : Everyone who has attended a conference knows that the quality of the talks is very uneven. There are talks that are highly engaging, entertaining, and describe nicely the research challenges and solutions. And there are talks that are a waste of time. Either the presenter cannot present clearly, or the presented content is impossible to digest within the time frame of the presentation. We already have reviewing for the written part. The program committee examines the quality of the written paper and vouch for its technical content. However, by looking at a paper it is impossible to know how nicely it can be presented. Perhaps the seemingly solid but boring paper can be a very entertaining presentation. Or an excellent paper may be written by a horrible presenter. Why not having a second round of reviewing, where the authors of accepted papers submit their presentations (slides and a YouTube video) for presentation to the conference.
2 0.95366967 1789 andrew gelman stats-2013-04-05-Elites have alcohol problems too!
Introduction: Speaking of Tyler Cowen, I’m puzzled by this paragraph of his: Guns, like alcohol, have many legitimate uses, and they are enjoyed by many people in a responsible manner. In both cases, there is an elite which has absolutely no problems handling the institution in question, but still there is the question of whether the nation really can have such bifurcated social norms, namely one set of standards for the elite and another set for everybody else. I don’t know anything about guns so I’ll set that part aside. My bafflement is with the claim that “there is an elite which has absolutely no problem handling [alcohol].” Is he kidding? Unless Cowen is circularly defining “an elite” as the subset of elites who don’t have an alcohol problem, I don’t buy this claim. And I actually think it’s a serious problem, that various “elites” are so sure that they have “absolutely no problem” that they do dangerous, dangerous things. Consider the notorious incident when Dick Cheney shot a
3 0.94895595 234 andrew gelman stats-2010-08-25-Modeling constrained parameters
Introduction: Mike McLaughlin writes: In general, is there any way to do MCMC with a fixed constraint? E.g., suppose I measure the three internal angles of a triangle with errors ~dnorm(0, tau) where tau might be different for the three measurements. This would be an easy BUGS/WinBUGS/JAGS exercise but suppose, in addition, I wanted to include prior information to the effect that the three angles had to total 180 degrees exactly. Is this feasible? Could you point me to any BUGS model in which a constraint of this type is implemented? Note: Even in my own (non-hierarchical) code which tends to be component-wise, random-walk Metropolis with tuned Laplacian proposals, I cannot see how I could incorporate such a constraint. My reply: See page 508 of Bayesian Data Analysis (2nd edition). We have an example of such a model there (from this paper with Bois and Jiang).
same-blog 4 0.94480252 1942 andrew gelman stats-2013-07-17-“Stop and frisk” statistics
Introduction: Washington Post columnist Richard Cohen brings up one of my research topics: In New York City, blacks make up a quarter of the population, yet they represent 78 percent of all shooting suspects — almost all of them young men. We know them from the nightly news. Those statistics represent the justification for New York City’s controversial stop-and-frisk program, which amounts to racial profiling writ large. After all, if young black males are your shooters, then it ought to be young black males whom the police stop and frisk. I have two comments on this. First, my research with Jeff Fagan and Alex Kiss (based on data from the late 1990s, so maybe things have changed) found that the NYPD was stopping blacks and hispanics at a rate higher than their previous arrest rates: To briefly summarize our findings, blacks and Hispanics represented 51% and 33% of the stops while representing only 26% and 24% of the New York City population. Compared with the number of arrests of
5 0.94223762 1137 andrew gelman stats-2012-01-24-Difficulties in publishing non-replications of implausible findings
Introduction: Eric Tassone points me to this news article by Christopher Shea on the challenges of debunking ESP. Shea writes : Earlier this year, a major psychology journal published a paper suggesting that there was some evidence for “pre-cognition,” a form of ESP. Stuart Ritchie, a doctoral student at the University of Edinburgh, is part of a team that tried, but failed, to replicate those results. Here, he tells the Chronicle of Higher Education’s Tom Bartlett about the difficulties he’s had getting the results published. Several journals told the team they wouldn’t publish a study that did no more than disprove a previous study. . . . An editor at another journal said he’d “only accept our paper if we ran a fourth experiment where we got a believer [in ESP] to run all the participants, to control for . . . experimenter effects.” My reaction is, this isn’t as easy a question as it might seem. At first, one’s reaction might share Ritchie’s frustration that a shoddy paper by Bem got p
6 0.93626589 437 andrew gelman stats-2010-11-29-The mystery of the U-shaped relationship between happiness and age
7 0.93090189 971 andrew gelman stats-2011-10-25-Apply now for Earth Institute postdoctoral fellowships at Columbia University
9 0.92402279 1509 andrew gelman stats-2012-09-24-Analyzing photon counts
10 0.92354012 597 andrew gelman stats-2011-03-02-RStudio – new cross-platform IDE for R
11 0.92146301 1559 andrew gelman stats-2012-11-02-The blog is back
13 0.91313708 1916 andrew gelman stats-2013-06-27-The weirdest thing about the AJPH story
14 0.90527958 1852 andrew gelman stats-2013-05-12-Crime novels for economists
15 0.89943588 1907 andrew gelman stats-2013-06-20-Amazing retro gnu graphics!
16 0.89684427 800 andrew gelman stats-2011-07-13-I like lineplots
17 0.89343464 2069 andrew gelman stats-2013-10-19-R package for effect size calculations for psychology researchers
18 0.88580728 1545 andrew gelman stats-2012-10-23-Two postdoc opportunities to work with our research group!! (apply by 15 Nov 2012)
19 0.88499838 1672 andrew gelman stats-2013-01-14-How do you think about the values in a confidence interval?
20 0.88099325 1933 andrew gelman stats-2013-07-10-Please send all comments to -dev-ripley