andrew_gelman_stats andrew_gelman_stats-2012 andrew_gelman_stats-2012-1541 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Mark Johnstone writes: I’ve recently been investigating a new European Court of Justice ruling on insurance calculations (on behalf of MoneySuperMarket) and I found something related to statistics that caught my attention. . . . The ruling (which comes into effect in December 2012) states that insurers in Europe can no longer provide different premiums based on gender. Despite the fact that women are statistically safer drivers, unless it’s biologically proven there is a causal relationship between being female and being a safer driver, this is now seen as an act of discrimination (more on this from the Wall Street Journal). However, where do you stop with this? What about age? What about other factors? And what does this mean for the application of statistics in general? Is it inherently unjust in this context? One proposal has been to fit ‘black boxes’ into cars so more individual data can be collected, as opposed to relying heavily on aggregates. For fans of data and s
sentIndex sentText sentNum sentScore
1 Mark Johnstone writes: I’ve recently been investigating a new European Court of Justice ruling on insurance calculations (on behalf of MoneySuperMarket) and I found something related to statistics that caught my attention. [sent-1, score-0.972]
2 The ruling (which comes into effect in December 2012) states that insurers in Europe can no longer provide different premiums based on gender. [sent-5, score-0.546]
3 Despite the fact that women are statistically safer drivers, unless it’s biologically proven there is a causal relationship between being female and being a safer driver, this is now seen as an act of discrimination (more on this from the Wall Street Journal). [sent-6, score-1.42]
4 And what does this mean for the application of statistics in general? [sent-10, score-0.193]
5 One proposal has been to fit ‘black boxes’ into cars so more individual data can be collected, as opposed to relying heavily on aggregates. [sent-12, score-0.552]
6 For fans of data and statistics, the law poses some interesting challenges. [sent-13, score-0.358]
7 And I’d love to see somebody digging into this further from a statistical point-of-view. [sent-14, score-0.258]
8 I don’t have much to add here, beyond the usual Bayesian point that, if we have enough data on individuals, this will be more important than average rates, and also the usual political point that good information might not get used if the rulemakers have particular sympathy for unsafe drivers. [sent-15, score-0.575]
wordName wordTfidf (topN-words)
[('ruling', 0.302), ('safer', 0.273), ('drivers', 0.236), ('unjust', 0.173), ('unsafe', 0.173), ('digging', 0.173), ('biologically', 0.173), ('insurers', 0.163), ('poses', 0.151), ('usual', 0.139), ('boxes', 0.136), ('behalf', 0.134), ('december', 0.132), ('justice', 0.129), ('fans', 0.126), ('sympathy', 0.124), ('investigating', 0.122), ('discrimination', 0.121), ('relying', 0.119), ('proven', 0.119), ('driver', 0.119), ('insurance', 0.117), ('inherently', 0.115), ('court', 0.115), ('european', 0.115), ('europe', 0.115), ('cars', 0.113), ('heavily', 0.113), ('statistics', 0.109), ('female', 0.109), ('proposal', 0.105), ('collected', 0.103), ('wall', 0.103), ('opposed', 0.102), ('calculations', 0.099), ('relationship', 0.094), ('black', 0.093), ('street', 0.092), ('act', 0.09), ('caught', 0.089), ('despite', 0.087), ('somebody', 0.085), ('stop', 0.085), ('women', 0.085), ('application', 0.084), ('unless', 0.083), ('individuals', 0.083), ('longer', 0.081), ('law', 0.081), ('factors', 0.079)]
simIndex simValue blogId blogTitle
same-blog 1 1.0000002 1541 andrew gelman stats-2012-10-19-Statistical discrimination again
Introduction: Mark Johnstone writes: I’ve recently been investigating a new European Court of Justice ruling on insurance calculations (on behalf of MoneySuperMarket) and I found something related to statistics that caught my attention. . . . The ruling (which comes into effect in December 2012) states that insurers in Europe can no longer provide different premiums based on gender. Despite the fact that women are statistically safer drivers, unless it’s biologically proven there is a causal relationship between being female and being a safer driver, this is now seen as an act of discrimination (more on this from the Wall Street Journal). However, where do you stop with this? What about age? What about other factors? And what does this mean for the application of statistics in general? Is it inherently unjust in this context? One proposal has been to fit ‘black boxes’ into cars so more individual data can be collected, as opposed to relying heavily on aggregates. For fans of data and s
2 0.097757943 708 andrew gelman stats-2011-05-12-Improvement of 5 MPG: how many more auto deaths?
Introduction: This entry was posted by Phil Price. A colleague is looking at data on car (and SUV and light truck) collisions and casualties. He’s interested in causal relationships. For instance, suppose car manufacturers try to improve gas mileage without decreasing acceleration. The most likely way they will do that is to make cars lighter. But perhaps lighter cars are more dangerous; how many more people will die for each mpg increase in gas mileage? There are a few different data sources, all of them seriously deficient from the standpoint of answering this question. Deaths are very well reported, so if someone dies in an auto accident you can find out what kind of car they were in, what other kinds of cars (if any) were involved in the accident, whether the person was a driver or passenger, and so on. But it’s hard to normalize: OK, I know that N people who were passengers in a particular model of car died in car accidents last year, but I don’t know how many passenger-miles that
3 0.094557166 1693 andrew gelman stats-2013-01-25-Subsidized driving
Introduction: This post is by Phil. This DC Streets Blog post gives a concise summary of a report by “The Tax Foundation”. The money shot is here , a table that shows what fraction spending on roads in each state in the U.S. is covered by local, state, and federal gas taxes, tolls, registration fees, etc. (Click on the ‘rank’ table heading to put it in useful order). The national average is 51%, and in no state do drivers directly pay more than 80% of the cost of the roads and highways. That means that, nationwide, half the cost of the roads is paid out of general government funds. Even if it were 100% this still wouldn’t cover additional government costs of driving (such as military spending to protect the oil supply, and law enforcement costs, etc.) but I’ll ignore those in this post. Of course, most of the general funds that make up the difference are themselves paid by people who drive, so this isn’t as grossly unfair as it seems. But it’s still pretty unfair, and it is a huge “market di
4 0.081469111 437 andrew gelman stats-2010-11-29-The mystery of the U-shaped relationship between happiness and age
Introduction: For awhile I’ve been curious (see also here ) about the U-shaped relation between happiness and age (with people least happy, on average, in their forties, and happier before and after). But when I tried to demonstrate it to me intro statistics course, using the General Social Survey, I couldn’t find the famed U, or anything like it. Using pooled GSS data mixes age, period, and cohort, so I tried throwing in some cohort effects (indicators for decades) and a couple other variables, but still couldn’t find that U. So I was intrigued when I came across this paper by Paul Frijters and Tony Beatton , who write: Whilst the majority of psychologists have concluded there is not much of a relationship at all, the economic literature has unearthed a possible U-shape relationship. In this paper we [Frijters and Beatton] replicate the U-shape for the German SocioEconomic Panel (GSOEP), and we investigate several possible explanations for it. They write: What is the relationship
5 0.079972871 1695 andrew gelman stats-2013-01-28-Economists argue about Bayes
Introduction: Robert Bell pointed me to this post by Brad De Long on Bayesian statistics, and then I also noticed this from Noah Smith, who wrote: My impression is that although the Bayesian/Frequentist debate is interesting and intellectually fun, there’s really not much “there” there… despite being so-hip-right-now, Bayesian is not the Statistical Jesus. I’m happy to see the discussion going in this direction. Twenty-five years ago or so, when I got into this biz, there were some serious anti-Bayesian attitudes floating around in mainstream statistics. Discussions in the journals sometimes devolved into debates of the form, “Bayesians: knaves or fools?”. You’d get all sorts of free-floating skepticism about any prior distribution at all, even while people were accepting without question (and doing theory on) logistic regressions, proportional hazards models, and all sorts of strong strong models. (In the subfield of survey sampling, various prominent researchers would refuse to mode
6 0.074617751 1640 andrew gelman stats-2012-12-26-What do people do wrong? WSJ columnist is looking for examples!
7 0.073047079 2336 andrew gelman stats-2014-05-16-How much can we learn about individual-level causal claims from state-level correlations?
8 0.072364114 367 andrew gelman stats-2010-10-25-In today’s economy, the rich get richer
9 0.072219983 2236 andrew gelman stats-2014-03-07-Selection bias in the reporting of shaky research
10 0.070999518 554 andrew gelman stats-2011-02-04-An addition to the model-makers’ oath
11 0.067520611 15 andrew gelman stats-2010-05-03-Public Opinion on Health Care Reform
12 0.067493439 1591 andrew gelman stats-2012-11-26-Politics as an escape hatch
14 0.064796209 1860 andrew gelman stats-2013-05-17-How can statisticians help psychologists do their research better?
15 0.064704262 1076 andrew gelman stats-2011-12-21-Derman, Rodrik and the nature of statistical models
16 0.06414403 1621 andrew gelman stats-2012-12-13-Puzzles of criminal justice
17 0.06406413 235 andrew gelman stats-2010-08-25-Term Limits for the Supreme Court?
18 0.062382337 2368 andrew gelman stats-2014-06-11-Bayes in the research conversation
19 0.061613761 1675 andrew gelman stats-2013-01-15-“10 Things You Need to Know About Causal Effects”
20 0.061557338 487 andrew gelman stats-2010-12-27-Alfred Kahn
topicId topicWeight
[(0, 0.144), (1, -0.007), (2, 0.013), (3, -0.015), (4, -0.014), (5, 0.018), (6, -0.041), (7, 0.023), (8, -0.013), (9, 0.025), (10, -0.022), (11, -0.017), (12, 0.015), (13, 0.022), (14, 0.014), (15, 0.016), (16, 0.03), (17, 0.014), (18, -0.0), (19, -0.017), (20, 0.019), (21, 0.008), (22, 0.001), (23, -0.023), (24, 0.036), (25, 0.053), (26, -0.032), (27, -0.016), (28, -0.02), (29, -0.007), (30, 0.029), (31, 0.017), (32, -0.009), (33, -0.008), (34, 0.005), (35, 0.062), (36, 0.007), (37, -0.001), (38, -0.003), (39, 0.054), (40, -0.001), (41, -0.013), (42, -0.002), (43, -0.009), (44, -0.014), (45, -0.011), (46, 0.01), (47, -0.019), (48, 0.015), (49, 0.001)]
simIndex simValue blogId blogTitle
same-blog 1 0.94175559 1541 andrew gelman stats-2012-10-19-Statistical discrimination again
Introduction: Mark Johnstone writes: I’ve recently been investigating a new European Court of Justice ruling on insurance calculations (on behalf of MoneySuperMarket) and I found something related to statistics that caught my attention. . . . The ruling (which comes into effect in December 2012) states that insurers in Europe can no longer provide different premiums based on gender. Despite the fact that women are statistically safer drivers, unless it’s biologically proven there is a causal relationship between being female and being a safer driver, this is now seen as an act of discrimination (more on this from the Wall Street Journal). However, where do you stop with this? What about age? What about other factors? And what does this mean for the application of statistics in general? Is it inherently unjust in this context? One proposal has been to fit ‘black boxes’ into cars so more individual data can be collected, as opposed to relying heavily on aggregates. For fans of data and s
2 0.75206739 940 andrew gelman stats-2011-10-03-It depends upon what the meaning of the word “firm” is.
Introduction: David Hogg pointed me to this news article by Angela Saini: It’s not often that the quiet world of mathematics is rocked by a murder case. But last summer saw a trial that sent academics into a tailspin, and has since swollen into a fevered clash between science and the law. At its heart, this is a story about chance. And it begins with a convicted killer, “T”, who took his case to the court of appeal in 2010. Among the evidence against him was a shoeprint from a pair of Nike trainers, which seemed to match a pair found at his home. While appeals often unmask shaky evidence, this was different. This time, a mathematical formula was thrown out of court. The footwear expert made what the judge believed were poor calculations about the likelihood of the match, compounded by a bad explanation of how he reached his opinion. The conviction was quashed. . . . “The impact will be quite shattering,” says Professor Norman Fenton, a mathematician at Queen Mary, University of London.
3 0.72933906 32 andrew gelman stats-2010-05-14-Causal inference in economics
Introduction: Aaron Edlin points me to this issue of the Journal of Economic Perspectives that focuses on statistical methods for causal inference in economics. (Michael Bishop’s page provides some links .) To quickly summarize my reactions to Angrist and Pischke’s book: I pretty much agree with them that the potential-outcomes or natural-experiment approach is the most useful way to think about causality in economics and related fields. My main amendments to Angrist and Pischke would be to recognize that: 1. Modeling is important, especially modeling of interactions . It’s unfortunate to see a debate between experimentalists and modelers. Some experimenters (not Angrist and Pischke) make the mistake of avoiding models: Once they have their experimental data, they check their brains at the door and do nothing but simple differences, not realizing how much more can be learned. Conversely, some modelers are unduly dismissive of experiments and formal observational studies, forgetting t
4 0.7264632 1427 andrew gelman stats-2012-07-24-More from the sister blog
Introduction: Anthropologist Bruce Mannheim reports that a recent well-publicized study on the genetics of native Americans, which used genetic analysis to find “at least three streams of Asian gene flow,” is in fact a confirmation of a long-known fact. Mannheim writes: This three-way distinction was known linguistically since the 1920s (for example, Sapir 1921). Basically, it’s a division among the Eskimo-Aleut languages, which straddle the Bering Straits even today, the Athabaskan languages (which were discovered to be related to a small Siberian language family only within the last few years, not by Greenberg as Wade suggested), and everything else. This is not to say that the results from genetics are unimportant, but it’s good to see how it fits with other aspects of our understanding.
5 0.72087908 1114 andrew gelman stats-2012-01-12-Controversy about average personality differences between men and women
Introduction: Blogger Echidne pointed me to a recent article , “The Distance Between Mars and Venus: Measuring Global Sex Differences in Personality,” by Marco Del Giudice, Tom Booth, and Paul Irwing, who find: Sex differences in personality are believed to be comparatively small. However, research in this area has suffered from significant methodological limitations. We advance a set of guidelines for overcoming those limitations: (a) measure personality with a higher resolution than that afforded by the Big Five; (b) estimate sex differences on latent factors; and (c) assess global sex differences with multivariate effect sizes. . . . We found a global effect size D = 2.71, corresponding to an overlap of only 10% between the male and female distributions. Even excluding the factor showing the largest univariate ES [effect size], the global effect size was D = 1.71 (24% overlap). Echidne quotes a news article in which one of the study’s authors going overboard: “Psychologically, men a
6 0.71874565 2043 andrew gelman stats-2013-09-29-The difficulties of measuring just about anything
7 0.70413285 98 andrew gelman stats-2010-06-19-Further thoughts on happiness and life satisfaction research
8 0.69405413 1767 andrew gelman stats-2013-03-17-The disappearing or non-disappearing middle class
10 0.68825877 490 andrew gelman stats-2010-12-29-Brain Structure and the Big Five
11 0.68487942 2336 andrew gelman stats-2014-05-16-How much can we learn about individual-level causal claims from state-level correlations?
12 0.68371987 161 andrew gelman stats-2010-07-24-Differences in color perception by sex, also the Bechdel test for women in movies
13 0.68324506 708 andrew gelman stats-2011-05-12-Improvement of 5 MPG: how many more auto deaths?
14 0.67824578 1942 andrew gelman stats-2013-07-17-“Stop and frisk” statistics
15 0.67408556 2072 andrew gelman stats-2013-10-21-The future (and past) of statistical sciences
17 0.67058796 1906 andrew gelman stats-2013-06-19-“Behind a cancer-treatment firm’s rosy survival claims”
18 0.66329658 561 andrew gelman stats-2011-02-06-Poverty, educational performance – and can be done about it
19 0.6621353 1910 andrew gelman stats-2013-06-22-Struggles over the criticism of the “cannabis users and IQ change” paper
topicId topicWeight
[(15, 0.34), (16, 0.046), (21, 0.028), (23, 0.012), (24, 0.128), (58, 0.015), (59, 0.014), (63, 0.024), (66, 0.013), (72, 0.027), (79, 0.015), (84, 0.013), (89, 0.013), (99, 0.222)]
simIndex simValue blogId blogTitle
1 0.98123431 439 andrew gelman stats-2010-11-30-Of psychology research and investment tips
Introduction: A few days after “ Dramatic study shows participants are affected by psychological phenomena from the future ,” (see here ) the British Psychological Society follows up with “ Can psychology help combat pseudoscience? .” Somehow I’m reminded of that bit of financial advice which says, if you want to save some money, your best investment is to pay off your credit card bills.
2 0.94778001 908 andrew gelman stats-2011-09-14-Type M errors in the lab
Introduction: Jeff points us to this news article by Asher Mullard: Bayer halts nearly two-thirds of its target-validation projects because in-house experimental findings fail to match up with published literature claims, finds a first-of-a-kind analysis on data irreproducibility. An unspoken industry rule alleges that at least 50% of published studies from academic laboratories cannot be repeated in an industrial setting, wrote venture capitalist Bruce Booth in a recent blog post. A first-of-a-kind analysis of Bayer’s internal efforts to validate ‘new drug target’ claims now not only supports this view but suggests that 50% may be an underestimate; the company’s in-house experimental data do not match literature claims in 65% of target-validation projects, leading to project discontinuation. . . . Khusru Asadullah, Head of Target Discovery at Bayer, and his colleagues looked back at 67 target-validation projects, covering the majority of Bayer’s work in oncology, women’s health and cardiov
3 0.9296037 1081 andrew gelman stats-2011-12-24-Statistical ethics violation
Introduction: A colleague writes: When I was in NYC I went to this party by group of Japanese bio-scientists. There, one guy told me about how the biggest pharmaceutical company in Japan did their statistics. They ran 100 different tests and reported the most significant one. (This was in 2006 and he said they stopped doing this few years back so they were doing this until pretty recently…) I’m not sure if this was 100 multiple comparison or 100 different kinds of test but I’m sure they wouldn’t want to disclose their data… Ouch!
4 0.92191446 834 andrew gelman stats-2011-08-01-I owe it all to the haters
Introduction: Sometimes when I submit an article to a journal it is accepted right away or with minor alterations. But many of my favorite articles were rejected or had to go through an exhausting series of revisions. For example, this influential article had a very hostile referee and we had to seriously push the journal editor to accept it. This one was rejected by one or two journals before finally appearing with discussion. This paper was rejected by the American Political Science Review with no chance of revision and we had to publish it in the British Journal of Political Science, which was a bit odd given that the article was 100% about American politics. And when I submitted this instant classic (actually at the invitation of the editor), the referees found it to be trivial, and the editor did me the favor of publishing it but only by officially labeling it as a discussion of another article that appeared in the same issue. Some of my most influential papers were accepted right
5 0.91969621 1394 andrew gelman stats-2012-06-27-99!
Introduction: Those of you who know what I’m talking about, know what I’m talking about.
same-blog 6 0.90246022 1541 andrew gelman stats-2012-10-19-Statistical discrimination again
8 0.8841784 945 andrew gelman stats-2011-10-06-W’man < W’pedia, again
9 0.88065326 1624 andrew gelman stats-2012-12-15-New prize on causality in statstistics education
10 0.87153167 133 andrew gelman stats-2010-07-08-Gratuitous use of “Bayesian Statistics,” a branding issue?
11 0.86428916 1800 andrew gelman stats-2013-04-12-Too tired to mock
12 0.86356723 1794 andrew gelman stats-2013-04-09-My talks in DC and Baltimore this week
13 0.86338758 2278 andrew gelman stats-2014-04-01-Association for Psychological Science announces a new journal
14 0.85482883 1908 andrew gelman stats-2013-06-21-Interpreting interactions in discrete-data regression
15 0.83701223 762 andrew gelman stats-2011-06-13-How should journals handle replication studies?
16 0.8047781 1833 andrew gelman stats-2013-04-30-“Tragedy of the science-communication commons”
17 0.80126691 1998 andrew gelman stats-2013-08-25-A new Bem theory
18 0.79188883 2188 andrew gelman stats-2014-01-27-“Disappointed with your results? Boost your scientific paper”
20 0.78612053 1779 andrew gelman stats-2013-03-27-“Two Dogmas of Strong Objective Bayesianism”