andrew_gelman_stats andrew_gelman_stats-2013 andrew_gelman_stats-2013-1859 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: I was asked to write an article for the Committee of Presidents of Statistical Societies (COPSS) 50th anniversary volume. Here it is (it’s labeled as “Chapter 1,” which isn’t right; that’s just what came out when I used the template that was supplied). The article begins as follows: The field of statistics continues to be divided into competing schools of thought. In theory one might imagine choosing the uniquely best method for each problem as it arises, but in practice we choose for ourselves (and recom- mend to others) default principles, models, and methods to be used in a wide variety of settings. This article briefly considers the informal criteria we use to decide what methods to use and what principles to apply in statistics problems. And then I follow up with these sections: Statistics: the science of defaults Ways of knowing The pluralist’s dilemma And here’s the concluding paragraph: Statistics is a young science in which progress is being made in many
sentIndex sentText sentNum sentScore
1 I was asked to write an article for the Committee of Presidents of Statistical Societies (COPSS) 50th anniversary volume. [sent-1, score-0.112]
2 Here it is (it’s labeled as “Chapter 1,” which isn’t right; that’s just what came out when I used the template that was supplied). [sent-2, score-0.099]
3 The article begins as follows: The field of statistics continues to be divided into competing schools of thought. [sent-3, score-0.142]
4 In theory one might imagine choosing the uniquely best method for each problem as it arises, but in practice we choose for ourselves (and recom- mend to others) default principles, models, and methods to be used in a wide variety of settings. [sent-4, score-1.009]
5 This article briefly considers the informal criteria we use to decide what methods to use and what principles to apply in statistics problems. [sent-5, score-1.098]
6 And then I follow up with these sections: Statistics: the science of defaults Ways of knowing The pluralist’s dilemma And here’s the concluding paragraph: Statistics is a young science in which progress is being made in many areas. [sent-6, score-0.675]
7 Practitioners have a wide variety of statistical approaches to choose from, and researchers have many potential directions to study. [sent-8, score-0.686]
8 A casual and introspective review suggests that there are many different criteria we use to decide that a statistical method is worthy of routine use. [sent-9, score-1.004]
9 Regular blog readers will recognize many of these themes, but I hope this particular presentation has some added value. [sent-11, score-0.174]
10 And this is as good a place as any to thank my many correspondents who’ve helped contribute to the development and expression of these ideas. [sent-12, score-0.381]
wordName wordTfidf (topN-words)
[('many', 0.174), ('criteria', 0.167), ('wide', 0.155), ('knowing', 0.146), ('variety', 0.145), ('statistics', 0.142), ('default', 0.141), ('decide', 0.138), ('principles', 0.138), ('introspective', 0.136), ('success', 0.128), ('optimality', 0.128), ('dilemma', 0.123), ('concluding', 0.123), ('uniquely', 0.123), ('copss', 0.123), ('choose', 0.12), ('correspondents', 0.119), ('centuries', 0.119), ('methods', 0.119), ('marketplace', 0.115), ('anniversary', 0.112), ('defaults', 0.109), ('practice', 0.107), ('benchmark', 0.105), ('proofs', 0.105), ('use', 0.105), ('toy', 0.103), ('lean', 0.102), ('presidents', 0.1), ('method', 0.099), ('societies', 0.099), ('template', 0.099), ('supplied', 0.099), ('psychometrics', 0.099), ('worthy', 0.097), ('ways', 0.097), ('considers', 0.096), ('modeling', 0.095), ('committee', 0.094), ('practitioners', 0.094), ('developments', 0.093), ('directions', 0.092), ('genetics', 0.091), ('regularization', 0.091), ('ranging', 0.089), ('routine', 0.088), ('informal', 0.088), ('helped', 0.088), ('sections', 0.088)]
simIndex simValue blogId blogTitle
same-blog 1 0.99999994 1859 andrew gelman stats-2013-05-16-How do we choose our default methods?
Introduction: I was asked to write an article for the Committee of Presidents of Statistical Societies (COPSS) 50th anniversary volume. Here it is (it’s labeled as “Chapter 1,” which isn’t right; that’s just what came out when I used the template that was supplied). The article begins as follows: The field of statistics continues to be divided into competing schools of thought. In theory one might imagine choosing the uniquely best method for each problem as it arises, but in practice we choose for ourselves (and recom- mend to others) default principles, models, and methods to be used in a wide variety of settings. This article briefly considers the informal criteria we use to decide what methods to use and what principles to apply in statistics problems. And then I follow up with these sections: Statistics: the science of defaults Ways of knowing The pluralist’s dilemma And here’s the concluding paragraph: Statistics is a young science in which progress is being made in many
2 0.1837301 1469 andrew gelman stats-2012-08-25-Ways of knowing
Introduction: In this discussion from last month, computer science student and Judea Pearl collaborator Elias Barenboim expressed an attitude that hierarchical Bayesian methods might be fine in practice but that they lack theory, that Bayesians can’t succeed in toy problems. I posted a P.S. there which might not have been noticed so I will put it here: I now realize that there is some disagreement about what constitutes a “guarantee.” In one of his comments, Barenboim writes, “the assurance we have that the result must hold as long as the assumptions in the model are correct should be regarded as a guarantee.” In that sense, yes, we have guarantees! It is fundamental to Bayesian inference that the result must hold if the assumptions in the model are correct. We have lots of that in Bayesian Data Analysis (particularly in the first four chapters but implicitly elsewhere as well), and this is also covered in the classic books by Lindley, Jaynes, and others. This sort of guarantee is indeed p
3 0.17007847 2317 andrew gelman stats-2014-05-04-Honored oldsters write about statistics
Introduction: The new book titled: Past, Present, and Future of Statistical Science is now available for download . The official description makes the book sound pretty stuffy: Past, Present, and Future of Statistical Science, commissioned by the Committee of Presidents of Statistical Societies (COPSS) to celebrate its 50th anniversary and the International Year of Statistics, will be published in April by Taylor & Francis/CRC Press. Through the contributions of a distinguished group of 50 statisticians, the book showcases the breadth and vibrancy of statistics, describes current challenges and new opportunities, highlights the exciting future of statistical science, and provides guidance for future statisticians. Contributors are past COPSS award honorees. But it actually has lots of good stuff, including the chapter by Tibshirani which I discussed last year (in the context of the “bet on sparsity principle”), and chapters by XL and other fun people. Also my own chapter, How do we choo
Introduction: Statistics is the science of defaults. One of the differences between statistics and other branches of engineering is that we have a special love for default procedures, perhaps because so many statistical problems are routine (or, at least, people would like them to be). We have standard estimates for all sorts of models, books of statistical tests, and default settings for everything. Recently I’ve been working on default weakly informative priors (which are not the same as the typically noninformative “reference priors” of the Bayesian literature). From a Bayesian point of view, the appropriate default procedure could be defined as that which is appropriate for the population of problems that one might be studying. More generally, much of our job as statisticians is to come up with methods that will be used by others in routine practice. (Much of the rest of our job is to come up with methods for evaluating new and existing statistical methods, and methods for coming up wi
5 0.15316403 147 andrew gelman stats-2010-07-15-Quote of the day: statisticians and defaults
Introduction: On statisticians and statistical software: Statisticians are particularly sensitive to default settings, which makes sense considering that statistics is, in many ways, a science based on defaults. What is a “statistical method” if not a recommended default analysis, backed up by some combination of theory and experience?
6 0.12262139 1425 andrew gelman stats-2012-07-23-Examples of the use of hierarchical modeling to generalize to new settings
7 0.10809194 1695 andrew gelman stats-2013-01-28-Economists argue about Bayes
8 0.10780108 1848 andrew gelman stats-2013-05-09-A tale of two discussion papers
9 0.10733984 1979 andrew gelman stats-2013-08-13-Convincing Evidence
10 0.10675746 2072 andrew gelman stats-2013-10-21-The future (and past) of statistical sciences
11 0.10643853 361 andrew gelman stats-2010-10-21-Tenure-track statistics job at Teachers College, here at Columbia!
12 0.10504733 2245 andrew gelman stats-2014-03-12-More on publishing in journals
13 0.10297214 2151 andrew gelman stats-2013-12-27-Should statistics have a Nobel prize?
14 0.10060051 1482 andrew gelman stats-2012-09-04-Model checking and model understanding in machine learning
15 0.10051496 557 andrew gelman stats-2011-02-05-Call for book proposals
16 0.10021264 746 andrew gelman stats-2011-06-05-An unexpected benefit of Arrow’s other theorem
19 0.094266027 1165 andrew gelman stats-2012-02-13-Philosophy of Bayesian statistics: my reactions to Wasserman
20 0.092245474 1605 andrew gelman stats-2012-12-04-Write This Book
topicId topicWeight
[(0, 0.192), (1, 0.04), (2, -0.089), (3, -0.012), (4, -0.027), (5, 0.05), (6, -0.106), (7, 0.026), (8, -0.016), (9, 0.03), (10, 0.001), (11, -0.032), (12, -0.01), (13, -0.003), (14, -0.019), (15, 0.006), (16, -0.031), (17, 0.009), (18, -0.008), (19, -0.025), (20, 0.031), (21, -0.03), (22, -0.028), (23, 0.07), (24, 0.025), (25, 0.055), (26, 0.011), (27, 0.06), (28, 0.018), (29, -0.025), (30, 0.015), (31, 0.063), (32, 0.037), (33, -0.002), (34, 0.005), (35, -0.005), (36, 0.006), (37, 0.023), (38, -0.002), (39, -0.015), (40, -0.021), (41, -0.02), (42, 0.012), (43, 0.01), (44, -0.001), (45, -0.022), (46, -0.055), (47, -0.009), (48, 0.002), (49, 0.02)]
simIndex simValue blogId blogTitle
same-blog 1 0.9852106 1859 andrew gelman stats-2013-05-16-How do we choose our default methods?
Introduction: I was asked to write an article for the Committee of Presidents of Statistical Societies (COPSS) 50th anniversary volume. Here it is (it’s labeled as “Chapter 1,” which isn’t right; that’s just what came out when I used the template that was supplied). The article begins as follows: The field of statistics continues to be divided into competing schools of thought. In theory one might imagine choosing the uniquely best method for each problem as it arises, but in practice we choose for ourselves (and recom- mend to others) default principles, models, and methods to be used in a wide variety of settings. This article briefly considers the informal criteria we use to decide what methods to use and what principles to apply in statistics problems. And then I follow up with these sections: Statistics: the science of defaults Ways of knowing The pluralist’s dilemma And here’s the concluding paragraph: Statistics is a young science in which progress is being made in many
2 0.84990311 147 andrew gelman stats-2010-07-15-Quote of the day: statisticians and defaults
Introduction: On statisticians and statistical software: Statisticians are particularly sensitive to default settings, which makes sense considering that statistics is, in many ways, a science based on defaults. What is a “statistical method” if not a recommended default analysis, backed up by some combination of theory and experience?
3 0.79963845 1979 andrew gelman stats-2013-08-13-Convincing Evidence
Introduction: Keith O’Rourke and I wrote an article that begins: Textbooks on statistics emphasize care and precision, via concepts such as reliability and validity in measurement, random sampling and treatment assignment in data collection, and causal identification and bias in estimation. But how do researchers decide what to believe and what to trust when choosing which statistical methods to use? How do they decide the credibility of methods? Statisticians and statistical practitioners seem to rely on a sense of anecdotal evidence based on personal experience and on the attitudes of trusted colleagues. Authorship, reputation, and past experience are thus central to decisions about statistical procedures. It’s for a volume on theoretical or methodological research on authorship, functional roles, reputation, and credibility in social media, edited by Sorin Matei and Elisa Bertino.
4 0.7911948 2151 andrew gelman stats-2013-12-27-Should statistics have a Nobel prize?
Introduction: Xiao-Li says yes: The most compelling reason for having highly visible awards in any field is to enhance its ability to attract future talent. Virtually all the media and public attention our profession received in recent years has been on the utility of statistics in all walks of life. We are extremely happy for and proud of this recognition—it is long overdue. However, the media and public have given much more attention to the Fields Medal than to the COPSS Award, even though the former has hardly been about direct or even indirect impact on everyday life. Why this difference? . . . these awards arouse media and public interest by featuring how ingenious the awardees are and how difficult the problems they solved, much like how conquering Everest bestows admiration not because the admirers care or even know much about Everest itself but because it represents the ultimate physical feat. In this sense, the biggest winner of the Fields Medal is mathematics itself: enticing the brig
5 0.78895324 557 andrew gelman stats-2011-02-05-Call for book proposals
Introduction: Rob Calver writes: Large and complex datasets are becoming prevalent in the social and behavioral sciences and statistical methods are crucial for the analysis and interpretation of such data. The Chapman & Hall/CRC Statistics in the Social and Behavioral Sciences Series aims to capture new developments in statistical methodology with particular relevance to applications in the social and behavioral sciences. It seeks to promote appropriate use of statistical, econometric and psychometric methods in these applied sciences by publishing a broad range of monographs, textbooks and handbooks. The scope of the series is wide, including applications of statistical methodology in sociology, psychology, economics, education, marketing research, political science, criminology, public policy, demography, survey methodology and official statistics. The titles included in the series are designed to appeal to applied statisticians, as well as students, researchers and practitioners from the
6 0.78050953 744 andrew gelman stats-2011-06-03-Statistical methods for healthcare regulation: rating, screening and surveillance
7 0.77586842 498 andrew gelman stats-2011-01-02-Theoretical vs applied statistics
10 0.74601912 2317 andrew gelman stats-2014-05-04-Honored oldsters write about statistics
11 0.74066061 2072 andrew gelman stats-2013-10-21-The future (and past) of statistical sciences
12 0.73174268 1740 andrew gelman stats-2013-02-26-“Is machine learning a subset of statistics?”
13 0.73106611 1889 andrew gelman stats-2013-06-08-Using trends in R-squared to measure progress in criminology??
14 0.7298789 155 andrew gelman stats-2010-07-19-David Blackwell
15 0.72905648 241 andrew gelman stats-2010-08-29-Ethics and statistics in development research
16 0.72751135 214 andrew gelman stats-2010-08-17-Probability-processing hardware
18 0.71916306 1110 andrew gelman stats-2012-01-10-Jobs in statistics research! In New Jersey!
19 0.71110785 738 andrew gelman stats-2011-05-30-Works well versus well understood
20 0.70520973 1690 andrew gelman stats-2013-01-23-When are complicated models helpful in psychology research and when are they overkill?
topicId topicWeight
[(2, 0.017), (10, 0.1), (16, 0.094), (21, 0.042), (24, 0.081), (42, 0.019), (53, 0.022), (55, 0.018), (58, 0.011), (61, 0.015), (63, 0.025), (74, 0.011), (77, 0.043), (84, 0.013), (86, 0.032), (90, 0.01), (99, 0.339)]
simIndex simValue blogId blogTitle
same-blog 1 0.96823728 1859 andrew gelman stats-2013-05-16-How do we choose our default methods?
Introduction: I was asked to write an article for the Committee of Presidents of Statistical Societies (COPSS) 50th anniversary volume. Here it is (it’s labeled as “Chapter 1,” which isn’t right; that’s just what came out when I used the template that was supplied). The article begins as follows: The field of statistics continues to be divided into competing schools of thought. In theory one might imagine choosing the uniquely best method for each problem as it arises, but in practice we choose for ourselves (and recom- mend to others) default principles, models, and methods to be used in a wide variety of settings. This article briefly considers the informal criteria we use to decide what methods to use and what principles to apply in statistics problems. And then I follow up with these sections: Statistics: the science of defaults Ways of knowing The pluralist’s dilemma And here’s the concluding paragraph: Statistics is a young science in which progress is being made in many
2 0.95973068 2215 andrew gelman stats-2014-02-17-The Washington Post reprints university press releases without editing them
Introduction: Somebody points me to this horrifying exposé by Paul Raeburn on a new series by the Washington Post where they reprint press releases as if they are actual news. And the gimmick is, the reason why it’s appearing on this blog, is that these are university press releases on science stories . What could possibly go wrong there? After all, Steve Chaplin, a self-identified “science-writing PIO from an R1,” writes in a comment to Raeburn’s post: We write about peer-reviewed research accepted for publication or published by the world’s leading scientific journals after that research has been determined to be legitimate. Repeatability of new research is a publication requisite. I emphasized that last sentence myself because it was such a stunner. Do people really think that??? So I guess what he’s saying is, they don’t do press releases for articles from Psychological Science or the Journal of Personality and Social Psychology . But I wonder how the profs in the psych d
3 0.95802659 344 andrew gelman stats-2010-10-15-Story time
Introduction: This one belongs in the statistical lexicon. Kaiser Fung nails it : In reading [news] articles, we must look out for the moment(s) when the reporters announce story time. Much of the article is great propaganda for the statistics lobby, describing an attempt to use observational data to address a practical question, sort of a Freakonomics-style application. We have no problems when they say things like: “There is a substantial gap at year’s end between students whose teachers were in the top 10% in effectiveness and the bottom 10%. The fortunate students ranked 17 percentile points higher in English and 25 points higher in math.” Or this: “On average, Smith’s students slide under his instruction, losing 14 percentile points in math during the school year relative to their peers districtwide, The Times found. Overall, he ranked among the least effective of the district’s elementary school teachers.” Midway through the article (right before the section called “Study in contras
4 0.95660597 487 andrew gelman stats-2010-12-27-Alfred Kahn
Introduction: Appointed “inflation czar” in late 1970s, Alfred Kahn is most famous for deregulating the airline industry. At the time this seemed to make sense, although in retrospect I’m less a fan of consumer-driven policies than I used to be. When I was a kid we subscribed to Consumer Reports and so I just assumed that everything that was good for the consumer–lower prices, better products, etc.–was a good thing. Upon reflection, though, I think it’s a mistake to focus too narrowly on the interests of consumers. For example (from my Taleb review a couple years ago): The discussion on page 112 of how Ralph Nader saved lives (mostly via seat belts in cars) reminds me of his car-bumper campaign in the 1970s. My dad subscribed to Consumer Reports then (he still does, actually, and I think reads it for pleasure–it must be one of those Depression-mentality things), and at one point they were pushing heavily for the 5-mph bumpers. Apparently there was some federal regulation about how strong
5 0.95564461 2257 andrew gelman stats-2014-03-20-The candy weighing demonstration, or, the unwisdom of crowds
Introduction: From 2008: The candy weighing demonstration, or, the unwisdom of crowds My favorite statistics demonstration is the one with the bag of candies. I’ve elaborated upon it since including it in the Teaching Statistics book and I thought these tips might be useful to some of you. Preparation Buy 100 candies of different sizes and shapes and put them in a bag (the plastic bag from the store is fine). Get something like 20 large full-sized candy bars, 20 or 30 little things like mini Snickers bars and mini Peppermint Patties. And then 50 or 60 really little things like tiny Tootsie Rolls, lollipops, and individually-wrapped Life Savers. Count and make sure it’s exactly 100. You also need a digital kitchen scale that reads out in grams. Also bring a sealed envelope inside of which is a note (details below). When you get into the room, unobtrusively put the note somewhere, for example between two books on a shelf or behind a window shade. Setup Hold up the back of cand
7 0.95288193 2301 andrew gelman stats-2014-04-22-Ticket to Baaaaarf
8 0.95273662 78 andrew gelman stats-2010-06-10-Hey, where’s my kickback?
9 0.94954216 2107 andrew gelman stats-2013-11-20-NYT (non)-retraction watch
10 0.94755733 154 andrew gelman stats-2010-07-18-Predictive checks for hierarchical models
11 0.94707346 1163 andrew gelman stats-2012-02-12-Meta-analysis, game theory, and incentives to do replicable research
12 0.94691658 1912 andrew gelman stats-2013-06-24-Bayesian quality control?
14 0.94519168 430 andrew gelman stats-2010-11-25-The von Neumann paradox
15 0.94446993 1948 andrew gelman stats-2013-07-21-Bayes related
16 0.94391733 2158 andrew gelman stats-2014-01-03-Booze: Been There. Done That.
17 0.94314814 2007 andrew gelman stats-2013-09-03-Popper and Jaynes
18 0.94312429 1656 andrew gelman stats-2013-01-05-Understanding regression models and regression coefficients
19 0.94307244 1635 andrew gelman stats-2012-12-22-More Pinker Pinker Pinker
20 0.94301468 2114 andrew gelman stats-2013-11-26-“Please make fun of this claim”