andrew_gelman_stats andrew_gelman_stats-2013 andrew_gelman_stats-2013-2007 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Deborah Mayo quotes me as saying, “Popper has argued (convincingly, in my opinion) that scientific inference is not inductive but deductive.” She then follows up with: Gelman employs significance test-type reasoning to reject a model when the data sufficiently disagree. Now, strictly speaking, a model falsification, even to inferring something as weak as “the model breaks down,” is not purely deductive, but Gelman is right to see it as about as close as one can get, in statistics, to a deductive falsification of a model. But where does that leave him as a Jaynesian? My reply: I was influenced by reading a toy example from Jaynes’s book where he sets up a model (for the probability of a die landing on each of its six sides) based on first principles, then presents some data that contradict the model, then expands the model. I’d seen very little of this sort of this reasoning before in statistics! In physics it’s the standard way to go: you set up a model based on physic
sentIndex sentText sentNum sentScore
1 Deborah Mayo quotes me as saying, “Popper has argued (convincingly, in my opinion) that scientific inference is not inductive but deductive. [sent-1, score-0.179]
2 ” She then follows up with: Gelman employs significance test-type reasoning to reject a model when the data sufficiently disagree. [sent-2, score-0.662]
3 Now, strictly speaking, a model falsification, even to inferring something as weak as “the model breaks down,” is not purely deductive, but Gelman is right to see it as about as close as one can get, in statistics, to a deductive falsification of a model. [sent-3, score-1.197]
4 My reply: I was influenced by reading a toy example from Jaynes’s book where he sets up a model (for the probability of a die landing on each of its six sides) based on first principles, then presents some data that contradict the model, then expands the model. [sent-5, score-0.833]
5 I’d seen very little of this sort of this reasoning before in statistics! [sent-6, score-0.118]
6 But in statistics we weren’t usually seeing this. [sent-8, score-0.077]
7 Instead, model checking typically was placed in the category of “hypothesis testing,” where the rejection was the goal. [sent-9, score-0.546]
8 Models to be tested were straw men, build up only to be rejected. [sent-10, score-0.184]
9 You can see this, for example, in social science papers that list research hypotheses that are not the same as the statistical “hypotheses” being tested. [sent-11, score-0.141]
10 A typical research hypothesis is “Y causes Z,” with the corresponding statistical hypothesis being “Y has no association with Z after controlling for X. [sent-12, score-0.372]
11 ” Jaynes’s approach—or, at least, what I took away from Jaynes’s presentation—was more simpatico to my way of doing science. [sent-13, score-0.077]
12 And I put a lot of effort into formalizing this idea, so that the kind of modeling I talk and write about can be the kind of modeling I actually do. [sent-14, score-0.467]
13 I don’t want to overstate this—as I wrote earlier, Jaynes is no guru —but I do think this combination of model building and checking is important. [sent-15, score-0.586]
14 Indeed, just as a chicken is said to be an egg’s way of making another egg, we can view inference as a way of sharpening the implications of an assumed model so that it can better be checked. [sent-16, score-0.591]
15 In response to Larry’s post here , let me give a quick +1 to this comment and also refer to this post , which remains relevant 3 years later. [sent-19, score-0.08]
16 See here for years and years of Popper-blogging. [sent-23, score-0.16]
17 And here’s my article with Shalizi and our rejoinder to the discussion. [sent-24, score-0.083]
wordName wordTfidf (topN-words)
[('jaynes', 0.368), ('model', 0.267), ('egg', 0.222), ('deductive', 0.208), ('falsification', 0.179), ('hypothesis', 0.15), ('hypotheses', 0.141), ('principles', 0.125), ('reasoning', 0.118), ('expands', 0.116), ('checking', 0.114), ('guru', 0.111), ('simplifications', 0.111), ('landing', 0.111), ('formalizing', 0.111), ('gelman', 0.109), ('inferring', 0.107), ('straw', 0.107), ('employs', 0.104), ('inductive', 0.101), ('sufficiently', 0.097), ('convincingly', 0.095), ('element', 0.095), ('overstate', 0.094), ('toy', 0.094), ('kind', 0.092), ('chicken', 0.092), ('assume', 0.088), ('stability', 0.087), ('breaks', 0.086), ('influenced', 0.086), ('modeling', 0.086), ('rejection', 0.084), ('rejoinder', 0.083), ('strictly', 0.083), ('popper', 0.082), ('contradict', 0.082), ('deborah', 0.082), ('mayo', 0.082), ('placed', 0.081), ('years', 0.08), ('inference', 0.078), ('way', 0.077), ('statistics', 0.077), ('die', 0.077), ('tested', 0.077), ('reject', 0.076), ('shalizi', 0.076), ('sides', 0.072), ('controlling', 0.072)]
simIndex simValue blogId blogTitle
same-blog 1 1.0000001 2007 andrew gelman stats-2013-09-03-Popper and Jaynes
Introduction: Deborah Mayo quotes me as saying, “Popper has argued (convincingly, in my opinion) that scientific inference is not inductive but deductive.” She then follows up with: Gelman employs significance test-type reasoning to reject a model when the data sufficiently disagree. Now, strictly speaking, a model falsification, even to inferring something as weak as “the model breaks down,” is not purely deductive, but Gelman is right to see it as about as close as one can get, in statistics, to a deductive falsification of a model. But where does that leave him as a Jaynesian? My reply: I was influenced by reading a toy example from Jaynes’s book where he sets up a model (for the probability of a die landing on each of its six sides) based on first principles, then presents some data that contradict the model, then expands the model. I’d seen very little of this sort of this reasoning before in statistics! In physics it’s the standard way to go: you set up a model based on physic
2 0.21645908 1247 andrew gelman stats-2012-04-05-More philosophy of Bayes
Introduction: Konrad Scheffler writes: I was interested by your paper “Induction and deduction in Bayesian data analysis” and was wondering if you would entertain a few questions: – Under the banner of objective Bayesianism, I would posit something like this as a description of Bayesian inference: “Objective Bayesian probability is not a degree of belief (which would necessarily be subjective) but a measure of the plausibility of a hypothesis, conditional on a formally specified information state. One way of specifying a formal information state is to specify a model, which involves specifying both a prior distribution (typically for a set of unobserved variables) and a likelihood function (typically for a set of observed variables, conditioned on the values of the unobserved variables). Bayesian inference involves calculating the objective degree of plausibility of a hypothesis (typically the truth value of the hypothesis is a function of the variables mentioned above) given such a
3 0.21098587 2263 andrew gelman stats-2014-03-24-Empirical implications of Empirical Implications of Theoretical Models
Introduction: Robert Bloomfield writes: Most of the people in my field (accounting, which is basically applied economics and finance, leavened with psychology and organizational behavior) use ‘positive research methods’, which are typically described as coming to the data with a predefined theory, and using hypothesis testing to accept or reject the theory’s predictions. But a substantial minority use ‘interpretive research methods’ (sometimes called qualitative methods, for those that call positive research ‘quantitative’). No one seems entirely happy with the definition of this method, but I’ve found it useful to think of it as an attempt to see the world through the eyes of your subjects, much as Jane Goodall lived with gorillas and tried to see the world through their eyes.) Interpretive researchers often criticize positive researchers by noting that the latter don’t make the best use of their data, because they come to the data with a predetermined theory, and only test a narrow set of h
Introduction: In response to my remarks on his online book, Think Bayes, Allen Downey wrote: I [Downey] have a question about one of your comments: My [Gelman's] main criticism with both books is that they talk a lot about inference but not so much about model building or model checking (recall the three steps of Bayesian data analysis). I think it’s ok for an introductory book to focus on inference, which of course is central to the data-analytic process—but I’d like them to at least mention that Bayesian ideas arise in model building and model checking as well. This sounds like something I agree with, and one of the things I tried to do in the book is to put modeling decisions front and center. But the word “modeling” is used in lots of ways, so I want to see if we are talking about the same thing. For example, in many chapters, I start with a simple model of the scenario, do some analysis, then check whether the model is good enough, and iterate. Here’s the discussion of modeling
5 0.19710188 614 andrew gelman stats-2011-03-15-Induction within a model, deductive inference for model evaluation
Introduction: Jonathan Livengood writes: I have a couple of questions on your paper with Cosma Shalizi on “Philosophy and the practice of Bayesian statistics.” First, you distinguish between inductive approaches and hypothetico-deductive approaches to inference and locate statistical practice (at least, the practice of model building and checking) on the hypothetico-deductive side. Do you think that there are any interesting elements of statistical practice that are properly inductive? For example, suppose someone is playing around with a system that more or less resembles a toy model, like drawing balls from an urn or some such, and where the person has some well-defined priors. The person makes a number of draws from the urn and applies Bayes theorem to get a posterior. On your view, is that person making an induction? If so, how much space is there in statistical practice for genuine inductions like this? Second, I agree with you that one ought to distinguish induction from other kind
6 0.1932071 291 andrew gelman stats-2010-09-22-Philosophy of Bayes and non-Bayes: A dialogue with Deborah Mayo
7 0.16358022 1205 andrew gelman stats-2012-03-09-Coming to agreement on philosophy of statistics
8 0.16036259 781 andrew gelman stats-2011-06-28-The holes in my philosophy of Bayesian data analysis
9 0.15410462 811 andrew gelman stats-2011-07-20-Kind of Bayesian
10 0.14822216 998 andrew gelman stats-2011-11-08-Bayes-Godel
11 0.14786386 1712 andrew gelman stats-2013-02-07-Philosophy and the practice of Bayesian statistics (with all the discussions!)
12 0.14538622 754 andrew gelman stats-2011-06-09-Difficulties with Bayesian model averaging
14 0.13782594 1425 andrew gelman stats-2012-07-23-Examples of the use of hierarchical modeling to generalize to new settings
15 0.13776545 746 andrew gelman stats-2011-06-05-An unexpected benefit of Arrow’s other theorem
16 0.13594164 2133 andrew gelman stats-2013-12-13-Flexibility is good
17 0.13525848 320 andrew gelman stats-2010-10-05-Does posterior predictive model checking fit with the operational subjective approach?
18 0.13094088 295 andrew gelman stats-2010-09-25-Clusters with very small numbers of observations
19 0.13051002 244 andrew gelman stats-2010-08-30-Useful models, model checking, and external validation: a mini-discussion
20 0.12787212 110 andrew gelman stats-2010-06-26-Philosophy and the practice of Bayesian statistics
topicId topicWeight
[(0, 0.235), (1, 0.127), (2, -0.053), (3, 0.027), (4, -0.073), (5, 0.007), (6, -0.072), (7, 0.019), (8, 0.148), (9, -0.025), (10, 0.003), (11, 0.046), (12, -0.041), (13, -0.018), (14, -0.056), (15, -0.007), (16, 0.014), (17, -0.037), (18, -0.0), (19, -0.029), (20, 0.031), (21, -0.063), (22, -0.043), (23, -0.068), (24, -0.056), (25, -0.001), (26, -0.04), (27, -0.034), (28, -0.0), (29, -0.044), (30, -0.02), (31, -0.043), (32, -0.026), (33, 0.013), (34, -0.067), (35, 0.017), (36, 0.057), (37, -0.022), (38, 0.056), (39, -0.043), (40, -0.008), (41, -0.043), (42, 0.037), (43, -0.002), (44, -0.049), (45, 0.023), (46, -0.023), (47, -0.098), (48, -0.044), (49, -0.003)]
simIndex simValue blogId blogTitle
same-blog 1 0.96651167 2007 andrew gelman stats-2013-09-03-Popper and Jaynes
Introduction: Deborah Mayo quotes me as saying, “Popper has argued (convincingly, in my opinion) that scientific inference is not inductive but deductive.” She then follows up with: Gelman employs significance test-type reasoning to reject a model when the data sufficiently disagree. Now, strictly speaking, a model falsification, even to inferring something as weak as “the model breaks down,” is not purely deductive, but Gelman is right to see it as about as close as one can get, in statistics, to a deductive falsification of a model. But where does that leave him as a Jaynesian? My reply: I was influenced by reading a toy example from Jaynes’s book where he sets up a model (for the probability of a die landing on each of its six sides) based on first principles, then presents some data that contradict the model, then expands the model. I’d seen very little of this sort of this reasoning before in statistics! In physics it’s the standard way to go: you set up a model based on physic
2 0.83332521 614 andrew gelman stats-2011-03-15-Induction within a model, deductive inference for model evaluation
Introduction: Jonathan Livengood writes: I have a couple of questions on your paper with Cosma Shalizi on “Philosophy and the practice of Bayesian statistics.” First, you distinguish between inductive approaches and hypothetico-deductive approaches to inference and locate statistical practice (at least, the practice of model building and checking) on the hypothetico-deductive side. Do you think that there are any interesting elements of statistical practice that are properly inductive? For example, suppose someone is playing around with a system that more or less resembles a toy model, like drawing balls from an urn or some such, and where the person has some well-defined priors. The person makes a number of draws from the urn and applies Bayes theorem to get a posterior. On your view, is that person making an induction? If so, how much space is there in statistical practice for genuine inductions like this? Second, I agree with you that one ought to distinguish induction from other kind
Introduction: David Rohde writes: I have been thinking a lot lately about your Bayesian model checking approach. This is in part because I have been working on exploratory data analysis and wishing to avoid controversy and mathematical statistics we omitted model checking from our discussion. This is something that the refereeing process picked us up on and we ultimately added a critical discussion of null-hypothesis testing to our paper . The exploratory technique we discussed was essentially a 2D histogram approach, but we used Polya models as a formal model for the histogram. We are currently working on a new paper, and we are thinking through how or if we should do “confirmatory analysis” or model checking in the paper. What I find most admirable about your statistical work is that you clearly use the Bayesian approach to do useful applied statistical analysis. My own attempts at applied Bayesian analysis makes me greatly admire your applied successes. On the other hand it may be t
Introduction: In response to my remarks on his online book, Think Bayes, Allen Downey wrote: I [Downey] have a question about one of your comments: My [Gelman's] main criticism with both books is that they talk a lot about inference but not so much about model building or model checking (recall the three steps of Bayesian data analysis). I think it’s ok for an introductory book to focus on inference, which of course is central to the data-analytic process—but I’d like them to at least mention that Bayesian ideas arise in model building and model checking as well. This sounds like something I agree with, and one of the things I tried to do in the book is to put modeling decisions front and center. But the word “modeling” is used in lots of ways, so I want to see if we are talking about the same thing. For example, in many chapters, I start with a simple model of the scenario, do some analysis, then check whether the model is good enough, and iterate. Here’s the discussion of modeling
Introduction: Psychologists talk about “folk psychology”: ideas that make sense to us about how people think and behave, even if these ideas are not accurate descriptions of reality. And physicists talk about “folk physics” (for example, the idea that a thrown ball falls in a straight line and then suddenly drops, rather than following an approximate parabola). There’s also “folk statistics.” Some of the ideas of folk statistics are so strong that even educated people–even well-known researchers–can make these mistakes. One of the ideas of folk statistics that bothers me a lot is what might be called the “either/or fallacy”: the idea that if there are two possible stories, the truth has to be one or the other. I have often encountered the either/or fallacy in Bayesian statistics, for example the vast literature on “model selection” or “variable selection” or “model averaging” in which it is assumed that one of some pre-specified discrete set of models is the truth, and that this true model
6 0.79654574 82 andrew gelman stats-2010-06-12-UnConMax – uncertainty consideration maxims 7 +-- 2
7 0.78782922 754 andrew gelman stats-2011-06-09-Difficulties with Bayesian model averaging
8 0.78436154 1195 andrew gelman stats-2012-03-04-Multiple comparisons dispute in the tabloids
9 0.78071994 1817 andrew gelman stats-2013-04-21-More on Bayesian model selection in high-dimensional settings
10 0.77501869 496 andrew gelman stats-2011-01-01-Tukey’s philosophy
11 0.7748031 811 andrew gelman stats-2011-07-20-Kind of Bayesian
12 0.77311021 2133 andrew gelman stats-2013-12-13-Flexibility is good
13 0.77122378 1392 andrew gelman stats-2012-06-26-Occam
14 0.77073175 244 andrew gelman stats-2010-08-30-Useful models, model checking, and external validation: a mini-discussion
15 0.76670933 1041 andrew gelman stats-2011-12-04-David MacKay and Occam’s Razor
16 0.76611215 448 andrew gelman stats-2010-12-03-This is a footnote in one of my papers
17 0.76557523 1521 andrew gelman stats-2012-10-04-Columbo does posterior predictive checks
18 0.76302892 24 andrew gelman stats-2010-05-09-Special journal issue on statistical methods for the social sciences
19 0.76134068 72 andrew gelman stats-2010-06-07-Valencia: Summer of 1991
20 0.76074821 1406 andrew gelman stats-2012-07-05-Xiao-Li Meng and Xianchao Xie rethink asymptotics
topicId topicWeight
[(1, 0.011), (15, 0.055), (16, 0.108), (21, 0.061), (24, 0.127), (50, 0.01), (63, 0.032), (64, 0.011), (68, 0.037), (77, 0.062), (81, 0.031), (88, 0.027), (94, 0.011), (95, 0.014), (96, 0.023), (99, 0.277)]
simIndex simValue blogId blogTitle
same-blog 1 0.96761316 2007 andrew gelman stats-2013-09-03-Popper and Jaynes
Introduction: Deborah Mayo quotes me as saying, “Popper has argued (convincingly, in my opinion) that scientific inference is not inductive but deductive.” She then follows up with: Gelman employs significance test-type reasoning to reject a model when the data sufficiently disagree. Now, strictly speaking, a model falsification, even to inferring something as weak as “the model breaks down,” is not purely deductive, but Gelman is right to see it as about as close as one can get, in statistics, to a deductive falsification of a model. But where does that leave him as a Jaynesian? My reply: I was influenced by reading a toy example from Jaynes’s book where he sets up a model (for the probability of a die landing on each of its six sides) based on first principles, then presents some data that contradict the model, then expands the model. I’d seen very little of this sort of this reasoning before in statistics! In physics it’s the standard way to go: you set up a model based on physic
2 0.95482796 2227 andrew gelman stats-2014-02-27-“What Can we Learn from the Many Labs Replication Project?”
Introduction: Aki points us to this discussion from Rolf Zwaan: The first massive replication project in psychology has just reached completion (several others are to follow). . . . What can we learn from the ManyLabs project? The results here show the effect sizes for the replication efforts (in green and grey) as well as the original studies (in blue). The 99% confidence intervals are for the meta-analysis of the effect size (the green dots); the studies are ordered by effect size. Let’s first consider what we canNOT learn from these data. Of the 13 replication attempts (when the first four are taken together), 11 succeeded and 2 did not (in fact, at some point ManyLabs suggests that a third one, Imagined Contact also doesn’t really replicate). We cannot learn from this that the vast majority of psychological findings will replicate . . . But even if we had an accurate estimate of the percentage of findings that replicate, how useful would that be? Rather than trying to arrive at a mo
3 0.95223212 1878 andrew gelman stats-2013-05-31-How to fix the tabloids? Toward replicable social science research
Introduction: This seems to be the topic of the week. Yesterday I posted on the sister blog some further thoughts on those “Psychological Science” papers on menstrual cycles, biceps size, and political attitudes, tied to a horrible press release from the journal Psychological Science hyping the biceps and politics study. Then I was pointed to these suggestions from Richard Lucas and M. Brent Donnellan have on improving the replicability and reproducibility of research published in the Journal of Research in Personality: It goes without saying that editors of scientific journals strive to publish research that is not only theoretically interesting but also methodologically rigorous. The goal is to select papers that advance the field. Accordingly, editors want to publish findings that can be reproduced and replicated by other scientists. Unfortunately, there has been a recent “crisis in confidence” among psychologists about the quality of psychological research (Pashler & Wagenmakers, 2012)
4 0.95131695 586 andrew gelman stats-2011-02-23-A statistical version of Arrow’s paradox
Introduction: Unfortunately, when we deal with scientists, statisticians are often put in a setting reminiscent of Arrow’s paradox, where we are asked to provide estimates that are informative and unbiased and confidence statements that are correct conditional on the data and also on the underlying true parameter. [It's not generally possible for an estimate to do all these things at the same time -- ed.] Larry Wasserman feels that scientists are truly frequentist, and Don Rubin has told me how he feels that scientists interpret all statistical estimates Bayesianly. I have no doubt that both Larry and Don are correct. Voters want lower taxes and more services, and scientists want both Bayesian and frequency coverage; as the saying goes, everybody wants to go to heaven but nobody wants to die.
5 0.94916713 1824 andrew gelman stats-2013-04-25-Fascinating graphs from facebook data
Introduction: Yair points us to this page full of wonderful graphs from the Stephen Wolfram blog. Here are a few: And some words: People talk less about video games as they get older, and more about politics and the weather. Men typically talk more about sports and technology than women—and, somewhat surprisingly to me, they also talk more about movies, television and music. Women talk more about pets+animals, family+friends, relationships—and, at least after they reach child-bearing years, health. . . . Some of this is rather depressingly stereotypical. And most of it isn’t terribly surprising to anyone who’s known a reasonable diversity of people of different ages. But what to me is remarkable is how we can see everything laid out in such quantitative detail in the pictures above—kind of a signature of people’s thinking as they go through life. Of course, the pictures above are all based on aggregate data, carefully anonymized. But if we start looking at individuals, we’ll s
6 0.94870126 2137 andrew gelman stats-2013-12-17-Replication backlash
7 0.94824606 207 andrew gelman stats-2010-08-14-Pourquoi Google search est devenu plus raisonnable?
9 0.94706005 1568 andrew gelman stats-2012-11-07-That last satisfaction at the end of the career
10 0.9466992 252 andrew gelman stats-2010-09-02-R needs a good function to make line plots
11 0.94548655 2353 andrew gelman stats-2014-05-30-I posted this as a comment on a sociology blog
13 0.94344145 505 andrew gelman stats-2011-01-05-Wacky interview questions: An exploration into the nature of evidence on the internet
14 0.94337428 2177 andrew gelman stats-2014-01-19-“The British amateur who debunked the mathematics of happiness”
15 0.94270688 1788 andrew gelman stats-2013-04-04-When is there “hidden structure in data” to be discovered?
17 0.94190902 481 andrew gelman stats-2010-12-22-The Jumpstart financial literacy survey and the different purposes of tests
18 0.94149321 562 andrew gelman stats-2011-02-06-Statistician cracks Toronto lottery
19 0.94121075 2179 andrew gelman stats-2014-01-20-The AAA Tranche of Subprime Science
20 0.94107383 1980 andrew gelman stats-2013-08-13-Test scores and grades predict job performance (but maybe not at Google)