andrew_gelman_stats andrew_gelman_stats-2011 andrew_gelman_stats-2011-614 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Jonathan Livengood writes: I have a couple of questions on your paper with Cosma Shalizi on “Philosophy and the practice of Bayesian statistics.” First, you distinguish between inductive approaches and hypothetico-deductive approaches to inference and locate statistical practice (at least, the practice of model building and checking) on the hypothetico-deductive side. Do you think that there are any interesting elements of statistical practice that are properly inductive? For example, suppose someone is playing around with a system that more or less resembles a toy model, like drawing balls from an urn or some such, and where the person has some well-defined priors. The person makes a number of draws from the urn and applies Bayes theorem to get a posterior. On your view, is that person making an induction? If so, how much space is there in statistical practice for genuine inductions like this? Second, I agree with you that one ought to distinguish induction from other kind
sentIndex sentText sentNum sentScore
1 ” First, you distinguish between inductive approaches and hypothetico-deductive approaches to inference and locate statistical practice (at least, the practice of model building and checking) on the hypothetico-deductive side. [sent-2, score-1.356]
2 Do you think that there are any interesting elements of statistical practice that are properly inductive? [sent-3, score-0.224]
3 For example, suppose someone is playing around with a system that more or less resembles a toy model, like drawing balls from an urn or some such, and where the person has some well-defined priors. [sent-4, score-0.378]
4 The person makes a number of draws from the urn and applies Bayes theorem to get a posterior. [sent-5, score-0.249]
5 On your view, is that person making an induction? [sent-6, score-0.175]
6 If so, how much space is there in statistical practice for genuine inductions like this? [sent-7, score-0.212]
7 Second, I agree with you that one ought to distinguish induction from other kinds of risky inference, but I’m not sure that I see a clear payoff from making the distinction. [sent-8, score-0.655]
8 I’m worried because a lot of smart philosophers just don’t distinguish “inductive” inferences from “risky” inferences. [sent-9, score-0.255]
9 But if that is right, then a model that survives attempts at falsification and then gets used to make predictions is still going to be open to a Humean attack. [sent-14, score-0.305]
10 Rather, it’s a variety of induction and suffers all the same difficulties as simple enumerative induction. [sent-16, score-0.413]
11 So, I guess what I’d like to know is in what ways you think the philosophers are misled here. [sent-17, score-0.257]
12 What is the value / motivation for distinguishing induction from hypothetico-deductive inference? [sent-18, score-0.333]
13 I replied: My short answer is that inductive inference of the balls-in-urns variety takes place within a model, and the deductive Popperian reasoning takes place when evaluating a model. [sent-21, score-1.249]
14 I think of “Popper” more as a totem than as an actual person or body of work. [sent-23, score-0.256]
15 Crudely speaking, I think of models as a language, with models created in the same way that we create sentences, by working with recursive structures. [sent-25, score-0.297]
16 When you say that inductive inference takes place within a model, are you claiming that an inductive inference is justified just to the extent that the model within which the induction takes place is justified (or approximately correct or some such — I know you won’t say “true” here …)? [sent-30, score-2.727]
17 If so, then under what conditions do you think a model is justified? [sent-31, score-0.313]
18 That is, under what conditions do you think one is justified in making *predictions* on the basis of a model? [sent-32, score-0.427]
19 There will (almost) always be some assumptions required, some sense in which any prediction is conditional on something . [sent-35, score-0.246]
20 Stepping back a bit, I’d say that scientists get experience with certain models, they work well for prediction until they don’t. [sent-36, score-0.182]
wordName wordTfidf (topN-words)
[('inductive', 0.375), ('hume', 0.333), ('induction', 0.271), ('justified', 0.209), ('popper', 0.203), ('inference', 0.169), ('model', 0.164), ('salmon', 0.152), ('practice', 0.15), ('livengood', 0.143), ('urn', 0.143), ('distinguish', 0.136), ('takes', 0.13), ('philosophy', 0.12), ('philosophers', 0.119), ('risky', 0.115), ('place', 0.11), ('prediction', 0.109), ('person', 0.106), ('variety', 0.081), ('within', 0.08), ('models', 0.076), ('humean', 0.076), ('totem', 0.076), ('conditions', 0.075), ('approaches', 0.075), ('think', 0.074), ('predictions', 0.073), ('say', 0.073), ('recursive', 0.071), ('conditional', 0.071), ('mind', 0.069), ('making', 0.069), ('resembles', 0.068), ('survives', 0.068), ('assumptions', 0.066), ('solves', 0.066), ('crudely', 0.066), ('payoff', 0.064), ('deductive', 0.064), ('misled', 0.064), ('distinguishing', 0.062), ('locate', 0.062), ('genuine', 0.062), ('stepping', 0.061), ('balls', 0.061), ('embarrassing', 0.061), ('suffers', 0.061), ('popperian', 0.06), ('famously', 0.06)]
simIndex simValue blogId blogTitle
same-blog 1 0.99999982 614 andrew gelman stats-2011-03-15-Induction within a model, deductive inference for model evaluation
Introduction: Jonathan Livengood writes: I have a couple of questions on your paper with Cosma Shalizi on “Philosophy and the practice of Bayesian statistics.” First, you distinguish between inductive approaches and hypothetico-deductive approaches to inference and locate statistical practice (at least, the practice of model building and checking) on the hypothetico-deductive side. Do you think that there are any interesting elements of statistical practice that are properly inductive? For example, suppose someone is playing around with a system that more or less resembles a toy model, like drawing balls from an urn or some such, and where the person has some well-defined priors. The person makes a number of draws from the urn and applies Bayes theorem to get a posterior. On your view, is that person making an induction? If so, how much space is there in statistical practice for genuine inductions like this? Second, I agree with you that one ought to distinguish induction from other kind
2 0.24544403 1652 andrew gelman stats-2013-01-03-“The Case for Inductive Theory Building”
Introduction: Professor of business management Edwin Locke sent me an article : This paper argues that theory building in the social sciences, management and psychology included, should be inductive. It begins by critiquing contemporary philosophy of science, e.g., Popper’s falsifiability theory, his stress on deduction, and the hypothetico-deductive method. Next I present some history of the concept of induction in philosophy and of inductive theory building in the hard sciences (e.g., Aristotle, Bacon, Newton). This is followed by three examples of successful theory building by induction in psychology and management (Beck’s theory, Bandura’s social-cognitive theory, goal setting theory). The paper concludes with some suggested guidelines for successful theory building through induction and some new policies that journal editors might encourage. Like most social scientists (but maybe not most Bayesians ), I’m pretty much a Popperian myself, so I was interested to see someone taking such a
3 0.19911288 781 andrew gelman stats-2011-06-28-The holes in my philosophy of Bayesian data analysis
Introduction: I’ve been writing a lot about my philosophy of Bayesian statistics and how it fits into Popper’s ideas about falsification and Kuhn’s ideas about scientific revolutions. Here’s my long, somewhat technical paper with Cosma Shalizi. Here’s our shorter overview for the volume on the philosophy of social science. Here’s my latest try (for an online symposium), focusing on the key issues. I’m pretty happy with my approach–the familiar idea that Bayesian data analysis iterates the three steps of model building, inference, and model checking–but it does have some unresolved (maybe unresolvable) problems. Here are a couple mentioned in the third of the above links. Consider a simple model with independent data y_1, y_2, .., y_10 ~ N(θ,σ^2), with a prior distribution θ ~ N(0,10^2) and σ known and taking on some value of approximately 10. Inference about μ is straightforward, as is model checking, whether based on graphs or numerical summaries such as the sample variance and skewn
4 0.19710188 2007 andrew gelman stats-2013-09-03-Popper and Jaynes
Introduction: Deborah Mayo quotes me as saying, “Popper has argued (convincingly, in my opinion) that scientific inference is not inductive but deductive.” She then follows up with: Gelman employs significance test-type reasoning to reject a model when the data sufficiently disagree. Now, strictly speaking, a model falsification, even to inferring something as weak as “the model breaks down,” is not purely deductive, but Gelman is right to see it as about as close as one can get, in statistics, to a deductive falsification of a model. But where does that leave him as a Jaynesian? My reply: I was influenced by reading a toy example from Jaynes’s book where he sets up a model (for the probability of a die landing on each of its six sides) based on first principles, then presents some data that contradict the model, then expands the model. I’d seen very little of this sort of this reasoning before in statistics! In physics it’s the standard way to go: you set up a model based on physic
5 0.19543524 746 andrew gelman stats-2011-06-05-An unexpected benefit of Arrow’s other theorem
Introduction: In my remarks on Arrow’s theorem (the weak form of Arrow’s Theorem is that any result can be published no more than five times. The strong form is that every result will be published five times), I meant no criticism of Bruno Frey, the author of the articles in question: I agree that it can be a contribution to publish in multiple places. Regarding the evaluation of contributions, it should be possible to evaluate research contributions and also evaluate communication. One problem is that communication is both under- and over-counted. It’s undercounted in that we mostly get credit for original ideas not for exposition; it’s overcounted in that we need communication skills to publish in the top journals. But I don’t think these two biases cancel out. The real reason I’m bringing this up, though, is because Arrow’s theorem happened to me recently and in interesting way. Here’s the story. Two years ago I was contacted by Harold Kincaid to write a chapter on Bayesian statistics
6 0.19050278 1779 andrew gelman stats-2013-03-27-“Two Dogmas of Strong Objective Bayesianism”
7 0.18700029 291 andrew gelman stats-2010-09-22-Philosophy of Bayes and non-Bayes: A dialogue with Deborah Mayo
8 0.17350926 1181 andrew gelman stats-2012-02-23-Philosophy: Pointer to Salmon
9 0.17281517 811 andrew gelman stats-2011-07-20-Kind of Bayesian
10 0.17183979 110 andrew gelman stats-2010-06-26-Philosophy and the practice of Bayesian statistics
11 0.17095265 1719 andrew gelman stats-2013-02-11-Why waste time philosophizing?
12 0.16495359 1712 andrew gelman stats-2013-02-07-Philosophy and the practice of Bayesian statistics (with all the discussions!)
13 0.1606828 754 andrew gelman stats-2011-06-09-Difficulties with Bayesian model averaging
14 0.15552467 1205 andrew gelman stats-2012-03-09-Coming to agreement on philosophy of statistics
15 0.15210719 244 andrew gelman stats-2010-08-30-Useful models, model checking, and external validation: a mini-discussion
17 0.14382195 998 andrew gelman stats-2011-11-08-Bayes-Godel
18 0.1421466 1247 andrew gelman stats-2012-04-05-More philosophy of Bayes
19 0.13956553 23 andrew gelman stats-2010-05-09-Popper’s great, but don’t bother with his theory of probability
20 0.13922963 556 andrew gelman stats-2011-02-04-Patterns
topicId topicWeight
[(0, 0.217), (1, 0.116), (2, -0.047), (3, 0.047), (4, -0.091), (5, 0.013), (6, -0.032), (7, 0.014), (8, 0.143), (9, -0.03), (10, -0.01), (11, -0.006), (12, -0.077), (13, 0.003), (14, -0.075), (15, 0.019), (16, 0.058), (17, -0.003), (18, -0.003), (19, 0.043), (20, -0.029), (21, -0.063), (22, -0.024), (23, -0.034), (24, -0.005), (25, 0.011), (26, 0.009), (27, 0.032), (28, -0.032), (29, 0.005), (30, 0.012), (31, -0.021), (32, -0.003), (33, 0.003), (34, -0.016), (35, -0.0), (36, -0.008), (37, -0.011), (38, 0.001), (39, -0.053), (40, 0.015), (41, -0.032), (42, 0.028), (43, 0.047), (44, -0.003), (45, 0.0), (46, -0.021), (47, -0.021), (48, 0.014), (49, -0.02)]
simIndex simValue blogId blogTitle
same-blog 1 0.96669835 614 andrew gelman stats-2011-03-15-Induction within a model, deductive inference for model evaluation
Introduction: Jonathan Livengood writes: I have a couple of questions on your paper with Cosma Shalizi on “Philosophy and the practice of Bayesian statistics.” First, you distinguish between inductive approaches and hypothetico-deductive approaches to inference and locate statistical practice (at least, the practice of model building and checking) on the hypothetico-deductive side. Do you think that there are any interesting elements of statistical practice that are properly inductive? For example, suppose someone is playing around with a system that more or less resembles a toy model, like drawing balls from an urn or some such, and where the person has some well-defined priors. The person makes a number of draws from the urn and applies Bayes theorem to get a posterior. On your view, is that person making an induction? If so, how much space is there in statistical practice for genuine inductions like this? Second, I agree with you that one ought to distinguish induction from other kind
Introduction: David Rohde writes: I have been thinking a lot lately about your Bayesian model checking approach. This is in part because I have been working on exploratory data analysis and wishing to avoid controversy and mathematical statistics we omitted model checking from our discussion. This is something that the refereeing process picked us up on and we ultimately added a critical discussion of null-hypothesis testing to our paper . The exploratory technique we discussed was essentially a 2D histogram approach, but we used Polya models as a formal model for the histogram. We are currently working on a new paper, and we are thinking through how or if we should do “confirmatory analysis” or model checking in the paper. What I find most admirable about your statistical work is that you clearly use the Bayesian approach to do useful applied statistical analysis. My own attempts at applied Bayesian analysis makes me greatly admire your applied successes. On the other hand it may be t
3 0.85626209 754 andrew gelman stats-2011-06-09-Difficulties with Bayesian model averaging
Introduction: In response to this article by Cosma Shalizi and myself on the philosophy of Bayesian statistics, David Hogg writes: I [Hogg] agree–even in physics and astronomy–that the models are not “True” in the God-like sense of being absolute reality (that is, I am not a realist); and I have argued (a philosophically very naive paper, but hey, I was new to all this) that for pretty fundamental reasons we could never arrive at the True (with a capital “T”) model of the Universe. The goal of inference is to find the “best” model, where “best” might have something to do with prediction, or explanation, or message length, or (horror!) our utility. Needless to say, most of my physics friends *are* realists, even in the face of “effective theories” as Newtonian mechanics is an effective theory of GR and GR is an effective theory of “quantum gravity” (this plays to your point, because if you think any theory is possibly an effective theory, how could you ever find Truth?). I also liked the i
4 0.8447699 1041 andrew gelman stats-2011-12-04-David MacKay and Occam’s Razor
Introduction: In my comments on David MacKay’s 2003 book on Bayesian inference, I wrote that I hate all the Occam-factor stuff that MacKay talks about, and I linked to this quote from Radford Neal: Sometimes a simple model will outperform a more complex model . . . Nevertheless, I believe that deliberately limiting the complexity of the model is not fruitful when the problem is evidently complex. Instead, if a simple model is found that outperforms some particular complex model, the appropriate response is to define a different complex model that captures whatever aspect of the problem led to the simple model performing well. MacKay replied as follows: When you said you disagree with me on Occam factors I think what you meant was that you agree with me on them. I’ve read your post on the topic and completely agreed with you (and Radford) that we should be using models the size of a house, models that we believe in, and that anyone who thinks it is a good idea to bias the model toward
5 0.8444708 811 andrew gelman stats-2011-07-20-Kind of Bayesian
Introduction: Astrophysicist Andrew Jaffe pointed me to this and discussion of my philosophy of statistics (which is, in turn, my rational reconstruction of the statistical practice of Bayesians such as Rubin and Jaynes). Jaffe’s summary is fair enough and I only disagree in a few points: 1. Jaffe writes: Subjective probability, at least the way it is actually used by practicing scientists, is a sort of “as-if” subjectivity — how would an agent reason if her beliefs were reflected in a certain set of probability distributions? This is why when I discuss probability I try to make the pedantic point that all probabilities are conditional, at least on some background prior information or context. I agree, and my problem with the usual procedures used for Bayesian model comparison and Bayesian model averaging is not that these approaches are subjective but that the particular models being considered don’t make sense. I’m thinking of the sorts of models that say the truth is either A or
6 0.83872688 72 andrew gelman stats-2010-06-07-Valencia: Summer of 1991
7 0.83427787 244 andrew gelman stats-2010-08-30-Useful models, model checking, and external validation: a mini-discussion
8 0.82662964 496 andrew gelman stats-2011-01-01-Tukey’s philosophy
9 0.81485367 1392 andrew gelman stats-2012-06-26-Occam
10 0.80688924 217 andrew gelman stats-2010-08-19-The “either-or” fallacy of believing in discrete models: an example of folk statistics
11 0.80026817 2007 andrew gelman stats-2013-09-03-Popper and Jaynes
12 0.79344922 291 andrew gelman stats-2010-09-22-Philosophy of Bayes and non-Bayes: A dialogue with Deborah Mayo
13 0.79242051 1406 andrew gelman stats-2012-07-05-Xiao-Li Meng and Xianchao Xie rethink asymptotics
14 0.79102486 317 andrew gelman stats-2010-10-04-Rob Kass on statistical pragmatism, and my reactions
15 0.7851032 781 andrew gelman stats-2011-06-28-The holes in my philosophy of Bayesian data analysis
16 0.7840361 1510 andrew gelman stats-2012-09-25-Incoherence of Bayesian data analysis
17 0.78136951 1205 andrew gelman stats-2012-03-09-Coming to agreement on philosophy of statistics
18 0.77541649 1739 andrew gelman stats-2013-02-26-An AI can build and try out statistical models using an open-ended generative grammar
19 0.77419466 110 andrew gelman stats-2010-06-26-Philosophy and the practice of Bayesian statistics
20 0.77164829 1719 andrew gelman stats-2013-02-11-Why waste time philosophizing?
topicId topicWeight
[(9, 0.019), (15, 0.026), (16, 0.068), (21, 0.019), (24, 0.137), (27, 0.012), (63, 0.028), (66, 0.031), (76, 0.012), (77, 0.138), (84, 0.028), (86, 0.043), (92, 0.013), (99, 0.317)]
simIndex simValue blogId blogTitle
1 0.98583293 1784 andrew gelman stats-2013-04-01-Wolfram on Mandelbrot
Introduction: The most perfect pairing of author and subject since Nicholson Baker and John Updike. Here’s Wolfram on the great researcher of fractals : In his way, Mandelbrot paid me some great compliments. When I was in my 20s, and he in his 60s, he would ask about my scientific work: “How can so many people take someone so young so seriously?” In 2002, my book “A New Kind of Science”—in which I argued that many phenomena across science are the complex results of relatively simple, program-like rules—appeared. Mandelbrot seemed to see it as a direct threat, once declaring that “Wolfram’s ‘science’ is not new except when it is clearly wrong; it deserves to be completely disregarded.” In private, though, several mutual friends told me, he fretted that in the long view of history it would overwhelm his work. In retrospect, I don’t think Mandelbrot had much to worry about on this account. The link from the above review came from Peter Woit, who also points to a review by Brian Hayes wit
2 0.98344553 1684 andrew gelman stats-2013-01-20-Ugly ugly ugly
Introduction: Denis Cote sends the following , under the heading, “Some bad graphs for your enjoyment”: To start with, they don’t know how to spell “color.” Seriously, though, the graph is a mess. The circular display implies a circular or periodic structure that isn’t actually in the data, the cramped display requires the use of an otherwise-unnecessary color code that makes it difficult to find or make sense of the information, the alphabetical ordering (without even supplying state names, only abbreviations) makes it further difficult to find any patterns. It would be so much better, and even easier, to just display a set of small maps shading states on whether they have different laws. But that’s part of the problem—the clearer graph would also be easier to make! To get a distinctive graph, there needs to be some degree of difficulty. The designers continue with these monstrosities: Here they decide to display only 5 states at a time so that it’s really hard to see any big pi
3 0.97982919 978 andrew gelman stats-2011-10-28-Cool job opening with brilliant researchers at Yahoo
Introduction: Duncan Watts writes: The Human Social Dynamics Group in Yahoo Research is seeking highly qualified candidates for a post-doctoral research scientist position. The Human and Social Dynamics group is devoted to understanding the interplay between individual-level behavior (e.g. how people make decisions about what music they like, which dates to go on, or which groups to join) and the social environment in which individual behavior necessarily plays itself out. In particular, we are interested in: * Structure and evolution of social groups and networks * Decision making, social influence, diffusion, and collective decisions * Networking and collaborative problem solving. The intrinsically multi-disciplinary and cross-cutting nature of the subject demands an eclectic range of researchers, both in terms of domain-expertise (e.g. decision sciences, social psychology, sociology) and technical skills (e.g. statistical analysis, mathematical modeling, computer simulations, design o
4 0.97898436 1124 andrew gelman stats-2012-01-17-How to map geographically-detailed survey responses?
Introduction: David Sparks writes: I am experimenting with the mapping/visualization of survey response data, with a particular focus on using transparency to convey uncertainty. See some examples here . Do you think the examples are successful at communicating both local values of the variable of interest, as well as the lack of information in certain places? Also, do you have any general advice for choosing an approach to spatially smoothing the data in a way that preserves local features, but prevents individual respondents from standing out? I have experimented a lot with smoothing in these maps, and the cost of preventing the Midwest and West from looking “spotty” is the oversmoothing of the Northeast. My quick impression is that the graphs are more pretty than they are informative. But “pretty” is not such a bad thing! The conveying-information part is more difficult: to me, the graphs seem to be displaying a somewhat confusing mix of opinion level and population density. Consider
5 0.97727048 1604 andrew gelman stats-2012-12-04-An epithet I can live with
Introduction: Here . Indeed, I’d much rather be a legend than a myth. I just want to clarify one thing. Walter Hickey writes: [Antony Unwin and Andrew Gelman] collaborated on this presentation where they take a hard look at what’s wrong with the recent trends of data visualization and infographics. The takeaway is that while there have been great leaps in visualization technology, some of the visualizations that have garnered the highest praises have actually been lacking in a number of key areas. Specifically, the pair does a takedown of the top visualizations of 2008 as decided by the popular statistics blog Flowing Data. This is a fair summary, but I want to emphasize that, although our dislike of some award-winning visualizations is central to our argument, it is only the first part of our story. As Antony and I worked more on our paper, and especially after seeing the discussions by Robert Kosara, Stephen Few, Hadley Wickham, and Paul Murrell (all to appear in Journal of Computati
8 0.96785569 562 andrew gelman stats-2011-02-06-Statistician cracks Toronto lottery
9 0.95986879 2054 andrew gelman stats-2013-10-07-Bing is preferred to Google by people who aren’t like me
10 0.95889056 57 andrew gelman stats-2010-05-29-Roth and Amsterdam
same-blog 11 0.95773053 614 andrew gelman stats-2011-03-15-Induction within a model, deductive inference for model evaluation
12 0.95770693 93 andrew gelman stats-2010-06-17-My proposal for making college admissions fairer
13 0.95708394 1561 andrew gelman stats-2012-11-04-Someone is wrong on the internet
14 0.95174026 216 andrew gelman stats-2010-08-18-More forecasting competitions
15 0.94986033 207 andrew gelman stats-2010-08-14-Pourquoi Google search est devenu plus raisonnable?
16 0.94964695 1948 andrew gelman stats-2013-07-21-Bayes related
17 0.94902951 1438 andrew gelman stats-2012-07-31-What is a Bayesian?
18 0.94794774 1788 andrew gelman stats-2013-04-04-When is there “hidden structure in data” to be discovered?
19 0.94737828 1247 andrew gelman stats-2012-04-05-More philosophy of Bayes
20 0.94703078 128 andrew gelman stats-2010-07-05-The greatest works of statistics never published