andrew_gelman_stats andrew_gelman_stats-2013 andrew_gelman_stats-2013-1742 knowledge-graph by maker-knowledge-mining

1742 andrew gelman stats-2013-02-27-What is “explanation”?


meta infos for this blog

Source: html

Introduction: “Explanation” is this thing that social scientists (or people in their everyday lives, acting like social scientists) do, where some event X happens and we supply a coherent story that concludes with X. Sometimes we speak of an event as “overdetermined,” when we can think of many plausible stories that all lead to X. My question today is: what is explanation, in a statistical sense? To understand why this is a question worth asking at all, compare to prediction. Prediction is another thing that we all to, typically in a qualitative fashion: I think she’s gonna win this struggle, I think he’s probably gonna look for a new job, etc. It’s pretty clear how to map everyday prediction into a statistical framework, and we can think of informal qualitative predictions as approximations to the predictions that could be made by a statistical model (as in the classic work of Meehl and others on clinical vs. statistical prediction). Fitting “explanation” into a statistical framework i


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 “Explanation” is this thing that social scientists (or people in their everyday lives, acting like social scientists) do, where some event X happens and we supply a coherent story that concludes with X. [sent-1, score-0.387]

2 Prediction is another thing that we all to, typically in a qualitative fashion: I think she’s gonna win this struggle, I think he’s probably gonna look for a new job, etc. [sent-5, score-0.225]

3 It’s pretty clear how to map everyday prediction into a statistical framework, and we can think of informal qualitative predictions as approximations to the predictions that could be made by a statistical model (as in the classic work of Meehl and others on clinical vs. [sent-6, score-0.67]

4 Fitting “explanation” into a statistical framework is more of a challenge. [sent-8, score-0.148]

5 I was thinking about this the other day after reading a blog exchange that began with a post by sociologist Fabio Rojas entitled “the argo win easily explained”: The Academy loves well crafted films that are about actors or acting, especially when actors save the day. [sent-9, score-1.051]

6 Example: Shakespeare in Love beats Saving Private Ryan; the Kings Speech beats Black Swan, Inception and Social Network. [sent-11, score-0.186]

7 It’s not like you learned anything new about the nominated films over the past 48 hours besides who actually won. [sent-17, score-0.248]

8 I could see where Basbøll was coming from, but his comment seemed to strong to me, so I responded to the thread: To be fair, Fabio didn’t say “the argo win easily predicted ,” he said “explained. [sent-21, score-0.551]

9 For a social scientist to make a prediction is clear enough, but we also spend a lot of time explaining. [sent-23, score-0.306]

10 ) Explanation is not the same as prediction but it’s not nothing. [sent-26, score-0.223]

11 The fact that Fabio could’ve explained a Lincoln win does not make his Argo explanation empty. [sent-28, score-0.703]

12 (Over the years I’ve spent a lot of time considering commonsense “practical” ideas such as mixing of Markov chains, checking of model fit, statistical graphics, boundary-avoiding estimates, and storytelling, and placing them in a formal statistical-modeling framework. [sent-30, score-0.198]

13 ) Explanation is not prediction (for the reason indicated by Basbøll above), but it’s something . [sent-32, score-0.223]

14 Rojas’s Argo explanation helps him elaborate his implicit theory of the Oscars, essentially constraining his theory as compared to where it was before the awards were announced. [sent-34, score-0.778]

15 What “explanation” does is to align the theory to fit the data, which is comparable to the statistical procedure of restricting the parameters to the zone of high likelihood for the observed data . [sent-36, score-0.168]

16 Without explanations (including after-the-fact explanations), it would be difficult to understand a model well enough to use it. [sent-38, score-0.258]

17 Another way of putting it is that explanation is a form of consistency check. [sent-39, score-0.5]

18 It would be relevant to social science if it helps us to formulate our explanations in terms of what we have learned from the data: in this case, how are Rojas’s post-Oscars views of the world different from his views last week. [sent-41, score-0.527]

19 If Basbøll is right and Rojas did not forecast the Argo win ahead of time, that’s fine; to that extent, his explanation will be more valuable to the extent that it articulates (even if only qualitatively) the role of the new information in refining his theories. [sent-42, score-0.757]

20 A purely predictive explanation would not really feel like an explanation at all. [sent-50, score-1.0]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('explanation', 0.5), ('argo', 0.327), ('rojas', 0.259), ('prediction', 0.223), ('explanations', 0.207), ('basb', 0.176), ('fabio', 0.173), ('win', 0.146), ('films', 0.143), ('lincoln', 0.101), ('beats', 0.093), ('statistical', 0.089), ('actors', 0.088), ('saving', 0.088), ('social', 0.083), ('acting', 0.08), ('qualitative', 0.079), ('theory', 0.079), ('easily', 0.078), ('everyday', 0.078), ('sociologist', 0.074), ('gon', 0.069), ('helps', 0.068), ('ll', 0.066), ('na', 0.064), ('event', 0.063), ('framework', 0.059), ('meehl', 0.058), ('oscars', 0.058), ('refining', 0.058), ('skeptically', 0.058), ('views', 0.058), ('explained', 0.057), ('predictions', 0.056), ('crafted', 0.055), ('shakespeare', 0.055), ('inception', 0.055), ('commonsense', 0.055), ('predict', 0.054), ('checking', 0.054), ('learned', 0.053), ('role', 0.053), ('thinking', 0.052), ('nominated', 0.052), ('kings', 0.052), ('freudianism', 0.052), ('freudian', 0.052), ('constraining', 0.052), ('understand', 0.051), ('swan', 0.05)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0000002 1742 andrew gelman stats-2013-02-27-What is “explanation”?

Introduction: “Explanation” is this thing that social scientists (or people in their everyday lives, acting like social scientists) do, where some event X happens and we supply a coherent story that concludes with X. Sometimes we speak of an event as “overdetermined,” when we can think of many plausible stories that all lead to X. My question today is: what is explanation, in a statistical sense? To understand why this is a question worth asking at all, compare to prediction. Prediction is another thing that we all to, typically in a qualitative fashion: I think she’s gonna win this struggle, I think he’s probably gonna look for a new job, etc. It’s pretty clear how to map everyday prediction into a statistical framework, and we can think of informal qualitative predictions as approximations to the predictions that could be made by a statistical model (as in the classic work of Meehl and others on clinical vs. statistical prediction). Fitting “explanation” into a statistical framework i

2 0.23670626 1282 andrew gelman stats-2012-04-26-Bad news about (some) statisticians

Introduction: Sociologist Fabio Rojas reports on “a conversation I [Rojas] have had a few times with statisticians”: Rojas: “What does your research tell us about a sample of, say, a few hundred cases?” Statistician: “That’s not important. My result works as n–> 00.” Rojas: “Sure, that’s a fine mathematical result, but I have to estimate the model with, like, totally finite data. I need inference, not limits. Maybe the estimate doesn’t work out so well for small n.” Statistician: “Sure, but if you have a few million cases, it’ll work in the limit.” Rojas: “Whoa. Have you ever collected, like, real world network data? A million cases is hard to get.” The conversation continues in this frustrating vein. Rojas writes: This illustrates a fundamental issue in statistics (and other sciences). One you formalize a model and work mathematically, you are tempted to focus on what is mathematically interesting instead of the underlying problem motivating the science. . . . We have the sam

3 0.17020857 2142 andrew gelman stats-2013-12-21-Chasing the noise

Introduction: Fabio Rojas writes : After reading the Fowler/Christakis paper on networks and obesity , a student asked why it was that friends had a stronger influence on spouses. In other words, if we believe the F&C; paper, they report that your friends (57%) are more likely to transmit obesity than your spouse (37%) (see page 370). This might be interpreted in two ways. First, it might be seen as a counter argument. This might really indicate that homophily is at work. We probably select spouses for some traits that are not self-similar. While we choose friends mainly on self-similarity of leisure and consumption (e.g, diet and exercise). Second, there might be an explanation based on transmission. We choose friends because we want them to influence us, while spouses are (supposed?) to accept us. Your thoughts? My thought: No. No no no no no. No no no. No. From the linked paper: A person’s chances of becoming obese increased by 57% (95% confidence interval [CI], 6 to 123) if h

4 0.13546561 1269 andrew gelman stats-2012-04-19-Believe your models (up to the point that you abandon them)

Introduction: In a discussion of his variant of the write-a-thousand-words-a-day strategy (as he puts it, “a system for the production of academic results in writing”), Thomas Basbøll writes : Believe the claims you are making. That is, confine yourself to making claims you believe. I always emphasize this when I [Basbøll] define knowledge as “justified, true belief”. . . . I think if there is one sure way to undermine your sense of your own genius it is to begin to say things you know to be publishable without being sure they are true. Or even things you know to be “true” but don’t understand well enough to believe. He points out that this is not so easy: In times when there are strong orthodoxies it can sometimes be difficult to know what to believe. Or, rather, it is all too easy to know what to believe (what the “right belief” is). It is therefore difficult to stick to statements of one’s own belief. I sometimes worry that our universities, which are systems of formal education and for

5 0.13503224 1564 andrew gelman stats-2012-11-06-Choose your default, or your default will choose you (election forecasting edition)

Introduction: Statistics is the science of defaults. One of the differences between statistics and other branches of engineering is that we have a special love for default procedures, perhaps because so many statistical problems are routine (or, at least, people would like them to be). We have standard estimates for all sorts of models, books of statistical tests, and default settings for everything. Recently I’ve been working on default weakly informative priors (which are not the same as the typically noninformative “reference priors” of the Bayesian literature). From a Bayesian point of view, the appropriate default procedure could be defined as that which is appropriate for the population of problems that one might be studying. More generally, much of our job as statisticians is to come up with methods that will be used by others in routine practice. (Much of the rest of our job is to come up with methods for evaluating new and existing statistical methods, and methods for coming up wi

6 0.12439866 1844 andrew gelman stats-2013-05-06-Against optimism about social science

7 0.1193516 754 andrew gelman stats-2011-06-09-Difficulties with Bayesian model averaging

8 0.11086328 1041 andrew gelman stats-2011-12-04-David MacKay and Occam’s Razor

9 0.10969777 2284 andrew gelman stats-2014-04-07-How literature is like statistical reasoning: Kosara on stories. Gelman and Basbøll on stories.

10 0.10793898 2287 andrew gelman stats-2014-04-09-Advice: positive-sum, zero-sum, or negative-sum

11 0.10500629 1408 andrew gelman stats-2012-07-07-Not much difference between communicating to self and communicating to others

12 0.09960676 719 andrew gelman stats-2011-05-19-Everything is Obvious (once you know the answer)

13 0.097631887 901 andrew gelman stats-2011-09-12-Some thoughts on academic cheating, inspired by Frey, Wegman, Fischer, Hauser, Stapel

14 0.096283771 1351 andrew gelman stats-2012-05-29-A Ph.D. thesis is not really a marathon

15 0.095693506 1823 andrew gelman stats-2013-04-24-The Tweets-Votes Curve

16 0.095616683 2327 andrew gelman stats-2014-05-09-Nicholas Wade and the paradox of racism

17 0.095426157 1874 andrew gelman stats-2013-05-28-Nostalgia

18 0.095331319 563 andrew gelman stats-2011-02-07-Evaluating predictions of political events

19 0.094469242 1489 andrew gelman stats-2012-09-09-Commercial Bayesian inference software is popping up all over

20 0.094158143 2245 andrew gelman stats-2014-03-12-More on publishing in journals


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.192), (1, -0.003), (2, -0.023), (3, 0.044), (4, -0.038), (5, -0.011), (6, -0.02), (7, -0.003), (8, 0.069), (9, -0.004), (10, -0.01), (11, 0.031), (12, -0.015), (13, -0.02), (14, -0.102), (15, -0.022), (16, 0.02), (17, -0.039), (18, 0.019), (19, 0.019), (20, -0.022), (21, -0.034), (22, -0.036), (23, 0.039), (24, -0.005), (25, 0.033), (26, 0.032), (27, 0.036), (28, -0.025), (29, 0.01), (30, 0.016), (31, 0.025), (32, -0.033), (33, -0.003), (34, 0.089), (35, 0.005), (36, -0.022), (37, -0.005), (38, -0.008), (39, -0.01), (40, -0.002), (41, -0.013), (42, 0.002), (43, -0.04), (44, -0.005), (45, -0.012), (46, -0.045), (47, -0.02), (48, -0.03), (49, -0.021)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.96402317 1742 andrew gelman stats-2013-02-27-What is “explanation”?

Introduction: “Explanation” is this thing that social scientists (or people in their everyday lives, acting like social scientists) do, where some event X happens and we supply a coherent story that concludes with X. Sometimes we speak of an event as “overdetermined,” when we can think of many plausible stories that all lead to X. My question today is: what is explanation, in a statistical sense? To understand why this is a question worth asking at all, compare to prediction. Prediction is another thing that we all to, typically in a qualitative fashion: I think she’s gonna win this struggle, I think he’s probably gonna look for a new job, etc. It’s pretty clear how to map everyday prediction into a statistical framework, and we can think of informal qualitative predictions as approximations to the predictions that could be made by a statistical model (as in the classic work of Meehl and others on clinical vs. statistical prediction). Fitting “explanation” into a statistical framework i

2 0.81547719 1278 andrew gelman stats-2012-04-23-“Any old map will do” meets “God is in every leaf of every tree”

Introduction: As a statistician I am particularly worried about the rhetorical power of anecdotes (even though I use them in my own reasoning; see discussion below). But much can be learned from a true anecdote. The rough edges—the places where the anecdote doesn’t fit your thesis—these are where you learn. We have recently had a discussion ( here and here ) of Karl Weick, a prominent scholar of business management who plagiarized a story and then went on to draw different lessons from the pilfered anecdote in several different publications published over many years. Setting aside an issues of plagiarism and rulebreaking, I argue that, by hiding the source of the story and changing its form, Weick and his management-science audience are losing their ability to get anything out of it beyond empty confirmation. A full discussion follows. 1. The lost Hungarian soldiers Thomas Basbøll (who has the unusual (to me) job of “writing consultant” at the Copenhagen Business School) has been

3 0.79476225 2284 andrew gelman stats-2014-04-07-How literature is like statistical reasoning: Kosara on stories. Gelman and Basbøll on stories.

Introduction: In “Story: A Definition,” visual analysis researcher Robert Kosara writes : A story ties facts together. There is a reason why this particular collection of facts is in this story, and the story gives you that reason. provides a narrative path through those facts. In other words, it guides the viewer/reader through the world, rather than just throwing them in there. presents a particular interpretation of those facts. A story is always a particular path through a world, so it favors one way of seeing things over all others. The relevance of these ideas to statistical graphics is apparent. From a completely different direction, in “When do stories work? Evidence and illustration in the social sciences,” Thomas Basbøll and I write : Storytelling has long been recognized as central to human cognition and communication. Here we explore a more active role of stories in social science research, not merely to illustrate concepts but also to develop new ideas and evalu

4 0.79155964 1269 andrew gelman stats-2012-04-19-Believe your models (up to the point that you abandon them)

Introduction: In a discussion of his variant of the write-a-thousand-words-a-day strategy (as he puts it, “a system for the production of academic results in writing”), Thomas Basbøll writes : Believe the claims you are making. That is, confine yourself to making claims you believe. I always emphasize this when I [Basbøll] define knowledge as “justified, true belief”. . . . I think if there is one sure way to undermine your sense of your own genius it is to begin to say things you know to be publishable without being sure they are true. Or even things you know to be “true” but don’t understand well enough to believe. He points out that this is not so easy: In times when there are strong orthodoxies it can sometimes be difficult to know what to believe. Or, rather, it is all too easy to know what to believe (what the “right belief” is). It is therefore difficult to stick to statements of one’s own belief. I sometimes worry that our universities, which are systems of formal education and for

5 0.78585923 471 andrew gelman stats-2010-12-17-Attractive models (and data) wanted for statistical art show.

Introduction: I have agreed to do a local art exhibition in February. An excuse to think about form, colour and style for plotting almost individual observation likelihoods – while invoking the artists privilege of refusing to give interpretations of their own work. In order to make it possibly less dry I’ll try to use intuitive suggestive captions like in this example TheTyranyof13.pdf thereby side stepping the technical discussions like here RadfordNealBlog Suggested models and data sets (or even submissions) would be most appreciated. I likely be sticking to realism i.e. plots that represent ‘statistical reality’ faithfully. K?

6 0.7829591 1812 andrew gelman stats-2013-04-19-Chomsky chomsky chomsky chomsky furiously

7 0.74458581 1408 andrew gelman stats-2012-07-07-Not much difference between communicating to self and communicating to others

8 0.74012834 1750 andrew gelman stats-2013-03-05-Watership Down, thick description, applied statistics, immutability of stories, and playing tennis with a net

9 0.73660076 171 andrew gelman stats-2010-07-30-Silly baseball example illustrates a couple of key ideas they don’t usually teach you in statistics class

10 0.73293668 2136 andrew gelman stats-2013-12-16-Whither the “bet on sparsity principle” in a nonsparse world?

11 0.72708994 1292 andrew gelman stats-2012-05-01-Colorless green facts asserted resolutely

12 0.7244767 757 andrew gelman stats-2011-06-10-Controversy over the Christakis-Fowler findings on the contagion of obesity

13 0.72031933 789 andrew gelman stats-2011-07-07-Descriptive statistics, causal inference, and story time

14 0.71927035 1412 andrew gelman stats-2012-07-10-More questions on the contagion of obesity, height, etc.

15 0.71082503 391 andrew gelman stats-2010-11-03-Some thoughts on election forecasting

16 0.71044403 614 andrew gelman stats-2011-03-15-Induction within a model, deductive inference for model evaluation

17 0.70610064 1282 andrew gelman stats-2012-04-26-Bad news about (some) statisticians

18 0.7052905 1351 andrew gelman stats-2012-05-29-A Ph.D. thesis is not really a marathon

19 0.70494306 49 andrew gelman stats-2010-05-24-Blogging

20 0.70147359 1319 andrew gelman stats-2012-05-14-I hate to get all Gerd Gigerenzer on you here, but . . .


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(2, 0.013), (12, 0.041), (15, 0.04), (16, 0.049), (21, 0.054), (22, 0.017), (24, 0.13), (29, 0.098), (42, 0.013), (45, 0.049), (76, 0.023), (80, 0.012), (86, 0.043), (95, 0.012), (99, 0.269)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.95944107 1940 andrew gelman stats-2013-07-16-A poll that throws away data???

Introduction: Mark Blumenthal writes: What do you think about the “random rejection” method used by PPP that was attacked at some length today by a Republican pollster. Our just published post on the debate includes all the details as I know them. The Storify of Martino’s tweets has some additional data tables linked to toward the end. Also, more specifically, setting aside Martino’s suggestion of manipulation (which is also quite possible with post-stratification weights), would the PPP method introduce more potential random error than weighting? From Blumenthal’s blog: B.J. Martino, a senior vice president at the Republican polling firm The Tarrance Group, went on an 30-minute Twitter rant on Tuesday questioning the unorthodox method used by PPP [Public Policy Polling] to select samples and weight data: “Looking at @ppppolls new VA SW. Wondering how many interviews they discarded to get down to 601 completes? Because @ppppolls discards a LOT of interviews. Of 64,811 conducted

2 0.95552474 1539 andrew gelman stats-2012-10-18-IRB nightmares

Introduction: Andrew Perrin nails it : Twice a year, like clockwork, the ethics cops at the IRB [institutional review board, the group on campus that has to approve research involving human subjects] take a break from deciding whether or not radioactive isotopes can be administered to prison populations to cure restless-leg syndrome to dream up some fancy new way in which participating in an automated telephone poll might cause harm. Perrin adds: The list of exemptions to IRB review is too short and, more importantly, contains no guiding principle as to what makes exempt. . . . [and] Even exemptions require approval by the IRB. He also voices a thought I’ve had many times, which is that there are all sorts of things you or I or anyone else can do on the street (for example, go up to people and ask them personal questions, drop objects and see if people pick them up, stage fights with our friends to see the reactions of bystanders, etc etc etc) but for which we have to go through an IRB

3 0.95377493 651 andrew gelman stats-2011-04-06-My talk at Northwestern University tomorrow (Thursday)

Introduction: Of Beauty, Sex, and Power: Statistical Challenges in Estimating Small Effects. At the Institute of Policy Research, Thurs 7 Apr 2011, 3.30pm . Regular blog readers know all about this topic. ( Here are the slides.) But, rest assured, I don’t just mock. I also offer constructive suggestions. My last talk at Northwestern was fifteen years ago. Actually, I gave two lectures then, in the process of being turned down for a job enjoying their chilly Midwestern hospitality. P.S. I searched on the web and also found this announcement which gives the wrong title.

4 0.95304275 1687 andrew gelman stats-2013-01-21-Workshop on science communication for graduate students

Introduction: Nathan Sanders writes: Applications are now open for the Communicating Science 2013 workshop (http://workshop.astrobites.com/), to be held in Cambridge, MA on June 13-15th, 2013. Graduate students at US institutions in all fields of science and engineering are encouraged to apply – funding is available for travel expenses and accommodations. The application can be found here: http://workshop.astrobites.org/application Participants will build the communication skills that technical professionals need to express complex ideas to their peers, experts in other fields, and the general public. There will be panel discussions on the following topics: * Engaging Non-Scientific Audiences * Science Writing for a Cause * Communicating Science Through Fiction * Sharing Science with Scientists * The World of Non-Academic Publishing * Communicating using Multimedia and the Web In addition to these discussions, ample time is allotted for interacting with the experts and with att

5 0.95291984 1491 andrew gelman stats-2012-09-10-Update on Levitt paper on child car seats

Introduction: A few years ago I noted the following quote from applied microeconomist Steven Levitt: Is it surprising that scientists would try to keep work that disagrees with their findings out of journals? When I told my father that I [Levitt] was sending my work saying car seats are not that effective to medical journals, he laughed and said they would never publish it because of the result, no matter how well done the analysis was. (As is so often the case, he was right, and I eventually published it in an economics journal.) Within the field of economics, academics work behind the scenes constantly trying to undermine each other. I’ve seen economists do far worse things than pulling tricks in figures. When economists get mixed up in public policy, things get messier. At the time, I expressed dismay about Levitt’s air of (as I read it) amused, world-weary tolerance of scientists behaving against the interest of science. But I took his story about the car seats at face value. But no

same-blog 6 0.95136982 1742 andrew gelman stats-2013-02-27-What is “explanation”?

7 0.94595098 1392 andrew gelman stats-2012-06-26-Occam

8 0.94318372 2057 andrew gelman stats-2013-10-10-Chris Chabris is irritated by Malcolm Gladwell

9 0.94040817 1024 andrew gelman stats-2011-11-23-Of hypothesis tests and Unitarians

10 0.93878186 2051 andrew gelman stats-2013-10-04-Scientific communication that accords you “the basic human dignity of allowing you to draw your own conclusions”

11 0.9368878 1344 andrew gelman stats-2012-05-25-Question 15 of my final exam for Design and Analysis of Sample Surveys

12 0.93587941 868 andrew gelman stats-2011-08-24-Blogs vs. real journalism

13 0.93416512 2133 andrew gelman stats-2013-12-13-Flexibility is good

14 0.93155432 1944 andrew gelman stats-2013-07-18-You’ll get a high Type S error rate if you use classical statistical methods to analyze data from underpowered studies

15 0.9300195 535 andrew gelman stats-2011-01-24-Bleg: Automatic Differentiation for Log Prob Gradients?

16 0.92765158 777 andrew gelman stats-2011-06-23-Combining survey data obtained using different modes of sampling

17 0.92657018 1041 andrew gelman stats-2011-12-04-David MacKay and Occam’s Razor

18 0.92654991 2246 andrew gelman stats-2014-03-13-An Economist’s Guide to Visualizing Data

19 0.92649621 1915 andrew gelman stats-2013-06-27-Huh?

20 0.92619282 811 andrew gelman stats-2011-07-20-Kind of Bayesian