andrew_gelman_stats andrew_gelman_stats-2011 andrew_gelman_stats-2011-552 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Emanuel Derman and Paul Wilmott wonder how to get their fellow modelers to give up their fantasy of perfection. In a Business Week article they proposed, not entirely in jest, a model makers’ Hippocratic Oath: I will remember that I didn’t make the world and that it doesn’t satisfy my equations. Though I will use models boldly to estimate value, I will not be overly impressed by mathematics. I will never sacrifice reality for elegance without explaining why I have done so. Nor will I give the people who use my model false comfort about its accuracy. Instead, I will make explicit its assumptions and oversights. I understand that my work may have enormous effects on society and the economy, many of them beyond my comprehension. Found via Abductive Intelligence .
sentIndex sentText sentNum sentScore
1 Emanuel Derman and Paul Wilmott wonder how to get their fellow modelers to give up their fantasy of perfection. [sent-1, score-0.805]
2 In a Business Week article they proposed, not entirely in jest, a model makers’ Hippocratic Oath: I will remember that I didn’t make the world and that it doesn’t satisfy my equations. [sent-2, score-0.642]
3 Though I will use models boldly to estimate value, I will not be overly impressed by mathematics. [sent-3, score-0.722]
4 I will never sacrifice reality for elegance without explaining why I have done so. [sent-4, score-0.923]
5 Nor will I give the people who use my model false comfort about its accuracy. [sent-5, score-0.617]
6 Instead, I will make explicit its assumptions and oversights. [sent-6, score-0.348]
7 I understand that my work may have enormous effects on society and the economy, many of them beyond my comprehension. [sent-7, score-0.583]
wordName wordTfidf (topN-words)
[('elegance', 0.241), ('abductive', 0.241), ('derman', 0.241), ('hippocratic', 0.241), ('boldly', 0.227), ('fantasy', 0.227), ('sacrifice', 0.217), ('modelers', 0.204), ('makers', 0.198), ('comfort', 0.19), ('overly', 0.18), ('satisfy', 0.175), ('enormous', 0.166), ('explicit', 0.164), ('fellow', 0.156), ('intelligence', 0.156), ('impressed', 0.147), ('explaining', 0.134), ('proposed', 0.131), ('reality', 0.129), ('give', 0.128), ('entirely', 0.121), ('economy', 0.12), ('society', 0.118), ('paul', 0.116), ('false', 0.111), ('week', 0.108), ('via', 0.106), ('assumptions', 0.105), ('business', 0.102), ('remember', 0.099), ('model', 0.095), ('use', 0.093), ('wonder', 0.09), ('beyond', 0.089), ('value', 0.088), ('make', 0.079), ('instead', 0.078), ('estimate', 0.075), ('effects', 0.075), ('done', 0.073), ('world', 0.073), ('understand', 0.071), ('found', 0.066), ('never', 0.066), ('didn', 0.065), ('may', 0.064), ('without', 0.063), ('though', 0.062), ('doesn', 0.061)]
simIndex simValue blogId blogTitle
same-blog 1 0.99999988 552 andrew gelman stats-2011-02-03-Model Makers’ Hippocratic Oath
Introduction: Emanuel Derman and Paul Wilmott wonder how to get their fellow modelers to give up their fantasy of perfection. In a Business Week article they proposed, not entirely in jest, a model makers’ Hippocratic Oath: I will remember that I didn’t make the world and that it doesn’t satisfy my equations. Though I will use models boldly to estimate value, I will not be overly impressed by mathematics. I will never sacrifice reality for elegance without explaining why I have done so. Nor will I give the people who use my model false comfort about its accuracy. Instead, I will make explicit its assumptions and oversights. I understand that my work may have enormous effects on society and the economy, many of them beyond my comprehension. Found via Abductive Intelligence .
2 0.16937192 554 andrew gelman stats-2011-02-04-An addition to the model-makers’ oath
Introduction: Yesterday Aleks posted a proposal for a model makers’ Hippocratic Oath. I’d like to add two more items: 1. From Mark Palko : “Our model only describes the data we used to build it; if you go outside of that range, you do so at your own risk.” 2. In case you like to think of your methods as nonparametric or non-model-based: “Our method, just like any model, relies on assumptions which we have the duty to state and to check.” (Observant readers will see that I use “we” rather than “I” in these two items. Modeling is an inherently collaborative endeavor.
3 0.093497038 1076 andrew gelman stats-2011-12-21-Derman, Rodrik and the nature of statistical models
Introduction: Interesting thoughts from Kaiser Fung. Derman seems to have a point in his criticisms of economic models—and things are just as bad in other social sciences. (I’ve criticized economists and political scientists for taking a crude, 80-year-old model of psychology as “foundational,” but even more sophisticated models in psychology and sociology have a lot of holes, if you go outside of certain clearly bounded areas such as psychometrics.) What can be done, then? One approach, which appeals to me as a statistician, is to more carefully define one’s range of inquiry. Even if we don’t have a great model of political bargaining, we can still use ideal-point models to capture a lot of the variation in legislative voting. And, in my blog post linked to above, I recommended that economists forget about coming up with the grand unified theory of human behavior (pretty impossible, given that they still don’t want to let go of much of their folk-psychology models) and put more effort i
Introduction: This material should be familiar to many of you but could be helpful to newcomers. Pearl writes: ALL causal conclusions in nonexperimental settings must be based on untested, judgmental assumptions that investigators are prepared to defend on scientific grounds. . . . To understand what the world should be like for a given procedure to work is of no lesser scientific value than seeking evidence for how the world works . . . Assumptions are self-destructive in their honesty. The more explicit the assumption, the more criticism it invites . . . causal diagrams invite the harshest criticism because they make assumptions more explicit and more transparent than other representation schemes. As regular readers know (for example, search this blog for “Pearl”), I have not got much out of the causal-diagrams approach myself, but in general I think that when there are multiple, mathematically equivalent methods of getting the same answer, we tend to go with the framework we are used
Introduction: Elias Bareinboim asked what I thought about his comment on selection bias in which he referred to a paper by himself and Judea Pearl, “Controlling Selection Bias in Causal Inference.” I replied that I have no problem with what he wrote, but that from my perspective I find it easier to conceptualize such problems in terms of multilevel models. I elaborated on that point in a recent post , “Hierarchical modeling as a framework for extrapolation,” which I think was read by only a few people (I say this because it received only two comments). I don’t think Bareinboim objected to anything I wrote, but like me he is comfortable working within his own framework. He wrote the following to me: In some sense, “not ad hoc” could mean logically consistent. In other words, if one agrees with the assumptions encoded in the model, one must also agree with the conclusions entailed by these assumptions. I am not aware of any other way of doing mathematics. As it turns out, to get causa
6 0.068519622 82 andrew gelman stats-2010-06-12-UnConMax – uncertainty consideration maxims 7 +-- 2
7 0.068481922 1431 andrew gelman stats-2012-07-27-Overfitting
8 0.067817003 2093 andrew gelman stats-2013-11-07-I’m negative on the expression “false positives”
9 0.065764077 2039 andrew gelman stats-2013-09-25-Harmonic convergence
10 0.064997658 754 andrew gelman stats-2011-06-09-Difficulties with Bayesian model averaging
11 0.062539987 1392 andrew gelman stats-2012-06-26-Occam
12 0.062348764 472 andrew gelman stats-2010-12-17-So-called fixed and random effects
14 0.060154226 1004 andrew gelman stats-2011-11-11-Kaiser Fung on how not to critique models
15 0.059327781 888 andrew gelman stats-2011-09-03-A psychology researcher asks: Is Anova dead?
16 0.058163811 1581 andrew gelman stats-2012-11-17-Horrible but harmless?
17 0.057921555 1644 andrew gelman stats-2012-12-30-Fixed effects, followed by Bayes shrinkage?
19 0.056910679 780 andrew gelman stats-2011-06-27-Bridges between deterministic and probabilistic models for binary data
20 0.05676093 394 andrew gelman stats-2010-11-05-2010: What happened?
topicId topicWeight
[(0, 0.118), (1, 0.03), (2, 0.017), (3, 0.003), (4, -0.002), (5, 0.007), (6, 0.011), (7, -0.011), (8, 0.068), (9, 0.022), (10, -0.02), (11, 0.015), (12, -0.021), (13, -0.003), (14, -0.057), (15, -0.006), (16, 0.004), (17, -0.026), (18, -0.022), (19, 0.016), (20, -0.01), (21, -0.046), (22, 0.006), (23, -0.019), (24, -0.032), (25, 0.007), (26, -0.035), (27, 0.012), (28, -0.01), (29, 0.014), (30, -0.032), (31, 0.017), (32, -0.015), (33, -0.01), (34, 0.001), (35, -0.035), (36, -0.008), (37, 0.005), (38, 0.038), (39, -0.015), (40, 0.0), (41, 0.015), (42, 0.004), (43, 0.019), (44, 0.018), (45, -0.011), (46, -0.011), (47, 0.008), (48, 0.001), (49, -0.013)]
simIndex simValue blogId blogTitle
same-blog 1 0.96694529 552 andrew gelman stats-2011-02-03-Model Makers’ Hippocratic Oath
Introduction: Emanuel Derman and Paul Wilmott wonder how to get their fellow modelers to give up their fantasy of perfection. In a Business Week article they proposed, not entirely in jest, a model makers’ Hippocratic Oath: I will remember that I didn’t make the world and that it doesn’t satisfy my equations. Though I will use models boldly to estimate value, I will not be overly impressed by mathematics. I will never sacrifice reality for elegance without explaining why I have done so. Nor will I give the people who use my model false comfort about its accuracy. Instead, I will make explicit its assumptions and oversights. I understand that my work may have enormous effects on society and the economy, many of them beyond my comprehension. Found via Abductive Intelligence .
2 0.78490841 1395 andrew gelman stats-2012-06-27-Cross-validation (What is it good for?)
Introduction: I think cross-validation is a good way to estimate a model’s forecasting error but I don’t think it’s always such a great tool for comparing models. I mean, sure, if the differences are dramatic, ok. But you can easily have a few candidate models, and one model makes a lot more sense than the others (even from a purely predictive sense, I’m not talking about causality here). The difference between the model doesn’t show up in a xval measure of total error but in the patterns of the predictions. For a simple example, imagine using a linear model with positive slope to model a function that is constrained to be increasing. If the constraint isn’t in the model, the predicted/imputed series will sometimes be nonmonotonic. The effect on the prediction error can be so tiny as to be undetectable (or it might even increase avg prediction error to include the constraint); nonetheless, the predictions will be clearly nonsensical. That’s an extreme example but I think the general point h
3 0.77581 1392 andrew gelman stats-2012-06-26-Occam
Introduction: Cosma Shalizi and Larry Wasserman discuss some papers from a conference on Ockham’s Razor. I don’t have anything new to add on this so let me link to past blog entries on the topic and repost the following from 2004 : A lot has been written in statistics about “parsimony”—that is, the desire to explain phenomena using fewer parameters–but I’ve never seen any good general justification for parsimony. (I don’t count “Occam’s Razor,” or “Ockham’s Razor,” or whatever, as a justification. You gotta do better than digging up a 700-year-old quote.) Maybe it’s because I work in social science, but my feeling is: if you can approximate reality with just a few parameters, fine. If you can use more parameters to fold in more information, that’s even better. In practice, I often use simple models—because they are less effort to fit and, especially, to understand. But I don’t kid myself that they’re better than more complicated efforts! My favorite quote on this comes from Rad
4 0.77389443 614 andrew gelman stats-2011-03-15-Induction within a model, deductive inference for model evaluation
Introduction: Jonathan Livengood writes: I have a couple of questions on your paper with Cosma Shalizi on “Philosophy and the practice of Bayesian statistics.” First, you distinguish between inductive approaches and hypothetico-deductive approaches to inference and locate statistical practice (at least, the practice of model building and checking) on the hypothetico-deductive side. Do you think that there are any interesting elements of statistical practice that are properly inductive? For example, suppose someone is playing around with a system that more or less resembles a toy model, like drawing balls from an urn or some such, and where the person has some well-defined priors. The person makes a number of draws from the urn and applies Bayes theorem to get a posterior. On your view, is that person making an induction? If so, how much space is there in statistical practice for genuine inductions like this? Second, I agree with you that one ought to distinguish induction from other kind
5 0.76946259 819 andrew gelman stats-2011-07-24-Don’t idealize “risk aversion”
Introduction: Richard Thaler writes (click here and search on Thaler): Both risk and risk aversion are concepts that were once well defined, but are now in danger of becoming Aetherized [this is Thaler's term for adding free parameters to a model to make it work, thus destroying the purity and much of the value of the original model]. Stocks that earn surprisingly high returns are labeled as risky, because in the theory, excess returns must be accompanied by higher risk. If, inconveniently, the traditional measures of risk such as variance or covariance with the market are not high, then the Aetherists tell us there must be some other risk; we just don’t know what it is. Similarly, traditionally the concept of risk aversion was taken to be a primitive; each person had a parameter, gamma, that measured her degree of risk aversion. Now risk aversion is allowed to be time varying, and Aetherists can say with a straight face that the market crashes of 2001 and 2008 were caused by sudden increases
6 0.76748455 2136 andrew gelman stats-2013-12-16-Whither the “bet on sparsity principle” in a nonsparse world?
7 0.74641979 1162 andrew gelman stats-2012-02-11-Adding an error model to a deterministic model
8 0.74633068 823 andrew gelman stats-2011-07-26-Including interactions or not
9 0.7364673 72 andrew gelman stats-2010-06-07-Valencia: Summer of 1991
10 0.72555208 964 andrew gelman stats-2011-10-19-An interweaving-transformation strategy for boosting MCMC efficiency
11 0.72545648 217 andrew gelman stats-2010-08-19-The “either-or” fallacy of believing in discrete models: an example of folk statistics
12 0.72522384 1739 andrew gelman stats-2013-02-26-An AI can build and try out statistical models using an open-ended generative grammar
13 0.72265792 1983 andrew gelman stats-2013-08-15-More on AIC, WAIC, etc
14 0.71252418 1841 andrew gelman stats-2013-05-04-The Folk Theorem of Statistical Computing
15 0.71149158 1468 andrew gelman stats-2012-08-24-Multilevel modeling and instrumental variables
16 0.71134734 24 andrew gelman stats-2010-05-09-Special journal issue on statistical methods for the social sciences
17 0.7100786 754 andrew gelman stats-2011-06-09-Difficulties with Bayesian model averaging
19 0.70573932 1200 andrew gelman stats-2012-03-06-Some economists are skeptical about microfoundations
20 0.70430666 759 andrew gelman stats-2011-06-11-“2 level logit with 2 REs & large sample. computational nightmare – please help”
topicId topicWeight
[(16, 0.021), (17, 0.028), (18, 0.02), (24, 0.14), (34, 0.022), (43, 0.022), (81, 0.371), (86, 0.016), (99, 0.242)]
simIndex simValue blogId blogTitle
Introduction: I just want to share with you the best comment we’ve every had in the nearly ten-year history of this blog. Also it has statistical content! Here’s the story. After seeing an amusing article by Tom Scocca relating how reporter John Lee Anderson called someone as a “little twerp” on twitter: I conjectured that Anderson suffered from “tall person syndrome,” that problem that some people of above-average height have, that they think they’re more important than other people because they literally look down on them. But I had no idea of Anderson’s actual height. Commenter Gary responded with this impressive bit of investigative reporting: Based on this picture: he appears to be fairly tall. But the perspective makes it hard to judge. Based on this picture: he appears to be about 9-10 inches taller than Catalina Garcia. But how tall is Catalina Garcia? Not that tall – she’s shorter than the high-wire artist Phillipe Petit: And he doesn’t appear
2 0.92788196 915 andrew gelman stats-2011-09-17-(Worst) graph of the year
Introduction: This (forwarded to me from Jeff, from a powerpoint by Willam Gawthrop) wins not on form but on content: Really this graph should stand alone but it’s so wonderful that I can’t resist pointing out a few things: - The gap between 610 and 622 A.D. seems to be about the same as the previous 600 years, and only a little less than the 1400 years before that. - “Pious and devout” Jews are portrayed as having steadily increased in nonviolence up to the present day. Been to Israel lately? - I assume the line labeled “Bible” is referring to Christians? I’m sort of amazed to see pious and devout Christians listed as being maximally violent at the beginning. Huh? I thought Christ was supposed to be a nonviolent, mellow dude. The line starts at 3 B.C., implying that baby Jesus was at the extreme of violence. Gong forward, we can learn from the graph that pious and devout Christians in 1492 or 1618, say, were much more peaceful than Jesus and his crew. - Most amusingly g
3 0.91610956 2250 andrew gelman stats-2014-03-16-“I have no idea who Catalina Garcia is, but she makes a decent ruler”
Introduction: Best blog comment ever , following up on our post, How tall is Jon Lee Anderson?: Based on this picture: http://farm3.static.flickr.com/2235/1640569735_05337bb974.jpg he appears to be fairly tall. But the perspective makes it hard to judge. Based on this picture: http://www.catalinagarcia.com/cata/Libraries/BLOG_Images/Cata_w_Jon_Lee_Anderson.sflb.ashx he appears to be about 9-10 inches taller than Catalina Garcia. But how tall is Catalina Garcia? Not that tall – she’s shorter than the high-wire artist Phillipe Petit http://www.catalinagarcia.com/cata/Libraries/BLOG_Images/Cata_w_Philippe_Petite.sflb.ashx. And he doesn’t appear to be that tall… about the same height as Claire Danes: http://cdn.theatermania.com/photo-gallery/Petit_Danes_Daldry_2421_4700.jpg – who according to Google is 5′ 6″. So if Jon Lee Anderson is 10″ taller than Catalina Garcia, who is 2″ shorter than Philippe Petit, who is the same height as Claire Danes, then he is 6′ 2″ tall. I have no idea who Catal
Introduction: Elsewhere: 1. They asked me to write about my “favorite election- or campaign-related movie, novel, or TV show” (Salon) 2. The shopping period is over; the time for buying has begun (NYT) 3. If anybody’s gonna be criticizing my tax plan, I want it to be this guy (Monkey Cage) 4. The 4 key qualifications to be a great president; unfortunately George W. Bush satisfies all four, and Ronald Reagan doesn’t match any of them (Monkey Cage) 5. The politics of eyeliner (Monkey Cage)
same-blog 5 0.89041924 552 andrew gelman stats-2011-02-03-Model Makers’ Hippocratic Oath
Introduction: Emanuel Derman and Paul Wilmott wonder how to get their fellow modelers to give up their fantasy of perfection. In a Business Week article they proposed, not entirely in jest, a model makers’ Hippocratic Oath: I will remember that I didn’t make the world and that it doesn’t satisfy my equations. Though I will use models boldly to estimate value, I will not be overly impressed by mathematics. I will never sacrifice reality for elegance without explaining why I have done so. Nor will I give the people who use my model false comfort about its accuracy. Instead, I will make explicit its assumptions and oversights. I understand that my work may have enormous effects on society and the economy, many of them beyond my comprehension. Found via Abductive Intelligence .
6 0.87867093 1057 andrew gelman stats-2011-12-14-Hey—I didn’t know that!
7 0.81361318 1632 andrew gelman stats-2012-12-20-Who exactly are those silly academics who aren’t as smart as a Vegas bookie?
10 0.78516769 1033 andrew gelman stats-2011-11-28-Greece to head statistician: Tell the truth, go to jail
11 0.76506591 1222 andrew gelman stats-2012-03-20-5 books book
12 0.76290357 1962 andrew gelman stats-2013-07-30-The Roy causal model?
13 0.7562722 556 andrew gelman stats-2011-02-04-Patterns
14 0.75503302 1705 andrew gelman stats-2013-02-04-Recently in the sister blog
15 0.74732411 858 andrew gelman stats-2011-08-17-Jumping off the edge of the world
16 0.7450968 1321 andrew gelman stats-2012-05-15-A statistical research project: Weeding out the fraudulent citations
17 0.70955038 1759 andrew gelman stats-2013-03-12-How tall is Jon Lee Anderson?
18 0.70170641 658 andrew gelman stats-2011-04-11-Statistics in high schools: Towards more accessible conceptions of statistical inference
19 0.66254205 2002 andrew gelman stats-2013-08-30-Blogging
20 0.66156781 2096 andrew gelman stats-2013-11-10-Schiminovich is on The Simpsons