andrew_gelman_stats andrew_gelman_stats-2014 andrew_gelman_stats-2014-2279 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: For background, you can start by reading my recent article, Is It Possible to Be an Ethicist Without Being Mean to People? and then a blog post, Quality over Quantity , by John Cook, who writes: At one point [Ed] Tufte spoke more generally and more personally about pursuing quality over quantity. He said most papers are not worth reading and that he learned early on to concentrate on the great papers, maybe one in 500, that are worth reading and rereading rather than trying to “keep up with the literature.” He also explained how over time he has concentrated more on showcasing excellent work than on criticizing bad work. You can see this in the progression from his first book to his latest. (Criticizing bad work is important too, but you’ll have to read his early books to find more of that. He won’t spend as much time talking about it in his course.) That reminded me of Jesse Robbins’ line: “Don’t fight stupid. You are better than that. Make more awesome.” This made me stop an
sentIndex sentText sentNum sentScore
1 For background, you can start by reading my recent article, Is It Possible to Be an Ethicist Without Being Mean to People? [sent-1, score-0.093]
2 and then a blog post, Quality over Quantity , by John Cook, who writes: At one point [Ed] Tufte spoke more generally and more personally about pursuing quality over quantity. [sent-2, score-0.171]
3 He said most papers are not worth reading and that he learned early on to concentrate on the great papers, maybe one in 500, that are worth reading and rereading rather than trying to “keep up with the literature. [sent-3, score-0.66]
4 ” He also explained how over time he has concentrated more on showcasing excellent work than on criticizing bad work. [sent-4, score-0.415]
5 You can see this in the progression from his first book to his latest. [sent-5, score-0.084]
6 (Criticizing bad work is important too, but you’ll have to read his early books to find more of that. [sent-6, score-0.34]
7 ” This made me stop and think, given how much time I spend criticizing things. [sent-11, score-0.357]
8 Indeed, like Tufte I’ve spent a lot of time criticizing chartjunk! [sent-12, score-0.273]
9 I do think, though, that I and others have learned a lot from my criticisms. [sent-13, score-0.089]
10 There’s some way in which good examples, as well as bad examples, can be helpful in developing and understanding general principles. [sent-14, score-0.232]
11 The next phase of my writing on graphics accentuated the negative, with a series of blog posts over several years criticizing various published graphs. [sent-17, score-0.939]
12 This phase peaked with a post of mine from 2009 (with followup here ), slamming some popular infographics. [sent-19, score-0.351]
13 Between the initial post and the final appearance of the paper, my thinking changed, and I became much more clear on the idea that graphical displays have different sorts of goals. [sent-21, score-0.472]
14 ( Here’s a blog post from 2011 where I explain where I’m coming from on the graphics criticism. [sent-23, score-0.386]
15 See also here for a slightly broader discussion of the difficulties of communication across different research perspectives. [sent-24, score-0.095]
16 In this case, I’m part of an informal “club” of critics (Simonsohn, Francis, Ioannidis, Nosek, etc etc), but, again, it seems that criticism of bad work can be a helpful way of moving forward and thinking harder about how to do good work. [sent-26, score-0.496]
17 In my blog and in my talks, I talk about stuff I like and stuff I don’t like. [sent-28, score-0.085]
18 But in my books, just about all my examples are positive. [sent-29, score-0.164]
19 We have very few negative examples, really none at all that I can think of (except for some of the examples in the “lying with statistics” chapter in the Teaching Statistics book). [sent-30, score-0.289]
20 This suggests that I’m doing something different in my books than in my blogs and lectures. [sent-31, score-0.21]
wordName wordTfidf (topN-words)
[('criticizing', 0.273), ('graphics', 0.177), ('examples', 0.164), ('graphical', 0.153), ('phase', 0.146), ('bad', 0.142), ('tufte', 0.136), ('constructive', 0.136), ('papers', 0.131), ('negative', 0.125), ('post', 0.124), ('king', 0.117), ('books', 0.115), ('gary', 0.114), ('exploratory', 0.11), ('graphs', 0.106), ('eventually', 0.105), ('became', 0.1), ('robbins', 0.096), ('pencil', 0.096), ('cristian', 0.096), ('jesse', 0.096), ('moving', 0.096), ('different', 0.095), ('posts', 0.094), ('reading', 0.093), ('strangers', 0.091), ('ethicist', 0.091), ('helpful', 0.09), ('learned', 0.089), ('sparked', 0.087), ('rereading', 0.087), ('dodhia', 0.087), ('pasarica', 0.087), ('rahul', 0.087), ('quality', 0.086), ('criticism', 0.086), ('blog', 0.085), ('chartjunk', 0.084), ('progression', 0.084), ('concentrate', 0.084), ('spend', 0.084), ('early', 0.083), ('published', 0.083), ('etc', 0.082), ('series', 0.081), ('peaked', 0.081), ('paper', 0.08), ('preach', 0.077), ('nosek', 0.077)]
simIndex simValue blogId blogTitle
same-blog 1 0.9999997 2279 andrew gelman stats-2014-04-02-Am I too negative?
Introduction: For background, you can start by reading my recent article, Is It Possible to Be an Ethicist Without Being Mean to People? and then a blog post, Quality over Quantity , by John Cook, who writes: At one point [Ed] Tufte spoke more generally and more personally about pursuing quality over quantity. He said most papers are not worth reading and that he learned early on to concentrate on the great papers, maybe one in 500, that are worth reading and rereading rather than trying to “keep up with the literature.” He also explained how over time he has concentrated more on showcasing excellent work than on criticizing bad work. You can see this in the progression from his first book to his latest. (Criticizing bad work is important too, but you’ll have to read his early books to find more of that. He won’t spend as much time talking about it in his course.) That reminded me of Jesse Robbins’ line: “Don’t fight stupid. You are better than that. Make more awesome.” This made me stop an
2 0.23071606 855 andrew gelman stats-2011-08-16-Infovis and statgraphics update update
Introduction: To continue our discussion from last week , consider three positions regarding the display of information: (a) The traditional tabular approach. This is how most statisticians, econometricians, political scientists, sociologists, etc., seem to operate. They understand the appeal of a pretty graph, and they’re willing to plot some data as part of an exploratory data analysis, but they see their serious research as leading to numerical estimates, p-values, tables of numbers. These people might use a graph to illustrate their points but they don’t see them as necessary in their research. (b) Statistical graphics as performed by Howard Wainer, Bill Cleveland, Dianne Cook, etc. They–we–see graphics as central to the process of statistical modeling and data analysis and are interested in graphs (static and dynamic) that display every data point as transparently as possible. (c) Information visualization or infographics, as performed by graphics designers and statisticians who are
Introduction: Our discussion on data visualization continues. One one side are three statisticians–Antony Unwin, Kaiser Fung, and myself. We have been writing about the different goals served by information visualization and statistical graphics. On the other side are graphics experts (sorry for the imprecision, I don’t know exactly what these people do in their day jobs or how they are trained, and I don’t want to mislabel them) such as Robert Kosara and Jen Lowe , who seem a bit annoyed at how my colleagues and myself seem to follow the Tufte strategy of criticizing what we don’t understand. And on the third side are many (most?) academic statisticians, econometricians, etc., who don’t understand or respect graphs and seem to think of visualization as a toy that is unrelated to serious science or statistics. I’m not so interested in the third group right now–I tried to communicate with them in my big articles from 2003 and 2004 )–but I am concerned that our dialogue with the graphic
Introduction: I continue to struggle to convey my thoughts on statistical graphics so I’ll try another approach, this time giving my own story. For newcomers to this discussion: the background is that Antony Unwin and I wrote an article on the different goals embodied in information visualization and statistical graphics, but I have difficulty communicating on this point with the infovis people. Maybe if I tell my own story, and then they tell their stories, this will point a way forward to a more constructive discussion. So here goes. I majored in physics in college and I worked in a couple of research labs during the summer. Physicists graph everything. I did most of my plotting on graph paper–this continued through my second year of grad school–and became expert at putting points at 1/5, 2/5, 3/5, and 4/5 between the x and y grid lines. In grad school in statistics, I continued my physics habits and graphed everything I could. I did notice, though, that the faculty and the other
Introduction: I had a brief email exchange with Jeff Leek regarding our recent discussions of replication, criticism, and the self-correcting process of science. Jeff writes: (1) I can see the problem with serious, evidence-based criticisms not being published in the same journal (and linked to) studies that are shown to be incorrect. I have been mostly seeing these sorts of things show up in blogs. But I’m not sure that is a bad thing. I think people read blogs more than they read the literature. I wonder if this means that blogs will eventually be a sort of “shadow literature”? (2) I think there is a ton of bad literature out there, just like there is a ton of bad stuff on Google. If we focus too much on the bad stuff we will be paralyzed. I still manage to find good papers despite all the bad papers. (3) I think one positive solution to this problem is to incentivize/publish referee reports and give people credit for a good referee report just like they get credit for a good paper. T
6 0.19809638 1594 andrew gelman stats-2012-11-28-My talk on statistical graphics at Mit this Thurs aft
7 0.19505087 1848 andrew gelman stats-2013-05-09-A tale of two discussion papers
8 0.18422984 2245 andrew gelman stats-2014-03-12-More on publishing in journals
9 0.18195057 2266 andrew gelman stats-2014-03-25-A statistical graphics course and statistical graphics advice
10 0.17227705 1584 andrew gelman stats-2012-11-19-Tradeoffs in information graphics
11 0.16653585 1450 andrew gelman stats-2012-08-08-My upcoming talk for the data visualization meetup
12 0.16310084 2232 andrew gelman stats-2014-03-03-What is the appropriate time scale for blogging—the day or the week?
13 0.15844396 1668 andrew gelman stats-2013-01-11-My talk at the NY data visualization meetup this Monday!
14 0.15794191 798 andrew gelman stats-2011-07-12-Sometimes a graph really is just ugly
16 0.14704944 1604 andrew gelman stats-2012-12-04-An epithet I can live with
17 0.14498691 546 andrew gelman stats-2011-01-31-Infovis vs. statistical graphics: My talk tomorrow (Tues) 1pm at Columbia
18 0.14392534 37 andrew gelman stats-2010-05-17-Is chartjunk really “more useful” than plain graphs? I don’t think so.
20 0.14228398 2233 andrew gelman stats-2014-03-04-Literal vs. rhetorical
topicId topicWeight
[(0, 0.295), (1, -0.076), (2, -0.151), (3, 0.036), (4, 0.063), (5, -0.159), (6, -0.1), (7, 0.01), (8, 0.001), (9, -0.011), (10, 0.102), (11, 0.052), (12, -0.019), (13, -0.003), (14, 0.066), (15, -0.061), (16, -0.079), (17, 0.012), (18, -0.005), (19, 0.044), (20, 0.013), (21, -0.037), (22, 0.004), (23, 0.053), (24, -0.002), (25, 0.017), (26, -0.008), (27, 0.024), (28, 0.021), (29, 0.01), (30, -0.057), (31, 0.018), (32, -0.009), (33, 0.007), (34, -0.002), (35, 0.022), (36, 0.04), (37, 0.049), (38, 0.027), (39, -0.02), (40, -0.016), (41, -0.056), (42, -0.058), (43, -0.08), (44, 0.035), (45, -0.006), (46, 0.021), (47, -0.009), (48, -0.017), (49, 0.006)]
simIndex simValue blogId blogTitle
same-blog 1 0.96893495 2279 andrew gelman stats-2014-04-02-Am I too negative?
Introduction: For background, you can start by reading my recent article, Is It Possible to Be an Ethicist Without Being Mean to People? and then a blog post, Quality over Quantity , by John Cook, who writes: At one point [Ed] Tufte spoke more generally and more personally about pursuing quality over quantity. He said most papers are not worth reading and that he learned early on to concentrate on the great papers, maybe one in 500, that are worth reading and rereading rather than trying to “keep up with the literature.” He also explained how over time he has concentrated more on showcasing excellent work than on criticizing bad work. You can see this in the progression from his first book to his latest. (Criticizing bad work is important too, but you’ll have to read his early books to find more of that. He won’t spend as much time talking about it in his course.) That reminded me of Jesse Robbins’ line: “Don’t fight stupid. You are better than that. Make more awesome.” This made me stop an
Introduction: Our discussion on data visualization continues. One one side are three statisticians–Antony Unwin, Kaiser Fung, and myself. We have been writing about the different goals served by information visualization and statistical graphics. On the other side are graphics experts (sorry for the imprecision, I don’t know exactly what these people do in their day jobs or how they are trained, and I don’t want to mislabel them) such as Robert Kosara and Jen Lowe , who seem a bit annoyed at how my colleagues and myself seem to follow the Tufte strategy of criticizing what we don’t understand. And on the third side are many (most?) academic statisticians, econometricians, etc., who don’t understand or respect graphs and seem to think of visualization as a toy that is unrelated to serious science or statistics. I’m not so interested in the third group right now–I tried to communicate with them in my big articles from 2003 and 2004 )–but I am concerned that our dialogue with the graphic
3 0.83623475 1848 andrew gelman stats-2013-05-09-A tale of two discussion papers
Introduction: Over the years I’ve written a dozen or so journal articles that have appeared with discussions, and I’ve participated in many published discussions of others’ articles as well. I get a lot out of these article-discussion-rejoinder packages, in all three of my roles as reader, writer, and discussant. Part 1: The story of an unsuccessful discussion The first time I had a discussion article was the result of an unfortunate circumstance. I had a research idea that resulted in an article with Don Rubin on monitoring the mixing of Markov chain simulations. I new the idea was great, but back then we worked pretty slowly so it was awhile before we had a final version to submit to a journal. (In retrospect I wish I’d just submitted the draft version as it was.) In the meantime I presented the paper at a conference. Our idea was very well received (I had a sheet of paper so people could write their names and addresses to get preprints, and we got either 50 or 150 (I can’t remembe
4 0.82286477 855 andrew gelman stats-2011-08-16-Infovis and statgraphics update update
Introduction: To continue our discussion from last week , consider three positions regarding the display of information: (a) The traditional tabular approach. This is how most statisticians, econometricians, political scientists, sociologists, etc., seem to operate. They understand the appeal of a pretty graph, and they’re willing to plot some data as part of an exploratory data analysis, but they see their serious research as leading to numerical estimates, p-values, tables of numbers. These people might use a graph to illustrate their points but they don’t see them as necessary in their research. (b) Statistical graphics as performed by Howard Wainer, Bill Cleveland, Dianne Cook, etc. They–we–see graphics as central to the process of statistical modeling and data analysis and are interested in graphs (static and dynamic) that display every data point as transparently as possible. (c) Information visualization or infographics, as performed by graphics designers and statisticians who are
Introduction: I continue to struggle to convey my thoughts on statistical graphics so I’ll try another approach, this time giving my own story. For newcomers to this discussion: the background is that Antony Unwin and I wrote an article on the different goals embodied in information visualization and statistical graphics, but I have difficulty communicating on this point with the infovis people. Maybe if I tell my own story, and then they tell their stories, this will point a way forward to a more constructive discussion. So here goes. I majored in physics in college and I worked in a couple of research labs during the summer. Physicists graph everything. I did most of my plotting on graph paper–this continued through my second year of grad school–and became expert at putting points at 1/5, 2/5, 3/5, and 4/5 between the x and y grid lines. In grad school in statistics, I continued my physics habits and graphed everything I could. I did notice, though, that the faculty and the other
6 0.79674762 816 andrew gelman stats-2011-07-22-“Information visualization” vs. “Statistical graphics”
7 0.79634088 798 andrew gelman stats-2011-07-12-Sometimes a graph really is just ugly
8 0.79596651 1096 andrew gelman stats-2012-01-02-Graphical communication for legal scholarship
9 0.7939623 1775 andrew gelman stats-2013-03-23-In which I disagree with John Maynard Keynes
10 0.77658135 2246 andrew gelman stats-2014-03-13-An Economist’s Guide to Visualizing Data
11 0.76692891 1764 andrew gelman stats-2013-03-15-How do I make my graphs?
12 0.76630861 1604 andrew gelman stats-2012-12-04-An epithet I can live with
13 0.76423395 2013 andrew gelman stats-2013-09-08-What we need here is some peer review for statistical graphics
14 0.74550289 1811 andrew gelman stats-2013-04-18-Psychology experiments to understand what’s going on with data graphics?
15 0.74440622 61 andrew gelman stats-2010-05-31-A data visualization manifesto
16 0.73721415 2266 andrew gelman stats-2014-03-25-A statistical graphics course and statistical graphics advice
17 0.73672485 319 andrew gelman stats-2010-10-04-“Who owns Congress”
18 0.73630512 1176 andrew gelman stats-2012-02-19-Standardized writing styles and standardized graphing styles
19 0.73255819 1308 andrew gelman stats-2012-05-08-chartsnthings !
20 0.72758567 1584 andrew gelman stats-2012-11-19-Tradeoffs in information graphics
topicId topicWeight
[(6, 0.015), (15, 0.038), (16, 0.089), (21, 0.036), (24, 0.137), (29, 0.033), (40, 0.011), (51, 0.035), (63, 0.053), (81, 0.011), (86, 0.039), (95, 0.013), (96, 0.018), (99, 0.361)]
simIndex simValue blogId blogTitle
1 0.98252523 291 andrew gelman stats-2010-09-22-Philosophy of Bayes and non-Bayes: A dialogue with Deborah Mayo
Introduction: I sent Deborah Mayo a link to my paper with Cosma Shalizi on the philosophy of statistics, and she sent me the link to this conference which unfortunately already occurred. (It’s too bad, because I’d have liked to have been there.) I summarized my philosophy as follows: I am highly sympathetic to the approach of Lakatos (or of Popper, if you consider Lakatos’s “Popper_2″ to be a reasonable simulation of the true Popperism), in that (a) I view statistical models as being built within theoretical structures, and (b) I see the checking and refutation of models to be a key part of scientific progress. A big problem I have with mainstream Bayesianism is its “inductivist” view that science can operate completely smoothly with posterior updates: the idea that new data causes us to increase the posterior probability of good models and decrease the posterior probability of bad models. I don’t buy that: I see models as ever-changing entities that are flexible and can be patched and ex
2 0.98205316 421 andrew gelman stats-2010-11-19-Just chaid
Introduction: Reading somebody else’s statistics rant made me realize the inherent contradictions in much of my own statistical advice. Jeff Lax sent along this article by Philip Schrodt, along with the cryptic comment: Perhaps of interest to you. perhaps not. Not meant to be an excuse for you to rant against hypothesis testing again. In his article, Schrodt makes a reasonable and entertaining argument against the overfitting of data and the overuse of linear models. He states that his article is motivated by the quantitative papers he has been sent to review for journals or conferences, and he explicitly excludes “studies of United States voting behavior,” so at least I think Mister P is off the hook. I notice a bit of incoherence in Schrodt’s position–on one hand, he criticizes “kitchen-sink models” for overfitting and he criticizes “using complex methods without understanding the underlying assumptions” . . . but then later on he suggests that political scientists in this countr
same-blog 3 0.98171556 2279 andrew gelman stats-2014-04-02-Am I too negative?
Introduction: For background, you can start by reading my recent article, Is It Possible to Be an Ethicist Without Being Mean to People? and then a blog post, Quality over Quantity , by John Cook, who writes: At one point [Ed] Tufte spoke more generally and more personally about pursuing quality over quantity. He said most papers are not worth reading and that he learned early on to concentrate on the great papers, maybe one in 500, that are worth reading and rereading rather than trying to “keep up with the literature.” He also explained how over time he has concentrated more on showcasing excellent work than on criticizing bad work. You can see this in the progression from his first book to his latest. (Criticizing bad work is important too, but you’ll have to read his early books to find more of that. He won’t spend as much time talking about it in his course.) That reminded me of Jesse Robbins’ line: “Don’t fight stupid. You are better than that. Make more awesome.” This made me stop an
4 0.98077041 2263 andrew gelman stats-2014-03-24-Empirical implications of Empirical Implications of Theoretical Models
Introduction: Robert Bloomfield writes: Most of the people in my field (accounting, which is basically applied economics and finance, leavened with psychology and organizational behavior) use ‘positive research methods’, which are typically described as coming to the data with a predefined theory, and using hypothesis testing to accept or reject the theory’s predictions. But a substantial minority use ‘interpretive research methods’ (sometimes called qualitative methods, for those that call positive research ‘quantitative’). No one seems entirely happy with the definition of this method, but I’ve found it useful to think of it as an attempt to see the world through the eyes of your subjects, much as Jane Goodall lived with gorillas and tried to see the world through their eyes.) Interpretive researchers often criticize positive researchers by noting that the latter don’t make the best use of their data, because they come to the data with a predetermined theory, and only test a narrow set of h
5 0.98057199 2057 andrew gelman stats-2013-10-10-Chris Chabris is irritated by Malcolm Gladwell
Introduction: Christopher Chabris reviewed the new book by Malcolm Gladwell: One thing “David and Goliath” shows is that Mr. Gladwell has not changed his own strategy, despite serious criticism of his prior work. What he presents are mostly just intriguing possibilities and musings about human behavior, but what his publisher sells them as, and what his readers may incorrectly take them for, are lawful, causal rules that explain how the world really works. Mr. Gladwell should acknowledge when he is speculating or working with thin evidentiary soup. Yet far from abandoning his hand or even standing pat, Mr. Gladwell has doubled down. This will surely bring more success to a Goliath of nonfiction writing, but not to his readers. Afterward he blogged some further thoughts about the popular popular science writer. Good stuff . Chabris has a thoughtful explanation of why the “Gladwell is just an entertainer” alibi doesn’t work for him (Chabris). Some of his discussion reminds me of my articl
6 0.97948891 1506 andrew gelman stats-2012-09-21-Building a regression model . . . with only 27 data points
10 0.97808737 719 andrew gelman stats-2011-05-19-Everything is Obvious (once you know the answer)
11 0.97807038 315 andrew gelman stats-2010-10-03-He doesn’t trust the fit . . . r=.999
12 0.97798157 1690 andrew gelman stats-2013-01-23-When are complicated models helpful in psychology research and when are they overkill?
13 0.97776836 690 andrew gelman stats-2011-05-01-Peter Huber’s reflections on data analysis
14 0.97758967 532 andrew gelman stats-2011-01-23-My Wall Street Journal story
15 0.97750384 2007 andrew gelman stats-2013-09-03-Popper and Jaynes
18 0.97740424 966 andrew gelman stats-2011-10-20-A qualified but incomplete thanks to Gregg Easterbrook’s editor at Reuters
20 0.97729158 1529 andrew gelman stats-2012-10-11-Bayesian brains?