andrew_gelman_stats andrew_gelman_stats-2011 andrew_gelman_stats-2011-798 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: I’ve been talking a lot about how different graphical presentations serve different goals, and how we should avoid being so judgmental about graphs. Instead of saying that a particular data visualization is bad, we should think about what goal it serves. That’s all well and good, but sometimes a graph really is bad. Let me draw an analogy to the popular media. Books and videogames serve different goals. I’m a reader and writer of books and have very little interest in videogames, but it would be silly for me to criticize a videogame on the grounds that it’s a bad book (or, for that matter, to criticize a book because it doesn’t yield a satisfying game-playing experience). But . . . there are bad books and there are bad videogames. To restrict our scope to books for a moment: you could argue that Mickey Spillane’s books are terrible or you could argue that, given that they sold tens of millions of copies, they must have had something going for them. I wouldn’t want to charac
sentIndex sentText sentNum sentScore
1 I’ve been talking a lot about how different graphical presentations serve different goals, and how we should avoid being so judgmental about graphs. [sent-1, score-0.474]
2 That’s all well and good, but sometimes a graph really is bad. [sent-3, score-0.135]
3 I’m a reader and writer of books and have very little interest in videogames, but it would be silly for me to criticize a videogame on the grounds that it’s a bad book (or, for that matter, to criticize a book because it doesn’t yield a satisfying game-playing experience). [sent-6, score-0.837]
4 there are bad books and there are bad videogames. [sent-10, score-0.521]
5 To restrict our scope to books for a moment: you could argue that Mickey Spillane’s books are terrible or you could argue that, given that they sold tens of millions of copies, they must have had something going for them. [sent-11, score-0.94]
6 think of all the crappy attempted Spillanes that were produced in the 1950s, all the ones that neither sold well nor had interesting content. [sent-16, score-0.274]
7 McDonald’s is fine but somewhere there’s a greasy spoon whose burgers are so barfable that even the locals don’t go there. [sent-19, score-0.386]
8 (I remember a place we went to in grad school once at 2 in the morning, a disgusting local restaurant that was open from about 11pm-5am every day and was always empty. [sent-20, score-0.181]
9 The rumor was that its sole function was as a mob hangout. [sent-21, score-0.391]
10 Similarly, some of those attempted Spillanes probably had the intended function of selling books but, at that they failed miserably. [sent-23, score-0.504]
11 It seems to us that the message could have been conveyed in a much clearer way. [sent-26, score-0.164]
12 I’ve been blogging long enough that I think faithful readers could easily reel off ten things I don’t like about the graph. [sent-28, score-0.17]
13 ); instead, I just want to make the point that, indeed, this graph could’ve been much better and, no, I don’t see that its ugly clutteredness serves any clear non-statistical goals either. [sent-30, score-0.28]
14 Just as a lot of writing is done by people without good command of the tools of the written language, so are many graphs made by people who can only clumsily handle the tools of graphics. [sent-31, score-0.415]
15 The problem is made worse, I believe, because I don’t think the creators of the graph thought hard about what their goals were. [sent-32, score-0.44]
16 That said, I applaud their decision to make the presentations graphically. [sent-33, score-0.227]
17 I think the best way to read the report is to read the text and then glance at the graphs to see that they provide evidence for the points made in the text. [sent-34, score-0.323]
18 And you’ll have to decide for yourself whether it’s actually bad news (as claimed on page 5 of the report) that “the United States’ and its allies’ share of world military spending . [sent-35, score-0.415]
19 ” I mean, sure, it would be great if other countries didn’t waste any money on expensive tanks, fighter jets, military retirement plans, etc. [sent-39, score-0.268]
20 It’s part of our ongoing exploration of criteria for understanding and evaluating statistical graphics and, by implication, statistical communication more generally. [sent-44, score-0.24]
wordName wordTfidf (topN-words)
[('books', 0.233), ('spillanes', 0.215), ('videogames', 0.215), ('military', 0.17), ('presentations', 0.154), ('goals', 0.145), ('bad', 0.144), ('attempted', 0.142), ('graph', 0.135), ('sold', 0.132), ('serve', 0.129), ('function', 0.129), ('charts', 0.125), ('criticize', 0.116), ('graphical', 0.103), ('report', 0.103), ('spending', 0.101), ('locals', 0.098), ('swartz', 0.098), ('mob', 0.098), ('disgusting', 0.098), ('barfable', 0.098), ('fighter', 0.098), ('greasy', 0.098), ('tanks', 0.098), ('tools', 0.097), ('rumor', 0.092), ('burgers', 0.092), ('graphics', 0.09), ('jets', 0.088), ('judgmental', 0.088), ('argue', 0.086), ('could', 0.085), ('faithful', 0.085), ('mcdonald', 0.085), ('creators', 0.085), ('restaurant', 0.083), ('mickey', 0.083), ('projected', 0.079), ('conveyed', 0.079), ('allies', 0.077), ('book', 0.077), ('statistical', 0.075), ('made', 0.075), ('satisfying', 0.074), ('command', 0.074), ('glance', 0.073), ('applaud', 0.073), ('graphs', 0.072), ('sole', 0.072)]
simIndex simValue blogId blogTitle
same-blog 1 0.99999988 798 andrew gelman stats-2011-07-12-Sometimes a graph really is just ugly
Introduction: I’ve been talking a lot about how different graphical presentations serve different goals, and how we should avoid being so judgmental about graphs. Instead of saying that a particular data visualization is bad, we should think about what goal it serves. That’s all well and good, but sometimes a graph really is bad. Let me draw an analogy to the popular media. Books and videogames serve different goals. I’m a reader and writer of books and have very little interest in videogames, but it would be silly for me to criticize a videogame on the grounds that it’s a bad book (or, for that matter, to criticize a book because it doesn’t yield a satisfying game-playing experience). But . . . there are bad books and there are bad videogames. To restrict our scope to books for a moment: you could argue that Mickey Spillane’s books are terrible or you could argue that, given that they sold tens of millions of copies, they must have had something going for them. I wouldn’t want to charac
Introduction: I continue to struggle to convey my thoughts on statistical graphics so I’ll try another approach, this time giving my own story. For newcomers to this discussion: the background is that Antony Unwin and I wrote an article on the different goals embodied in information visualization and statistical graphics, but I have difficulty communicating on this point with the infovis people. Maybe if I tell my own story, and then they tell their stories, this will point a way forward to a more constructive discussion. So here goes. I majored in physics in college and I worked in a couple of research labs during the summer. Physicists graph everything. I did most of my plotting on graph paper–this continued through my second year of grad school–and became expert at putting points at 1/5, 2/5, 3/5, and 4/5 between the x and y grid lines. In grad school in statistics, I continued my physics habits and graphed everything I could. I did notice, though, that the faculty and the other
3 0.17745359 855 andrew gelman stats-2011-08-16-Infovis and statgraphics update update
Introduction: To continue our discussion from last week , consider three positions regarding the display of information: (a) The traditional tabular approach. This is how most statisticians, econometricians, political scientists, sociologists, etc., seem to operate. They understand the appeal of a pretty graph, and they’re willing to plot some data as part of an exploratory data analysis, but they see their serious research as leading to numerical estimates, p-values, tables of numbers. These people might use a graph to illustrate their points but they don’t see them as necessary in their research. (b) Statistical graphics as performed by Howard Wainer, Bill Cleveland, Dianne Cook, etc. They–we–see graphics as central to the process of statistical modeling and data analysis and are interested in graphs (static and dynamic) that display every data point as transparently as possible. (c) Information visualization or infographics, as performed by graphics designers and statisticians who are
4 0.15794191 2279 andrew gelman stats-2014-04-02-Am I too negative?
Introduction: For background, you can start by reading my recent article, Is It Possible to Be an Ethicist Without Being Mean to People? and then a blog post, Quality over Quantity , by John Cook, who writes: At one point [Ed] Tufte spoke more generally and more personally about pursuing quality over quantity. He said most papers are not worth reading and that he learned early on to concentrate on the great papers, maybe one in 500, that are worth reading and rereading rather than trying to “keep up with the literature.” He also explained how over time he has concentrated more on showcasing excellent work than on criticizing bad work. You can see this in the progression from his first book to his latest. (Criticizing bad work is important too, but you’ll have to read his early books to find more of that. He won’t spend as much time talking about it in his course.) That reminded me of Jesse Robbins’ line: “Don’t fight stupid. You are better than that. Make more awesome.” This made me stop an
5 0.15295929 2266 andrew gelman stats-2014-03-25-A statistical graphics course and statistical graphics advice
Introduction: Dean Eckles writes: Some of my coworkers at Facebook and I have worked with Udacity to create an online course on exploratory data analysis, including using data visualizations in R as part of EDA. The course has now launched at https://www.udacity.com/course/ud651 so anyone can take it for free. And Kaiser Fung has reviewed it . So definitely feel free to promote it! Criticism is also welcome (we are still fine-tuning things and adding more notes throughout). I wrote some more comments about the course here , including highlighting the interviews with my great coworkers. I didn’t have a chance to look at the course so instead I responded with some generic comments about eda and visualization (in no particular order): - Think of a graph as a comparison. All graphs are comparison (indeed, all statistical analyses are comparisons). If you already have the graph in mind, think of what comparisons it’s enabling. Or if you haven’t settled on the graph yet, think of what
6 0.15003304 61 andrew gelman stats-2010-05-31-A data visualization manifesto
7 0.14901498 1584 andrew gelman stats-2012-11-19-Tradeoffs in information graphics
8 0.14043471 1848 andrew gelman stats-2013-05-09-A tale of two discussion papers
9 0.13538948 499 andrew gelman stats-2011-01-03-5 books
10 0.131652 1594 andrew gelman stats-2012-11-28-My talk on statistical graphics at Mit this Thurs aft
12 0.13114947 2255 andrew gelman stats-2014-03-19-How Americans vote
14 0.11297131 1604 andrew gelman stats-2012-12-04-An epithet I can live with
15 0.11120924 1275 andrew gelman stats-2012-04-22-Please stop me before I barf again
16 0.1074512 1661 andrew gelman stats-2013-01-08-Software is as software does
17 0.10713934 620 andrew gelman stats-2011-03-19-Online James?
19 0.10648923 8 andrew gelman stats-2010-04-28-Advice to help the rich get richer
20 0.10592926 2013 andrew gelman stats-2013-09-08-What we need here is some peer review for statistical graphics
topicId topicWeight
[(0, 0.23), (1, -0.104), (2, -0.052), (3, 0.102), (4, 0.096), (5, -0.138), (6, -0.034), (7, 0.04), (8, 0.02), (9, 0.003), (10, 0.002), (11, -0.027), (12, -0.011), (13, 0.012), (14, 0.067), (15, -0.039), (16, 0.002), (17, 0.002), (18, 0.04), (19, 0.004), (20, 0.003), (21, -0.022), (22, 0.015), (23, 0.041), (24, -0.003), (25, -0.001), (26, 0.036), (27, 0.015), (28, -0.032), (29, 0.032), (30, -0.084), (31, 0.013), (32, -0.01), (33, 0.015), (34, -0.027), (35, 0.027), (36, 0.011), (37, -0.014), (38, 0.027), (39, -0.01), (40, 0.008), (41, -0.03), (42, -0.018), (43, 0.032), (44, 0.013), (45, 0.003), (46, 0.014), (47, 0.001), (48, -0.028), (49, -0.006)]
simIndex simValue blogId blogTitle
same-blog 1 0.97421998 798 andrew gelman stats-2011-07-12-Sometimes a graph really is just ugly
Introduction: I’ve been talking a lot about how different graphical presentations serve different goals, and how we should avoid being so judgmental about graphs. Instead of saying that a particular data visualization is bad, we should think about what goal it serves. That’s all well and good, but sometimes a graph really is bad. Let me draw an analogy to the popular media. Books and videogames serve different goals. I’m a reader and writer of books and have very little interest in videogames, but it would be silly for me to criticize a videogame on the grounds that it’s a bad book (or, for that matter, to criticize a book because it doesn’t yield a satisfying game-playing experience). But . . . there are bad books and there are bad videogames. To restrict our scope to books for a moment: you could argue that Mickey Spillane’s books are terrible or you could argue that, given that they sold tens of millions of copies, they must have had something going for them. I wouldn’t want to charac
Introduction: I continue to struggle to convey my thoughts on statistical graphics so I’ll try another approach, this time giving my own story. For newcomers to this discussion: the background is that Antony Unwin and I wrote an article on the different goals embodied in information visualization and statistical graphics, but I have difficulty communicating on this point with the infovis people. Maybe if I tell my own story, and then they tell their stories, this will point a way forward to a more constructive discussion. So here goes. I majored in physics in college and I worked in a couple of research labs during the summer. Physicists graph everything. I did most of my plotting on graph paper–this continued through my second year of grad school–and became expert at putting points at 1/5, 2/5, 3/5, and 4/5 between the x and y grid lines. In grad school in statistics, I continued my physics habits and graphed everything I could. I did notice, though, that the faculty and the other
3 0.83849734 855 andrew gelman stats-2011-08-16-Infovis and statgraphics update update
Introduction: To continue our discussion from last week , consider three positions regarding the display of information: (a) The traditional tabular approach. This is how most statisticians, econometricians, political scientists, sociologists, etc., seem to operate. They understand the appeal of a pretty graph, and they’re willing to plot some data as part of an exploratory data analysis, but they see their serious research as leading to numerical estimates, p-values, tables of numbers. These people might use a graph to illustrate their points but they don’t see them as necessary in their research. (b) Statistical graphics as performed by Howard Wainer, Bill Cleveland, Dianne Cook, etc. They–we–see graphics as central to the process of statistical modeling and data analysis and are interested in graphs (static and dynamic) that display every data point as transparently as possible. (c) Information visualization or infographics, as performed by graphics designers and statisticians who are
4 0.83782709 61 andrew gelman stats-2010-05-31-A data visualization manifesto
Introduction: Details matter (at least, they do for me), but we don’t yet have a systematic way of going back and forth between the structure of a graph, its details, and the underlying questions that motivate our visualizations. (Cleveland, Wilkinson, and others have written a bit on how to formalize these connections, and I’ve thought about it too, but we have a ways to go.) I was thinking about this difficulty after reading an article on graphics by some computer scientists that was well-written but to me lacked a feeling for the linkages between substantive/statistical goals and graphical details. I have problems with these issues too, and my point here is not to criticize but to move the discussion forward. When thinking about visualization, how important are the details? Aleks pointed me to this article by Jeffrey Heer, Michael Bostock, and Vadim Ogievetsky, “A Tour through the Visualization Zoo: A survey of powerful visualization techniques, from the obvious to the obscure.” Th
5 0.82849532 1439 andrew gelman stats-2012-08-01-A book with a bunch of simple graphs
Introduction: Howard Friedman sent me a new book, The Measure of a Nation, subtitled How to Regain America’s Competitive Edge and Boost Our Global Standing. Without commenting on the substance of Friedman’s recommendations, I’d like to endorse his strategy of presentation, which is to display graph after graph after graph showing the same message over and over again, which is that the U.S. is outperformed by various other countries (mostly in Europe) on a variety of measures. These aren’t graphs I would ever make—they are scatterplots in which the x-axis conveys no information. But they have the advantage of repetition: once you figure out how to read one of the graphs, you can read the others easily. Here’s an example which I found from a quick Google: I can’t actually figure out what is happening on the x-axis, nor do I understand the “star, middle child, dog” thing. But I like the use of graphics. Lots more fun than bullet points. Seriously. P.S. Just to be clear: I am not trying
6 0.82364529 319 andrew gelman stats-2010-10-04-“Who owns Congress”
7 0.81203187 1176 andrew gelman stats-2012-02-19-Standardized writing styles and standardized graphing styles
8 0.80563241 2154 andrew gelman stats-2013-12-30-Bill Gates’s favorite graph of the year
9 0.80475497 2246 andrew gelman stats-2014-03-13-An Economist’s Guide to Visualizing Data
10 0.8001411 1606 andrew gelman stats-2012-12-05-The Grinch Comes Back
11 0.79733217 2266 andrew gelman stats-2014-03-25-A statistical graphics course and statistical graphics advice
12 0.79698348 296 andrew gelman stats-2010-09-26-A simple semigraphic display
16 0.78306282 1661 andrew gelman stats-2013-01-08-Software is as software does
17 0.78096139 829 andrew gelman stats-2011-07-29-Infovis vs. statgraphics: A clear example of their different goals
18 0.78078717 1764 andrew gelman stats-2013-03-15-How do I make my graphs?
19 0.78064275 37 andrew gelman stats-2010-05-17-Is chartjunk really “more useful” than plain graphs? I don’t think so.
20 0.77242255 787 andrew gelman stats-2011-07-05-Different goals, different looks: Infovis and the Chris Rock effect
topicId topicWeight
[(2, 0.022), (13, 0.012), (15, 0.041), (16, 0.069), (21, 0.024), (24, 0.122), (43, 0.016), (53, 0.028), (61, 0.023), (68, 0.012), (73, 0.171), (76, 0.03), (86, 0.012), (99, 0.308)]
simIndex simValue blogId blogTitle
1 0.97969288 1925 andrew gelman stats-2013-07-04-“Versatile, affordable chicken has grown in popularity”
Introduction: From two years ago : Awhile ago I was cleaning out the closet and found some old unread magazines. Good stuff. As we’ve discussed before , lots of things are better read a few years late. Today I was reading the 18 Nov 2004 issue of the London Review of Books, which contained (among other things) the following: - A review by Jenny Diski of a biography of Stanley Milgram. Diski appears to want to debunk: Milgram was a whiz at devising sexy experiments, but barely interested in any theoretical basis for them. They all have the same instant attractiveness of style, and then an underlying emptiness. Huh? Michael Jordan couldn’t hit the curveball and he was reportedly an easy mark for golf hustlers but that doesn’t diminish his greatness on the basketball court. She also criticizes Milgram for being “no help at all” for solving international disputes. OK, fine. I haven’t solved any international disputes either. Milgram, though, . . . he conducted an imaginative exp
2 0.97661436 655 andrew gelman stats-2011-04-10-“Versatile, affordable chicken has grown in popularity”
Introduction: Awhile ago I was cleaning out the closet and found some old unread magazines. Good stuff. As we’ve discussed before , lots of things are better read a few years late. Today I was reading the 18 Nov 2004 issue of the London Review of Books, which contained (among other things) the following: - A review by Jenny Diski of a biography of Stanley Milgram. Diski appears to want to debunk: Milgram was a whiz at devising sexy experiments, but barely interested in any theoretical basis for them. They all have the same instant attractiveness of style, and then an underlying emptiness. Huh? Michael Jordan couldn’t hit the curveball and he was reportedly an easy mark for golf hustlers but that doesn’t diminish his greatness on the basketball court. She also criticizes Milgram for being “no help at all” for solving international disputes. OK, fine. I haven’t solved any international disputes either. Milgram, though, . . . he conducted an imaginative experiment whose results stu
3 0.96391463 794 andrew gelman stats-2011-07-09-The quest for the holy graph
Introduction: Eytan Adar writes: I was just going through the latest draft of your paper with Anthony Unwin . I heard part of it at the talk you gave (remotely) here at UMich. I’m curious about your discussion of the Baby Name Voyager . The tool in itself is simple, attractive, and useful. No argument from me there. It’s an awesome demonstration of how subtle interactions can be very helpful (click and it zooms, type and it filters… falls perfectly into the Shneiderman visualization mantra). It satisfies a very common use case: finding appropriate names for children. That said, I can’t help but feeling that what you are really excited about is the very static analysis on last letters (you spend most of your time on this). This analysis, incidentally, is not possible to infer from the interactive application (which doesn’t support this type of filtering and pivoting). In a sense, the two visualizations don’t have anything to do with each other (other than a shared context/dataset).
4 0.96298021 497 andrew gelman stats-2011-01-02-Hipmunk update
Introduction: Florence from customer support at Hipmunk writes: Hipmunk now includes American Airlines in our search results. Please note that users will be taken directly to AA.com to complete the booking/transaction. . . . we are steadily increasing the number of flights that we offer on Hipmunk. As you may recall, Hipmunk is a really cool flight-finder that didn’t actually work (as of 16 Sept 2010). At the time, I was a bit annoyed at the NYT columnist who plugged Hipmunk without actually telling his readers that the site didn’t actually do the job. (I discovered the problem myself because I couldn’t believe that my flight options to Raleigh-Durham were really so meager, so I checked on Expedia and found a good flight.) I do think Hipmunk’s graphics are beautiful, though, so I’m rooting for them to catch up. P.S. Apparently they include Amtrak Northeast Corridor trains, so I’ll give them a try, next time I travel. The regular Amtrak website is about as horrible as you’d expect.
5 0.96066213 2238 andrew gelman stats-2014-03-09-Hipmunk worked
Introduction: In the past I’ve categorized Hipmunk as a really cool flight-finder that doesn’t actually work , as worse than Expedia , and as graphics without content . So, I thought it would be only fair to tell you that I bought a flight the other day using Hipmunk and it gave me the same flight as Expedia but at a lower cost (by linking to something called CheapOair, which I hope is legit). So score one for Hipmunk.
6 0.96061426 7 andrew gelman stats-2010-04-27-Should Mister P be allowed-encouraged to reside in counter-factual populations?
7 0.95330846 1748 andrew gelman stats-2013-03-04-PyStan!
8 0.94310844 1099 andrew gelman stats-2012-01-05-Approaching harmonic convergence
9 0.94166613 917 andrew gelman stats-2011-09-20-Last post on Hipmunk
10 0.93971574 280 andrew gelman stats-2010-09-16-Meet Hipmunk, a really cool flight-finder that doesn’t actually work
same-blog 11 0.93892205 798 andrew gelman stats-2011-07-12-Sometimes a graph really is just ugly
12 0.9361074 2020 andrew gelman stats-2013-09-12-Samplers for Big Science: emcee and BAT
13 0.93375528 161 andrew gelman stats-2010-07-24-Differences in color perception by sex, also the Bechdel test for women in movies
14 0.92713016 1511 andrew gelman stats-2012-09-26-What do statistical p-values mean when the sample = the population?
15 0.92156619 1846 andrew gelman stats-2013-05-07-Like Casper the ghost, Niall Ferguson is not only white. He is also very, very adorable.
16 0.91554296 320 andrew gelman stats-2010-10-05-Does posterior predictive model checking fit with the operational subjective approach?
17 0.91336536 496 andrew gelman stats-2011-01-01-Tukey’s philosophy
18 0.91057563 641 andrew gelman stats-2011-04-01-So many topics, so little time
19 0.90579677 2325 andrew gelman stats-2014-05-07-Stan users meetup next week
20 0.90201479 2327 andrew gelman stats-2014-05-09-Nicholas Wade and the paradox of racism