andrew_gelman_stats andrew_gelman_stats-2010 andrew_gelman_stats-2010-37 knowledge-graph by maker-knowledge-mining

37 andrew gelman stats-2010-05-17-Is chartjunk really “more useful” than plain graphs? I don’t think so.


meta infos for this blog

Source: html

Introduction: Helen DeWitt links to this blog that reports on a study by Scott Bateman, Carl Gutwin, David McDine, Regan Mandryk, Aaron Genest, and Christopher Brooks that claims the following: Guidelines for designing information charts often state that the presentation should reduce ‘chart junk’–visual embellishments that are not essential to understanding the data. . . . we conducted an experiment that compared embellished charts with plain ones, and measured both interpretation accuracy and long-term recall. We found that people’s accuracy in describing the embellished charts was no worse than for plain charts, and that their recall after a two-to-three-week gap was significantly better. As the above-linked blogger puts it, “chartjunk is more useful than plain graphs. . . . Tufte is not going to like this.” I can’t speak for Ed Tufte, but I’m not gonna take this claim about chartjunk lying down. I have two points to make which I hope can stop the above-linked study from being sla


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 we conducted an experiment that compared embellished charts with plain ones, and measured both interpretation accuracy and long-term recall. [sent-5, score-0.761]

2 We found that people’s accuracy in describing the embellished charts was no worse than for plain charts, and that their recall after a two-to-three-week gap was significantly better. [sent-6, score-0.761]

3 As the above-linked blogger puts it, “chartjunk is more useful than plain graphs. [sent-7, score-0.249]

4 ” I can’t speak for Ed Tufte, but I’m not gonna take this claim about chartjunk lying down. [sent-12, score-0.679]

5 The non-chart-junk graphs in the paper are not so good. [sent-15, score-0.201]

6 Figure 1 is a time series of dollars that is unhelpfully presented as a bar chart and which is either unadjusted for inflation or, if adjusted, is not indicated as such. [sent-16, score-0.535]

7 Figure 2a is a lineplot that whose y-axis should go down to 0, but doesn’t. [sent-17, score-0.075]

8 Both graphs also use the nonstandard strategy of labeling the y-axis on the right rather than the left. [sent-18, score-0.26]

9 Figure 2b is an impossible-to-read pie chart with one of the wedges popping out of the circle. [sent-19, score-0.428]

10 Regular readers of this blog will know what I think of that. [sent-20, score-0.061]

11 Figures 2c and 2d are blurry and have no axis labels. [sent-21, score-0.14]

12 Figure 2d is particularly bad because it’s a time-series graph in which time is presented on the y-axis; it also has the problem with inflation adjustment noted earlier. [sent-22, score-0.345]

13 Figures 4-9, presenting their own findings, are not particularly easy to read either. [sent-23, score-0.069]

14 And maybe they’re right that crappy chartjunk graphs are better than crappy non-chartjunk graphs. [sent-27, score-1.002]

15 But I don’t think it’s appropriate to generalize to the claim that chartjunk graphs are better than good graphs. [sent-28, score-0.819]

16 This brings me to my second point, which is that a huge, huge drawback of chartjunk is that it limits the amount of information you can display in a graph. [sent-30, score-0.87]

17 If all you want is to display a sequence of 5 numbers, then, sure, go for the chartjunk, I don’t really care. [sent-31, score-0.156]

18 But why limit yourself to only displaying 5 numbers? [sent-32, score-0.06]

19 Consider the graphs in Red State, Blue State (or in our other research publications, or on this blog). [sent-33, score-0.201]

20 Sure, you can do pretty instead of plain (see this discussion with examples ), but here the graphics design is used to enhance the points made in the graph, not as a distraction. [sent-34, score-0.321]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('chartjunk', 0.555), ('plain', 0.249), ('charts', 0.232), ('graphs', 0.201), ('chart', 0.184), ('bateman', 0.182), ('embellished', 0.182), ('figure', 0.14), ('tufte', 0.129), ('inflation', 0.129), ('crappy', 0.123), ('figures', 0.098), ('accuracy', 0.098), ('display', 0.092), ('state', 0.091), ('wedges', 0.091), ('drawback', 0.086), ('popping', 0.086), ('regan', 0.086), ('presented', 0.084), ('blurry', 0.082), ('distraction', 0.079), ('huge', 0.078), ('unadjusted', 0.077), ('dewitt', 0.075), ('helen', 0.075), ('lineplot', 0.075), ('carl', 0.072), ('enhance', 0.072), ('junk', 0.07), ('particularly', 0.069), ('pie', 0.067), ('aaron', 0.065), ('designing', 0.065), ('christopher', 0.064), ('sequence', 0.064), ('claim', 0.063), ('numbers', 0.063), ('graph', 0.063), ('guidelines', 0.063), ('adjusted', 0.062), ('indicated', 0.061), ('lying', 0.061), ('blog', 0.061), ('displaying', 0.06), ('routine', 0.059), ('essential', 0.059), ('labeling', 0.059), ('limits', 0.059), ('axis', 0.058)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0 37 andrew gelman stats-2010-05-17-Is chartjunk really “more useful” than plain graphs? I don’t think so.

Introduction: Helen DeWitt links to this blog that reports on a study by Scott Bateman, Carl Gutwin, David McDine, Regan Mandryk, Aaron Genest, and Christopher Brooks that claims the following: Guidelines for designing information charts often state that the presentation should reduce ‘chart junk’–visual embellishments that are not essential to understanding the data. . . . we conducted an experiment that compared embellished charts with plain ones, and measured both interpretation accuracy and long-term recall. We found that people’s accuracy in describing the embellished charts was no worse than for plain charts, and that their recall after a two-to-three-week gap was significantly better. As the above-linked blogger puts it, “chartjunk is more useful than plain graphs. . . . Tufte is not going to like this.” I can’t speak for Ed Tufte, but I’m not gonna take this claim about chartjunk lying down. I have two points to make which I hope can stop the above-linked study from being sla

2 0.18164912 878 andrew gelman stats-2011-08-29-Infovis, infographics, and data visualization: Where I’m coming from, and where I’d like to go

Introduction: I continue to struggle to convey my thoughts on statistical graphics so I’ll try another approach, this time giving my own story. For newcomers to this discussion: the background is that Antony Unwin and I wrote an article on the different goals embodied in information visualization and statistical graphics, but I have difficulty communicating on this point with the infovis people. Maybe if I tell my own story, and then they tell their stories, this will point a way forward to a more constructive discussion. So here goes. I majored in physics in college and I worked in a couple of research labs during the summer. Physicists graph everything. I did most of my plotting on graph paper–this continued through my second year of grad school–and became expert at putting points at 1/5, 2/5, 3/5, and 4/5 between the x and y grid lines. In grad school in statistics, I continued my physics habits and graphed everything I could. I did notice, though, that the faculty and the other

3 0.16414186 855 andrew gelman stats-2011-08-16-Infovis and statgraphics update update

Introduction: To continue our discussion from last week , consider three positions regarding the display of information: (a) The traditional tabular approach. This is how most statisticians, econometricians, political scientists, sociologists, etc., seem to operate. They understand the appeal of a pretty graph, and they’re willing to plot some data as part of an exploratory data analysis, but they see their serious research as leading to numerical estimates, p-values, tables of numbers. These people might use a graph to illustrate their points but they don’t see them as necessary in their research. (b) Statistical graphics as performed by Howard Wainer, Bill Cleveland, Dianne Cook, etc. They–we–see graphics as central to the process of statistical modeling and data analysis and are interested in graphs (static and dynamic) that display every data point as transparently as possible. (c) Information visualization or infographics, as performed by graphics designers and statisticians who are

4 0.16043074 687 andrew gelman stats-2011-04-29-Zero is zero

Introduction: Nathan Roseberry writes: I thought I had read on your blog that bar charts should always include zero on the scale, but a search of your blog (or google) didn’t return what I was looking for. Is it considered a best practice to always include zero on the axis for bar charts? Has this been written in a book? My reply: The idea is that the area of the bar represents “how many” or “how much.” The bar has to go down to 0 for that to work. You don’t have to have your y-axis go to zero, but if you want the axis to go anywhere else, don’t use a bar graph, use a line graph. Usually line graphs are better anyway. I’m sure this is all in a book somewhere.

5 0.15182179 1275 andrew gelman stats-2012-04-22-Please stop me before I barf again

Introduction: Pointing to some horrible graphs, Kaiser writes, “The Earth Institute needs a graphics adviser.” I agree. The graphs are corporate standard, neither pretty or innovative enough to qualify as infographics, not informational enough to be good statistical data displays. Some examples include the above exploding pie chart, which, as Kaiser notes, is not merely ugly and ridiculously difficult to read (given that it is conveying only nine data points) but also invites suspicion of its numbers, and pages and pages of graphs that could be better compressed into a compact displays (see pages 25-65 of the report). Yes, this is all better than tables of numbers, but I don’t see that much thought went into displaying patterns of information or telling a story. It’s more graph-as-data-dump. To be fair, the report does have some a clean scatterplot (on page 65). But, overall, the graphs are not well-integrated with the messages in the text. I feel a little bit bad about this, beca

6 0.14392534 2279 andrew gelman stats-2014-04-02-Am I too negative?

7 0.1416073 1090 andrew gelman stats-2011-12-28-“. . . extending for dozens of pages”

8 0.13303244 294 andrew gelman stats-2010-09-23-Thinking outside the (graphical) box: Instead of arguing about how best to fix a bar chart, graph it as a time series lineplot instead

9 0.13242926 61 andrew gelman stats-2010-05-31-A data visualization manifesto

10 0.13000409 1684 andrew gelman stats-2013-01-20-Ugly ugly ugly

11 0.12412566 2266 andrew gelman stats-2014-03-25-A statistical graphics course and statistical graphics advice

12 0.11848823 847 andrew gelman stats-2011-08-10-Using a “pure infographic” to explore differences between information visualization and statistical graphics

13 0.1065549 1176 andrew gelman stats-2012-02-19-Standardized writing styles and standardized graphing styles

14 0.10604505 1609 andrew gelman stats-2012-12-06-Stephen Kosslyn’s principles of graphics and one more: There’s no need to cram everything into a single plot

15 0.1059924 829 andrew gelman stats-2011-07-29-Infovis vs. statgraphics: A clear example of their different goals

16 0.10344266 319 andrew gelman stats-2010-10-04-“Who owns Congress”

17 0.10098311 1439 andrew gelman stats-2012-08-01-A book with a bunch of simple graphs

18 0.10064306 798 andrew gelman stats-2011-07-12-Sometimes a graph really is just ugly

19 0.1006092 927 andrew gelman stats-2011-09-26-R and Google Visualization

20 0.099705532 488 andrew gelman stats-2010-12-27-Graph of the year


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.143), (1, -0.065), (2, -0.007), (3, 0.049), (4, 0.111), (5, -0.168), (6, -0.078), (7, 0.042), (8, -0.053), (9, -0.003), (10, 0.026), (11, 0.005), (12, -0.03), (13, 0.006), (14, 0.026), (15, 0.016), (16, 0.003), (17, 0.01), (18, -0.012), (19, 0.007), (20, 0.009), (21, 0.005), (22, -0.014), (23, 0.024), (24, 0.03), (25, -0.003), (26, 0.014), (27, -0.003), (28, -0.006), (29, 0.023), (30, -0.0), (31, -0.004), (32, -0.02), (33, -0.005), (34, -0.023), (35, -0.009), (36, 0.026), (37, -0.015), (38, 0.01), (39, -0.039), (40, 0.056), (41, 0.022), (42, -0.017), (43, 0.019), (44, -0.019), (45, -0.028), (46, -0.038), (47, 0.042), (48, 0.016), (49, 0.027)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.97062182 37 andrew gelman stats-2010-05-17-Is chartjunk really “more useful” than plain graphs? I don’t think so.

Introduction: Helen DeWitt links to this blog that reports on a study by Scott Bateman, Carl Gutwin, David McDine, Regan Mandryk, Aaron Genest, and Christopher Brooks that claims the following: Guidelines for designing information charts often state that the presentation should reduce ‘chart junk’–visual embellishments that are not essential to understanding the data. . . . we conducted an experiment that compared embellished charts with plain ones, and measured both interpretation accuracy and long-term recall. We found that people’s accuracy in describing the embellished charts was no worse than for plain charts, and that their recall after a two-to-three-week gap was significantly better. As the above-linked blogger puts it, “chartjunk is more useful than plain graphs. . . . Tufte is not going to like this.” I can’t speak for Ed Tufte, but I’m not gonna take this claim about chartjunk lying down. I have two points to make which I hope can stop the above-linked study from being sla

2 0.87594944 1439 andrew gelman stats-2012-08-01-A book with a bunch of simple graphs

Introduction: Howard Friedman sent me a new book, The Measure of a Nation, subtitled How to Regain America’s Competitive Edge and Boost Our Global Standing. Without commenting on the substance of Friedman’s recommendations, I’d like to endorse his strategy of presentation, which is to display graph after graph after graph showing the same message over and over again, which is that the U.S. is outperformed by various other countries (mostly in Europe) on a variety of measures. These aren’t graphs I would ever make—they are scatterplots in which the x-axis conveys no information. But they have the advantage of repetition: once you figure out how to read one of the graphs, you can read the others easily. Here’s an example which I found from a quick Google: I can’t actually figure out what is happening on the x-axis, nor do I understand the “star, middle child, dog” thing. But I like the use of graphics. Lots more fun than bullet points. Seriously. P.S. Just to be clear: I am not trying

3 0.86196071 61 andrew gelman stats-2010-05-31-A data visualization manifesto

Introduction: Details matter (at least, they do for me), but we don’t yet have a systematic way of going back and forth between the structure of a graph, its details, and the underlying questions that motivate our visualizations. (Cleveland, Wilkinson, and others have written a bit on how to formalize these connections, and I’ve thought about it too, but we have a ways to go.) I was thinking about this difficulty after reading an article on graphics by some computer scientists that was well-written but to me lacked a feeling for the linkages between substantive/statistical goals and graphical details. I have problems with these issues too, and my point here is not to criticize but to move the discussion forward. When thinking about visualization, how important are the details? Aleks pointed me to this article by Jeffrey Heer, Michael Bostock, and Vadim Ogievetsky, “A Tour through the Visualization Zoo: A survey of powerful visualization techniques, from the obvious to the obscure.” Th

4 0.86037105 1684 andrew gelman stats-2013-01-20-Ugly ugly ugly

Introduction: Denis Cote sends the following , under the heading, “Some bad graphs for your enjoyment”: To start with, they don’t know how to spell “color.” Seriously, though, the graph is a mess. The circular display implies a circular or periodic structure that isn’t actually in the data, the cramped display requires the use of an otherwise-unnecessary color code that makes it difficult to find or make sense of the information, the alphabetical ordering (without even supplying state names, only abbreviations) makes it further difficult to find any patterns. It would be so much better, and even easier, to just display a set of small maps shading states on whether they have different laws. But that’s part of the problem—the clearer graph would also be easier to make! To get a distinctive graph, there needs to be some degree of difficulty. The designers continue with these monstrosities: Here they decide to display only 5 states at a time so that it’s really hard to see any big pi

5 0.85502934 1606 andrew gelman stats-2012-12-05-The Grinch Comes Back

Introduction: Wayne Folta writes: In keeping with your interest in graphs, this might interest or inspire you, if you haven’t seen it already, which features 20 scientific graphs that Wired likes, ranging from drawn illustrations to trajectory plots. My reaction: I looked at the first 10. I liked 1, 3, and 5, I didn’t like 2, 7, 8, 9, and 10. I have neutral feelings about 4 and 6. I won’t explain all these feelings, but, just for example, from my perspective, image 9 fails as a statistical graphic (although it might be fine as an infovis) by trying to cram to much into a single image. I don’t think it works to have all the colors on the single wheels; instead I’d prefer some sort of grid of images. Also, I don’t see the point of the circular display. That makes no sense at all; it’s a misleading feature. That said, the graphs I dislike can still be fine for their purpose. A graph in a journal such as Science or Nature is meant to grab the eye of a busy reader (or to go viral on

6 0.85216653 488 andrew gelman stats-2010-12-27-Graph of the year

7 0.8493917 126 andrew gelman stats-2010-07-03-Graphical presentation of risk ratios

8 0.84254104 1609 andrew gelman stats-2012-12-06-Stephen Kosslyn’s principles of graphics and one more: There’s no need to cram everything into a single plot

9 0.83540946 878 andrew gelman stats-2011-08-29-Infovis, infographics, and data visualization: Where I’m coming from, and where I’d like to go

10 0.83169532 672 andrew gelman stats-2011-04-20-The R code for those time-use graphs

11 0.82504189 1275 andrew gelman stats-2012-04-22-Please stop me before I barf again

12 0.8158316 2266 andrew gelman stats-2014-03-25-A statistical graphics course and statistical graphics advice

13 0.81402671 1894 andrew gelman stats-2013-06-12-How to best graph the Beveridge curve, relating the vacancy rate in jobs to the unemployment rate?

14 0.80948788 2246 andrew gelman stats-2014-03-13-An Economist’s Guide to Visualizing Data

15 0.80737287 829 andrew gelman stats-2011-07-29-Infovis vs. statgraphics: A clear example of their different goals

16 0.80735284 2154 andrew gelman stats-2013-12-30-Bill Gates’s favorite graph of the year

17 0.80533171 296 andrew gelman stats-2010-09-26-A simple semigraphic display

18 0.80482614 294 andrew gelman stats-2010-09-23-Thinking outside the (graphical) box: Instead of arguing about how best to fix a bar chart, graph it as a time series lineplot instead

19 0.80083001 319 andrew gelman stats-2010-10-04-“Who owns Congress”

20 0.78971446 1764 andrew gelman stats-2013-03-15-How do I make my graphs?


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(5, 0.038), (9, 0.018), (10, 0.146), (15, 0.023), (16, 0.085), (24, 0.182), (44, 0.021), (53, 0.021), (76, 0.015), (95, 0.031), (99, 0.247)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.95446825 37 andrew gelman stats-2010-05-17-Is chartjunk really “more useful” than plain graphs? I don’t think so.

Introduction: Helen DeWitt links to this blog that reports on a study by Scott Bateman, Carl Gutwin, David McDine, Regan Mandryk, Aaron Genest, and Christopher Brooks that claims the following: Guidelines for designing information charts often state that the presentation should reduce ‘chart junk’–visual embellishments that are not essential to understanding the data. . . . we conducted an experiment that compared embellished charts with plain ones, and measured both interpretation accuracy and long-term recall. We found that people’s accuracy in describing the embellished charts was no worse than for plain charts, and that their recall after a two-to-three-week gap was significantly better. As the above-linked blogger puts it, “chartjunk is more useful than plain graphs. . . . Tufte is not going to like this.” I can’t speak for Ed Tufte, but I’m not gonna take this claim about chartjunk lying down. I have two points to make which I hope can stop the above-linked study from being sla

2 0.93303144 1059 andrew gelman stats-2011-12-14-Looking at many comparisons may increase the risk of finding something statistically significant by epidemiologists, a population with relatively low multilevel modeling consumption

Introduction: To understand the above title, see here . Masanao writes: This report claims that eating meat increases the risk of cancer. I’m sure you can’t read the page but you probably can understand the graphs. Different bars represent subdivision in the amount of the particular type of meat one consumes. And each chunk is different types of meat. Left is for male right is for female. They claim that the difference is significant, but they are clearly not!! I’m for not eating much meat but this is just way too much… Here’s the graph: I don’t know what to think. If you look carefully you can find one or two statistically significant differences but overall the pattern doesn’t look so compelling. I don’t know what the top and bottom rows are, though. Overall, the pattern in the top row looks like it could represent a real trend, while the graphs on the bottom row look like noise. This could be a good example for our multiple comparisons paper. If the researchers won’t

3 0.9290899 1810 andrew gelman stats-2013-04-17-Subway series

Introduction: Abby points us to a spare but cool visualization . I don’t like the curvy connect-the-dots line, but my main suggested improvement would be a closer link to the map . Showing median income on census tracts along subway lines is cool, but ultimately it’s a clever gimmick that pulls me in and makes me curious about what the map looks like. (And, thanks to google, the map was easy to find.)

4 0.92887962 1402 andrew gelman stats-2012-07-01-Ice cream! and temperature

Introduction: Just in time for the hot weather . . . Aleks points me to this link to a graph of % check-ins at NYC ice cream shops plotted against temperature in 2011. Aleks writes, “interesting how the ice cream response lags temperature in spring/fall but during the summer, the response is immediate.” This graph is a good starting point but I think more could be done, both in the analysis and purely in the graphics. Putting the two lines together like this with a fixed ratio is just too crude a tool. A series of graphs done just right could show a lot, I think!

5 0.9221797 2215 andrew gelman stats-2014-02-17-The Washington Post reprints university press releases without editing them

Introduction: Somebody points me to this horrifying exposé by Paul Raeburn on a new series by the Washington Post where they reprint press releases as if they are actual news. And the gimmick is, the reason why it’s appearing on this blog, is that these are university press releases on science stories . What could possibly go wrong there? After all, Steve Chaplin, a self-identified “science-writing PIO from an R1,” writes in a comment to Raeburn’s post: We write about peer-reviewed research accepted for publication or published by the world’s leading scientific journals after that research has been determined to be legitimate. Repeatability of new research is a publication requisite. I emphasized that last sentence myself because it was such a stunner. Do people really think that??? So I guess what he’s saying is, they don’t do press releases for articles from Psychological Science or the Journal of Personality and Social Psychology . But I wonder how the profs in the psych d

6 0.91496515 78 andrew gelman stats-2010-06-10-Hey, where’s my kickback?

7 0.90934622 1974 andrew gelman stats-2013-08-08-Statistical significance and the dangerous lure of certainty

8 0.90760005 2288 andrew gelman stats-2014-04-10-Small multiples of lineplots > maps (ok, not always, but yes in this case)

9 0.9071843 487 andrew gelman stats-2010-12-27-Alfred Kahn

10 0.90490055 1744 andrew gelman stats-2013-03-01-Why big effects are more important than small effects

11 0.90320462 2257 andrew gelman stats-2014-03-20-The candy weighing demonstration, or, the unwisdom of crowds

12 0.89859641 1363 andrew gelman stats-2012-06-03-Question about predictive checks

13 0.89701062 1122 andrew gelman stats-2012-01-16-“Groundbreaking or Definitive? Journals Need to Pick One”

14 0.89506352 687 andrew gelman stats-2011-04-29-Zero is zero

15 0.89160752 1713 andrew gelman stats-2013-02-08-P-values and statistical practice

16 0.89065111 357 andrew gelman stats-2010-10-20-Sas and R

17 0.88994867 2266 andrew gelman stats-2014-03-25-A statistical graphics course and statistical graphics advice

18 0.8883037 488 andrew gelman stats-2010-12-27-Graph of the year

19 0.8869077 1240 andrew gelman stats-2012-04-02-Blogads update

20 0.88676804 799 andrew gelman stats-2011-07-13-Hypothesis testing with multiple imputations