andrew_gelman_stats andrew_gelman_stats-2013 andrew_gelman_stats-2013-2154 knowledge-graph by maker-knowledge-mining

2154 andrew gelman stats-2013-12-30-Bill Gates’s favorite graph of the year


meta infos for this blog

Source: html

Introduction: Under the subject line “Blog bait!”, Brendan Nyhan points me to this post at the Washington Post blog: For 2013, we asked some of the year’s most interesting, important and influential thinkers to name their favorite graph of the year — and why they chose it. Here’s Bill Gates’s. Infographic by Thomas Porostocky for WIRED. “I love this graph because it shows that while the number of people dying from communicable diseases is still far too high, those numbers continue to come down. . . .” As Brendan is aware, this is not my favorite sort of graph, it’s a bit of a puzzle to read and figure out where all the pieces fit in, also weird stuff going on like 3-D effects and the big space taken up by those yellow and green borders, as well as tricky things like understanding what some of those little blocks are, and perhaps the biggest question, what is the definition of an “untimely death.” But, as often is the case, the defects of the graph form a statistical perspective can


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 ”, Brendan Nyhan points me to this post at the Washington Post blog: For 2013, we asked some of the year’s most interesting, important and influential thinkers to name their favorite graph of the year — and why they chose it. [sent-2, score-0.46]

2 “I love this graph because it shows that while the number of people dying from communicable diseases is still far too high, those numbers continue to come down. [sent-5, score-1.279]

3 ” But, as often is the case, the defects of the graph form a statistical perspective can make it attractive to readers: the 3-D design is grabby, and all the puzzles give a Chris Rock effect . [sent-10, score-0.437]

4 And its use of the three colors is excellent—a simple but effective way of conveying the three groups (just don’t ask me to guess the relative sizes of the yellow and red parallelograms! [sent-12, score-0.826]

5 , China, India, Indonesia, and other large countries, also maybe some continents and other groupings such as the E. [sent-16, score-0.147]

6 ), then click through again to a spreadsheet with the numbers. [sent-18, score-0.08]

7 But remember the BD principle: a picture plus a thousand words is better than 2 pictures or 2000 words. [sent-23, score-0.087]

8 The hypertext could give lots of information, including their definition of “untimely death. [sent-24, score-0.252]

9 ” This graph does not say what you think it says Most interesting to me, though, was Gates’s claim that the graph shows that “while the number of people dying from communicable diseases is still far too high, those numbers continue to come down. [sent-25, score-1.592]

10 ” I guess I’ll buy that the graph shows that those yellow numbers are “far too high,” not really from the graphic itself but because we can see that big fat rectangle for Diarrhea. [sent-26, score-1.056]

11 It just doesn’t seem like so many people should be dying of that. [sent-27, score-0.175]

12 But I don’t think the graph does such a great job of showing the trend (“those numbers continue to come down). [sent-28, score-0.695]

13 I mean, sure, after seeing Gates’s words, I can go back and say, yeah, lots of bright yellow there. [sent-29, score-0.519]

14 Also, that bright yellow thing is a bit of a cheat. [sent-31, score-0.519]

15 For the red and green sections of the graph, the sharpest declines are indicated by a very pale green and a very very pale pink. [sent-32, score-0.977]

16 But in the yellow box, the sharpest declines are screaming yellow. [sent-33, score-0.746]

17 It’s the role of the follow-up graphs in the click-through to provide some perspective and allow some comparison. [sent-36, score-0.058]

18 Both these goals are important, and there’s no reason whatsoever to expect that they can be most effectively achieved by a single display. [sent-39, score-0.057]

19 If there’s a key message you want to convey in your graph (for example, infectious diseases kill too many people but are in decline), I recommend putting that message front and center, in words, on your display. [sent-41, score-0.704]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('yellow', 0.395), ('graph', 0.313), ('gates', 0.223), ('diseases', 0.175), ('dying', 0.175), ('communicable', 0.166), ('hypertext', 0.166), ('pale', 0.166), ('sharpest', 0.166), ('untimely', 0.166), ('green', 0.146), ('graphic', 0.145), ('bright', 0.124), ('declines', 0.119), ('brendan', 0.108), ('infographic', 0.108), ('continue', 0.108), ('numbers', 0.105), ('showing', 0.103), ('colors', 0.101), ('shows', 0.098), ('three', 0.098), ('words', 0.087), ('definition', 0.086), ('click', 0.08), ('favorite', 0.076), ('continents', 0.076), ('infectious', 0.076), ('far', 0.073), ('thinkers', 0.071), ('groupings', 0.071), ('bait', 0.071), ('grabs', 0.071), ('replica', 0.071), ('message', 0.07), ('bill', 0.07), ('red', 0.068), ('high', 0.067), ('screaming', 0.066), ('defects', 0.066), ('conveying', 0.066), ('come', 0.066), ('grabby', 0.064), ('dotplots', 0.064), ('borders', 0.064), ('caption', 0.062), ('dotplot', 0.061), ('perspective', 0.058), ('whatsoever', 0.057), ('china', 0.056)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0000002 2154 andrew gelman stats-2013-12-30-Bill Gates’s favorite graph of the year

Introduction: Under the subject line “Blog bait!”, Brendan Nyhan points me to this post at the Washington Post blog: For 2013, we asked some of the year’s most interesting, important and influential thinkers to name their favorite graph of the year — and why they chose it. Here’s Bill Gates’s. Infographic by Thomas Porostocky for WIRED. “I love this graph because it shows that while the number of people dying from communicable diseases is still far too high, those numbers continue to come down. . . .” As Brendan is aware, this is not my favorite sort of graph, it’s a bit of a puzzle to read and figure out where all the pieces fit in, also weird stuff going on like 3-D effects and the big space taken up by those yellow and green borders, as well as tricky things like understanding what some of those little blocks are, and perhaps the biggest question, what is the definition of an “untimely death.” But, as often is the case, the defects of the graph form a statistical perspective can

2 0.20464677 878 andrew gelman stats-2011-08-29-Infovis, infographics, and data visualization: Where I’m coming from, and where I’d like to go

Introduction: I continue to struggle to convey my thoughts on statistical graphics so I’ll try another approach, this time giving my own story. For newcomers to this discussion: the background is that Antony Unwin and I wrote an article on the different goals embodied in information visualization and statistical graphics, but I have difficulty communicating on this point with the infovis people. Maybe if I tell my own story, and then they tell their stories, this will point a way forward to a more constructive discussion. So here goes. I majored in physics in college and I worked in a couple of research labs during the summer. Physicists graph everything. I did most of my plotting on graph paper–this continued through my second year of grad school–and became expert at putting points at 1/5, 2/5, 3/5, and 4/5 between the x and y grid lines. In grad school in statistics, I continued my physics habits and graphed everything I could. I did notice, though, that the faculty and the other

3 0.15776908 1834 andrew gelman stats-2013-05-01-A graph at war with its caption. Also, how to visualize the same numbers without giving the display a misleading causal feel?

Introduction: Kaiser Fung discusses the following graph that is captioned, “A study of 54 nations–ranked below–found that those with more progressive tax rates had happier citizens, on average.” As Kaiser writes, “from a purely graphical perspective, the chart is well executed . . . they have 54 points, and the chart still doesn’t look too crammed . . .” But he also points out that the graph’s implicit claims (that tax rates can explain happiness or cause more happiness) are not supported. Kaiser and I are not being picky-picky-picky here. Taken literally, the graph title says nothing about causation, but I think the phrasing implies it. Also, from a purely descriptive perspective, the graph is somewhat at war with its caption. The caption announces a relationship, but in the graph, the x and y variables have only a very weak correlation. The caption says that happiness and progressive tax rates go together, but the graph uses the U.S. as a baseline, and when you move from the U.S

4 0.15760669 2308 andrew gelman stats-2014-04-27-White stripes and dead armadillos

Introduction: Paul Alper writes: For years I [Alper] have been obsessed by the color of the line which divides oncoming (i.e., opposing) traffic because I was firmly convinced that the color of the center line changed during my lifetime. Yet, I never could find anyone who had the same remembrance (or interest in the topic). The other day I found this this explanation that vindicates my recollection (and I was continuously out of the U.S. from 1969 to 1973): The question of which color to use for highway center lines in the United States enjoyed considerable debate and changing standards over a period of several decades. By November 1954, 47 states had adopted white as their standard color for highway centerlines, with Oregon being the last holdout to use yellow. In 1958, the U.S. Bureau of Public Roads adopted white as the standard color for the new interstate highway system. The 1971 edition of the Manual on Uniform Traffic Control Devices, however, mandated yellow as the standard color o

5 0.15535803 2266 andrew gelman stats-2014-03-25-A statistical graphics course and statistical graphics advice

Introduction: Dean Eckles writes: Some of my coworkers at Facebook and I have worked with Udacity to create an online course on exploratory data analysis, including using data visualizations in R as part of EDA. The course has now launched at  https://www.udacity.com/course/ud651  so anyone can take it for free. And Kaiser Fung has  reviewed it . So definitely feel free to promote it! Criticism is also welcome (we are still fine-tuning things and adding more notes throughout). I wrote some more comments about the course  here , including highlighting the interviews with my great coworkers. I didn’t have a chance to look at the course so instead I responded with some generic comments about eda and visualization (in no particular order): - Think of a graph as a comparison. All graphs are comparison (indeed, all statistical analyses are comparisons). If you already have the graph in mind, think of what comparisons it’s enabling. Or if you haven’t settled on the graph yet, think of what

6 0.14899975 61 andrew gelman stats-2010-05-31-A data visualization manifesto

7 0.14809173 832 andrew gelman stats-2011-07-31-Even a good data display can sometimes be improved

8 0.142397 1258 andrew gelman stats-2012-04-10-Why display 6 years instead of 30?

9 0.13937312 488 andrew gelman stats-2010-12-27-Graph of the year

10 0.13615167 1376 andrew gelman stats-2012-06-12-Simple graph WIN: the example of birthday frequencies

11 0.13589525 855 andrew gelman stats-2011-08-16-Infovis and statgraphics update update

12 0.1309059 2132 andrew gelman stats-2013-12-13-And now, here’s something that would make Ed Tufte spin in his . . . ummm, Tufte’s still around, actually, so let’s just say I don’t think he’d like it!

13 0.12648515 2288 andrew gelman stats-2014-04-10-Small multiples of lineplots > maps (ok, not always, but yes in this case)

14 0.12413373 787 andrew gelman stats-2011-07-05-Different goals, different looks: Infovis and the Chris Rock effect

15 0.1234505 583 andrew gelman stats-2011-02-21-An interesting assignment for statistical graphics

16 0.12047377 262 andrew gelman stats-2010-09-08-Here’s how rumors get started: Lineplots, dotplots, and nonfunctional modernist architecture

17 0.11696885 847 andrew gelman stats-2011-08-10-Using a “pure infographic” to explore differences between information visualization and statistical graphics

18 0.1143117 1584 andrew gelman stats-2012-11-19-Tradeoffs in information graphics

19 0.11425029 670 andrew gelman stats-2011-04-20-Attractive but hard-to-read graph could be made much much better

20 0.11241721 863 andrew gelman stats-2011-08-21-Bad graph


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.18), (1, -0.076), (2, 0.014), (3, 0.069), (4, 0.14), (5, -0.174), (6, -0.061), (7, 0.072), (8, -0.018), (9, -0.011), (10, 0.007), (11, -0.008), (12, -0.02), (13, 0.027), (14, 0.015), (15, 0.007), (16, 0.042), (17, -0.002), (18, -0.009), (19, -0.006), (20, 0.013), (21, 0.005), (22, -0.031), (23, 0.002), (24, 0.045), (25, -0.009), (26, -0.013), (27, 0.042), (28, -0.033), (29, 0.005), (30, 0.015), (31, -0.019), (32, -0.065), (33, -0.054), (34, -0.039), (35, -0.014), (36, 0.002), (37, -0.06), (38, -0.006), (39, 0.016), (40, 0.004), (41, -0.027), (42, 0.014), (43, 0.04), (44, -0.027), (45, 0.022), (46, 0.016), (47, -0.012), (48, -0.055), (49, -0.002)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.9745056 2154 andrew gelman stats-2013-12-30-Bill Gates’s favorite graph of the year

Introduction: Under the subject line “Blog bait!”, Brendan Nyhan points me to this post at the Washington Post blog: For 2013, we asked some of the year’s most interesting, important and influential thinkers to name their favorite graph of the year — and why they chose it. Here’s Bill Gates’s. Infographic by Thomas Porostocky for WIRED. “I love this graph because it shows that while the number of people dying from communicable diseases is still far too high, those numbers continue to come down. . . .” As Brendan is aware, this is not my favorite sort of graph, it’s a bit of a puzzle to read and figure out where all the pieces fit in, also weird stuff going on like 3-D effects and the big space taken up by those yellow and green borders, as well as tricky things like understanding what some of those little blocks are, and perhaps the biggest question, what is the definition of an “untimely death.” But, as often is the case, the defects of the graph form a statistical perspective can

2 0.94085342 488 andrew gelman stats-2010-12-27-Graph of the year

Introduction: From blogger Matthew Yglesias : There are lots of great graphs all over the web (see, for example, here and here for some snappy pictures of unemployment trends from blogger “Geoff”). There’s nothing special about Yglesias’s graph. In fact, the reason I’m singling it out as “graph of the year” is because it’s not special. It’s a display of three numbers, with no subtlety or artistry in its presentation. True, it has some good features: - Clear title - Clearly labeled axes - Vertical axis goes to zero - The cities are in a sensible order (not, for example, alphabetical) - The graphs is readable; none of that 3-D “data visualization” crap that looks cool but distances the reader from the numbers being displayed. What’s impressive about the above graph, what makes it a landmark to me, is that it was made at all. As noted in the text immediately below the image, it’s a display of exactly three numbers which can with little effort be completely presented and e

3 0.91319245 829 andrew gelman stats-2011-07-29-Infovis vs. statgraphics: A clear example of their different goals

Introduction: I recently came across a data visualization that perfectly demonstrates the difference between the “infovis” and “statgraphics” perspectives. Here’s the image ( link from Tyler Cowen): That’s the infovis. The statgraphic version would simply be a dotplot, something like this: (I purposely used the default settings in R with only minor modifications here to demonstrate what happens if you just want to plot the data with minimal effort.) Let’s compare the two graphs: From a statistical graphics perspective, the second graph dominates. The countries are directly comparable and the numbers are indicated by positions rather than area. The first graph is full of distracting color and gives the misleading visual impression that the total GDP of countries 5-10 is about equal to that of countries 1-4. If the goal is to get attention , though, it’s another story. There’s nothing special about the top graph above except how it looks. It represents neither a dat

4 0.89706463 502 andrew gelman stats-2011-01-04-Cash in, cash out graph

Introduction: David Afshartous writes: I thought this graph [from Ed Easterling] might be good for your blog. The 71 outlined squares show the main story, and the regions of the graph present the information nicely. Looks like the bins for the color coding are not of equal size and of course the end bins are unbounded. Might be interesting to graph the distribution of the actual data for the 71 outlined squares. In addition, I assume that each period begins on Jan 1 so data size could be naturally increased by looking at intervals that start on June 1 as well (where the limit of this process would be to have it at the granularity of one day; while it most likely wouldn’t make much difference, I’ve seen some graphs before where 1 year returns can be quite sensitive to starting date, etc). I agree that (a) the graph could be improved in small ways–in particular, adding half-year data seems like a great idea–and (b) it’s a wonderful, wonderful graph as is. And the NYT graphics people ad

5 0.89561236 671 andrew gelman stats-2011-04-20-One more time-use graph

Introduction: Evan Hensleigh sens me this redesign of the cross-national time use graph : Here was my version: And here was the original: Compared to my graph, Evan’s has better fonts, and that’s important–good fonts can make a display look professional. But I’m not sure about his other innovations. To me, the different colors for the different time-use categories are more of a distraction than a visual aid, and I also don’t like how he made the bars fatter. As I noted in my earlier entry, to me this draws unwanted attention to the negative space between the bars. His country labels are slightly misaligned (particularly Japan and USA), and I really don’t like his horizontal axis at all! He removed the units of hours and put + and – on the edges so that the axes run into each other. What was the point of that? It’s bad news. Also I don’t see any advantage at all to the prehensile tick marks. On the other hand, if Evgn and I were working together on such a graph, we w

6 0.89543068 1684 andrew gelman stats-2013-01-20-Ugly ugly ugly

7 0.88675147 2146 andrew gelman stats-2013-12-24-NYT version of birthday graph

8 0.87918431 1439 andrew gelman stats-2012-08-01-A book with a bunch of simple graphs

9 0.87874943 1011 andrew gelman stats-2011-11-15-World record running times vs. distance

10 0.87024713 915 andrew gelman stats-2011-09-17-(Worst) graph of the year

11 0.85865796 294 andrew gelman stats-2010-09-23-Thinking outside the (graphical) box: Instead of arguing about how best to fix a bar chart, graph it as a time series lineplot instead

12 0.85481215 1894 andrew gelman stats-2013-06-12-How to best graph the Beveridge curve, relating the vacancy rate in jobs to the unemployment rate?

13 0.85174656 443 andrew gelman stats-2010-12-02-Automating my graphics advice

14 0.8514666 1669 andrew gelman stats-2013-01-12-The power of the puzzlegraph

15 0.84682888 262 andrew gelman stats-2010-09-08-Here’s how rumors get started: Lineplots, dotplots, and nonfunctional modernist architecture

16 0.84499466 1104 andrew gelman stats-2012-01-07-A compelling reason to go to London, Ontario??

17 0.83575845 1258 andrew gelman stats-2012-04-10-Why display 6 years instead of 30?

18 0.83443147 1376 andrew gelman stats-2012-06-12-Simple graph WIN: the example of birthday frequencies

19 0.83394963 61 andrew gelman stats-2010-05-31-A data visualization manifesto

20 0.83150774 1609 andrew gelman stats-2012-12-06-Stephen Kosslyn’s principles of graphics and one more: There’s no need to cram everything into a single plot


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(1, 0.012), (6, 0.02), (16, 0.077), (21, 0.044), (24, 0.16), (25, 0.044), (38, 0.012), (45, 0.011), (55, 0.026), (66, 0.056), (77, 0.027), (95, 0.152), (96, 0.011), (99, 0.24)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.95673406 404 andrew gelman stats-2010-11-09-“Much of the recent reported drop in interstate migration is a statistical artifact”

Introduction: Greg Kaplan writes: I noticed that you have blogged a little about interstate migration trends in the US, and thought that you might be interested in a new working paper of mine (joint with Sam Schulhofer-Wohl from the Minneapolis Fed) which I have attached. Briefly, we show that much of the recent reported drop in interstate migration is a statistical artifact: The Census Bureau made an undocumented change in its imputation procedures for missing data in 2006, and this change significantly reduced the number of imputed interstate moves. The change in imputation procedures — not any actual change in migration behavior — explains 90 percent of the reported decrease in interstate migration between the 2005 and 2006 Current Population Surveys, and 42 percent of the decrease between 2000 and 2010. I haven’t had a chance to give a serious look so could only make the quick suggestion to make the graphs smaller and put multiple graphs on a page, This would allow the reader to bett

2 0.94962621 12 andrew gelman stats-2010-04-30-More on problems with surveys estimating deaths in war zones

Introduction: Andrew Mack writes: There was a brief commentary from the Benetech folk on the Human Security Report Project’s, “The Shrinking Costs of War” report on your blog in January. But the report has since generated a lot of public controversy . Since the report–like the current discussion in your blog on Mike Spagat’s new paper on Iraq–deals with controversies generated by survey-based excess death estimates, we thought your readers might be interested. Our responses to the debate were posted on our website last week. “Shrinking Costs” had discussed the dramatic decline in death tolls from wartime violence since the end of World War II –and its causes. We also argued that deaths from war-exacerbated disease and malnutrition had declined. (The exec. summary is here .) One of the most striking findings was that mortality rates (we used under-five mortality data) decline during most wars. Indeed our latest research indicates that of the total number of years that countries w

3 0.94626176 1973 andrew gelman stats-2013-08-08-For chrissake, just make up an analysis already! We have a lab here to run, y’know?

Introduction: Ben Hyde sends along this : Stuck in the middle of the supplemental data, reporting the total workup for their compounds, was this gem: Emma, please insert NMR data here! where are they? and for this compound, just make up an elemental analysis . . . I’m reminded of our recent discussions of coauthorship, where I argued that I see real advantages to having multiple people taking responsibility for the result. Jay Verkuilen responded: “On the flipside of collaboration . . . is diffusion of responsibility, where everybody thinks someone else ‘has that problem’ and thus things don’t get solved.” That’s what seems to have happened (hilariously) here.

4 0.94571543 266 andrew gelman stats-2010-09-09-The future of R

Introduction: Some thoughts from Christian , including this bit: We need to consider separately 1. R’s brilliant library 2. R’s not-so-brilliant language and/or interpreter. I don’t know that R’s library is so brilliant as all that–if necessary, I don’t think it would be hard to reprogram the important packages in a new language. I would say, though, that the problems with R are not just in the technical details of the language. I think the culture of R has some problems too. As I’ve written before, R functions used to be lean and mean, and now they’re full of exception-handling and calls to other packages. R functions are spaghetti-like messes of connections in which I keep expecting to run into syntax like “GOTO 120.” I learned about these problems a couple years ago when writing bayesglm(), which is a simple adaptation of glm(). But glm(), and its workhorse, glm.fit(), are a mess: They’re about 10 lines of functioning code, plus about 20 lines of necessary front-end, plus a cou

same-blog 5 0.94246352 2154 andrew gelman stats-2013-12-30-Bill Gates’s favorite graph of the year

Introduction: Under the subject line “Blog bait!”, Brendan Nyhan points me to this post at the Washington Post blog: For 2013, we asked some of the year’s most interesting, important and influential thinkers to name their favorite graph of the year — and why they chose it. Here’s Bill Gates’s. Infographic by Thomas Porostocky for WIRED. “I love this graph because it shows that while the number of people dying from communicable diseases is still far too high, those numbers continue to come down. . . .” As Brendan is aware, this is not my favorite sort of graph, it’s a bit of a puzzle to read and figure out where all the pieces fit in, also weird stuff going on like 3-D effects and the big space taken up by those yellow and green borders, as well as tricky things like understanding what some of those little blocks are, and perhaps the biggest question, what is the definition of an “untimely death.” But, as often is the case, the defects of the graph form a statistical perspective can

6 0.94189417 1164 andrew gelman stats-2012-02-13-Help with this problem, win valuable prizes

7 0.93274856 1086 andrew gelman stats-2011-12-27-The most dangerous jobs in America

8 0.92712104 1308 andrew gelman stats-2012-05-08-chartsnthings !

9 0.92665654 2135 andrew gelman stats-2013-12-15-The UN Plot to Force Bayesianism on Unsuspecting Americans (penalized B-Spline edition)

10 0.92663932 1862 andrew gelman stats-2013-05-18-uuuuuuuuuuuuugly

11 0.92227328 519 andrew gelman stats-2011-01-16-Update on the generalized method of moments

12 0.92126298 1834 andrew gelman stats-2013-05-01-A graph at war with its caption. Also, how to visualize the same numbers without giving the display a misleading causal feel?

13 0.92030227 1820 andrew gelman stats-2013-04-23-Foundation for Open Access Statistics

14 0.91605771 1070 andrew gelman stats-2011-12-19-The scope for snooping

15 0.91569245 829 andrew gelman stats-2011-07-29-Infovis vs. statgraphics: A clear example of their different goals

16 0.91379201 1737 andrew gelman stats-2013-02-25-Correlation of 1 . . . too good to be true?

17 0.9127053 2308 andrew gelman stats-2014-04-27-White stripes and dead armadillos

18 0.91237718 1758 andrew gelman stats-2013-03-11-Yes, the decision to try (or not) to have a child can be made rationally

19 0.90779912 1575 andrew gelman stats-2012-11-12-Thinking like a statistician (continuously) rather than like a civilian (discretely)

20 0.90251541 627 andrew gelman stats-2011-03-24-How few respondents are reasonable to use when calculating the average by county?