andrew_gelman_stats andrew_gelman_stats-2011 andrew_gelman_stats-2011-671 knowledge-graph by maker-knowledge-mining

671 andrew gelman stats-2011-04-20-One more time-use graph


meta infos for this blog

Source: html

Introduction: Evan Hensleigh sens me this redesign of the cross-national time use graph : Here was my version: And here was the original: Compared to my graph, Evan’s has better fonts, and that’s important–good fonts can make a display look professional. But I’m not sure about his other innovations. To me, the different colors for the different time-use categories are more of a distraction than a visual aid, and I also don’t like how he made the bars fatter. As I noted in my earlier entry, to me this draws unwanted attention to the negative space between the bars. His country labels are slightly misaligned (particularly Japan and USA), and I really don’t like his horizontal axis at all! He removed the units of hours and put + and – on the edges so that the axes run into each other. What was the point of that? It’s bad news. Also I don’t see any advantage at all to the prehensile tick marks. On the other hand, if Evgn and I were working together on such a graph, we w


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Evan Hensleigh sens me this redesign of the cross-national time use graph : Here was my version: And here was the original: Compared to my graph, Evan’s has better fonts, and that’s important–good fonts can make a display look professional. [sent-1, score-1.008]

2 To me, the different colors for the different time-use categories are more of a distraction than a visual aid, and I also don’t like how he made the bars fatter. [sent-3, score-0.834]

3 As I noted in my earlier entry, to me this draws unwanted attention to the negative space between the bars. [sent-4, score-0.743]

4 His country labels are slightly misaligned (particularly Japan and USA), and I really don’t like his horizontal axis at all! [sent-5, score-0.819]

5 He removed the units of hours and put + and – on the edges so that the axes run into each other. [sent-6, score-0.768]

6 Also I don’t see any advantage at all to the prehensile tick marks. [sent-9, score-0.283]

7 On the other hand, if Evgn and I were working together on such a graph, we would probably come up with something better than either of us would make alone. [sent-10, score-0.486]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('evan', 0.381), ('fonts', 0.352), ('graph', 0.209), ('misaligned', 0.202), ('redesign', 0.191), ('unwanted', 0.182), ('tick', 0.182), ('distraction', 0.176), ('horizontal', 0.163), ('axes', 0.163), ('japan', 0.159), ('edges', 0.156), ('usa', 0.143), ('removed', 0.136), ('bars', 0.136), ('labels', 0.135), ('colors', 0.135), ('aid', 0.135), ('units', 0.134), ('axis', 0.129), ('draws', 0.119), ('categories', 0.116), ('alone', 0.113), ('visual', 0.111), ('entry', 0.102), ('display', 0.102), ('hours', 0.102), ('advantage', 0.101), ('slightly', 0.097), ('space', 0.095), ('attention', 0.093), ('country', 0.093), ('noted', 0.089), ('version', 0.089), ('better', 0.088), ('negative', 0.088), ('together', 0.081), ('different', 0.08), ('compared', 0.079), ('particularly', 0.077), ('earlier', 0.077), ('run', 0.077), ('original', 0.073), ('hand', 0.071), ('either', 0.067), ('make', 0.066), ('probably', 0.064), ('working', 0.061), ('bad', 0.06), ('come', 0.059)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0 671 andrew gelman stats-2011-04-20-One more time-use graph

Introduction: Evan Hensleigh sens me this redesign of the cross-national time use graph : Here was my version: And here was the original: Compared to my graph, Evan’s has better fonts, and that’s important–good fonts can make a display look professional. But I’m not sure about his other innovations. To me, the different colors for the different time-use categories are more of a distraction than a visual aid, and I also don’t like how he made the bars fatter. As I noted in my earlier entry, to me this draws unwanted attention to the negative space between the bars. His country labels are slightly misaligned (particularly Japan and USA), and I really don’t like his horizontal axis at all! He removed the units of hours and put + and – on the edges so that the axes run into each other. What was the point of that? It’s bad news. Also I don’t see any advantage at all to the prehensile tick marks. On the other hand, if Evgn and I were working together on such a graph, we w

2 0.16732751 670 andrew gelman stats-2011-04-20-Attractive but hard-to-read graph could be made much much better

Introduction: Matthew Yglesias shares this graph from the Economist : I hate this graph. OK, sure, I don’t hate hate hate hate it: it’s not a 3-d exploding pie chart or anything. It’s not misleading, it’s just extremely difficult to read. Basically, you have to go back and forth between the colors and the labels and the countries and read it like a table. OK, so here’s the table: Average Hours Per Day Spent in Each Activity Work, Unpaid Eating, Personal Country study work sleeping care Leisure Other France 4 3 11 1 2 2 Germany 4 3 10 1 3 3 Japan 6 2 10 1 2 2 Britain 4 3 10 1 3 3 USA 5 3 10 1 3 2 Turkey 4 3 11 1 3 2 Hmm, that didn’t work too well. Let’s try subtracting the average from each column (for these six countries,

3 0.14082581 2132 andrew gelman stats-2013-12-13-And now, here’s something that would make Ed Tufte spin in his . . . ummm, Tufte’s still around, actually, so let’s just say I don’t think he’d like it!

Introduction: We haven’t had one of these in awhile, having mostly switched to the “chess trivia” and “bad p-values” genres of blogging . . . But I had to come back to the topic after receiving this note from Raghuveer Parthasarathy: Here’s another bad graph you might like. It might (arguably) be even worse than the “worst graphs of the year” you’ve blogged about, since rather than being a poor representation of data, it is simply the plotting of a tautology that mistakenly gives the impression of being data. (And it’s in Nature.) Parthasarathy explains: On the vertical axis we have the probability of being Type 2 Diabetic (T2D). On the horizontal axis we have the probability of being normal. There’s a clear, important trend evident, right? No! The probability of being normal is trivially one minus the probability of being T2D! The graph could not possibly be anything other than a straight line of slope -1. (For the students out there: the complete lack of scatter in the graph is

4 0.11927432 2266 andrew gelman stats-2014-03-25-A statistical graphics course and statistical graphics advice

Introduction: Dean Eckles writes: Some of my coworkers at Facebook and I have worked with Udacity to create an online course on exploratory data analysis, including using data visualizations in R as part of EDA. The course has now launched at  https://www.udacity.com/course/ud651  so anyone can take it for free. And Kaiser Fung has  reviewed it . So definitely feel free to promote it! Criticism is also welcome (we are still fine-tuning things and adding more notes throughout). I wrote some more comments about the course  here , including highlighting the interviews with my great coworkers. I didn’t have a chance to look at the course so instead I responded with some generic comments about eda and visualization (in no particular order): - Think of a graph as a comparison. All graphs are comparison (indeed, all statistical analyses are comparisons). If you already have the graph in mind, think of what comparisons it’s enabling. Or if you haven’t settled on the graph yet, think of what

5 0.11684787 61 andrew gelman stats-2010-05-31-A data visualization manifesto

Introduction: Details matter (at least, they do for me), but we don’t yet have a systematic way of going back and forth between the structure of a graph, its details, and the underlying questions that motivate our visualizations. (Cleveland, Wilkinson, and others have written a bit on how to formalize these connections, and I’ve thought about it too, but we have a ways to go.) I was thinking about this difficulty after reading an article on graphics by some computer scientists that was well-written but to me lacked a feeling for the linkages between substantive/statistical goals and graphical details. I have problems with these issues too, and my point here is not to criticize but to move the discussion forward. When thinking about visualization, how important are the details? Aleks pointed me to this article by Jeffrey Heer, Michael Bostock, and Vadim Ogievetsky, “A Tour through the Visualization Zoo: A survey of powerful visualization techniques, from the obvious to the obscure.” Th

6 0.11503217 878 andrew gelman stats-2011-08-29-Infovis, infographics, and data visualization: Where I’m coming from, and where I’d like to go

7 0.11141466 488 andrew gelman stats-2010-12-27-Graph of the year

8 0.11139308 855 andrew gelman stats-2011-08-16-Infovis and statgraphics update update

9 0.10441645 2154 andrew gelman stats-2013-12-30-Bill Gates’s favorite graph of the year

10 0.10377514 262 andrew gelman stats-2010-09-08-Here’s how rumors get started: Lineplots, dotplots, and nonfunctional modernist architecture

11 0.10206389 2091 andrew gelman stats-2013-11-06-“Marginally significant”

12 0.10199241 2288 andrew gelman stats-2014-04-10-Small multiples of lineplots > maps (ok, not always, but yes in this case)

13 0.10058602 1834 andrew gelman stats-2013-05-01-A graph at war with its caption. Also, how to visualize the same numbers without giving the display a misleading causal feel?

14 0.096287251 1376 andrew gelman stats-2012-06-12-Simple graph WIN: the example of birthday frequencies

15 0.091634959 1684 andrew gelman stats-2013-01-20-Ugly ugly ugly

16 0.090234332 294 andrew gelman stats-2010-09-23-Thinking outside the (graphical) box: Instead of arguing about how best to fix a bar chart, graph it as a time series lineplot instead

17 0.090012394 1609 andrew gelman stats-2012-12-06-Stephen Kosslyn’s principles of graphics and one more: There’s no need to cram everything into a single plot

18 0.089347631 2043 andrew gelman stats-2013-09-29-The difficulties of measuring just about anything

19 0.086636193 672 andrew gelman stats-2011-04-20-The R code for those time-use graphs

20 0.086480089 2246 andrew gelman stats-2014-03-13-An Economist’s Guide to Visualizing Data


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.114), (1, -0.042), (2, 0.002), (3, 0.059), (4, 0.111), (5, -0.135), (6, -0.036), (7, 0.035), (8, -0.004), (9, -0.007), (10, 0.006), (11, -0.002), (12, -0.03), (13, -0.009), (14, 0.01), (15, 0.006), (16, 0.036), (17, 0.017), (18, -0.041), (19, 0.008), (20, 0.025), (21, 0.03), (22, -0.047), (23, 0.0), (24, 0.03), (25, -0.015), (26, 0.018), (27, -0.017), (28, -0.037), (29, -0.005), (30, 0.03), (31, -0.01), (32, -0.081), (33, -0.021), (34, -0.028), (35, -0.032), (36, -0.046), (37, -0.04), (38, -0.021), (39, 0.015), (40, -0.006), (41, -0.014), (42, 0.002), (43, 0.024), (44, -0.035), (45, 0.012), (46, 0.024), (47, 0.033), (48, -0.013), (49, 0.0)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.96531457 671 andrew gelman stats-2011-04-20-One more time-use graph

Introduction: Evan Hensleigh sens me this redesign of the cross-national time use graph : Here was my version: And here was the original: Compared to my graph, Evan’s has better fonts, and that’s important–good fonts can make a display look professional. But I’m not sure about his other innovations. To me, the different colors for the different time-use categories are more of a distraction than a visual aid, and I also don’t like how he made the bars fatter. As I noted in my earlier entry, to me this draws unwanted attention to the negative space between the bars. His country labels are slightly misaligned (particularly Japan and USA), and I really don’t like his horizontal axis at all! He removed the units of hours and put + and – on the edges so that the axes run into each other. What was the point of that? It’s bad news. Also I don’t see any advantage at all to the prehensile tick marks. On the other hand, if Evgn and I were working together on such a graph, we w

2 0.90633881 1684 andrew gelman stats-2013-01-20-Ugly ugly ugly

Introduction: Denis Cote sends the following , under the heading, “Some bad graphs for your enjoyment”: To start with, they don’t know how to spell “color.” Seriously, though, the graph is a mess. The circular display implies a circular or periodic structure that isn’t actually in the data, the cramped display requires the use of an otherwise-unnecessary color code that makes it difficult to find or make sense of the information, the alphabetical ordering (without even supplying state names, only abbreviations) makes it further difficult to find any patterns. It would be so much better, and even easier, to just display a set of small maps shading states on whether they have different laws. But that’s part of the problem—the clearer graph would also be easier to make! To get a distinctive graph, there needs to be some degree of difficulty. The designers continue with these monstrosities: Here they decide to display only 5 states at a time so that it’s really hard to see any big pi

3 0.88407904 2146 andrew gelman stats-2013-12-24-NYT version of birthday graph

Introduction: They didn’t have room for all four graphs of the time-series decomposition so they just displayed the date-of-year graph: They rotated so the graph fit better on the page. The rotation worked for me, but I was a bit bummed that that they put the title and heading of the graph (“The birthrate tends to drop on holidays . . .”) on the left in the Mar-Apr slot, leaving no room to label Leap Day and April Fool’s. I suggested to the graphics people that they put the label at the very top and just shrink the rest of the graph by 5 or 10% so as to not take up any more total space. Then there’d be plenty of space to label Leap Day and April Fool’s. But they didn’t do it, maybe they felt that it wouldn’t look good to have the label right at the top, I dunno.

4 0.8827076 502 andrew gelman stats-2011-01-04-Cash in, cash out graph

Introduction: David Afshartous writes: I thought this graph [from Ed Easterling] might be good for your blog. The 71 outlined squares show the main story, and the regions of the graph present the information nicely. Looks like the bins for the color coding are not of equal size and of course the end bins are unbounded. Might be interesting to graph the distribution of the actual data for the 71 outlined squares. In addition, I assume that each period begins on Jan 1 so data size could be naturally increased by looking at intervals that start on June 1 as well (where the limit of this process would be to have it at the granularity of one day; while it most likely wouldn’t make much difference, I’ve seen some graphs before where 1 year returns can be quite sensitive to starting date, etc). I agree that (a) the graph could be improved in small ways–in particular, adding half-year data seems like a great idea–and (b) it’s a wonderful, wonderful graph as is. And the NYT graphics people ad

5 0.87140232 2154 andrew gelman stats-2013-12-30-Bill Gates’s favorite graph of the year

Introduction: Under the subject line “Blog bait!”, Brendan Nyhan points me to this post at the Washington Post blog: For 2013, we asked some of the year’s most interesting, important and influential thinkers to name their favorite graph of the year — and why they chose it. Here’s Bill Gates’s. Infographic by Thomas Porostocky for WIRED. “I love this graph because it shows that while the number of people dying from communicable diseases is still far too high, those numbers continue to come down. . . .” As Brendan is aware, this is not my favorite sort of graph, it’s a bit of a puzzle to read and figure out where all the pieces fit in, also weird stuff going on like 3-D effects and the big space taken up by those yellow and green borders, as well as tricky things like understanding what some of those little blocks are, and perhaps the biggest question, what is the definition of an “untimely death.” But, as often is the case, the defects of the graph form a statistical perspective can

6 0.86879569 672 andrew gelman stats-2011-04-20-The R code for those time-use graphs

7 0.86803526 262 andrew gelman stats-2010-09-08-Here’s how rumors get started: Lineplots, dotplots, and nonfunctional modernist architecture

8 0.86761886 294 andrew gelman stats-2010-09-23-Thinking outside the (graphical) box: Instead of arguing about how best to fix a bar chart, graph it as a time series lineplot instead

9 0.86747336 1011 andrew gelman stats-2011-11-15-World record running times vs. distance

10 0.85559142 829 andrew gelman stats-2011-07-29-Infovis vs. statgraphics: A clear example of their different goals

11 0.85056728 1376 andrew gelman stats-2012-06-12-Simple graph WIN: the example of birthday frequencies

12 0.84995061 2132 andrew gelman stats-2013-12-13-And now, here’s something that would make Ed Tufte spin in his . . . ummm, Tufte’s still around, actually, so let’s just say I don’t think he’d like it!

13 0.84044993 670 andrew gelman stats-2011-04-20-Attractive but hard-to-read graph could be made much much better

14 0.83933401 1258 andrew gelman stats-2012-04-10-Why display 6 years instead of 30?

15 0.83580583 488 andrew gelman stats-2010-12-27-Graph of the year

16 0.83383918 443 andrew gelman stats-2010-12-02-Automating my graphics advice

17 0.83103287 1104 andrew gelman stats-2012-01-07-A compelling reason to go to London, Ontario??

18 0.82940239 1439 andrew gelman stats-2012-08-01-A book with a bunch of simple graphs

19 0.82271045 1253 andrew gelman stats-2012-04-08-Technology speedup graph

20 0.8165617 1498 andrew gelman stats-2012-09-16-Choices in graphing parallel time series


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(5, 0.055), (14, 0.021), (15, 0.039), (16, 0.079), (21, 0.019), (24, 0.163), (65, 0.225), (76, 0.047), (86, 0.027), (89, 0.018), (99, 0.193)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.92846298 457 andrew gelman stats-2010-12-07-Whassup with phantom-limb treatment?

Introduction: OK, here’s something that is completely baffling me. I read this article by John Colapinto on the neuroscientist V. S. Ramachandran, who’s famous for his innovative treatment for “phantom limb” pain: His first subject was a young man who a decade earlier had crashed his motorcycle and torn from his spinal column the nerves supplying the left arm. After keeping the useless arm in a sling for a year, the man had the arm amputated above the elbow. Ever since, he had felt unremitting cramping in the phantom limb, as though it were immobilized in an awkward position. . . . Ramachandram positioned a twenty-inch-by-twenty-inch drugstore mirror . . . and told him to place his intact right arm on one side of the mirror and his stump on the other. He told the man to arrange the mirror so that the reflection created the illusion that his intact arm was the continuation of the amputated one. The Ramachandran asked the man to move his right and left arms . . . “Oh, my God!” the man began

2 0.91160148 1475 andrew gelman stats-2012-08-30-A Stan is Born

Introduction: Stan 1.0.0 and RStan 1.0.0 It’s official. The Stan Development Team is happy to announce the first stable versions of Stan and RStan. What is (R)Stan? Stan is an open-source package for obtaining Bayesian inference using the No-U-Turn sampler, a variant of Hamiltonian Monte Carlo. It’s sort of like BUGS, but with a different language for expressing models and a different sampler for sampling from their posteriors. RStan is the R interface to Stan. Stan Home Page Stan’s home page is: http://mc-stan.org/ It links everything you need to get started running Stan from the command line, from R, or from C++, including full step-by-step install instructions, a detailed user’s guide and reference manual for the modeling language, and tested ports of most of the BUGS examples. Peruse the Manual If you’d like to learn more, the Stan User’s Guide and Reference Manual is the place to start.

same-blog 3 0.88217056 671 andrew gelman stats-2011-04-20-One more time-use graph

Introduction: Evan Hensleigh sens me this redesign of the cross-national time use graph : Here was my version: And here was the original: Compared to my graph, Evan’s has better fonts, and that’s important–good fonts can make a display look professional. But I’m not sure about his other innovations. To me, the different colors for the different time-use categories are more of a distraction than a visual aid, and I also don’t like how he made the bars fatter. As I noted in my earlier entry, to me this draws unwanted attention to the negative space between the bars. His country labels are slightly misaligned (particularly Japan and USA), and I really don’t like his horizontal axis at all! He removed the units of hours and put + and – on the edges so that the axes run into each other. What was the point of that? It’s bad news. Also I don’t see any advantage at all to the prehensile tick marks. On the other hand, if Evgn and I were working together on such a graph, we w

4 0.8804664 1426 andrew gelman stats-2012-07-23-Special effects

Introduction: I just saw L’Age de Glace 4 and boy are my eyes tired. I’m just glad it wasn’t in 3-D or I probably would’ve thrown up. The special effects were amazing, way beyond George of the Jungle and that ilk. Which was good, as I could only understand about 10% of the dialogue. I’d heard about all this new animation technology but not actually seen it before.

5 0.87679517 1197 andrew gelman stats-2012-03-04-“All Models are Right, Most are Useless”

Introduction: The above is the title of a talk that Thad Tarpey gave at the Joint Statistical Meetings in 2009. Here’s the abstract: Students of statistics are often introduced to George Box’s famous quote: “all models are wrong, some are useful.” In this talk I [Tarpey] argue that this quote, although useful, is wrong. A different and more positive perspective is to acknowledge that a model is simply a means of extracting information of interest from data. The truth is infinitely complex and a model is merely an approximation to the truth. If the approximation is poor or misleading, then the model is useless. In this talk I give examples of correct models that are not true models. I illustrate how the notion of a “wrong” model can lead to wrong conclusions. I’m curious what he had to say—maybe he could post the slides? P.S. And here they are !

6 0.87380236 1993 andrew gelman stats-2013-08-22-Improvements to Kindle Version of BDA3

7 0.86712521 1021 andrew gelman stats-2011-11-21-Don’t judge a book by its title

8 0.86248922 1365 andrew gelman stats-2012-06-04-Question 25 of my final exam for Design and Analysis of Sample Surveys

9 0.86123371 2074 andrew gelman stats-2013-10-23-Can’t Stop Won’t Stop Mister P Beatdown

10 0.85737067 2062 andrew gelman stats-2013-10-15-Last word on Mister P (for now)

11 0.85465562 2146 andrew gelman stats-2013-12-24-NYT version of birthday graph

12 0.83285975 2056 andrew gelman stats-2013-10-09-Mister P: What’s its secret sauce?

13 0.823053 1454 andrew gelman stats-2012-08-11-Weakly informative priors for Bayesian nonparametric models?

14 0.81985849 463 andrew gelman stats-2010-12-11-Compare p-values from privately funded medical trials to those in publicly funded research?

15 0.81237662 990 andrew gelman stats-2011-11-04-At the politics blogs . . .

16 0.8066076 1333 andrew gelman stats-2012-05-20-Question 10 of my final exam for Design and Analysis of Sample Surveys

17 0.8010754 758 andrew gelman stats-2011-06-11-Hey, good news! Your p-value just passed the 0.05 threshold!

18 0.7948935 2061 andrew gelman stats-2013-10-14-More on Mister P and how it does what it does

19 0.79353023 1811 andrew gelman stats-2013-04-18-Psychology experiments to understand what’s going on with data graphics?

20 0.7814225 100 andrew gelman stats-2010-06-19-Unsurprisingly, people are more worried about the economy and jobs than about deficits