andrew_gelman_stats andrew_gelman_stats-2014 andrew_gelman_stats-2014-2186 knowledge-graph by maker-knowledge-mining

2186 andrew gelman stats-2014-01-26-Infoviz on top of stat graphic on top of spreadsheet


meta infos for this blog

Source: html

Introduction: Kaiser points to this infoviz from MIT’s Technology Review: Kaiser writes: What makes the designer want to tilt the reader’s head? This chart is unreadable. It also fails the self-sufficiency test. All 13 data points are printed onto the chart. You really don’t need the axis, and the gridlines. A further design flaw is the use of signposts. Our eyes are drawn to the hexagons containing the brand icons but the data is at the other end of the signpost, where it is planted on the surface! Here is a sketch of something not as cute: I [Kaiser] expressed time as years . . . The mobile-related entities are labelled red. The dots could be replaced by the hexagonal brand icons. I agree with all of Kaiser’s criticisms, and I agree that his graph is, from the statistical perspective, a zillion times better than what was published. On the other hand, unusual images can get attention. Recall the famous/notorious clock plot from Florence Nightingale . This is why I’ve move


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Kaiser points to this infoviz from MIT’s Technology Review: Kaiser writes: What makes the designer want to tilt the reader’s head? [sent-1, score-0.367]

2 Our eyes are drawn to the hexagons containing the brand icons but the data is at the other end of the signpost, where it is planted on the surface! [sent-7, score-0.659]

3 Here is a sketch of something not as cute: I [Kaiser] expressed time as years . [sent-8, score-0.112]

4 The dots could be replaced by the hexagonal brand icons. [sent-12, score-0.504]

5 I agree with all of Kaiser’s criticisms, and I agree that his graph is, from the statistical perspective, a zillion times better than what was published. [sent-13, score-0.232]

6 On the other hand, unusual images can get attention. [sent-14, score-0.084]

7 Recall the famous/notorious clock plot from Florence Nightingale . [sent-15, score-0.102]

8 This is why I’ve moved to the idea of accepting both styles. [sent-16, score-0.086]

9 Maybe Technology Review could feature their arty graph, but then when the reader clicks on it, they go straight to a statistical graph. [sent-17, score-0.529]

10 And then another click could go to a spreadsheet with the raw data (and as much metadata as needed). [sent-18, score-0.209]

11 Strictly speaking, the points convey all the information, but the lines indicate growth, which is what it’s all about. [sent-20, score-0.319]

12 On the details, I think Kaiser’s graph could be better. [sent-23, score-0.252]

13 I’d like to see labels on all the points; one option would be to put the y-axis on a log scale so it will all be more readable. [sent-24, score-0.358]

14 Also, then the straight lines on log scale will correspond to exponential growth, which might be more realistic than linear for most of the data. [sent-25, score-0.87]

15 He could also play around with having the x-axis be absolute time rather than relative time, so that we could see when each platform started. [sent-26, score-0.413]

16 (Recall also one of our core principles of graphics, which is that there’s no need to limit ourselves to a single display. [sent-27, score-0.081]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('kaiser', 0.427), ('brand', 0.218), ('lines', 0.181), ('technology', 0.159), ('growth', 0.157), ('log', 0.156), ('straight', 0.145), ('points', 0.138), ('graph', 0.137), ('reader', 0.136), ('entities', 0.133), ('icons', 0.133), ('clicks', 0.133), ('labelled', 0.133), ('planted', 0.12), ('tilt', 0.12), ('recall', 0.119), ('could', 0.115), ('scale', 0.113), ('sketch', 0.112), ('florence', 0.112), ('printed', 0.112), ('nightingale', 0.112), ('designer', 0.109), ('exponential', 0.104), ('surface', 0.102), ('clock', 0.102), ('review', 0.098), ('containing', 0.097), ('flaw', 0.097), ('platform', 0.096), ('fails', 0.096), ('zillion', 0.095), ('spreadsheet', 0.094), ('eyes', 0.091), ('mit', 0.09), ('chart', 0.089), ('strictly', 0.089), ('cute', 0.089), ('labels', 0.089), ('dots', 0.088), ('onto', 0.088), ('absolute', 0.087), ('accepting', 0.086), ('correspond', 0.086), ('realistic', 0.085), ('axis', 0.085), ('images', 0.084), ('replaced', 0.083), ('core', 0.081)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.99999976 2186 andrew gelman stats-2014-01-26-Infoviz on top of stat graphic on top of spreadsheet

Introduction: Kaiser points to this infoviz from MIT’s Technology Review: Kaiser writes: What makes the designer want to tilt the reader’s head? This chart is unreadable. It also fails the self-sufficiency test. All 13 data points are printed onto the chart. You really don’t need the axis, and the gridlines. A further design flaw is the use of signposts. Our eyes are drawn to the hexagons containing the brand icons but the data is at the other end of the signpost, where it is planted on the surface! Here is a sketch of something not as cute: I [Kaiser] expressed time as years . . . The mobile-related entities are labelled red. The dots could be replaced by the hexagonal brand icons. I agree with all of Kaiser’s criticisms, and I agree that his graph is, from the statistical perspective, a zillion times better than what was published. On the other hand, unusual images can get attention. Recall the famous/notorious clock plot from Florence Nightingale . This is why I’ve move

2 0.22235776 648 andrew gelman stats-2011-04-04-The Case for More False Positives in Anti-doping Testing

Introduction: No joke. See here (from Kaiser Fung). At the Statistics Forum.

3 0.21971188 2288 andrew gelman stats-2014-04-10-Small multiples of lineplots > maps (ok, not always, but yes in this case)

Introduction: Kaiser Fung shares this graph from Ritchie King: Kaiser writes: What they did right: - Did not put the data on a map - Ordered the countries by the most recent data point rather than alphabetically - Scale labels are found only on outer edge of the chart area, rather than one set per panel - Only used three labels for the 11 years on the plot - Did not overdo the vertical scale either The nicest feature was the XL scale applied only to South Korea. This destroys the small-multiples principle but draws attention to the top left corner, where the designer wants our eyes to go. I would have used smaller fonts throughout. I agree with all of Kaiser’s comments. I could even add a few more, like using light gray for the backgrounds and a bright blue for the lines, spacing the graphs well, using full country names rather than three-letter abbreviations. There are so many standard mistakes that go into default data displays that it is refreshing to see a simple graph done

4 0.21749753 1001 andrew gelman stats-2011-11-10-Three hours in the life of a statistician

Introduction: Kaiser Fung tells what it’s really like . Here’s a sample: As soon as I [Kaiser] put the substring-concatenate expression together with two lines of code that generate data tables, it choked. Sorta like Dashiell Hammett without the broads and the heaters. And here’s another take, from a slightly different perspective.

5 0.2100497 388 andrew gelman stats-2010-11-01-The placebo effect in pharma

Introduction: Bruce McCullough writes: The Sept 2009 issue of Wired had a big article on the increase in the placebo effect, and why it’s been getting bigger. Kaiser Fung has a synopsis . As if you don’t have enough to do, I thought you might be interested in blogging on this. My reply: I thought Kaiser’s discussion was good, especially this point: Effect on treatment group = Effect of the drug + effect of belief in being treated Effect on placebo group = Effect of belief in being treated Thus, the difference between the two groups = effect of the drug, since the effect of belief in being treated affects both groups of patients. Thus, as Kaiser puts it, if the treatment isn’t doing better than placebo, it doesn’t say that the placebo effect is big (let alone “too big”) but that the treatment isn’t showing any additional effect. It’s “treatment + placebo” vs. placebo, not treatment vs. placebo. That said, I’d prefer for Kaiser to make it clear that the additivity he’s assu

6 0.20376019 543 andrew gelman stats-2011-01-28-NYT shills for personal DNA tests

7 0.19344941 2031 andrew gelman stats-2013-09-19-What makes a statistician look like a hero?

8 0.18949139 1132 andrew gelman stats-2012-01-21-A counterfeit data graphic

9 0.16357651 1090 andrew gelman stats-2011-12-28-“. . . extending for dozens of pages”

10 0.16209275 461 andrew gelman stats-2010-12-09-“‘Why work?’”

11 0.15873718 2266 andrew gelman stats-2014-03-25-A statistical graphics course and statistical graphics advice

12 0.15537283 878 andrew gelman stats-2011-08-29-Infovis, infographics, and data visualization: Where I’m coming from, and where I’d like to go

13 0.15028296 262 andrew gelman stats-2010-09-08-Here’s how rumors get started: Lineplots, dotplots, and nonfunctional modernist architecture

14 0.14555013 1834 andrew gelman stats-2013-05-01-A graph at war with its caption. Also, how to visualize the same numbers without giving the display a misleading causal feel?

15 0.14529863 344 andrew gelman stats-2010-10-15-Story time

16 0.14352861 1176 andrew gelman stats-2012-02-19-Standardized writing styles and standardized graphing styles

17 0.13337904 61 andrew gelman stats-2010-05-31-A data visualization manifesto

18 0.13222513 855 andrew gelman stats-2011-08-16-Infovis and statgraphics update update

19 0.12793663 502 andrew gelman stats-2011-01-04-Cash in, cash out graph

20 0.12749945 982 andrew gelman stats-2011-10-30-“There’s at least as much as an 80 percent chance . . .”


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.148), (1, -0.04), (2, -0.029), (3, 0.073), (4, 0.141), (5, -0.167), (6, -0.06), (7, 0.068), (8, -0.003), (9, 0.025), (10, -0.045), (11, -0.04), (12, -0.023), (13, 0.002), (14, -0.088), (15, -0.033), (16, -0.062), (17, 0.192), (18, 0.011), (19, 0.015), (20, 0.009), (21, 0.074), (22, -0.044), (23, -0.089), (24, 0.025), (25, 0.019), (26, 0.037), (27, -0.011), (28, -0.018), (29, 0.046), (30, -0.026), (31, -0.106), (32, 0.007), (33, 0.075), (34, 0.035), (35, -0.094), (36, -0.013), (37, -0.063), (38, -0.02), (39, -0.078), (40, -0.05), (41, 0.158), (42, 0.019), (43, 0.036), (44, -0.027), (45, 0.041), (46, 0.069), (47, 0.018), (48, -0.011), (49, -0.051)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.94927609 2186 andrew gelman stats-2014-01-26-Infoviz on top of stat graphic on top of spreadsheet

Introduction: Kaiser points to this infoviz from MIT’s Technology Review: Kaiser writes: What makes the designer want to tilt the reader’s head? This chart is unreadable. It also fails the self-sufficiency test. All 13 data points are printed onto the chart. You really don’t need the axis, and the gridlines. A further design flaw is the use of signposts. Our eyes are drawn to the hexagons containing the brand icons but the data is at the other end of the signpost, where it is planted on the surface! Here is a sketch of something not as cute: I [Kaiser] expressed time as years . . . The mobile-related entities are labelled red. The dots could be replaced by the hexagonal brand icons. I agree with all of Kaiser’s criticisms, and I agree that his graph is, from the statistical perspective, a zillion times better than what was published. On the other hand, unusual images can get attention. Recall the famous/notorious clock plot from Florence Nightingale . This is why I’ve move

2 0.8848114 1001 andrew gelman stats-2011-11-10-Three hours in the life of a statistician

Introduction: Kaiser Fung tells what it’s really like . Here’s a sample: As soon as I [Kaiser] put the substring-concatenate expression together with two lines of code that generate data tables, it choked. Sorta like Dashiell Hammett without the broads and the heaters. And here’s another take, from a slightly different perspective.

3 0.81445426 2031 andrew gelman stats-2013-09-19-What makes a statistician look like a hero?

Introduction: Answer here (courtesy of Kaiser Fung).

4 0.78276443 543 andrew gelman stats-2011-01-28-NYT shills for personal DNA tests

Introduction: Kaiser nails it . The offending article , by John Tierney, somehow ended up in the Science section rather than the Opinion section. As an opinion piece (or, for that matter, a blog), Tierney’s article would be nothing special. But I agree with Kaiser that it doesn’t work as a newspaper article. As Kaiser notes, this story involves a bunch of statistical and empirical claims that are not well resolved by P.R. and rhetoric.

5 0.77859026 1132 andrew gelman stats-2012-01-21-A counterfeit data graphic

Introduction: Kaiser Fung discusses . It’s a good sign when statistical graphics are so popular that people feel the need to fake them!

6 0.77295822 1090 andrew gelman stats-2011-12-28-“. . . extending for dozens of pages”

7 0.7641409 2288 andrew gelman stats-2014-04-10-Small multiples of lineplots > maps (ok, not always, but yes in this case)

8 0.7518276 1256 andrew gelman stats-2012-04-10-Our data visualization panel at the New York Public Library

9 0.70406598 1174 andrew gelman stats-2012-02-18-Not as ugly as you look

10 0.69914669 982 andrew gelman stats-2011-10-30-“There’s at least as much as an 80 percent chance . . .”

11 0.69165665 388 andrew gelman stats-2010-11-01-The placebo effect in pharma

12 0.69030249 461 andrew gelman stats-2010-12-09-“‘Why work?’”

13 0.68174779 262 andrew gelman stats-2010-09-08-Here’s how rumors get started: Lineplots, dotplots, and nonfunctional modernist architecture

14 0.64966995 1834 andrew gelman stats-2013-05-01-A graph at war with its caption. Also, how to visualize the same numbers without giving the display a misleading causal feel?

15 0.64356643 742 andrew gelman stats-2011-06-02-Grouponomics, counterfactuals, and opportunity cost

16 0.63859653 1275 andrew gelman stats-2012-04-22-Please stop me before I barf again

17 0.61779392 1011 andrew gelman stats-2011-11-15-World record running times vs. distance

18 0.60951936 1246 andrew gelman stats-2012-04-04-Data visualization panel at the New York Public Library this evening!

19 0.59595066 344 andrew gelman stats-2010-10-15-Story time

20 0.59496385 672 andrew gelman stats-2011-04-20-The R code for those time-use graphs


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(0, 0.011), (5, 0.051), (16, 0.143), (21, 0.041), (24, 0.115), (34, 0.061), (41, 0.013), (48, 0.017), (52, 0.012), (54, 0.015), (55, 0.086), (59, 0.024), (63, 0.026), (64, 0.024), (77, 0.024), (82, 0.012), (96, 0.063), (99, 0.18)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.91627014 2186 andrew gelman stats-2014-01-26-Infoviz on top of stat graphic on top of spreadsheet

Introduction: Kaiser points to this infoviz from MIT’s Technology Review: Kaiser writes: What makes the designer want to tilt the reader’s head? This chart is unreadable. It also fails the self-sufficiency test. All 13 data points are printed onto the chart. You really don’t need the axis, and the gridlines. A further design flaw is the use of signposts. Our eyes are drawn to the hexagons containing the brand icons but the data is at the other end of the signpost, where it is planted on the surface! Here is a sketch of something not as cute: I [Kaiser] expressed time as years . . . The mobile-related entities are labelled red. The dots could be replaced by the hexagonal brand icons. I agree with all of Kaiser’s criticisms, and I agree that his graph is, from the statistical perspective, a zillion times better than what was published. On the other hand, unusual images can get attention. Recall the famous/notorious clock plot from Florence Nightingale . This is why I’ve move

2 0.86568546 377 andrew gelman stats-2010-10-28-The incoming moderate Republican congressmembers

Introduction: Boris writes : By nearly all accounts, the Republicans looks set to take over the US House of Representatives in next week’s November 2010 general election. . . . Republicans, in this wave election that recalls 1994, look set to win not just swing districts, but also those districts that have been traditionally Democratic, or those with strong or longtime Democratic incumbents. Naturally, just as in 2008, this has led to overclaiming by jubilant conservatives and distraught liberals-though the adjectives were then reversed-that this portends a realignment in American politics. . . . Republican moderates in Congress are often associated with two factors: 1) a liberal voting record earlier in their career, and 2) a liberal district. Of course, both are related, in the sense that ambitious moderates choose liberal districts to run in, and liberal districts weed out conservative candidates. . . . Given how competitive Republicans are in 2010, even in otherwise unfriendly territory,

3 0.86525738 411 andrew gelman stats-2010-11-13-Ethical concerns in medical trials

Introduction: I just read this article on the treatment of medical volunteers, written by doctor and bioethicist Carl Ellliott. As a statistician who has done a small amount of consulting for pharmaceutical companies, I have a slightly different perspective. As a doctor, Elliott focuses on individual patients, whereas, as a statistician, I’ve been trained to focus on the goal of accurately estimate treatment effects. I’ll go through Elliott’s article and give my reactions. Elliott: In Miami, investigative reporters for Bloomberg Markets magazine discovered that a contract research organisation called SFBC International was testing drugs on undocumented immigrants in a rundown motel; since that report, the motel has been demolished for fire and safety violations. . . . SFBC had recently been named one of the best small businesses in America by Forbes magazine. The Holiday Inn testing facility was the largest in North America, and had been operating for nearly ten years before inspecto

4 0.86435795 2203 andrew gelman stats-2014-02-08-“Guys who do more housework get less sex”

Introduction: Sometimes I have a few minutes where I can work, but I don’t feel like working. So I follow the blogroll, this time from here to here : Sabino Kornrich, Julie Brines, Katrina Leupp. Egalitarianism, Housework, and Sexual Frequency in Marriage American Sociological Review February 2013 vol. 78 no. 1 26-50 doi:  10.1177/0003122412472340 Data are from Wave II of the National Survey of Families and Households published in 1996, interviews from 1992-1994. The division of labor: Core tasks include preparing meals, washing dishes, cleaning house, shopping, and washing and ironing; non-core tasks include outdoor work, paying bills, auto maintenance, and driving. As you can see in the graph, the more of the “core” tasks a man completes, the less sex he gets. The covariates for overall marital happiness and specific happiness with spouses’ contribution to housework did not change this relationship. The covariate for gender-traditional ideology on household labor likewise

5 0.86326492 1118 andrew gelman stats-2012-01-14-A model rejection letter

Introduction: Howard Wainer sends in this rejection letter from Sir David Brewster of The Edinburgh Journal of Science to Charles Babbage: It is no inconsiderable degree of reluctance that I decline the offer of any Paper from you. I think, however, you will upon reconsideration of the subject be of the opinion that I have no other alternative. The subjects you propose for a series of Mathematical and Metaphysical Essays are so profound, that there is perhaps not a single subscriber to our Journal who could follow them. Nowadays, he could just submit to Wiley Interdisciplinary Reviews . . .

6 0.86212134 177 andrew gelman stats-2010-08-02-Reintegrating rebels into civilian life: Quasi-experimental evidence from Burundi

7 0.86061668 135 andrew gelman stats-2010-07-09-Rasmussen sez: “108% of Respondents Say . . .”

8 0.85766679 2 andrew gelman stats-2010-04-23-Modeling heterogenous treatment effects

9 0.8548106 1755 andrew gelman stats-2013-03-09-Plaig

10 0.85309321 1734 andrew gelman stats-2013-02-23-Life in the C-suite: A graph that is both ugly and bad, and an unrelated story

11 0.84965873 609 andrew gelman stats-2011-03-13-Coauthorship norms

12 0.84896529 503 andrew gelman stats-2011-01-04-Clarity on my email policy

13 0.84839338 2179 andrew gelman stats-2014-01-20-The AAA Tranche of Subprime Science

14 0.84830594 168 andrew gelman stats-2010-07-28-Colorless green, and clueless

15 0.8472923 252 andrew gelman stats-2010-09-02-R needs a good function to make line plots

16 0.84703273 2095 andrew gelman stats-2013-11-09-Typo in Ghitza and Gelman MRP paper

17 0.84586072 159 andrew gelman stats-2010-07-23-Popular governor, small state

18 0.84551394 586 andrew gelman stats-2011-02-23-A statistical version of Arrow’s paradox

19 0.84536994 960 andrew gelman stats-2011-10-15-The bias-variance tradeoff

20 0.84447718 2288 andrew gelman stats-2014-04-10-Small multiples of lineplots > maps (ok, not always, but yes in this case)