andrew_gelman_stats andrew_gelman_stats-2012 andrew_gelman_stats-2012-1606 knowledge-graph by maker-knowledge-mining

1606 andrew gelman stats-2012-12-05-The Grinch Comes Back


meta infos for this blog

Source: html

Introduction: Wayne Folta writes: In keeping with your interest in graphs, this might interest or inspire you, if you haven’t seen it already, which features 20 scientific graphs that Wired likes, ranging from drawn illustrations to trajectory plots. My reaction: I looked at the first 10. I liked 1, 3, and 5, I didn’t like 2, 7, 8, 9, and 10. I have neutral feelings about 4 and 6. I won’t explain all these feelings, but, just for example, from my perspective, image 9 fails as a statistical graphic (although it might be fine as an infovis) by trying to cram to much into a single image. I don’t think it works to have all the colors on the single wheels; instead I’d prefer some sort of grid of images. Also, I don’t see the point of the circular display. That makes no sense at all; it’s a misleading feature. That said, the graphs I dislike can still be fine for their purpose. A graph in a journal such as Science or Nature is meant to grab the eye of a busy reader (or to go viral on


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Wayne Folta writes: In keeping with your interest in graphs, this might interest or inspire you, if you haven’t seen it already, which features 20 scientific graphs that Wired likes, ranging from drawn illustrations to trajectory plots. [sent-1, score-1.49]

2 I liked 1, 3, and 5, I didn’t like 2, 7, 8, 9, and 10. [sent-3, score-0.115]

3 I won’t explain all these feelings, but, just for example, from my perspective, image 9 fails as a statistical graphic (although it might be fine as an infovis) by trying to cram to much into a single image. [sent-5, score-1.031]

4 I don’t think it works to have all the colors on the single wheels; instead I’d prefer some sort of grid of images. [sent-6, score-0.608]

5 Also, I don’t see the point of the circular display. [sent-7, score-0.188]

6 That makes no sense at all; it’s a misleading feature. [sent-8, score-0.111]

7 That said, the graphs I dislike can still be fine for their purpose. [sent-9, score-0.518]

8 A graph in a journal such as Science or Nature is meant to grab the eye of a busy reader (or to go viral on the web), not necessarily to allow data exploration. [sent-10, score-0.985]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('feelings', 0.261), ('graphs', 0.23), ('wheels', 0.196), ('circular', 0.188), ('trajectory', 0.181), ('cram', 0.176), ('wired', 0.176), ('folta', 0.171), ('wayne', 0.171), ('viral', 0.171), ('inspire', 0.164), ('neutral', 0.161), ('single', 0.153), ('fails', 0.151), ('grid', 0.147), ('grab', 0.145), ('dislike', 0.145), ('likes', 0.145), ('interest', 0.143), ('fine', 0.143), ('colors', 0.139), ('infovis', 0.138), ('ranging', 0.136), ('eye', 0.135), ('graphic', 0.133), ('exploration', 0.129), ('busy', 0.124), ('keeping', 0.12), ('drawn', 0.119), ('image', 0.118), ('liked', 0.115), ('misleading', 0.111), ('meant', 0.108), ('reader', 0.107), ('features', 0.107), ('reaction', 0.103), ('web', 0.1), ('necessarily', 0.098), ('allow', 0.097), ('nature', 0.09), ('looked', 0.085), ('prefer', 0.085), ('works', 0.084), ('explain', 0.083), ('haven', 0.083), ('perspective', 0.08), ('won', 0.076), ('might', 0.074), ('although', 0.074), ('seen', 0.073)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0 1606 andrew gelman stats-2012-12-05-The Grinch Comes Back

Introduction: Wayne Folta writes: In keeping with your interest in graphs, this might interest or inspire you, if you haven’t seen it already, which features 20 scientific graphs that Wired likes, ranging from drawn illustrations to trajectory plots. My reaction: I looked at the first 10. I liked 1, 3, and 5, I didn’t like 2, 7, 8, 9, and 10. I have neutral feelings about 4 and 6. I won’t explain all these feelings, but, just for example, from my perspective, image 9 fails as a statistical graphic (although it might be fine as an infovis) by trying to cram to much into a single image. I don’t think it works to have all the colors on the single wheels; instead I’d prefer some sort of grid of images. Also, I don’t see the point of the circular display. That makes no sense at all; it’s a misleading feature. That said, the graphs I dislike can still be fine for their purpose. A graph in a journal such as Science or Nature is meant to grab the eye of a busy reader (or to go viral on

2 0.22020181 1146 andrew gelman stats-2012-01-30-Convenient page of data sources from the Washington Post

Introduction: Wayne Folta points us to this list .

3 0.19039753 878 andrew gelman stats-2011-08-29-Infovis, infographics, and data visualization: Where I’m coming from, and where I’d like to go

Introduction: I continue to struggle to convey my thoughts on statistical graphics so I’ll try another approach, this time giving my own story. For newcomers to this discussion: the background is that Antony Unwin and I wrote an article on the different goals embodied in information visualization and statistical graphics, but I have difficulty communicating on this point with the infovis people. Maybe if I tell my own story, and then they tell their stories, this will point a way forward to a more constructive discussion. So here goes. I majored in physics in college and I worked in a couple of research labs during the summer. Physicists graph everything. I did most of my plotting on graph paper–this continued through my second year of grad school–and became expert at putting points at 1/5, 2/5, 3/5, and 4/5 between the x and y grid lines. In grad school in statistics, I continued my physics habits and graphed everything I could. I did notice, though, that the faculty and the other

4 0.18260808 891 andrew gelman stats-2011-09-05-World Bank data now online

Introduction: Wayne Folta writes that the World Bank is opening up some of its data for researchers.

5 0.1698034 855 andrew gelman stats-2011-08-16-Infovis and statgraphics update update

Introduction: To continue our discussion from last week , consider three positions regarding the display of information: (a) The traditional tabular approach. This is how most statisticians, econometricians, political scientists, sociologists, etc., seem to operate. They understand the appeal of a pretty graph, and they’re willing to plot some data as part of an exploratory data analysis, but they see their serious research as leading to numerical estimates, p-values, tables of numbers. These people might use a graph to illustrate their points but they don’t see them as necessary in their research. (b) Statistical graphics as performed by Howard Wainer, Bill Cleveland, Dianne Cook, etc. They–we–see graphics as central to the process of statistical modeling and data analysis and are interested in graphs (static and dynamic) that display every data point as transparently as possible. (c) Information visualization or infographics, as performed by graphics designers and statisticians who are

6 0.15379956 306 andrew gelman stats-2010-09-29-Statistics and the end of time

7 0.15184928 832 andrew gelman stats-2011-07-31-Even a good data display can sometimes be improved

8 0.14935081 2266 andrew gelman stats-2014-03-25-A statistical graphics course and statistical graphics advice

9 0.13546507 319 andrew gelman stats-2010-10-04-“Who owns Congress”

10 0.13232151 1684 andrew gelman stats-2013-01-20-Ugly ugly ugly

11 0.12362066 1104 andrew gelman stats-2012-01-07-A compelling reason to go to London, Ontario??

12 0.10895632 1584 andrew gelman stats-2012-11-19-Tradeoffs in information graphics

13 0.10292331 1614 andrew gelman stats-2012-12-09-The pretty picture is just the beginning of the data exploration. But the pretty picture is a great way to get started. Another example of how a puzzle can make a graph appealing

14 0.099830672 1609 andrew gelman stats-2012-12-06-Stephen Kosslyn’s principles of graphics and one more: There’s no need to cram everything into a single plot

15 0.09763734 1848 andrew gelman stats-2013-05-09-A tale of two discussion papers

16 0.095009148 2154 andrew gelman stats-2013-12-30-Bill Gates’s favorite graph of the year

17 0.094751537 863 andrew gelman stats-2011-08-21-Bad graph

18 0.091613889 1668 andrew gelman stats-2013-01-11-My talk at the NY data visualization meetup this Monday!

19 0.091021135 847 andrew gelman stats-2011-08-10-Using a “pure infographic” to explore differences between information visualization and statistical graphics

20 0.090464585 61 andrew gelman stats-2010-05-31-A data visualization manifesto


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.13), (1, -0.038), (2, -0.048), (3, 0.046), (4, 0.093), (5, -0.154), (6, -0.085), (7, 0.042), (8, -0.035), (9, 0.006), (10, 0.019), (11, -0.015), (12, -0.04), (13, -0.008), (14, 0.011), (15, -0.042), (16, -0.01), (17, -0.021), (18, -0.002), (19, 0.017), (20, 0.019), (21, 0.015), (22, -0.01), (23, 0.007), (24, 0.017), (25, 0.001), (26, 0.034), (27, 0.022), (28, -0.044), (29, 0.009), (30, -0.011), (31, 0.021), (32, -0.031), (33, 0.014), (34, -0.008), (35, 0.035), (36, 0.028), (37, 0.006), (38, -0.002), (39, 0.019), (40, 0.033), (41, 0.006), (42, 0.02), (43, 0.05), (44, -0.043), (45, -0.027), (46, -0.027), (47, 0.023), (48, -0.039), (49, 0.05)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.97394049 1606 andrew gelman stats-2012-12-05-The Grinch Comes Back

Introduction: Wayne Folta writes: In keeping with your interest in graphs, this might interest or inspire you, if you haven’t seen it already, which features 20 scientific graphs that Wired likes, ranging from drawn illustrations to trajectory plots. My reaction: I looked at the first 10. I liked 1, 3, and 5, I didn’t like 2, 7, 8, 9, and 10. I have neutral feelings about 4 and 6. I won’t explain all these feelings, but, just for example, from my perspective, image 9 fails as a statistical graphic (although it might be fine as an infovis) by trying to cram to much into a single image. I don’t think it works to have all the colors on the single wheels; instead I’d prefer some sort of grid of images. Also, I don’t see the point of the circular display. That makes no sense at all; it’s a misleading feature. That said, the graphs I dislike can still be fine for their purpose. A graph in a journal such as Science or Nature is meant to grab the eye of a busy reader (or to go viral on

2 0.84890062 878 andrew gelman stats-2011-08-29-Infovis, infographics, and data visualization: Where I’m coming from, and where I’d like to go

Introduction: I continue to struggle to convey my thoughts on statistical graphics so I’ll try another approach, this time giving my own story. For newcomers to this discussion: the background is that Antony Unwin and I wrote an article on the different goals embodied in information visualization and statistical graphics, but I have difficulty communicating on this point with the infovis people. Maybe if I tell my own story, and then they tell their stories, this will point a way forward to a more constructive discussion. So here goes. I majored in physics in college and I worked in a couple of research labs during the summer. Physicists graph everything. I did most of my plotting on graph paper–this continued through my second year of grad school–and became expert at putting points at 1/5, 2/5, 3/5, and 4/5 between the x and y grid lines. In grad school in statistics, I continued my physics habits and graphed everything I could. I did notice, though, that the faculty and the other

3 0.83440363 1439 andrew gelman stats-2012-08-01-A book with a bunch of simple graphs

Introduction: Howard Friedman sent me a new book, The Measure of a Nation, subtitled How to Regain America’s Competitive Edge and Boost Our Global Standing. Without commenting on the substance of Friedman’s recommendations, I’d like to endorse his strategy of presentation, which is to display graph after graph after graph showing the same message over and over again, which is that the U.S. is outperformed by various other countries (mostly in Europe) on a variety of measures. These aren’t graphs I would ever make—they are scatterplots in which the x-axis conveys no information. But they have the advantage of repetition: once you figure out how to read one of the graphs, you can read the others easily. Here’s an example which I found from a quick Google: I can’t actually figure out what is happening on the x-axis, nor do I understand the “star, middle child, dog” thing. But I like the use of graphics. Lots more fun than bullet points. Seriously. P.S. Just to be clear: I am not trying

4 0.83013409 2266 andrew gelman stats-2014-03-25-A statistical graphics course and statistical graphics advice

Introduction: Dean Eckles writes: Some of my coworkers at Facebook and I have worked with Udacity to create an online course on exploratory data analysis, including using data visualizations in R as part of EDA. The course has now launched at  https://www.udacity.com/course/ud651  so anyone can take it for free. And Kaiser Fung has  reviewed it . So definitely feel free to promote it! Criticism is also welcome (we are still fine-tuning things and adding more notes throughout). I wrote some more comments about the course  here , including highlighting the interviews with my great coworkers. I didn’t have a chance to look at the course so instead I responded with some generic comments about eda and visualization (in no particular order): - Think of a graph as a comparison. All graphs are comparison (indeed, all statistical analyses are comparisons). If you already have the graph in mind, think of what comparisons it’s enabling. Or if you haven’t settled on the graph yet, think of what

5 0.82504719 37 andrew gelman stats-2010-05-17-Is chartjunk really “more useful” than plain graphs? I don’t think so.

Introduction: Helen DeWitt links to this blog that reports on a study by Scott Bateman, Carl Gutwin, David McDine, Regan Mandryk, Aaron Genest, and Christopher Brooks that claims the following: Guidelines for designing information charts often state that the presentation should reduce ‘chart junk’–visual embellishments that are not essential to understanding the data. . . . we conducted an experiment that compared embellished charts with plain ones, and measured both interpretation accuracy and long-term recall. We found that people’s accuracy in describing the embellished charts was no worse than for plain charts, and that their recall after a two-to-three-week gap was significantly better. As the above-linked blogger puts it, “chartjunk is more useful than plain graphs. . . . Tufte is not going to like this.” I can’t speak for Ed Tufte, but I’m not gonna take this claim about chartjunk lying down. I have two points to make which I hope can stop the above-linked study from being sla

6 0.82078558 2246 andrew gelman stats-2014-03-13-An Economist’s Guide to Visualizing Data

7 0.81891459 855 andrew gelman stats-2011-08-16-Infovis and statgraphics update update

8 0.81044412 1584 andrew gelman stats-2012-11-19-Tradeoffs in information graphics

9 0.80874956 319 andrew gelman stats-2010-10-04-“Who owns Congress”

10 0.80556709 829 andrew gelman stats-2011-07-29-Infovis vs. statgraphics: A clear example of their different goals

11 0.80323458 1684 andrew gelman stats-2013-01-20-Ugly ugly ugly

12 0.79981887 1609 andrew gelman stats-2012-12-06-Stephen Kosslyn’s principles of graphics and one more: There’s no need to cram everything into a single plot

13 0.79683149 1894 andrew gelman stats-2013-06-12-How to best graph the Beveridge curve, relating the vacancy rate in jobs to the unemployment rate?

14 0.79562342 61 andrew gelman stats-2010-05-31-A data visualization manifesto

15 0.79403329 488 andrew gelman stats-2010-12-27-Graph of the year

16 0.79392844 126 andrew gelman stats-2010-07-03-Graphical presentation of risk ratios

17 0.79350293 1764 andrew gelman stats-2013-03-15-How do I make my graphs?

18 0.78977615 1604 andrew gelman stats-2012-12-04-An epithet I can live with

19 0.78819478 1775 andrew gelman stats-2013-03-23-In which I disagree with John Maynard Keynes

20 0.78335857 2154 andrew gelman stats-2013-12-30-Bill Gates’s favorite graph of the year


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(5, 0.24), (10, 0.034), (16, 0.082), (21, 0.028), (24, 0.143), (44, 0.021), (51, 0.017), (57, 0.071), (77, 0.051), (95, 0.012), (99, 0.199)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.94176191 224 andrew gelman stats-2010-08-22-Mister P gets married

Introduction: Jeff, Justin, and I write : Gay marriage is not going away as a highly emotional, contested issue. Proposition 8, the California ballot measure that bans same-sex marriage, has seen to that, as it winds its way through the federal courts. But perhaps the public has reached a turning point. And check out the (mildly) dynamic graphics. The picture below is ok but for the full effect you have to click through and play the movie.

2 0.93613553 228 andrew gelman stats-2010-08-24-A new efficient lossless compression algorithm

Introduction: Frank Wood and Nick Bartlett write : Deplump works the same as all probabilistic lossless compressors. A datastream is fed one observation at a time into a predictor which emits both the data stream and predictions about what the next observation in the stream should be for every observation. An encoder takes this output and produces a compressed stream which can be piped over a network or to a file. A receiver then takes this stream and decompresses it by doing everything in reverse. In order to ensure that the decoder has the same information available to it that the encoder had when compressing the stream, the decoded datastream is both emitted and directed to another predictor. This second predictor’s job is to produce exactly the same predictions as the initial predictor so that the decoder has the same information at every step of the process as the encoder did. The difference between probabilistic lossless compressors is in the prediction engine, encoding and decoding bein

3 0.92968357 665 andrew gelman stats-2011-04-17-Yes, your wish shall be granted (in 25 years)

Introduction: This one was so beautiful I just had to repost it: From the New York Times, 9 Sept 1981: IF I COULD CHANGE PARK SLOPE If I could change Park Slope I would turn it into a palace with queens and kings and princesses to dance the night away at the ball. The trees would look like garden stalks. The lights would look like silver pearls and the dresses would look like soft silver silk. You should see the ball. It looks so luxurious to me. The Park Slope ball is great. Can you guess what street it’s on? “Yes. My street. That’s Carroll Street.” – Jennifer Chatmon, second grade, P.S. 321 This was a few years before my sister told me that she felt safer having a crack house down the block because the cops were surveilling it all the time.

same-blog 4 0.92723465 1606 andrew gelman stats-2012-12-05-The Grinch Comes Back

Introduction: Wayne Folta writes: In keeping with your interest in graphs, this might interest or inspire you, if you haven’t seen it already, which features 20 scientific graphs that Wired likes, ranging from drawn illustrations to trajectory plots. My reaction: I looked at the first 10. I liked 1, 3, and 5, I didn’t like 2, 7, 8, 9, and 10. I have neutral feelings about 4 and 6. I won’t explain all these feelings, but, just for example, from my perspective, image 9 fails as a statistical graphic (although it might be fine as an infovis) by trying to cram to much into a single image. I don’t think it works to have all the colors on the single wheels; instead I’d prefer some sort of grid of images. Also, I don’t see the point of the circular display. That makes no sense at all; it’s a misleading feature. That said, the graphs I dislike can still be fine for their purpose. A graph in a journal such as Science or Nature is meant to grab the eye of a busy reader (or to go viral on

5 0.92623758 1250 andrew gelman stats-2012-04-07-Hangman tips

Introduction: Jeff pointed me to this article by Nick Berry. It’s kind of fun but of course if you know your opponent will be following this strategy you can figure out how to outwit it. Also, Berry writes that ETAOIN SHRDLU CMFWYP VBGKQJ XZ is the “ordering of letter frequency in English language.” Indeed this is the conventional ordering but nobody thinks it’s right anymore. See here (with further discussion here ). I wonder what corpus he’s using. P.S. Klutz was my personal standby.

6 0.91703856 422 andrew gelman stats-2010-11-20-A Gapminder-like data visualization package

7 0.90953398 87 andrew gelman stats-2010-06-15-Statistical analysis and visualization of the drug war in Mexico

8 0.9055649 2005 andrew gelman stats-2013-09-02-“Il y a beaucoup de candidats démocrates, et leurs idéologies ne sont pas très différentes. Et la participation est imprévisible.”

9 0.89129823 513 andrew gelman stats-2011-01-12-“Tied for Warmest Year On Record”

10 0.85236943 1103 andrew gelman stats-2012-01-06-Unconvincing defense of the recent Russian elections, and a problem when an official organ of an academic society has low standards for publication

11 0.83744776 1512 andrew gelman stats-2012-09-27-A Non-random Walk Down Campaign Street

12 0.83400154 1841 andrew gelman stats-2013-05-04-The Folk Theorem of Statistical Computing

13 0.82759982 1286 andrew gelman stats-2012-04-28-Agreement Groups in US Senate and Dynamic Clustering

14 0.80650145 364 andrew gelman stats-2010-10-22-Politics is not a random walk: Momentum and mean reversion in polling

15 0.8063283 164 andrew gelman stats-2010-07-26-A very short story

16 0.80266887 1894 andrew gelman stats-2013-06-12-How to best graph the Beveridge curve, relating the vacancy rate in jobs to the unemployment rate?

17 0.79814458 1052 andrew gelman stats-2011-12-11-Rational Turbulence

18 0.79321182 123 andrew gelman stats-2010-07-01-Truth in headlines

19 0.77695847 951 andrew gelman stats-2011-10-11-Data mining efforts for Obama’s campaign

20 0.77240533 1914 andrew gelman stats-2013-06-25-Is there too much coauthorship in economics (and science more generally)? Or too little?