andrew_gelman_stats andrew_gelman_stats-2012 andrew_gelman_stats-2012-1125 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: I stumbled across a chart that’s in my opinion the best way to express a comparison of quantities through time: It compares the new PC companies, such as Apple, to traditional PC companies like IBM and Compaq, but on the same scale. If you’d like to see how iPads and other novelties compare, see here . I’ve tried to use the same type of visualization in my old work on legal data visualization . It comes from a new market research firm Asymco that also produced a very clean income vs expenses visualization (click to enlarge): While the first figure is pure perfection, Tufte purists might find the second one too colorful. But to a busy person, color helps tell things apart: when I know that pink means interest, it takes a fraction of the second to assess the situation. We live in 2012, not in 1712 to have to think black and white. Finally, they have a few other interesting uses of interactive visualization, such as cellular-broadband infrastructure around
sentIndex sentText sentNum sentScore
1 I stumbled across a chart that’s in my opinion the best way to express a comparison of quantities through time: It compares the new PC companies, such as Apple, to traditional PC companies like IBM and Compaq, but on the same scale. [sent-1, score-1.064]
2 I’ve tried to use the same type of visualization in my old work on legal data visualization . [sent-3, score-1.105]
3 It comes from a new market research firm Asymco that also produced a very clean income vs expenses visualization (click to enlarge): While the first figure is pure perfection, Tufte purists might find the second one too colorful. [sent-4, score-1.647]
4 But to a busy person, color helps tell things apart: when I know that pink means interest, it takes a fraction of the second to assess the situation. [sent-5, score-0.874]
5 We live in 2012, not in 1712 to have to think black and white. [sent-6, score-0.185]
6 Finally, they have a few other interesting uses of interactive visualization, such as cellular-broadband infrastructure around the world through time – along with the underlying data. [sent-7, score-0.459]
7 It seems that the GapMinder tools are now out there. [sent-8, score-0.091]
wordName wordTfidf (topN-words)
[('visualization', 0.383), ('pc', 0.331), ('companies', 0.212), ('perfection', 0.184), ('gapminder', 0.184), ('purists', 0.184), ('infrastructure', 0.166), ('expenses', 0.16), ('enlarge', 0.155), ('ibm', 0.155), ('stumbled', 0.145), ('apple', 0.139), ('tufte', 0.13), ('vs', 0.13), ('pink', 0.13), ('interactive', 0.124), ('chart', 0.124), ('compares', 0.12), ('fraction', 0.119), ('quantities', 0.119), ('second', 0.115), ('firm', 0.114), ('legal', 0.112), ('assess', 0.11), ('busy', 0.109), ('apart', 0.109), ('helps', 0.108), ('produced', 0.106), ('color', 0.105), ('pure', 0.103), ('clean', 0.1), ('black', 0.099), ('market', 0.099), ('click', 0.098), ('express', 0.096), ('traditional', 0.093), ('tools', 0.091), ('uses', 0.087), ('live', 0.086), ('compare', 0.084), ('underlying', 0.082), ('income', 0.082), ('comparison', 0.081), ('tried', 0.079), ('takes', 0.078), ('type', 0.078), ('opinion', 0.074), ('finally', 0.074), ('figure', 0.071), ('old', 0.07)]
simIndex simValue blogId blogTitle
same-blog 1 1.0000001 1125 andrew gelman stats-2012-01-18-Beautiful Line Charts
Introduction: I stumbled across a chart that’s in my opinion the best way to express a comparison of quantities through time: It compares the new PC companies, such as Apple, to traditional PC companies like IBM and Compaq, but on the same scale. If you’d like to see how iPads and other novelties compare, see here . I’ve tried to use the same type of visualization in my old work on legal data visualization . It comes from a new market research firm Asymco that also produced a very clean income vs expenses visualization (click to enlarge): While the first figure is pure perfection, Tufte purists might find the second one too colorful. But to a busy person, color helps tell things apart: when I know that pink means interest, it takes a fraction of the second to assess the situation. We live in 2012, not in 1712 to have to think black and white. Finally, they have a few other interesting uses of interactive visualization, such as cellular-broadband infrastructure around
2 0.26139936 441 andrew gelman stats-2010-12-01-Mapmaking software
Introduction: I can’t use this on my PC, but the link comes from Aleks, so maybe it’s something good!
3 0.17929007 867 andrew gelman stats-2011-08-23-The economics of the mac? A paradox of competition
Introduction: I switched to the mac and it’s great I’d like a bit more real estate on my laptop but Malecki assures me that soon I’ll get used to jumping between windows. Anyway, my impression is that now the mac dominates the pc, but a few years ago it wasn’t so clear. The mac had some nice features but often ran slowly, and thinkpads could do a lot. At the time, the way I understood this was that only one company made macs but several made pc’s, thus there was a lot of competition, stimulating innovation in the pc market. But what’s the story now? The macbook air is awesome and a real advance on what came before, while the thinkpads and all the rest have stagnated. So what’s my new theory? There’s lots of competition on pc’s, but what they’re all competing on is price. Meanwhile, only one company makes the mac and so they have the freedom to make something good. But I’m just blathering here. I’m sure someone can offer some thick description to reveal the real story.
4 0.16887975 794 andrew gelman stats-2011-07-09-The quest for the holy graph
Introduction: Eytan Adar writes: I was just going through the latest draft of your paper with Anthony Unwin . I heard part of it at the talk you gave (remotely) here at UMich. I’m curious about your discussion of the Baby Name Voyager . The tool in itself is simple, attractive, and useful. No argument from me there. It’s an awesome demonstration of how subtle interactions can be very helpful (click and it zooms, type and it filters… falls perfectly into the Shneiderman visualization mantra). It satisfies a very common use case: finding appropriate names for children. That said, I can’t help but feeling that what you are really excited about is the very static analysis on last letters (you spend most of your time on this). This analysis, incidentally, is not possible to infer from the interactive application (which doesn’t support this type of filtering and pivoting). In a sense, the two visualizations don’t have anything to do with each other (other than a shared context/dataset).
5 0.15131962 1811 andrew gelman stats-2013-04-18-Psychology experiments to understand what’s going on with data graphics?
Introduction: Ricardo Pietrobon writes, regarding my post from last year on attitudes toward data graphics, Wouldn’t it be the case to start formally studying the usability of graphics from a cognitive perspective? with platforms such as the mechanical turk it should be fairly straightforward to test alternative methods and come to some conclusions about what might be more informative and what might better assist in supporting decisions. btw, my guess is that these two constructs might not necessarily agree with each other. And Jessica Hullman provides some background: Measuring success for the different goals that you hint at in your article is indeed challenging, and I don’t think that most visualization researchers would claim to have met this challenge (myself included). Visualization researchers may know the user psychology well when it comes to certain dimensions of a graph’s effectiveness (such as quick and accurate responses), but I wouldn’t agree with this statement as a gene
6 0.1510385 558 andrew gelman stats-2011-02-05-Fattening of the world and good use of the alpha channel
7 0.14419147 194 andrew gelman stats-2010-08-09-Data Visualization
8 0.13496612 422 andrew gelman stats-2010-11-20-A Gapminder-like data visualization package
10 0.12557632 816 andrew gelman stats-2011-07-22-“Information visualization” vs. “Statistical graphics”
11 0.12386165 304 andrew gelman stats-2010-09-29-Data visualization marathon
12 0.11814036 40 andrew gelman stats-2010-05-18-What visualization is best?
13 0.11715445 1187 andrew gelman stats-2012-02-27-“Apple confronts the law of large numbers” . . . huh?
14 0.11433092 764 andrew gelman stats-2011-06-14-Examining US Legislative process with “Many Bills”
15 0.10724477 1450 andrew gelman stats-2012-08-08-My upcoming talk for the data visualization meetup
16 0.10472084 1014 andrew gelman stats-2011-11-16-Visualizations of NYPD stop-and-frisk data
17 0.10455137 927 andrew gelman stats-2011-09-26-R and Google Visualization
18 0.10432185 1689 andrew gelman stats-2013-01-23-MLB Hall of Fame Voting Trajectories
19 0.099958539 599 andrew gelman stats-2011-03-03-Two interesting posts elsewhere on graphics
20 0.099682301 61 andrew gelman stats-2010-05-31-A data visualization manifesto
topicId topicWeight
[(0, 0.118), (1, -0.043), (2, -0.015), (3, 0.037), (4, 0.088), (5, -0.065), (6, -0.052), (7, 0.02), (8, -0.024), (9, 0.007), (10, -0.032), (11, -0.043), (12, 0.016), (13, -0.006), (14, -0.024), (15, 0.035), (16, 0.035), (17, -0.021), (18, -0.028), (19, -0.027), (20, -0.037), (21, -0.029), (22, 0.016), (23, 0.025), (24, -0.027), (25, -0.02), (26, -0.042), (27, 0.025), (28, 0.013), (29, 0.007), (30, -0.03), (31, -0.014), (32, 0.031), (33, -0.025), (34, 0.021), (35, 0.028), (36, -0.01), (37, 0.018), (38, -0.006), (39, 0.053), (40, -0.036), (41, -0.01), (42, -0.002), (43, -0.016), (44, 0.019), (45, 0.03), (46, 0.031), (47, 0.051), (48, 0.018), (49, -0.003)]
simIndex simValue blogId blogTitle
same-blog 1 0.91827375 1125 andrew gelman stats-2012-01-18-Beautiful Line Charts
Introduction: I stumbled across a chart that’s in my opinion the best way to express a comparison of quantities through time: It compares the new PC companies, such as Apple, to traditional PC companies like IBM and Compaq, but on the same scale. If you’d like to see how iPads and other novelties compare, see here . I’ve tried to use the same type of visualization in my old work on legal data visualization . It comes from a new market research firm Asymco that also produced a very clean income vs expenses visualization (click to enlarge): While the first figure is pure perfection, Tufte purists might find the second one too colorful. But to a busy person, color helps tell things apart: when I know that pink means interest, it takes a fraction of the second to assess the situation. We live in 2012, not in 1712 to have to think black and white. Finally, they have a few other interesting uses of interactive visualization, such as cellular-broadband infrastructure around
2 0.80329144 2065 andrew gelman stats-2013-10-17-Cool dynamic demographic maps provide beautiful illustration of Chris Rock effect
Introduction: Robert Gonzalez reports on some beautiful graphs from John Nelson. Here’s Nelson: The sexes start out homogenous, go super segregated in the teen years, segregate for business in the twenty-somethings, and re-couple for co-habitation years. Then the lights fade into faint pockets of pink. I [Nelson] am using simple tract-level population/gender counts from the US Census Bureau. Because their tract boundaries extend into the water and vacant area, I used NYC’s Bytes of the Big Apple zoning shapes to clip the census tracts to residentially zoned areas -giving me a more realistic (and more recognizable) definition of populated areas. The census breaks out their population counts by gender for five-year age spans ranging from teeny tiny infants through esteemed 85+ year-olds. And here’s Gonzalez: Between ages 0 and 14, the entire map is more or less an evenly mixed purple landscape; newborns, children and adolescents, after all, can’t really choose where the
3 0.77817065 1689 andrew gelman stats-2013-01-23-MLB Hall of Fame Voting Trajectories
Introduction: Kenny Shirley sends along this interactive data visualization : What I learned from this was that Jim Rice is in the Hall of Fame! I remember watching him play. Whenever he struck out with a man on first base, we were just so relieved that he hadn’t hit into a double play.
4 0.7529096 794 andrew gelman stats-2011-07-09-The quest for the holy graph
Introduction: Eytan Adar writes: I was just going through the latest draft of your paper with Anthony Unwin . I heard part of it at the talk you gave (remotely) here at UMich. I’m curious about your discussion of the Baby Name Voyager . The tool in itself is simple, attractive, and useful. No argument from me there. It’s an awesome demonstration of how subtle interactions can be very helpful (click and it zooms, type and it filters… falls perfectly into the Shneiderman visualization mantra). It satisfies a very common use case: finding appropriate names for children. That said, I can’t help but feeling that what you are really excited about is the very static analysis on last letters (you spend most of your time on this). This analysis, incidentally, is not possible to infer from the interactive application (which doesn’t support this type of filtering and pivoting). In a sense, the two visualizations don’t have anything to do with each other (other than a shared context/dataset).
5 0.75039208 1811 andrew gelman stats-2013-04-18-Psychology experiments to understand what’s going on with data graphics?
Introduction: Ricardo Pietrobon writes, regarding my post from last year on attitudes toward data graphics, Wouldn’t it be the case to start formally studying the usability of graphics from a cognitive perspective? with platforms such as the mechanical turk it should be fairly straightforward to test alternative methods and come to some conclusions about what might be more informative and what might better assist in supporting decisions. btw, my guess is that these two constructs might not necessarily agree with each other. And Jessica Hullman provides some background: Measuring success for the different goals that you hint at in your article is indeed challenging, and I don’t think that most visualization researchers would claim to have met this challenge (myself included). Visualization researchers may know the user psychology well when it comes to certain dimensions of a graph’s effectiveness (such as quick and accurate responses), but I wouldn’t agree with this statement as a gene
6 0.73402214 492 andrew gelman stats-2010-12-30-That puzzle-solving feeling
7 0.72159529 558 andrew gelman stats-2011-02-05-Fattening of the world and good use of the alpha channel
8 0.70239151 422 andrew gelman stats-2010-11-20-A Gapminder-like data visualization package
9 0.70038486 1669 andrew gelman stats-2013-01-12-The power of the puzzlegraph
10 0.69719756 599 andrew gelman stats-2011-03-03-Two interesting posts elsewhere on graphics
11 0.69360089 275 andrew gelman stats-2010-09-14-Data visualization at the American Evaluation Association
12 0.69167149 40 andrew gelman stats-2010-05-18-What visualization is best?
14 0.68913954 304 andrew gelman stats-2010-09-29-Data visualization marathon
15 0.68287438 832 andrew gelman stats-2011-07-31-Even a good data display can sometimes be improved
16 0.68020773 1896 andrew gelman stats-2013-06-13-Against the myth of the heroic visualization
17 0.67935449 428 andrew gelman stats-2010-11-24-Flawed visualization of U.S. voting maybe has some good features
18 0.67734361 925 andrew gelman stats-2011-09-26-Ethnicity and Population Structure in Personal Naming Networks
19 0.67639089 764 andrew gelman stats-2011-06-14-Examining US Legislative process with “Many Bills”
20 0.67549974 787 andrew gelman stats-2011-07-05-Different goals, different looks: Infovis and the Chris Rock effect
topicId topicWeight
[(0, 0.032), (5, 0.052), (16, 0.097), (21, 0.025), (24, 0.08), (29, 0.019), (36, 0.015), (39, 0.04), (44, 0.028), (51, 0.096), (64, 0.016), (79, 0.016), (81, 0.015), (86, 0.059), (89, 0.012), (93, 0.02), (95, 0.039), (99, 0.242)]
simIndex simValue blogId blogTitle
same-blog 1 0.93587714 1125 andrew gelman stats-2012-01-18-Beautiful Line Charts
Introduction: I stumbled across a chart that’s in my opinion the best way to express a comparison of quantities through time: It compares the new PC companies, such as Apple, to traditional PC companies like IBM and Compaq, but on the same scale. If you’d like to see how iPads and other novelties compare, see here . I’ve tried to use the same type of visualization in my old work on legal data visualization . It comes from a new market research firm Asymco that also produced a very clean income vs expenses visualization (click to enlarge): While the first figure is pure perfection, Tufte purists might find the second one too colorful. But to a busy person, color helps tell things apart: when I know that pink means interest, it takes a fraction of the second to assess the situation. We live in 2012, not in 1712 to have to think black and white. Finally, they have a few other interesting uses of interactive visualization, such as cellular-broadband infrastructure around
2 0.90651202 1594 andrew gelman stats-2012-11-28-My talk on statistical graphics at Mit this Thurs aft
Introduction: Infovis and Statistical Graphics: Different Goals, Different Looks (and here’s the article) Speaker: Andrew Gelman, Columbia University Date: Thursday, November 29 2012 Time: 4:00PM to 5:00PM Location: 32-D463 (Star Conference Room) Host: Polina Golland, CSAIL Contact: Polina Golland, 6172538005, polina@csail.mit.edu The importance of graphical displays in statistical practice has been recognized sporadically in the statistical literature over the past century, with wider awareness following Tukey’s Exploratory Data Analysis (1977) and Tufte’s books in the succeeding decades. But statistical graphics still occupies an awkward in-between position: Within statistics, exploratory and graphical methods represent a minor subfield and are not well-integrated with larger themes of modeling and inference. Outside of statistics, infographics (also called information visualization or Infovis) is huge, but their purveyors and enthusiasts appear largely to be uninterested in statisti
3 0.90389967 816 andrew gelman stats-2011-07-22-“Information visualization” vs. “Statistical graphics”
Introduction: By now you all must be tired of my one-sided presentations of the differences between infovis and statgraphics (for example, this article with Antony Unwin). Today is something different. Courtesy of Martin Theus, editor of the Statistical Computing and Graphics Newsletter, we have two short articles offering competing perspectives: Robert Kosara writes from an Infovis view: Information visualization is a field that has had trouble defining its boundaries, and that consequently is often misunderstood. It doesn’t help that InfoVis, as it is also known, produces pretty pictures that people like to look at and link to or send around. But InfoVis is more than pretty pictures, and it is more than statistical graphics. The key to understanding InfoVis is to ignore the images for a moment and focus on the part that is often lost: interaction. When we use visualization tools, we don’t just create one image or one kind of visualization. In fact, most people would argue that there is
4 0.89955318 966 andrew gelman stats-2011-10-20-A qualified but incomplete thanks to Gregg Easterbrook’s editor at Reuters
Introduction: Dear Reuters editor: Thanks for reading my blog and correcting the erroneous numbers in Easterbrook’s column from the other day. I’m pretty sure you got the corrections from my blog because in your corrections you used the exact same links that I posted. I think your readers will like that you gave links to the sources of your numbers. But I’d appreciate if you cite me! It’s considered polite to credit your sources rather than just copying over numbers and links with no mention of where they came from. Unlike Easterbrook, I’m not expecting to be paid for this material but I’d still like to be thanked. (See the last paragraph of this post by Felix Salmon for more on the desirability of linking to your sources.) Also, since you’re correcting the article anyway, maybe you could go back and change this sentence too: But don’t sell Huntsman short because he is low in the polls – Obama had been at that point, too. As I noted earlier, As of 14 Oct 2011, Gallup gi
5 0.89953482 1543 andrew gelman stats-2012-10-21-Model complexity as a function of sample size
Introduction: As we get more data, we can fit more model. But at some point we become so overwhelmed by data that, for computational reasons, we can barely do anything at all. Thus, the curve above could be thought of as the product of two curves: a steadily increasing curve showing the statistical ability to fit more complex models with more data, and a steadily decreasing curve showing the computational feasibility of doing so.
6 0.89445275 2182 andrew gelman stats-2014-01-22-Spell-checking example demonstrates key aspects of Bayesian data analysis
8 0.89275885 1641 andrew gelman stats-2012-12-27-The Möbius strip, or, marketing that is impervious to criticism
10 0.8885985 1199 andrew gelman stats-2012-03-05-Any available cookbooks on Bayesian designs?
11 0.88842463 571 andrew gelman stats-2011-02-13-A departmental wiki page?
12 0.88826817 722 andrew gelman stats-2011-05-20-Why no Wegmania?
13 0.88808751 2352 andrew gelman stats-2014-05-29-When you believe in things that you don’t understand
14 0.8880679 1917 andrew gelman stats-2013-06-28-Econ coauthorship update
15 0.88646275 935 andrew gelman stats-2011-10-01-When should you worry about imputed data?
16 0.88556546 2280 andrew gelman stats-2014-04-03-As the boldest experiment in journalism history, you admit you made a mistake
17 0.88525259 1914 andrew gelman stats-2013-06-25-Is there too much coauthorship in economics (and science more generally)? Or too little?
18 0.88492304 2057 andrew gelman stats-2013-10-10-Chris Chabris is irritated by Malcolm Gladwell
19 0.88455647 131 andrew gelman stats-2010-07-07-A note to John
20 0.88448262 1266 andrew gelman stats-2012-04-16-Another day, another plagiarist