andrew_gelman_stats andrew_gelman_stats-2012 andrew_gelman_stats-2012-1308 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Yair pointed me to this awesome blog of how the NYT people make their graphs. This blows away all other stat graphics blogs (including this one). Lots of examples from mockup to first tries to final version. I recognize a lot of what they’re doing from my own experience. Also from my experience it’s hard to get all these details down: once you have the final graph, it’s easy to forget how you go there.
sentIndex sentText sentNum sentScore
1 Yair pointed me to this awesome blog of how the NYT people make their graphs. [sent-1, score-0.633]
2 This blows away all other stat graphics blogs (including this one). [sent-2, score-1.125]
3 Lots of examples from mockup to first tries to final version. [sent-3, score-0.867]
4 I recognize a lot of what they’re doing from my own experience. [sent-4, score-0.275]
5 Also from my experience it’s hard to get all these details down: once you have the final graph, it’s easy to forget how you go there. [sent-5, score-1.318]
wordName wordTfidf (topN-words)
[('final', 0.417), ('blows', 0.362), ('awesome', 0.275), ('yair', 0.256), ('tries', 0.245), ('nyt', 0.232), ('stat', 0.23), ('blogs', 0.215), ('forget', 0.213), ('recognize', 0.19), ('graphics', 0.176), ('pointed', 0.157), ('experience', 0.151), ('details', 0.15), ('away', 0.142), ('easy', 0.135), ('graph', 0.133), ('examples', 0.131), ('including', 0.122), ('lots', 0.112), ('hard', 0.111), ('go', 0.087), ('blog', 0.085), ('lot', 0.085), ('first', 0.074), ('re', 0.069), ('make', 0.063), ('get', 0.054), ('people', 0.053), ('also', 0.05), ('one', 0.038)]
simIndex simValue blogId blogTitle
same-blog 1 0.99999994 1308 andrew gelman stats-2012-05-08-chartsnthings !
Introduction: Yair pointed me to this awesome blog of how the NYT people make their graphs. This blows away all other stat graphics blogs (including this one). Lots of examples from mockup to first tries to final version. I recognize a lot of what they’re doing from my own experience. Also from my experience it’s hard to get all these details down: once you have the final graph, it’s easy to forget how you go there.
2 0.15045469 676 andrew gelman stats-2011-04-23-The payoff: $650. The odds: 1 in 500,000.
Introduction: Details here .
3 0.14828119 2175 andrew gelman stats-2014-01-18-A course in sample surveys for political science
Introduction: A colleague asked if I had any material for a course in sample surveys. And indeed I do. See here . It’s all the slides for a 14-week course, also the syllabus (“surveyscourse.pdf”), the final exam (“final2012.pdf”) and various misc files. Also more discussion of final exam questions here (keep scrolling thru the “previous entries” until you get to Question 1). Enjoy! This is in no way a self-contained teach-it-yourself course, but I do think it could be helpful for anyone who is trying to teach a class on this material.
4 0.14443624 1667 andrew gelman stats-2013-01-10-When you SHARE poorly researched infographics…
Introduction: Ironically, I can’t find the source for this awesome graphic that’s been making the rounds. -Phil
5 0.1395826 1764 andrew gelman stats-2013-03-15-How do I make my graphs?
Introduction: Someone who wishes to remain anonymous writes: I’ve been following your blog a long time and enjoy your posts on visualization/statistical graphics matters. I don’t recall however you ever describing the details of your setup for plotting. I’m a new R user (convert from matplotlib) and would love to know your thoughts on the ideal setup: do you use mainly the R base? Do you use lattice? What do you think of ggplot2? etc. I found ggplot2 nearly indecipherable until a recent eureka moment, and I think its default theme is a waste tremendous ink (all those silly grey backgrounds and grids are really unnecessary), but if you customize that away it can be made to look like ordinary, pretty statistical graphs. Feel free to respond on your blog, but if you do, please remove my name from the post (my colleagues already make fun of me for thinking about visualization too much.) I love that last bit! Anyway, my response is that I do everything in base graphics (using my
7 0.11052043 2279 andrew gelman stats-2014-04-02-Am I too negative?
9 0.10350856 2266 andrew gelman stats-2014-03-25-A statistical graphics course and statistical graphics advice
10 0.098283984 61 andrew gelman stats-2010-05-31-A data visualization manifesto
11 0.093216136 855 andrew gelman stats-2011-08-16-Infovis and statgraphics update update
12 0.091766566 1782 andrew gelman stats-2013-03-30-“Statistical Modeling: A Fresh Approach”
13 0.087607823 227 andrew gelman stats-2010-08-23-Visualization magazine
14 0.08755485 1584 andrew gelman stats-2012-11-19-Tradeoffs in information graphics
15 0.086390272 502 andrew gelman stats-2011-01-04-Cash in, cash out graph
16 0.08407937 1832 andrew gelman stats-2013-04-29-The blogroll
17 0.081922881 120 andrew gelman stats-2010-06-30-You can’t put Pandora back in the box
18 0.080884069 2016 andrew gelman stats-2013-09-11-Zipfian Academy, A School for Data Science
19 0.079717971 1831 andrew gelman stats-2013-04-29-The Great Race
20 0.076640628 1661 andrew gelman stats-2013-01-08-Software is as software does
topicId topicWeight
[(0, 0.101), (1, -0.05), (2, -0.045), (3, 0.052), (4, 0.092), (5, -0.077), (6, -0.049), (7, 0.026), (8, -0.024), (9, -0.011), (10, 0.016), (11, -0.002), (12, 0.003), (13, -0.013), (14, -0.017), (15, -0.013), (16, -0.027), (17, -0.013), (18, -0.034), (19, 0.012), (20, 0.031), (21, -0.024), (22, -0.035), (23, 0.026), (24, -0.014), (25, -0.023), (26, 0.009), (27, 0.004), (28, 0.001), (29, -0.032), (30, 0.012), (31, -0.007), (32, -0.009), (33, -0.003), (34, 0.008), (35, -0.002), (36, 0.018), (37, 0.004), (38, 0.003), (39, -0.013), (40, -0.033), (41, -0.008), (42, -0.001), (43, -0.027), (44, -0.019), (45, 0.008), (46, 0.023), (47, 0.018), (48, -0.021), (49, 0.04)]
simIndex simValue blogId blogTitle
same-blog 1 0.95781702 1308 andrew gelman stats-2012-05-08-chartsnthings !
Introduction: Yair pointed me to this awesome blog of how the NYT people make their graphs. This blows away all other stat graphics blogs (including this one). Lots of examples from mockup to first tries to final version. I recognize a lot of what they’re doing from my own experience. Also from my experience it’s hard to get all these details down: once you have the final graph, it’s easy to forget how you go there.
2 0.86407679 1764 andrew gelman stats-2013-03-15-How do I make my graphs?
Introduction: Someone who wishes to remain anonymous writes: I’ve been following your blog a long time and enjoy your posts on visualization/statistical graphics matters. I don’t recall however you ever describing the details of your setup for plotting. I’m a new R user (convert from matplotlib) and would love to know your thoughts on the ideal setup: do you use mainly the R base? Do you use lattice? What do you think of ggplot2? etc. I found ggplot2 nearly indecipherable until a recent eureka moment, and I think its default theme is a waste tremendous ink (all those silly grey backgrounds and grids are really unnecessary), but if you customize that away it can be made to look like ordinary, pretty statistical graphs. Feel free to respond on your blog, but if you do, please remove my name from the post (my colleagues already make fun of me for thinking about visualization too much.) I love that last bit! Anyway, my response is that I do everything in base graphics (using my
Introduction: I continue to struggle to convey my thoughts on statistical graphics so I’ll try another approach, this time giving my own story. For newcomers to this discussion: the background is that Antony Unwin and I wrote an article on the different goals embodied in information visualization and statistical graphics, but I have difficulty communicating on this point with the infovis people. Maybe if I tell my own story, and then they tell their stories, this will point a way forward to a more constructive discussion. So here goes. I majored in physics in college and I worked in a couple of research labs during the summer. Physicists graph everything. I did most of my plotting on graph paper–this continued through my second year of grad school–and became expert at putting points at 1/5, 2/5, 3/5, and 4/5 between the x and y grid lines. In grad school in statistics, I continued my physics habits and graphed everything I could. I did notice, though, that the faculty and the other
Introduction: Our discussion on data visualization continues. One one side are three statisticians–Antony Unwin, Kaiser Fung, and myself. We have been writing about the different goals served by information visualization and statistical graphics. On the other side are graphics experts (sorry for the imprecision, I don’t know exactly what these people do in their day jobs or how they are trained, and I don’t want to mislabel them) such as Robert Kosara and Jen Lowe , who seem a bit annoyed at how my colleagues and myself seem to follow the Tufte strategy of criticizing what we don’t understand. And on the third side are many (most?) academic statisticians, econometricians, etc., who don’t understand or respect graphs and seem to think of visualization as a toy that is unrelated to serious science or statistics. I’m not so interested in the third group right now–I tried to communicate with them in my big articles from 2003 and 2004 )–but I am concerned that our dialogue with the graphic
5 0.78825754 2266 andrew gelman stats-2014-03-25-A statistical graphics course and statistical graphics advice
Introduction: Dean Eckles writes: Some of my coworkers at Facebook and I have worked with Udacity to create an online course on exploratory data analysis, including using data visualizations in R as part of EDA. The course has now launched at https://www.udacity.com/course/ud651 so anyone can take it for free. And Kaiser Fung has reviewed it . So definitely feel free to promote it! Criticism is also welcome (we are still fine-tuning things and adding more notes throughout). I wrote some more comments about the course here , including highlighting the interviews with my great coworkers. I didn’t have a chance to look at the course so instead I responded with some generic comments about eda and visualization (in no particular order): - Think of a graph as a comparison. All graphs are comparison (indeed, all statistical analyses are comparisons). If you already have the graph in mind, think of what comparisons it’s enabling. Or if you haven’t settled on the graph yet, think of what
6 0.77332354 61 andrew gelman stats-2010-05-31-A data visualization manifesto
7 0.76869792 855 andrew gelman stats-2011-08-16-Infovis and statgraphics update update
8 0.75709093 1604 andrew gelman stats-2012-12-04-An epithet I can live with
9 0.75028872 1606 andrew gelman stats-2012-12-05-The Grinch Comes Back
10 0.74921131 1684 andrew gelman stats-2013-01-20-Ugly ugly ugly
11 0.74730295 37 andrew gelman stats-2010-05-17-Is chartjunk really “more useful” than plain graphs? I don’t think so.
12 0.74190098 1896 andrew gelman stats-2013-06-13-Against the myth of the heroic visualization
13 0.73434955 319 andrew gelman stats-2010-10-04-“Who owns Congress”
15 0.73169142 1584 andrew gelman stats-2012-11-19-Tradeoffs in information graphics
16 0.72677022 1811 andrew gelman stats-2013-04-18-Psychology experiments to understand what’s going on with data graphics?
17 0.72514045 2154 andrew gelman stats-2013-12-30-Bill Gates’s favorite graph of the year
18 0.71579832 583 andrew gelman stats-2011-02-21-An interesting assignment for statistical graphics
19 0.70852995 829 andrew gelman stats-2011-07-29-Infovis vs. statgraphics: A clear example of their different goals
20 0.70821804 671 andrew gelman stats-2011-04-20-One more time-use graph
topicId topicWeight
[(24, 0.153), (81, 0.042), (95, 0.364), (99, 0.276)]
simIndex simValue blogId blogTitle
1 0.97537476 876 andrew gelman stats-2011-08-28-Vaguely related to the coke-dumping story
Introduction: Underground norms from Jay Livingston. P.S. The Coke story is here (and is followed up in the comments).
2 0.95666242 832 andrew gelman stats-2011-07-31-Even a good data display can sometimes be improved
Introduction: When I first saw this graphic, I thought “boy, that’s great, sometimes the graphic practically makes itself.” Normally it’s hard to use lots of different colors to differentiate items of interest, because there’s usually not an intuitive mapping between color and item (e.g. for countries, or states, or whatever). But the colors of crayons, what could be more perfect? So this graphic seemed awesome. But, as they discovered after some experimentation at datapointed.net there is an even BETTER possibility here. Click the link to see. Crayola Crayon colors by year
Introduction: Ben Hyde sends along this : Stuck in the middle of the supplemental data, reporting the total workup for their compounds, was this gem: Emma, please insert NMR data here! where are they? and for this compound, just make up an elemental analysis . . . I’m reminded of our recent discussions of coauthorship, where I argued that I see real advantages to having multiple people taking responsibility for the result. Jay Verkuilen responded: “On the flipside of collaboration . . . is diffusion of responsibility, where everybody thinks someone else ‘has that problem’ and thus things don’t get solved.” That’s what seems to have happened (hilariously) here.
4 0.94353104 1820 andrew gelman stats-2013-04-23-Foundation for Open Access Statistics
Introduction: Now here’s a foundation I (Bob) can get behind: Foundation for Open Access Statistics (FOAS) Their mission is to “promote free software, open access publishing, and reproducible research in statistics.” To me, that’s like supporting motherhood and apple pie ! FOAS spun out of and is partially designed to support the Journal of Statistical Software (aka JSS , aka JStatSoft ). I adore JSS because it (a) is open access, (b) publishes systems papers on statistical software, (c) has fast reviewing turnaround times, and (d) is free for authors and readers. One of the next items on my to-do list is to write up the Stan modeling language and submit it to JSS . As a not-for-profit with no visible source of income, they are quite sensibly asking for donations (don’t complain — it beats $3K author fees or not being able to read papers).
5 0.91698849 1862 andrew gelman stats-2013-05-18-uuuuuuuuuuuuugly
Introduction: Hamdan Azhar writes: I came across this graphic of vaccine-attributed decreases in mortality and was curious if you found it as unattractive and unintuitive as I did. Hope all is well with you! My reply: All’s well with me. And yes, that’s one horrible graph. It has all the problems with a bad infographic with none of the virtues. Compared to this monstrosity, the typical USA Today graph is a stunning, beautiful masterpiece. I don’t think I want to soil this webpage with the image. In fact, I don’t even want to link to it.
same-blog 7 0.90244865 1308 andrew gelman stats-2012-05-08-chartsnthings !
8 0.89520347 12 andrew gelman stats-2010-04-30-More on problems with surveys estimating deaths in war zones
9 0.89490509 520 andrew gelman stats-2011-01-17-R Advertised
10 0.88852227 519 andrew gelman stats-2011-01-16-Update on the generalized method of moments
11 0.88054287 1086 andrew gelman stats-2011-12-27-The most dangerous jobs in America
12 0.87716854 1164 andrew gelman stats-2012-02-13-Help with this problem, win valuable prizes
13 0.86853874 2101 andrew gelman stats-2013-11-15-BDA class 4 G+ hangout on air is on air
14 0.86226761 627 andrew gelman stats-2011-03-24-How few respondents are reasonable to use when calculating the average by county?
15 0.84630901 1667 andrew gelman stats-2013-01-10-When you SHARE poorly researched infographics…
16 0.83490157 266 andrew gelman stats-2010-09-09-The future of R
17 0.81043464 944 andrew gelman stats-2011-10-05-How accurate is your gaydar?
18 0.8089065 1758 andrew gelman stats-2013-03-11-Yes, the decision to try (or not) to have a child can be made rationally
19 0.80465484 1737 andrew gelman stats-2013-02-25-Correlation of 1 . . . too good to be true?
20 0.79889458 1646 andrew gelman stats-2013-01-01-Back when fifty years was a long time ago