andrew_gelman_stats andrew_gelman_stats-2010 andrew_gelman_stats-2010-319 knowledge-graph by maker-knowledge-mining

319 andrew gelman stats-2010-10-04-“Who owns Congress”


meta infos for this blog

Source: html

Introduction: Curt Yeske pointed me to this . Wow–these graphs are really hard to read! The old me would’ve said that each of these graphs would be better replaced by a dotplot (or, better still, a series of lineplots showing time trends). The new me would still like the dotplots and lineplots, but I’d say it’s fine to have the eye-grabbing but hard-to-read graphs as is, and then to have the more informative statistical graphics underneath, as it were. The idea is, you’d click on the pretty but hard-to-read “infovis” graphs, and this would then reveal informative “full Cleveland” graphs. And then if you click again you’d get a spreadsheet with the raw numbers. That I’d like to see, as a new model for graphical presentation.


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 The old me would’ve said that each of these graphs would be better replaced by a dotplot (or, better still, a series of lineplots showing time trends). [sent-3, score-1.932]

2 The new me would still like the dotplots and lineplots, but I’d say it’s fine to have the eye-grabbing but hard-to-read graphs as is, and then to have the more informative statistical graphics underneath, as it were. [sent-4, score-1.497]

3 The idea is, you’d click on the pretty but hard-to-read “infovis” graphs, and this would then reveal informative “full Cleveland” graphs. [sent-5, score-0.88]

4 And then if you click again you’d get a spreadsheet with the raw numbers. [sent-6, score-0.625]

5 That I’d like to see, as a new model for graphical presentation. [sent-7, score-0.33]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('lineplots', 0.429), ('graphs', 0.374), ('click', 0.27), ('underneath', 0.239), ('informative', 0.232), ('dotplots', 0.215), ('dotplot', 0.204), ('cleveland', 0.184), ('spreadsheet', 0.179), ('infovis', 0.168), ('wow', 0.165), ('replaced', 0.159), ('reveal', 0.147), ('raw', 0.14), ('trends', 0.136), ('graphical', 0.134), ('presentation', 0.134), ('graphics', 0.117), ('still', 0.116), ('showing', 0.116), ('would', 0.112), ('better', 0.111), ('series', 0.107), ('pointed', 0.104), ('old', 0.096), ('full', 0.092), ('new', 0.091), ('fine', 0.087), ('hard', 0.073), ('said', 0.072), ('read', 0.066), ('pretty', 0.062), ('idea', 0.057), ('like', 0.055), ('model', 0.05), ('statistical', 0.049), ('say', 0.049), ('ve', 0.043), ('really', 0.043), ('time', 0.041), ('get', 0.036), ('see', 0.034)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.99999994 319 andrew gelman stats-2010-10-04-“Who owns Congress”

Introduction: Curt Yeske pointed me to this . Wow–these graphs are really hard to read! The old me would’ve said that each of these graphs would be better replaced by a dotplot (or, better still, a series of lineplots showing time trends). The new me would still like the dotplots and lineplots, but I’d say it’s fine to have the eye-grabbing but hard-to-read graphs as is, and then to have the more informative statistical graphics underneath, as it were. The idea is, you’d click on the pretty but hard-to-read “infovis” graphs, and this would then reveal informative “full Cleveland” graphs. And then if you click again you’d get a spreadsheet with the raw numbers. That I’d like to see, as a new model for graphical presentation.

2 0.24396682 855 andrew gelman stats-2011-08-16-Infovis and statgraphics update update

Introduction: To continue our discussion from last week , consider three positions regarding the display of information: (a) The traditional tabular approach. This is how most statisticians, econometricians, political scientists, sociologists, etc., seem to operate. They understand the appeal of a pretty graph, and they’re willing to plot some data as part of an exploratory data analysis, but they see their serious research as leading to numerical estimates, p-values, tables of numbers. These people might use a graph to illustrate their points but they don’t see them as necessary in their research. (b) Statistical graphics as performed by Howard Wainer, Bill Cleveland, Dianne Cook, etc. They–we–see graphics as central to the process of statistical modeling and data analysis and are interested in graphs (static and dynamic) that display every data point as transparently as possible. (c) Information visualization or infographics, as performed by graphics designers and statisticians who are

3 0.23682709 800 andrew gelman stats-2011-07-13-I like lineplots

Introduction: These particular lineplots are called parallel coordinate plots.

4 0.20387655 878 andrew gelman stats-2011-08-29-Infovis, infographics, and data visualization: Where I’m coming from, and where I’d like to go

Introduction: I continue to struggle to convey my thoughts on statistical graphics so I’ll try another approach, this time giving my own story. For newcomers to this discussion: the background is that Antony Unwin and I wrote an article on the different goals embodied in information visualization and statistical graphics, but I have difficulty communicating on this point with the infovis people. Maybe if I tell my own story, and then they tell their stories, this will point a way forward to a more constructive discussion. So here goes. I majored in physics in college and I worked in a couple of research labs during the summer. Physicists graph everything. I did most of my plotting on graph paper–this continued through my second year of grad school–and became expert at putting points at 1/5, 2/5, 3/5, and 4/5 between the x and y grid lines. In grad school in statistics, I continued my physics habits and graphed everything I could. I did notice, though, that the faculty and the other

5 0.18219343 1584 andrew gelman stats-2012-11-19-Tradeoffs in information graphics

Introduction: The visual display of quantitative information (to use Edward Tufte’s wonderful term) is a diverse field or set of fields, and its practitioners have different goals. The goals of software designers, applied statisticians, biologists, graphic designers, and journalists (to list just a few of the important creators of data graphics) often overlap—but not completely. One of our aims in writing our article [on Infovis and Statistical Graphics] was to emphasize the diversity of graphical goals, as it seems to us that even experts tend to consider one aspect of a graph and not others. Our main practical suggestion was that, in the internet age, we should not have to choose between attractive graphs and informational graphs: it should be possible to display both, via interactive displays. But to follow this suggestion, one must first accept that not every beautiful graph is informative, and not every informative graph is beautiful. . . . Yes, it can sometimes be possible for a graph to

6 0.15969932 2266 andrew gelman stats-2014-03-25-A statistical graphics course and statistical graphics advice

7 0.14858231 1604 andrew gelman stats-2012-12-04-An epithet I can live with

8 0.13933727 1668 andrew gelman stats-2013-01-11-My talk at the NY data visualization meetup this Monday!

9 0.13831012 1275 andrew gelman stats-2012-04-22-Please stop me before I barf again

10 0.13775182 1450 andrew gelman stats-2012-08-08-My upcoming talk for the data visualization meetup

11 0.13546507 1606 andrew gelman stats-2012-12-05-The Grinch Comes Back

12 0.13377646 443 andrew gelman stats-2010-12-02-Automating my graphics advice

13 0.1321582 1439 andrew gelman stats-2012-08-01-A book with a bunch of simple graphs

14 0.12915331 61 andrew gelman stats-2010-05-31-A data visualization manifesto

15 0.12582183 1298 andrew gelman stats-2012-05-03-News from the sister blog!

16 0.12548923 847 andrew gelman stats-2011-08-10-Using a “pure infographic” to explore differences between information visualization and statistical graphics

17 0.12391704 2038 andrew gelman stats-2013-09-25-Great graphs of names

18 0.12063409 210 andrew gelman stats-2010-08-16-What I learned from those tough 538 commenters

19 0.1200559 1848 andrew gelman stats-2013-05-09-A tale of two discussion papers

20 0.11681925 1894 andrew gelman stats-2013-06-12-How to best graph the Beveridge curve, relating the vacancy rate in jobs to the unemployment rate?


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.126), (1, -0.02), (2, -0.049), (3, 0.107), (4, 0.15), (5, -0.201), (6, -0.108), (7, 0.054), (8, -0.07), (9, 0.006), (10, 0.047), (11, 0.011), (12, -0.007), (13, 0.029), (14, 0.016), (15, -0.042), (16, 0.004), (17, -0.039), (18, -0.002), (19, 0.016), (20, -0.022), (21, -0.024), (22, 0.016), (23, 0.032), (24, -0.025), (25, -0.015), (26, -0.02), (27, 0.015), (28, -0.014), (29, 0.017), (30, -0.04), (31, -0.02), (32, -0.004), (33, 0.036), (34, 0.019), (35, 0.017), (36, 0.013), (37, 0.025), (38, 0.063), (39, 0.017), (40, 0.011), (41, -0.013), (42, -0.011), (43, 0.047), (44, 0.02), (45, -0.023), (46, -0.022), (47, -0.013), (48, 0.02), (49, 0.02)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.94969094 319 andrew gelman stats-2010-10-04-“Who owns Congress”

Introduction: Curt Yeske pointed me to this . Wow–these graphs are really hard to read! The old me would’ve said that each of these graphs would be better replaced by a dotplot (or, better still, a series of lineplots showing time trends). The new me would still like the dotplots and lineplots, but I’d say it’s fine to have the eye-grabbing but hard-to-read graphs as is, and then to have the more informative statistical graphics underneath, as it were. The idea is, you’d click on the pretty but hard-to-read “infovis” graphs, and this would then reveal informative “full Cleveland” graphs. And then if you click again you’d get a spreadsheet with the raw numbers. That I’d like to see, as a new model for graphical presentation.

2 0.89239407 1584 andrew gelman stats-2012-11-19-Tradeoffs in information graphics

Introduction: The visual display of quantitative information (to use Edward Tufte’s wonderful term) is a diverse field or set of fields, and its practitioners have different goals. The goals of software designers, applied statisticians, biologists, graphic designers, and journalists (to list just a few of the important creators of data graphics) often overlap—but not completely. One of our aims in writing our article [on Infovis and Statistical Graphics] was to emphasize the diversity of graphical goals, as it seems to us that even experts tend to consider one aspect of a graph and not others. Our main practical suggestion was that, in the internet age, we should not have to choose between attractive graphs and informational graphs: it should be possible to display both, via interactive displays. But to follow this suggestion, one must first accept that not every beautiful graph is informative, and not every informative graph is beautiful. . . . Yes, it can sometimes be possible for a graph to

3 0.85448897 1604 andrew gelman stats-2012-12-04-An epithet I can live with

Introduction: Here . Indeed, I’d much rather be a legend than a myth. I just want to clarify one thing. Walter Hickey writes: [Antony Unwin and Andrew Gelman] collaborated on this presentation where they take a hard look at what’s wrong with the recent trends of data visualization and infographics. The takeaway is that while there have been great leaps in visualization technology, some of the visualizations that have garnered the highest praises have actually been lacking in a number of key areas. Specifically, the pair does a takedown of the top visualizations of 2008 as decided by the popular statistics blog Flowing Data. This is a fair summary, but I want to emphasize that, although our dislike of some award-winning visualizations is central to our argument, it is only the first part of our story. As Antony and I worked more on our paper, and especially after seeing the discussions by Robert Kosara, Stephen Few, Hadley Wickham, and Paul Murrell (all to appear in Journal of Computati

4 0.82107294 855 andrew gelman stats-2011-08-16-Infovis and statgraphics update update

Introduction: To continue our discussion from last week , consider three positions regarding the display of information: (a) The traditional tabular approach. This is how most statisticians, econometricians, political scientists, sociologists, etc., seem to operate. They understand the appeal of a pretty graph, and they’re willing to plot some data as part of an exploratory data analysis, but they see their serious research as leading to numerical estimates, p-values, tables of numbers. These people might use a graph to illustrate their points but they don’t see them as necessary in their research. (b) Statistical graphics as performed by Howard Wainer, Bill Cleveland, Dianne Cook, etc. They–we–see graphics as central to the process of statistical modeling and data analysis and are interested in graphs (static and dynamic) that display every data point as transparently as possible. (c) Information visualization or infographics, as performed by graphics designers and statisticians who are

5 0.81893957 878 andrew gelman stats-2011-08-29-Infovis, infographics, and data visualization: Where I’m coming from, and where I’d like to go

Introduction: I continue to struggle to convey my thoughts on statistical graphics so I’ll try another approach, this time giving my own story. For newcomers to this discussion: the background is that Antony Unwin and I wrote an article on the different goals embodied in information visualization and statistical graphics, but I have difficulty communicating on this point with the infovis people. Maybe if I tell my own story, and then they tell their stories, this will point a way forward to a more constructive discussion. So here goes. I majored in physics in college and I worked in a couple of research labs during the summer. Physicists graph everything. I did most of my plotting on graph paper–this continued through my second year of grad school–and became expert at putting points at 1/5, 2/5, 3/5, and 4/5 between the x and y grid lines. In grad school in statistics, I continued my physics habits and graphed everything I could. I did notice, though, that the faculty and the other

6 0.81729037 1275 andrew gelman stats-2012-04-22-Please stop me before I barf again

7 0.80566382 2038 andrew gelman stats-2013-09-25-Great graphs of names

8 0.79539871 2266 andrew gelman stats-2014-03-25-A statistical graphics course and statistical graphics advice

9 0.78763205 847 andrew gelman stats-2011-08-10-Using a “pure infographic” to explore differences between information visualization and statistical graphics

10 0.78654003 1606 andrew gelman stats-2012-12-05-The Grinch Comes Back

11 0.7662828 829 andrew gelman stats-2011-07-29-Infovis vs. statgraphics: A clear example of their different goals

12 0.75895005 372 andrew gelman stats-2010-10-27-A use for tables (really)

13 0.75786847 1775 andrew gelman stats-2013-03-23-In which I disagree with John Maynard Keynes

14 0.75644219 1764 andrew gelman stats-2013-03-15-How do I make my graphs?

15 0.75622904 37 andrew gelman stats-2010-05-17-Is chartjunk really “more useful” than plain graphs? I don’t think so.

16 0.75482744 1439 andrew gelman stats-2012-08-01-A book with a bunch of simple graphs

17 0.75250065 1896 andrew gelman stats-2013-06-13-Against the myth of the heroic visualization

18 0.74306828 1684 andrew gelman stats-2013-01-20-Ugly ugly ugly

19 0.73456818 61 andrew gelman stats-2010-05-31-A data visualization manifesto

20 0.72007924 126 andrew gelman stats-2010-07-03-Graphical presentation of risk ratios


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(5, 0.018), (6, 0.027), (13, 0.09), (16, 0.078), (24, 0.145), (51, 0.029), (77, 0.03), (96, 0.203), (99, 0.237)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.94264323 1306 andrew gelman stats-2012-05-07-Lists of Note and Letters of Note

Introduction: These (from Shaun Usher) are surprisingly good, especially since he appears to come up with new lists and letters pretty regularly. I suppose a lot of them get sent in from readers, but still. Here’s my favorite recent item, a letter sent to the Seattle Bureau of Prohibition in 1931: Dear Sir: My husband is in the habit of buying a quart of wiskey every other day from a Chinese bootlegger named Chin Waugh living at 317-16th near Alder street. We need this money for household expenses. Will you please have his place raided? He keeps a supply planted in the garden and a smaller quantity under the back steps for quick delivery. If you make the raid at 9:30 any morning you will be sure to get the goods and Chin also as he leaves the house at 10 o’clock and may clean up before he goes. Thanking you in advance, I remain yours truly, Mrs. Hillyer

2 0.91151446 1731 andrew gelman stats-2013-02-21-If a lottery is encouraging addictive gambling, don’t expand it!

Introduction: This story from Vivian Yee seems just horrible to me. First the background: Pronto Lotto’s real business takes place in the carpeted, hushed area where its most devoted customers watch video screens from a scattering of tall silver tables, hour after hour, day after day. The players — mostly men, about a dozen at any given time — come on their lunch breaks or after work to study the screens, which are programmed with the Quick Draw lottery game, and flash a new set of winning numbers every four minutes. They have helped make Pronto Lotto the top Quick Draw vendor in the state, selling $3.3 million worth of tickets last year, more than $1 million more than the second busiest location, a World Books shop in Penn Station. Some stay for just a few minutes. Others play for the length of a workday, repeatedly traversing the few yards between their seats and the cash register as they hand the next wager to a clerk with a dollar bill or two, and return to wait. “It’s like my job, 24

3 0.90890872 410 andrew gelman stats-2010-11-12-The Wald method has been the subject of extensive criticism by statisticians for exaggerating results”

Introduction: Paul Nee sends in this amusing item: MELA Sciences claimed success in a clinical trial of its experimental skin cancer detection device only by altering the statistical method used to analyze the data in violation of an agreement with U.S. regulators, charges an independent healthcare analyst in a report issued last week. . . The BER report, however, relies on its own analysis to suggest that MELA struck out with FDA because the agency’s medical device reviewers discovered the MELAFind pivotal study failed to reach statistical significance despite the company’s claims to the contrary. And now here’s where it gets interesting: MELA claims that a phase III study of MELAFind met its primary endpoint by detecting accurately 112 of 114 eligible melanomas for a “sensitivity” rate of 98%. The lower confidence bound of the sensitivity analysis was 95.1%, which met the FDA’s standard for statistical significance in the study spelled out in a binding agreement with MELA, the compa

same-blog 4 0.89340162 319 andrew gelman stats-2010-10-04-“Who owns Congress”

Introduction: Curt Yeske pointed me to this . Wow–these graphs are really hard to read! The old me would’ve said that each of these graphs would be better replaced by a dotplot (or, better still, a series of lineplots showing time trends). The new me would still like the dotplots and lineplots, but I’d say it’s fine to have the eye-grabbing but hard-to-read graphs as is, and then to have the more informative statistical graphics underneath, as it were. The idea is, you’d click on the pretty but hard-to-read “infovis” graphs, and this would then reveal informative “full Cleveland” graphs. And then if you click again you’d get a spreadsheet with the raw numbers. That I’d like to see, as a new model for graphical presentation.

5 0.89078766 327 andrew gelman stats-2010-10-07-There are never 70 distinct parameters

Introduction: Sam Seaver writes: I’m a graduate student in computational biology, and I’m relatively new to advanced statistics, and am trying to teach myself how best to approach a problem I have. My dataset is a small sparse matrix of 150 cases and 70 predictors, it is sparse as in many zeros, not many ‘NA’s. Each case is a nutrient that is fed into an in silico organism, and its response is whether or not it stimulates growth, and each predictor is one of 70 different pathways that the nutrient may or may not belong to. Because all of the nutrients do not belong to all of the pathways, there are thus many zeros in my matrix. My goal is to be able to use the pathways themselves to predict whether or not a nutrient could stimulate growth, thus I wanted to compute regression coefficients for each pathway, with which I could apply to other nutrients for other species. There are quite a few singularities in the dataset (summary(glm) reports that 14 coefficients are not defined because of sin

6 0.88705534 169 andrew gelman stats-2010-07-29-Say again?

7 0.86997437 1023 andrew gelman stats-2011-11-22-Going Beyond the Book: Towards Critical Reading in Statistics Teaching

8 0.85406244 1118 andrew gelman stats-2012-01-14-A model rejection letter

9 0.84939241 1338 andrew gelman stats-2012-05-23-Advice on writing research articles

10 0.83841127 787 andrew gelman stats-2011-07-05-Different goals, different looks: Infovis and the Chris Rock effect

11 0.83601725 2023 andrew gelman stats-2013-09-14-On blogging

12 0.83175808 302 andrew gelman stats-2010-09-28-This is a link to a news article about a scientific paper

13 0.83141208 2065 andrew gelman stats-2013-10-17-Cool dynamic demographic maps provide beautiful illustration of Chris Rock effect

14 0.82443255 934 andrew gelman stats-2011-09-30-Nooooooooooooooooooo!

15 0.82244706 205 andrew gelman stats-2010-08-13-Arnold Zellner

16 0.81847197 99 andrew gelman stats-2010-06-19-Paired comparisons

17 0.81323397 2296 andrew gelman stats-2014-04-19-Index or indicator variables

18 0.81271386 405 andrew gelman stats-2010-11-10-Estimation from an out-of-date census

19 0.81058455 2172 andrew gelman stats-2014-01-14-Advice on writing research articles

20 0.80880094 980 andrew gelman stats-2011-10-29-When people meet this guy, can they resist the temptation to ask him what he’s doing for breakfast??