andrew_gelman_stats andrew_gelman_stats-2012 andrew_gelman_stats-2012-1552 knowledge-graph by maker-knowledge-mining

1552 andrew gelman stats-2012-10-29-“Communication is a central task of statistics, and ideally a state-of-the-art data analysis can have state-of-the-art displays to match”


meta infos for this blog

Source: html

Introduction: The Journal of the Royal Statistical Society publishes papers followed by discussions. Lots of discussions, each can be no more than 400 words. Here’s my most recent discussion: The authors are working on an important applied problem and I have no reason to doubt that their approach is a step forward beyond diagnostic criteria based on point estimation. An attempt at an accurate assessment of variation is important not just for statistical reasons but also because scientists have the duty to convey their uncertainty to the larger world. I am thinking, for example, of discredited claims such as that of the mathematician who claimed to predict divorces with 93% accuracy (Abraham, 2010). Regarding the paper at hand, I thought I would try an experiment in comment-writing. My usual practice is to read the graphs and then go back and clarify any questions through the text. So, very quickly: I would prefer Figure 1 to be displayed in terms of standard deviations, not variances. I


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 The Journal of the Royal Statistical Society publishes papers followed by discussions. [sent-1, score-0.103]

2 Here’s my most recent discussion: The authors are working on an important applied problem and I have no reason to doubt that their approach is a step forward beyond diagnostic criteria based on point estimation. [sent-3, score-0.291]

3 An attempt at an accurate assessment of variation is important not just for statistical reasons but also because scientists have the duty to convey their uncertainty to the larger world. [sent-4, score-0.268]

4 I am thinking, for example, of discredited claims such as that of the mathematician who claimed to predict divorces with 93% accuracy (Abraham, 2010). [sent-5, score-0.467]

5 My usual practice is to read the graphs and then go back and clarify any questions through the text. [sent-7, score-0.215]

6 So, very quickly: I would prefer Figure 1 to be displayed in terms of standard deviations, not variances. [sent-8, score-0.098]

7 I find variances difficult to interpret, and I’m always taking mental square roots (0. [sent-9, score-0.397]

8 Figure 3 is appealing but I don’t like the visual emphasis of the endpoints of the 95% intervals. [sent-12, score-0.214]

9 5th percentiles of the posterior distribution, and I think it goes against the spirit of the article to emphasize these arbitrary endpoints. [sent-15, score-0.295]

10 I also think that, with some care, the graphs in Figures 3, 4, and 5 could be compactly re-expressed to show comparisons more effectively (as in Gelman, Pasarica, and Dodhia, 2002). [sent-16, score-0.321]

11 Tables 2 and 3 I think are useless: why should a reader care that the 10th percentile point of the distribution for a particular probability os 0. [sent-17, score-0.435]

12 Again, this seems to me to contradict the decision-analytic focus of the applied research. [sent-19, score-0.187]

13 These brusque comments on display may seem peripheral but to me they are important. [sent-20, score-0.263]

14 Communication is a central task of statistics, and ideally a state-of-the-art data analysis can have state-of-the-art displays to match. [sent-21, score-0.187]

15 Can you really predict the success of a marriage in 15 minutes? [sent-23, score-0.218]

16 Let’s practice what we preach: turning tables into graphs. [sent-31, score-0.359]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('dodhia', 0.254), ('pasarica', 0.254), ('abraham', 0.245), ('tables', 0.155), ('brusque', 0.141), ('compactly', 0.133), ('predict', 0.131), ('divorces', 0.127), ('endpoints', 0.127), ('gelman', 0.124), ('laurie', 0.122), ('peripheral', 0.122), ('percentiles', 0.119), ('os', 0.119), ('roots', 0.113), ('preach', 0.113), ('practice', 0.111), ('diagnostic', 0.111), ('care', 0.11), ('percentile', 0.109), ('figure', 0.108), ('discredited', 0.107), ('graphs', 0.104), ('publishes', 0.103), ('squared', 0.103), ('mathematician', 0.102), ('royal', 0.102), ('displayed', 0.098), ('duty', 0.098), ('variances', 0.098), ('distribution', 0.097), ('square', 0.097), ('deviations', 0.097), ('ideally', 0.096), ('applied', 0.094), ('contradict', 0.093), ('turning', 0.093), ('slate', 0.092), ('displays', 0.091), ('standpoint', 0.09), ('mental', 0.089), ('spirit', 0.089), ('marriage', 0.087), ('appealing', 0.087), ('arbitrary', 0.087), ('assessment', 0.086), ('criteria', 0.086), ('useless', 0.086), ('convey', 0.084), ('effectively', 0.084)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.99999982 1552 andrew gelman stats-2012-10-29-“Communication is a central task of statistics, and ideally a state-of-the-art data analysis can have state-of-the-art displays to match”

Introduction: The Journal of the Royal Statistical Society publishes papers followed by discussions. Lots of discussions, each can be no more than 400 words. Here’s my most recent discussion: The authors are working on an important applied problem and I have no reason to doubt that their approach is a step forward beyond diagnostic criteria based on point estimation. An attempt at an accurate assessment of variation is important not just for statistical reasons but also because scientists have the duty to convey their uncertainty to the larger world. I am thinking, for example, of discredited claims such as that of the mathematician who claimed to predict divorces with 93% accuracy (Abraham, 2010). Regarding the paper at hand, I thought I would try an experiment in comment-writing. My usual practice is to read the graphs and then go back and clarify any questions through the text. So, very quickly: I would prefer Figure 1 to be displayed in terms of standard deviations, not variances. I

2 0.93586463 1327 andrew gelman stats-2012-05-18-Comments on “A Bayesian approach to complex clinical diagnoses: a case-study in child abuse”

Introduction: I was given the opportunity to briefly comment on the paper , A Bayesian approach to complex clinical diagnoses: a case-study in child abuse, by Nicky Best, Deborah Ashby, Frank Dunstan, David Foreman, and Neil McIntosh, for the Journal of the Royal Statistical Society. Here is what I wrote: Best et al. are working on an important applied problem and I have no reason to doubt that their approach is a step forward beyond diagnostic criteria based on point estimation. An attempt at an accurate assessment of variation is important not just for statistical reasons but also because scientists have the duty to convey their uncertainty to the larger world. I am thinking, for example, of discredited claims such as that of the mathematician who claimed to predict divorces with 93% accuracy (Abraham, 2010). Regarding the paper at hand, I thought I would try an experiment in comment-writing. My usual practice is to read the graphs and then go back and clarify any questions through the t

3 0.14668751 105 andrew gelman stats-2010-06-23-More on those divorce prediction statistics, including a discussion of the innumeracy of (some) mathematicians

Introduction: A few months ago, I blogged on John Gottman, a psychologist whose headline-grabbing research on marriages (he got himself featured in Blink with a claim that he could predict with 83 percent accuracy whether a couple would be divorced—after meeting with them for 15 minutes!) was recently debunked in a book by Laurie Abraham. The question I raised was: how could someone who was evidently so intelligent and accomplished—Gottman, that is—get things so wrong? My brief conclusion was that once you have some success, I guess there’s not much of a motivation to change your ways. Also, I could well believe that, for all its flaws, Gottman’s work is better than much of the other research out there on marriages. There’s still the question of how this stuff gets published in scientific journals. I haven’t looked at Gottman’s articles in detail and so don’t really have thoughts on that one. Anyway, I recently corresponded with a mathematician who had heard of Gottman’s research and wrote

4 0.14350829 2279 andrew gelman stats-2014-04-02-Am I too negative?

Introduction: For background, you can start by reading my recent article, Is It Possible to Be an Ethicist Without Being Mean to People? and then a blog post, Quality over Quantity , by John Cook, who writes: At one point [Ed] Tufte spoke more generally and more personally about pursuing quality over quantity. He said most papers are not worth reading and that he learned early on to concentrate on the great papers, maybe one in 500, that are worth reading and rereading rather than trying to “keep up with the literature.” He also explained how over time he has concentrated more on showcasing excellent work than on criticizing bad work. You can see this in the progression from his first book to his latest. (Criticizing bad work is important too, but you’ll have to read his early books to find more of that. He won’t spend as much time talking about it in his course.) That reminded me of Jesse Robbins’ line: “Don’t fight stupid. You are better than that. Make more awesome.” This made me stop an

5 0.12936287 372 andrew gelman stats-2010-10-27-A use for tables (really)

Introduction: After our recent discussion of semigraphic displays, Jay Ulfelder sent along a semigraphic table from his recent book. He notes, “When countries are the units of analysis, it’s nice that you can use three-letter codes, so all the proper names have the same visual weight.” Ultimately I think that graphs win over tables for display. However in our work we spend a lot of time looking at raw data, often simply to understand what data we have. This use of tables has, I think, been forgotten in the statistical graphics literature. So I’d like to refocus the eternal tables vs. graphs discussion. If the goal is to present information, comparisons, relationships, models, data, etc etc, graphs win. Forget about tables. But . . . when you’re looking at your data, it can often help to see the raw numbers. Once you’re looking at numbers, it makes sense to organize them. Even a displayed matrix in R is a form of table, after all. And once you’re making a table, it can be sensible to

6 0.12227853 69 andrew gelman stats-2010-06-04-A Wikipedia whitewash

7 0.12027581 2157 andrew gelman stats-2014-01-02-2013

8 0.11424707 855 andrew gelman stats-2011-08-16-Infovis and statgraphics update update

9 0.11421241 61 andrew gelman stats-2010-05-31-A data visualization manifesto

10 0.11173394 878 andrew gelman stats-2011-08-29-Infovis, infographics, and data visualization: Where I’m coming from, and where I’d like to go

11 0.10934594 2172 andrew gelman stats-2014-01-14-Advice on writing research articles

12 0.10219372 2013 andrew gelman stats-2013-09-08-What we need here is some peer review for statistical graphics

13 0.10196741 2034 andrew gelman stats-2013-09-23-My talk Tues 24 Sept at 12h30 at Université de Technologie de Compiègne

14 0.099224247 1096 andrew gelman stats-2012-01-02-Graphical communication for legal scholarship

15 0.098148763 1848 andrew gelman stats-2013-05-09-A tale of two discussion papers

16 0.097126022 302 andrew gelman stats-2010-09-28-This is a link to a news article about a scientific paper

17 0.094036862 1712 andrew gelman stats-2013-02-07-Philosophy and the practice of Bayesian statistics (with all the discussions!)

18 0.093898237 1403 andrew gelman stats-2012-07-02-Moving beyond hopeless graphics

19 0.092152953 1247 andrew gelman stats-2012-04-05-More philosophy of Bayes

20 0.090163738 2081 andrew gelman stats-2013-10-29-My talk in Amsterdam tomorrow (Wed 29 Oct): Can we use Bayesian methods to resolve the current crisis of statistically-significant research findings that don’t hold up?


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.186), (1, 0.024), (2, -0.047), (3, -0.001), (4, 0.011), (5, -0.102), (6, -0.077), (7, 0.033), (8, -0.043), (9, -0.005), (10, 0.041), (11, 0.009), (12, -0.037), (13, -0.009), (14, 0.024), (15, 0.003), (16, 0.001), (17, 0.007), (18, -0.025), (19, 0.013), (20, 0.007), (21, 0.025), (22, 0.051), (23, -0.002), (24, 0.051), (25, 0.035), (26, 0.02), (27, 0.09), (28, 0.041), (29, -0.0), (30, 0.038), (31, 0.073), (32, -0.072), (33, 0.068), (34, 0.003), (35, -0.042), (36, 0.008), (37, 0.035), (38, 0.024), (39, -0.149), (40, 0.081), (41, -0.067), (42, -0.075), (43, -0.02), (44, -0.059), (45, -0.023), (46, -0.062), (47, -0.01), (48, 0.103), (49, -0.021)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.96335185 1552 andrew gelman stats-2012-10-29-“Communication is a central task of statistics, and ideally a state-of-the-art data analysis can have state-of-the-art displays to match”

Introduction: The Journal of the Royal Statistical Society publishes papers followed by discussions. Lots of discussions, each can be no more than 400 words. Here’s my most recent discussion: The authors are working on an important applied problem and I have no reason to doubt that their approach is a step forward beyond diagnostic criteria based on point estimation. An attempt at an accurate assessment of variation is important not just for statistical reasons but also because scientists have the duty to convey their uncertainty to the larger world. I am thinking, for example, of discredited claims such as that of the mathematician who claimed to predict divorces with 93% accuracy (Abraham, 2010). Regarding the paper at hand, I thought I would try an experiment in comment-writing. My usual practice is to read the graphs and then go back and clarify any questions through the text. So, very quickly: I would prefer Figure 1 to be displayed in terms of standard deviations, not variances. I

2 0.95129037 1327 andrew gelman stats-2012-05-18-Comments on “A Bayesian approach to complex clinical diagnoses: a case-study in child abuse”

Introduction: I was given the opportunity to briefly comment on the paper , A Bayesian approach to complex clinical diagnoses: a case-study in child abuse, by Nicky Best, Deborah Ashby, Frank Dunstan, David Foreman, and Neil McIntosh, for the Journal of the Royal Statistical Society. Here is what I wrote: Best et al. are working on an important applied problem and I have no reason to doubt that their approach is a step forward beyond diagnostic criteria based on point estimation. An attempt at an accurate assessment of variation is important not just for statistical reasons but also because scientists have the duty to convey their uncertainty to the larger world. I am thinking, for example, of discredited claims such as that of the mathematician who claimed to predict divorces with 93% accuracy (Abraham, 2010). Regarding the paper at hand, I thought I would try an experiment in comment-writing. My usual practice is to read the graphs and then go back and clarify any questions through the t

3 0.65659374 1775 andrew gelman stats-2013-03-23-In which I disagree with John Maynard Keynes

Introduction: In his review in 1938 of Historical Development of the Graphical Representation of Statistical Data , by H. Gray Funkhauser, for The Economic Journal , the great economist writes: Perhaps the most striking outcome of Mr. Funkhouser’s researches is the fact of the very slow progress which graphical methods made until quite recently. . . . In the first fifty volumes of the Statistical Journal, 1837-87, only fourteen graphs are printed altogether. It is surprising to be told that Laplace never drew a graph of the normal law of error . . . Edgeworth made no use of statistical charts as distinct from mathematical diagrams. Apart from Quetelet and Jevons, the most important influences were probably those of Galton and of Mulhall’s Dictionary, first published in 1884. Galton was indeed following his father and grandfather in this field, but his pioneer work was mainly restricted to meteorological maps, and he did not contribute to the development of the graphical representation of ec

4 0.62220061 1403 andrew gelman stats-2012-07-02-Moving beyond hopeless graphics

Introduction: I was at a talk awhile ago where the speaker presented tables with 4, 5, 6, even 8 significant digits even though, as is usual, only the first or second digit of each number conveyed any useful information. A graph would be better, but even if you’re too lazy to make a plot, a bit of rounding would seem to be required. I mentioned this to a colleague, who responded: I don’t know how to stop this practice. Logic doesn’t work. Maybe ridicule? Best hope is the departure from field who do it. (Theories don’t die, but the people who follow those theories retire.) Another possibility, I think, is helpful software defaults. If we can get to the people who write the software, maybe we could have some impact. Once the software is written, however, it’s probably too late. I’m not far from the center of the R universe, but I don’t know if I’ll ever succeed in my goals of increasing the default number of histogram bars or reducing the default number of decimal places in regression

5 0.6171723 2157 andrew gelman stats-2014-01-02-2013

Introduction: There’s lots of overlap but I put each paper into only one category.  Also, I’ve included work that has been published in 2013 as well as work that has been completed this year and might appear in 2014 or later.  So you can can think of this list as representing roughly two years’ work. Political science: [2014] The twentieth-century reversal: How did the Republican states switch to the Democrats and vice versa? {\em Statistics and Public Policy}.  (Andrew Gelman) [2013] Hierarchical models for estimating state and demographic trends in U.S. death penalty public opinion. {\em Journal of the Royal Statistical Society A}.  (Kenneth Shirley and Andrew Gelman) [2013] Deep interactions with MRP: Election turnout and voting patterns among small electoral subgroups. {\em American Journal of Political Science}.  (Yair Ghitza and Andrew Gelman) [2013] Charles Murray’s {\em Coming Apart} and the measurement of social and political divisions. {\em Statistics, Politics and Policy}.

6 0.60789555 1096 andrew gelman stats-2012-01-02-Graphical communication for legal scholarship

7 0.60608339 2225 andrew gelman stats-2014-02-26-A good comment on one of my papers

8 0.59703934 1800 andrew gelman stats-2013-04-12-Too tired to mock

9 0.57763404 37 andrew gelman stats-2010-05-17-Is chartjunk really “more useful” than plain graphs? I don’t think so.

10 0.5774774 2081 andrew gelman stats-2013-10-29-My talk in Amsterdam tomorrow (Wed 29 Oct): Can we use Bayesian methods to resolve the current crisis of statistically-significant research findings that don’t hold up?

11 0.57687807 372 andrew gelman stats-2010-10-27-A use for tables (really)

12 0.57578814 1309 andrew gelman stats-2012-05-09-The first version of my “inference from iterative simulation using parallel sequences” paper!

13 0.57521671 1078 andrew gelman stats-2011-12-22-Tables as graphs: The Ramanujan principle

14 0.5745542 1366 andrew gelman stats-2012-06-05-How do segregation measures change when you change the level of aggregation?

15 0.56602585 1848 andrew gelman stats-2013-05-09-A tale of two discussion papers

16 0.5658502 800 andrew gelman stats-2011-07-13-I like lineplots

17 0.56093627 933 andrew gelman stats-2011-09-30-More bad news: The (mis)reporting of statistical results in psychology journals

18 0.55403793 61 andrew gelman stats-2010-05-31-A data visualization manifesto

19 0.54820889 689 andrew gelman stats-2011-05-01-Is that what she said?

20 0.54708028 2246 andrew gelman stats-2014-03-13-An Economist’s Guide to Visualizing Data


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(16, 0.025), (21, 0.037), (24, 0.173), (55, 0.015), (86, 0.331), (95, 0.017), (98, 0.042), (99, 0.26)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.97012222 1427 andrew gelman stats-2012-07-24-More from the sister blog

Introduction: Anthropologist Bruce Mannheim reports that a recent well-publicized study on the genetics of native Americans, which used genetic analysis to find “at least three streams of Asian gene flow,” is in fact a confirmation of a long-known fact. Mannheim writes: This three-way distinction was known linguistically since the 1920s (for example, Sapir 1921). Basically, it’s a division among the Eskimo-Aleut languages, which straddle the Bering Straits even today, the Athabaskan languages (which were discovered to be related to a small Siberian language family only within the last few years, not by Greenberg as Wade suggested), and everything else. This is not to say that the results from genetics are unimportant, but it’s good to see how it fits with other aspects of our understanding.

2 0.96667969 1530 andrew gelman stats-2012-10-11-Migrating your blog from Movable Type to WordPress

Introduction: Cord Blomquist, who did a great job moving us from horrible Movable Type to nice nice WordPress, writes: I [Cord] wanted to share a little news with you related to the original work we did for you last year. When ReadyMadeWeb converted your Movable Type blog to WordPress, we got a lot of other requestes for the same service, so we started thinking about a bigger market for such a product. After a bit of research, we started work on automating the data conversion, writing rules, and exceptions to the rules, on how Movable Type and TypePad data could be translated to WordPress. After many months of work, we’re getting ready to announce TP2WP.com , a service that converts Movable Type and TypePad export files to WordPress import files, so anyone who wants to migrate to WordPress can do so easily and without losing permalinks, comments, images, or other files. By automating our service, we’ve been able to drop the price to just $99. I recommend it (and, no, Cord is not paying m

3 0.95914626 436 andrew gelman stats-2010-11-29-Quality control problems at the New York Times

Introduction: I guess there’s a reason they put this stuff in the Opinion section and not in the Science section, huh? P.S. More here .

4 0.94756246 253 andrew gelman stats-2010-09-03-Gladwell vs Pinker

Introduction: I just happened to notice this from last year. Eric Loken writes : Steven Pinker reviewed Malcolm Gladwell’s latest book and criticized him rather harshly for several shortcomings. Gladwell appears to have made things worse for himself in a letter to the editor of the NYT by defending a manifestly weak claim from one of his essays – the claim that NFL quarterback performance is unrelated to the order they were drafted out of college. The reason w [Loken and his colleagues] are implicated is that Pinker identified an earlier blog post of ours as one of three sources he used to challenge Gladwell (yay us!). But Gladwell either misrepresented or misunderstood our post in his response, and admonishes Pinker by saying “we should agree that our differences owe less to what can be found in the scientific literature than they do to what can be found on Google.” Well, here’s what you can find on Google. Follow this link to request the data for NFL quarterbacks drafted between 1980 and

5 0.94132435 873 andrew gelman stats-2011-08-26-Luck or knowledge?

Introduction: Joan Ginther has won the Texas lottery four times. First, she won $5.4 million, then a decade later, she won $2million, then two years later $3million and in the summer of 2010, she hit a $10million jackpot. The odds of this has been calculated at one in eighteen septillion and luck like this could only come once every quadrillion years. According to Forbes, the residents of Bishop, Texas, seem to believe God was behind it all. The Texas Lottery Commission told Mr Rich that Ms Ginther must have been ‘born under a lucky star’, and that they don’t suspect foul play. Harper’s reporter Nathanial Rich recently wrote an article about Ms Ginther, which calls the the validity of her ‘luck’ into question. First, he points out, Ms Ginther is a former math professor with a PhD from Stanford University specialising in statistics. More at Daily Mail. [Edited Saturday] In comments, C Ryan King points to the original article at Harper’s and Bill Jefferys to Wired .

6 0.9363873 1718 andrew gelman stats-2013-02-11-Toward a framework for automatic model building

same-blog 7 0.92955196 1552 andrew gelman stats-2012-10-29-“Communication is a central task of statistics, and ideally a state-of-the-art data analysis can have state-of-the-art displays to match”

8 0.92715025 76 andrew gelman stats-2010-06-09-Both R and Stata

9 0.92635298 904 andrew gelman stats-2011-09-13-My wikipedia edit

10 0.91747487 1327 andrew gelman stats-2012-05-18-Comments on “A Bayesian approach to complex clinical diagnoses: a case-study in child abuse”

11 0.91468334 2219 andrew gelman stats-2014-02-21-The world’s most popular languages that the Mac documentation hasn’t been translated into

12 0.90731955 558 andrew gelman stats-2011-02-05-Fattening of the world and good use of the alpha channel

13 0.90577006 1547 andrew gelman stats-2012-10-25-College football, voting, and the law of large numbers

14 0.8937096 759 andrew gelman stats-2011-06-11-“2 level logit with 2 REs & large sample. computational nightmare – please help”

15 0.89294082 305 andrew gelman stats-2010-09-29-Decision science vs. social psychology

16 0.87642384 2082 andrew gelman stats-2013-10-30-Berri Gladwell Loken football update

17 0.87384999 276 andrew gelman stats-2010-09-14-Don’t look at just one poll number–unless you really know what you’re doing!

18 0.8564989 1971 andrew gelman stats-2013-08-07-I doubt they cheated

19 0.85227245 2102 andrew gelman stats-2013-11-15-“Are all significant p-values created equal?”

20 0.8506816 1278 andrew gelman stats-2012-04-23-“Any old map will do” meets “God is in every leaf of every tree”