andrew_gelman_stats andrew_gelman_stats-2011 andrew_gelman_stats-2011-1066 knowledge-graph by maker-knowledge-mining

1066 andrew gelman stats-2011-12-17-Ripley on model selection, and some links on exploratory model analysis


meta infos for this blog

Source: html

Introduction: This is really fun. I love how Ripley thinks, with just about every concept considered in broad generality while being connected to real-data examples. He’s a great statistical storyteller as well. . . . and Wickham on exploratory model analysis I came across Ripley’s slides in a reference from Hadley Wickham’s article on exploratory model analysis . I’ve been interested for awhile in statistical graphics for understanding fitted models (which is different than the usual use of graphics to visualize data or to understand discrepancies of data from models). Recently I’ve started using the term “exploratory model analysis,” and it seemed like such a natural phrase that I thought I’d google it and see what’s up. I found the above-linked paper by Hadley, which in turn refers to a paper by Antony Unwin, Chris Volinksy, and Sylvia Winkler that defines “exploratory modelling analysis” as “the evaluation and comparison of many models simultaneously.” That’s not exactly what I h


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 I love how Ripley thinks, with just about every concept considered in broad generality while being connected to real-data examples. [sent-2, score-0.383]

2 and Wickham on exploratory model analysis I came across Ripley’s slides in a reference from Hadley Wickham’s article on exploratory model analysis . [sent-7, score-1.302]

3 I’ve been interested for awhile in statistical graphics for understanding fitted models (which is different than the usual use of graphics to visualize data or to understand discrepancies of data from models). [sent-8, score-0.786]

4 Recently I’ve started using the term “exploratory model analysis,” and it seemed like such a natural phrase that I thought I’d google it and see what’s up. [sent-9, score-0.376]

5 I found the above-linked paper by Hadley, which in turn refers to a paper by Antony Unwin, Chris Volinksy, and Sylvia Winkler that defines “exploratory modelling analysis” as “the evaluation and comparison of many models simultaneously. [sent-10, score-0.65]

6 I was curious to see what research Ripley’s been up to lately. [sent-14, score-0.076]

7 His webpage doesn’t seem to have been updated in many years, but a Google scholar search revealed this article on estimating disease prevalence. [sent-15, score-0.781]

8 I have no idea how he got involved in that project, but I hope he is getting deep enough into the problem to inspire further insights. [sent-16, score-0.267]

9 (The search also revealed a bunch of articles and patents on electric household appliances, but that seems to be a different Brian D. [sent-17, score-0.683]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('ripley', 0.396), ('exploratory', 0.348), ('wickham', 0.234), ('hadley', 0.223), ('revealed', 0.195), ('storyteller', 0.152), ('sylvia', 0.152), ('appliances', 0.143), ('graphics', 0.139), ('google', 0.137), ('search', 0.136), ('analysis', 0.13), ('patents', 0.128), ('electric', 0.122), ('visualize', 0.119), ('inspire', 0.119), ('generality', 0.119), ('discrepancies', 0.117), ('models', 0.114), ('modelling', 0.113), ('defines', 0.112), ('brian', 0.103), ('household', 0.102), ('antony', 0.102), ('updated', 0.099), ('unwin', 0.099), ('scholar', 0.095), ('webpage', 0.095), ('connected', 0.092), ('disease', 0.092), ('refers', 0.091), ('slides', 0.09), ('model', 0.09), ('broad', 0.087), ('fitted', 0.085), ('concept', 0.085), ('evaluation', 0.084), ('phrase', 0.083), ('deep', 0.082), ('thinks', 0.082), ('reference', 0.076), ('curious', 0.076), ('chris', 0.075), ('awhile', 0.073), ('turn', 0.069), ('estimating', 0.069), ('mind', 0.069), ('comparison', 0.067), ('involved', 0.066), ('started', 0.066)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0000001 1066 andrew gelman stats-2011-12-17-Ripley on model selection, and some links on exploratory model analysis

Introduction: This is really fun. I love how Ripley thinks, with just about every concept considered in broad generality while being connected to real-data examples. He’s a great statistical storyteller as well. . . . and Wickham on exploratory model analysis I came across Ripley’s slides in a reference from Hadley Wickham’s article on exploratory model analysis . I’ve been interested for awhile in statistical graphics for understanding fitted models (which is different than the usual use of graphics to visualize data or to understand discrepancies of data from models). Recently I’ve started using the term “exploratory model analysis,” and it seemed like such a natural phrase that I thought I’d google it and see what’s up. I found the above-linked paper by Hadley, which in turn refers to a paper by Antony Unwin, Chris Volinksy, and Sylvia Winkler that defines “exploratory modelling analysis” as “the evaluation and comparison of many models simultaneously.” That’s not exactly what I h

2 0.16227615 2266 andrew gelman stats-2014-03-25-A statistical graphics course and statistical graphics advice

Introduction: Dean Eckles writes: Some of my coworkers at Facebook and I have worked with Udacity to create an online course on exploratory data analysis, including using data visualizations in R as part of EDA. The course has now launched at  https://www.udacity.com/course/ud651  so anyone can take it for free. And Kaiser Fung has  reviewed it . So definitely feel free to promote it! Criticism is also welcome (we are still fine-tuning things and adding more notes throughout). I wrote some more comments about the course  here , including highlighting the interviews with my great coworkers. I didn’t have a chance to look at the course so instead I responded with some generic comments about eda and visualization (in no particular order): - Think of a graph as a comparison. All graphs are comparison (indeed, all statistical analyses are comparisons). If you already have the graph in mind, think of what comparisons it’s enabling. Or if you haven’t settled on the graph yet, think of what

3 0.15560976 2183 andrew gelman stats-2014-01-23-Discussion on preregistration of research studies

Introduction: Chris Chambers and I had an enlightening discussion the other day at the blog of Rolf Zwaan, regarding the Garden of Forking Paths ( go here and scroll down through the comments). Chris sent me the following note: I’m writing a book at the moment about reforming practices in psychological research (focusing on various bad practices such as p-hacking, HARKing, low statistical power, publication bias, lack of data sharing etc. – and posing solutions such as pre-registration, Bayesian hypothesis testing, mandatory data archiving etc.) and I am arriving at rather unsettling conclusion: that null hypothesis significance testing (NHST) simply isn’t valid for observational research. If this is true then most of the psychological literature is statistically flawed. I was wonder what your thoughts were on this, both from a statistical point of view and from your experience working in an observational field. We all know about the dangers of researcher degrees of freedom. We also know

4 0.14350016 847 andrew gelman stats-2011-08-10-Using a “pure infographic” to explore differences between information visualization and statistical graphics

Introduction: Our discussion on data visualization continues. One one side are three statisticians–Antony Unwin, Kaiser Fung, and myself. We have been writing about the different goals served by information visualization and statistical graphics. On the other side are graphics experts (sorry for the imprecision, I don’t know exactly what these people do in their day jobs or how they are trained, and I don’t want to mislabel them) such as Robert Kosara and Jen Lowe , who seem a bit annoyed at how my colleagues and myself seem to follow the Tufte strategy of criticizing what we don’t understand. And on the third side are many (most?) academic statisticians, econometricians, etc., who don’t understand or respect graphs and seem to think of visualization as a toy that is unrelated to serious science or statistics. I’m not so interested in the third group right now–I tried to communicate with them in my big articles from 2003 and 2004 )–but I am concerned that our dialogue with the graphic

5 0.13591123 855 andrew gelman stats-2011-08-16-Infovis and statgraphics update update

Introduction: To continue our discussion from last week , consider three positions regarding the display of information: (a) The traditional tabular approach. This is how most statisticians, econometricians, political scientists, sociologists, etc., seem to operate. They understand the appeal of a pretty graph, and they’re willing to plot some data as part of an exploratory data analysis, but they see their serious research as leading to numerical estimates, p-values, tables of numbers. These people might use a graph to illustrate their points but they don’t see them as necessary in their research. (b) Statistical graphics as performed by Howard Wainer, Bill Cleveland, Dianne Cook, etc. They–we–see graphics as central to the process of statistical modeling and data analysis and are interested in graphs (static and dynamic) that display every data point as transparently as possible. (c) Information visualization or infographics, as performed by graphics designers and statisticians who are

6 0.12496747 225 andrew gelman stats-2010-08-23-Getting into hot water over hot graphics

7 0.12479968 822 andrew gelman stats-2011-07-26-Any good articles on the use of error bars?

8 0.12451944 2279 andrew gelman stats-2014-04-02-Am I too negative?

9 0.12235741 1668 andrew gelman stats-2013-01-11-My talk at the NY data visualization meetup this Monday!

10 0.12108396 1806 andrew gelman stats-2013-04-16-My talk in Chicago this Thurs 6:30pm

11 0.12103326 1604 andrew gelman stats-2012-12-04-An epithet I can live with

12 0.120316 1584 andrew gelman stats-2012-11-19-Tradeoffs in information graphics

13 0.11969064 1450 andrew gelman stats-2012-08-08-My upcoming talk for the data visualization meetup

14 0.11033615 878 andrew gelman stats-2011-08-29-Infovis, infographics, and data visualization: Where I’m coming from, and where I’d like to go

15 0.11016429 450 andrew gelman stats-2010-12-04-The Joy of Stats

16 0.10823748 265 andrew gelman stats-2010-09-09-Removing the blindfold: visualising statistical models

17 0.10653631 1848 andrew gelman stats-2013-05-09-A tale of two discussion papers

18 0.10594995 583 andrew gelman stats-2011-02-21-An interesting assignment for statistical graphics

19 0.10462727 1767 andrew gelman stats-2013-03-17-The disappearing or non-disappearing middle class

20 0.10434908 1594 andrew gelman stats-2012-11-28-My talk on statistical graphics at Mit this Thurs aft


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.16), (1, 0.041), (2, -0.068), (3, 0.034), (4, 0.076), (5, -0.051), (6, -0.091), (7, -0.0), (8, 0.025), (9, 0.011), (10, 0.021), (11, 0.02), (12, -0.016), (13, -0.005), (14, 0.0), (15, -0.022), (16, 0.004), (17, -0.03), (18, 0.017), (19, 0.042), (20, -0.037), (21, -0.042), (22, 0.028), (23, -0.04), (24, -0.047), (25, -0.019), (26, -0.054), (27, -0.018), (28, 0.017), (29, -0.035), (30, -0.047), (31, 0.027), (32, 0.061), (33, 0.037), (34, -0.007), (35, 0.03), (36, 0.016), (37, -0.024), (38, 0.026), (39, 0.032), (40, -0.034), (41, 0.004), (42, 0.018), (43, 0.023), (44, 0.006), (45, -0.026), (46, 0.029), (47, -0.018), (48, -0.057), (49, -0.026)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.9493258 1066 andrew gelman stats-2011-12-17-Ripley on model selection, and some links on exploratory model analysis

Introduction: This is really fun. I love how Ripley thinks, with just about every concept considered in broad generality while being connected to real-data examples. He’s a great statistical storyteller as well. . . . and Wickham on exploratory model analysis I came across Ripley’s slides in a reference from Hadley Wickham’s article on exploratory model analysis . I’ve been interested for awhile in statistical graphics for understanding fitted models (which is different than the usual use of graphics to visualize data or to understand discrepancies of data from models). Recently I’ve started using the term “exploratory model analysis,” and it seemed like such a natural phrase that I thought I’d google it and see what’s up. I found the above-linked paper by Hadley, which in turn refers to a paper by Antony Unwin, Chris Volinksy, and Sylvia Winkler that defines “exploratory modelling analysis” as “the evaluation and comparison of many models simultaneously.” That’s not exactly what I h

2 0.72169775 847 andrew gelman stats-2011-08-10-Using a “pure infographic” to explore differences between information visualization and statistical graphics

Introduction: Our discussion on data visualization continues. One one side are three statisticians–Antony Unwin, Kaiser Fung, and myself. We have been writing about the different goals served by information visualization and statistical graphics. On the other side are graphics experts (sorry for the imprecision, I don’t know exactly what these people do in their day jobs or how they are trained, and I don’t want to mislabel them) such as Robert Kosara and Jen Lowe , who seem a bit annoyed at how my colleagues and myself seem to follow the Tufte strategy of criticizing what we don’t understand. And on the third side are many (most?) academic statisticians, econometricians, etc., who don’t understand or respect graphs and seem to think of visualization as a toy that is unrelated to serious science or statistics. I’m not so interested in the third group right now–I tried to communicate with them in my big articles from 2003 and 2004 )–but I am concerned that our dialogue with the graphic

3 0.72009689 1450 andrew gelman stats-2012-08-08-My upcoming talk for the data visualization meetup

Introduction: Somebody asked me to speak sometime at a data visualization meetup. I think I spoke there a year or two ago but I could do it again. Last time I spoke on Infovis vs Statistical Graphics , this time I could just go thru the choices involved in a few zillion graphs I’ve published over the years, to give a sense of the options and choices involved in graphical communication. For this talk there would be no single theme (except, perhaps, my usual “Graphs as comparisons,” “All of statistics as comparisons,” and “Exploratory data analysis as hypothesis testing”), just a bunch of open discussion about what I tried, why I tried it, what worked and what didn’t work, etc. I’ve discussed these sorts of decisions on occasion (and am now writing a paper with Yair about some of this for our voting models), but I’ve never tried to make a talk out of it before. Could be fun.

4 0.70183808 855 andrew gelman stats-2011-08-16-Infovis and statgraphics update update

Introduction: To continue our discussion from last week , consider three positions regarding the display of information: (a) The traditional tabular approach. This is how most statisticians, econometricians, political scientists, sociologists, etc., seem to operate. They understand the appeal of a pretty graph, and they’re willing to plot some data as part of an exploratory data analysis, but they see their serious research as leading to numerical estimates, p-values, tables of numbers. These people might use a graph to illustrate their points but they don’t see them as necessary in their research. (b) Statistical graphics as performed by Howard Wainer, Bill Cleveland, Dianne Cook, etc. They–we–see graphics as central to the process of statistical modeling and data analysis and are interested in graphs (static and dynamic) that display every data point as transparently as possible. (c) Information visualization or infographics, as performed by graphics designers and statisticians who are

5 0.69827378 1811 andrew gelman stats-2013-04-18-Psychology experiments to understand what’s going on with data graphics?

Introduction: Ricardo Pietrobon writes, regarding my post from last year on attitudes toward data graphics, Wouldn’t it be the case to start formally studying the usability of graphics from a cognitive perspective? with platforms such as the mechanical turk it should be fairly straightforward to test alternative methods and come to some conclusions about what might be more informative and what might better assist in supporting decisions. btw, my guess is that these two constructs might not necessarily agree with each other. And Jessica Hullman provides some background: Measuring success for the different goals that you hint at in your article is indeed challenging, and I don’t think that most visualization researchers would claim to have met this challenge (myself included). Visualization researchers may know the user psychology well when it comes to certain dimensions of a graph’s effectiveness (such as quick and accurate responses), but I wouldn’t agree with this statement as a gene

6 0.69410497 1848 andrew gelman stats-2013-05-09-A tale of two discussion papers

7 0.69103122 794 andrew gelman stats-2011-07-09-The quest for the holy graph

8 0.68965763 816 andrew gelman stats-2011-07-22-“Information visualization” vs. “Statistical graphics”

9 0.689345 265 andrew gelman stats-2010-09-09-Removing the blindfold: visualising statistical models

10 0.68776989 1668 andrew gelman stats-2013-01-11-My talk at the NY data visualization meetup this Monday!

11 0.68499517 1806 andrew gelman stats-2013-04-16-My talk in Chicago this Thurs 6:30pm

12 0.68436968 546 andrew gelman stats-2011-01-31-Infovis vs. statistical graphics: My talk tomorrow (Tues) 1pm at Columbia

13 0.6797455 1604 andrew gelman stats-2012-12-04-An epithet I can live with

14 0.67516452 1594 andrew gelman stats-2012-11-28-My talk on statistical graphics at Mit this Thurs aft

15 0.67421877 1824 andrew gelman stats-2013-04-25-Fascinating graphs from facebook data

16 0.66924065 1584 andrew gelman stats-2012-11-19-Tradeoffs in information graphics

17 0.66867125 524 andrew gelman stats-2011-01-19-Data exploration and multiple comparisons

18 0.66030049 878 andrew gelman stats-2011-08-29-Infovis, infographics, and data visualization: Where I’m coming from, and where I’d like to go

19 0.65932393 1197 andrew gelman stats-2012-03-04-“All Models are Right, Most are Useless”

20 0.65721995 319 andrew gelman stats-2010-10-04-“Who owns Congress”


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(5, 0.016), (9, 0.015), (16, 0.047), (21, 0.011), (24, 0.107), (27, 0.014), (28, 0.014), (29, 0.013), (43, 0.025), (49, 0.018), (57, 0.014), (63, 0.019), (66, 0.139), (77, 0.028), (84, 0.07), (85, 0.015), (87, 0.024), (95, 0.033), (99, 0.288)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.94827819 1192 andrew gelman stats-2012-03-02-These people totally don’t know what Chance magazine is all about

Introduction: I received the following unsolicited email, subject line “Chance Magazine – Comedy Showcase”: Hi Andrew, Hope you’re doing well. I’m writing to let you know that we will be putting on an industry showcase at the brand new Laughing Devil Comedy Club (4738 Vernon Blvd. Long Island City) on Thursday, February 9th at 8:00 PM. If you’re unfamiliar, it’s one stop on the 7 train from Grand Central. Following the showcase, the club will stay open for an industry mingle/happy hour with drink specials and all the business card exchanging you can hope for. This showcase will feature 9 of our best: Steve Hofstetter’s latest album hit #1 in the world. He’ll be hosting Collin Moulton (Showtime Half Hour Special), Tony Deyo (Aspen Comedy Festival), Tom Simmons (Winner of the SF International Comedy Festival), Marc Ryan (Host of Mudslingers), Mike Trainor (TruTV), Jessi Campbell (CMT), Danny Browning (Bob & Tom), and Joe Zimmerman (Sirius/XM). I would love for you (and anyone you’d like to

2 0.94730723 536 andrew gelman stats-2011-01-24-Trends in partisanship by state

Introduction: Matthew Yglesias discusses how West Virginia used to be a Democratic state but is now solidly Republican. I thought it would be helpful to expand this to look at trends since 1948 (rather than just 1988) and all 50 states (rather than just one). This would represent a bit of work, except that I already did it a couple years ago, so here it is (right-click on the image to see the whole thing): I cheated a bit to get reasonable-looking groupings, for example putting Indiana in the Border South rather than Midwest, and putting Alaska in Mountain West and Hawaii in West Coast. Also, it would help to distinguish states by color (to be able to disentangle New Jersey and Delaware, for example) but we didn’t do this because the book is mostly black and white. In any case, the picture makes it clear that there have been strong regional trends all over during the past sixty years. P.S. My graph comes from Red State Blue State so no 2008 data, but 2008 was pretty much a shift

same-blog 3 0.94137204 1066 andrew gelman stats-2011-12-17-Ripley on model selection, and some links on exploratory model analysis

Introduction: This is really fun. I love how Ripley thinks, with just about every concept considered in broad generality while being connected to real-data examples. He’s a great statistical storyteller as well. . . . and Wickham on exploratory model analysis I came across Ripley’s slides in a reference from Hadley Wickham’s article on exploratory model analysis . I’ve been interested for awhile in statistical graphics for understanding fitted models (which is different than the usual use of graphics to visualize data or to understand discrepancies of data from models). Recently I’ve started using the term “exploratory model analysis,” and it seemed like such a natural phrase that I thought I’d google it and see what’s up. I found the above-linked paper by Hadley, which in turn refers to a paper by Antony Unwin, Chris Volinksy, and Sylvia Winkler that defines “exploratory modelling analysis” as “the evaluation and comparison of many models simultaneously.” That’s not exactly what I h

4 0.93823409 204 andrew gelman stats-2010-08-12-Sloppily-written slam on moderately celebrated writers is amusing nonetheless

Introduction: Via J. Robert Lennon , I discovered this amusing blog by Anis Shivani on “The 15 Most Overrated Contemporary American Writers.” Lennon found it so annoying that he refused to even link to it, but I actually enjoyed Shivani’s bit of performance art. The literary criticism I see is so focused on individual books that it’s refreshing to see someone take on an entire author’s career in a single paragraph. I agree with Lennon that Shivani’s blog doesn’t have much content –it’s full of terms such as “vacuity” and “pap,” compared to which “trendy” and “fashionable” are precision instruments–but Shivani covers a lot of ground and it’s fun to see this all in one place. My main complaint with Shivani, beyond his sloppy writing (but, hey, it’s just a blog; I’m sure he saves the good stuff for his paid gigs) is his implicit assumption that everyone should agree with him. I’m as big a Kazin fan as anyone, but I still think he completely undervalued Marquand . The other thing I noticed

5 0.93629986 1200 andrew gelman stats-2012-03-06-Some economists are skeptical about microfoundations

Introduction: A few months ago, I wrote : Economists seem to rely heavily on a sort of folk psychology, a relic of the 1920s-1950s in which people calculate utilities (or act as if they are doing so) in order to make decisions. A central tenet of economics is that inference or policy recommendation be derived from first principles from this folk-psychology model. This just seems silly to me, as if astronomers justified all their calculations with an underlying appeal to Aristotle’s mechanics. Or maybe the better analogy is the Stalinist era in which everything had to be connected to Marxist principles (followed, perhaps, by an equationful explanation of how the world can be interpreted as if Marxism were valid). Mark Thoma and Paul Krugman seem to agree with me on this one (as does my Barnard colleague Rajiv Sethi ). They don’t go so far as to identify utility etc as folk psychology, but maybe that will come next. P.S. Perhaps this will clarify: In a typical economics research pap

6 0.93466288 680 andrew gelman stats-2011-04-26-My talk at Berkeley on Wednesday

7 0.92814261 1010 andrew gelman stats-2011-11-14-“Free energy” and economic resources

8 0.92111921 2317 andrew gelman stats-2014-05-04-Honored oldsters write about statistics

9 0.9204424 1271 andrew gelman stats-2012-04-20-Education could use some systematic evaluation

10 0.92028016 1439 andrew gelman stats-2012-08-01-A book with a bunch of simple graphs

11 0.91667557 1322 andrew gelman stats-2012-05-15-Question 5 of my final exam for Design and Analysis of Sample Surveys

12 0.91023999 98 andrew gelman stats-2010-06-19-Further thoughts on happiness and life satisfaction research

13 0.90818346 1323 andrew gelman stats-2012-05-16-Question 6 of my final exam for Design and Analysis of Sample Surveys

14 0.90249127 474 andrew gelman stats-2010-12-18-The kind of frustration we could all use more of

15 0.90038586 814 andrew gelman stats-2011-07-21-The powerful consumer?

16 0.90013438 1544 andrew gelman stats-2012-10-22-Is it meaningful to talk about a probability of “65.7%” that Obama will win the election?

17 0.89892 922 andrew gelman stats-2011-09-24-Economists don’t think like accountants—but maybe they should

18 0.8982513 2141 andrew gelman stats-2013-12-20-Don’t douthat, man! Please give this fallacy a name.

19 0.8978951 1950 andrew gelman stats-2013-07-22-My talks that were scheduled for Tues at the Data Skeptics meetup and Wed at the Open Statistical Programming meetup

20 0.89681017 1931 andrew gelman stats-2013-07-09-“Frontiers in Massive Data Analysis”