andrew_gelman_stats andrew_gelman_stats-2013 andrew_gelman_stats-2013-1655 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Tyler Cowen links to a post by Sean Taylor, who writes the following about users of R: You are willing to invest in learning something difficult. You do not care about aesthetics, only availability of packages and getting results quickly. To me, R is easy and Sas is difficult. I once worked with some students who were running Sas and the output was unreadable! Pages and pages of numbers that made no sense. When it comes to ease or difficulty of use, I think it depends on what you’re used to! And I really don’t understand the bit about aesthetics. What about this ? One reason I use R is to make pretty graphs. That said, if I’d never learned R, I’d just be making pretty graphs in Fortran or whatever. My guess is, the way I program, R is actually hindering rather than helping my ability to make attractive graphs. Half the time I’m scrambling around, writing custom code to get around R’s defaults.
sentIndex sentText sentNum sentScore
1 Tyler Cowen links to a post by Sean Taylor, who writes the following about users of R: You are willing to invest in learning something difficult. [sent-1, score-0.62]
2 You do not care about aesthetics, only availability of packages and getting results quickly. [sent-2, score-0.527]
3 I once worked with some students who were running Sas and the output was unreadable! [sent-4, score-0.411]
4 When it comes to ease or difficulty of use, I think it depends on what you’re used to! [sent-6, score-0.5]
5 And I really don’t understand the bit about aesthetics. [sent-7, score-0.067]
6 That said, if I’d never learned R, I’d just be making pretty graphs in Fortran or whatever. [sent-10, score-0.37]
7 My guess is, the way I program, R is actually hindering rather than helping my ability to make attractive graphs. [sent-11, score-0.541]
8 Half the time I’m scrambling around, writing custom code to get around R’s defaults. [sent-12, score-0.725]
wordName wordTfidf (topN-words)
[('sas', 0.346), ('pages', 0.23), ('scrambling', 0.228), ('unreadable', 0.206), ('sean', 0.206), ('aesthetics', 0.206), ('custom', 0.199), ('ease', 0.199), ('fortran', 0.193), ('invest', 0.183), ('defaults', 0.183), ('taylor', 0.17), ('availability', 0.165), ('attractive', 0.147), ('output', 0.142), ('helping', 0.141), ('packages', 0.141), ('around', 0.129), ('users', 0.123), ('cowen', 0.119), ('depends', 0.117), ('tyler', 0.116), ('willing', 0.113), ('pretty', 0.112), ('difficulty', 0.109), ('ability', 0.106), ('learned', 0.105), ('links', 0.104), ('half', 0.102), ('running', 0.1), ('code', 0.1), ('program', 0.099), ('learning', 0.097), ('worked', 0.093), ('care', 0.089), ('use', 0.088), ('graphs', 0.084), ('easy', 0.08), ('numbers', 0.079), ('students', 0.076), ('comes', 0.075), ('make', 0.074), ('guess', 0.073), ('reason', 0.07), ('making', 0.069), ('getting', 0.069), ('writing', 0.069), ('understand', 0.067), ('said', 0.065), ('results', 0.063)]
simIndex simValue blogId blogTitle
same-blog 1 1.0 1655 andrew gelman stats-2013-01-05-The statistics software signal
Introduction: Tyler Cowen links to a post by Sean Taylor, who writes the following about users of R: You are willing to invest in learning something difficult. You do not care about aesthetics, only availability of packages and getting results quickly. To me, R is easy and Sas is difficult. I once worked with some students who were running Sas and the output was unreadable! Pages and pages of numbers that made no sense. When it comes to ease or difficulty of use, I think it depends on what you’re used to! And I really don’t understand the bit about aesthetics. What about this ? One reason I use R is to make pretty graphs. That said, if I’d never learned R, I’d just be making pretty graphs in Fortran or whatever. My guess is, the way I program, R is actually hindering rather than helping my ability to make attractive graphs. Half the time I’m scrambling around, writing custom code to get around R’s defaults.
2 0.32400784 1661 andrew gelman stats-2013-01-08-Software is as software does
Introduction: We had a recent discussion about statistics packages where people talked about the structure and capabilities of different computer languages. One thing I wanted to add to this discussion is some sociology. To me, a statistics package is not just its code, it’s also its community, it’s what people do with it. R, for example, is nothing special for graphics (again, I think in retrospect my graphs would be better if I’d been making them in Fortran all these years); what makes R graphics work so well is that there’s a clear path from the numbers to the graphs, there’s a tradition in R of postprocessing. In comparison, consider Sas. I’ve never directly used Sas but whenever I’ve seen it used, whether by people working for me or with me or just people down the hall who left Sas output sitting in the printer, in all these cases there’s no postprocessing. It doesn’t look interactive at all. The user runs some procedure and then there are pages and pages and pages of output. The po
3 0.15751781 83 andrew gelman stats-2010-06-13-Silly Sas lays out old-fashioned statistical thinking
Introduction: People keep telling me that Sas isn’t as bad as everybody says, but then I see (from Christian Robert ) this listing from the Sas website of “disadvantages in using Bayesian analysis”: There is no correct way to choose a prior. Bayesian inferences require skills to translate prior beliefs into a mathematically formulated prior. If you do not proceed with caution, you can generate misleading results. . . . From a practical point of view, it might sometimes be difficult to convince subject matter experts who do not agree with the validity of the chosen prior. That is so tacky! As if least squares, logistic regressions, Cox models, and all those other likelihoods mentioned in the Sas documentation are so automatically convincing to subject matter experts. P.S. For some more serious objections to Bayesian statistics, see here and here . P.P.S. In case you’re wondering why I’m commenting on month-old blog entries . . . I have a monthlong backlog of entries, and I’m spooling
4 0.12308019 1736 andrew gelman stats-2013-02-24-Rcpp class in Sat 9 Mar in NYC
Introduction: Join Dirk Eddelbuettel for six hours of detailed and hands-on instructions and discussions around Rcpp, RInside, RcppArmadillo, RcppGSL and other packages . . . Rcpp has become the most widely-used language extension for R. Currently deployed by 103 CRAN packages and a further 10 BioConductor packages, it permits users and developers to pass “whole R objects” with ease between R and C++ . . . Morning session: “A Hands-on Introduction to R and C++” . . . Afternoon session: “Advanced R and C++ Topics” . . .
5 0.10908411 1871 andrew gelman stats-2013-05-27-Annals of spam
Introduction: I received the following email, subject line “Want to Buy Text Link from andrewgelman.com”: Dear, I am Mary Taylor. I have started a link building campaign for my growing websites. For this, I need your cooperation. The campaign is quite diverse and large scale and if you take some time to understand it – it will benefit us. First I want to clarify that I do not want “blogroll” ”footer” or any other type of “site wide links”. Secondly I want links from inner pages of site – with good page rank of course. Third links should be within text so that Google may not mark them as spam – not for you and not for me. Hence this link building will cause almost no harm to your site or me. Because content links are fine with Google. Now I should come to the requirements. I will accept links from Page Rank 3 to as high as you have got. Also kindly note that I can buy 1 to 50 links from one site – so you should understand the scale of the project. If you have multiple sites with co
6 0.10035403 1994 andrew gelman stats-2013-08-22-“The comment section is open, but I’m not going to read them”
7 0.09589155 1815 andrew gelman stats-2013-04-20-Displaying inferences from complex models
8 0.094261184 1275 andrew gelman stats-2012-04-22-Please stop me before I barf again
9 0.090628363 1895 andrew gelman stats-2013-06-12-Peter Thiel is writing another book!
11 0.087966137 528 andrew gelman stats-2011-01-21-Elevator shame is a two-way street
12 0.08733774 1885 andrew gelman stats-2013-06-06-Leahy Versus Albedoman and the Moneygoround, Part One
13 0.083285831 608 andrew gelman stats-2011-03-12-Single or multiple imputation?
14 0.082189277 1368 andrew gelman stats-2012-06-06-Question 27 of my final exam for Design and Analysis of Sample Surveys
15 0.080480859 1764 andrew gelman stats-2013-03-15-How do I make my graphs?
16 0.080359057 530 andrew gelman stats-2011-01-22-MS-Bayes?
17 0.076091602 266 andrew gelman stats-2010-09-09-The future of R
18 0.074816719 1894 andrew gelman stats-2013-06-12-How to best graph the Beveridge curve, relating the vacancy rate in jobs to the unemployment rate?
19 0.07450489 2124 andrew gelman stats-2013-12-05-Stan (quietly) passes 512 people on the users list
20 0.072315834 1243 andrew gelman stats-2012-04-03-Don’t do the King’s Gambit
topicId topicWeight
[(0, 0.131), (1, -0.046), (2, -0.035), (3, 0.057), (4, 0.088), (5, 0.001), (6, 0.018), (7, 0.005), (8, -0.006), (9, -0.029), (10, -0.004), (11, 0.004), (12, -0.022), (13, -0.016), (14, 0.006), (15, 0.022), (16, 0.006), (17, -0.029), (18, -0.012), (19, 0.035), (20, 0.015), (21, 0.022), (22, -0.024), (23, 0.043), (24, -0.053), (25, 0.0), (26, 0.018), (27, 0.022), (28, -0.008), (29, 0.017), (30, 0.015), (31, -0.049), (32, 0.009), (33, 0.003), (34, -0.008), (35, -0.03), (36, -0.009), (37, 0.056), (38, 0.027), (39, -0.017), (40, -0.0), (41, -0.02), (42, -0.019), (43, 0.055), (44, 0.007), (45, -0.004), (46, -0.019), (47, 0.034), (48, 0.032), (49, 0.005)]
simIndex simValue blogId blogTitle
same-blog 1 0.9592483 1655 andrew gelman stats-2013-01-05-The statistics software signal
Introduction: Tyler Cowen links to a post by Sean Taylor, who writes the following about users of R: You are willing to invest in learning something difficult. You do not care about aesthetics, only availability of packages and getting results quickly. To me, R is easy and Sas is difficult. I once worked with some students who were running Sas and the output was unreadable! Pages and pages of numbers that made no sense. When it comes to ease or difficulty of use, I think it depends on what you’re used to! And I really don’t understand the bit about aesthetics. What about this ? One reason I use R is to make pretty graphs. That said, if I’d never learned R, I’d just be making pretty graphs in Fortran or whatever. My guess is, the way I program, R is actually hindering rather than helping my ability to make attractive graphs. Half the time I’m scrambling around, writing custom code to get around R’s defaults.
2 0.79797018 597 andrew gelman stats-2011-03-02-RStudio – new cross-platform IDE for R
Introduction: The new R environment RStudio looks really great, especially for users new to R. In teaching, these are often people new to programming anything, much less statistical models. The R GUIs were different on each platform, with (sometimes modal) windows appearing and disappearing and no unified design. RStudio fixes that and has already found a happy home on my desktop. Initial impressions I’ve been using it for the past couple of days. For me, it replaces the niche that R.app held: looking at help, quickly doing something I don’t want to pollute a project workspace with; sometimes data munging, merging, and transforming; and prototyping plots. RStudio is better than R.app at all of these things. For actual development and papers, though, I remain wedded to emacs+ess (good old C-x M-c M-Butterfly ). Favorite features in no particular order plots seamlessly made in new graphics devices. This is huge— instead of one active plot window named something like quartz(1) t
3 0.76994884 272 andrew gelman stats-2010-09-13-Ross Ihaka to R: Drop Dead
Introduction: Christian Robert posts these thoughts : I [Ross Ihaka] have been worried for some time that R isn’t going to provide the base that we’re going to need for statistical computation in the future. (It may well be that the future is already upon us.) There are certainly efficiency problems (speed and memory use), but there are more fundamental issues too. Some of these were inherited from S and some are peculiar to R. One of the worst problems is scoping. Consider the following little gem. f =function() { if (runif(1) > .5) x = 10 x } The x being returned by this function is randomly local or global. There are other examples where variables alternate between local and non-local throughout the body of a function. No sensible language would allow this. It’s ugly and it makes optimisation really difficult. This isn’t the only problem, even weirder things happen because of interactions between scoping and lazy evaluation. In light of this, I [Ihaka] have come to the c
4 0.75062037 266 andrew gelman stats-2010-09-09-The future of R
Introduction: Some thoughts from Christian , including this bit: We need to consider separately 1. R’s brilliant library 2. R’s not-so-brilliant language and/or interpreter. I don’t know that R’s library is so brilliant as all that–if necessary, I don’t think it would be hard to reprogram the important packages in a new language. I would say, though, that the problems with R are not just in the technical details of the language. I think the culture of R has some problems too. As I’ve written before, R functions used to be lean and mean, and now they’re full of exception-handling and calls to other packages. R functions are spaghetti-like messes of connections in which I keep expecting to run into syntax like “GOTO 120.” I learned about these problems a couple years ago when writing bayesglm(), which is a simple adaptation of glm(). But glm(), and its workhorse, glm.fit(), are a mess: They’re about 10 lines of functioning code, plus about 20 lines of necessary front-end, plus a cou
5 0.74917525 1245 andrew gelman stats-2012-04-03-Redundancy and efficiency: In praise of Penn Station
Introduction: In reaction to this news article by Michael Kimmelman, I’d like to repost this from four years ago: Walking through Penn Station in New York, I remembered how much I love its open structure. By “open,” I don’t mean bright and airy. I mean “open” in a topological sense. The station has three below-ground levels–the uppermost has ticket counters (and, what is more relevant nowadays, ticket machines), some crappy stores and restaurants, and a crappy waiting area. The middle level has Long Island Rail Road ticket counters, some more crappy stores and restaurants, and entrances to the 7th and 8th Avenue subway lines. The lower level has train tracks and platforms. There are stairs, escalators, and elevators going everywhere. As a result, it’s easy to get around, there are lots of shortcuts, and the train loads fast–some people come down the escalators and elevators from the top level, others take the stairs from the middle level. The powers-that-be keep threatening to spend a coupl
6 0.74408484 1536 andrew gelman stats-2012-10-16-Using economics to reduce bike theft
7 0.73151189 1154 andrew gelman stats-2012-02-04-“Turn a Boring Bar Graph into a 3D Masterpiece”
8 0.72936499 2089 andrew gelman stats-2013-11-04-Shlemiel the Software Developer and Unknown Unknowns
9 0.72047746 324 andrew gelman stats-2010-10-07-Contest for developing an R package recommendation system
10 0.71462083 1716 andrew gelman stats-2013-02-09-iPython Notebook
11 0.7098999 1520 andrew gelman stats-2012-10-03-Advice that’s so eminently sensible but so difficult to follow
12 0.70675153 1764 andrew gelman stats-2013-03-15-How do I make my graphs?
13 0.70419586 1661 andrew gelman stats-2013-01-08-Software is as software does
14 0.7027601 1807 andrew gelman stats-2013-04-17-Data problems, coding errors…what can be done?
15 0.69986886 1134 andrew gelman stats-2012-01-21-Lessons learned from a recent R package submission
16 0.69934607 1596 andrew gelman stats-2012-11-29-More consulting experiences, this time in computational linguistics
17 0.69780821 395 andrew gelman stats-2010-11-05-Consulting: how do you figure out what to charge?
18 0.69036949 736 andrew gelman stats-2011-05-29-Response to “Why Tables Are Really Much Better Than Graphs”
19 0.68824625 793 andrew gelman stats-2011-07-09-R on the cloud
20 0.68704194 166 andrew gelman stats-2010-07-27-The Three Golden Rules for Successful Scientific Research
topicId topicWeight
[(16, 0.109), (24, 0.127), (35, 0.019), (73, 0.016), (77, 0.019), (80, 0.022), (86, 0.04), (90, 0.285), (99, 0.251)]
simIndex simValue blogId blogTitle
1 0.9730345 2259 andrew gelman stats-2014-03-22-Picking pennies in front of a steamroller: A parable comes to life
Introduction: From 2011: Chapter 1 On Sunday we were over on 125 St so I stopped by the Jamaican beef patties place but they were closed. Jesus Taco was next door so I went there instead. What a mistake! I don’t know what Masanao and Yu-Sung could’ve been thinking. Anyway, then I had Jamaican beef patties on the brain so I went by Monday afternoon and asked for 9: 3 spicy beef, 3 mild beef (for the kids), and 3 chicken (not the jerk chicken; Bob got those the other day and they didn’t impress me). I’m about to pay and then a bunch of people come in and start ordering. The woman behind the counter asks if I’m in a hurry, I ask why, she whispers, For the same price you can get a dozen. So I get two more spicy beef and a chicken. She whispers that I shouldn’t tell anyone. I can’t really figure out why I’m getting this special treatment. So I walk out of there with 12 patties. Total cost: $17.25. It’s a good deal: they’re small but not that small. Sure, I ate 6 of them, but I was h
2 0.96283317 512 andrew gelman stats-2011-01-12-Picking pennies in front of a steamroller: A parable comes to life
Introduction: Chapter 1 On Sunday we were over on 125 St so I stopped by the Jamaican beef patties place but they were closed. Jesus Taco was next door so I went there instead. What a mistake! I don’t know what Masanao and Yu-Sung could’ve been thinking. Anyway, then I had Jamaican beef patties on the brain so I went by Monday afternoon and asked for 9: 3 spicy beef, 3 mild beef (for the kids), and 3 chicken (not the jerk chicken; Bob got those the other day and they didn’t impress me). I’m about to pay and then a bunch of people come in and start ordering. The woman behind the counter asks if I’m in a hurry, I ask why, she whispers, For the same price you can get a dozen. So I get two more spicy beef and a chicken. She whispers that I shouldn’t tell anyone. I can’t really figure out why I’m getting this special treatment. So I walk out of there with 12 patties. Total cost: $17.25. It’s a good deal: they’re small but not that small. Sure, I ate 6 of them, but I was hungry. Chapt
3 0.94341171 475 andrew gelman stats-2010-12-19-All politics are local — not
Introduction: Mickey Kaus does a public service by trashing Tip O’Neill’s famous dictum that “all politics are local.” As Kaus point out, all the congressional elections in recent decades have been nationalized. I’d go one step further and say that, sure, all politics are local–if you’re Tip O’Neill and represent a ironclad Democratic seat in Congress. It’s easy to be smug about your political skills if you’re in a safe seat and have enough pull in state politics to avoid your district getting gerrymandered. Then you can sit there and sagely attribute your success to your continuing mastery of local politics rather than to whatever it took to get the seat in the first place.
4 0.88834882 1417 andrew gelman stats-2012-07-15-Some decision analysis problems are pretty easy, no?
Introduction: Cassie Murdoch reports : A 47-year-old woman in Uxbridge, Massachusetts, got behind the wheel of her car after having a bit too much to drink, but instead of wreaking havoc on the road, she ended up lodged in a sand trap at a local golf course. Why? Because her GPS made her do it—obviously! She said the GPS told her to turn left, and she did, right into a cornfield. That didn’t faze her, and she just kept on going until she ended up on the golf course and got stuck in the sand. There were people on the course at the time, but thankfully nobody was injured. Police found a cup full of alcohol in her car and arrested her for driving drunk. Here’s the punchline: This is the fourth time she’s been arrested for a DUI. Assuming this story is accurate, I guess they don’t have one of those “three strikes” laws in Massachusetts? Personally, I’m a lot more afraid of a dangerous driver than of some drug dealer. I’d think a simple cost-benefit calculation would recommend taking away
same-blog 5 0.86833179 1655 andrew gelman stats-2013-01-05-The statistics software signal
Introduction: Tyler Cowen links to a post by Sean Taylor, who writes the following about users of R: You are willing to invest in learning something difficult. You do not care about aesthetics, only availability of packages and getting results quickly. To me, R is easy and Sas is difficult. I once worked with some students who were running Sas and the output was unreadable! Pages and pages of numbers that made no sense. When it comes to ease or difficulty of use, I think it depends on what you’re used to! And I really don’t understand the bit about aesthetics. What about this ? One reason I use R is to make pretty graphs. That said, if I’d never learned R, I’d just be making pretty graphs in Fortran or whatever. My guess is, the way I program, R is actually hindering rather than helping my ability to make attractive graphs. Half the time I’m scrambling around, writing custom code to get around R’s defaults.
7 0.84985936 1411 andrew gelman stats-2012-07-10-Defining ourselves arbitrarily
8 0.82764292 15 andrew gelman stats-2010-05-03-Public Opinion on Health Care Reform
9 0.81506455 1947 andrew gelman stats-2013-07-20-We are what we are studying
10 0.7991904 478 andrew gelman stats-2010-12-20-More on why “all politics is local” is an outdated slogan
11 0.78581059 766 andrew gelman stats-2011-06-14-Last Wegman post (for now)
14 0.76857233 1932 andrew gelman stats-2013-07-10-Don’t trust the Turk
15 0.76616728 1842 andrew gelman stats-2013-05-05-Cleaning up science
16 0.76253045 1163 andrew gelman stats-2012-02-12-Meta-analysis, game theory, and incentives to do replicable research
17 0.75863171 762 andrew gelman stats-2011-06-13-How should journals handle replication studies?
18 0.75637007 530 andrew gelman stats-2011-01-22-MS-Bayes?
19 0.75509351 187 andrew gelman stats-2010-08-05-Update on state size and governors’ popularity
20 0.74489123 630 andrew gelman stats-2011-03-27-What is an economic “conspiracy theory”?