andrew_gelman_stats andrew_gelman_stats-2010 andrew_gelman_stats-2010-76 knowledge-graph by maker-knowledge-mining

76 andrew gelman stats-2010-06-09-Both R and Stata


meta infos for this blog

Source: html

Introduction: A student I’m working with writes: I was planning on getting a applied stat text as a desk reference, and for that I’m assuming you’d recommend your own book. Also, being an economics student, I was initially planning on doing my analysis in STATA, but I noticed on your blog that you use R, and apparently so does the rest of the statistics profession. Would you rather I do my programming in R this summer, or does it not matter? It doesn’t look too hard to learn, so just let me know what’s most convenient for you. My reply: Yes, I recommend my book with Jennifer Hill. Also the book by John Fox, An R and S-plus Companion to Applied Regression, is a good way to get into R. I recommend you use both Stata and R. If you’re already familiar with Stata, then stick with it–it’s a great system for working with big datasets. You can grab your data in Stata, do some basic manipulations, then save a smaller dataset to read into R (using R’s read.dta() function). Once you want to make fu


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 A student I’m working with writes: I was planning on getting a applied stat text as a desk reference, and for that I’m assuming you’d recommend your own book. [sent-1, score-1.481]

2 Also, being an economics student, I was initially planning on doing my analysis in STATA, but I noticed on your blog that you use R, and apparently so does the rest of the statistics profession. [sent-2, score-0.828]

3 Would you rather I do my programming in R this summer, or does it not matter? [sent-3, score-0.114]

4 It doesn’t look too hard to learn, so just let me know what’s most convenient for you. [sent-4, score-0.177]

5 My reply: Yes, I recommend my book with Jennifer Hill. [sent-5, score-0.375]

6 Also the book by John Fox, An R and S-plus Companion to Applied Regression, is a good way to get into R. [sent-6, score-0.239]

7 If you’re already familiar with Stata, then stick with it–it’s a great system for working with big datasets. [sent-8, score-0.555]

8 You can grab your data in Stata, do some basic manipulations, then save a smaller dataset to read into R (using R’s read. [sent-9, score-0.553]

9 Once you want to make fun graphs, R is the way to go. [sent-11, score-0.153]

10 It’s good to have both systems at your disposal. [sent-12, score-0.182]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('stata', 0.519), ('recommend', 0.268), ('planning', 0.23), ('disposal', 0.205), ('companion', 0.185), ('student', 0.167), ('desk', 0.165), ('manipulations', 0.162), ('initially', 0.147), ('grab', 0.143), ('fox', 0.138), ('applied', 0.138), ('stick', 0.13), ('summer', 0.126), ('stat', 0.123), ('working', 0.123), ('convenient', 0.118), ('save', 0.115), ('systems', 0.114), ('programming', 0.114), ('jennifer', 0.109), ('book', 0.107), ('dataset', 0.107), ('text', 0.106), ('smaller', 0.105), ('reference', 0.103), ('familiar', 0.101), ('apparently', 0.099), ('assuming', 0.099), ('noticed', 0.096), ('rest', 0.095), ('function', 0.09), ('fun', 0.089), ('basic', 0.083), ('economics', 0.082), ('use', 0.079), ('system', 0.078), ('learn', 0.077), ('matter', 0.077), ('graphs', 0.076), ('john', 0.073), ('yes', 0.068), ('good', 0.068), ('regression', 0.066), ('already', 0.065), ('way', 0.064), ('getting', 0.062), ('reply', 0.06), ('hard', 0.059), ('great', 0.058)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.99999994 76 andrew gelman stats-2010-06-09-Both R and Stata

Introduction: A student I’m working with writes: I was planning on getting a applied stat text as a desk reference, and for that I’m assuming you’d recommend your own book. Also, being an economics student, I was initially planning on doing my analysis in STATA, but I noticed on your blog that you use R, and apparently so does the rest of the statistics profession. Would you rather I do my programming in R this summer, or does it not matter? It doesn’t look too hard to learn, so just let me know what’s most convenient for you. My reply: Yes, I recommend my book with Jennifer Hill. Also the book by John Fox, An R and S-plus Companion to Applied Regression, is a good way to get into R. I recommend you use both Stata and R. If you’re already familiar with Stata, then stick with it–it’s a great system for working with big datasets. You can grab your data in Stata, do some basic manipulations, then save a smaller dataset to read into R (using R’s read.dta() function). Once you want to make fu

2 0.17503741 869 andrew gelman stats-2011-08-24-Mister P in Stata

Introduction: Maurizio Pisati sends along this presentation of work with Valeria Glorioso. He writes: “Our major problem, now, is uncertainty estimation — we’re still struggling to find a solution appropriate to the Stata environment.”

3 0.16424081 442 andrew gelman stats-2010-12-01-bayesglm in Stata?

Introduction: Is there an implementation of bayesglm in Stata? (That is, approximate maximum penalized likelihood estimation with specified normal or t prior distributions on the coefficients.)

4 0.14680743 1661 andrew gelman stats-2013-01-08-Software is as software does

Introduction: We had a recent discussion about statistics packages where people talked about the structure and capabilities of different computer languages. One thing I wanted to add to this discussion is some sociology. To me, a statistics package is not just its code, it’s also its community, it’s what people do with it. R, for example, is nothing special for graphics (again, I think in retrospect my graphs would be better if I’d been making them in Fortran all these years); what makes R graphics work so well is that there’s a clear path from the numbers to the graphs, there’s a tradition in R of postprocessing. In comparison, consider Sas. I’ve never directly used Sas but whenever I’ve seen it used, whether by people working for me or with me or just people down the hall who left Sas output sitting in the printer, in all these cases there’s no postprocessing. It doesn’t look interactive at all. The user runs some procedure and then there are pages and pages and pages of output. The po

5 0.13690022 611 andrew gelman stats-2011-03-14-As the saying goes, when they argue that you’re taking over, that’s when you know you’ve won

Introduction: Hey, here’s a book I’m not planning to read any time soon! As Bill James wrote, the alternative to good statistics is not “no statistics,” it’s bad statistics. (I wouldn’t have bothered to bring this one up, but I noticed it on one of our sister blogs.)

6 0.13600068 80 andrew gelman stats-2010-06-11-Free online course in multilevel modeling

7 0.13054068 269 andrew gelman stats-2010-09-10-R vs. Stata, or, Different ways to estimate multilevel models

8 0.11610427 198 andrew gelman stats-2010-08-11-Multilevel modeling in R on a Mac

9 0.11492539 749 andrew gelman stats-2011-06-06-“Sampling: Design and Analysis”: a course for political science graduate students

10 0.11103372 107 andrew gelman stats-2010-06-24-PPS in Georgia

11 0.10521677 1283 andrew gelman stats-2012-04-26-Let’s play “Guess the smoother”!

12 0.10503437 2273 andrew gelman stats-2014-03-29-References (with code) for Bayesian hierarchical (multilevel) modeling and structural equation modeling

13 0.10395713 65 andrew gelman stats-2010-06-03-How best to learn R?

14 0.097324193 462 andrew gelman stats-2010-12-10-Who’s holding the pen?, The split screen, and other ideas for one-on-one instruction

15 0.094775744 352 andrew gelman stats-2010-10-19-Analysis of survey data: Design based models vs. hierarchical modeling?

16 0.09276849 546 andrew gelman stats-2011-01-31-Infovis vs. statistical graphics: My talk tomorrow (Tues) 1pm at Columbia

17 0.091708109 1135 andrew gelman stats-2012-01-22-Advice on do-it-yourself stats education?

18 0.090751767 1948 andrew gelman stats-2013-07-21-Bayes related

19 0.090602726 1642 andrew gelman stats-2012-12-28-New book by Stef van Buuren on missing-data imputation looks really good!

20 0.090408996 91 andrew gelman stats-2010-06-16-RSS mess


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.149), (1, -0.005), (2, -0.049), (3, 0.052), (4, 0.096), (5, 0.054), (6, -0.007), (7, 0.017), (8, 0.023), (9, 0.046), (10, 0.054), (11, -0.021), (12, 0.053), (13, -0.024), (14, 0.078), (15, -0.011), (16, -0.055), (17, 0.017), (18, 0.012), (19, -0.052), (20, 0.043), (21, 0.041), (22, 0.023), (23, 0.062), (24, -0.004), (25, -0.007), (26, 0.05), (27, 0.007), (28, 0.004), (29, 0.01), (30, 0.002), (31, 0.016), (32, 0.035), (33, 0.001), (34, -0.026), (35, 0.061), (36, -0.006), (37, -0.009), (38, -0.063), (39, 0.019), (40, -0.008), (41, -0.008), (42, -0.015), (43, 0.01), (44, 0.05), (45, 0.025), (46, 0.02), (47, 0.032), (48, 0.05), (49, 0.037)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.9792136 76 andrew gelman stats-2010-06-09-Both R and Stata

Introduction: A student I’m working with writes: I was planning on getting a applied stat text as a desk reference, and for that I’m assuming you’d recommend your own book. Also, being an economics student, I was initially planning on doing my analysis in STATA, but I noticed on your blog that you use R, and apparently so does the rest of the statistics profession. Would you rather I do my programming in R this summer, or does it not matter? It doesn’t look too hard to learn, so just let me know what’s most convenient for you. My reply: Yes, I recommend my book with Jennifer Hill. Also the book by John Fox, An R and S-plus Companion to Applied Regression, is a good way to get into R. I recommend you use both Stata and R. If you’re already familiar with Stata, then stick with it–it’s a great system for working with big datasets. You can grab your data in Stata, do some basic manipulations, then save a smaller dataset to read into R (using R’s read.dta() function). Once you want to make fu

2 0.7751283 1015 andrew gelman stats-2011-11-17-Good examples of lurking variables?

Introduction: Rama Ganesan writes: I have been using many of your demos from the Teaching Stats book . . . Do you by any chance have a nice easy dataset that I can use to show students how ‘lurking variables’ work using regression? For instance, in your book you talk about the relationship between height and salaries – where gender is the hidden variable. Any suggestions?

3 0.76058912 590 andrew gelman stats-2011-02-25-Good introductory book for statistical computation?

Introduction: Geen Tomko asks: Can you recommend a good introductory book for statistical computation? Mostly, something that would help make it easier in collecting and analyzing data from student test scores. I don’t know. Usually, when people ask for a starter statistics book, my recommendation (beyond my own books) is The Statistical Sleuth. But that’s not really a computation book. ARM isn’t really a statistical computation book either. But the statistical computation books that I’ve seen don’t seems so relevant for the analyses that Tomko is looking for. For example, the R book of Venables and Ripley focuses on nonparametric statistics, which is fine but seems a bit esoteric for these purposes. Does anyone have any suggestions?

4 0.75999707 65 andrew gelman stats-2010-06-03-How best to learn R?

Introduction: Alban Zeber writes: I am wondering whether there is a reference (online or book) that you would recommend to someone who is interested in learning how to program in R. Any thoughts? P.S. If I had a name like that, my books would be named, “Bayesian Statistics from A to Z,” “Teaching Statistics from A to Z,” “Regression and Multilevel Modeling from A to Z,” and so forth.

5 0.71635205 1642 andrew gelman stats-2012-12-28-New book by Stef van Buuren on missing-data imputation looks really good!

Introduction: Ben points us to a new book, Flexible Imputation of Missing Data . It’s excellent and I highly recommend it. Definitely worth the $89.95. Van Buuren’s book is great even if you don’t end up using the algorithm described in the book (I actually like their approach but I do think there are some limitations with their particular implementation, which is one reason we’re developing our own package ); he supplies lots of intuition, examples, and graphs. P.S. Stef’s book features an introduction by Don Rubin, which gets me thinking: if Don can find the time to write an introduction to somebody else’s book, he surely should be willing to read and comment on the third edition of his own book, no?

6 0.7135641 1782 andrew gelman stats-2013-03-30-“Statistical Modeling: A Fresh Approach”

7 0.69315356 316 andrew gelman stats-2010-10-03-Suggested reading for a prospective statistician?

8 0.69188499 1382 andrew gelman stats-2012-06-17-How to make a good fig?

9 0.68326998 46 andrew gelman stats-2010-05-21-Careers, one-hit wonders, and an offer of a free book

10 0.68302852 1726 andrew gelman stats-2013-02-18-What to read to catch up on multivariate statistics?

11 0.6810652 1783 andrew gelman stats-2013-03-31-He’s getting ready to write a book

12 0.68042326 1260 andrew gelman stats-2012-04-11-Hunger Games survival analysis

13 0.67970353 1135 andrew gelman stats-2012-01-22-Advice on do-it-yourself stats education?

14 0.66904277 1637 andrew gelman stats-2012-12-24-Textbook for data visualization?

15 0.66568381 1714 andrew gelman stats-2013-02-09-Partial least squares path analysis

16 0.66480517 1188 andrew gelman stats-2012-02-28-Reference on longitudinal models?

17 0.66344225 8 andrew gelman stats-2010-04-28-Advice to help the rich get richer

18 0.66213572 115 andrew gelman stats-2010-06-28-Whassup with those crappy thrillers?

19 0.66110861 2345 andrew gelman stats-2014-05-24-An interesting mosaic of a data programming course

20 0.65859127 986 andrew gelman stats-2011-11-01-MacKay update: where 12 comes from


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(2, 0.036), (24, 0.104), (86, 0.444), (99, 0.292)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.98223388 1427 andrew gelman stats-2012-07-24-More from the sister blog

Introduction: Anthropologist Bruce Mannheim reports that a recent well-publicized study on the genetics of native Americans, which used genetic analysis to find “at least three streams of Asian gene flow,” is in fact a confirmation of a long-known fact. Mannheim writes: This three-way distinction was known linguistically since the 1920s (for example, Sapir 1921). Basically, it’s a division among the Eskimo-Aleut languages, which straddle the Bering Straits even today, the Athabaskan languages (which were discovered to be related to a small Siberian language family only within the last few years, not by Greenberg as Wade suggested), and everything else. This is not to say that the results from genetics are unimportant, but it’s good to see how it fits with other aspects of our understanding.

2 0.96562678 1530 andrew gelman stats-2012-10-11-Migrating your blog from Movable Type to WordPress

Introduction: Cord Blomquist, who did a great job moving us from horrible Movable Type to nice nice WordPress, writes: I [Cord] wanted to share a little news with you related to the original work we did for you last year. When ReadyMadeWeb converted your Movable Type blog to WordPress, we got a lot of other requestes for the same service, so we started thinking about a bigger market for such a product. After a bit of research, we started work on automating the data conversion, writing rules, and exceptions to the rules, on how Movable Type and TypePad data could be translated to WordPress. After many months of work, we’re getting ready to announce TP2WP.com , a service that converts Movable Type and TypePad export files to WordPress import files, so anyone who wants to migrate to WordPress can do so easily and without losing permalinks, comments, images, or other files. By automating our service, we’ve been able to drop the price to just $99. I recommend it (and, no, Cord is not paying m

3 0.95716691 558 andrew gelman stats-2011-02-05-Fattening of the world and good use of the alpha channel

Introduction: In the spirit of Gapminder , Washington Post created an interactive scatterplot viewer that’s using alpha channel to tell apart overlapping fat dots better than sorting-by-circle-size Gapminder is using: Good news: the rate of fattening of the USA appears to be slowing down. Maybe because of high gas prices? But what’s happening with Oceania?

4 0.93511868 2219 andrew gelman stats-2014-02-21-The world’s most popular languages that the Mac documentation hasn’t been translated into

Introduction: I was updating my Mac and noticed the following: Lots of obscure European languages there. That got me wondering: what’s the least obscure language not on the above list? Igbo? Swahili? Or maybe Tagalog? I did a quick google and found this list of languages by number of native speakers. Once you see the list, the answer is obvious: Hindi, first language of 295 million people, is not on Apple’s list. The next most popular languages not included: Bengali, Punjabi, Javanese, Wu, Telegu, Marathi, Tamil, Urdu. Wow: most of these are Indian! Then comes Persian and a bunch of others. It turns out that Tagalog, Igbo, and Swahili, are way down on this list with 28 million, 24 million, and 26 million native speakers, respectively. Only 26 million for Swahili? This made me want to check the list of languages by total number of speakers . The ranking of most of the languages isn’t much different, but Swahili is now #10, at 140 million. Hindi and Bengali are still th

5 0.92578542 873 andrew gelman stats-2011-08-26-Luck or knowledge?

Introduction: Joan Ginther has won the Texas lottery four times. First, she won $5.4 million, then a decade later, she won $2million, then two years later $3million and in the summer of 2010, she hit a $10million jackpot. The odds of this has been calculated at one in eighteen septillion and luck like this could only come once every quadrillion years. According to Forbes, the residents of Bishop, Texas, seem to believe God was behind it all. The Texas Lottery Commission told Mr Rich that Ms Ginther must have been ‘born under a lucky star’, and that they don’t suspect foul play. Harper’s reporter Nathanial Rich recently wrote an article about Ms Ginther, which calls the the validity of her ‘luck’ into question. First, he points out, Ms Ginther is a former math professor with a PhD from Stanford University specialising in statistics. More at Daily Mail. [Edited Saturday] In comments, C Ryan King points to the original article at Harper’s and Bill Jefferys to Wired .

same-blog 6 0.91331375 76 andrew gelman stats-2010-06-09-Both R and Stata

7 0.90783834 904 andrew gelman stats-2011-09-13-My wikipedia edit

8 0.9064045 253 andrew gelman stats-2010-09-03-Gladwell vs Pinker

9 0.9005754 1718 andrew gelman stats-2013-02-11-Toward a framework for automatic model building

10 0.89318848 436 andrew gelman stats-2010-11-29-Quality control problems at the New York Times

11 0.85765868 1552 andrew gelman stats-2012-10-29-“Communication is a central task of statistics, and ideally a state-of-the-art data analysis can have state-of-the-art displays to match”

12 0.85181355 1547 andrew gelman stats-2012-10-25-College football, voting, and the law of large numbers

13 0.84055072 1327 andrew gelman stats-2012-05-18-Comments on “A Bayesian approach to complex clinical diagnoses: a case-study in child abuse”

14 0.8340137 759 andrew gelman stats-2011-06-11-“2 level logit with 2 REs & large sample. computational nightmare – please help”

15 0.82478082 305 andrew gelman stats-2010-09-29-Decision science vs. social psychology

16 0.82283562 276 andrew gelman stats-2010-09-14-Don’t look at just one poll number–unless you really know what you’re doing!

17 0.81139332 2082 andrew gelman stats-2013-10-30-Berri Gladwell Loken football update

18 0.79663312 515 andrew gelman stats-2011-01-13-The Road to a B

19 0.78699243 1586 andrew gelman stats-2012-11-21-Readings for a two-week segment on Bayesian modeling?

20 0.78480554 769 andrew gelman stats-2011-06-15-Mr. P by another name . . . is still great!