andrew_gelman_stats andrew_gelman_stats-2011 andrew_gelman_stats-2011-1009 knowledge-graph by maker-knowledge-mining

1009 andrew gelman stats-2011-11-14-Wickham R short course


meta infos for this blog

Source: html

Introduction: Hadley writes: I [Hadley] am going to be teaching an R development master class in New York City on Dec 12-13. The basic idea of the class is to help you write better code, focused on the mantra of “do not repeat yourself”. In day one you will learn powerful new tools of abstraction, allowing you to solve a wider range of problems with fewer lines of code. Day two will teach you how to make packages, the fundamental unit of code distribution in R, allowing others to save time by allowing them to use your code. To get the most out of this course, you should have some experience programming in R already: you should be familiar with writing functions, and the basic data structures of R: vectors, matrices, arrays, lists and data frames. You will find the course particularly useful if you’re an experienced R user looking to take the next step, or if you’re moving to R from other programming languages and you want to quickly get up to speed with R’s unique features. A coupl


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Hadley writes: I [Hadley] am going to be teaching an R development master class in New York City on Dec 12-13. [sent-1, score-0.231]

2 The basic idea of the class is to help you write better code, focused on the mantra of “do not repeat yourself”. [sent-2, score-0.583]

3 In day one you will learn powerful new tools of abstraction, allowing you to solve a wider range of problems with fewer lines of code. [sent-3, score-0.841]

4 Day two will teach you how to make packages, the fundamental unit of code distribution in R, allowing others to save time by allowing them to use your code. [sent-4, score-0.96]

5 To get the most out of this course, you should have some experience programming in R already: you should be familiar with writing functions, and the basic data structures of R: vectors, matrices, arrays, lists and data frames. [sent-5, score-0.585]

6 You will find the course particularly useful if you’re an experienced R user looking to take the next step, or if you’re moving to R from other programming languages and you want to quickly get up to speed with R’s unique features. [sent-6, score-1.037]

7 Both days will incorporate a mix of lectures and hands-on learning. [sent-8, score-0.206]

8 Expect to learn about a topic and then immediately put it into practice with a small example. [sent-9, score-0.257]

9 Plenty of help will be available if you get stuck. [sent-10, score-0.224]

10 You’ll receive a printed copy of all slides, as well as electronic access to the slides, code and data. [sent-11, score-0.796]

11 The material covered in the course is currently being turned into a book. [sent-12, score-0.226]

12 This looks great, and I imagine that what you’d learn there would be useful for other data programming, not just for R. [sent-14, score-0.272]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('allowing', 0.289), ('programming', 0.255), ('hadley', 0.226), ('code', 0.201), ('slides', 0.183), ('learn', 0.174), ('access', 0.171), ('mantra', 0.154), ('abstraction', 0.154), ('course', 0.131), ('arrays', 0.13), ('printed', 0.13), ('dec', 0.127), ('vectors', 0.127), ('basic', 0.124), ('available', 0.124), ('outline', 0.121), ('electronic', 0.117), ('master', 0.117), ('class', 0.114), ('languages', 0.11), ('structures', 0.106), ('session', 0.106), ('matrices', 0.106), ('plenty', 0.104), ('draft', 0.104), ('wider', 0.103), ('incorporate', 0.103), ('lectures', 0.103), ('day', 0.101), ('help', 0.1), ('lists', 0.1), ('useful', 0.098), ('experienced', 0.096), ('covered', 0.095), ('unit', 0.095), ('packages', 0.095), ('speed', 0.094), ('receive', 0.092), ('repeat', 0.091), ('powerful', 0.089), ('user', 0.088), ('save', 0.086), ('copy', 0.085), ('fewer', 0.085), ('functions', 0.085), ('immediately', 0.083), ('quickly', 0.083), ('city', 0.082), ('unique', 0.082)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.99999994 1009 andrew gelman stats-2011-11-14-Wickham R short course

Introduction: Hadley writes: I [Hadley] am going to be teaching an R development master class in New York City on Dec 12-13. The basic idea of the class is to help you write better code, focused on the mantra of “do not repeat yourself”. In day one you will learn powerful new tools of abstraction, allowing you to solve a wider range of problems with fewer lines of code. Day two will teach you how to make packages, the fundamental unit of code distribution in R, allowing others to save time by allowing them to use your code. To get the most out of this course, you should have some experience programming in R already: you should be familiar with writing functions, and the basic data structures of R: vectors, matrices, arrays, lists and data frames. You will find the course particularly useful if you’re an experienced R user looking to take the next step, or if you’re moving to R from other programming languages and you want to quickly get up to speed with R’s unique features. A coupl

2 0.17561428 1965 andrew gelman stats-2013-08-02-My course this fall on l’analyse bayésienne de données

Introduction: X marks the spot . I’ll post the slides soon (not just for the students in my class; these should be helpful for anyone teaching Bayesian data analysis from our book ). But I don’t think you’ll get much from reading the slides alone; you’ll get more out of the book (or, of course, from taking the class).

3 0.16095455 1611 andrew gelman stats-2012-12-07-Feedback on my Bayesian Data Analysis class at Columbia

Introduction: In one of the final Jitts, we asked the students how the course could be improved. Some of their suggestions would work, some would not. I’m putting all the suggestions below, interpolating my responses. (Overall, I think the course went well. Please remember that the remarks below are not course evaluations; they are answers to my specific question of how the course could be better. If we’d had a Jitt asking all the ways the course was good, you’d be seeing lots of positive remarks. But that wouldn’t be particularly useful or interesting.) The best thing about the course is that the kids worked hard each week on their homeworks. OK, here are the comments and my replies: Could have been better if we did less amount but more in detail. I don’t know if this would’ve been possible. I wanted to get to the harder stuff (HMC, VB, nonparametric models) which required a certain amount of preparation. And, even so, there was not time for everything. And also, needs solut

4 0.12657066 272 andrew gelman stats-2010-09-13-Ross Ihaka to R: Drop Dead

Introduction: Christian Robert posts these thoughts : I [Ross Ihaka] have been worried for some time that R isn’t going to provide the base that we’re going to need for statistical computation in the future. (It may well be that the future is already upon us.) There are certainly efficiency problems (speed and memory use), but there are more fundamental issues too. Some of these were inherited from S and some are peculiar to R. One of the worst problems is scoping. Consider the following little gem. f =function() { if (runif(1) > .5) x = 10 x } The x being returned by this function is randomly local or global. There are other examples where variables alternate between local and non-local throughout the body of a function. No sensible language would allow this. It’s ugly and it makes optimisation really difficult. This isn’t the only problem, even weirder things happen because of interactions between scoping and lazy evaluation. In light of this, I [Ihaka] have come to the c

5 0.12600677 1447 andrew gelman stats-2012-08-07-Reproducible science FAIL (so far): What’s stoppin people from sharin data and code?

Introduction: David Karger writes: Your recent post on sharing data was of great interest to me, as my own research in computer science asks how to incentivize and lower barriers to data sharing. I was particularly curious about your highlighting of effort as the major dis-incentive to sharing. I would love to hear more, as this question of effort is on we specifically target in our development of tools for data authoring and publishing. As a straw man, let me point out that sharing data technically requires no more than posting an excel spreadsheet online. And that you likely already produced that spreadsheet during your own analytic work. So, in what way does such low-tech publishing fail to meet your data sharing objectives? Our own hypothesis has been that the effort is really quite low, with the problem being a lack of *immediate/tangible* benefits (as opposed to the long-term values you accurately describe). To attack this problem, we’re developing tools (and, since it appear

6 0.12562361 2066 andrew gelman stats-2013-10-17-G+ hangout for test run of BDA course

7 0.11905903 1752 andrew gelman stats-2013-03-06-Online Education and Jazz

8 0.11899808 1948 andrew gelman stats-2013-07-21-Bayes related

9 0.11870791 1799 andrew gelman stats-2013-04-12-Stan 1.3.0 and RStan 1.3.0 Ready for Action

10 0.11844348 2175 andrew gelman stats-2014-01-18-A course in sample surveys for political science

11 0.10625071 727 andrew gelman stats-2011-05-23-My new writing strategy

12 0.10613224 2009 andrew gelman stats-2013-09-05-A locally organized online BDA course on G+ hangout?

13 0.10398574 1722 andrew gelman stats-2013-02-14-Statistics for firefighters: update

14 0.10230814 266 andrew gelman stats-2010-09-09-The future of R

15 0.10205745 1382 andrew gelman stats-2012-06-17-How to make a good fig?

16 0.099918678 1673 andrew gelman stats-2013-01-15-My talk last night at the visualization meetup

17 0.099311113 423 andrew gelman stats-2010-11-20-How to schedule projects in an introductory statistics course?

18 0.095654875 2345 andrew gelman stats-2014-05-24-An interesting mosaic of a data programming course

19 0.093909554 1311 andrew gelman stats-2012-05-10-My final exam for Design and Analysis of Sample Surveys

20 0.091447249 1710 andrew gelman stats-2013-02-06-The new Stan 1.1.1, featuring Gaussian processes!


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.155), (1, -0.016), (2, -0.079), (3, 0.058), (4, 0.114), (5, 0.101), (6, -0.004), (7, -0.003), (8, -0.064), (9, -0.054), (10, 0.007), (11, 0.02), (12, 0.028), (13, -0.045), (14, 0.035), (15, -0.003), (16, -0.003), (17, -0.038), (18, -0.034), (19, -0.005), (20, 0.037), (21, 0.019), (22, -0.022), (23, 0.015), (24, -0.087), (25, 0.059), (26, 0.044), (27, -0.012), (28, 0.15), (29, 0.002), (30, 0.031), (31, -0.055), (32, -0.039), (33, -0.024), (34, 0.068), (35, -0.021), (36, -0.069), (37, 0.005), (38, -0.013), (39, 0.0), (40, 0.008), (41, 0.006), (42, 0.0), (43, -0.008), (44, 0.012), (45, 0.103), (46, 0.001), (47, -0.019), (48, 0.01), (49, -0.05)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.97938508 1009 andrew gelman stats-2011-11-14-Wickham R short course

Introduction: Hadley writes: I [Hadley] am going to be teaching an R development master class in New York City on Dec 12-13. The basic idea of the class is to help you write better code, focused on the mantra of “do not repeat yourself”. In day one you will learn powerful new tools of abstraction, allowing you to solve a wider range of problems with fewer lines of code. Day two will teach you how to make packages, the fundamental unit of code distribution in R, allowing others to save time by allowing them to use your code. To get the most out of this course, you should have some experience programming in R already: you should be familiar with writing functions, and the basic data structures of R: vectors, matrices, arrays, lists and data frames. You will find the course particularly useful if you’re an experienced R user looking to take the next step, or if you’re moving to R from other programming languages and you want to quickly get up to speed with R’s unique features. A coupl

2 0.71583015 325 andrew gelman stats-2010-10-07-Fitting discrete-data regression models in social science

Introduction: My lecture for Greg’s class today (taken from chapters 5-6 of ARM). Also, after class we talked a bit more about formal modeling. If I have time I’ll post some of that discussion here.

3 0.70636767 1722 andrew gelman stats-2013-02-14-Statistics for firefighters: update

Introduction: Following up on our earlier discussion, Daniel Rubenson from Ryerson University in Toronto writes: The course went really well (it was a couple of years ago now). The course was run through a partnership my department has with the Ontario Fire College. Basically, firefighters can do a certificate and sometimes a degree in public administration and part of that is a course on methods. It was a small group — about 8 or so — very motivated guys (all guys). Some of them were chiefs or deputy chiefs from small towns, others captains who were doing the certificate in order to improve their chances for promotion or as a step into a broader public admin career. I had asked them ahead of time to bring with them whatever data they could get their hands on and that they thought would be interesting. This included response times, data on professional v voluntary firefighters, some insurance data and the like. I should mention that is was an intensive mode course. So we had 4.5 days toge

4 0.69140053 1965 andrew gelman stats-2013-08-02-My course this fall on l’analyse bayésienne de données

Introduction: X marks the spot . I’ll post the slides soon (not just for the students in my class; these should be helpful for anyone teaching Bayesian data analysis from our book ). But I don’t think you’ll get much from reading the slides alone; you’ll get more out of the book (or, of course, from taking the class).

5 0.68782133 402 andrew gelman stats-2010-11-09-Kaggle: forecasting competitions in the classroom

Introduction: Anthony Goldbloom writes: For those who haven’t come across Kaggle, we are a new platform for data prediction competitions. Companies and researchers put up a dataset and a problem and data scientists compete to produce the best solutions. We’ve just launched a new initiative called Kaggle in Class, allowing instructors to host competitions for their students. Competitions are a neat way to engage students, giving them the opportunity to put into practice what they learn. The platform offers live leaderboards, so students get instant feedback on the accuracy of their work. And since competitions are judged on objective criteria (predictions are compared with outcomes), the platform offers unique assessment opportunities. The first Kaggle in Class competition is being hosted by Stanford University’s Stats 202 class and requires students to predict the price of different wines based on vintage, country, ratings and other information. Those interested in hosting a competition f

6 0.68685955 1956 andrew gelman stats-2013-07-25-What should be in a machine learning course?

7 0.68548781 2068 andrew gelman stats-2013-10-18-G+ hangout for Bayesian Data Analysis course now! (actually, in 5 minutes)

8 0.68516672 2345 andrew gelman stats-2014-05-24-An interesting mosaic of a data programming course

9 0.68498439 1736 andrew gelman stats-2013-02-24-Rcpp class in Sat 9 Mar in NYC

10 0.6834166 2100 andrew gelman stats-2013-11-14-BDA class G+ hangout another try

11 0.68335652 1960 andrew gelman stats-2013-07-28-More on that machine learning course

12 0.68100542 1447 andrew gelman stats-2012-08-07-Reproducible science FAIL (so far): What’s stoppin people from sharin data and code?

13 0.67947227 2016 andrew gelman stats-2013-09-11-Zipfian Academy, A School for Data Science

14 0.67867786 198 andrew gelman stats-2010-08-11-Multilevel modeling in R on a Mac

15 0.67701966 1611 andrew gelman stats-2012-12-07-Feedback on my Bayesian Data Analysis class at Columbia

16 0.67547244 96 andrew gelman stats-2010-06-18-Course proposal: Bayesian and advanced likelihood statistical methods for zombies.

17 0.67300951 1311 andrew gelman stats-2012-05-10-My final exam for Design and Analysis of Sample Surveys

18 0.66256762 2175 andrew gelman stats-2014-01-18-A course in sample surveys for political science

19 0.65707827 2064 andrew gelman stats-2013-10-16-Test run for G+ hangout for my Bayesian Data Analysis class

20 0.65554506 1517 andrew gelman stats-2012-10-01-“On Inspiring Students and Being Human”


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(6, 0.026), (9, 0.095), (16, 0.049), (21, 0.03), (24, 0.145), (30, 0.012), (47, 0.011), (56, 0.023), (65, 0.022), (71, 0.011), (72, 0.013), (84, 0.024), (86, 0.066), (89, 0.091), (99, 0.299)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.96883804 1009 andrew gelman stats-2011-11-14-Wickham R short course

Introduction: Hadley writes: I [Hadley] am going to be teaching an R development master class in New York City on Dec 12-13. The basic idea of the class is to help you write better code, focused on the mantra of “do not repeat yourself”. In day one you will learn powerful new tools of abstraction, allowing you to solve a wider range of problems with fewer lines of code. Day two will teach you how to make packages, the fundamental unit of code distribution in R, allowing others to save time by allowing them to use your code. To get the most out of this course, you should have some experience programming in R already: you should be familiar with writing functions, and the basic data structures of R: vectors, matrices, arrays, lists and data frames. You will find the course particularly useful if you’re an experienced R user looking to take the next step, or if you’re moving to R from other programming languages and you want to quickly get up to speed with R’s unique features. A coupl

2 0.95737618 231 andrew gelman stats-2010-08-24-Yet another Bayesian job opportunity

Introduction: Steve Cohen writes: My [Cohen's] firm is looking for strong candidates to help us in developing software and analyzing data using Bayesian methods. We have been developing a suite of programs in C++ which allow us to do Bayesian hierarchical regression and logit/probit models on marketing data. These efforts have included the use of high performance computing tools like nVidia’s CUDA and the new OpenCL standard, which allow parallel processing of Bayesian models. Our software is very, very fast – even on databases that are ½ terabyte in size. The software still needs many additions and improvements and a person with the right skill set will have the chance to make a significant contribution. Here’s the job description he sent: Bayesian statistician and C++ programmer The company In4mation Insights is a marketing research, analytics, and consulting firm which operates on the leading-edge of our industry. Our clients are Fortune 500 companies and major management consul

3 0.95039433 1903 andrew gelman stats-2013-06-17-Weak identification provides partial information

Introduction: Matt Selove writes: My question is about Bayesian analysis of the linear regression model. It seems to me that in some cases this approach throws out useful information. As an example, imagine you have two basketball players randomly drawn from the pool of NBA players (which provides the prior). You’d like to estimate how many free throws each can make out of 100. You have two pieces of information: - Session 1: Each player shoots 100 shots, and you learn player A’s total minus player B’s total - Session 2: Player A does another session where he shoots 100 shots alone, and you learn his total If we take the regression approach: y_i = number of shots made beta_A = player A’s expected number out of 100 beta_B = player B’s expected number out of 100 x_i = vector of zeros and ones showing which player took shots In the above example, our datapoints are: y_1 (first number reported) = beta_A * 1 + beta_B * (-1) + epsilon_1 y_2 (second number reported) = beta_A * 1 +

4 0.94977856 1961 andrew gelman stats-2013-07-29-Postdocs in probabilistic modeling! With David Blei! And Stan!

Introduction: David Blei writes: I have two postdoc openings for basic research in probabilistic modeling . The thrusts are (a) scalable inference and (b) model checking. We will be developing new methods and implementing them in probabilistic programming systems. I am open to applicants interested in many kinds of applications and from any field. “Scalable inference” means black-box VB and related ideas, and “probabilistic programming systems” means Stan! (You might be familiar with Stan as an implementation of Nuts for posterior sampling, but Stan is also an efficient program for computing probability densities and their gradients, and as such is an ideal platform for developing scalable implementations of variational inference and related algorithms.) And you know I like model checking. Here’s the full ad: ===== POSTDOC POSITIONS IN PROBABILISTIC MODELING ===== We expect to have two postdoctoral positions available for January 2014 (or later). These positions are in D

5 0.94949555 623 andrew gelman stats-2011-03-21-Baseball’s greatest fielders

Introduction: Someone just stopped by and dropped off a copy of the book Wizardry: Baseball’s All-time Greatest Fielders Revealed, by Michael Humphreys. I don’t have much to say about the topic–I did see Brooks Robinson play, but I don’t remember any fancy plays. I must have seen Mark Belanger but I don’t really recall. Ozzie Smith was cool but I saw only him on TV. The most impressive thing I ever saw live was Rickey Henderson stealing a base. The best thing about that was that everyone was expecting him to steal the base, and he still was able to do it. But that wasn’t fielding either. Anyway, Humphreys was nice enough to give me a copy of his book, and since I can’t say much (I didn’t have it in me to study the formulas in detail, nor do I know enough to be able to evaluate them), I might as well say what I can say right away. (Note: Humphreys replies to some of these questions in a comment .) 1. Near the beginning, Humphreys says that 10 runs are worth about 1 win. I’ve always b

6 0.94776392 2267 andrew gelman stats-2014-03-26-Is a steal really worth 9 points?

7 0.94720399 1390 andrew gelman stats-2012-06-23-Traditionalist claims that modern art could just as well be replaced by a “paint-throwing chimp”

8 0.94577605 1142 andrew gelman stats-2012-01-29-Difficulties with the 1-4-power transformation

9 0.94316834 1630 andrew gelman stats-2012-12-18-Postdoc positions at Microsoft Research – NYC

10 0.94259149 1991 andrew gelman stats-2013-08-21-BDA3 table of contents (also a new paper on visualization)

11 0.94072556 1939 andrew gelman stats-2013-07-15-Forward causal reasoning statements are about estimation; reverse causal questions are about model checking and hypothesis generation

12 0.94054854 1572 andrew gelman stats-2012-11-10-I don’t like this cartoon

13 0.93989581 560 andrew gelman stats-2011-02-06-Education and Poverty

14 0.93922091 1565 andrew gelman stats-2012-11-06-Why it can be rational to vote

15 0.93921703 389 andrew gelman stats-2010-11-01-Why it can be rational to vote

16 0.93919712 1371 andrew gelman stats-2012-06-07-Question 28 of my final exam for Design and Analysis of Sample Surveys

17 0.93734694 407 andrew gelman stats-2010-11-11-Data Visualization vs. Statistical Graphics

18 0.93697381 1702 andrew gelman stats-2013-02-01-Don’t let your standard errors drive your research agenda

19 0.93618852 640 andrew gelman stats-2011-03-31-Why Edit Wikipedia?

20 0.93598258 566 andrew gelman stats-2011-02-09-The boxer, the wrestler, and the coin flip, again