andrew_gelman_stats andrew_gelman_stats-2014 andrew_gelman_stats-2014-2345 knowledge-graph by maker-knowledge-mining

2345 andrew gelman stats-2014-05-24-An interesting mosaic of a data programming course


meta infos for this blog

Source: html

Introduction: Rajit Dasgupta writes: I have been working on a website, SlideRule that in its present state, is a catalog of online courses aggregated from over 35 providers. One of the products we are building on top of this is something called Learning Paths, which are essentially a sequence of Online Courses designed to help learners gain mastery over a certain subject. We have recently released a Learning Path on Data Analysis , contributed by Claudia Gold, an early data scientist at Airbnb. We’d love it if you could look at it and tell us what you think. We are always looking for constructive feedback. I clicked through and took a look. It’s pretty cool. I haven’t tried to assess the actual teaching materials (they’re mostly about programming, not statistics) but I like how it’s structured based on pointers to existing resources, which seems like an excellent compromise between (a) someone trying to write the material all himself or herself (which would require either limiting the sco


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Rajit Dasgupta writes: I have been working on a website, SlideRule that in its present state, is a catalog of online courses aggregated from over 35 providers. [sent-1, score-0.719]

2 One of the products we are building on top of this is something called Learning Paths, which are essentially a sequence of Online Courses designed to help learners gain mastery over a certain subject. [sent-2, score-1.053]

3 We have recently released a Learning Path on Data Analysis , contributed by Claudia Gold, an early data scientist at Airbnb. [sent-3, score-0.479]

4 The material here is well structured into sections, but each section then has a link. [sent-9, score-0.469]

5 My only suggestion is that this particular learning path be called Data Programming, which seems like a more accurate title than Data Analysis. [sent-10, score-0.826]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('structured', 0.23), ('learning', 0.226), ('courses', 0.217), ('path', 0.208), ('programming', 0.196), ('claudia', 0.177), ('aggregated', 0.177), ('catalog', 0.167), ('material', 0.161), ('learners', 0.16), ('mastery', 0.16), ('competence', 0.16), ('areas', 0.158), ('online', 0.158), ('pointers', 0.154), ('pile', 0.14), ('compromise', 0.132), ('gold', 0.127), ('limiting', 0.127), ('called', 0.126), ('scope', 0.125), ('constructive', 0.125), ('clicked', 0.125), ('sequence', 0.124), ('contributed', 0.122), ('paths', 0.12), ('sections', 0.115), ('materials', 0.112), ('products', 0.112), ('released', 0.108), ('assess', 0.106), ('designed', 0.104), ('gain', 0.102), ('resources', 0.1), ('suggestion', 0.098), ('data', 0.096), ('require', 0.09), ('accurate', 0.09), ('existing', 0.089), ('building', 0.087), ('website', 0.085), ('outside', 0.083), ('mostly', 0.081), ('excellent', 0.081), ('section', 0.078), ('essentially', 0.078), ('title', 0.078), ('early', 0.077), ('scientist', 0.076), ('teaching', 0.076)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.99999994 2345 andrew gelman stats-2014-05-24-An interesting mosaic of a data programming course

Introduction: Rajit Dasgupta writes: I have been working on a website, SlideRule that in its present state, is a catalog of online courses aggregated from over 35 providers. One of the products we are building on top of this is something called Learning Paths, which are essentially a sequence of Online Courses designed to help learners gain mastery over a certain subject. We have recently released a Learning Path on Data Analysis , contributed by Claudia Gold, an early data scientist at Airbnb. We’d love it if you could look at it and tell us what you think. We are always looking for constructive feedback. I clicked through and took a look. It’s pretty cool. I haven’t tried to assess the actual teaching materials (they’re mostly about programming, not statistics) but I like how it’s structured based on pointers to existing resources, which seems like an excellent compromise between (a) someone trying to write the material all himself or herself (which would require either limiting the sco

2 0.16019167 2009 andrew gelman stats-2013-09-05-A locally organized online BDA course on G+ hangout?

Introduction: Eoin Lawless wrote me: I’ve been reading your blog (and John Kruschke ‘s) for several months now, as a result of starting to learn Bayesian methods from Doing Bayesian Data Analysis [I love the title of that book! --- ed.]. More recently I completed a Coursera course on Data Science. I found learning through the medium of a online course to be an amazing experience. It does not replace books, but learning new material at the same time as other people and discussing it in the forums is very motivational. Additionally it is much easier to work through exercises and projects when there is a deadline and some element of competition than to plow through the end of chapter exercises in a book. This is especially true, I believe, when the learning is for a long term goal, rather than to be used immediately in work, for example. My question: you are obviously evangelical about the benefits that Bayesian statistics brings, have you ever considered producing a Coursera (or similar) cour

3 0.11148881 223 andrew gelman stats-2010-08-21-Statoverflow

Introduction: Skirant Vadali writes: I am writing to seek your help in building a community driven Q&A; website tentatively called called ‘Statistics Analysis’. I am neither a founder of this website nor do I have any financial stake in its success. By way of background to this website, please see Stackoverflow (http://stackoverflow.com/) and Mathoverflow (http://mathoverflow.net/). Stackoverflow is a Q&A; website targeted at software developers and is designed to help them ask questions and get answers from other developers. Mathoverflow is a Q&A; website targeted at research mathematicians and is designed to help them ask and answer questions from other mathematicians across the world. The success of both these sites in helping their respective communities is a strong indicator that sites designed along these lines are very useful. The company that runs Stackoverflow (who also host Mathoverflow.net) has recently decided to develop other community driven websites for various other topic are

4 0.10763625 1740 andrew gelman stats-2013-02-26-“Is machine learning a subset of statistics?”

Introduction: Following up on our previous post , Andrew Wilson writes: I agree we are in a really exciting time for statistics and machine learning. There has been a lot of talk lately comparing machine learning with statistics. I am curious whether you think there are many fundamental differences between the fields, or just superficial differences — different popular approximate inference methods, slightly different popular application areas, etc. Is machine learning a subset of statistics? In the paper we discuss how we think machine learning is fundamentally about pattern discovery, and ultimately, fully automating the learning and decision making process. In other words, whatever a human does when he or she uses tools to analyze data, can be written down algorithmically and automated on a computer. I am not sure if the ambitions are similar in statistics — and I don’t have any conventional statistics background, which makes it harder to tell. I think it’s an interesting discussion.

5 0.10434968 515 andrew gelman stats-2011-01-13-The Road to a B

Introduction: A student in my intro class came by the other day with a lot of questions. It soon became clear that he was confused about a lot of things, going back several weeks in the course. What this means is that we did not do a good job of monitoring his performance earlier during the semester. But the question now is: what do do next? I’ll sign the drop form any time during the semester, but he didn’t want to drop the class (the usual scheduling issues). And he doesn’t want to get a C or a D. He’s in big trouble and at this point is basically rolling the dice that he’ll do well enough on the final to eke out a B in the course. (Yes, he goes to section meetings and office hours, and he even tried hiring a tutor. But it’s tough–if you’ve already been going to class and still don’t know what’s going on, it’s not so easy to pull yourself out of the hole, even if you have a big pile of practice problems ahead of you.) What we really need for this student, and others like him, is a road

6 0.10427269 1948 andrew gelman stats-2013-07-21-Bayes related

7 0.10375214 1630 andrew gelman stats-2012-12-18-Postdoc positions at Microsoft Research – NYC

8 0.10310107 65 andrew gelman stats-2010-06-03-How best to learn R?

9 0.095654875 1009 andrew gelman stats-2011-11-14-Wickham R short course

10 0.09504132 1056 andrew gelman stats-2011-12-13-Drawing to Learn in Science

11 0.091535002 277 andrew gelman stats-2010-09-14-In an introductory course, when does learning occur?

12 0.089899801 1956 andrew gelman stats-2013-07-25-What should be in a machine learning course?

13 0.086988479 596 andrew gelman stats-2011-03-01-Looking for a textbook for a two-semester course in probability and (theoretical) statistics

14 0.08596278 192 andrew gelman stats-2010-08-08-Turning pages into data

15 0.083358467 1482 andrew gelman stats-2012-09-04-Model checking and model understanding in machine learning

16 0.082897678 1447 andrew gelman stats-2012-08-07-Reproducible science FAIL (so far): What’s stoppin people from sharin data and code?

17 0.081883945 1714 andrew gelman stats-2013-02-09-Partial least squares path analysis

18 0.079553731 1538 andrew gelman stats-2012-10-17-Rust

19 0.079405807 1771 andrew gelman stats-2013-03-19-“Ronald Reagan is a Statistician and Other Examples of Learning From Diverse Sources of Information”

20 0.076831862 1431 andrew gelman stats-2012-07-27-Overfitting


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.14), (1, -0.02), (2, -0.071), (3, 0.026), (4, 0.041), (5, 0.076), (6, -0.035), (7, 0.014), (8, -0.034), (9, 0.018), (10, 0.032), (11, 0.011), (12, 0.015), (13, -0.025), (14, 0.007), (15, -0.014), (16, -0.006), (17, -0.027), (18, 0.019), (19, -0.024), (20, 0.022), (21, 0.024), (22, -0.022), (23, 0.013), (24, -0.044), (25, 0.021), (26, 0.023), (27, -0.034), (28, 0.048), (29, 0.039), (30, 0.029), (31, -0.033), (32, -0.013), (33, -0.011), (34, -0.003), (35, 0.078), (36, -0.007), (37, -0.004), (38, -0.028), (39, -0.007), (40, -0.028), (41, -0.029), (42, -0.02), (43, 0.036), (44, -0.029), (45, 0.046), (46, 0.037), (47, 0.035), (48, 0.05), (49, -0.044)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.96465921 2345 andrew gelman stats-2014-05-24-An interesting mosaic of a data programming course

Introduction: Rajit Dasgupta writes: I have been working on a website, SlideRule that in its present state, is a catalog of online courses aggregated from over 35 providers. One of the products we are building on top of this is something called Learning Paths, which are essentially a sequence of Online Courses designed to help learners gain mastery over a certain subject. We have recently released a Learning Path on Data Analysis , contributed by Claudia Gold, an early data scientist at Airbnb. We’d love it if you could look at it and tell us what you think. We are always looking for constructive feedback. I clicked through and took a look. It’s pretty cool. I haven’t tried to assess the actual teaching materials (they’re mostly about programming, not statistics) but I like how it’s structured based on pointers to existing resources, which seems like an excellent compromise between (a) someone trying to write the material all himself or herself (which would require either limiting the sco

2 0.8037684 1777 andrew gelman stats-2013-03-26-Data Science for Social Good summer fellowship program

Introduction: Juan-Pablo Velez writes: I’m helping with a  Data Science for Social Good  summer fellowship program at the University of Chicago. The goal is to train data scientists that can tackle problems in education, healthcare, energy, transportation, and more. Working with full-time mentors from academia, industry, and the  Obama campaign , fellows will build high-impact analytics projects using statistics, machine learning, data mining, and big data technologies. For fellows, we’re looking for grad students, advanced undergrads, and professionals in computer science, machine learning, statistics, and the computational and quantitative sciences. For mentors, we’re looking for folks with practical data science experience. Fellows and mentors will be paid competitively and housed in Chicago for duration of the program, from early June to late August. Rayid Ghani , former Chief Scientist of the Obama 2012 campaign, is leading the program.  Eric Sch

3 0.79999995 2016 andrew gelman stats-2013-09-11-Zipfian Academy, A School for Data Science

Introduction: Katie Kent writes: I’m with Zipfian Academy – we’re launching next week as the first 12-week immersive program to teach data science. The program combines the hard and soft skills of data science with introductions to the data science community out here in San Francisco. The launch will be covered by a couple big tech blogs, but we’d love to offer the opportunity to blog about it to some smaller and well-respected data science blogs like yours. I don’t know anything about this but I took a look at the website and it looks pretty cool. Maybe in a future iteration of their course, they can teach Stan, once it has a few more useful features such as VB and EP.

4 0.7646898 1902 andrew gelman stats-2013-06-17-Job opening at new “big data” consulting firm!

Introduction: David Shor sends along a job announcement for Civis Analytics, which he describes as “basically Obama’s Analytics team reconstituted as a company”: Data Scientist Position Overview Data Scientists are responsible for providing the fundamental data science that powers our work – including predictive analytics, data mining, experimental design and ad-hoc statistical analysis. As a Data Scientist, you will join our Chicago-based data science team, working closely and collaboratively with analysts and engineers to identify, quantify and solve big, meaningful problems. Data Scientists will have the opportunity to dive deeply into big problems and work in a variety of areas. Civis Analytics has opportunities for applicants who are seasoned professionals, brilliant new comers, and anywhere in between. Qualifications · Master’s degree in statistics, machine learning, computer science with heavy quant focus, a related subject, or a Bachelor’s degree and significant work ex

5 0.75762141 2106 andrew gelman stats-2013-11-19-More on “data science” and “statistics”

Introduction: After reading Rachel and Cathy’s book , I wrote that “Statistics is the least important part of data science . . . I think it would be fair to consider statistics as a subset of data science. . . . it’s not the most important part of data science, or even close.” But then I received “Data Science for Business,” by Foster Provost and Tom Fawcett, in the mail. I might not have opened the book at all (as I’m hardly in the target audience) but for seeing a blurb by Chris Volinsky, a statistician whom I respect a lot. So I flipped through the book and it indeed looked pretty good. It moves slowly but that’s appropriate for an intro book. But what surprised me, given the book’s title and our recent discussion on the nature of data science, was that the book was 100% statistics! It had some math (for example, definitions of various distance measures), some simple algebra, some conceptual graphs such as ROC curve, some tables and graphs of low-dimensional data summaries—but almost

6 0.75392824 1990 andrew gelman stats-2013-08-20-Job opening at an organization that promotes reproducible research!

7 0.7505427 1276 andrew gelman stats-2012-04-22-“Gross misuse of statistics” can be a good thing, if it indicates the acceptance of the importance of statistical reasoning

8 0.74738759 2084 andrew gelman stats-2013-11-01-Doing Data Science: What’s it all about?

9 0.74725336 1297 andrew gelman stats-2012-05-03-New New York data research organizations

10 0.7470265 1009 andrew gelman stats-2011-11-14-Wickham R short course

11 0.73135912 1517 andrew gelman stats-2012-10-01-“On Inspiring Students and Being Human”

12 0.72942322 1956 andrew gelman stats-2013-07-25-What should be in a machine learning course?

13 0.72445279 1722 andrew gelman stats-2013-02-14-Statistics for firefighters: update

14 0.72235626 2009 andrew gelman stats-2013-09-05-A locally organized online BDA course on G+ hangout?

15 0.7221691 118 andrew gelman stats-2010-06-30-Question & Answer Communities

16 0.71269757 1837 andrew gelman stats-2013-05-03-NYC Data Skeptics Meetup

17 0.70994461 1960 andrew gelman stats-2013-07-28-More on that machine learning course

18 0.70594645 999 andrew gelman stats-2011-11-09-I was at a meeting a couple months ago . . .

19 0.70447332 1920 andrew gelman stats-2013-06-30-“Non-statistical” statistics tools

20 0.70291603 2307 andrew gelman stats-2014-04-27-Big Data…Big Deal? Maybe, if Used with Caution.


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(0, 0.038), (5, 0.014), (8, 0.018), (15, 0.015), (17, 0.018), (24, 0.034), (33, 0.037), (47, 0.018), (72, 0.047), (79, 0.048), (86, 0.169), (99, 0.449)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.98379254 2345 andrew gelman stats-2014-05-24-An interesting mosaic of a data programming course

Introduction: Rajit Dasgupta writes: I have been working on a website, SlideRule that in its present state, is a catalog of online courses aggregated from over 35 providers. One of the products we are building on top of this is something called Learning Paths, which are essentially a sequence of Online Courses designed to help learners gain mastery over a certain subject. We have recently released a Learning Path on Data Analysis , contributed by Claudia Gold, an early data scientist at Airbnb. We’d love it if you could look at it and tell us what you think. We are always looking for constructive feedback. I clicked through and took a look. It’s pretty cool. I haven’t tried to assess the actual teaching materials (they’re mostly about programming, not statistics) but I like how it’s structured based on pointers to existing resources, which seems like an excellent compromise between (a) someone trying to write the material all himself or herself (which would require either limiting the sco

2 0.97316182 515 andrew gelman stats-2011-01-13-The Road to a B

Introduction: A student in my intro class came by the other day with a lot of questions. It soon became clear that he was confused about a lot of things, going back several weeks in the course. What this means is that we did not do a good job of monitoring his performance earlier during the semester. But the question now is: what do do next? I’ll sign the drop form any time during the semester, but he didn’t want to drop the class (the usual scheduling issues). And he doesn’t want to get a C or a D. He’s in big trouble and at this point is basically rolling the dice that he’ll do well enough on the final to eke out a B in the course. (Yes, he goes to section meetings and office hours, and he even tried hiring a tutor. But it’s tough–if you’ve already been going to class and still don’t know what’s going on, it’s not so easy to pull yourself out of the hole, even if you have a big pile of practice problems ahead of you.) What we really need for this student, and others like him, is a road

3 0.95914561 1586 andrew gelman stats-2012-11-21-Readings for a two-week segment on Bayesian modeling?

Introduction: Michael Landy writes: I’m in Psych and Center for Neural Science and I’m teaching a doctoral course this term in methods in psychophysics (never mind the details) at the tail end of which I’m planning on at least 2 lectures on Bayesian parameter estimation and Bayesian model comparison. So far, all the readings I have are a bit too obscure and either glancing (bits of machine-learning books: Bishop, MacKay) or too low-level. The only useful reference I’ve got is an application of these methods (a methods article of mine in a Neuroscience Methods journal). The idea is to give them a decent idea of both estimation (Jeffries priors, marginals of the posterior over the parameters) and model comparison (cross-validation, AIC, BIC, full-blown Bayesian model posterior comparisons, Bayes factor, Occam factor, blah blah blah). So: have you any suggestions for articles or chapters that might be suitable (yes, I’m aware you have an entire book that’s obviously relevant)? In the class topic

4 0.95833814 769 andrew gelman stats-2011-06-15-Mr. P by another name . . . is still great!

Introduction: Brendan Nyhan points me to this from Don Taylor: Can national data be used to estimate state-level results? . . . A challenge is the fact that the sample size in many states is very small . . . Richard [Gonzales] used a regression approach to extrapolate this information to provide a state-level support for health reform: To get around the challenge presented by small sample sizes, the model presented here combines the benefits of incorporating auxiliary demographic information about the states with the hierarchical modeling approach commonly used in small area estimation. The model is designed to “shrink” estimates toward the average level of support in the region when there are few observations available, while simultaneously adjusting for the demographics and political ideology in the state. This approach therefore takes fuller advantage of all information available in the data to estimate state-level public opinion. This is a great idea, and it is already being used al

5 0.94824654 276 andrew gelman stats-2010-09-14-Don’t look at just one poll number–unless you really know what you’re doing!

Introduction: Here’s a good one if you want to tell your students about question wording bias. It’s fun because the data are all on the web–the research is something that students could do on their own–if they know what to look for. Another win for Google. Here’s the story. I found the following graph on the front page of the American Enterprise Institute, a well-known D.C. think tank: My first thought was that they should replace this graph by a time series, which would show so much more information. I did a web search and, indeed, looking at a broad range of poll questions over time gives us a much richer perspective on public opinion about Afghanistan than is revealed in the above graph. I did a quick google search (“polling report afghanistan”) and found this . The quick summary is that roughly 40% of Americans favor the Afghan war (down from about 50% from 2006 through early 2009). The Polling Report page also features the Quninipiac poll featured in the above graph; here it r

6 0.94605041 1782 andrew gelman stats-2013-03-30-“Statistical Modeling: A Fresh Approach”

7 0.94582701 76 andrew gelman stats-2010-06-09-Both R and Stata

8 0.94396627 2343 andrew gelman stats-2014-05-22-Big Data needs Big Model

9 0.94166362 904 andrew gelman stats-2011-09-13-My wikipedia edit

10 0.93687385 866 andrew gelman stats-2011-08-23-Participate in a research project on combining information for prediction

11 0.93619615 856 andrew gelman stats-2011-08-16-Our new improved blog! Thanks to Cord Blomquist

12 0.93570513 759 andrew gelman stats-2011-06-11-“2 level logit with 2 REs & large sample. computational nightmare – please help”

13 0.93567014 2273 andrew gelman stats-2014-03-29-References (with code) for Bayesian hierarchical (multilevel) modeling and structural equation modeling

14 0.93461633 2260 andrew gelman stats-2014-03-22-Postdoc at Rennes on multilevel missing data imputation

15 0.93431681 1202 andrew gelman stats-2012-03-08-Between and within-Krugman correlation

16 0.93395841 2127 andrew gelman stats-2013-12-08-The never-ending (and often productive) race between theory and practice

17 0.9328531 767 andrew gelman stats-2011-06-15-Error in an attribution of an error

18 0.932486 579 andrew gelman stats-2011-02-18-What is this, a statistics class or a dentist’s office??

19 0.93217111 462 andrew gelman stats-2010-12-10-Who’s holding the pen?, The split screen, and other ideas for one-on-one instruction

20 0.93204683 2366 andrew gelman stats-2014-06-09-On deck this week