andrew_gelman_stats andrew_gelman_stats-2013 andrew_gelman_stats-2013-2016 knowledge-graph by maker-knowledge-mining

2016 andrew gelman stats-2013-09-11-Zipfian Academy, A School for Data Science


meta infos for this blog

Source: html

Introduction: Katie Kent writes: I’m with Zipfian Academy – we’re launching next week as the first 12-week immersive program to teach data science. The program combines the hard and soft skills of data science with introductions to the data science community out here in San Francisco. The launch will be covered by a couple big tech blogs, but we’d love to offer the opportunity to blog about it to some smaller and well-respected data science blogs like yours. I don’t know anything about this but I took a look at the website and it looks pretty cool. Maybe in a future iteration of their course, they can teach Stan, once it has a few more useful features such as VB and EP.


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Katie Kent writes: I’m with Zipfian Academy – we’re launching next week as the first 12-week immersive program to teach data science. [sent-1, score-1.126]

2 The program combines the hard and soft skills of data science with introductions to the data science community out here in San Francisco. [sent-2, score-1.581]

3 The launch will be covered by a couple big tech blogs, but we’d love to offer the opportunity to blog about it to some smaller and well-respected data science blogs like yours. [sent-3, score-1.864]

4 I don’t know anything about this but I took a look at the website and it looks pretty cool. [sent-4, score-0.557]

5 Maybe in a future iteration of their course, they can teach Stan, once it has a few more useful features such as VB and EP. [sent-5, score-0.797]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('blogs', 0.285), ('teach', 0.266), ('launching', 0.254), ('vb', 0.24), ('ep', 0.221), ('program', 0.22), ('iteration', 0.215), ('launch', 0.209), ('kent', 0.209), ('combines', 0.2), ('academy', 0.19), ('tech', 0.187), ('science', 0.184), ('san', 0.184), ('soft', 0.177), ('covered', 0.157), ('data', 0.138), ('skills', 0.138), ('features', 0.13), ('smaller', 0.13), ('community', 0.129), ('website', 0.122), ('offer', 0.117), ('opportunity', 0.114), ('week', 0.114), ('stan', 0.111), ('future', 0.105), ('love', 0.104), ('took', 0.101), ('looks', 0.093), ('couple', 0.087), ('next', 0.085), ('useful', 0.081), ('hard', 0.073), ('course', 0.072), ('anything', 0.072), ('big', 0.068), ('look', 0.066), ('pretty', 0.062), ('blog', 0.056), ('maybe', 0.056), ('first', 0.049), ('re', 0.045), ('know', 0.041), ('writes', 0.038), ('like', 0.028)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0000001 2016 andrew gelman stats-2013-09-11-Zipfian Academy, A School for Data Science

Introduction: Katie Kent writes: I’m with Zipfian Academy – we’re launching next week as the first 12-week immersive program to teach data science. The program combines the hard and soft skills of data science with introductions to the data science community out here in San Francisco. The launch will be covered by a couple big tech blogs, but we’d love to offer the opportunity to blog about it to some smaller and well-respected data science blogs like yours. I don’t know anything about this but I took a look at the website and it looks pretty cool. Maybe in a future iteration of their course, they can teach Stan, once it has a few more useful features such as VB and EP.

2 0.13824648 591 andrew gelman stats-2011-02-25-Quantitative Methods in the Social Sciences M.A.: Innovative, interdisciplinary social science research program for a data-rich world

Introduction: About 12 years ago Greg Wawro, Sy Spilerman, and I started a M.A. program here in Quantitative Methods in Social Sciences, jointly between the departments of history, economics, political science, sociology, psychology, and statistics. We created a bunch of new features for the program, including an interdisciplinary course based on this book . And here’s their new logo: Don’t blame me for the pie-chart motif! Seriously, though, the program is great. I’m proud to have gotten it started, and I’m impressed by the progress that Chris Weiss and others have made in expanding the program during the past decade.

3 0.12251987 1864 andrew gelman stats-2013-05-20-Evaluating Columbia University’s Frontiers of Science course

Introduction: Frontiers of Science is a course offered as part of Columbia University’s Core Curriculum. The course is controversial, with some people praising its overview of several areas of science, and others feeling that a more traditional set of introductory science courses would do the job better. Last month, the faculty in charge of the course wrote the following public letter : The United States is in the midst of a debate over the value of a traditional college education. Why enroll in a place like Columbia College when you can obtain an undergraduate degree for $10,000 or learn everything from Massive Open Online Courses? In more parochial terms, what is the value added by approaches such as Columbia’s Core Curriculum? Recently students in our Core Course, Frontiers of Science (FoS), provided a partial answer. The FoS faculty designed a survey to gauge the scientific skills and knowledge of the Class of 2016 both before and after taking FoS. In an assembly held during orientati

4 0.10349081 2084 andrew gelman stats-2013-11-01-Doing Data Science: What’s it all about?

Introduction: Rachel Schutt and Cathy O’Neil just came out with a wonderfully readable book on doing data science, based on a course Rachel taught last year at Columbia. Rachel is a former Ph.D. student of mine and so I’m inclined to have a positive view of her work; on the other hand, I did actually look at the book and I did find it readable! What do I claim is the least important part of data science? Here’s what Schutt and O’Neil say regarding the title: “Data science is not just a rebranding of statistics or machine learning but rather a field unto itself.” I agree. There’s so much that goes on with data that is about computing, not statistics. I do think it would be fair to consider statistics (which includes sampling, experimental design, and data collection as well as data analysis (which itself includes model building, visualization, and model checking as well as inference)) as a subset of data science. The question then arises: why do descriptions of data science focus so

5 0.10100476 1611 andrew gelman stats-2012-12-07-Feedback on my Bayesian Data Analysis class at Columbia

Introduction: In one of the final Jitts, we asked the students how the course could be improved. Some of their suggestions would work, some would not. I’m putting all the suggestions below, interpolating my responses. (Overall, I think the course went well. Please remember that the remarks below are not course evaluations; they are answers to my specific question of how the course could be better. If we’d had a Jitt asking all the ways the course was good, you’d be seeing lots of positive remarks. But that wouldn’t be particularly useful or interesting.) The best thing about the course is that the kids worked hard each week on their homeworks. OK, here are the comments and my replies: Could have been better if we did less amount but more in detail. I don’t know if this would’ve been possible. I wanted to get to the harder stuff (HMC, VB, nonparametric models) which required a certain amount of preparation. And, even so, there was not time for everything. And also, needs solut

6 0.1009758 120 andrew gelman stats-2010-06-30-You can’t put Pandora back in the box

7 0.10088357 1298 andrew gelman stats-2012-05-03-News from the sister blog!

8 0.099923819 1832 andrew gelman stats-2013-04-29-The blogroll

9 0.099887297 1853 andrew gelman stats-2013-05-12-OpenData Latinoamerica

10 0.099735513 2282 andrew gelman stats-2014-04-05-Bizarre academic spam

11 0.09922199 2291 andrew gelman stats-2014-04-14-Transitioning to Stan

12 0.096572846 1622 andrew gelman stats-2012-12-14-Can gambling addicts be identified in gambling venues?

13 0.096279174 1217 andrew gelman stats-2012-03-17-NSF program “to support analytic and methodological research in support of its surveys”

14 0.088005751 423 andrew gelman stats-2010-11-20-How to schedule projects in an introductory statistics course?

15 0.086221695 412 andrew gelman stats-2010-11-13-Time to apply for the hackNY summer fellows program

16 0.081200674 2106 andrew gelman stats-2013-11-19-More on “data science” and “statistics”

17 0.08113198 1476 andrew gelman stats-2012-08-30-Stan is fast

18 0.080917194 1856 andrew gelman stats-2013-05-14-GPstuff: Bayesian Modeling with Gaussian Processes

19 0.080884069 1308 andrew gelman stats-2012-05-08-chartsnthings !

20 0.07984601 1009 andrew gelman stats-2011-11-14-Wickham R short course


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.113), (1, -0.033), (2, -0.082), (3, 0.027), (4, 0.049), (5, 0.087), (6, -0.041), (7, -0.053), (8, -0.078), (9, -0.014), (10, -0.03), (11, -0.007), (12, 0.007), (13, -0.042), (14, -0.01), (15, 0.01), (16, -0.029), (17, -0.016), (18, -0.011), (19, -0.006), (20, 0.022), (21, -0.007), (22, -0.065), (23, 0.006), (24, -0.033), (25, -0.006), (26, 0.021), (27, -0.032), (28, 0.01), (29, 0.007), (30, 0.034), (31, -0.051), (32, -0.027), (33, -0.041), (34, -0.029), (35, 0.073), (36, -0.03), (37, 0.028), (38, -0.016), (39, -0.016), (40, -0.018), (41, 0.01), (42, -0.015), (43, -0.024), (44, -0.008), (45, 0.057), (46, 0.003), (47, -0.017), (48, 0.008), (49, 0.021)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.95457697 2016 andrew gelman stats-2013-09-11-Zipfian Academy, A School for Data Science

Introduction: Katie Kent writes: I’m with Zipfian Academy – we’re launching next week as the first 12-week immersive program to teach data science. The program combines the hard and soft skills of data science with introductions to the data science community out here in San Francisco. The launch will be covered by a couple big tech blogs, but we’d love to offer the opportunity to blog about it to some smaller and well-respected data science blogs like yours. I don’t know anything about this but I took a look at the website and it looks pretty cool. Maybe in a future iteration of their course, they can teach Stan, once it has a few more useful features such as VB and EP.

2 0.73946869 1990 andrew gelman stats-2013-08-20-Job opening at an organization that promotes reproducible research!

Introduction: I was told about an organization called Reproducibility Initiative. They tell me they are trying to make what was described in our “50 shades of gray” post standard across all of science, particularly areas like cancer research. I don’t know anything else about them, but that sounds like a good start! Here’s the ad: Data Scientist: Science Exchange, Palo Alto, CA Science Exchange is an innovative start-up with a mission to improve the efficiency and quality of scientific research. This Data Science position is critical to our mission. Our ideal candidate has the ability to collect and normalize data from multiple sources. This information will be used to drive marketing and product decisions, as well as fuel many of the features of Science Exchange. Desired Skills & Experience Experience with text mining, entity extraction and natural language processing is essential Experience scripting with either Python or R Experience running complex statistical analyses on l

3 0.73647743 2173 andrew gelman stats-2014-01-15-Postdoc involving pathbreaking work in MRP, Stan, and the 2014 election!

Introduction: We’re working with polling company YouGov to track public opinion, state-by-state and district-by-district, during the 2014 campaign. We’ll be using multilevel regression and poststratification, and implementing it in Stan, and developing the necessary new parts of Stan to get this running scalably and efficiently. And we’ll be making the most detailed, up-to-date election forecasts. What you’ll be doing if you join us as a postdoc: - You’ll be in the midst of the most advanced polling team anywhere; - You’ll be doing cutting-edge statistical research on MRP with deep interactions; - You’ll be doing basic research in statistical computing, developing fast and scalable deterministic and stochastic algorithms for fitting multilevel models; - You’ll be working inside Stan, the most advanced general computational framework for Bayesian analysis. We’re doing research, not just implementing existing methods. What we need: - Stats knowledge. You should know your way around Ba

4 0.72578228 1777 andrew gelman stats-2013-03-26-Data Science for Social Good summer fellowship program

Introduction: Juan-Pablo Velez writes: I’m helping with a  Data Science for Social Good  summer fellowship program at the University of Chicago. The goal is to train data scientists that can tackle problems in education, healthcare, energy, transportation, and more. Working with full-time mentors from academia, industry, and the  Obama campaign , fellows will build high-impact analytics projects using statistics, machine learning, data mining, and big data technologies. For fellows, we’re looking for grad students, advanced undergrads, and professionals in computer science, machine learning, statistics, and the computational and quantitative sciences. For mentors, we’re looking for folks with practical data science experience. Fellows and mentors will be paid competitively and housed in Chicago for duration of the program, from early June to late August. Rayid Ghani , former Chief Scientist of the Obama 2012 campaign, is leading the program.  Eric Sch

5 0.70997244 2221 andrew gelman stats-2014-02-23-Postdoc with Huffpost Pollster to do Bayesian poll tracking

Introduction: Mark Blumenthal writes: HuffPost Pollster has an immediate opening for a social and data scientist to join us full time, preferably in our Washington D.C. bureau, to work on development and improvement of our poll tracking models and political forecasts. You are someone who has: * A passion for electoral politics, * Advanced training in statistics and dynamic Bayesian data analysis, * A Ph.D. in statistics, political science, economics or the social sciences or comparable high level training or experience, * A desire to make a lasting contribution in the way the news media cover polls and elections. We are: * The award-winning website formerly known as  Pollster.com , which joined the Huffington Post in 2010 and remains the internet’s premier source for uniquely interactive polling charts and electorate forecasts and a running daily commentary that explains, demystifies and critiques political polling. * Home to the open source Pollster API, which provides academic

6 0.70993954 2345 andrew gelman stats-2014-05-24-An interesting mosaic of a data programming course

7 0.7079007 412 andrew gelman stats-2010-11-13-Time to apply for the hackNY summer fellows program

8 0.69679779 1009 andrew gelman stats-2011-11-14-Wickham R short course

9 0.69291705 1711 andrew gelman stats-2013-02-07-How Open Should Academic Papers Be?

10 0.68033922 1902 andrew gelman stats-2013-06-17-Job opening at new “big data” consulting firm!

11 0.67120147 1217 andrew gelman stats-2012-03-17-NSF program “to support analytic and methodological research in support of its surveys”

12 0.66413879 714 andrew gelman stats-2011-05-16-NYT Labs releases Openpaths, a utility for saving your iphone data

13 0.652722 2325 andrew gelman stats-2014-05-07-Stan users meetup next week

14 0.65108401 1909 andrew gelman stats-2013-06-21-Job openings at conservative political analytics firm!

15 0.6479739 1276 andrew gelman stats-2012-04-22-“Gross misuse of statistics” can be a good thing, if it indicates the acceptance of the importance of statistical reasoning

16 0.64572698 2291 andrew gelman stats-2014-04-14-Transitioning to Stan

17 0.64301258 1923 andrew gelman stats-2013-07-03-Bayes pays!

18 0.63967574 1630 andrew gelman stats-2012-12-18-Postdoc positions at Microsoft Research – NYC

19 0.63065982 423 andrew gelman stats-2010-11-20-How to schedule projects in an introductory statistics course?

20 0.62791926 1687 andrew gelman stats-2013-01-21-Workshop on science communication for graduate students


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(5, 0.028), (24, 0.097), (27, 0.042), (28, 0.024), (36, 0.032), (40, 0.031), (47, 0.031), (48, 0.048), (52, 0.052), (53, 0.045), (98, 0.054), (99, 0.391)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.9817757 2016 andrew gelman stats-2013-09-11-Zipfian Academy, A School for Data Science

Introduction: Katie Kent writes: I’m with Zipfian Academy – we’re launching next week as the first 12-week immersive program to teach data science. The program combines the hard and soft skills of data science with introductions to the data science community out here in San Francisco. The launch will be covered by a couple big tech blogs, but we’d love to offer the opportunity to blog about it to some smaller and well-respected data science blogs like yours. I don’t know anything about this but I took a look at the website and it looks pretty cool. Maybe in a future iteration of their course, they can teach Stan, once it has a few more useful features such as VB and EP.

2 0.96189761 82 andrew gelman stats-2010-06-12-UnConMax – uncertainty consideration maxims 7 +-- 2

Introduction: Warning – this blog post is meant to encourage some loose, fuzzy and possibly distracting thoughts about the practice of statistics in research endeavours. There maybe spelling and grammatical errors as well as a lack of proper sentence structure. It may not be understandable to many or even possibly any readers. But somewhat more seriously, its better that “ConUnMax” So far I have five maxims 1. Explicit models of uncertanty are useful but – always wrong and can always be made less wrong 2. If the model is formally a probability model – always use probability calculus (Bayes) 3. Always useful to make the model a formal probability model – no matter what (Bayesianisn) 4. Never use a model that is not empirically motivated and strongly empirically testable (Frequentist – of the anti-Bayesian flavour) 5. Quantitative tools are always just a means to grasp and manipulate models – never an end in itself (i.e. don’t obsess over “baby” mathematics) 6. If one really understood st

3 0.9603712 2068 andrew gelman stats-2013-10-18-G+ hangout for Bayesian Data Analysis course now! (actually, in 5 minutes)

Introduction: Here’s the link . When you’re on the hangout, please mute your own microphone! I’ll have the computer point at the blackboard. You can follow along with the slides: for the first hour for the second hour P.S. Apparently there is some limit on number of hangout participants (see comments). I didn’t know about that! Maybe next time will try “on air” hangout, I will have to learn more about this. Next week the teaching asst will do the course so no hangout, then in two weeks there is no class because it’s the day after Halloween and that’s a holiday around here. So we’ll resume this on Fri 8 Nov. See you then! P.P.S. Those of you who were able to join the hangout: Could you please let me know how the visual and sound quality were? Thanks.

4 0.96024603 1317 andrew gelman stats-2012-05-13-Question 3 of my final exam for Design and Analysis of Sample Surveys

Introduction: 3. We discussed in class the best currently available method for estimating the proportion of military servicemembers who are gay. What is that method? (Recall the problems with the direct approach: there is no simple way to survey servicemembers at random, nor is it likely that they would answer such a question honestly.) Solution to question 2 From yesterday : 2. Which of the following are useful goals in a pilot study? (Indicate all that apply.) (a) You can search for statistical significance, then from that decide what to look for in a confirmatory analysis of your full dataset. (b) You can see if you find statistical significance in a pre-chosen comparison of interest. (c) You can examine the direction (positive or negative, even if not statistically significant) of comparisons of interest. (d) With a small sample size, you cannot hope to learn anything conclusive, but you can get a crude estimate of effect size and standard deviation which will be useful in a po

5 0.95816875 1965 andrew gelman stats-2013-08-02-My course this fall on l’analyse bayésienne de données

Introduction: X marks the spot . I’ll post the slides soon (not just for the students in my class; these should be helpful for anyone teaching Bayesian data analysis from our book ). But I don’t think you’ll get much from reading the slides alone; you’ll get more out of the book (or, of course, from taking the class).

6 0.95542008 653 andrew gelman stats-2011-04-08-Multilevel regression with shrinkage for “fixed” effects

7 0.95330536 423 andrew gelman stats-2010-11-20-How to schedule projects in an introductory statistics course?

8 0.95217162 607 andrew gelman stats-2011-03-11-Rajiv Sethi on the interpretation of prediction market data

9 0.95202106 88 andrew gelman stats-2010-06-15-What people do vs. what they want to do

10 0.95179605 1585 andrew gelman stats-2012-11-20-“I know you aren’t the plagiarism police, but . . .”

11 0.95089793 751 andrew gelman stats-2011-06-08-Another Wegman plagiarism

12 0.95047879 318 andrew gelman stats-2010-10-04-U-Haul statistics

13 0.94963658 1670 andrew gelman stats-2013-01-13-More Bell Labs happy talk

14 0.94962537 638 andrew gelman stats-2011-03-30-More on the correlation between statistical and political ideology

15 0.94933474 153 andrew gelman stats-2010-07-17-Tenure-track position at U. North Carolina in survey methods and social statistics

16 0.94932461 1096 andrew gelman stats-2012-01-02-Graphical communication for legal scholarship

17 0.94928408 1703 andrew gelman stats-2013-02-02-Interaction-based feature selection and classification for high-dimensional biological data

18 0.94921356 1536 andrew gelman stats-2012-10-16-Using economics to reduce bike theft

19 0.94899035 1095 andrew gelman stats-2012-01-01-Martin and Liu: Probabilistic inference based on consistency of model with data

20 0.94881797 1982 andrew gelman stats-2013-08-15-Blaming scientific fraud on the Kuhnians