andrew_gelman_stats andrew_gelman_stats-2013 andrew_gelman_stats-2013-1950 knowledge-graph by maker-knowledge-mining

1950 andrew gelman stats-2013-07-22-My talks that were scheduled for Tues at the Data Skeptics meetup and Wed at the Open Statistical Programming meetup


meta infos for this blog

Source: html

Introduction: Statistical Methods and Data Skepticism Data analysis today is dominated by three paradigms: null hypothesis significance testing, Bayesian inference, and exploratory data analysis. There is concern that all these methods lead to overconfidence on the part of researchers and the general public, and this concern has led to the new “data skepticism” movement. But the history of statistics is already in some sense a history of data skepticism. Concepts of bias, variance, sampling and measurement error, least-squares regression, and statistical significance can all be viewed as formalizations of data skepticism. All these methods address the concern that patterns in observed data might not generalize to the population of interest. We discuss the challenge of attaining data skepticism while avoiding data nihilism, and consider some proposed future directions. Stan Stan (mc-stan.org) is an open-source package for obtaining Bayesian inference using the No-U-Turn sampler, a


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Statistical Methods and Data Skepticism Data analysis today is dominated by three paradigms: null hypothesis significance testing, Bayesian inference, and exploratory data analysis. [sent-1, score-0.627]

2 There is concern that all these methods lead to overconfidence on the part of researchers and the general public, and this concern has led to the new “data skepticism” movement. [sent-2, score-0.806]

3 But the history of statistics is already in some sense a history of data skepticism. [sent-3, score-0.477]

4 Concepts of bias, variance, sampling and measurement error, least-squares regression, and statistical significance can all be viewed as formalizations of data skepticism. [sent-4, score-0.651]

5 All these methods address the concern that patterns in observed data might not generalize to the population of interest. [sent-5, score-0.741]

6 We discuss the challenge of attaining data skepticism while avoiding data nihilism, and consider some proposed future directions. [sent-6, score-1.053]

7 org) is an open-source package for obtaining Bayesian inference using the No-U-Turn sampler, a variant of Hamiltonian Monte Carlo. [sent-8, score-0.448]

8 We are also developing Stan as a more general statistical modeling and computing platform that will be able to do optimization, variational inference, and expectation propagation, as well as full Bayes. [sent-9, score-0.762]

9 Unfortunately something came up and I won’t be able to do either of those talks. [sent-11, score-0.106]

10 An old version of the Stan talk is here but I was planning to present some new material too. [sent-14, score-0.084]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('stan', 0.394), ('skepticism', 0.259), ('concern', 0.231), ('data', 0.205), ('inference', 0.144), ('usability', 0.142), ('overconfidence', 0.136), ('history', 0.136), ('significance', 0.134), ('methods', 0.131), ('sampling', 0.13), ('paradigms', 0.127), ('variational', 0.121), ('propagation', 0.119), ('variant', 0.119), ('fuller', 0.119), ('generality', 0.119), ('bayesian', 0.117), ('dominated', 0.117), ('discuss', 0.115), ('platform', 0.109), ('avoiding', 0.108), ('obtaining', 0.106), ('able', 0.106), ('sampler', 0.103), ('planned', 0.099), ('integration', 0.099), ('hamiltonian', 0.099), ('expectation', 0.098), ('generalize', 0.096), ('viewed', 0.095), ('optimization', 0.095), ('concepts', 0.094), ('algorithms', 0.09), ('monte', 0.089), ('model', 0.089), ('statistical', 0.087), ('exploratory', 0.086), ('efficient', 0.085), ('null', 0.085), ('challenges', 0.084), ('planning', 0.084), ('improved', 0.084), ('developing', 0.083), ('proposed', 0.082), ('computing', 0.081), ('package', 0.079), ('challenge', 0.079), ('address', 0.078), ('general', 0.077)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0 1950 andrew gelman stats-2013-07-22-My talks that were scheduled for Tues at the Data Skeptics meetup and Wed at the Open Statistical Programming meetup

Introduction: Statistical Methods and Data Skepticism Data analysis today is dominated by three paradigms: null hypothesis significance testing, Bayesian inference, and exploratory data analysis. There is concern that all these methods lead to overconfidence on the part of researchers and the general public, and this concern has led to the new “data skepticism” movement. But the history of statistics is already in some sense a history of data skepticism. Concepts of bias, variance, sampling and measurement error, least-squares regression, and statistical significance can all be viewed as formalizations of data skepticism. All these methods address the concern that patterns in observed data might not generalize to the population of interest. We discuss the challenge of attaining data skepticism while avoiding data nihilism, and consider some proposed future directions. Stan Stan (mc-stan.org) is an open-source package for obtaining Bayesian inference using the No-U-Turn sampler, a

2 0.43181705 1528 andrew gelman stats-2012-10-10-My talk at MIT on Thurs 11 Oct

Introduction: Stan: open-source Bayesian inference Speaker: Andrew Gelman, Columbia University Date: Thursday, October 11 2012 Time: 4:00PM to 5:00PM Location: 32-D507 Host: Polina Golland, CSAIL Contact: Polina Golland, 6172538005, polina@csail.mit.edu Stan ( mc-stan.org ) is an open-source package for obtaining Bayesian inference using the No-U-Turn sampler, a variant of Hamiltonian Monte Carlo. We discuss how Stan works and what it can do, the problems that motivated us to write Stan, current challenges, and areas of planned development, including tools for improved generality and usability, more efficient sampling algorithms, and fuller integration of model building, model checking, and model understanding in Bayesian data analysis. P.S. Here’s the talk .

3 0.33784032 1475 andrew gelman stats-2012-08-30-A Stan is Born

Introduction: Stan 1.0.0 and RStan 1.0.0 It’s official. The Stan Development Team is happy to announce the first stable versions of Stan and RStan. What is (R)Stan? Stan is an open-source package for obtaining Bayesian inference using the No-U-Turn sampler, a variant of Hamiltonian Monte Carlo. It’s sort of like BUGS, but with a different language for expressing models and a different sampler for sampling from their posteriors. RStan is the R interface to Stan. Stan Home Page Stan’s home page is: http://mc-stan.org/ It links everything you need to get started running Stan from the command line, from R, or from C++, including full step-by-step install instructions, a detailed user’s guide and reference manual for the modeling language, and tested ports of most of the BUGS examples. Peruse the Manual If you’d like to learn more, the Stan User’s Guide and Reference Manual is the place to start.

4 0.2368692 2291 andrew gelman stats-2014-04-14-Transitioning to Stan

Introduction: Kevin Cartier writes: I’ve been happily using R for a number of years now and recently came across Stan. Looks big and powerful, so I’d like to pick an appropriate project and try it out. I wondered if you could point me to a link or document that goes into the motivation for this tool (aside from the Stan user doc)? What I’d like to understand is, at what point might you look at an emergent R project and advise, “You know, that thing you’re trying to do would be a whole lot easier/simpler/more straightforward to implement with Stan.” (or words to that effect). My reply: For my collaborators in political science, Stan has been most useful for models where the data set is not huge (e.g., we might have 10,000 data points or 50,000 data points but not 10 million) but where the model is somewhat complex (for example, a model with latent time series structure). The point is that the model has enough parameters and uncertainty that you’ll want to do full Bayes (rather than some sort

5 0.22519818 1748 andrew gelman stats-2013-03-04-PyStan!

Introduction: Stan is written in C++ and can be run from the command line and from R. We’d like for Python users to be able to run Stan as well. If anyone is interested in doing this, please let us know and we’d be happy to work with you on it. Stan, like Python, is completely free and open-source. P.S. Because Stan is open-source, it of course would also be possible for people to translate Stan into Python, or to take whatever features they like from Stan and incorporate them into a Python package. That’s fine too. But we think it would make sense in addition for users to be able to run Stan directly from Python, in the same way that it can be run from R.

6 0.2181382 1580 andrew gelman stats-2012-11-16-Stantastic!

7 0.19937719 2161 andrew gelman stats-2014-01-07-My recent debugging experience

8 0.17837821 1948 andrew gelman stats-2013-07-21-Bayes related

9 0.17803352 1627 andrew gelman stats-2012-12-17-Stan and RStan 1.1.0

10 0.17615043 1961 andrew gelman stats-2013-07-29-Postdocs in probabilistic modeling! With David Blei! And Stan!

11 0.16932298 774 andrew gelman stats-2011-06-20-The pervasive twoishness of statistics; in particular, the “sampling distribution” and the “likelihood” are two different models, and that’s a good thing

12 0.16089816 2150 andrew gelman stats-2013-12-27-(R-Py-Cmd)Stan 2.1.0

13 0.15917888 2299 andrew gelman stats-2014-04-21-Stan Model of the Week: Hierarchical Modeling of Supernovas

14 0.15765534 2173 andrew gelman stats-2014-01-15-Postdoc involving pathbreaking work in MRP, Stan, and the 2014 election!

15 0.15340619 2209 andrew gelman stats-2014-02-13-CmdStan, RStan, PyStan v2.2.0

16 0.153322 1855 andrew gelman stats-2013-05-13-Stan!

17 0.15155125 1695 andrew gelman stats-2013-01-28-Economists argue about Bayes

18 0.15093283 1772 andrew gelman stats-2013-03-20-Stan at Google this Thurs and at Berkeley this Fri noon

19 0.14639039 1554 andrew gelman stats-2012-10-31-It not necessary that Bayesian methods conform to the likelihood principle

20 0.14567088 1991 andrew gelman stats-2013-08-21-BDA3 table of contents (also a new paper on visualization)


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.218), (1, 0.189), (2, -0.102), (3, 0.045), (4, 0.044), (5, 0.125), (6, -0.157), (7, -0.199), (8, -0.045), (9, -0.17), (10, -0.183), (11, -0.015), (12, -0.107), (13, -0.04), (14, 0.093), (15, -0.054), (16, -0.051), (17, 0.027), (18, -0.002), (19, 0.005), (20, -0.049), (21, -0.031), (22, -0.055), (23, 0.034), (24, -0.019), (25, -0.007), (26, 0.013), (27, -0.046), (28, -0.011), (29, -0.003), (30, 0.025), (31, 0.018), (32, 0.006), (33, 0.049), (34, -0.015), (35, 0.122), (36, 0.005), (37, -0.011), (38, 0.023), (39, -0.004), (40, -0.028), (41, 0.047), (42, -0.038), (43, -0.013), (44, 0.03), (45, 0.01), (46, 0.003), (47, -0.055), (48, -0.022), (49, -0.015)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.93997514 1950 andrew gelman stats-2013-07-22-My talks that were scheduled for Tues at the Data Skeptics meetup and Wed at the Open Statistical Programming meetup

Introduction: Statistical Methods and Data Skepticism Data analysis today is dominated by three paradigms: null hypothesis significance testing, Bayesian inference, and exploratory data analysis. There is concern that all these methods lead to overconfidence on the part of researchers and the general public, and this concern has led to the new “data skepticism” movement. But the history of statistics is already in some sense a history of data skepticism. Concepts of bias, variance, sampling and measurement error, least-squares regression, and statistical significance can all be viewed as formalizations of data skepticism. All these methods address the concern that patterns in observed data might not generalize to the population of interest. We discuss the challenge of attaining data skepticism while avoiding data nihilism, and consider some proposed future directions. Stan Stan (mc-stan.org) is an open-source package for obtaining Bayesian inference using the No-U-Turn sampler, a

2 0.87316829 1528 andrew gelman stats-2012-10-10-My talk at MIT on Thurs 11 Oct

Introduction: Stan: open-source Bayesian inference Speaker: Andrew Gelman, Columbia University Date: Thursday, October 11 2012 Time: 4:00PM to 5:00PM Location: 32-D507 Host: Polina Golland, CSAIL Contact: Polina Golland, 6172538005, polina@csail.mit.edu Stan ( mc-stan.org ) is an open-source package for obtaining Bayesian inference using the No-U-Turn sampler, a variant of Hamiltonian Monte Carlo. We discuss how Stan works and what it can do, the problems that motivated us to write Stan, current challenges, and areas of planned development, including tools for improved generality and usability, more efficient sampling algorithms, and fuller integration of model building, model checking, and model understanding in Bayesian data analysis. P.S. Here’s the talk .

3 0.8234871 1475 andrew gelman stats-2012-08-30-A Stan is Born

Introduction: Stan 1.0.0 and RStan 1.0.0 It’s official. The Stan Development Team is happy to announce the first stable versions of Stan and RStan. What is (R)Stan? Stan is an open-source package for obtaining Bayesian inference using the No-U-Turn sampler, a variant of Hamiltonian Monte Carlo. It’s sort of like BUGS, but with a different language for expressing models and a different sampler for sampling from their posteriors. RStan is the R interface to Stan. Stan Home Page Stan’s home page is: http://mc-stan.org/ It links everything you need to get started running Stan from the command line, from R, or from C++, including full step-by-step install instructions, a detailed user’s guide and reference manual for the modeling language, and tested ports of most of the BUGS examples. Peruse the Manual If you’d like to learn more, the Stan User’s Guide and Reference Manual is the place to start.

4 0.82075405 1580 andrew gelman stats-2012-11-16-Stantastic!

Introduction: Richard McElreath writes: I’ve been translating a few ongoing data analysis projects into Stan code, mostly with success. The most important for me right now has been a hierarchical zero-inflated gamma problem. This a “hurdle” model, in which a bernoulli GLM produces zeros/nonzeros, and then a gamma GLM produces the nonzero values, using varying effects correlated with those in the bernoulli process. The data are 20 years of human foraging returns from a subsistence hunting population in Paraguay (the Ache), comprising about 15k hunts in total (Hill & Kintigh. 2009. Current Anthropology 50:369-377). Observed values are kilograms of meat returned to camp. The more complex models contain a 147-by-9 matrix of varying effects (147 unique hunters), as well as imputation of missing values. Originally, I had written the sampler myself in raw R code. It was very slow, but I knew what it was doing at least. Just before Stan version 1.0 was released, I had managed to get JAGS to do it a

5 0.79931015 2150 andrew gelman stats-2013-12-27-(R-Py-Cmd)Stan 2.1.0

Introduction: We’re happy to announce the release of Stan C++, CmdStan, RStan, and PyStan 2.1.0.  This is a minor feature release, but it is also an important bug fix release.  As always, the place to start is the (all new) Stan web pages: http://mc-stan.org   Major Bug in 2.0.0, 2.0.1 Stan 2.0.0 and Stan 2.0.1 introduced a bug in the implementation of the NUTS criterion that led to poor tail exploration and thus biased the posterior uncertainty downward.  There was no bug in NUTS in Stan 1.3 or earlier, and 2.1 has been extensively tested and tests put in place so this problem will not recur. If you are using Stan 2.0.0 or 2.0.1, you should switch to 2.1.0 as soon as possible and rerun any models you care about.   New Target Acceptance Rate Default for Stan 2.1.0 Another big change aimed at reducing posterior estimation bias was an increase in the target acceptance rate during adaptation from 0.65 to 0.80.  The bad news is that iterations will take around 50% longer

6 0.79121298 2291 andrew gelman stats-2014-04-14-Transitioning to Stan

7 0.78228718 1627 andrew gelman stats-2012-12-17-Stan and RStan 1.1.0

8 0.77286929 2020 andrew gelman stats-2013-09-12-Samplers for Big Science: emcee and BAT

9 0.75992566 1749 andrew gelman stats-2013-03-04-Stan in L.A. this Wed 3:30pm

10 0.75365245 1036 andrew gelman stats-2011-11-30-Stan uses Nuts!

11 0.74875021 2318 andrew gelman stats-2014-05-04-Stan (& JAGS) Tutorial on Linear Mixed Models

12 0.74239385 712 andrew gelman stats-2011-05-14-The joys of working in the public domain

13 0.74133456 1748 andrew gelman stats-2013-03-04-PyStan!

14 0.73401785 2209 andrew gelman stats-2014-02-13-CmdStan, RStan, PyStan v2.2.0

15 0.73341978 2003 andrew gelman stats-2013-08-30-Stan Project: Continuous Relaxations for Discrete MRFs

16 0.72813833 2349 andrew gelman stats-2014-05-26-WAIC and cross-validation in Stan!

17 0.72352391 2242 andrew gelman stats-2014-03-10-Stan Model of the Week: PK Calculation of IV and Oral Dosing

18 0.72333223 1772 andrew gelman stats-2013-03-20-Stan at Google this Thurs and at Berkeley this Fri noon

19 0.72100717 2161 andrew gelman stats-2014-01-07-My recent debugging experience

20 0.71042323 2360 andrew gelman stats-2014-06-05-Identifying pathways for managing multiple disturbances to limit plant invasions


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(9, 0.041), (15, 0.011), (16, 0.027), (21, 0.062), (24, 0.193), (43, 0.011), (45, 0.012), (66, 0.093), (77, 0.022), (82, 0.019), (86, 0.07), (91, 0.026), (93, 0.012), (99, 0.305)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.97348142 1950 andrew gelman stats-2013-07-22-My talks that were scheduled for Tues at the Data Skeptics meetup and Wed at the Open Statistical Programming meetup

Introduction: Statistical Methods and Data Skepticism Data analysis today is dominated by three paradigms: null hypothesis significance testing, Bayesian inference, and exploratory data analysis. There is concern that all these methods lead to overconfidence on the part of researchers and the general public, and this concern has led to the new “data skepticism” movement. But the history of statistics is already in some sense a history of data skepticism. Concepts of bias, variance, sampling and measurement error, least-squares regression, and statistical significance can all be viewed as formalizations of data skepticism. All these methods address the concern that patterns in observed data might not generalize to the population of interest. We discuss the challenge of attaining data skepticism while avoiding data nihilism, and consider some proposed future directions. Stan Stan (mc-stan.org) is an open-source package for obtaining Bayesian inference using the No-U-Turn sampler, a

2 0.95196277 1439 andrew gelman stats-2012-08-01-A book with a bunch of simple graphs

Introduction: Howard Friedman sent me a new book, The Measure of a Nation, subtitled How to Regain America’s Competitive Edge and Boost Our Global Standing. Without commenting on the substance of Friedman’s recommendations, I’d like to endorse his strategy of presentation, which is to display graph after graph after graph showing the same message over and over again, which is that the U.S. is outperformed by various other countries (mostly in Europe) on a variety of measures. These aren’t graphs I would ever make—they are scatterplots in which the x-axis conveys no information. But they have the advantage of repetition: once you figure out how to read one of the graphs, you can read the others easily. Here’s an example which I found from a quick Google: I can’t actually figure out what is happening on the x-axis, nor do I understand the “star, middle child, dog” thing. But I like the use of graphics. Lots more fun than bullet points. Seriously. P.S. Just to be clear: I am not trying

3 0.95179868 1322 andrew gelman stats-2012-05-15-Question 5 of my final exam for Design and Analysis of Sample Surveys

Introduction: 5. Which of the following better describes changes in public opinion on most issues? (Choose only one.) (a) Dynamic stability: On any given issue, average opinion remains stable but liberals and conservatives move back and forth in opposite directions (the “accordion model”) (b) Uniform swing: Average opinion on an issue can move but the liberals and conservatives don’t move much relative to each other (the disribution of opinions is a “solid block of wood”) (c) Compensating tradeoffs: When considering multiple survey questions on the same general topic, average opinion can move sharply to the left or right on individual questions while the average over all the questions remains stable (the “rubber band model”) Solution to question 4 From yesterday : 4. Researchers have found that survey respondents overreport church attendance. Thus, naive estimates from surveys overstate the percentage of Americans who attend church regularly. Does this have a large impact on estimate

4 0.95141995 669 andrew gelman stats-2011-04-19-The mysterious Gamma (1.4, 0.4)

Introduction: A student writes: I have a question about an earlier recommendation of yours on the election of the prior distribution for the precision hyperparameter of a normal distribution, and a reference for the recommendation. If I recall correctly I have read that you have suggested to use Gamma(1.4, 0.4) instead of Gamma(0.01,0.01) for the prior distribution of the precision hyper parameter of a normal distribution. I would very much appreciate if you would have the time to point me to this publication of yours. The reason is that I have used the prior distribution (Gamma(1.4, 0.4)) in a study which we now revise for publication, and where a reviewer question the choice of the distribution (claiming that it is too informative!). I am well aware of that you in recent publications (Prior distributions for variance parameters in hierarchical models. Bayesian Analysis; Data Analysis using regression and multilevel/hierarchical models) suggest to model the precision as pow(standard deviatio

5 0.94856608 1630 andrew gelman stats-2012-12-18-Postdoc positions at Microsoft Research – NYC

Introduction: Sharad Goel sends this in: Microsoft Research NYC [ http://research.microsoft.com/newyork/ ] seeks outstanding applicants for 2-year postdoctoral researcher positions. We welcome applicants with a strong academic record in one of the following areas: * Computational social science: http://research.microsoft.com/cssnyc * Online experimental social science: http://research.microsoft.com/oess_nyc * Algorithmic economics and market design: http://research.microsoft.com/algorithmic-economics/ * Machine learning: http://research.microsoft.com/mlnyc/ We will also consider applicants in other focus areas of the lab, including information retrieval, and behavioral & empirical economics. Additional information about these areas is included below. Please submit all application materials by January 11, 2013. ———- COMPUTATIONAL SOCIAL SCIENCE http://research.microsoft.com/cssnyc With an increasing amount of data on every aspect of our daily activities — from what we buy, to wh

6 0.94660783 1191 andrew gelman stats-2012-03-01-Hoe noem je?

7 0.94589829 1247 andrew gelman stats-2012-04-05-More philosophy of Bayes

8 0.94538265 1502 andrew gelman stats-2012-09-19-Scalability in education

9 0.94479352 1792 andrew gelman stats-2013-04-07-X on JLP

10 0.94475031 1459 andrew gelman stats-2012-08-15-How I think about mixture models

11 0.94473737 204 andrew gelman stats-2010-08-12-Sloppily-written slam on moderately celebrated writers is amusing nonetheless

12 0.94384408 2027 andrew gelman stats-2013-09-17-Christian Robert on the Jeffreys-Lindley paradox; more generally, it’s good news when philosophical arguments can be transformed into technical modeling issues

13 0.94306022 2340 andrew gelman stats-2014-05-20-Thermodynamic Monte Carlo: Michael Betancourt’s new method for simulating from difficult distributions and evaluating normalizing constants

14 0.94241023 1150 andrew gelman stats-2012-02-02-The inevitable problems with statistical significance and 95% intervals

15 0.94217604 1465 andrew gelman stats-2012-08-21-D. Buggin

16 0.941998 897 andrew gelman stats-2011-09-09-The difference between significant and not significant…

17 0.94193268 781 andrew gelman stats-2011-06-28-The holes in my philosophy of Bayesian data analysis

18 0.9411658 1355 andrew gelman stats-2012-05-31-Lindley’s paradox

19 0.94109213 1510 andrew gelman stats-2012-09-25-Incoherence of Bayesian data analysis

20 0.94108272 262 andrew gelman stats-2010-09-08-Here’s how rumors get started: Lineplots, dotplots, and nonfunctional modernist architecture