andrew_gelman_stats andrew_gelman_stats-2011 andrew_gelman_stats-2011-575 knowledge-graph by maker-knowledge-mining

575 andrew gelman stats-2011-02-15-What are the trickiest models to fit?

meta infos for this blog

Source: html

Introduction: John Salvatier writes: What do you and your readers think are the trickiest models to fit? If I had an algorithm that I claimed could fit many models with little fuss, what kinds of models would really impress you? I am interested in testing different MCMC sampling methods to evaluate their performance and I want to stretch the bounds of their abilities. I don’t know what’s the trickiest, but just about anything I work on in a serious way gives me some troubles. This reminds me that we should finish our Bayesian Benchmarks paper already.

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 John Salvatier writes: What do you and your readers think are the trickiest models to fit? [sent-1, score-0.892]

2 If I had an algorithm that I claimed could fit many models with little fuss, what kinds of models would really impress you? [sent-2, score-1.458]

3 I am interested in testing different MCMC sampling methods to evaluate their performance and I want to stretch the bounds of their abilities. [sent-3, score-1.172]

4 I don’t know what’s the trickiest, but just about anything I work on in a serious way gives me some troubles. [sent-4, score-0.404]

5 This reminds me that we should finish our Bayesian Benchmarks paper already. [sent-5, score-0.381]

similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('trickiest', 0.574), ('salvatier', 0.246), ('fuss', 0.236), ('impress', 0.228), ('stretch', 0.228), ('benchmarks', 0.215), ('finish', 0.198), ('models', 0.196), ('bounds', 0.195), ('fit', 0.169), ('mcmc', 0.157), ('kinds', 0.154), ('claimed', 0.141), ('evaluate', 0.14), ('algorithm', 0.138), ('reminds', 0.129), ('testing', 0.119), ('performance', 0.118), ('sampling', 0.113), ('gives', 0.102), ('serious', 0.1), ('readers', 0.094), ('john', 0.092), ('already', 0.083), ('interested', 0.077), ('little', 0.077), ('methods', 0.076), ('anything', 0.074), ('bayesian', 0.068), ('paper', 0.054), ('want', 0.054), ('different', 0.052), ('many', 0.048), ('work', 0.045), ('really', 0.044), ('know', 0.042), ('way', 0.041), ('writes', 0.039), ('could', 0.038), ('would', 0.029), ('think', 0.028)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0 575 andrew gelman stats-2011-02-15-What are the trickiest models to fit?

2 0.20028323 419 andrew gelman stats-2010-11-18-Derivative-based MCMC as a breakthrough technique for implementing Bayesian statistics

Introduction: John Salvatier pointed me to this blog on derivative based MCMC algorithms (also sometimes called “hybrid” or “Hamiltonian” Monte Carlo) and automatic differentiation as the future of MCMC. This all makes sense to me and is consistent both with my mathematical intuition from studying Metropolis algorithms and my experience with Matt using hybrid MCMC when fitting hierarchical spline models. In particular, I agree with Salvatier’s point about the potential for computation of analytic derivatives of the log-density function. As long as we’re mostly snapping together our models using analytically-simple pieces, the same part of the program that handles the computation of log-posterior densities should also be able to compute derivatives analytically. I’ve been a big fan of automatic derivative-based MCMC methods since I started hearing about them a couple years ago (I’m thinking of the DREAM project and of Mark Girolami’s paper), and I too wonder why they haven’t been used more. I

3 0.15812516 181 andrew gelman stats-2010-08-03-MCMC in Python

Introduction: John Salvatier forwards a note from Anand Patil that a paper on PyMC has appeared in the Journal of Statistical Software, Weâ€™ll have to check this out.

4 0.12434521 1735 andrew gelman stats-2013-02-24-F-f-f-fake data

Introduction: Tiago Fragoso writes: Suppose I fit a two stage regression model Y = a + bx + e a = cw + d + e1 I could fit it all in one step by using MCMC for example (my model is more complicated than that, so I’ll have to do it by MCMC). However, I could fit the first regression only using MCMC because those estimates are hard to obtain and perform the second regression using least squares or a separate MCMC. So there’s an ‘one step’ inference based on doing it all at the same time and a ‘two step’ inference by fitting one and using the estimates on the further steps. What is gained or lost between both? Is anything done in this question? My response: Rather than answering your particular question, I’ll give you my generic answer, which is to simulate fake data from your model, then fit your model both ways and see how the results differ. Repeat the simulation a few thousand times and you can make all the statistical comparisons you like.

5 0.10039257 1443 andrew gelman stats-2012-08-04-Bayesian Learning via Stochastic Gradient Langevin Dynamics

Introduction: Burak Bayramli writes: In this paper by Sunjin Ahn, Anoop Korattikara, and Max Welling and this paper by Welling and Yee Whye The, there are some arguments on big data and the use of MCMC. Both papers have suggested improvements to speed up MCMC computations. I was wondering what your thoughts were, especially on this paragraph: When a dataset has a billion data-cases (as is not uncommon these days) MCMC algorithms will not even have generated a single (burn-in) sample when a clever learning algorithm based on stochastic gradients may already be making fairly good predictions. In fact, the intriguing results of Bottou and Bousquet (2008) seem to indicate that in terms of “number of bits learned per unit of computation”, an algorithm as simple as stochastic gradient descent is almost optimally efficient. We therefore argue that for Bayesian methods to remain useful in an age when the datasets grow at an exponential rate, they need to embrace the ideas of the stochastic optimiz

6 0.098412983 861 andrew gelman stats-2011-08-19-Will Stan work well with 40×40 matrices?

7 0.094980158 2011 andrew gelman stats-2013-09-07-Here’s what happened when I finished my PhD thesis

8 0.091123201 774 andrew gelman stats-2011-06-20-The pervasive twoishness of statistics; in particular, the “sampling distribution” and the “likelihood” are two different models, and that’s a good thing

9 0.089724645 1469 andrew gelman stats-2012-08-25-Ways of knowing

10 0.082560413 231 andrew gelman stats-2010-08-24-Yet another Bayesian job opportunity

11 0.081560411 244 andrew gelman stats-2010-08-30-Useful models, model checking, and external validation: a mini-discussion

12 0.080928802 288 andrew gelman stats-2010-09-21-Discussion of the paper by Girolami and Calderhead on Bayesian computation

13 0.078805387 1482 andrew gelman stats-2012-09-04-Model checking and model understanding in machine learning

14 0.078561008 1886 andrew gelman stats-2013-06-07-Robust logistic regression

15 0.07604751 662 andrew gelman stats-2011-04-15-Bayesian statistical pragmatism

16 0.075453006 1690 andrew gelman stats-2013-01-23-When are complicated models helpful in psychology research and when are they overkill?

17 0.074413255 2231 andrew gelman stats-2014-03-03-Running into a Stan Reference by Accident

18 0.07393378 1459 andrew gelman stats-2012-08-15-How I think about mixture models

19 0.07357581 1719 andrew gelman stats-2013-02-11-Why waste time philosophizing?

20 0.071923912 1392 andrew gelman stats-2012-06-26-Occam

similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.111), (1, 0.079), (2, -0.041), (3, 0.026), (4, -0.003), (5, 0.032), (6, -0.037), (7, -0.034), (8, 0.053), (9, -0.005), (10, 0.026), (11, -0.012), (12, -0.041), (13, -0.02), (14, 0.01), (15, -0.019), (16, -0.0), (17, 0.003), (18, -0.026), (19, 0.008), (20, 0.007), (21, 0.002), (22, -0.014), (23, -0.007), (24, 0.007), (25, -0.035), (26, -0.054), (27, 0.016), (28, 0.024), (29, 0.014), (30, -0.0), (31, 0.018), (32, 0.027), (33, -0.023), (34, -0.021), (35, -0.044), (36, -0.005), (37, -0.018), (38, -0.03), (39, 0.021), (40, -0.002), (41, 0.042), (42, 0.01), (43, 0.0), (44, 0.001), (45, -0.058), (46, -0.056), (47, 0.038), (48, 0.037), (49, -0.014)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.95994896 575 andrew gelman stats-2011-02-15-What are the trickiest models to fit?

2 0.80388188 419 andrew gelman stats-2010-11-18-Derivative-based MCMC as a breakthrough technique for implementing Bayesian statistics

3 0.76807779 1739 andrew gelman stats-2013-02-26-An AI can build and try out statistical models using an open-ended generative grammar

Introduction: David Duvenaud writes: I’ve been following your recent discussions about how an AI could do statistics [see also here ]. I was especially excited about your suggestion for new statistical methods using “a language-like approach to recursively creating new models from a specified list of distributions and transformations, and an automatic approach to checking model fit.” Your discussion of these ideas was exciting to me and my colleagues because we recently did some work taking a step in this direction, automatically searching through a grammar over Gaussian process regression models. Roger Grosse previously did the same thing , but over matrix decomposition models using held-out predictive likelihood to check model fit. These are both examples of automatic Bayesian model-building by a search over more and more complex models, as you suggested. One nice thing is that both grammars include lots of standard models for free, and they seem to work pretty well, although the

4 0.75045878 1682 andrew gelman stats-2013-01-19-R package for Bayes factors

Introduction: Richard Morey writes: You and your blog readers may be interested to know that a we’ve released a major new version of the BayesFactor package to CRAN. The package computes Bayes factors for linear mixed models and regression models. Of course, I’m aware you don’t like point-null model comparisons, but the package does more than that; it also allows sampling from posterior distributions of the compared models, in much the same way that your arm package does with lmer objects. The sampling (both for the Bayes factors and posteriors) is quite fast, since the back end is written in C. Some basic examples using the package can be found here , and the CRAN page is here . Indeed I don’t like point-null model comparisons . . . but maybe this will be useful to some of you!

5 0.74802393 1443 andrew gelman stats-2012-08-04-Bayesian Learning via Stochastic Gradient Langevin Dynamics

6 0.74138391 964 andrew gelman stats-2011-10-19-An interweaving-transformation strategy for boosting MCMC efficiency

7 0.73292285 421 andrew gelman stats-2010-11-19-Just chaid

8 0.71869767 1856 andrew gelman stats-2013-05-14-GPstuff: Bayesian Modeling with Gaussian Processes

9 0.69711483 1374 andrew gelman stats-2012-06-11-Convergence Monitoring for Non-Identifiable and Non-Parametric Models

10 0.6931144 861 andrew gelman stats-2011-08-19-Will Stan work well with 40×40 matrices?

11 0.6896373 193 andrew gelman stats-2010-08-09-Besag

12 0.68752807 288 andrew gelman stats-2010-09-21-Discussion of the paper by Girolami and Calderhead on Bayesian computation

13 0.68262249 72 andrew gelman stats-2010-06-07-Valencia: Summer of 1991

14 0.68212622 1469 andrew gelman stats-2012-08-25-Ways of knowing

15 0.68170416 1497 andrew gelman stats-2012-09-15-Our blog makes connections!

16 0.67981195 2254 andrew gelman stats-2014-03-18-Those wacky anti-Bayesians used to be intimidating, but now they’re just pathetic

17 0.66838044 1270 andrew gelman stats-2012-04-19-Demystifying Blup

18 0.65936929 244 andrew gelman stats-2010-08-30-Useful models, model checking, and external validation: a mini-discussion

19 0.65814763 1489 andrew gelman stats-2012-09-09-Commercial Bayesian inference software is popping up all over

20 0.6569469 1041 andrew gelman stats-2011-12-04-David MacKay and Occam’s Razor

similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(8, 0.258), (15, 0.027), (16, 0.018), (24, 0.133), (48, 0.043), (86, 0.06), (87, 0.025), (90, 0.039), (99, 0.246)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.94422483 1822 andrew gelman stats-2013-04-24-Samurai sword-wielding Mormon bishop pharmaceutical statistician stops mugger

Introduction: Brett Keller points us to this feel-good story of the day: A Samurai sword-wielding Mormon bishop helped a neighbor woman escape a Tuesday morning attack by a man who had been stalking her. Kent Hendrix woke up Tuesday to his teenage son pounding on his bedroom door and telling him somebody was being mugged in front of their house. The 47-year-old father of six rushed out the door and grabbed the weapon closest to him — a 29-inch high carbon steel Samurai sword. . . . Hendrix, a pharmaceutical statistician, was one of several neighbors who came to the woman’s aid after she began yelling for help . . . Too bad the whole “statistician” thing got buried in the middle of the article. Fair enough, though: I don’t know what it takes to become a Mormon bishop, but I assume it’s more effort than what it takes to learn statistics.

2 0.88179839 916 andrew gelman stats-2011-09-18-Multimodality in hierarchical models

Introduction: Jim Hodges posted a note to the Bugs mailing list that I thought could be of more general interest: Is multi-modality a common experience? I [Hodges] think the answer is “nobody knows in any generality”. Here are some examples of bimodality that certainly do *not* involve the kind of labeling problems that arise in mixture models. The only systematic study of multimodality I know of is Liu J, Hodges JS (2003). Posterior bimodality in the balanced one-way random effects model. J.~Royal Stat.~Soc., Ser.~B, 65:247-255. The surprise of this paper is that in the simplest possible hierarchical model (analyzed using the standard inverse-gamma priors for the two variances), bimodality occurs quite readily, although it is much less common to have two modes that are big enough so that you’d actually get a noticeable fraction of MCMC draws from both of them. Because the restricted likelihood (= the marginal posterior for the two variances, if you’ve put flat priors on them) is

same-blog 3 0.87180734 575 andrew gelman stats-2011-02-15-What are the trickiest models to fit?

4 0.86059821 1128 andrew gelman stats-2012-01-19-Sharon Begley: Worse than Stephen Jay Gould?

Introduction: Commenter Tggp links to a criticism of science journalist Sharon Begley by science journalist Matthew Hutson. I learned of this dispute after reporting that Begley had received the American Statistical Association’s Excellence in Statistical Reporting Award, a completely undeserved honor, if Hutson is to believed. The two journalists have somewhat similar profiles: Begley was science editor at Newsweek (she’s now at Reuters) and author of “Train Your Mind, Change Your Brain: How a New Science Reveals Our Extraordinary Potential to Transform Ourselves,” and Hutson was news editor at Psychology Today and wrote the similarly self-helpy-titled, “The 7 Laws of Magical Thinking: How Irrational Beliefs Keep Us Happy, Healthy, and Sane.” Hutson writes : Psychological Science recently published a fascinating new study on jealousy. I was interested to read Newsweek’s 1300-word article covering the research by their science editor, Sharon Begley. But part-way through the article, I

5 0.83904016 1133 andrew gelman stats-2012-01-21-Judea Pearl on why he is “only a half-Bayesian”

Introduction: In an article published in 2001, Pearl wrote: I [Pearl] turned Bayesian in 1971, as soon as I began reading Savage’s monograph The Foundations of Statistical Inference [Savage, 1962]. The arguments were unassailable: (i) It is plain silly to ignore what we know, (ii) It is natural and useful to cast what we know in the language of probabilities, and (iii) If our subjective probabilities are erroneous, their impact will get washed out in due time, as the number of observations increases. Thirty years later, I [Pearl] am still a devout Bayesian in the sense of (i), but I now doubt the wisdom of (ii) and I know that, in general, (iii) is false. He elaborates: The bulk of human knowledge is organized around causal, not probabilistic relationships, and the grammar of probability calculus is insufficient for capturing those relationships. Specifically, the building blocks of our scientific and everyday knowledge are elementary facts such as “mud does not cause rain” and “symptom

6 0.83140212 1378 andrew gelman stats-2012-06-13-Economists . . .

7 0.82298052 1056 andrew gelman stats-2011-12-13-Drawing to Learn in Science

8 0.8200081 647 andrew gelman stats-2011-04-04-Irritating pseudo-populism, backed up by false statistics and implausible speculations

9 0.81665981 317 andrew gelman stats-2010-10-04-Rob Kass on statistical pragmatism, and my reactions

10 0.79153883 198 andrew gelman stats-2010-08-11-Multilevel modeling in R on a Mac

11 0.78976309 85 andrew gelman stats-2010-06-14-Prior distribution for design effects

12 0.78567415 1355 andrew gelman stats-2012-05-31-Lindley’s paradox

13 0.78231448 994 andrew gelman stats-2011-11-06-Josh Tenenbaum presents . . . a model of folk physics!

14 0.77414346 662 andrew gelman stats-2011-04-15-Bayesian statistical pragmatism

15 0.77250648 478 andrew gelman stats-2010-12-20-More on why “all politics is local” is an outdated slogan

16 0.77130282 220 andrew gelman stats-2010-08-20-Why I blog?

17 0.74842405 2043 andrew gelman stats-2013-09-29-The difficulties of measuring just about anything

18 0.74033391 1221 andrew gelman stats-2012-03-19-Whassup with deviance having a high posterior correlation with a parameter in the model?

19 0.73196626 2176 andrew gelman stats-2014-01-19-Transformations for non-normal data

20 0.73193705 2183 andrew gelman stats-2014-01-23-Discussion on preregistration of research studies