andrew_gelman_stats andrew_gelman_stats-2014 andrew_gelman_stats-2014-2299 knowledge-graph by maker-knowledge-mining

2299 andrew gelman stats-2014-04-21-Stan Model of the Week: Hierarchical Modeling of Supernovas


meta infos for this blog

Source: html

Introduction: The Stan Model of the Week showcases research using Stan to push the limits of applied statistics.  If you have a model that you would like to submit for a future post then send us an email . Our inaugural post comes from Nathan Sanders, a graduate student finishing up his thesis on astrophysics at Harvard. Nathan writes, “Core-collapse supernovae, the luminous explosions of massive stars, exhibit an expansive and meaningful diversity of behavior in their brightness evolution over time (their “light curves”). Our group discovers and monitors these events using the Pan-STARRS1 telescope in Hawaii, and we’ve collected a dataset of about 20,000 individual photometric observations of about 80 Type IIP supernovae, the class my work has focused on. While this dataset provides one of the best available tools to infer the explosion properties of these supernovae, due to the nature of extragalactic astronomy (observing from distances 1 billion light years), these light curves typicall


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 If you have a model that you would like to submit for a future post then send us an email . [sent-2, score-0.22]

2 Our group discovers and monitors these events using the Pan-STARRS1 telescope in Hawaii, and we’ve collected a dataset of about 20,000 individual photometric observations of about 80 Type IIP supernovae, the class my work has focused on. [sent-5, score-0.252]

3 My goal has been to develop a light curve model, with a physically interpretable parameterization, robust enough to fit the diversity of observed behavior and to extract the most information possible from every light curve in the sample, regardless of data quality or completeness. [sent-7, score-1.499]

4 Because light curve parameters of individual objects are often not identified by the data, we have adopted a hierarchical model structure. [sent-8, score-1.177]

5 The intention is to capitalize on partial pooling of information to simultaneously regularize the fits of individual light curves and constrain the population level properties of the light curve sample. [sent-9, score-1.466]

6 The highly non-linear character of the light curves motivates a full Bayes approach to explore the complex joint structure of the posterior. [sent-10, score-0.713]

7 Sampling from a ~ dimensional, highly correlated joint posterior seemed intimidating to me, but I’m fortunate to have been empowered by having taken Andrew’s course at Harvard, by befriending expert practitioners in this field like Kaisey Mandel and Michael Betancourt, and by using Stan! [sent-11, score-0.208]

8 It has allowed us to rapidly develop and test a variety of functional forms for the light curve model and strategies for optimization and regularization of the hierarchical structure. [sent-13, score-1.113]

9 Over the course of the project, I learned to pay increasingly close attention to the stepsize, n_treedepth and n_divergent NUTS parameters, and other diagnostic information provided by Stan in order to help debug sampling issues. [sent-15, score-0.306]

10 Encountering saturation of the treedepth and/or extremely small stepsizes often motivated simplifications of the hierarchical structure in order to reduce the curvature in the posterior. [sent-16, score-0.518]

11 Divergences during sampling led us to apply stronger prior information on key parameters (particularly those that are exponentiated in the light curve model) in order to avoid numerical overflow on samples drawn from the tails. [sent-17, score-1.342]

12 ” By modeling the hierarchical structure of the supernova measurements Nathan was able to significantly improve the utilization of the data. [sent-19, score-0.341]

13 Building and fitting this model proved to be a tremendous learning experience for both Nathan any myself. [sent-21, score-0.236]

14 We haven’t really seen Stan applied to such deep hierarchical models before, and our first naive implementations proved to be vulnerable to all kinds of pathologies. [sent-22, score-0.339]

15 A problem early on came in how to model hierarchical dependences between constrained parameters. [sent-23, score-0.472]

16 As has become a common theme, the most successful computational strategy is to model the hierarchical dependencies on the unconstrained latent space and transform to the constrained space only when necessary. [sent-24, score-0.588]

17 With multiple layers the parameter variances increase exponentially, and the naive generalization of a one-layer prior induces huge variances on the top-level parameters. [sent-26, score-0.426]

18 This became especially pathological when those top-level parameters are constrained — the exponential function is very easy to overflow in floating point. [sent-27, score-0.376]

19 Ultimately we established the desired variance on the top-level parameters and worked backwards, scaling the deeper priors by the number of groups in the next layer to ensure the desired behavior. [sent-28, score-0.338]

20 Nathan was able to include the full model as an appendix to his paper, which you can find on the arXiv . [sent-30, score-0.158]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('light', 0.38), ('curve', 0.261), ('stan', 0.215), ('nathan', 0.211), ('supernovae', 0.203), ('hierarchical', 0.186), ('curves', 0.172), ('model', 0.158), ('constrained', 0.128), ('parameters', 0.124), ('overflow', 0.124), ('saturation', 0.111), ('sampling', 0.107), ('structure', 0.091), ('diversity', 0.086), ('variances', 0.086), ('nuts', 0.085), ('numerical', 0.084), ('desired', 0.078), ('proved', 0.078), ('properties', 0.078), ('posterior', 0.076), ('naive', 0.075), ('failure', 0.073), ('order', 0.072), ('joint', 0.07), ('individual', 0.068), ('feature', 0.066), ('develop', 0.066), ('information', 0.065), ('modeling', 0.064), ('dataset', 0.064), ('prior', 0.063), ('us', 0.062), ('debug', 0.062), ('mandel', 0.062), ('discovers', 0.062), ('empowered', 0.062), ('inaugural', 0.062), ('capitalize', 0.062), ('pathologies', 0.062), ('induces', 0.058), ('layers', 0.058), ('telescope', 0.058), ('explosions', 0.058), ('showcases', 0.058), ('inadequate', 0.058), ('treedepth', 0.058), ('space', 0.058), ('priors', 0.058)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0000001 2299 andrew gelman stats-2014-04-21-Stan Model of the Week: Hierarchical Modeling of Supernovas

Introduction: The Stan Model of the Week showcases research using Stan to push the limits of applied statistics.  If you have a model that you would like to submit for a future post then send us an email . Our inaugural post comes from Nathan Sanders, a graduate student finishing up his thesis on astrophysics at Harvard. Nathan writes, “Core-collapse supernovae, the luminous explosions of massive stars, exhibit an expansive and meaningful diversity of behavior in their brightness evolution over time (their “light curves”). Our group discovers and monitors these events using the Pan-STARRS1 telescope in Hawaii, and we’ve collected a dataset of about 20,000 individual photometric observations of about 80 Type IIP supernovae, the class my work has focused on. While this dataset provides one of the best available tools to infer the explosion properties of these supernovae, due to the nature of extragalactic astronomy (observing from distances 1 billion light years), these light curves typicall

2 0.18932134 1543 andrew gelman stats-2012-10-21-Model complexity as a function of sample size

Introduction: As we get more data, we can fit more model. But at some point we become so overwhelmed by data that, for computational reasons, we can barely do anything at all. Thus, the curve above could be thought of as the product of two curves: a steadily increasing curve showing the statistical ability to fit more complex models with more data, and a steadily decreasing curve showing the computational feasibility of doing so.

3 0.17947638 2291 andrew gelman stats-2014-04-14-Transitioning to Stan

Introduction: Kevin Cartier writes: I’ve been happily using R for a number of years now and recently came across Stan. Looks big and powerful, so I’d like to pick an appropriate project and try it out. I wondered if you could point me to a link or document that goes into the motivation for this tool (aside from the Stan user doc)? What I’d like to understand is, at what point might you look at an emergent R project and advise, “You know, that thing you’re trying to do would be a whole lot easier/simpler/more straightforward to implement with Stan.” (or words to that effect). My reply: For my collaborators in political science, Stan has been most useful for models where the data set is not huge (e.g., we might have 10,000 data points or 50,000 data points but not 10 million) but where the model is somewhat complex (for example, a model with latent time series structure). The point is that the model has enough parameters and uncertainty that you’ll want to do full Bayes (rather than some sort

4 0.15917888 1950 andrew gelman stats-2013-07-22-My talks that were scheduled for Tues at the Data Skeptics meetup and Wed at the Open Statistical Programming meetup

Introduction: Statistical Methods and Data Skepticism Data analysis today is dominated by three paradigms: null hypothesis significance testing, Bayesian inference, and exploratory data analysis. There is concern that all these methods lead to overconfidence on the part of researchers and the general public, and this concern has led to the new “data skepticism” movement. But the history of statistics is already in some sense a history of data skepticism. Concepts of bias, variance, sampling and measurement error, least-squares regression, and statistical significance can all be viewed as formalizations of data skepticism. All these methods address the concern that patterns in observed data might not generalize to the population of interest. We discuss the challenge of attaining data skepticism while avoiding data nihilism, and consider some proposed future directions. Stan Stan (mc-stan.org) is an open-source package for obtaining Bayesian inference using the No-U-Turn sampler, a

5 0.15843734 1886 andrew gelman stats-2013-06-07-Robust logistic regression

Introduction: Corey Yanofsky writes: In your work, you’ve robustificated logistic regression by having the logit function saturate at, e.g., 0.01 and 0.99, instead of 0 and 1. Do you have any thoughts on a sensible setting for the saturation values? My intuition suggests that it has something to do with proportion of outliers expected in the data (assuming a reasonable model fit). It would be desirable to have them fit in the model, but my intuition is that integrability of the posterior distribution might become an issue. My reply: it should be no problem to put these saturation values in the model, I bet it would work fine in Stan if you give them uniform (0,.1) priors or something like that. Or you could just fit the robit model. And this reminds me . . . I’ve been told that when Stan’s on its optimization setting, it fits generalized linear models just about as fast as regular glm or bayesglm in R. This suggests to me that we should have some precompiled regression models in Stan,

6 0.15241009 1946 andrew gelman stats-2013-07-19-Prior distributions on derived quantities rather than on parameters themselves

7 0.14952128 1475 andrew gelman stats-2012-08-30-A Stan is Born

8 0.1426509 2161 andrew gelman stats-2014-01-07-My recent debugging experience

9 0.14237456 1580 andrew gelman stats-2012-11-16-Stantastic!

10 0.14136562 2208 andrew gelman stats-2014-02-12-How to think about “identifiability” in Bayesian inference?

11 0.13841948 2109 andrew gelman stats-2013-11-21-Hidden dangers of noninformative priors

12 0.13721505 1425 andrew gelman stats-2012-07-23-Examples of the use of hierarchical modeling to generalize to new settings

13 0.13716824 1999 andrew gelman stats-2013-08-27-Bayesian model averaging or fitting a larger model

14 0.13225941 2150 andrew gelman stats-2013-12-27-(R-Py-Cmd)Stan 2.1.0

15 0.12760609 1748 andrew gelman stats-2013-03-04-PyStan!

16 0.12647304 1941 andrew gelman stats-2013-07-16-Priors

17 0.12450086 2035 andrew gelman stats-2013-09-23-Scalable Stan

18 0.12343058 1474 andrew gelman stats-2012-08-29-More on scaled-inverse Wishart and prior independence

19 0.12265173 1144 andrew gelman stats-2012-01-29-How many parameters are in a multilevel model?

20 0.1215027 1695 andrew gelman stats-2013-01-28-Economists argue about Bayes


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.198), (1, 0.178), (2, -0.006), (3, 0.069), (4, 0.086), (5, 0.072), (6, 0.019), (7, -0.152), (8, -0.104), (9, 0.008), (10, -0.058), (11, 0.018), (12, -0.083), (13, -0.022), (14, -0.006), (15, -0.037), (16, 0.002), (17, 0.006), (18, 0.006), (19, 0.001), (20, -0.016), (21, -0.05), (22, -0.056), (23, -0.023), (24, -0.003), (25, 0.012), (26, -0.006), (27, -0.007), (28, 0.012), (29, 0.003), (30, -0.012), (31, -0.033), (32, -0.019), (33, -0.001), (34, -0.033), (35, 0.044), (36, 0.032), (37, -0.014), (38, 0.02), (39, -0.012), (40, 0.009), (41, -0.007), (42, 0.004), (43, -0.033), (44, 0.005), (45, -0.048), (46, -0.015), (47, -0.016), (48, 0.005), (49, -0.012)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.96078908 2299 andrew gelman stats-2014-04-21-Stan Model of the Week: Hierarchical Modeling of Supernovas

Introduction: The Stan Model of the Week showcases research using Stan to push the limits of applied statistics.  If you have a model that you would like to submit for a future post then send us an email . Our inaugural post comes from Nathan Sanders, a graduate student finishing up his thesis on astrophysics at Harvard. Nathan writes, “Core-collapse supernovae, the luminous explosions of massive stars, exhibit an expansive and meaningful diversity of behavior in their brightness evolution over time (their “light curves”). Our group discovers and monitors these events using the Pan-STARRS1 telescope in Hawaii, and we’ve collected a dataset of about 20,000 individual photometric observations of about 80 Type IIP supernovae, the class my work has focused on. While this dataset provides one of the best available tools to infer the explosion properties of these supernovae, due to the nature of extragalactic astronomy (observing from distances 1 billion light years), these light curves typicall

2 0.88565576 2242 andrew gelman stats-2014-03-10-Stan Model of the Week: PK Calculation of IV and Oral Dosing

Introduction: [Update: Revised given comments from Wingfeet, Andrew and germo. Thanks! I'd mistakenly translated the dlnorm priors in the first version --- amazing what a difference the priors make. I also escaped the less-than and greater-than signs in the constraints in the model so they're visible. I also updated to match the thin=2 output of JAGS.] We’re going to be starting a Stan “model of the P” (for some time period P) column, so I thought I’d kick things off with one of my own. I’ve been following the Wingvoet blog , the author of which is identified only by the Blogger handle Wingfeet ; a couple of days ago this lovely post came out: PK calculation of IV and oral dosing in JAGS Wingfeet’s post implemented an answer to question 6 from chapter 6 of problem from Rowland and Tozer’s 2010 book, Clinical Pharmacokinetics and Pharmacodynamics , Fourth edition, Lippincott, Williams & Wilkins. So in the grand tradition of using this blog to procrastinate, I thought I’d t

3 0.83703244 2003 andrew gelman stats-2013-08-30-Stan Project: Continuous Relaxations for Discrete MRFs

Introduction: Hamiltonian Monte Carlo (HMC), as used by Stan , is only defined for continuous parameters. We’d love to be able to do discrete sampling. So I was excited when I saw this: Yichuan Zhang, Charles Sutton, Amos J Storkey, and Zoubin Ghahramani. 2012. Continuous Relaxations for Discrete Hamiltonian Monte Carlo . NIPS 25. Abstract: Continuous relaxations play an important role in discrete optimization, but have not seen much use in approximate probabilistic inference. Here we show that a general form of the Gaussian Integral Trick makes it possible to transform a wide class of discrete variable undirected models into fully continuous systems. The continuous representation allows the use of gradient-based Hamiltonian Monte Carlo for inference, results in new ways of estimating normalization constants (partition functions), and in general opens up a number of new avenues for inference in difficult discrete systems. We demonstrate some of these continuous relaxation inference a

4 0.83491629 1580 andrew gelman stats-2012-11-16-Stantastic!

Introduction: Richard McElreath writes: I’ve been translating a few ongoing data analysis projects into Stan code, mostly with success. The most important for me right now has been a hierarchical zero-inflated gamma problem. This a “hurdle” model, in which a bernoulli GLM produces zeros/nonzeros, and then a gamma GLM produces the nonzero values, using varying effects correlated with those in the bernoulli process. The data are 20 years of human foraging returns from a subsistence hunting population in Paraguay (the Ache), comprising about 15k hunts in total (Hill & Kintigh. 2009. Current Anthropology 50:369-377). Observed values are kilograms of meat returned to camp. The more complex models contain a 147-by-9 matrix of varying effects (147 unique hunters), as well as imputation of missing values. Originally, I had written the sampler myself in raw R code. It was very slow, but I knew what it was doing at least. Just before Stan version 1.0 was released, I had managed to get JAGS to do it a

5 0.83046108 2291 andrew gelman stats-2014-04-14-Transitioning to Stan

Introduction: Kevin Cartier writes: I’ve been happily using R for a number of years now and recently came across Stan. Looks big and powerful, so I’d like to pick an appropriate project and try it out. I wondered if you could point me to a link or document that goes into the motivation for this tool (aside from the Stan user doc)? What I’d like to understand is, at what point might you look at an emergent R project and advise, “You know, that thing you’re trying to do would be a whole lot easier/simpler/more straightforward to implement with Stan.” (or words to that effect). My reply: For my collaborators in political science, Stan has been most useful for models where the data set is not huge (e.g., we might have 10,000 data points or 50,000 data points but not 10 million) but where the model is somewhat complex (for example, a model with latent time series structure). The point is that the model has enough parameters and uncertainty that you’ll want to do full Bayes (rather than some sort

6 0.8085115 1886 andrew gelman stats-2013-06-07-Robust logistic regression

7 0.7990548 2161 andrew gelman stats-2014-01-07-My recent debugging experience

8 0.7858777 1036 andrew gelman stats-2011-11-30-Stan uses Nuts!

9 0.78450525 2020 andrew gelman stats-2013-09-12-Samplers for Big Science: emcee and BAT

10 0.77386296 1753 andrew gelman stats-2013-03-06-Stan 1.2.0 and RStan 1.2.0

11 0.77023518 2208 andrew gelman stats-2014-02-12-How to think about “identifiability” in Bayesian inference?

12 0.76775479 2231 andrew gelman stats-2014-03-03-Running into a Stan Reference by Accident

13 0.75071359 2035 andrew gelman stats-2013-09-23-Scalable Stan

14 0.73968947 1476 andrew gelman stats-2012-08-30-Stan is fast

15 0.72997272 2110 andrew gelman stats-2013-11-22-A Bayesian model for an increasing function, in Stan!

16 0.72934014 1710 andrew gelman stats-2013-02-06-The new Stan 1.1.1, featuring Gaussian processes!

17 0.72718525 2318 andrew gelman stats-2014-05-04-Stan (& JAGS) Tutorial on Linear Mixed Models

18 0.72404599 2150 andrew gelman stats-2013-12-27-(R-Py-Cmd)Stan 2.1.0

19 0.71429282 2178 andrew gelman stats-2014-01-20-Mailing List Degree-of-Difficulty Difficulty

20 0.71160215 1991 andrew gelman stats-2013-08-21-BDA3 table of contents (also a new paper on visualization)


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(2, 0.019), (6, 0.029), (9, 0.018), (15, 0.047), (16, 0.079), (21, 0.016), (24, 0.162), (41, 0.02), (43, 0.011), (51, 0.023), (52, 0.014), (57, 0.031), (58, 0.012), (66, 0.011), (77, 0.02), (82, 0.031), (84, 0.052), (86, 0.066), (98, 0.018), (99, 0.217)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.97518051 2299 andrew gelman stats-2014-04-21-Stan Model of the Week: Hierarchical Modeling of Supernovas

Introduction: The Stan Model of the Week showcases research using Stan to push the limits of applied statistics.  If you have a model that you would like to submit for a future post then send us an email . Our inaugural post comes from Nathan Sanders, a graduate student finishing up his thesis on astrophysics at Harvard. Nathan writes, “Core-collapse supernovae, the luminous explosions of massive stars, exhibit an expansive and meaningful diversity of behavior in their brightness evolution over time (their “light curves”). Our group discovers and monitors these events using the Pan-STARRS1 telescope in Hawaii, and we’ve collected a dataset of about 20,000 individual photometric observations of about 80 Type IIP supernovae, the class my work has focused on. While this dataset provides one of the best available tools to infer the explosion properties of these supernovae, due to the nature of extragalactic astronomy (observing from distances 1 billion light years), these light curves typicall

2 0.94547784 1883 andrew gelman stats-2013-06-04-Interrogating p-values

Introduction: This article is a discussion of a paper by Greg Francis for a special issue, edited by E. J. Wagenmakers, of the Journal of Mathematical Psychology. Here’s what I wrote: Much of statistical practice is an effort to reduce or deny variation and uncertainty. The reduction is done through standardization, replication, and other practices of experimental design, with the idea being to isolate and stabilize the quantity being estimated and then average over many cases. Even so, however, uncertainty persists, and statistical hypothesis testing is in many ways an endeavor to deny this, by reporting binary accept/reject decisions. Classical statistical methods produce binary statements, but there is no reason to assume that the world works that way. Expressions such as Type 1 error, Type 2 error, false positive, and so on, are based on a model in which the world is divided into real and non-real effects. To put it another way, I understand the general scientific distinction of real vs

3 0.94228154 2004 andrew gelman stats-2013-09-01-Post-publication peer review: How it (sometimes) really works

Introduction: In an ideal world, research articles would be open to criticism and discussion in the same place where they are published, in a sort of non-corrupt version of Yelp. What is happening now is that the occasional paper or research area gets lots of press coverage, and this inspires reactions on science-focused blogs. The trouble here is that it’s easier to give off-the-cuff comments than detailed criticisms. Here’s an example. It starts a couple years ago with this article by Ryota Kanai, Tom Feilden, Colin Firth, and Geraint Rees, on brain size and political orientation: In a large sample of young adults, we related self-reported political attitudes to gray matter volume using structural MRI. We found that greater liberalism was associated with increased gray matter volume in the anterior cingulate cortex, whereas greater conservatism was associated with increased volume of the right amygdala. These results were replicated in an independent sample of additional participants. Ou

4 0.94183058 42 andrew gelman stats-2010-05-19-Updated solutions to Bayesian Data Analysis homeworks

Introduction: Here are solutions to about 50 of the exercises from Bayesian Data Analysis. The solutions themselves haven’t been updated; I just cleaned up the file: some change in Latex had resulted in much of the computer code running off the page, so I went in and cleaned up the files. I wrote most of these in 1996, and I like them a lot. I think several of them would’ve made good journal articles, and in retrospect I wish I’d published them as such. Original material that appears first in a book (or, even worse, in homework solutions) can easily be overlooked.

5 0.93827665 2224 andrew gelman stats-2014-02-25-Basketball Stats: Don’t model the probability of win, model the expected score differential.

Introduction: Someone who wants to remain anonymous writes: I am working to create a more accurate in-game win probability model for basketball games. My idea is for each timestep in a game (a second, 5 seconds, etc), use the Vegas line, the current score differential, who has the ball, and the number of possessions played already (to account for differences in pace) to create a point estimate probability of the home team winning. This problem would seem to fit a multi-level model structure well. It seems silly to estimate 2,000 regressions (one for each timestep), but the coefficients should vary at each timestep. Do you have suggestions for what type of model this could/would be? Additionally, I believe this needs to be some form of logit/probit given the binary dependent variable (win or loss). Finally, do you have suggestions for what package could accomplish this in Stata or R? To answer the questions in reverse order: 3. I’d hope this could be done in Stan (which can be run from R)

6 0.93818951 2201 andrew gelman stats-2014-02-06-Bootstrap averaging: Examples where it works and where it doesn’t work

7 0.93693185 1247 andrew gelman stats-2012-04-05-More philosophy of Bayes

8 0.93589157 1980 andrew gelman stats-2013-08-13-Test scores and grades predict job performance (but maybe not at Google)

9 0.93587804 494 andrew gelman stats-2010-12-31-Type S error rates for classical and Bayesian single and multiple comparison procedures

10 0.93438792 1637 andrew gelman stats-2012-12-24-Textbook for data visualization?

11 0.93396449 1367 andrew gelman stats-2012-06-05-Question 26 of my final exam for Design and Analysis of Sample Surveys

12 0.93386602 994 andrew gelman stats-2011-11-06-Josh Tenenbaum presents . . . a model of folk physics!

13 0.93351769 1019 andrew gelman stats-2011-11-19-Validation of Software for Bayesian Models Using Posterior Quantiles

14 0.93308204 2149 andrew gelman stats-2013-12-26-Statistical evidence for revised standards

15 0.93307906 899 andrew gelman stats-2011-09-10-The statistical significance filter

16 0.93304467 781 andrew gelman stats-2011-06-28-The holes in my philosophy of Bayesian data analysis

17 0.93274355 187 andrew gelman stats-2010-08-05-Update on state size and governors’ popularity

18 0.9326247 1036 andrew gelman stats-2011-11-30-Stan uses Nuts!

19 0.93247843 2055 andrew gelman stats-2013-10-08-A Bayesian approach for peer-review panels? and a speculation about Bruno Frey

20 0.93135792 1438 andrew gelman stats-2012-07-31-What is a Bayesian?