andrew_gelman_stats andrew_gelman_stats-2013 andrew_gelman_stats-2013-2110 knowledge-graph by maker-knowledge-mining

2110 andrew gelman stats-2013-11-22-A Bayesian model for an increasing function, in Stan!


meta infos for this blog

Source: html

Introduction: Following up on yesterday’s post, here’s David Chudzicki’s story (with graphs and Stan/R code!) of how he fit a model for an increasing function (“isotonic regression”). Chudzicki writes: This post will describe a way I came up with of fitting a function that’s constrained to be increasing, using Stan. If you want practical help, standard statistical approaches, or expert research, this isn’t the place for you (look up “isotonic regression” or “Bayesian isotonic regression” or David Dunson). This is the place for you if you want to read about how I thought about setting up a model, implemented the model in Stan, and created graphics to understand what was going on. The background is that a simple, natural-seeming uniform prior on the function values does not work so well—it’s a much stronger prior distribution than one might naively think, just one of those unexpected aspects of high-dimensional probability distributions. So Chudzicki sets up a more general family with a hype


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Following up on yesterday’s post, here’s David Chudzicki’s story (with graphs and Stan/R code! [sent-1, score-0.057]

2 ) of how he fit a model for an increasing function (“isotonic regression”). [sent-2, score-0.542]

3 Chudzicki writes: This post will describe a way I came up with of fitting a function that’s constrained to be increasing, using Stan. [sent-3, score-0.587]

4 If you want practical help, standard statistical approaches, or expert research, this isn’t the place for you (look up “isotonic regression” or “Bayesian isotonic regression” or David Dunson). [sent-4, score-0.885]

5 This is the place for you if you want to read about how I thought about setting up a model, implemented the model in Stan, and created graphics to understand what was going on. [sent-5, score-0.622]

6 The background is that a simple, natural-seeming uniform prior on the function values does not work so well—it’s a much stronger prior distribution than one might naively think, just one of those unexpected aspects of high-dimensional probability distributions. [sent-6, score-1.179]

7 So Chudzicki sets up a more general family with a hyper parameter. [sent-7, score-0.309]

8 One thing I like about this example is that it’s not the latest research; it has a charming DIY flavor that might make you feel that you too can patch together a model in Stan to do what you need. [sent-8, score-0.698]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('chudzicki', 0.513), ('isotonic', 0.513), ('function', 0.205), ('increasing', 0.164), ('hyper', 0.156), ('regression', 0.15), ('patch', 0.141), ('stan', 0.136), ('dunson', 0.128), ('model', 0.123), ('charming', 0.123), ('naively', 0.115), ('flavor', 0.115), ('david', 0.114), ('place', 0.113), ('constrained', 0.107), ('prior', 0.106), ('unexpected', 0.101), ('implemented', 0.098), ('uniform', 0.092), ('stronger', 0.088), ('created', 0.084), ('post', 0.08), ('latest', 0.079), ('family', 0.077), ('approaches', 0.077), ('sets', 0.076), ('aspects', 0.076), ('yesterday', 0.076), ('expert', 0.074), ('describe', 0.073), ('fitting', 0.073), ('practical', 0.072), ('graphics', 0.072), ('code', 0.068), ('setting', 0.068), ('background', 0.067), ('values', 0.065), ('want', 0.064), ('together', 0.062), ('research', 0.061), ('graphs', 0.057), ('might', 0.055), ('distribution', 0.054), ('isn', 0.052), ('help', 0.051), ('fit', 0.05), ('probability', 0.049), ('standard', 0.049), ('came', 0.049)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0 2110 andrew gelman stats-2013-11-22-A Bayesian model for an increasing function, in Stan!

Introduction: Following up on yesterday’s post, here’s David Chudzicki’s story (with graphs and Stan/R code!) of how he fit a model for an increasing function (“isotonic regression”). Chudzicki writes: This post will describe a way I came up with of fitting a function that’s constrained to be increasing, using Stan. If you want practical help, standard statistical approaches, or expert research, this isn’t the place for you (look up “isotonic regression” or “Bayesian isotonic regression” or David Dunson). This is the place for you if you want to read about how I thought about setting up a model, implemented the model in Stan, and created graphics to understand what was going on. The background is that a simple, natural-seeming uniform prior on the function values does not work so well—it’s a much stronger prior distribution than one might naively think, just one of those unexpected aspects of high-dimensional probability distributions. So Chudzicki sets up a more general family with a hype

2 0.1476185 1886 andrew gelman stats-2013-06-07-Robust logistic regression

Introduction: Corey Yanofsky writes: In your work, you’ve robustificated logistic regression by having the logit function saturate at, e.g., 0.01 and 0.99, instead of 0 and 1. Do you have any thoughts on a sensible setting for the saturation values? My intuition suggests that it has something to do with proportion of outliers expected in the data (assuming a reasonable model fit). It would be desirable to have them fit in the model, but my intuition is that integrability of the posterior distribution might become an issue. My reply: it should be no problem to put these saturation values in the model, I bet it would work fine in Stan if you give them uniform (0,.1) priors or something like that. Or you could just fit the robit model. And this reminds me . . . I’ve been told that when Stan’s on its optimization setting, it fits generalized linear models just about as fast as regular glm or bayesglm in R. This suggests to me that we should have some precompiled regression models in Stan,

3 0.1321024 2200 andrew gelman stats-2014-02-05-Prior distribution for a predicted probability

Introduction: I received the following email: I have an interesting thought on a prior for a logistic regression, and would love your input on how to make it “work.” Some of my research, two published papers, are on mathematical models of **. Along those lines, I’m interested in developing more models for **. . . . Empirical studies show that the public is rather smart and that the wisdom-of-the-crowd is fairly accurate. So, my thought would be to tread the public’s probability of the event as a prior, and then see how adding data, through a model, would change or perturb our inferred probability of **. (Similarly, I could envision using previously published epidemiological research as a prior probability of a disease, and then seeing how the addition of new testing protocols would update that belief.) However, everything I learned about hierarchical Bayesian models has a prior as a distribution on the coefficients. I don’t know how to start with a prior point estimate for the probabili

4 0.12045957 2109 andrew gelman stats-2013-11-21-Hidden dangers of noninformative priors

Introduction: Following up on Christian’s post [link fixed] on the topic, I’d like to offer a few thoughts of my own. In BDA, we express the idea that a noninformative prior is a placeholder: you can use the noninformative prior to get the analysis started, then if your posterior distribution is less informative than you would like, or if it does not make sense, you can go back and add prior information. Same thing for the data model (the “likelihood”), for that matter: it often makes sense to start with something simple and conventional and then go from there. So, in that sense, noninformative priors are no big deal, they’re just a way to get started. Just don’t take them too seriously. Traditionally in statistics we’ve worked with the paradigm of a single highly informative dataset with only weak external information. But if the data are sparse and prior information is strong, we have to think differently. And, when you increase the dimensionality of a problem, both these things hap

5 0.11018896 1941 andrew gelman stats-2013-07-16-Priors

Introduction: Nick Firoozye writes: While I am absolutely sympathetic to the Bayesian agenda I am often troubled by the requirement of having priors. We must have priors on the parameter of an infinite number of model we have never seen before and I find this troubling. There is a similarly troubling problem in economics of utility theory. Utility is on consumables. To be complete a consumer must assign utility to all sorts of things they never would have encountered. More recent versions of utility theory instead make consumption goods a portfolio of attributes. Cadillacs are x many units of luxury y of transport etc etc. And we can automatically have personal utilities to all these attributes. I don’t ever see parameters. Some model have few and some have hundreds. Instead, I see data. So I don’t know how to have an opinion on parameters themselves. Rather I think it far more natural to have opinions on the behavior of models. The prior predictive density is a good and sensible notion. Also

6 0.10831576 669 andrew gelman stats-2011-04-19-The mysterious Gamma (1.4, 0.4)

7 0.10788775 1474 andrew gelman stats-2012-08-29-More on scaled-inverse Wishart and prior independence

8 0.10482296 1753 andrew gelman stats-2013-03-06-Stan 1.2.0 and RStan 1.2.0

9 0.10143528 2291 andrew gelman stats-2014-04-14-Transitioning to Stan

10 0.10087126 247 andrew gelman stats-2010-09-01-How does Bayes do it?

11 0.096519411 2161 andrew gelman stats-2014-01-07-My recent debugging experience

12 0.094411395 1247 andrew gelman stats-2012-04-05-More philosophy of Bayes

13 0.093766429 781 andrew gelman stats-2011-06-28-The holes in my philosophy of Bayesian data analysis

14 0.092419684 2299 andrew gelman stats-2014-04-21-Stan Model of the Week: Hierarchical Modeling of Supernovas

15 0.092415594 1735 andrew gelman stats-2013-02-24-F-f-f-fake data

16 0.092098385 1431 andrew gelman stats-2012-07-27-Overfitting

17 0.091735229 2017 andrew gelman stats-2013-09-11-“Informative g-Priors for Logistic Regression”

18 0.091149382 1475 andrew gelman stats-2012-08-30-A Stan is Born

19 0.089587867 1807 andrew gelman stats-2013-04-17-Data problems, coding errors…what can be done?

20 0.088112712 1580 andrew gelman stats-2012-11-16-Stantastic!


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.153), (1, 0.126), (2, -0.008), (3, 0.076), (4, 0.061), (5, -0.008), (6, 0.019), (7, -0.076), (8, -0.04), (9, -0.002), (10, -0.024), (11, 0.021), (12, -0.034), (13, 0.007), (14, -0.006), (15, -0.008), (16, -0.014), (17, -0.005), (18, 0.009), (19, 0.0), (20, 0.008), (21, -0.023), (22, -0.018), (23, -0.04), (24, -0.0), (25, 0.005), (26, 0.045), (27, -0.066), (28, -0.048), (29, -0.003), (30, 0.025), (31, 0.023), (32, -0.015), (33, 0.038), (34, -0.001), (35, -0.024), (36, 0.003), (37, 0.013), (38, -0.016), (39, -0.029), (40, 0.013), (41, 0.014), (42, -0.015), (43, -0.035), (44, 0.067), (45, 0.031), (46, 0.013), (47, 0.0), (48, -0.002), (49, 0.029)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.96115756 2110 andrew gelman stats-2013-11-22-A Bayesian model for an increasing function, in Stan!

Introduction: Following up on yesterday’s post, here’s David Chudzicki’s story (with graphs and Stan/R code!) of how he fit a model for an increasing function (“isotonic regression”). Chudzicki writes: This post will describe a way I came up with of fitting a function that’s constrained to be increasing, using Stan. If you want practical help, standard statistical approaches, or expert research, this isn’t the place for you (look up “isotonic regression” or “Bayesian isotonic regression” or David Dunson). This is the place for you if you want to read about how I thought about setting up a model, implemented the model in Stan, and created graphics to understand what was going on. The background is that a simple, natural-seeming uniform prior on the function values does not work so well—it’s a much stronger prior distribution than one might naively think, just one of those unexpected aspects of high-dimensional probability distributions. So Chudzicki sets up a more general family with a hype

2 0.87482482 1886 andrew gelman stats-2013-06-07-Robust logistic regression

Introduction: Corey Yanofsky writes: In your work, you’ve robustificated logistic regression by having the logit function saturate at, e.g., 0.01 and 0.99, instead of 0 and 1. Do you have any thoughts on a sensible setting for the saturation values? My intuition suggests that it has something to do with proportion of outliers expected in the data (assuming a reasonable model fit). It would be desirable to have them fit in the model, but my intuition is that integrability of the posterior distribution might become an issue. My reply: it should be no problem to put these saturation values in the model, I bet it would work fine in Stan if you give them uniform (0,.1) priors or something like that. Or you could just fit the robit model. And this reminds me . . . I’ve been told that when Stan’s on its optimization setting, it fits generalized linear models just about as fast as regular glm or bayesglm in R. This suggests to me that we should have some precompiled regression models in Stan,

3 0.79870379 2161 andrew gelman stats-2014-01-07-My recent debugging experience

Introduction: OK, so this sort of thing happens sometimes. I was working on a new idea (still working on it; if it ultimately works out—or if it doesn’t—I’ll let you know) and as part of it I was fitting little models in Stan, in a loop. I thought it would make sense to start with linear regression with normal priors and known data variance, because then the exact solution is Gaussian and I can also work with the problem analytically. So I programmed up the algorithm and, no surprise, it didn’t work. I went through my R code, put in print statements here and there, and cleared out bug after bug until at least it stopped crashing. But the algorithm still wasn’t doing what it was supposed to do. So I decided to do something simpler, and just check that the Stan linear regression gave the same answer as the analytic posterior distribution: I ran Stan for tons of iterations, then computed the sample mean and variance of the simulations. It was an example with two coefficients—I’d originally cho

4 0.74546665 2299 andrew gelman stats-2014-04-21-Stan Model of the Week: Hierarchical Modeling of Supernovas

Introduction: The Stan Model of the Week showcases research using Stan to push the limits of applied statistics.  If you have a model that you would like to submit for a future post then send us an email . Our inaugural post comes from Nathan Sanders, a graduate student finishing up his thesis on astrophysics at Harvard. Nathan writes, “Core-collapse supernovae, the luminous explosions of massive stars, exhibit an expansive and meaningful diversity of behavior in their brightness evolution over time (their “light curves”). Our group discovers and monitors these events using the Pan-STARRS1 telescope in Hawaii, and we’ve collected a dataset of about 20,000 individual photometric observations of about 80 Type IIP supernovae, the class my work has focused on. While this dataset provides one of the best available tools to infer the explosion properties of these supernovae, due to the nature of extragalactic astronomy (observing from distances 1 billion light years), these light curves typicall

5 0.72480685 773 andrew gelman stats-2011-06-18-Should we always be using the t and robit instead of the normal and logit?

Introduction: My (coauthored) books on Bayesian data analysis and applied regression are like almost all the other statistics textbooks out there, in that we spend most of our time on the basic distributions such as normal and logistic and then, only as an aside, discuss robust models such as t and robit. Why aren’t the t and robit front and center? Sure, I can see starting with the normal (at least in the Bayesian book, where we actually work out all the algebra), but then why don’t we move on immediately to the real stuff? This isn’t just (or mainly) a question of textbooks or teaching; I’m really thinking here about statistical practice. My statistical practice. Should t and robit be the default? If not, why not? Some possible answers: 10. Estimating the degrees of freedom in the error distribution isn’t so easy, and throwing this extra parameter into the model could make inference unstable. 9. Real data usually don’t have outliers. In practice, fitting a robust model costs you

6 0.72476739 1486 andrew gelman stats-2012-09-07-Prior distributions for regression coefficients

7 0.72356564 2242 andrew gelman stats-2014-03-10-Stan Model of the Week: PK Calculation of IV and Oral Dosing

8 0.71583951 2291 andrew gelman stats-2014-04-14-Transitioning to Stan

9 0.7150228 2208 andrew gelman stats-2014-02-12-How to think about “identifiability” in Bayesian inference?

10 0.70675755 2200 andrew gelman stats-2014-02-05-Prior distribution for a predicted probability

11 0.70435959 1849 andrew gelman stats-2013-05-09-Same old same old

12 0.70304626 2342 andrew gelman stats-2014-05-21-Models with constraints

13 0.6932463 1460 andrew gelman stats-2012-08-16-“Real data can be a pain”

14 0.68691802 327 andrew gelman stats-2010-10-07-There are never 70 distinct parameters

15 0.68676138 39 andrew gelman stats-2010-05-18-The 1.6 rule

16 0.683568 2020 andrew gelman stats-2013-09-12-Samplers for Big Science: emcee and BAT

17 0.68265778 1516 andrew gelman stats-2012-09-30-Computational problems with glm etc.

18 0.68168986 2003 andrew gelman stats-2013-08-30-Stan Project: Continuous Relaxations for Discrete MRFs

19 0.68071657 782 andrew gelman stats-2011-06-29-Putting together multinomial discrete regressions by combining simple logits

20 0.6767056 861 andrew gelman stats-2011-08-19-Will Stan work well with 40×40 matrices?


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(15, 0.016), (24, 0.18), (40, 0.046), (44, 0.029), (48, 0.012), (63, 0.188), (71, 0.019), (73, 0.015), (86, 0.022), (99, 0.347)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.9885971 1078 andrew gelman stats-2011-12-22-Tables as graphs: The Ramanujan principle

Introduction: Tables are commonly read as crude graphs: what you notice in a table of numbers is (a) the minus signs, and thus which values are positive and which are negative, and (b) the length of each number, that is, its order of magnitude. The most famous example of such a read might be when the mathematician Srinivasa Ramanujan supposedly conjectured the asymptotic form of the partition function based on a look at a table of the first several partition numbers: he was essentially looking at a graph on the logarithmic scale. I discuss some modern-day statistical examples in this article for Significance magazine .   I had a lot of fun creating the “calculator font” for the above graph in R and then writing the article. I hope you enjoy it too! P.S. Also check out this short note by Marcin Kozak and Wojtek Krzanowski on effective presentation of data. P.P.S. I wrote this blog entry a month ago and had it in storage. Then my issue of Significance came in the mail—with my

2 0.978181 782 andrew gelman stats-2011-06-29-Putting together multinomial discrete regressions by combining simple logits

Introduction: When predicting 0/1 data we can use logit (or probit or robit or some other robust model such as invlogit (0.01 + 0.98*X*beta)). Logit is simple enough and we can use bayesglm to regularize and avoid the problem of separation. What if there are more than 2 categories? If they’re ordered (1, 2, 3, etc), we can do ordered logit (and use bayespolr() to avoid separation). If the categories are unordered (vanilla, chocolate, strawberry), there are unordered multinomial logit and probit models out there. But it’s not so easy to fit these multinomial model in a multilevel setting (with coefficients that vary by group), especially if the computation is embedded in an iterative routine such as mi where you have real time constraints at each step. So this got me wondering whether we could kluge it with logits. Here’s the basic idea (in the ordered and unordered forms): - If you have a variable that goes 1, 2, 3, etc., set up a series of logits: 1 vs. 2,3,…; 2 vs. 3,…; and so forth

3 0.97507846 313 andrew gelman stats-2010-10-03-A question for psychometricians

Introduction: Don Coffin writes: A colleague of mine and I are doing a presentation for new faculty on a number of topics related to teaching. Our charge is to identify interesting issues and to find research-based information for them about how to approach things. So, what I wondered is, do you know of any published research dealing with the sort of issues about structuring a course and final exam in the ways you talk about in this blog post ? Some poking around in the usual places hasn’t turned anything up yet. I don’t really know the psychometrics literature but I imagine that some good stuff has been written on principles of test design. There are probably some good papers from back in the 1920s. Can anyone supply some references?

4 0.97208667 102 andrew gelman stats-2010-06-21-Why modern art is all in the mind

Introduction: This looks cool: Ten years ago researchers in America took two groups of three-year-olds and showed them a blob of paint on a canvas. Children who were told that the marks were the result of an accidental spillage showed little interest. The others, who had been told that the splodge of colour had been carefully created for them, started to refer to it as “a painting”. Now that experiment . . . has gone on to form part of the foundation of an influential new book that questions the way in which we respond to art. . . . The book, which is subtitled The New Science of Why We Like What We Like, is not an attack on modern or contemporary art and Bloom says fans of more traditional art are not capable of making purely aesthetic judgments either. “I don’t have a strong position about the art itself,” he said this weekend. “But I do have a strong position about why we actually like it.” This sounds fascinating. But I’m skeptical about this part: Humans are incapable of just getti

5 0.96162039 1621 andrew gelman stats-2012-12-13-Puzzles of criminal justice

Introduction: Four recent news stories about crime and punishment made me realize, yet again, how little I understand all this. 1. “HSBC to Pay $1.92 Billion to Settle Charges of Money Laundering” : State and federal authorities decided against indicting HSBC in a money-laundering case over concerns that criminal charges could jeopardize one of the world’s largest banks and ultimately destabilize the global financial system. Instead, HSBC announced on Tuesday that it had agreed to a record $1.92 billion settlement with authorities. . . . I don’t understand this idea of punishing the institution. I have the same problem when the NCAA punishes a college football program. These are individual people breaking the law (or the rules), right? So why not punish them directly? Giving 40 lashes to a bunch of HSBC executives and garnisheeing their salaries for life, say, that wouldn’t destabilize the global financial system would it? From the article: “A money-laundering indictment, or a guilt

6 0.95535886 293 andrew gelman stats-2010-09-23-Lowess is great

7 0.95068437 745 andrew gelman stats-2011-06-04-High-level intellectual discussions in the Columbia statistics department

8 0.95036143 1484 andrew gelman stats-2012-09-05-Two exciting movie ideas: “Second Chance U” and “The New Dirty Dozen”

same-blog 9 0.95019507 2110 andrew gelman stats-2013-11-22-A Bayesian model for an increasing function, in Stan!

10 0.94911671 1480 andrew gelman stats-2012-09-02-“If our product is harmful . . . we’ll stop making it.”

11 0.94313872 33 andrew gelman stats-2010-05-14-Felix Salmon wins the American Statistical Association’s Excellence in Statistical Reporting Award

12 0.93742645 1690 andrew gelman stats-2013-01-23-When are complicated models helpful in psychology research and when are they overkill?

13 0.93211198 1506 andrew gelman stats-2012-09-21-Building a regression model . . . with only 27 data points

14 0.93079817 286 andrew gelman stats-2010-09-20-Are the Democrats avoiding a national campaign?

15 0.93036413 544 andrew gelman stats-2011-01-29-Splitting the data

16 0.92865419 2103 andrew gelman stats-2013-11-16-Objects of the class “Objects of the class”

17 0.92229736 628 andrew gelman stats-2011-03-25-100-year floods

18 0.92107975 2178 andrew gelman stats-2014-01-20-Mailing List Degree-of-Difficulty Difficulty

19 0.91873783 1201 andrew gelman stats-2012-03-07-Inference = data + model

20 0.9181565 421 andrew gelman stats-2010-11-19-Just chaid