andrew_gelman_stats andrew_gelman_stats-2014 andrew_gelman_stats-2014-2342 knowledge-graph by maker-knowledge-mining

2342 andrew gelman stats-2014-05-21-Models with constraints


meta infos for this blog

Source: html

Introduction: I had an interesting conversation with Aki about monotonicity constraints. We were discussing a particular set of Gaussian processes that we were fitting to the arsenic well-switching data (the example from the logistic regression chapter in my book with Jennifer) but some more general issues arose that I thought might interest you. The idea was to fit a model where the response (the logit probability of switching wells) was constrained to be monotonically increasing in your current arsenic level and monotonically decreasing in your current distance to the closest safe well. These constraints seem reasonable enough, but when we actually fit the model we found that doing Bayesian inference with the constraint pulled the estimate, not just toward monotonicity, but to a strong increase (for the increasing relation) or a strong decrease (for the decreasing relation). This makes sense from a statistical standpoint because if you restrict a parameter to be nonnegative, any posterior dis


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 I had an interesting conversation with Aki about monotonicity constraints. [sent-1, score-0.299]

2 We were discussing a particular set of Gaussian processes that we were fitting to the arsenic well-switching data (the example from the logistic regression chapter in my book with Jennifer) but some more general issues arose that I thought might interest you. [sent-2, score-0.67]

3 The idea was to fit a model where the response (the logit probability of switching wells) was constrained to be monotonically increasing in your current arsenic level and monotonically decreasing in your current distance to the closest safe well. [sent-3, score-2.132]

4 These constraints seem reasonable enough, but when we actually fit the model we found that doing Bayesian inference with the constraint pulled the estimate, not just toward monotonicity, but to a strong increase (for the increasing relation) or a strong decrease (for the decreasing relation). [sent-4, score-1.304]

5 This makes sense from a statistical standpoint because if you restrict a parameter to be nonnegative, any posterior distribution will end up on the positive half of the line. [sent-5, score-0.139]

6 Thinking about it more, I’m not always comfortable with any strict constraint unless there is a clear physical reason. [sent-7, score-0.499]

7 For example, yes it seems logical that increasing arsenic would increase the probability of switching but I could imagine that in any particular dataset there could be areas of negative slope. [sent-8, score-1.273]

8 After all, it is observational data and for example there could be a village that happens to have arsenic in a particular high range but where for cultural reasons there would be less switching. [sent-9, score-0.983]

9 Here there’s an omitted variable (“culture” or a village indicator) but the point is that these (hypothetical) data would not really support a strictly monotonic model, and including that restriction could distort things in other ways. [sent-10, score-0.887]

10 It does not mean we should ignore prior information, of course, but it’s a reason that I prefer soft rather than hard constraints. [sent-12, score-0.218]

11 Alternatively in this example one could put a hard constraint on the monotonicity and then add a latent omitted variable which would have the effect of turning it into a soft constraint, but I don’t usually see people do this. [sent-13, score-1.424]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('arsenic', 0.442), ('constraint', 0.342), ('monotonicity', 0.299), ('monotonically', 0.221), ('omitted', 0.182), ('switching', 0.178), ('increasing', 0.175), ('village', 0.174), ('decreasing', 0.168), ('soft', 0.154), ('relation', 0.123), ('monotonic', 0.104), ('wells', 0.104), ('increase', 0.095), ('restriction', 0.093), ('variable', 0.09), ('distort', 0.089), ('strict', 0.087), ('particular', 0.087), ('alternatively', 0.085), ('strong', 0.084), ('pulled', 0.083), ('current', 0.082), ('could', 0.08), ('closest', 0.08), ('aki', 0.077), ('indicator', 0.076), ('constrained', 0.076), ('strictly', 0.075), ('example', 0.073), ('turning', 0.073), ('fit', 0.071), ('decrease', 0.071), ('standpoint', 0.07), ('probability', 0.07), ('comfortable', 0.07), ('logit', 0.07), ('restrict', 0.069), ('arose', 0.068), ('gaussian', 0.067), ('latent', 0.067), ('constraints', 0.066), ('logical', 0.066), ('distance', 0.066), ('safe', 0.065), ('model', 0.065), ('hypothetical', 0.065), ('cultural', 0.064), ('hard', 0.064), ('observational', 0.063)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0000002 2342 andrew gelman stats-2014-05-21-Models with constraints

Introduction: I had an interesting conversation with Aki about monotonicity constraints. We were discussing a particular set of Gaussian processes that we were fitting to the arsenic well-switching data (the example from the logistic regression chapter in my book with Jennifer) but some more general issues arose that I thought might interest you. The idea was to fit a model where the response (the logit probability of switching wells) was constrained to be monotonically increasing in your current arsenic level and monotonically decreasing in your current distance to the closest safe well. These constraints seem reasonable enough, but when we actually fit the model we found that doing Bayesian inference with the constraint pulled the estimate, not just toward monotonicity, but to a strong increase (for the increasing relation) or a strong decrease (for the decreasing relation). This makes sense from a statistical standpoint because if you restrict a parameter to be nonnegative, any posterior dis

2 0.13449201 234 andrew gelman stats-2010-08-25-Modeling constrained parameters

Introduction: Mike McLaughlin writes: In general, is there any way to do MCMC with a fixed constraint? E.g., suppose I measure the three internal angles of a triangle with errors ~dnorm(0, tau) where tau might be different for the three measurements. This would be an easy BUGS/WinBUGS/JAGS exercise but suppose, in addition, I wanted to include prior information to the effect that the three angles had to total 180 degrees exactly. Is this feasible? Could you point me to any BUGS model in which a constraint of this type is implemented? Note: Even in my own (non-hierarchical) code which tends to be component-wise, random-walk Metropolis with tuned Laplacian proposals, I cannot see how I could incorporate such a constraint. My reply: See page 508 of Bayesian Data Analysis (2nd edition). We have an example of such a model there (from this paper with Bois and Jiang).

3 0.12035774 2109 andrew gelman stats-2013-11-21-Hidden dangers of noninformative priors

Introduction: Following up on Christian’s post [link fixed] on the topic, I’d like to offer a few thoughts of my own. In BDA, we express the idea that a noninformative prior is a placeholder: you can use the noninformative prior to get the analysis started, then if your posterior distribution is less informative than you would like, or if it does not make sense, you can go back and add prior information. Same thing for the data model (the “likelihood”), for that matter: it often makes sense to start with something simple and conventional and then go from there. So, in that sense, noninformative priors are no big deal, they’re just a way to get started. Just don’t take them too seriously. Traditionally in statistics we’ve worked with the paradigm of a single highly informative dataset with only weak external information. But if the data are sparse and prior information is strong, we have to think differently. And, when you increase the dimensionality of a problem, both these things hap

4 0.10898779 1017 andrew gelman stats-2011-11-18-Lack of complete overlap

Introduction: Evens Salies writes: I have a question regarding a randomizing constraint in my current funded electricity experiment. After elimination of missing data we have 110 voluntary households from a larger population (resource constraints do not allow us to have more households!). I randomly assign them to threated and non treated where the treatment variable is some ICT that allows the treated to track their electricity consumption in real tim. The ICT is made of two devices, one that is plugged on the household’s modem and the other on the electric meter. A necessary condition for being treated is that the distance between the box and the meter be below some threshold (d), the value of which is 20 meters approximately. 50 ICTs can be installed. 60 households will be in the control group. But, I can only assign 6 households in the control group for whom d is less than 20. Therefore, I have only 6 households in the control group who have a counterfactual in the group of treated.

5 0.10876469 782 andrew gelman stats-2011-06-29-Putting together multinomial discrete regressions by combining simple logits

Introduction: When predicting 0/1 data we can use logit (or probit or robit or some other robust model such as invlogit (0.01 + 0.98*X*beta)). Logit is simple enough and we can use bayesglm to regularize and avoid the problem of separation. What if there are more than 2 categories? If they’re ordered (1, 2, 3, etc), we can do ordered logit (and use bayespolr() to avoid separation). If the categories are unordered (vanilla, chocolate, strawberry), there are unordered multinomial logit and probit models out there. But it’s not so easy to fit these multinomial model in a multilevel setting (with coefficients that vary by group), especially if the computation is embedded in an iterative routine such as mi where you have real time constraints at each step. So this got me wondering whether we could kluge it with logits. Here’s the basic idea (in the ordered and unordered forms): - If you have a variable that goes 1, 2, 3, etc., set up a series of logits: 1 vs. 2,3,…; 2 vs. 3,…; and so forth

6 0.10396129 773 andrew gelman stats-2011-06-18-Should we always be using the t and robit instead of the normal and logit?

7 0.10222405 2072 andrew gelman stats-2013-10-21-The future (and past) of statistical sciences

8 0.10193302 2200 andrew gelman stats-2014-02-05-Prior distribution for a predicted probability

9 0.097513646 1695 andrew gelman stats-2013-01-28-Economists argue about Bayes

10 0.096178904 1941 andrew gelman stats-2013-07-16-Priors

11 0.093851067 1395 andrew gelman stats-2012-06-27-Cross-validation (What is it good for?)

12 0.091720991 1543 andrew gelman stats-2012-10-21-Model complexity as a function of sample size

13 0.085266747 888 andrew gelman stats-2011-09-03-A psychology researcher asks: Is Anova dead?

14 0.084149815 1247 andrew gelman stats-2012-04-05-More philosophy of Bayes

15 0.083592378 7 andrew gelman stats-2010-04-27-Should Mister P be allowed-encouraged to reside in counter-factual populations?

16 0.083011553 1779 andrew gelman stats-2013-03-27-“Two Dogmas of Strong Objective Bayesianism”

17 0.080166399 1972 andrew gelman stats-2013-08-07-When you’re planning on fitting a model, build up to it by fitting simpler models first. Then, once you have a model you like, check the hell out of it

18 0.07949286 1716 andrew gelman stats-2013-02-09-iPython Notebook

19 0.077412248 1735 andrew gelman stats-2013-02-24-F-f-f-fake data

20 0.076725177 1757 andrew gelman stats-2013-03-11-My problem with the Lindley paradox


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.158), (1, 0.096), (2, 0.033), (3, 0.017), (4, 0.003), (5, -0.018), (6, 0.034), (7, 0.001), (8, 0.014), (9, 0.029), (10, -0.003), (11, 0.023), (12, -0.024), (13, -0.02), (14, 0.009), (15, 0.011), (16, 0.015), (17, 0.001), (18, 0.0), (19, -0.02), (20, 0.011), (21, 0.008), (22, -0.008), (23, -0.039), (24, -0.008), (25, 0.034), (26, 0.013), (27, -0.017), (28, 0.0), (29, -0.012), (30, -0.011), (31, 0.012), (32, -0.041), (33, 0.02), (34, -0.017), (35, -0.039), (36, -0.001), (37, -0.017), (38, -0.025), (39, 0.015), (40, 0.001), (41, -0.034), (42, -0.031), (43, -0.041), (44, 0.029), (45, 0.041), (46, 0.031), (47, 0.024), (48, -0.023), (49, 0.032)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.9578737 2342 andrew gelman stats-2014-05-21-Models with constraints

Introduction: I had an interesting conversation with Aki about monotonicity constraints. We were discussing a particular set of Gaussian processes that we were fitting to the arsenic well-switching data (the example from the logistic regression chapter in my book with Jennifer) but some more general issues arose that I thought might interest you. The idea was to fit a model where the response (the logit probability of switching wells) was constrained to be monotonically increasing in your current arsenic level and monotonically decreasing in your current distance to the closest safe well. These constraints seem reasonable enough, but when we actually fit the model we found that doing Bayesian inference with the constraint pulled the estimate, not just toward monotonicity, but to a strong increase (for the increasing relation) or a strong decrease (for the decreasing relation). This makes sense from a statistical standpoint because if you restrict a parameter to be nonnegative, any posterior dis

2 0.79464626 782 andrew gelman stats-2011-06-29-Putting together multinomial discrete regressions by combining simple logits

Introduction: When predicting 0/1 data we can use logit (or probit or robit or some other robust model such as invlogit (0.01 + 0.98*X*beta)). Logit is simple enough and we can use bayesglm to regularize and avoid the problem of separation. What if there are more than 2 categories? If they’re ordered (1, 2, 3, etc), we can do ordered logit (and use bayespolr() to avoid separation). If the categories are unordered (vanilla, chocolate, strawberry), there are unordered multinomial logit and probit models out there. But it’s not so easy to fit these multinomial model in a multilevel setting (with coefficients that vary by group), especially if the computation is embedded in an iterative routine such as mi where you have real time constraints at each step. So this got me wondering whether we could kluge it with logits. Here’s the basic idea (in the ordered and unordered forms): - If you have a variable that goes 1, 2, 3, etc., set up a series of logits: 1 vs. 2,3,…; 2 vs. 3,…; and so forth

3 0.79073489 2110 andrew gelman stats-2013-11-22-A Bayesian model for an increasing function, in Stan!

Introduction: Following up on yesterday’s post, here’s David Chudzicki’s story (with graphs and Stan/R code!) of how he fit a model for an increasing function (“isotonic regression”). Chudzicki writes: This post will describe a way I came up with of fitting a function that’s constrained to be increasing, using Stan. If you want practical help, standard statistical approaches, or expert research, this isn’t the place for you (look up “isotonic regression” or “Bayesian isotonic regression” or David Dunson). This is the place for you if you want to read about how I thought about setting up a model, implemented the model in Stan, and created graphics to understand what was going on. The background is that a simple, natural-seeming uniform prior on the function values does not work so well—it’s a much stronger prior distribution than one might naively think, just one of those unexpected aspects of high-dimensional probability distributions. So Chudzicki sets up a more general family with a hype

4 0.78315765 1284 andrew gelman stats-2012-04-26-Modeling probability data

Introduction: Rafael Huber writes: I conducted an experiment in which subjects where asked to estimate the probability of a certain event given a number of information (like a wheater forecaster or a stockmarket trader). These probability estimates are the dependent variable of my experiment. My goal is to model the data with a (hierarchical) Bayesian regression. A linear equation with all the presented information (quantified as log odds) defines the mu of a normal likelihood. The tau as precision is another free parameter. y[r] ~ dnorm( mu[r] , tau[ subj[r] ] ) mu[r] <- b0[ subj[r] ] + b1[ subj[r] ] * x1[r] + b2[ subj[r] ] * x2[r] + b3[ subj[r] ] * x3[r] My problem is that I do not believe that the normal is the correct probability distribution to model probability data (‌ because the error is limited). However, until now nobody was able to tell me how I can correctly model probability data. My reply: You can take the logit of the data before analyzing them. That is assuming there

5 0.78300691 1221 andrew gelman stats-2012-03-19-Whassup with deviance having a high posterior correlation with a parameter in the model?

Introduction: Jean Richardson writes: Do you know what might lead to a large negative cross-correlation (-0.95) between deviance and one of the model parameters? Here’s the (brief) background: I [Richardson] have written a Bayesian hierarchical site occupancy model for presence of disease on individual amphibians. The response variable is therefore binary (disease present/absent) and the probability of disease being present in an individual (psi) depends on various covariates (species of amphibian, location sampled, etc.) paramaterized using a logit link function. Replicates are individuals sampled (tested for presence of disease) together. The possibility of imperfect detection is included as p = (prob. disease detected given disease is present). Posterior distributions were estimated using WinBUGS via R2WinBUGS. Simulated data from the model fit the real data very well and posterior distribution densities seem robust to any changes in the model (different priors, etc.) All autocor

6 0.77284724 773 andrew gelman stats-2011-06-18-Should we always be using the t and robit instead of the normal and logit?

7 0.76912087 1723 andrew gelman stats-2013-02-15-Wacky priors can work well?

8 0.76005435 1247 andrew gelman stats-2012-04-05-More philosophy of Bayes

9 0.75829905 1999 andrew gelman stats-2013-08-27-Bayesian model averaging or fitting a larger model

10 0.75778574 1017 andrew gelman stats-2011-11-18-Lack of complete overlap

11 0.73953503 2208 andrew gelman stats-2014-02-12-How to think about “identifiability” in Bayesian inference?

12 0.73866183 2200 andrew gelman stats-2014-02-05-Prior distribution for a predicted probability

13 0.73549426 234 andrew gelman stats-2010-08-25-Modeling constrained parameters

14 0.73407745 1955 andrew gelman stats-2013-07-25-Bayes-respecting experimental design and other things

15 0.72950441 251 andrew gelman stats-2010-09-02-Interactions of predictors in a causal model

16 0.72719526 996 andrew gelman stats-2011-11-07-Chi-square FAIL when many cells have small expected values

17 0.72378898 547 andrew gelman stats-2011-01-31-Using sample size in the prior distribution

18 0.71431082 2364 andrew gelman stats-2014-06-08-Regression and causality and variable ordering

19 0.71410769 2311 andrew gelman stats-2014-04-29-Bayesian Uncertainty Quantification for Differential Equations!

20 0.71230108 56 andrew gelman stats-2010-05-28-Another argument in favor of expressing conditional probability statements using the population distribution


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(5, 0.052), (6, 0.028), (13, 0.044), (15, 0.042), (16, 0.057), (24, 0.117), (42, 0.01), (51, 0.037), (53, 0.058), (56, 0.028), (57, 0.02), (59, 0.062), (67, 0.012), (73, 0.044), (86, 0.013), (87, 0.032), (99, 0.247)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.95609879 2342 andrew gelman stats-2014-05-21-Models with constraints

Introduction: I had an interesting conversation with Aki about monotonicity constraints. We were discussing a particular set of Gaussian processes that we were fitting to the arsenic well-switching data (the example from the logistic regression chapter in my book with Jennifer) but some more general issues arose that I thought might interest you. The idea was to fit a model where the response (the logit probability of switching wells) was constrained to be monotonically increasing in your current arsenic level and monotonically decreasing in your current distance to the closest safe well. These constraints seem reasonable enough, but when we actually fit the model we found that doing Bayesian inference with the constraint pulled the estimate, not just toward monotonicity, but to a strong increase (for the increasing relation) or a strong decrease (for the decreasing relation). This makes sense from a statistical standpoint because if you restrict a parameter to be nonnegative, any posterior dis

2 0.91519332 294 andrew gelman stats-2010-09-23-Thinking outside the (graphical) box: Instead of arguing about how best to fix a bar chart, graph it as a time series lineplot instead

Introduction: John Kastellec points me to this blog by Ezra Klein criticizing the following graph from a recent Republican Party report: Klein (following Alexander Hart ) slams the graph for not going all the way to zero on the y-axis, thus making the projected change seem bigger than it really is. I agree with Klein and Hart that, if you’re gonna do a bar chart, you want the bars to go down to 0. On the other hand, a projected change from 19% to 23% is actually pretty big, and I don’t see the point of using a graphical display that hides it. The solution: Ditch the bar graph entirely and replace it by a lineplot , in particular, a time series with year-by-year data. The time series would have several advantages: 1. Data are placed in context. You’d see every year, instead of discrete averages, and you’d get to see the changes in the context of year-to-year variation. 2. With the time series, you can use whatever y-axis works with the data. No need to go to zero. P.S. I l

3 0.91511589 1764 andrew gelman stats-2013-03-15-How do I make my graphs?

Introduction: Someone who wishes to remain anonymous writes: I’ve been following your blog a long time and enjoy your posts on visualization/statistical graphics matters. I don’t recall however you ever describing the details of your setup for plotting. I’m a new R user (convert from matplotlib) and would love to know your thoughts on the ideal setup: do you use mainly the R base? Do you use lattice? What do you think of ggplot2? etc. I found ggplot2 nearly indecipherable until a recent eureka moment, and I think its default theme is a waste tremendous ink (all those silly grey backgrounds and grids are really unnecessary), but if you customize that away it can be made to look like ordinary, pretty statistical graphs. Feel free to respond on your blog, but if you do, please remove my name from the post (my colleagues already make fun of me for thinking about visualization too much.) I love that last bit! Anyway, my response is that I do everything in base graphics (using my

4 0.91498852 2013 andrew gelman stats-2013-09-08-What we need here is some peer review for statistical graphics

Introduction: Under the heading, “Bad graph candidate,” Kevin Wright points to this article [link fixed], writing: Some of the figures use the same line type for two different series. More egregious are the confidence intervals that are constant width instead of increasing in width into the future. Indeed. What’s even more embarrassing is that these graphs appeared in an article in the magazine Significance, sponsored by the American Statistical Association and the Royal Statistical Society. Perhaps every scientific journal could have a graphics editor whose job is to point out really horrible problems and require authors to make improvements. The difficulty, as always, is that scientists write these articles for free and as a public service (publishing in Significance doesn’t pay, nor does it count as a publication in an academic record), so it might be difficult to get authors to fix their graphs. On the other hand, if an article is worth writing at all, it’s worth trying to conv

5 0.91382593 517 andrew gelman stats-2011-01-14-Bayes in China update

Introduction: Some clarification on the Bayes-in-China issue raised last week : 1. We heard that the Chinese publisher cited the following pages that might contain politically objectionable materials: 3, 5, 21, 73, 112, 201. 2. It appears that, as some commenters suggested, the objection was to some of the applications, not to the Bayesian methods. 3. Our book is not censored in China. In fact, as some commenters mentioned, it is possible to buy it there, and it is also available in university libraries there. The edition of the book which was canceled was intended to be a low-cost reprint of the book. The original book is still available. I used the phrase “Banned in China” as a joke and I apologize if it was misinterpreted. 4. I have no quarrel with the Chinese government or with any Chinese publishers. They can publish whatever books they would like. I found this episode amusing only because I do not think my book on regression and multilevel models has any strong political co

6 0.91174436 446 andrew gelman stats-2010-12-03-Is 0.05 too strict as a p-value threshold?

7 0.91145486 1380 andrew gelman stats-2012-06-15-Coaching, teaching, and writing

8 0.91036111 687 andrew gelman stats-2011-04-29-Zero is zero

9 0.90906215 1960 andrew gelman stats-2013-07-28-More on that machine learning course

10 0.90829211 1453 andrew gelman stats-2012-08-10-Quotes from me!

11 0.90791243 798 andrew gelman stats-2011-07-12-Sometimes a graph really is just ugly

12 0.90749806 2313 andrew gelman stats-2014-04-30-Seth Roberts

13 0.90699065 350 andrew gelman stats-2010-10-18-Subtle statistical issues to be debated on TV.

14 0.9066655 248 andrew gelman stats-2010-09-01-Ratios where the numerator and denominator both change signs

15 0.9066022 1914 andrew gelman stats-2013-06-25-Is there too much coauthorship in economics (and science more generally)? Or too little?

16 0.90655696 1047 andrew gelman stats-2011-12-08-I Am Too Absolutely Heteroskedastic for This Probit Model

17 0.90622211 1555 andrew gelman stats-2012-10-31-Social scientists who use medical analogies to explain causal inference are, I think, implicitly trying to borrow some of the scientific and cultural authority of that field for our own purposes

18 0.90547663 1610 andrew gelman stats-2012-12-06-Yes, checking calibration of probability forecasts is part of Bayesian statistics

19 0.90517128 61 andrew gelman stats-2010-05-31-A data visualization manifesto

20 0.90474993 2233 andrew gelman stats-2014-03-04-Literal vs. rhetorical