andrew_gelman_stats andrew_gelman_stats-2011 andrew_gelman_stats-2011-696 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: We’re having problem with starting values in glm(). A very simple logistic regression with just an intercept with a very simple starting value (beta=5) blows up. Here’s the R code: > y <- rep (c(1,0),c(10,5)) > glm (y ~ 1, family=binomial(link="logit")) Call: glm(formula = y ~ 1, family = binomial(link = "logit")) Coefficients: (Intercept) 0.6931 Degrees of Freedom: 14 Total (i.e. Null); 14 Residual Null Deviance: 19.1 Residual Deviance: 19.1 AIC: 21.1 > glm (y ~ 1, family=binomial(link="logit"), start=2) Call: glm(formula = y ~ 1, family = binomial(link = "logit"), start = 2) Coefficients: (Intercept) 0.6931 Degrees of Freedom: 14 Total (i.e. Null); 14 Residual Null Deviance: 19.1 Residual Deviance: 19.1 AIC: 21.1 > glm (y ~ 1, family=binomial(link="logit"), start=5) Call: glm(formula = y ~ 1, family = binomial(link = "logit"), start = 5) Coefficients: (Intercept) 1.501e+15 Degrees of Freedom: 14 Total (i.
sentIndex sentText sentNum sentScore
1 A very simple logistic regression with just an intercept with a very simple starting value (beta=5) blows up. [sent-2, score-0.659]
2 Here’s the R code: > y <- rep (c(1,0),c(10,5)) > glm (y ~ 1, family=binomial(link="logit")) Call: glm(formula = y ~ 1, family = binomial(link = "logit")) Coefficients: (Intercept) 0. [sent-3, score-0.712]
3 1 > glm (y ~ 1, family=binomial(link="logit"), start=2) Call: glm(formula = y ~ 1, family = binomial(link = "logit"), start = 2) Coefficients: (Intercept) 0. [sent-9, score-0.773]
4 1 > glm (y ~ 1, family=binomial(link="logit"), start=5) Call: glm(formula = y ~ 1, family = binomial(link = "logit"), start = 5) Coefficients: (Intercept) 1. [sent-15, score-0.773]
5 fit: fitted probabilities numerically 0 or 1 occurred What’s going on? [sent-22, score-0.409]
6 Just to be clear: my problem is not with the “fitted probabilities numerically 0 or 1 occurred” thing. [sent-25, score-0.259]
7 My problem is that when I start with a not-ridiculous starting value of 5, that glm does not converge to the correct estimate of 0. [sent-27, score-0.868]
wordName wordTfidf (topN-words)
[('glm', 0.436), ('deviance', 0.345), ('residual', 0.332), ('binomial', 0.309), ('logit', 0.277), ('null', 0.247), ('family', 0.216), ('intercept', 0.215), ('aic', 0.172), ('formula', 0.143), ('link', 0.14), ('blows', 0.138), ('numerically', 0.138), ('freedom', 0.128), ('degrees', 0.127), ('start', 0.121), ('coefficients', 0.111), ('starting', 0.103), ('total', 0.101), ('warning', 0.099), ('occurred', 0.097), ('call', 0.082), ('fitted', 0.082), ('probabilities', 0.073), ('rep', 0.06), ('converge', 0.055), ('blow', 0.054), ('value', 0.053), ('beta', 0.049), ('problem', 0.048), ('simple', 0.045), ('logistic', 0.037), ('message', 0.034), ('shouldn', 0.034), ('code', 0.032), ('values', 0.03), ('correct', 0.029), ('care', 0.029), ('instead', 0.024), ('regression', 0.023), ('estimate', 0.023), ('clear', 0.022), ('going', 0.019), ('re', 0.013)]
simIndex simValue blogId blogTitle
same-blog 1 0.99999988 696 andrew gelman stats-2011-05-04-Whassup with glm()?
Introduction: We’re having problem with starting values in glm(). A very simple logistic regression with just an intercept with a very simple starting value (beta=5) blows up. Here’s the R code: > y <- rep (c(1,0),c(10,5)) > glm (y ~ 1, family=binomial(link="logit")) Call: glm(formula = y ~ 1, family = binomial(link = "logit")) Coefficients: (Intercept) 0.6931 Degrees of Freedom: 14 Total (i.e. Null); 14 Residual Null Deviance: 19.1 Residual Deviance: 19.1 AIC: 21.1 > glm (y ~ 1, family=binomial(link="logit"), start=2) Call: glm(formula = y ~ 1, family = binomial(link = "logit"), start = 2) Coefficients: (Intercept) 0.6931 Degrees of Freedom: 14 Total (i.e. Null); 14 Residual Null Deviance: 19.1 Residual Deviance: 19.1 AIC: 21.1 > glm (y ~ 1, family=binomial(link="logit"), start=5) Call: glm(formula = y ~ 1, family = binomial(link = "logit"), start = 5) Coefficients: (Intercept) 1.501e+15 Degrees of Freedom: 14 Total (i.
2 0.62378514 1516 andrew gelman stats-2012-09-30-Computational problems with glm etc.
Introduction: John Mount provides some useful background and follow-up on our discussion from last year on computational instability of the usual logistic regression solver. Just to refresh your memory, here’s a simple logistic regression with only a constant term and no separation, nothing pathological at all: > y <- rep (c(1,0),c(10,5)) > display (glm (y ~ 1, family=binomial(link="logit"))) glm(formula = y ~ 1, family = binomial(link = "logit")) coef.est coef.se (Intercept) 0.69 0.55 --- n = 15, k = 1 residual deviance = 19.1, null deviance = 19.1 (difference = 0.0) And here’s what happens when we give it the not-outrageous starting value of -2: > display (glm (y ~ 1, family=binomial(link="logit"), start=-2)) glm(formula = y ~ 1, family = binomial(link = "logit"), start = -2) coef.est coef.se (Intercept) 71.97 17327434.18 --- n = 15, k = 1 residual deviance = 360.4, null deviance = 19.1 (difference = -341.3) Warning message:
3 0.30265656 729 andrew gelman stats-2011-05-24-Deviance as a difference
Introduction: Peng Yu writes: On page 180 of BDA2, deviance is defined as D(y,\theta)=-2log p(y|\theta). However, according to GLM 2/e by McCullagh and Nelder, deviance is the different of the log-likelihood of the full model and the base model (times 2) (see the equation on the wiki webpage). The english word ‘deviance’ implies the difference from a standard (in this case, the base model). I’m wondering what the rationale for your definition of deviance, which consists of only 1 term rather than 2 terms. My reply: Deviance is typically computed as a relative quantity; that is, people look at the difference in deviance. So the two definitions are equivalent.
Introduction: Jean Richardson writes: Do you know what might lead to a large negative cross-correlation (-0.95) between deviance and one of the model parameters? Here’s the (brief) background: I [Richardson] have written a Bayesian hierarchical site occupancy model for presence of disease on individual amphibians. The response variable is therefore binary (disease present/absent) and the probability of disease being present in an individual (psi) depends on various covariates (species of amphibian, location sampled, etc.) paramaterized using a logit link function. Replicates are individuals sampled (tested for presence of disease) together. The possibility of imperfect detection is included as p = (prob. disease detected given disease is present). Posterior distributions were estimated using WinBUGS via R2WinBUGS. Simulated data from the model fit the real data very well and posterior distribution densities seem robust to any changes in the model (different priors, etc.) All autocor
5 0.18168378 547 andrew gelman stats-2011-01-31-Using sample size in the prior distribution
Introduction: Mike McLaughlin writes: Consider the Seeds example in vol. 1 of the BUGS examples. There, a binomial likelihood has a p parameter constructed, via logit, from two covariates. What I am wondering is: Would it be legitimate, in a binomial + logit problem like this, to allow binomial p[i] to be a function of the corresponding n[i] or would that amount to using the data in the prior? In other words, in the context of the Seeds example, is r[] the only data or is n[] data as well and therefore not permissible in a prior formulation? I [McLaughlin] currently have a model with a common beta prior for all p[i] but would like to mitigate this commonality (a kind of James-Stein effect) when there are lots of observations for some i. But this seems to feed the data back into the prior. Does it really? It also occurs to me [McLaughlin] that, perhaps, a binomial likelihood is not the one to use here (not flexible enough). My reply: Strictly speaking, “n” is data, and so what you wa
6 0.17375265 684 andrew gelman stats-2011-04-28-Hierarchical ordered logit or probit
7 0.14092852 187 andrew gelman stats-2010-08-05-Update on state size and governors’ popularity
8 0.12198761 782 andrew gelman stats-2011-06-29-Putting together multinomial discrete regressions by combining simple logits
9 0.11992633 266 andrew gelman stats-2010-09-09-The future of R
10 0.11735387 1886 andrew gelman stats-2013-06-07-Robust logistic regression
11 0.11398923 888 andrew gelman stats-2011-09-03-A psychology researcher asks: Is Anova dead?
13 0.096873604 1377 andrew gelman stats-2012-06-13-A question about AIC
15 0.095172584 833 andrew gelman stats-2011-07-31-Untunable Metropolis
16 0.09383674 1975 andrew gelman stats-2013-08-09-Understanding predictive information criteria for Bayesian models
17 0.091202274 417 andrew gelman stats-2010-11-17-Clutering and variance components
18 0.08979056 246 andrew gelman stats-2010-08-31-Somewhat Bayesian multilevel modeling
19 0.087344252 1869 andrew gelman stats-2013-05-24-In which I side with Neyman over Fisher
20 0.085479163 1607 andrew gelman stats-2012-12-05-The p-value is not . . .
topicId topicWeight
[(0, 0.056), (1, 0.052), (2, 0.03), (3, -0.009), (4, 0.046), (5, -0.026), (6, 0.038), (7, -0.029), (8, 0.012), (9, -0.033), (10, -0.023), (11, 0.002), (12, 0.003), (13, -0.058), (14, -0.026), (15, 0.048), (16, -0.018), (17, -0.053), (18, -0.032), (19, -0.057), (20, 0.073), (21, 0.033), (22, 0.058), (23, -0.069), (24, -0.014), (25, -0.015), (26, 0.021), (27, -0.007), (28, 0.017), (29, -0.036), (30, -0.008), (31, 0.064), (32, 0.053), (33, 0.034), (34, -0.032), (35, -0.057), (36, -0.008), (37, -0.039), (38, 0.008), (39, 0.077), (40, -0.023), (41, 0.026), (42, -0.038), (43, -0.029), (44, 0.057), (45, 0.059), (46, 0.047), (47, 0.032), (48, 0.015), (49, 0.132)]
simIndex simValue blogId blogTitle
same-blog 1 0.98681355 696 andrew gelman stats-2011-05-04-Whassup with glm()?
Introduction: We’re having problem with starting values in glm(). A very simple logistic regression with just an intercept with a very simple starting value (beta=5) blows up. Here’s the R code: > y <- rep (c(1,0),c(10,5)) > glm (y ~ 1, family=binomial(link="logit")) Call: glm(formula = y ~ 1, family = binomial(link = "logit")) Coefficients: (Intercept) 0.6931 Degrees of Freedom: 14 Total (i.e. Null); 14 Residual Null Deviance: 19.1 Residual Deviance: 19.1 AIC: 21.1 > glm (y ~ 1, family=binomial(link="logit"), start=2) Call: glm(formula = y ~ 1, family = binomial(link = "logit"), start = 2) Coefficients: (Intercept) 0.6931 Degrees of Freedom: 14 Total (i.e. Null); 14 Residual Null Deviance: 19.1 Residual Deviance: 19.1 AIC: 21.1 > glm (y ~ 1, family=binomial(link="logit"), start=5) Call: glm(formula = y ~ 1, family = binomial(link = "logit"), start = 5) Coefficients: (Intercept) 1.501e+15 Degrees of Freedom: 14 Total (i.
2 0.7632885 1516 andrew gelman stats-2012-09-30-Computational problems with glm etc.
Introduction: John Mount provides some useful background and follow-up on our discussion from last year on computational instability of the usual logistic regression solver. Just to refresh your memory, here’s a simple logistic regression with only a constant term and no separation, nothing pathological at all: > y <- rep (c(1,0),c(10,5)) > display (glm (y ~ 1, family=binomial(link="logit"))) glm(formula = y ~ 1, family = binomial(link = "logit")) coef.est coef.se (Intercept) 0.69 0.55 --- n = 15, k = 1 residual deviance = 19.1, null deviance = 19.1 (difference = 0.0) And here’s what happens when we give it the not-outrageous starting value of -2: > display (glm (y ~ 1, family=binomial(link="logit"), start=-2)) glm(formula = y ~ 1, family = binomial(link = "logit"), start = -2) coef.est coef.se (Intercept) 71.97 17327434.18 --- n = 15, k = 1 residual deviance = 360.4, null deviance = 19.1 (difference = -341.3) Warning message:
3 0.62798214 684 andrew gelman stats-2011-04-28-Hierarchical ordered logit or probit
Introduction: Jeff writes: How far off is bglmer and can it handle ordered logit or multinom logit? My reply: bglmer is very close. No ordered logit but I was just talking about it with Sophia today. My guess is that the easiest way to fit a hierarchical ordered logit or multinom logit will be to use stan. For right now I’d recommend using glmer/bglmer to fit the ordered logits in order (e.g., 1 vs. 2,3,4, then 2 vs. 3,4, then 3 vs. 4). Or maybe there’s already a hierarchical multinomial logit in mcmcpack or somewhere?
4 0.57722163 729 andrew gelman stats-2011-05-24-Deviance as a difference
Introduction: Peng Yu writes: On page 180 of BDA2, deviance is defined as D(y,\theta)=-2log p(y|\theta). However, according to GLM 2/e by McCullagh and Nelder, deviance is the different of the log-likelihood of the full model and the base model (times 2) (see the equation on the wiki webpage). The english word ‘deviance’ implies the difference from a standard (in this case, the base model). I’m wondering what the rationale for your definition of deviance, which consists of only 1 term rather than 2 terms. My reply: Deviance is typically computed as a relative quantity; that is, people look at the difference in deviance. So the two definitions are equivalent.
5 0.54921472 2190 andrew gelman stats-2014-01-29-Stupid R Tricks: Random Scope
Introduction: Andrew and I have been discussing how we’re going to define functions in Stan for defining systems of differential equations; see our evolving ode design doc ; comments welcome, of course. About Scope I mentioned to Andrew I would prefer pure lexical, static scoping, as found in languages like C++ and Java. If you’re not familiar with the alternatives, there’s a nice overview in the Wikipedia article on scope . Let me call out a few passages that will help set the context. A fundamental distinction in scoping is what “context” means – whether name resolution depends on the location in the source code (lexical scope, static scope, which depends on the lexical context) or depends on the program state when the name is encountered (dynamic scope, which depends on the execution context or calling context). Lexical resolution can be determined at compile time, and is also known as early binding, while dynamic resolution can in general only be determined at run time, and thus
7 0.51625174 1761 andrew gelman stats-2013-03-13-Lame Statistics Patents
8 0.50586712 782 andrew gelman stats-2011-06-29-Putting together multinomial discrete regressions by combining simple logits
9 0.45846239 1462 andrew gelman stats-2012-08-18-Standardizing regression inputs
10 0.45798376 357 andrew gelman stats-2010-10-20-Sas and R
11 0.4551785 39 andrew gelman stats-2010-05-18-The 1.6 rule
13 0.43831068 1422 andrew gelman stats-2012-07-20-Likelihood thresholds and decisions
14 0.42934987 1080 andrew gelman stats-2011-12-24-Latest in blog advertising
15 0.42126486 14 andrew gelman stats-2010-05-01-Imputing count data
16 0.42088783 290 andrew gelman stats-2010-09-22-Data Thief
17 0.40815774 2342 andrew gelman stats-2014-05-21-Models with constraints
18 0.400392 953 andrew gelman stats-2011-10-11-Steve Jobs’s cancer and science-based medicine
19 0.39893138 2311 andrew gelman stats-2014-04-29-Bayesian Uncertainty Quantification for Differential Equations!
20 0.39816096 2163 andrew gelman stats-2014-01-08-How to display multinominal logit results graphically?
topicId topicWeight
[(7, 0.038), (16, 0.051), (21, 0.043), (24, 0.122), (35, 0.025), (53, 0.055), (54, 0.075), (61, 0.127), (63, 0.064), (73, 0.012), (80, 0.011), (82, 0.026), (90, 0.067), (95, 0.025), (99, 0.118)]
simIndex simValue blogId blogTitle
same-blog 1 0.9557336 696 andrew gelman stats-2011-05-04-Whassup with glm()?
Introduction: We’re having problem with starting values in glm(). A very simple logistic regression with just an intercept with a very simple starting value (beta=5) blows up. Here’s the R code: > y <- rep (c(1,0),c(10,5)) > glm (y ~ 1, family=binomial(link="logit")) Call: glm(formula = y ~ 1, family = binomial(link = "logit")) Coefficients: (Intercept) 0.6931 Degrees of Freedom: 14 Total (i.e. Null); 14 Residual Null Deviance: 19.1 Residual Deviance: 19.1 AIC: 21.1 > glm (y ~ 1, family=binomial(link="logit"), start=2) Call: glm(formula = y ~ 1, family = binomial(link = "logit"), start = 2) Coefficients: (Intercept) 0.6931 Degrees of Freedom: 14 Total (i.e. Null); 14 Residual Null Deviance: 19.1 Residual Deviance: 19.1 AIC: 21.1 > glm (y ~ 1, family=binomial(link="logit"), start=5) Call: glm(formula = y ~ 1, family = binomial(link = "logit"), start = 5) Coefficients: (Intercept) 1.501e+15 Degrees of Freedom: 14 Total (i.
2 0.77968955 1558 andrew gelman stats-2012-11-02-Not so fast on levees and seawalls for NY harbor?
Introduction: I was talking with June Williamson and mentioned offhand that I’d seen something in the paper saying that if only we’d invested a few billion dollars in levees we would’ve saved zillions in economic damage from the flood. (A quick search also revealed this eerily prescient article from last month and, more recently, this online discussion.) June said, No, no, no: levees are not the way to go: Here and here are the articles on “soft infrastructure” for the New York-New Jersey Harbor I was mentioning, summarizing work that is more extensively published in two books, “Rising Currents” and “On the Water: Palisade Bay”: The hazards posed by climate change, sea level rise, and severe storm surges make this the time to transform our coastal cities through adaptive design. The conventional response to flooding, in recent history, has been hard engineering — fortifying the coastal infrastructure with seawalls and bulkheads to protect real estate at the expense of natural t
Introduction: Commenter Rahul asked what I thought of this note by Scott Firestone ( link from Tyler Cowen) criticizing a recent discussion by Kevin Drum suggesting that lead exposure causes violent crime. Firestone writes: It turns out there was in fact a prospective study done—but its implications for Drum’s argument are mixed. The study was a cohort study done by researchers at the University of Cincinnati. Between 1979 and 1984, 376 infants were recruited. Their parents consented to have lead levels in their blood tested over time; this was matched with records over subsequent decades of the individuals’ arrest records, and specifically arrest for violent crime. Ultimately, some of these individuals were dropped from the study; by the end, 250 were selected for the results. The researchers found that for each increase of 5 micrograms of lead per deciliter of blood, there was a higher risk for being arrested for a violent crime, but a further look at the numbers shows a more mixe
4 0.76038432 16 andrew gelman stats-2010-05-04-Burgess on Kipling
Introduction: This is my last entry derived from Anthony Burgess’s book reviews , and it’ll be short. His review of Angus Wilson’s “The Strange Ride of Rudyard Kipling: His Life and Works” is a wonderfully balanced little thing. Nothing incredibly deep–like most items in the collection, the review is only two pages long–but I give it credit for being a rare piece of Kipling criticism I’ve seen that (a) seriously engages with the politics, without (b) congratulating itself on bravely going against the fashions of the politically incorrect chattering classes by celebrating Kipling’s magnificent achievement blah blah blah. Instead, Burgess shows respect for Kipling’s work and puts it in historical, biographical, and literary context. Burgess concludes that Wilson’s book “reminds us, in John Gross’s words, that Kipling ‘remains a haunting, unsettling presence, with whom we still have to come to terms.’ Still.” Well put, and generous of Burgess to end his review with another’s quote. Other cri
5 0.74498796 1370 andrew gelman stats-2012-06-07-Duncan Watts and the Titanic
Introduction: Daniel Mendelsohn recently asked , “Why do we love the Titanic?”, seeking to understand how it has happened that: It may not be true that ‘the three most written-about subjects of all time are Jesus, the Civil War, and the Titanic,’ as one historian has put it, but it’s not much of an exaggeration. . . . The inexhaustible interest suggests that the Titanic’s story taps a vein much deeper than the morbid fascination that has attached to other disasters. The explosion of the Hindenburg, for instance, and even the torpedoing, just three years after the Titanic sank, of the Lusitania, another great liner whose passenger list boasted the rich and the famous, were calamities that shocked the world but have failed to generate an obsessive preoccupation. . . . If the Titanic has gripped our imagination so forcefully for the past century, it must be because of something bigger than any fact of social or political or cultural history. To get to the bottom of why we can’t forget it, yo
6 0.7437982 1028 andrew gelman stats-2011-11-26-Tenure lets you handle students who cheat
7 0.74370342 729 andrew gelman stats-2011-05-24-Deviance as a difference
8 0.74253041 1433 andrew gelman stats-2012-07-28-LOL without the CATS
9 0.7385062 1975 andrew gelman stats-2013-08-09-Understanding predictive information criteria for Bayesian models
10 0.72429657 2349 andrew gelman stats-2014-05-26-WAIC and cross-validation in Stan!
11 0.71626729 1221 andrew gelman stats-2012-03-19-Whassup with deviance having a high posterior correlation with a parameter in the model?
12 0.7125783 322 andrew gelman stats-2010-10-06-More on the differences between drugs and medical devices
13 0.70745337 1516 andrew gelman stats-2012-09-30-Computational problems with glm etc.
14 0.70401871 827 andrew gelman stats-2011-07-28-Amusing case of self-defeating science writing
15 0.70384246 102 andrew gelman stats-2010-06-21-Why modern art is all in the mind
16 0.70348549 21 andrew gelman stats-2010-05-07-Environmentally induced cancer “grossly underestimated”? Doubtful.
17 0.70102012 714 andrew gelman stats-2011-05-16-NYT Labs releases Openpaths, a utility for saving your iphone data
18 0.69926888 1417 andrew gelman stats-2012-07-15-Some decision analysis problems are pretty easy, no?
19 0.69684803 547 andrew gelman stats-2011-01-31-Using sample size in the prior distribution
20 0.69600433 1607 andrew gelman stats-2012-12-05-The p-value is not . . .