andrew_gelman_stats andrew_gelman_stats-2010 andrew_gelman_stats-2010-39 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: In ARM we discuss how you can go back and forth between logit and probit models by dividing by 1.6. Or, to put it another way, logistic regression corresponds to a latent-variable model with errors that are approximately normally distributed with mean 0 and standard deviation 1.6. (This is well known, it’s nothing original with our book.) Anyway, John Cook discusses the approximation here .
sentIndex sentText sentNum sentScore
1 In ARM we discuss how you can go back and forth between logit and probit models by dividing by 1. [sent-1, score-1.459]
2 Or, to put it another way, logistic regression corresponds to a latent-variable model with errors that are approximately normally distributed with mean 0 and standard deviation 1. [sent-3, score-2.064]
3 (This is well known, it’s nothing original with our book. [sent-5, score-0.329]
4 ) Anyway, John Cook discusses the approximation here . [sent-6, score-0.436]
wordName wordTfidf (topN-words)
[('dividing', 0.298), ('probit', 0.292), ('normally', 0.245), ('approximation', 0.236), ('cook', 0.236), ('distributed', 0.234), ('logit', 0.234), ('corresponds', 0.229), ('deviation', 0.221), ('forth', 0.219), ('arm', 0.216), ('approximately', 0.203), ('discusses', 0.2), ('logistic', 0.19), ('anyway', 0.151), ('known', 0.15), ('errors', 0.148), ('discuss', 0.141), ('original', 0.134), ('john', 0.131), ('regression', 0.119), ('nothing', 0.118), ('standard', 0.116), ('mean', 0.105), ('back', 0.098), ('put', 0.093), ('models', 0.093), ('another', 0.088), ('go', 0.084), ('well', 0.077), ('model', 0.073), ('way', 0.058)]
simIndex simValue blogId blogTitle
same-blog 1 1.0 39 andrew gelman stats-2010-05-18-The 1.6 rule
Introduction: In ARM we discuss how you can go back and forth between logit and probit models by dividing by 1.6. Or, to put it another way, logistic regression corresponds to a latent-variable model with errors that are approximately normally distributed with mean 0 and standard deviation 1.6. (This is well known, it’s nothing original with our book.) Anyway, John Cook discusses the approximation here .
2 0.14858583 684 andrew gelman stats-2011-04-28-Hierarchical ordered logit or probit
Introduction: Jeff writes: How far off is bglmer and can it handle ordered logit or multinom logit? My reply: bglmer is very close. No ordered logit but I was just talking about it with Sophia today. My guess is that the easiest way to fit a hierarchical ordered logit or multinom logit will be to use stan. For right now I’d recommend using glmer/bglmer to fit the ordered logits in order (e.g., 1 vs. 2,3,4, then 2 vs. 3,4, then 3 vs. 4). Or maybe there’s already a hierarchical multinomial logit in mcmcpack or somewhere?
3 0.14827694 782 andrew gelman stats-2011-06-29-Putting together multinomial discrete regressions by combining simple logits
Introduction: When predicting 0/1 data we can use logit (or probit or robit or some other robust model such as invlogit (0.01 + 0.98*X*beta)). Logit is simple enough and we can use bayesglm to regularize and avoid the problem of separation. What if there are more than 2 categories? If they’re ordered (1, 2, 3, etc), we can do ordered logit (and use bayespolr() to avoid separation). If the categories are unordered (vanilla, chocolate, strawberry), there are unordered multinomial logit and probit models out there. But it’s not so easy to fit these multinomial model in a multilevel setting (with coefficients that vary by group), especially if the computation is embedded in an iterative routine such as mi where you have real time constraints at each step. So this got me wondering whether we could kluge it with logits. Here’s the basic idea (in the ordered and unordered forms): - If you have a variable that goes 1, 2, 3, etc., set up a series of logits: 1 vs. 2,3,…; 2 vs. 3,…; and so forth
4 0.14232753 738 andrew gelman stats-2011-05-30-Works well versus well understood
Introduction: John Cook discusses the John Tukey quote, “The test of a good procedure is how well it works, not how well it is understood.” Cook writes: At some level, it’s hard to argue against this. Statistical procedures operate on empirical data, so it makes sense that the procedures themselves be evaluated empirically. But I [Cook] question whether we really know that a statistical procedure works well if it isn’t well understood. Specifically, I’m skeptical of complex statistical methods whose only credentials are a handful of simulations. “We don’t have any theoretical results, buy hey, it works well in practice. Just look at the simulations.” Every method works well on the scenarios its author publishes, almost by definition. If the method didn’t handle a scenario well, the author would publish a different scenario. I agree with Cook but would give a slightly different emphasis. I’d say that a lot of methods can work when they are done well. See the second meta-principle liste
5 0.13946322 1516 andrew gelman stats-2012-09-30-Computational problems with glm etc.
Introduction: John Mount provides some useful background and follow-up on our discussion from last year on computational instability of the usual logistic regression solver. Just to refresh your memory, here’s a simple logistic regression with only a constant term and no separation, nothing pathological at all: > y <- rep (c(1,0),c(10,5)) > display (glm (y ~ 1, family=binomial(link="logit"))) glm(formula = y ~ 1, family = binomial(link = "logit")) coef.est coef.se (Intercept) 0.69 0.55 --- n = 15, k = 1 residual deviance = 19.1, null deviance = 19.1 (difference = 0.0) And here’s what happens when we give it the not-outrageous starting value of -2: > display (glm (y ~ 1, family=binomial(link="logit"), start=-2)) glm(formula = y ~ 1, family = binomial(link = "logit"), start = -2) coef.est coef.se (Intercept) 71.97 17327434.18 --- n = 15, k = 1 residual deviance = 360.4, null deviance = 19.1 (difference = -341.3) Warning message:
6 0.13764398 888 andrew gelman stats-2011-09-03-A psychology researcher asks: Is Anova dead?
7 0.12827636 2163 andrew gelman stats-2014-01-08-How to display multinominal logit results graphically?
9 0.11631264 2046 andrew gelman stats-2013-10-01-I’ll say it again
10 0.11267287 417 andrew gelman stats-2010-11-17-Clutering and variance components
11 0.10707954 1527 andrew gelman stats-2012-10-10-Another reason why you can get good inferences from a bad model
12 0.10197946 2364 andrew gelman stats-2014-06-08-Regression and causality and variable ordering
13 0.09722849 534 andrew gelman stats-2011-01-24-Bayes at the end
14 0.097126588 1908 andrew gelman stats-2013-06-21-Interpreting interactions in discrete-data regression
15 0.094343826 1967 andrew gelman stats-2013-08-04-What are the key assumptions of linear regression?
16 0.093124062 1337 andrew gelman stats-2012-05-22-Question 12 of my final exam for Design and Analysis of Sample Surveys
17 0.092120841 2200 andrew gelman stats-2014-02-05-Prior distribution for a predicted probability
18 0.090086065 1886 andrew gelman stats-2013-06-07-Robust logistic regression
19 0.089728057 1197 andrew gelman stats-2012-03-04-“All Models are Right, Most are Useless”
20 0.089103222 1251 andrew gelman stats-2012-04-07-Mathematical model of vote operations
topicId topicWeight
[(0, 0.104), (1, 0.085), (2, 0.034), (3, 0.017), (4, 0.047), (5, -0.003), (6, 0.034), (7, -0.033), (8, 0.053), (9, 0.031), (10, 0.04), (11, 0.017), (12, -0.005), (13, -0.003), (14, -0.026), (15, -0.007), (16, -0.039), (17, -0.01), (18, 0.009), (19, -0.024), (20, 0.036), (21, -0.001), (22, 0.021), (23, -0.041), (24, 0.005), (25, -0.022), (26, 0.015), (27, -0.069), (28, -0.045), (29, -0.02), (30, -0.018), (31, 0.065), (32, 0.015), (33, -0.008), (34, 0.017), (35, -0.038), (36, -0.024), (37, -0.002), (38, -0.02), (39, -0.004), (40, -0.028), (41, -0.019), (42, -0.002), (43, -0.006), (44, 0.057), (45, 0.064), (46, -0.039), (47, 0.019), (48, 0.044), (49, 0.029)]
simIndex simValue blogId blogTitle
same-blog 1 0.96740866 39 andrew gelman stats-2010-05-18-The 1.6 rule
Introduction: In ARM we discuss how you can go back and forth between logit and probit models by dividing by 1.6. Or, to put it another way, logistic regression corresponds to a latent-variable model with errors that are approximately normally distributed with mean 0 and standard deviation 1.6. (This is well known, it’s nothing original with our book.) Anyway, John Cook discusses the approximation here .
2 0.75940979 1516 andrew gelman stats-2012-09-30-Computational problems with glm etc.
Introduction: John Mount provides some useful background and follow-up on our discussion from last year on computational instability of the usual logistic regression solver. Just to refresh your memory, here’s a simple logistic regression with only a constant term and no separation, nothing pathological at all: > y <- rep (c(1,0),c(10,5)) > display (glm (y ~ 1, family=binomial(link="logit"))) glm(formula = y ~ 1, family = binomial(link = "logit")) coef.est coef.se (Intercept) 0.69 0.55 --- n = 15, k = 1 residual deviance = 19.1, null deviance = 19.1 (difference = 0.0) And here’s what happens when we give it the not-outrageous starting value of -2: > display (glm (y ~ 1, family=binomial(link="logit"), start=-2)) glm(formula = y ~ 1, family = binomial(link = "logit"), start = -2) coef.est coef.se (Intercept) 71.97 17327434.18 --- n = 15, k = 1 residual deviance = 360.4, null deviance = 19.1 (difference = -341.3) Warning message:
3 0.7123521 684 andrew gelman stats-2011-04-28-Hierarchical ordered logit or probit
Introduction: Jeff writes: How far off is bglmer and can it handle ordered logit or multinom logit? My reply: bglmer is very close. No ordered logit but I was just talking about it with Sophia today. My guess is that the easiest way to fit a hierarchical ordered logit or multinom logit will be to use stan. For right now I’d recommend using glmer/bglmer to fit the ordered logits in order (e.g., 1 vs. 2,3,4, then 2 vs. 3,4, then 3 vs. 4). Or maybe there’s already a hierarchical multinomial logit in mcmcpack or somewhere?
4 0.71058434 782 andrew gelman stats-2011-06-29-Putting together multinomial discrete regressions by combining simple logits
Introduction: When predicting 0/1 data we can use logit (or probit or robit or some other robust model such as invlogit (0.01 + 0.98*X*beta)). Logit is simple enough and we can use bayesglm to regularize and avoid the problem of separation. What if there are more than 2 categories? If they’re ordered (1, 2, 3, etc), we can do ordered logit (and use bayespolr() to avoid separation). If the categories are unordered (vanilla, chocolate, strawberry), there are unordered multinomial logit and probit models out there. But it’s not so easy to fit these multinomial model in a multilevel setting (with coefficients that vary by group), especially if the computation is embedded in an iterative routine such as mi where you have real time constraints at each step. So this got me wondering whether we could kluge it with logits. Here’s the basic idea (in the ordered and unordered forms): - If you have a variable that goes 1, 2, 3, etc., set up a series of logits: 1 vs. 2,3,…; 2 vs. 3,…; and so forth
5 0.71040934 796 andrew gelman stats-2011-07-10-Matching and regression: two great tastes etc etc
Introduction: Matthew Bogard writes: Regarding the book Mostly Harmless Econometrics, you state : A casual reader of the book might be left with the unfortunate impression that matching is a competitor to regression rather than a tool for making regression more effective. But in fact isn’t that what they are arguing, that, in a ‘mostly harmless way’ regression is in fact a matching estimator itself? “Our view is that regression can be motivated as a particular sort of weighted matching estimator, and therefore the differences between regression and matching estimates are unlikely to be of major empirical importance” (Chapter 3 p. 70) They seem to be distinguishing regression (without prior matching) from all other types of matching techniques, and therefore implying that regression can be a ‘mostly harmless’ substitute or competitor to matching. My previous understanding, before starting this book was as you say, that matching is a tool that makes regression more effective. I have n
6 0.70402294 1967 andrew gelman stats-2013-08-04-What are the key assumptions of linear regression?
8 0.69962704 1981 andrew gelman stats-2013-08-14-The robust beauty of improper linear models in decision making
9 0.69539601 2357 andrew gelman stats-2014-06-02-Why we hate stepwise regression
10 0.69173688 375 andrew gelman stats-2010-10-28-Matching for preprocessing data for causal inference
11 0.68440688 2110 andrew gelman stats-2013-11-22-A Bayesian model for an increasing function, in Stan!
12 0.68400961 861 andrew gelman stats-2011-08-19-Will Stan work well with 40×40 matrices?
13 0.68253666 773 andrew gelman stats-2011-06-18-Should we always be using the t and robit instead of the normal and logit?
14 0.67122799 1908 andrew gelman stats-2013-06-21-Interpreting interactions in discrete-data regression
15 0.66512811 1849 andrew gelman stats-2013-05-09-Same old same old
16 0.66421545 1761 andrew gelman stats-2013-03-13-Lame Statistics Patents
17 0.66175139 1462 andrew gelman stats-2012-08-18-Standardizing regression inputs
18 0.65656435 1735 andrew gelman stats-2013-02-24-F-f-f-fake data
19 0.6485579 1886 andrew gelman stats-2013-06-07-Robust logistic regression
20 0.64730215 770 andrew gelman stats-2011-06-15-Still more Mr. P in public health
topicId topicWeight
[(16, 0.023), (24, 0.104), (29, 0.04), (31, 0.084), (63, 0.086), (84, 0.04), (95, 0.078), (99, 0.399)]
simIndex simValue blogId blogTitle
same-blog 1 0.986476 39 andrew gelman stats-2010-05-18-The 1.6 rule
Introduction: In ARM we discuss how you can go back and forth between logit and probit models by dividing by 1.6. Or, to put it another way, logistic regression corresponds to a latent-variable model with errors that are approximately normally distributed with mean 0 and standard deviation 1.6. (This is well known, it’s nothing original with our book.) Anyway, John Cook discusses the approximation here .
2 0.95592564 925 andrew gelman stats-2011-09-26-Ethnicity and Population Structure in Personal Naming Networks
Introduction: Aleks pointed me to this recent article by Pablo Mateos, Paul Longley, and David O’Sullivan on one of my favorite topics. The authors produced a potentially cool naming network of the city of Auckland New Zealand . I say “potentially cool” because I have such difficulty reading the article–I speak English, statistics, and a bit of political science and economics, but this one is written in heavy sociologese–that I can’t quite be sure what they’re doing. However, despite my (perhaps unfair) disdain for the particulars of their method, it’s probably good that they’re jumping in with this analysis. Others can take their data (and similar datasets from elsewhere) and do better. Ya gotta start somewhere, and the basic idea (to cluster first names that are associated with the same last names, and to cluster last names that are associated with the same first names) seems good. I have to admit, though, that I was amused by the following line, which, amazingly, led off the paper:
Introduction: Xian pointed me to this recycling of a classic probability error. It’s too bad it was in the New York Times, but at least it was in the Opinion Pages, so I guess that’s not so bad. And, on the plus side, several of the blog commenters got the point. What I was wondering, though, was who was this “Yitzhak Melechson, a statistics professor at the University of Tel Aviv”? This is such a standard problem, I’m surprised to find a statistics professor making this mistake. I was curious what his area of research is and where he was trained. I started by googling Yitzhak Melechson but all I could find was this news story, over and over and over and over again. Then I found Tel Aviv University and navigated to its statistics department but couldn’t find any Melechson in the faculty list. Next stop: entering Melechson in the search engine at the Tel Aviv University website. It came up blank. One last try: I entered the Yitzhak Melechson into Google Scholar. Here’s what came up:
Introduction: What follows is a long response to a comment on someone else’s blog . The quote is, “Thinking like an economist simply means that you scientifically approach human social behavior. . . .” I’ll give the context in a bit, but first let me say that I thought this topic might be worth one more discussion because I suspect that the sort of economics exceptionalism that I will discuss is widely disseminated in college econ courses as well as in books such as the Freakonomics series. It’s great to have pride in human achievements but at some point too much group self-regard can be distorting. My best analogy to economics exceptionalism is Freudianism in the 1950s: Back then, Freudian psychiatrists were on the top of the world. Not only were they well paid, well respected, and secure in their theoretical foundations, they were also at the center of many important conversations. Even those people who disagreed with them felt the need to explain why the Freudians were wrong. Freudian
5 0.95302004 1646 andrew gelman stats-2013-01-01-Back when fifty years was a long time ago
Introduction: New Year’s Day is an excellent time to look back at changes, not just in the past year, but in the past half-century. Mark Palko has an interesting post on the pace of changes in everyday life. We’ve been hearing a lot in the past few decades about how things are changing faster and faster. But, as Palko points out, the difference between life in 1962 and life today does not seem so different, at least for many people in the United States. Sure, there are some big changes: nonwhites get more respect, people mostly live longer, many cancers can be cured, fewer people are really really poor but it’s harder to hold down a job, cars are more reliable, you can get fresh fish in the suburbs, containers are lighter and stronger, vacations in the Caribbean instead of the Catskills, people have a lot more stuff and a lot more place to put it, etc etc etc. But life in the 1950s or 1960s just doesn’t seem so different from how we live today. In contrast, Palko writes, “You can also get
6 0.95072949 544 andrew gelman stats-2011-01-29-Splitting the data
7 0.95040816 508 andrew gelman stats-2011-01-08-More evidence of growing nationalization of congressional elections
8 0.9488495 682 andrew gelman stats-2011-04-27-“The ultimate left-wing novel”
9 0.94833559 1995 andrew gelman stats-2013-08-23-“I mean, what exact buttons do I have to hit?”
10 0.94752204 1690 andrew gelman stats-2013-01-23-When are complicated models helpful in psychology research and when are they overkill?
11 0.94740283 1536 andrew gelman stats-2012-10-16-Using economics to reduce bike theft
12 0.94683945 1201 andrew gelman stats-2012-03-07-Inference = data + model
13 0.94643617 992 andrew gelman stats-2011-11-05-Deadwood in the math curriculum
14 0.94545567 2279 andrew gelman stats-2014-04-02-Am I too negative?
15 0.9450385 460 andrew gelman stats-2010-12-09-Statistics gifts?
16 0.9450295 1506 andrew gelman stats-2012-09-21-Building a regression model . . . with only 27 data points
17 0.94441712 421 andrew gelman stats-2010-11-19-Just chaid
18 0.94423944 2141 andrew gelman stats-2013-12-20-Don’t douthat, man! Please give this fallacy a name.