andrew_gelman_stats andrew_gelman_stats-2011 andrew_gelman_stats-2011-782 knowledge-graph by maker-knowledge-mining

782 andrew gelman stats-2011-06-29-Putting together multinomial discrete regressions by combining simple logits


meta infos for this blog

Source: html

Introduction: When predicting 0/1 data we can use logit (or probit or robit or some other robust model such as invlogit (0.01 + 0.98*X*beta)). Logit is simple enough and we can use bayesglm to regularize and avoid the problem of separation. What if there are more than 2 categories? If they’re ordered (1, 2, 3, etc), we can do ordered logit (and use bayespolr() to avoid separation). If the categories are unordered (vanilla, chocolate, strawberry), there are unordered multinomial logit and probit models out there. But it’s not so easy to fit these multinomial model in a multilevel setting (with coefficients that vary by group), especially if the computation is embedded in an iterative routine such as mi where you have real time constraints at each step. So this got me wondering whether we could kluge it with logits. Here’s the basic idea (in the ordered and unordered forms): - If you have a variable that goes 1, 2, 3, etc., set up a series of logits: 1 vs. 2,3,…; 2 vs. 3,…; and so forth


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 When predicting 0/1 data we can use logit (or probit or robit or some other robust model such as invlogit (0. [sent-1, score-0.964]

2 Logit is simple enough and we can use bayesglm to regularize and avoid the problem of separation. [sent-4, score-0.536]

3 If they’re ordered (1, 2, 3, etc), we can do ordered logit (and use bayespolr() to avoid separation). [sent-6, score-1.253]

4 If the categories are unordered (vanilla, chocolate, strawberry), there are unordered multinomial logit and probit models out there. [sent-7, score-1.526]

5 But it’s not so easy to fit these multinomial model in a multilevel setting (with coefficients that vary by group), especially if the computation is embedded in an iterative routine such as mi where you have real time constraints at each step. [sent-8, score-1.115]

6 So this got me wondering whether we could kluge it with logits. [sent-9, score-0.171]

7 Here’s the basic idea (in the ordered and unordered forms): - If you have a variable that goes 1, 2, 3, etc. [sent-10, score-0.798]

8 The usual ordered logit is a special case of this model in which the coefficients (except for the constant term) are the same for each model. [sent-15, score-1.017]

9 intermediate versions could link the models with soft constraints, some prior distribution on the coefficients. [sent-18, score-0.276]

10 (This would need some hyperparameters but these could be estimated too if the whole model is being fit in some iterative context. [sent-19, score-0.537]

11 ) - If you have a vanilla-chocolate-strawberry variable, do the same thing; just order the categories first, either in some reasonable way based on substantive information or else using some automatic rule such as putting the categories in decreasing order of frequency in the data. [sent-20, score-0.944]

12 In any case, you’d first predict the probability of being in category 1, then the probability of being in 2 (given that you’re not in 1), then the probability of 3 (given not 1 or 2), and so forth. [sent-21, score-0.306]

13 Depending on your control over the problem, you could choose how to model the variables. [sent-22, score-0.19]

14 For example, in a political survey with some missing ethnicity responses, you might model that variable as ordered: white/other/hispanic/black. [sent-23, score-0.325]

15 I recognized that my patchwork of logits is a bit of a hack, but I like its flexibility, as well as the computational simplicity of building it out of simple logits. [sent-25, score-0.548]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('ordered', 0.376), ('logit', 0.34), ('unordered', 0.291), ('categories', 0.247), ('logits', 0.203), ('multinomial', 0.187), ('bayesglm', 0.177), ('probit', 0.17), ('iterative', 0.152), ('variable', 0.131), ('constraints', 0.129), ('model', 0.127), ('coefficients', 0.109), ('kluge', 0.108), ('fit', 0.104), ('probability', 0.102), ('invlogit', 0.102), ('robit', 0.102), ('chocolate', 0.102), ('patchwork', 0.102), ('simple', 0.101), ('avoid', 0.099), ('regularize', 0.097), ('hyperparameters', 0.091), ('embedded', 0.089), ('mi', 0.087), ('order', 0.083), ('decreasing', 0.082), ('hack', 0.08), ('flexibility', 0.079), ('contexts', 0.079), ('separation', 0.077), ('simplicity', 0.077), ('soft', 0.075), ('automatic', 0.073), ('beta', 0.073), ('intermediate', 0.071), ('routine', 0.07), ('depending', 0.068), ('frequency', 0.068), ('ethnicity', 0.067), ('versions', 0.067), ('constant', 0.065), ('recognized', 0.065), ('could', 0.063), ('use', 0.062), ('forms', 0.062), ('robust', 0.061), ('computation', 0.061), ('substantive', 0.061)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.99999964 782 andrew gelman stats-2011-06-29-Putting together multinomial discrete regressions by combining simple logits

Introduction: When predicting 0/1 data we can use logit (or probit or robit or some other robust model such as invlogit (0.01 + 0.98*X*beta)). Logit is simple enough and we can use bayesglm to regularize and avoid the problem of separation. What if there are more than 2 categories? If they’re ordered (1, 2, 3, etc), we can do ordered logit (and use bayespolr() to avoid separation). If the categories are unordered (vanilla, chocolate, strawberry), there are unordered multinomial logit and probit models out there. But it’s not so easy to fit these multinomial model in a multilevel setting (with coefficients that vary by group), especially if the computation is embedded in an iterative routine such as mi where you have real time constraints at each step. So this got me wondering whether we could kluge it with logits. Here’s the basic idea (in the ordered and unordered forms): - If you have a variable that goes 1, 2, 3, etc., set up a series of logits: 1 vs. 2,3,…; 2 vs. 3,…; and so forth

2 0.46639845 684 andrew gelman stats-2011-04-28-Hierarchical ordered logit or probit

Introduction: Jeff writes: How far off is bglmer and can it handle ordered logit or multinom logit? My reply: bglmer is very close. No ordered logit but I was just talking about it with Sophia today. My guess is that the easiest way to fit a hierarchical ordered logit or multinom logit will be to use stan. For right now I’d recommend using glmer/bglmer to fit the ordered logits in order (e.g., 1 vs. 2,3,4, then 2 vs. 3,4, then 3 vs. 4). Or maybe there’s already a hierarchical multinomial logit in mcmcpack or somewhere?

3 0.30946189 2163 andrew gelman stats-2014-01-08-How to display multinominal logit results graphically?

Introduction: Adriana Lins de Albuquerque writes: Do you have any suggestions for the best way to represent multinominal logit results graphically? I am using stata. My reply: I don’t know from Stata, but here are my suggestions: 1. If the categories are unordered, break them up into a series of binary choices in a tree structure (for example, non-voter or voter, then voting for left or right, then voting for left party A or B, then voting for right party C or D). Each of these is a binary split and so can be displayed using the usual techniques for logit (as in chapters 3 and 4 of ARM). 2. If the categories are ordered, see Figure 6.4 of ARM for an example (from our analysis of storable votes).

4 0.20010771 888 andrew gelman stats-2011-09-03-A psychology researcher asks: Is Anova dead?

Introduction: A research psychologist writes in with a question that’s so long that I’ll put my answer first, then put the question itself below the fold. Here’s my reply: As I wrote in my Anova paper and in my book with Jennifer Hill, I do think that multilevel models can completely replace Anova. At the same time, I think the central idea of Anova should persist in our understanding of these models. To me the central idea of Anova is not F-tests or p-values or sums of squares, but rather the idea of predicting an outcome based on factors with discrete levels, and understanding these factors using variance components. The continuous or categorical response thing doesn’t really matter so much to me. I have no problem using a normal linear model for continuous outcomes (perhaps suitably transformed) and a logistic model for binary outcomes. I don’t want to throw away interactions just because they’re not statistically significant. I’d rather partially pool them toward zero using an inform

5 0.17001744 328 andrew gelman stats-2010-10-08-Displaying a fitted multilevel model

Introduction: Elissa Brown writes: I’m working on some data using a multinomial model (3 categories for the response & 2 predictors-1 continuous and 1 binary), and I’ve been looking and looking for some sort of nice graphical way to show my model at work. Something like a predicted probabilities plot. I know you can do this for the levels of Y with just one covariate, but is this still a valid way to describe the multinomial model (just doing a pred plot for each covariate)? What’s the deal, is there really no way to graphically represent a successful multinomial model? Also, is it unreasonable to break down your model into a binary response just to get some ROC curves? This seems like cheating. From what I’ve found so far, it seems that people just avoid graphical support when discussing their fitted multinomial models. My reply: It’s hard for me to think about this sort of thing in the abstract with no context. We do have one example in chapter 6 of ARM where we display data and fitted m

6 0.16366842 1886 andrew gelman stats-2013-06-07-Robust logistic regression

7 0.16260427 154 andrew gelman stats-2010-07-18-Predictive checks for hierarchical models

8 0.14827694 39 andrew gelman stats-2010-05-18-The 1.6 rule

9 0.1451304 1247 andrew gelman stats-2012-04-05-More philosophy of Bayes

10 0.14293373 417 andrew gelman stats-2010-11-17-Clutering and variance components

11 0.13628624 2200 andrew gelman stats-2014-02-05-Prior distribution for a predicted probability

12 0.12198761 696 andrew gelman stats-2011-05-04-Whassup with glm()?

13 0.12107798 397 andrew gelman stats-2010-11-06-Multilevel quantile regression

14 0.1130449 1516 andrew gelman stats-2012-09-30-Computational problems with glm etc.

15 0.11054035 773 andrew gelman stats-2011-06-18-Should we always be using the t and robit instead of the normal and logit?

16 0.10876469 2342 andrew gelman stats-2014-05-21-Models with constraints

17 0.10713611 2224 andrew gelman stats-2014-02-25-Basketball Stats: Don’t model the probability of win, model the expected score differential.

18 0.10508959 547 andrew gelman stats-2011-01-31-Using sample size in the prior distribution

19 0.10257782 1972 andrew gelman stats-2013-08-07-When you’re planning on fitting a model, build up to it by fitting simpler models first. Then, once you have a model you like, check the hell out of it

20 0.10097396 1392 andrew gelman stats-2012-06-26-Occam


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.159), (1, 0.128), (2, 0.058), (3, 0.042), (4, 0.069), (5, 0.003), (6, 0.043), (7, -0.053), (8, 0.073), (9, 0.046), (10, 0.041), (11, 0.041), (12, -0.029), (13, -0.031), (14, -0.057), (15, 0.011), (16, 0.027), (17, -0.038), (18, -0.021), (19, -0.042), (20, 0.016), (21, -0.013), (22, -0.018), (23, -0.053), (24, -0.043), (25, -0.025), (26, 0.019), (27, -0.039), (28, -0.019), (29, -0.035), (30, -0.037), (31, 0.043), (32, 0.009), (33, 0.044), (34, -0.054), (35, -0.068), (36, -0.017), (37, 0.031), (38, -0.036), (39, 0.044), (40, 0.019), (41, -0.032), (42, 0.026), (43, -0.052), (44, 0.02), (45, 0.04), (46, -0.018), (47, 0.025), (48, 0.005), (49, 0.109)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.9643321 782 andrew gelman stats-2011-06-29-Putting together multinomial discrete regressions by combining simple logits

Introduction: When predicting 0/1 data we can use logit (or probit or robit or some other robust model such as invlogit (0.01 + 0.98*X*beta)). Logit is simple enough and we can use bayesglm to regularize and avoid the problem of separation. What if there are more than 2 categories? If they’re ordered (1, 2, 3, etc), we can do ordered logit (and use bayespolr() to avoid separation). If the categories are unordered (vanilla, chocolate, strawberry), there are unordered multinomial logit and probit models out there. But it’s not so easy to fit these multinomial model in a multilevel setting (with coefficients that vary by group), especially if the computation is embedded in an iterative routine such as mi where you have real time constraints at each step. So this got me wondering whether we could kluge it with logits. Here’s the basic idea (in the ordered and unordered forms): - If you have a variable that goes 1, 2, 3, etc., set up a series of logits: 1 vs. 2,3,…; 2 vs. 3,…; and so forth

2 0.88442159 684 andrew gelman stats-2011-04-28-Hierarchical ordered logit or probit

Introduction: Jeff writes: How far off is bglmer and can it handle ordered logit or multinom logit? My reply: bglmer is very close. No ordered logit but I was just talking about it with Sophia today. My guess is that the easiest way to fit a hierarchical ordered logit or multinom logit will be to use stan. For right now I’d recommend using glmer/bglmer to fit the ordered logits in order (e.g., 1 vs. 2,3,4, then 2 vs. 3,4, then 3 vs. 4). Or maybe there’s already a hierarchical multinomial logit in mcmcpack or somewhere?

3 0.79687113 1875 andrew gelman stats-2013-05-28-Simplify until your fake-data check works, then add complications until you can figure out where the problem is coming from

Introduction: I received the following email: I am trying to develop a Bayesian model to represent the process through which individual consumers make online product rating decisions. In my model each individual faces total J product options and for each product option (j) each individual (i) needs to make three sequential decisions: - First he decides whether to consume a specific product option (j) or not (choice decision) - If he decides to consume a product option j, then after consumption he decides whether to rate it or not (incidence decision) - If he decides to rate product j then what finally he decides what rating (k) to assign to it (evaluation decision) We model this decision sequence in terms of three equations. A binary response variable in the first equation represents the choice decision. Another binary response variable in the second equation represents the incidence decision that is observable only when first selection decision is 1. Finally, an ordered response v

4 0.78322679 39 andrew gelman stats-2010-05-18-The 1.6 rule

Introduction: In ARM we discuss how you can go back and forth between logit and probit models by dividing by 1.6. Or, to put it another way, logistic regression corresponds to a latent-variable model with errors that are approximately normally distributed with mean 0 and standard deviation 1.6. (This is well known, it’s nothing original with our book.) Anyway, John Cook discusses the approximation here .

5 0.78117317 328 andrew gelman stats-2010-10-08-Displaying a fitted multilevel model

Introduction: Elissa Brown writes: I’m working on some data using a multinomial model (3 categories for the response & 2 predictors-1 continuous and 1 binary), and I’ve been looking and looking for some sort of nice graphical way to show my model at work. Something like a predicted probabilities plot. I know you can do this for the levels of Y with just one covariate, but is this still a valid way to describe the multinomial model (just doing a pred plot for each covariate)? What’s the deal, is there really no way to graphically represent a successful multinomial model? Also, is it unreasonable to break down your model into a binary response just to get some ROC curves? This seems like cheating. From what I’ve found so far, it seems that people just avoid graphical support when discussing their fitted multinomial models. My reply: It’s hard for me to think about this sort of thing in the abstract with no context. We do have one example in chapter 6 of ARM where we display data and fitted m

6 0.75854379 852 andrew gelman stats-2011-08-13-Checking your model using fake data

7 0.74385691 1999 andrew gelman stats-2013-08-27-Bayesian model averaging or fitting a larger model

8 0.74374944 2110 andrew gelman stats-2013-11-22-A Bayesian model for an increasing function, in Stan!

9 0.74250942 1886 andrew gelman stats-2013-06-07-Robust logistic regression

10 0.72843409 1462 andrew gelman stats-2012-08-18-Standardizing regression inputs

11 0.72830695 2342 andrew gelman stats-2014-05-21-Models with constraints

12 0.72503155 251 andrew gelman stats-2010-09-02-Interactions of predictors in a causal model

13 0.72091657 1516 andrew gelman stats-2012-09-30-Computational problems with glm etc.

14 0.71397948 823 andrew gelman stats-2011-07-26-Including interactions or not

15 0.71278477 375 andrew gelman stats-2010-10-28-Matching for preprocessing data for causal inference

16 0.71012688 1468 andrew gelman stats-2012-08-24-Multilevel modeling and instrumental variables

17 0.70627493 151 andrew gelman stats-2010-07-16-Wanted: Probability distributions for rank orderings

18 0.70415986 1908 andrew gelman stats-2013-06-21-Interpreting interactions in discrete-data regression

19 0.70350736 1221 andrew gelman stats-2012-03-19-Whassup with deviance having a high posterior correlation with a parameter in the model?

20 0.70290279 1817 andrew gelman stats-2013-04-21-More on Bayesian model selection in high-dimensional settings


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(3, 0.017), (8, 0.011), (16, 0.061), (21, 0.029), (24, 0.143), (50, 0.021), (54, 0.019), (59, 0.01), (63, 0.312), (86, 0.021), (94, 0.01), (99, 0.207)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.9437983 684 andrew gelman stats-2011-04-28-Hierarchical ordered logit or probit

Introduction: Jeff writes: How far off is bglmer and can it handle ordered logit or multinom logit? My reply: bglmer is very close. No ordered logit but I was just talking about it with Sophia today. My guess is that the easiest way to fit a hierarchical ordered logit or multinom logit will be to use stan. For right now I’d recommend using glmer/bglmer to fit the ordered logits in order (e.g., 1 vs. 2,3,4, then 2 vs. 3,4, then 3 vs. 4). Or maybe there’s already a hierarchical multinomial logit in mcmcpack or somewhere?

2 0.93031973 568 andrew gelman stats-2011-02-11-Calibration in chess

Introduction: Has anybody done this study yet? I’m curious about the results. Perhaps there’s some chess-playing cognitive psychologist who’d like to collaborate on this?

3 0.92015231 739 andrew gelman stats-2011-05-31-When Did Girls Start Wearing Pink?

Introduction: That cute picture is of toddler FDR in a dress, from 1884. Jeanne Maglaty writes : A Ladies’ Home Journal article [or maybe from a different source, according to a commenter] in June 1918 said, “The generally accepted rule is pink for the boys, and blue for the girls. The reason is that pink, being a more decided and stronger color, is more suitable for the boy, while blue, which is more delicate and dainty, is prettier for the girl.” Other sources said blue was flattering for blonds, pink for brunettes; or blue was for blue-eyed babies, pink for brown-eyed babies, according to Paoletti. In 1927, Time magazine printed a chart showing sex-appropriate colors for girls and boys according to leading U.S. stores. In Boston, Filene’s told parents to dress boys in pink. So did Best & Co. in New York City, Halle’s in Cleveland and Marshall Field in Chicago. Today’s color dictate wasn’t established until the 1940s . . . When the women’s liberation movement arrived in the mid-1960s, w

4 0.9159525 313 andrew gelman stats-2010-10-03-A question for psychometricians

Introduction: Don Coffin writes: A colleague of mine and I are doing a presentation for new faculty on a number of topics related to teaching. Our charge is to identify interesting issues and to find research-based information for them about how to approach things. So, what I wondered is, do you know of any published research dealing with the sort of issues about structuring a course and final exam in the ways you talk about in this blog post ? Some poking around in the usual places hasn’t turned anything up yet. I don’t really know the psychometrics literature but I imagine that some good stuff has been written on principles of test design. There are probably some good papers from back in the 1920s. Can anyone supply some references?

5 0.90894532 628 andrew gelman stats-2011-03-25-100-year floods

Introduction: According to the National Weather Service : What is a 100 year flood? A 100 year flood is an event that statistically has a 1% chance of occurring in any given year. A 500 year flood has a .2% chance of occurring and a 1000 year flood has a .1% chance of occurring. The accompanying map shows a part of Tennessee that in May 2010 had 1000-year levels of flooding. At first, it seems hard to believe that a 1000-year flood would have just happened to occur last year. But then, this is just a 1000-year flood for that particular place. I don’t really have a sense of the statistics of these events. How many 100-year, 500-year, and 1000-year flood events have been recorded by the Weather Service, and when have they occurred?

same-blog 6 0.89768887 782 andrew gelman stats-2011-06-29-Putting together multinomial discrete regressions by combining simple logits

7 0.88867927 1078 andrew gelman stats-2011-12-22-Tables as graphs: The Ramanujan principle

8 0.88237953 745 andrew gelman stats-2011-06-04-High-level intellectual discussions in the Columbia statistics department

9 0.86478204 33 andrew gelman stats-2010-05-14-Felix Salmon wins the American Statistical Association’s Excellence in Statistical Reporting Award

10 0.85954154 1621 andrew gelman stats-2012-12-13-Puzzles of criminal justice

11 0.85944307 1316 andrew gelman stats-2012-05-12-black and Black, white and White

12 0.85539651 102 andrew gelman stats-2010-06-21-Why modern art is all in the mind

13 0.82991654 293 andrew gelman stats-2010-09-23-Lowess is great

14 0.82929307 1484 andrew gelman stats-2012-09-05-Two exciting movie ideas: “Second Chance U” and “The New Dirty Dozen”

15 0.82460862 126 andrew gelman stats-2010-07-03-Graphical presentation of risk ratios

16 0.81906509 1480 andrew gelman stats-2012-09-02-“If our product is harmful . . . we’ll stop making it.”

17 0.78408074 2249 andrew gelman stats-2014-03-15-Recently in the sister blog

18 0.77079558 1005 andrew gelman stats-2011-11-11-Robert H. Frank and P. J. O’Rourke present . . .

19 0.75855494 286 andrew gelman stats-2010-09-20-Are the Democrats avoiding a national campaign?

20 0.75764012 2103 andrew gelman stats-2013-11-16-Objects of the class “Objects of the class”