andrew_gelman_stats andrew_gelman_stats-2010 andrew_gelman_stats-2010-417 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Raymond Lim writes: Do you have any recommendations on clustering and binary models? My particular problem is I’m running a firm fixed effect logit and want to cluster by industry-year (every combination of industry-year). My control variable of interest in measured by industry-year and when I cluster by industry-year, the standard errors are 300x larger than when I don’t cluster. Strangely, this problem only occurs when doing logit and not OLS (linear probability). Also, clustering just by field doesn’t blow up the errors. My hunch is it has something to do with the non-nested structure of year, but I don’t understand why this is only problematic under logit and not OLS. My reply: I’d recommend including four multilevel variance parameters, one for firm, one for industry, one for year, and one for industry-year. (In lmer, that’s (1 | firm) + (1 | industry) + (1 | year) + (1 | industry.year)). No need to include (1 | firm.year) since in your data this is the error term. Try
sentIndex sentText sentNum sentScore
1 Raymond Lim writes: Do you have any recommendations on clustering and binary models? [sent-1, score-0.409]
2 My particular problem is I’m running a firm fixed effect logit and want to cluster by industry-year (every combination of industry-year). [sent-2, score-1.063]
3 My control variable of interest in measured by industry-year and when I cluster by industry-year, the standard errors are 300x larger than when I don’t cluster. [sent-3, score-0.288]
4 Strangely, this problem only occurs when doing logit and not OLS (linear probability). [sent-4, score-0.387]
5 Also, clustering just by field doesn’t blow up the errors. [sent-5, score-0.348]
6 My hunch is it has something to do with the non-nested structure of year, but I don’t understand why this is only problematic under logit and not OLS. [sent-6, score-0.597]
7 My reply: I’d recommend including four multilevel variance parameters, one for firm, one for industry, one for year, and one for industry-year. [sent-7, score-0.136]
8 If you have a lot of firms, you might first try the secret weapon, fitting a model separately for each year. [sent-13, score-0.588]
9 Or if you have a lot of years, break the data up into 5-year periods and do the above analysis separately for each half-decade. [sent-14, score-0.575]
10 Things change over time, and I’m always wary of models with long time periods (decades or more). [sent-15, score-0.395]
11 I see this a lot in political science, where people naively think that they can just solve all their problems with so-called “state fixed effects,” as if Vermont in 1952 is anything like Vermont in 2008. [sent-16, score-0.43]
12 My other recommendation is to build up your model from simple parts and try to identify exactly where your procedure is blowing up. [sent-17, score-0.678]
13 (Masanao and Yu-Sung know what graph I’m talking about. [sent-19, score-0.105]
wordName wordTfidf (topN-words)
[('logit', 0.288), ('vermont', 0.286), ('firm', 0.283), ('clustering', 0.235), ('cluster', 0.203), ('periods', 0.199), ('separately', 0.19), ('industry', 0.19), ('blowing', 0.143), ('strangely', 0.143), ('fixed', 0.143), ('year', 0.141), ('try', 0.134), ('ols', 0.132), ('raymond', 0.125), ('hunch', 0.125), ('wary', 0.12), ('masanao', 0.117), ('weapon', 0.117), ('lmer', 0.115), ('slopes', 0.115), ('blow', 0.113), ('naively', 0.112), ('firms', 0.11), ('problematic', 0.109), ('graph', 0.105), ('lot', 0.101), ('occurs', 0.099), ('secret', 0.092), ('recommendations', 0.089), ('arm', 0.089), ('measured', 0.085), ('binary', 0.085), ('break', 0.085), ('build', 0.085), ('recommendation', 0.084), ('varying', 0.084), ('procedure', 0.079), ('combination', 0.079), ('parts', 0.077), ('identify', 0.076), ('models', 0.076), ('structure', 0.075), ('solve', 0.074), ('decades', 0.072), ('fitting', 0.071), ('variance', 0.069), ('linear', 0.067), ('four', 0.067), ('running', 0.067)]
simIndex simValue blogId blogTitle
same-blog 1 0.99999994 417 andrew gelman stats-2010-11-17-Clutering and variance components
Introduction: Raymond Lim writes: Do you have any recommendations on clustering and binary models? My particular problem is I’m running a firm fixed effect logit and want to cluster by industry-year (every combination of industry-year). My control variable of interest in measured by industry-year and when I cluster by industry-year, the standard errors are 300x larger than when I don’t cluster. Strangely, this problem only occurs when doing logit and not OLS (linear probability). Also, clustering just by field doesn’t blow up the errors. My hunch is it has something to do with the non-nested structure of year, but I don’t understand why this is only problematic under logit and not OLS. My reply: I’d recommend including four multilevel variance parameters, one for firm, one for industry, one for year, and one for industry-year. (In lmer, that’s (1 | firm) + (1 | industry) + (1 | year) + (1 | industry.year)). No need to include (1 | firm.year) since in your data this is the error term. Try
2 0.22195713 610 andrew gelman stats-2011-03-13-Secret weapon with rare events
Introduction: Gregory Eady writes: I’m working on a paper examining the effect of superpower alliance on a binary DV (war). I hypothesize that the size of the effect is much higher during the Cold War than it is afterwards. I’m going to run a Chow test to check whether this effect differs significantly between 1960-1989 and 1990-2007 (Scott Long also has a method using predicted probabilities), but I’d also like to show the trend graphically, and thought that your “Secret Weapon” would be useful here. I wonder if there is anything I should be concerned about when doing this with a (rare-events) logistic regression. I was thinking to graph the coefficients in 5-year periods, moving a single year at a time (1960-64, 1961-65, 1962-66, and so on), reporting the coefficient in the graph for the middle year of each 5-year range). My reply: I don’t know nuthin bout no Chow test but, sure, I’d think the secret weapon would work. If you’re analyzing 5-year periods, it might be cleaner just to keep t
3 0.20678015 888 andrew gelman stats-2011-09-03-A psychology researcher asks: Is Anova dead?
Introduction: A research psychologist writes in with a question that’s so long that I’ll put my answer first, then put the question itself below the fold. Here’s my reply: As I wrote in my Anova paper and in my book with Jennifer Hill, I do think that multilevel models can completely replace Anova. At the same time, I think the central idea of Anova should persist in our understanding of these models. To me the central idea of Anova is not F-tests or p-values or sums of squares, but rather the idea of predicting an outcome based on factors with discrete levels, and understanding these factors using variance components. The continuous or categorical response thing doesn’t really matter so much to me. I have no problem using a normal linear model for continuous outcomes (perhaps suitably transformed) and a logistic model for binary outcomes. I don’t want to throw away interactions just because they’re not statistically significant. I’d rather partially pool them toward zero using an inform
4 0.19167611 684 andrew gelman stats-2011-04-28-Hierarchical ordered logit or probit
Introduction: Jeff writes: How far off is bglmer and can it handle ordered logit or multinom logit? My reply: bglmer is very close. No ordered logit but I was just talking about it with Sophia today. My guess is that the easiest way to fit a hierarchical ordered logit or multinom logit will be to use stan. For right now I’d recommend using glmer/bglmer to fit the ordered logits in order (e.g., 1 vs. 2,3,4, then 2 vs. 3,4, then 3 vs. 4). Or maybe there’s already a hierarchical multinomial logit in mcmcpack or somewhere?
5 0.16049036 1248 andrew gelman stats-2012-04-06-17 groups, 6 group-level predictors: What to do?
Introduction: Yi-Chun Ou writes: I am using a multilevel model with three levels. I read that you wrote a book about multilevel models, and wonder if you can solve the following question. The data structure is like this: Level one: customer (8444 customers) Level two: companys (90 companies) Level three: industry (17 industries) I use 6 level-three variables (i.e. industry characteristics) to explain the variance of the level-one effect across industries. The question here is whether there is an over-fitting problem since there are only 17 industries. I understand that this must be a problem for non-multilevel models, but is it also a problem for multilevel models? My reply: Yes, this could be a problem. I’d suggest combining some of your variables into a common score, or using only some of the variables, or using strong priors to control the inferences. This is an interesting and important area of statistics research, to do this sort of thing systematically. There’s lots o
6 0.15570946 2163 andrew gelman stats-2014-01-08-How to display multinominal logit results graphically?
7 0.14293373 782 andrew gelman stats-2011-06-29-Putting together multinomial discrete regressions by combining simple logits
8 0.14170422 421 andrew gelman stats-2010-11-19-Just chaid
9 0.13762057 472 andrew gelman stats-2010-12-17-So-called fixed and random effects
10 0.12803547 383 andrew gelman stats-2010-10-31-Analyzing the entire population rather than a sample
11 0.12067072 851 andrew gelman stats-2011-08-12-year + (1|year)
12 0.11498205 653 andrew gelman stats-2011-04-08-Multilevel regression with shrinkage for “fixed” effects
13 0.11267287 39 andrew gelman stats-2010-05-18-The 1.6 rule
14 0.10883431 2086 andrew gelman stats-2013-11-03-How best to compare effects measured in two different time periods?
15 0.10805716 1194 andrew gelman stats-2012-03-04-Multilevel modeling even when you’re not interested in predictions for new groups
16 0.098751932 1267 andrew gelman stats-2012-04-17-Hierarchical-multilevel modeling with “big data”
17 0.098605663 1516 andrew gelman stats-2012-09-30-Computational problems with glm etc.
18 0.096593335 1966 andrew gelman stats-2013-08-03-Uncertainty in parameter estimates using multilevel models
19 0.095452532 107 andrew gelman stats-2010-06-24-PPS in Georgia
20 0.094665639 501 andrew gelman stats-2011-01-04-A new R package for fititng multilevel models
topicId topicWeight
[(0, 0.173), (1, 0.08), (2, 0.06), (3, 0.01), (4, 0.108), (5, -0.007), (6, -0.008), (7, -0.029), (8, 0.073), (9, 0.07), (10, 0.025), (11, 0.008), (12, 0.016), (13, -0.04), (14, 0.014), (15, 0.025), (16, -0.014), (17, -0.005), (18, -0.01), (19, 0.003), (20, 0.012), (21, 0.008), (22, -0.019), (23, -0.009), (24, -0.044), (25, -0.07), (26, -0.058), (27, 0.035), (28, -0.021), (29, -0.047), (30, -0.038), (31, 0.01), (32, -0.04), (33, -0.027), (34, 0.004), (35, -0.072), (36, -0.041), (37, -0.015), (38, 0.003), (39, 0.053), (40, -0.012), (41, -0.014), (42, -0.002), (43, -0.005), (44, -0.037), (45, 0.026), (46, 0.006), (47, 0.001), (48, -0.01), (49, 0.047)]
simIndex simValue blogId blogTitle
same-blog 1 0.9572522 417 andrew gelman stats-2010-11-17-Clutering and variance components
Introduction: Raymond Lim writes: Do you have any recommendations on clustering and binary models? My particular problem is I’m running a firm fixed effect logit and want to cluster by industry-year (every combination of industry-year). My control variable of interest in measured by industry-year and when I cluster by industry-year, the standard errors are 300x larger than when I don’t cluster. Strangely, this problem only occurs when doing logit and not OLS (linear probability). Also, clustering just by field doesn’t blow up the errors. My hunch is it has something to do with the non-nested structure of year, but I don’t understand why this is only problematic under logit and not OLS. My reply: I’d recommend including four multilevel variance parameters, one for firm, one for industry, one for year, and one for industry-year. (In lmer, that’s (1 | firm) + (1 | industry) + (1 | year) + (1 | industry.year)). No need to include (1 | firm.year) since in your data this is the error term. Try
2 0.81090671 851 andrew gelman stats-2011-08-12-year + (1|year)
Introduction: Ana Sequeira writes: I am using a temporal data series and I am trying specifically to understand if there is a temporal trends in the occurrence of a species, for which I need to use “Year” in my models (and from what I understood from pages 244-246 [in ARM] is that factors should always be used as random effects). I believe that in your book the closest example to my situation is the one shown in Figure 14.3: I also have 4 different regions in my study, states in your example are replaced by years in my study, and the x axis is a specific value for a climatic factor I am using in my analysis (IOD). The reason why I am writing you, is because I am having troubles understanding if my variable “Year” (factor), should only be added as a random effect (1|Year) or if I should include the “Years” (used not as factor) in my models as well (Species ~ …Years + (1|Year))? My doubt lies in the fact that I am looking for a trend and if I do not include “Years” as variable I believe
3 0.8081255 2086 andrew gelman stats-2013-11-03-How best to compare effects measured in two different time periods?
Introduction: I received the following email from someone who wishes to remain anonymous: My colleague and I are trying to understand the best way to approach a problem involving measuring a group of individuals’ abilities across time, and are hoping you can offer some guidance. We are trying to analyze the combined effect of two distinct groups of people (A and B, with no overlap between A and B) who collaborate to produce a binary outcome, using a mixed logistic regression along the lines of the following. Outcome ~ (1 | A) + (1 | B) + Other variables What we’re interested in testing was whether the observed A random effects in period 1 are predictive of the A random effects in the following period 2. Our idea being create two models, each using a different period’s worth of data, to create two sets of A coefficients, then observe the relationship between the two. If the A’s have a persistent ability across periods, the coefficients should be correlated or show a linear-ish relationshi
4 0.79073924 77 andrew gelman stats-2010-06-09-Sof[t]
Introduction: Joe Fruehwald writes: I’m working with linguistic data, specifically binomial hits and misses of a certain variable for certain words (specifically whether or not the “t” sound was pronounced at the end of words like “soft”). Word frequency follows a power law, with most words appearing just once, and with some words being hyperfrequent. I’m not interested in specific word effects, but I am interested in the effect of word frequency. A logistic model fit is going to be heavily influenced by the effect of the hyperfrequent words which constitute only one type. To control for the item effect, I would fit a multilevel model with a random intercept by word, but like I said, most of the words appear only once. Is there a principled approach to this problem? My response: It’s ok to fit a multilevel model even if most groups only have one observation each. You’ll want to throw in some word-level predictors too. Think of the multilevel model not as a substitute for the usual thoug
5 0.7863093 472 andrew gelman stats-2010-12-17-So-called fixed and random effects
Introduction: Someone writes: I am hoping you can give me some advice about when to use fixed and random effects model. I am currently working on a paper that examines the effect of . . . by comparing states . . . It got reviewed . . . by three economists and all suggest that we run a fixed effects model. We ran a hierarchial model in the paper that allow the intercept and slope to vary before and after . . . My question is which is correct? We have ran it both ways and really it makes no difference which model you run, the results are very similar. But for my own learning, I would really like to understand which to use under what circumstances. Is the fact that we use the whole population reason enough to just run a fixed effect model? Perhaps you can suggest a good reference to this question of when to run a fixed vs. random effects model. I’m not always sure what is meant by a “fixed effects model”; see my paper on Anova for discussion of the problems with this terminology: http://w
6 0.78496885 1194 andrew gelman stats-2012-03-04-Multilevel modeling even when you’re not interested in predictions for new groups
7 0.78440005 653 andrew gelman stats-2011-04-08-Multilevel regression with shrinkage for “fixed” effects
8 0.78416771 753 andrew gelman stats-2011-06-09-Allowing interaction terms to vary
9 0.78150254 1686 andrew gelman stats-2013-01-21-Finite-population Anova calculations for models with interactions
10 0.77910435 759 andrew gelman stats-2011-06-11-“2 level logit with 2 REs & large sample. computational nightmare – please help”
11 0.77838445 1267 andrew gelman stats-2012-04-17-Hierarchical-multilevel modeling with “big data”
12 0.77511048 888 andrew gelman stats-2011-09-03-A psychology researcher asks: Is Anova dead?
13 0.75413346 726 andrew gelman stats-2011-05-22-Handling multiple versions of an outcome variable
14 0.75405389 1248 andrew gelman stats-2012-04-06-17 groups, 6 group-level predictors: What to do?
16 0.74595261 269 andrew gelman stats-2010-09-10-R vs. Stata, or, Different ways to estimate multilevel models
17 0.74279082 1737 andrew gelman stats-2013-02-25-Correlation of 1 . . . too good to be true?
18 0.74097198 397 andrew gelman stats-2010-11-06-Multilevel quantile regression
19 0.73848385 1395 andrew gelman stats-2012-06-27-Cross-validation (What is it good for?)
20 0.73568928 1468 andrew gelman stats-2012-08-24-Multilevel modeling and instrumental variables
topicId topicWeight
[(2, 0.027), (9, 0.028), (16, 0.047), (24, 0.128), (28, 0.023), (36, 0.031), (44, 0.021), (56, 0.012), (59, 0.013), (63, 0.043), (77, 0.018), (80, 0.013), (84, 0.019), (85, 0.199), (90, 0.014), (93, 0.013), (95, 0.021), (99, 0.242)]
simIndex simValue blogId blogTitle
1 0.92383051 1790 andrew gelman stats-2013-04-06-Calling Jenny Davidson . . .
Introduction: Now that you have some free time again, you’ll have to check out these books and tell us if they’re worth reading. Claire Kirch reports : Lizzie Skurnick Books launches in September with the release of Debutante Hill by Lois Duncan. The novel, which was originally published by Dodd, Mead, in 1958, has been out of print for about three decades. The other books on the initial list, all reissues, are A Long Day in November by Ernest J. Gaines (originally published in 1971), Happy Endings Are All Alike by Sandra Scoppettone (1979), I’ll Love You When You’re More Like Me by M.E. Kerr (1977), Secret Lives by Berthe Amoss (1979), To All My Fans, With Love, From Sylvie by Ellen Conford (1982), and Me and Fat Glenda by Lila Perl (1972). . . . Noting that many of the books of that era beloved by teen boys are still in print – such as Isaac Asimov’s novels and The Chocolate War by Robert Cormier – Skurnick pointed out that, in contrast, many of the books that were embraced by teen gir
2 0.91340375 1534 andrew gelman stats-2012-10-15-The strange reappearance of Matthew Klam
Introduction: A few years ago I asked what happened to Matthew Klam, a talented writer who has a bizarrely professional-looking webpage but didn’t seem to be writing anymore. Good news! He published a new story in the New Yorker! Confusingly, he wrote it under the name “Justin Taylor,” but I’m not fooled (any more than I was fooled when that posthumous Updike story was published under the name “ Antonya Nelson “). I’m glad to see that Klam is back in action and look forward to seeing some stories under his own name as well.
same-blog 3 0.91201377 417 andrew gelman stats-2010-11-17-Clutering and variance components
Introduction: Raymond Lim writes: Do you have any recommendations on clustering and binary models? My particular problem is I’m running a firm fixed effect logit and want to cluster by industry-year (every combination of industry-year). My control variable of interest in measured by industry-year and when I cluster by industry-year, the standard errors are 300x larger than when I don’t cluster. Strangely, this problem only occurs when doing logit and not OLS (linear probability). Also, clustering just by field doesn’t blow up the errors. My hunch is it has something to do with the non-nested structure of year, but I don’t understand why this is only problematic under logit and not OLS. My reply: I’d recommend including four multilevel variance parameters, one for firm, one for industry, one for year, and one for industry-year. (In lmer, that’s (1 | firm) + (1 | industry) + (1 | year) + (1 | industry.year)). No need to include (1 | firm.year) since in your data this is the error term. Try
4 0.90555912 912 andrew gelman stats-2011-09-15-n = 2
Introduction: People in Chicago are nice. The conductor on the train came by and I asked if I could buy a ticket right there. He said yes, $2.50. While I was getting the money he asked if the ticket machine at the station had been broken. I said, I don’t know, I saw the train and ran up the stairs to catch it. He said, that’s not what you’re supposed to say. So I said, that’s right, the machine was broken. It’s just like on that radio show where Peter Sagal hems and haws to clue the contestant in that his guess is wrong so he can try again.
5 0.89574718 330 andrew gelman stats-2010-10-09-What joker put seven dog lice in my Iraqi fez box?
Introduction: New Sentences For The Testing Of Typewriters (from John Lennon ): Fetching killjoy Mavis Wax was probed on the quay. “Yo, never mix Zoloft with Quik,” gabs Doc Jasper. One zany quaff is vodka mixed with grape juice and blood. Zitty Vicki smugly quipped in her journal, “Fay waxes her butt.” Hot Wendy gave me quasi-Kreutzfeld-Jacob pox. Jack’s pervy moxie quashed Bob’s new Liszt fugue. I backed Zevy’s qualms over Janet’s wig of phlox. Tipsy Bangkok panjandrums fix elections with quivering zeal. Mexican juntas, viewed in fog, piqued Zachary, killed Rob. Jaywalking Zulu chieftains vex probate judge Marcy Quinn. Twenty-six Excedrin helped give Jocko quite a firm buzz. Racy pics of bed hijinx with glam queen sunk Val. Why Paxil? Jim’s Bodega stocked no quince-flavor Pez. Wavy-haired quints of El Paz mock Jorge by fax. Two phony quacks of God bi-exorcize evil mojo.
6 0.89021766 843 andrew gelman stats-2011-08-07-Non-rant
7 0.88401389 2300 andrew gelman stats-2014-04-21-Ticket to Baaaath
8 0.87915885 533 andrew gelman stats-2011-01-23-The scalarization of America
9 0.8785913 58 andrew gelman stats-2010-05-29-Stupid legal crap
10 0.8771019 375 andrew gelman stats-2010-10-28-Matching for preprocessing data for causal inference
11 0.87659627 1318 andrew gelman stats-2012-05-13-Stolen jokes
12 0.87301797 1374 andrew gelman stats-2012-06-11-Convergence Monitoring for Non-Identifiable and Non-Parametric Models
14 0.86517078 1187 andrew gelman stats-2012-02-27-“Apple confronts the law of large numbers” . . . huh?
15 0.86385608 610 andrew gelman stats-2011-03-13-Secret weapon with rare events
16 0.86219692 734 andrew gelman stats-2011-05-28-Funniest comment ever
18 0.84133875 1899 andrew gelman stats-2013-06-14-Turing chess tournament!
19 0.84011191 584 andrew gelman stats-2011-02-22-“Are Wisconsin Public Employees Underpaid?”
20 0.83650029 796 andrew gelman stats-2011-07-10-Matching and regression: two great tastes etc etc