andrew_gelman_stats andrew_gelman_stats-2012 andrew_gelman_stats-2012-1241 knowledge-graph by maker-knowledge-mining

1241 andrew gelman stats-2012-04-02-Fixed effects and identification


meta infos for this blog

Source: html

Introduction: Tom Clark writes: Drew Linzer and I [Tom] have been working on a paper about the use of modeled (“random”) and unmodeled (“fixed”) effects. Not directly in response to the paper, but in conversations about the topic over the past few months, several people have said to us things to the effect of “I prefer fixed effects over random effects because I care about identification.” Neither Drew nor I has any idea what this comment is supposed to mean. Have you come across someone saying something like this? Do you have any thoughts about what these people could possibly mean? I want to respond to this concern when people raise it, but I have failed thus far to inquire what is meant and so do not know what to say. My reply: I have a “cultural” reply, which is that so-called fixed effects are thought to make fewer assumptions, and making fewer assumptions is considered a generally good thing that serious people do, and identification is considered a concern of serious people, so they g


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Tom Clark writes: Drew Linzer and I [Tom] have been working on a paper about the use of modeled (“random”) and unmodeled (“fixed”) effects. [sent-1, score-0.407]

2 Not directly in response to the paper, but in conversations about the topic over the past few months, several people have said to us things to the effect of “I prefer fixed effects over random effects because I care about identification. [sent-2, score-1.784]

3 ” Neither Drew nor I has any idea what this comment is supposed to mean. [sent-3, score-0.155]

4 Have you come across someone saying something like this? [sent-4, score-0.066]

5 Do you have any thoughts about what these people could possibly mean? [sent-5, score-0.293]

6 I want to respond to this concern when people raise it, but I have failed thus far to inquire what is meant and so do not know what to say. [sent-6, score-1.041]

7 My reply: I have a “cultural” reply, which is that so-called fixed effects are thought to make fewer assumptions, and making fewer assumptions is considered a generally good thing that serious people do, and identification is considered a concern of serious people, so they go together. [sent-7, score-2.323]

8 I generally prefer for my varying coefficients to be modeled. [sent-13, score-0.483]

9 I’m no fan of so-called fixed effects identification. [sent-14, score-0.694]

10 It’s just another model, just not as flexible as the multilevel version. [sent-15, score-0.2]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('fixed', 0.354), ('drew', 0.273), ('effects', 0.234), ('tom', 0.22), ('fewer', 0.207), ('concern', 0.192), ('inquire', 0.188), ('linzer', 0.188), ('unmodeled', 0.17), ('assumptions', 0.164), ('considered', 0.153), ('prefer', 0.153), ('clark', 0.148), ('serious', 0.144), ('random', 0.143), ('conversations', 0.14), ('generally', 0.131), ('people', 0.129), ('joe', 0.127), ('flexible', 0.122), ('modeled', 0.12), ('paper', 0.117), ('identification', 0.111), ('raise', 0.111), ('reply', 0.109), ('cultural', 0.108), ('failed', 0.108), ('fan', 0.106), ('varying', 0.104), ('meant', 0.097), ('neither', 0.096), ('coefficients', 0.095), ('respond', 0.092), ('supposed', 0.091), ('possibly', 0.091), ('commenters', 0.088), ('months', 0.083), ('version', 0.083), ('multilevel', 0.078), ('care', 0.074), ('thoughts', 0.073), ('directly', 0.071), ('past', 0.067), ('across', 0.066), ('thus', 0.064), ('comment', 0.064), ('response', 0.063), ('topic', 0.063), ('far', 0.06), ('several', 0.059)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0 1241 andrew gelman stats-2012-04-02-Fixed effects and identification

Introduction: Tom Clark writes: Drew Linzer and I [Tom] have been working on a paper about the use of modeled (“random”) and unmodeled (“fixed”) effects. Not directly in response to the paper, but in conversations about the topic over the past few months, several people have said to us things to the effect of “I prefer fixed effects over random effects because I care about identification.” Neither Drew nor I has any idea what this comment is supposed to mean. Have you come across someone saying something like this? Do you have any thoughts about what these people could possibly mean? I want to respond to this concern when people raise it, but I have failed thus far to inquire what is meant and so do not know what to say. My reply: I have a “cultural” reply, which is that so-called fixed effects are thought to make fewer assumptions, and making fewer assumptions is considered a generally good thing that serious people do, and identification is considered a concern of serious people, so they g

2 0.29945976 472 andrew gelman stats-2010-12-17-So-called fixed and random effects

Introduction: Someone writes: I am hoping you can give me some advice about when to use fixed and random effects model. I am currently working on a paper that examines the effect of . . . by comparing states . . . It got reviewed . . . by three economists and all suggest that we run a fixed effects model. We ran a hierarchial model in the paper that allow the intercept and slope to vary before and after . . . My question is which is correct? We have ran it both ways and really it makes no difference which model you run, the results are very similar. But for my own learning, I would really like to understand which to use under what circumstances. Is the fact that we use the whole population reason enough to just run a fixed effect model? Perhaps you can suggest a good reference to this question of when to run a fixed vs. random effects model. I’m not always sure what is meant by a “fixed effects model”; see my paper on Anova for discussion of the problems with this terminology: http://w

3 0.27641758 1644 andrew gelman stats-2012-12-30-Fixed effects, followed by Bayes shrinkage?

Introduction: Stuart Buck writes: I have a question about fixed effects vs. random effects . Amongst economists who study teacher value-added, it has become common to see people saying that they estimated teacher fixed effects (via least squares dummy variables, so that there is a parameter for each teacher), but that they then applied empirical Bayes shrinkage so that the teacher effects are brought closer to the mean. (See this paper by Jacob and Lefgren, for example.) Can that really be what they are doing? Why wouldn’t they just run random (modeled) effects in the first place? I feel like there’s something I’m missing. My reply: I don’t know the full story here, but I’m thinking there are two goals, first to get an unbiased estimate of an overall treatment effect (and there the econometricians prefer so-called fixed effects; I disagree with them on this but I know where they’re coming from) and second to estimate individual teacher effects (and there it makes sense to use so-called

4 0.23466675 1203 andrew gelman stats-2012-03-08-John Dalton’s Stroop test

Introduction: Drew Linzer showed me this hilarious monochrome image:

5 0.21034141 1194 andrew gelman stats-2012-03-04-Multilevel modeling even when you’re not interested in predictions for new groups

Introduction: Fred Wu writes: I work at National Prescribing Services in Australia. I have a database representing say, antidiabetic drug utilisation for the entire Australia in the past few years. I planned to do a longitudinal analysis across GP Division Network (112 divisions in AUS) using mixed-effects models (or as you called in your book varying intercept and varying slope) on this data. The problem here is: as data actually represent the population who use antidiabetic drugs in AUS, should I use 112 fixed dummy variables to capture the random variations or use varying intercept and varying slope for the model ? Because some one may aruge, like divisions in AUS or states in USA can hardly be considered from a “superpopulation”, then fixed dummies should be used. What I think is the population are those who use the drugs, what will happen when the rest need to use them? In terms of exchangeability, using varying intercept and varying slopes can be justified. Also you provided in y

6 0.18045217 653 andrew gelman stats-2011-04-08-Multilevel regression with shrinkage for “fixed” effects

7 0.16369544 2145 andrew gelman stats-2013-12-24-Estimating and summarizing inference for hierarchical variance parameters when the number of groups is small

8 0.1548638 603 andrew gelman stats-2011-03-07-Assumptions vs. conditions, part 2

9 0.14830117 888 andrew gelman stats-2011-09-03-A psychology researcher asks: Is Anova dead?

10 0.14131975 342 andrew gelman stats-2010-10-14-Trying to be precise about vagueness

11 0.12981707 2086 andrew gelman stats-2013-11-03-How best to compare effects measured in two different time periods?

12 0.12587753 1234 andrew gelman stats-2012-03-28-The Supreme Court’s Many Median Justices

13 0.12256307 464 andrew gelman stats-2010-12-12-Finite-population standard deviation in a hierarchical model

14 0.11433298 1267 andrew gelman stats-2012-04-17-Hierarchical-multilevel modeling with “big data”

15 0.11255177 383 andrew gelman stats-2010-10-31-Analyzing the entire population rather than a sample

16 0.11254817 77 andrew gelman stats-2010-06-09-Sof[t]

17 0.11056186 433 andrew gelman stats-2010-11-27-One way that psychology research is different than medical research

18 0.10931571 2170 andrew gelman stats-2014-01-13-Judea Pearl overview on causal inference, and more general thoughts on the reexpression of existing methods by considering their implicit assumptions

19 0.1089467 2359 andrew gelman stats-2014-06-04-All the Assumptions That Are My Life

20 0.10869864 1120 andrew gelman stats-2012-01-15-Fun fight over the Grover search algorithm


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.181), (1, 0.033), (2, 0.052), (3, -0.066), (4, 0.042), (5, -0.022), (6, 0.039), (7, -0.068), (8, 0.079), (9, 0.04), (10, 0.004), (11, 0.001), (12, 0.05), (13, -0.036), (14, 0.075), (15, 0.04), (16, -0.077), (17, 0.046), (18, -0.062), (19, 0.075), (20, -0.037), (21, -0.008), (22, -0.018), (23, -0.003), (24, -0.057), (25, -0.073), (26, -0.099), (27, 0.113), (28, -0.042), (29, 0.039), (30, -0.044), (31, -0.01), (32, -0.055), (33, -0.028), (34, 0.008), (35, -0.053), (36, -0.027), (37, -0.026), (38, -0.007), (39, 0.02), (40, 0.075), (41, -0.04), (42, 0.021), (43, 0.013), (44, -0.07), (45, 0.05), (46, 0.022), (47, -0.066), (48, -0.02), (49, -0.005)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.97304559 1241 andrew gelman stats-2012-04-02-Fixed effects and identification

Introduction: Tom Clark writes: Drew Linzer and I [Tom] have been working on a paper about the use of modeled (“random”) and unmodeled (“fixed”) effects. Not directly in response to the paper, but in conversations about the topic over the past few months, several people have said to us things to the effect of “I prefer fixed effects over random effects because I care about identification.” Neither Drew nor I has any idea what this comment is supposed to mean. Have you come across someone saying something like this? Do you have any thoughts about what these people could possibly mean? I want to respond to this concern when people raise it, but I have failed thus far to inquire what is meant and so do not know what to say. My reply: I have a “cultural” reply, which is that so-called fixed effects are thought to make fewer assumptions, and making fewer assumptions is considered a generally good thing that serious people do, and identification is considered a concern of serious people, so they g

2 0.88256949 1644 andrew gelman stats-2012-12-30-Fixed effects, followed by Bayes shrinkage?

Introduction: Stuart Buck writes: I have a question about fixed effects vs. random effects . Amongst economists who study teacher value-added, it has become common to see people saying that they estimated teacher fixed effects (via least squares dummy variables, so that there is a parameter for each teacher), but that they then applied empirical Bayes shrinkage so that the teacher effects are brought closer to the mean. (See this paper by Jacob and Lefgren, for example.) Can that really be what they are doing? Why wouldn’t they just run random (modeled) effects in the first place? I feel like there’s something I’m missing. My reply: I don’t know the full story here, but I’m thinking there are two goals, first to get an unbiased estimate of an overall treatment effect (and there the econometricians prefer so-called fixed effects; I disagree with them on this but I know where they’re coming from) and second to estimate individual teacher effects (and there it makes sense to use so-called

3 0.86372405 1194 andrew gelman stats-2012-03-04-Multilevel modeling even when you’re not interested in predictions for new groups

Introduction: Fred Wu writes: I work at National Prescribing Services in Australia. I have a database representing say, antidiabetic drug utilisation for the entire Australia in the past few years. I planned to do a longitudinal analysis across GP Division Network (112 divisions in AUS) using mixed-effects models (or as you called in your book varying intercept and varying slope) on this data. The problem here is: as data actually represent the population who use antidiabetic drugs in AUS, should I use 112 fixed dummy variables to capture the random variations or use varying intercept and varying slope for the model ? Because some one may aruge, like divisions in AUS or states in USA can hardly be considered from a “superpopulation”, then fixed dummies should be used. What I think is the population are those who use the drugs, what will happen when the rest need to use them? In terms of exchangeability, using varying intercept and varying slopes can be justified. Also you provided in y

4 0.8481198 472 andrew gelman stats-2010-12-17-So-called fixed and random effects

Introduction: Someone writes: I am hoping you can give me some advice about when to use fixed and random effects model. I am currently working on a paper that examines the effect of . . . by comparing states . . . It got reviewed . . . by three economists and all suggest that we run a fixed effects model. We ran a hierarchial model in the paper that allow the intercept and slope to vary before and after . . . My question is which is correct? We have ran it both ways and really it makes no difference which model you run, the results are very similar. But for my own learning, I would really like to understand which to use under what circumstances. Is the fact that we use the whole population reason enough to just run a fixed effect model? Perhaps you can suggest a good reference to this question of when to run a fixed vs. random effects model. I’m not always sure what is meant by a “fixed effects model”; see my paper on Anova for discussion of the problems with this terminology: http://w

5 0.79323345 464 andrew gelman stats-2010-12-12-Finite-population standard deviation in a hierarchical model

Introduction: Karri Seppa writes: My topic is regional variation in the cause-specific survival of breast cancer patients across the 21 hospital districts in Finland, this component being modeled by random effects. I am interested mainly in the district-specific effects, and with a hierarchical model I can get reasonable estimates also for sparsely populated districts. Based on the recommendation given in the book by yourself and Dr. Hill (2007) I tend to think that the finite-population variance would be an appropriate measure to summarize the overall variation across the 21 districts. However, I feel it is somewhat incoherent first to assume a Normal distribution for the district effects, involving a “superpopulation” variance parameter, and then to compute the finite-population variance from the estimated district-specific parameters. I wonder whether the finite-population variance were more appropriate in the context of a model with fixed district effects? My reply: I agree that th

6 0.7899279 2086 andrew gelman stats-2013-11-03-How best to compare effects measured in two different time periods?

7 0.75817794 1891 andrew gelman stats-2013-06-09-“Heterogeneity of variance in experimental studies: A challenge to conventional interpretations”

8 0.73283076 1744 andrew gelman stats-2013-03-01-Why big effects are more important than small effects

9 0.72584313 433 andrew gelman stats-2010-11-27-One way that psychology research is different than medical research

10 0.72210354 1513 andrew gelman stats-2012-09-27-Estimating seasonality with a data set that’s just 52 weeks long

11 0.72146022 653 andrew gelman stats-2011-04-08-Multilevel regression with shrinkage for “fixed” effects

12 0.71814418 1267 andrew gelman stats-2012-04-17-Hierarchical-multilevel modeling with “big data”

13 0.71810323 1310 andrew gelman stats-2012-05-09-Varying treatment effects, again

14 0.7148605 1686 andrew gelman stats-2013-01-21-Finite-population Anova calculations for models with interactions

15 0.71442914 1186 andrew gelman stats-2012-02-27-Confusion from illusory precision

16 0.71322244 2165 andrew gelman stats-2014-01-09-San Fernando Valley cityscapes: An example of the benefits of fractal devastation?

17 0.69677675 2145 andrew gelman stats-2013-12-24-Estimating and summarizing inference for hierarchical variance parameters when the number of groups is small

18 0.68609291 417 andrew gelman stats-2010-11-17-Clutering and variance components

19 0.68587834 560 andrew gelman stats-2011-02-06-Education and Poverty

20 0.68436998 963 andrew gelman stats-2011-10-18-Question on Type M errors


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(2, 0.012), (13, 0.027), (15, 0.038), (16, 0.056), (24, 0.215), (26, 0.023), (45, 0.033), (62, 0.06), (72, 0.024), (88, 0.033), (89, 0.015), (99, 0.36)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.98378491 1414 andrew gelman stats-2012-07-12-Steven Pinker’s unconvincing debunking of group selection

Introduction: Steven Pinker writes : Human beings live in groups, are affected by the fortunes of their groups, and sometimes make sacrifices that benefit their groups. Does this mean that the human brain has been shaped by natural selection to promote the welfare of the group in competition with other groups, even when it damages the welfare of the person and his or her kin? . . . Several scientists whom I [Pinker] greatly respect have said so in prominent places. And they have gone on to use the theory of group selection to make eye-opening claims about the human condition. They have claimed that human morailty, particularly our willingness to engage in acts of altruism, can be explained as an adaptation to group-against-group competition. As E. O. Wilson explains, “In a group, selfish individuals beat altruistic individuals. But, groups of altruistic individuals beat groups of selfish individuals.” . . . I [Pinker] am often asked whether I agree with the new group selectionists, and the q

same-blog 2 0.9825514 1241 andrew gelman stats-2012-04-02-Fixed effects and identification

Introduction: Tom Clark writes: Drew Linzer and I [Tom] have been working on a paper about the use of modeled (“random”) and unmodeled (“fixed”) effects. Not directly in response to the paper, but in conversations about the topic over the past few months, several people have said to us things to the effect of “I prefer fixed effects over random effects because I care about identification.” Neither Drew nor I has any idea what this comment is supposed to mean. Have you come across someone saying something like this? Do you have any thoughts about what these people could possibly mean? I want to respond to this concern when people raise it, but I have failed thus far to inquire what is meant and so do not know what to say. My reply: I have a “cultural” reply, which is that so-called fixed effects are thought to make fewer assumptions, and making fewer assumptions is considered a generally good thing that serious people do, and identification is considered a concern of serious people, so they g

3 0.97569215 511 andrew gelman stats-2011-01-11-One more time on that ESP study: The problem of overestimates and the shrinkage solution

Introduction: Benedict Carey writes a follow-up article on ESP studies and Bayesian statistics. ( See here for my previous thoughts on the topic.) Everything Carey writes is fine, and he even uses an example I recommended: The statistical approach that has dominated the social sciences for almost a century is called significance testing. The idea is straightforward. A finding from any well-designed study — say, a correlation between a personality trait and the risk of depression — is considered “significant” if its probability of occurring by chance is less than 5 percent. This arbitrary cutoff makes sense when the effect being studied is a large one — for example, when measuring the so-called Stroop effect. This effect predicts that naming the color of a word is faster and more accurate when the word and color match (“red” in red letters) than when they do not (“red” in blue letters), and is very strong in almost everyone. “But if the true effect of what you are measuring is small,” sai

4 0.97509629 247 andrew gelman stats-2010-09-01-How does Bayes do it?

Introduction: I received the following message from a statistician working in industry: I am studying your paper, A Weakly Informative Default Prior Distribution for Logistic and Other Regression Models . I am not clear why the Bayesian approaches with some priors can usually handle the issue of nonidentifiability or can get stable estimates of parameters in model fit, while the frequentist approaches cannot. My reply: 1. The term “frequentist approach” is pretty general. “Frequentist” refers to an approach for evaluating inferences, not a method for creating estimates. In particular, any Bayes estimate can be viewed as a frequentist inference if you feel like evaluating its frequency properties. In logistic regression, maximum likelihood has some big problems that are solved with penalized likelihood–equivalently, Bayesian inference. A frequentist can feel free to consider the prior as a penalty function rather than a probability distribution of parameters. 2. The reason our approa

5 0.97508758 970 andrew gelman stats-2011-10-24-Bell Labs

Introduction: Sining Chen told me they’re hiring in the statistics group at Bell Labs . I’ll do my bit for economic stimulus by announcing this job (see below). I love Bell Labs. I worked there for three summers, in a physics lab in 1985-86 under the supervision of Loren Pfeiffer, and by myself in the statistics group in 1990. I learned a lot working for Loren. He was a really smart and driven guy. His lab was a small set of rooms—in Bell Labs, everything’s in a small room, as they value the positive externality of close physical proximity of different labs, which you get by making each lab compact—and it was Loren, his assistant (a guy named Ken West who kept everything running in the lab), and three summer students: me, Gowton Achaibar, and a girl whose name I’ve forgotten. Gowtan and I had a lot of fun chatting in the lab. One day I made a silly comment about Gowton’s accent—he was from Guyana and pronounced “three” as “tree”—and then I apologized and said: Hey, here I am making fun o

6 0.97499007 2080 andrew gelman stats-2013-10-28-Writing for free

7 0.97498941 603 andrew gelman stats-2011-03-07-Assumptions vs. conditions, part 2

8 0.97463942 2109 andrew gelman stats-2013-11-21-Hidden dangers of noninformative priors

9 0.9741509 466 andrew gelman stats-2010-12-13-“The truth wears off: Is there something wrong with the scientific method?”

10 0.97397971 1941 andrew gelman stats-2013-07-16-Priors

11 0.97346449 107 andrew gelman stats-2010-06-24-PPS in Georgia

12 0.97331953 427 andrew gelman stats-2010-11-23-Bayesian adaptive methods for clinical trials

13 0.97315955 1644 andrew gelman stats-2012-12-30-Fixed effects, followed by Bayes shrinkage?

14 0.97314143 784 andrew gelman stats-2011-07-01-Weighting and prediction in sample surveys

15 0.97288322 2208 andrew gelman stats-2014-02-12-How to think about “identifiability” in Bayesian inference?

16 0.97204971 1695 andrew gelman stats-2013-01-28-Economists argue about Bayes

17 0.97163677 22 andrew gelman stats-2010-05-07-Jenny Davidson wins Mark Van Doren Award, also some reflections on the continuity of work within literary criticism or statistics

18 0.97140884 1150 andrew gelman stats-2012-02-02-The inevitable problems with statistical significance and 95% intervals

19 0.9709965 2129 andrew gelman stats-2013-12-10-Cross-validation and Bayesian estimation of tuning parameters

20 0.97085369 2200 andrew gelman stats-2014-02-05-Prior distribution for a predicted probability