andrew_gelman_stats andrew_gelman_stats-2010 andrew_gelman_stats-2010-368 knowledge-graph by maker-knowledge-mining

368 andrew gelman stats-2010-10-25-Is instrumental variables analysis particularly susceptible to Type M errors?


meta infos for this blog

Source: html

Introduction: Hendrik Juerges writes: I am an applied econometrician. The reason I am writing is that I am pondering a question for some time now and I am curious whether you have any views on it. One problem the practitioner of instrumental variables estimation faces is large standard errors even with very large samples. Part of the problem is of course that one estimates a ratio. Anyhow, more often than not, I and many other researchers I know end up with large point estimates and standard errors when trying IV on a problem. Sometimes some of us are lucky and get a statistically significant result. Those estimates that make it beyond the 2 standard error threshold are often ridiculously large (one famous example in my line of research being Lleras-Muney’s estimates of the 10% effect of one year of schooling on mortality). The standard defense here is that IV estimates the complier-specific causal effect (which is mathematically correct). But still, I find many of the IV results (including my


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 The reason I am writing is that I am pondering a question for some time now and I am curious whether you have any views on it. [sent-2, score-0.321]

2 One problem the practitioner of instrumental variables estimation faces is large standard errors even with very large samples. [sent-3, score-1.51]

3 Part of the problem is of course that one estimates a ratio. [sent-4, score-0.37]

4 Anyhow, more often than not, I and many other researchers I know end up with large point estimates and standard errors when trying IV on a problem. [sent-5, score-1.004]

5 Sometimes some of us are lucky and get a statistically significant result. [sent-6, score-0.094]

6 Those estimates that make it beyond the 2 standard error threshold are often ridiculously large (one famous example in my line of research being Lleras-Muney’s estimates of the 10% effect of one year of schooling on mortality). [sent-7, score-1.599]

7 The standard defense here is that IV estimates the complier-specific causal effect (which is mathematically correct). [sent-8, score-0.759]

8 Now comes my question: Could it be that IV is particularly prone to “type M” errors? [sent-10, score-0.11]

9 My reply: I’ve never actually done any instrumental variables analysis, Bayesian or otherwise. [sent-14, score-0.37]

10 But I do recall that Imbens and Rubin discuss Bayesian solutions in one of their articles, and I think they made the point that the inclusion of a little bit of prior information can help a lot. [sent-15, score-0.425]

11 In any case, I agree that if standard errors are large, then you’ll be subject to Type M errors. [sent-16, score-0.444]

12 My own way of understanding IV is to think of the instrument has having a joint effect on the intermediate and final outcomes. [sent-18, score-0.471]

13 Often this can be clear enough, and you don’t need to actually divide the coefficients. [sent-19, score-0.091]

14 And here are my more general thoughts on the difficulty of estimating ratios. [sent-20, score-0.13]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('iv', 0.514), ('estimates', 0.24), ('errors', 0.224), ('standard', 0.22), ('large', 0.209), ('instrumental', 0.183), ('anyhow', 0.132), ('practitioner', 0.132), ('effect', 0.124), ('ironclad', 0.122), ('type', 0.119), ('ridiculously', 0.118), ('schooling', 0.118), ('pondering', 0.115), ('imbens', 0.112), ('often', 0.111), ('prone', 0.11), ('inclusion', 0.11), ('bayesian', 0.109), ('faces', 0.104), ('variables', 0.102), ('instrument', 0.1), ('mortality', 0.099), ('lucky', 0.094), ('solutions', 0.093), ('beauty', 0.093), ('intermediate', 0.092), ('help', 0.091), ('divide', 0.091), ('mathematically', 0.091), ('ratios', 0.09), ('threshold', 0.086), ('done', 0.085), ('defense', 0.084), ('joint', 0.079), ('final', 0.076), ('rubin', 0.074), ('sex', 0.071), ('views', 0.07), ('curious', 0.07), ('one', 0.069), ('difficulty', 0.067), ('question', 0.066), ('estimation', 0.066), ('rule', 0.065), ('basically', 0.065), ('famous', 0.064), ('estimating', 0.063), ('recall', 0.062), ('problem', 0.061)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0 368 andrew gelman stats-2010-10-25-Is instrumental variables analysis particularly susceptible to Type M errors?

Introduction: Hendrik Juerges writes: I am an applied econometrician. The reason I am writing is that I am pondering a question for some time now and I am curious whether you have any views on it. One problem the practitioner of instrumental variables estimation faces is large standard errors even with very large samples. Part of the problem is of course that one estimates a ratio. Anyhow, more often than not, I and many other researchers I know end up with large point estimates and standard errors when trying IV on a problem. Sometimes some of us are lucky and get a statistically significant result. Those estimates that make it beyond the 2 standard error threshold are often ridiculously large (one famous example in my line of research being Lleras-Muney’s estimates of the 10% effect of one year of schooling on mortality). The standard defense here is that IV estimates the complier-specific causal effect (which is mathematically correct). But still, I find many of the IV results (including my

2 0.18403108 777 andrew gelman stats-2011-06-23-Combining survey data obtained using different modes of sampling

Introduction: I’m involved (with Irv Garfinkel and others) in a planned survey of New York City residents. It’s hard to reach people in the city–not everyone will answer their mail or phone, and you can’t send an interviewer door-to-door in a locked apartment building. (I think it violates IRB to have a plan of pushing all the buzzers by the entrance and hoping someone will let you in.) So the plan is to use multiple modes, including phone, in person household, random street intercepts and mail. The question then is how to combine these samples. My suggested approach is to divide the population into poststrata based on various factors (age, ethnicity, family type, housing type, etc), then to pool responses within each poststratum, then to runs some regressions including postratsta and also indicators for mode, to understand how respondents from different modes differ, after controlling for the demographic/geographic adjustments. Maybe this has already been done and written up somewhere? P.

3 0.18210606 550 andrew gelman stats-2011-02-02-An IV won’t save your life if the line is tangled

Introduction: Alex Tabarrok quotes Randall Morck and Bernard Yeung on difficulties with instrumental variables. This reminded me of some related things I’ve written. In the official story the causal question comes first and then the clever researcher comes up with an IV. I suspect that often it’s the other way around: you find a natural experiment and look at the consequences that flow from it. And maybe that’s not such a bad thing. See section 4 of this article . More generally, I think economists and political scientists are currently a bit overinvested in identification strategies. I agree with Heckman’s point (as I understand it) that ultimately we should be building models that work for us rather than always thinking we can get causal inference on the cheap, as it were, by some trick or another. (This is a point I briefly discuss in a couple places here and also in my recent paper for the causality volume that Don Green etc are involved with.) I recently had this discussion wi

4 0.14835873 1966 andrew gelman stats-2013-08-03-Uncertainty in parameter estimates using multilevel models

Introduction: David Hsu writes: I have a (perhaps) simple question about uncertainty in parameter estimates using multilevel models — what is an appropriate threshold for measure parameter uncertainty in a multilevel model? The reason why I ask is that I set out to do a crossed two-way model with two varying intercepts, similar to your flight simulator example in your 2007 book. The difference is that I have a lot of predictors specific to each cell (I think equivalent to airport and pilot in your example), and I find after modeling this in JAGS, I happily find that the predictors are much less important than the variability by cell (airport and pilot effects). Happily because this is what I am writing a paper about. However, I then went to check subsets of predictors using lm() and lmer(). I understand that they all use different estimation methods, but what I can’t figure out is why the errors on all of the coefficient estimates are *so* different. For example, using JAGS, and th

5 0.13777621 1941 andrew gelman stats-2013-07-16-Priors

Introduction: Nick Firoozye writes: While I am absolutely sympathetic to the Bayesian agenda I am often troubled by the requirement of having priors. We must have priors on the parameter of an infinite number of model we have never seen before and I find this troubling. There is a similarly troubling problem in economics of utility theory. Utility is on consumables. To be complete a consumer must assign utility to all sorts of things they never would have encountered. More recent versions of utility theory instead make consumption goods a portfolio of attributes. Cadillacs are x many units of luxury y of transport etc etc. And we can automatically have personal utilities to all these attributes. I don’t ever see parameters. Some model have few and some have hundreds. Instead, I see data. So I don’t know how to have an opinion on parameters themselves. Rather I think it far more natural to have opinions on the behavior of models. The prior predictive density is a good and sensible notion. Also

6 0.13399239 1418 andrew gelman stats-2012-07-16-Long discussion about causal inference and the use of hierarchical models to bridge between different inferential settings

7 0.12916201 803 andrew gelman stats-2011-07-14-Subtleties with measurement-error models for the evaluation of wacky claims

8 0.12510408 1695 andrew gelman stats-2013-01-28-Economists argue about Bayes

9 0.12132566 1605 andrew gelman stats-2012-12-04-Write This Book

10 0.11756741 511 andrew gelman stats-2011-01-11-One more time on that ESP study: The problem of overestimates and the shrinkage solution

11 0.1115354 534 andrew gelman stats-2011-01-24-Bayes at the end

12 0.11011319 716 andrew gelman stats-2011-05-17-Is the internet causing half the rapes in Norway? I wanna see the scatterplot.

13 0.10654887 888 andrew gelman stats-2011-09-03-A psychology researcher asks: Is Anova dead?

14 0.1054461 957 andrew gelman stats-2011-10-14-Questions about a study of charter schools

15 0.10473032 25 andrew gelman stats-2010-05-10-Two great tastes that taste great together

16 0.1035753 2135 andrew gelman stats-2013-12-15-The UN Plot to Force Bayesianism on Unsuspecting Americans (penalized B-Spline edition)

17 0.10299367 1149 andrew gelman stats-2012-02-01-Philosophy of Bayesian statistics: my reactions to Cox and Mayo

18 0.098503068 2109 andrew gelman stats-2013-11-21-Hidden dangers of noninformative priors

19 0.098317072 896 andrew gelman stats-2011-09-09-My homework success

20 0.098107524 972 andrew gelman stats-2011-10-25-How do you interpret standard errors from a regression fit to the entire population?


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.214), (1, 0.088), (2, 0.04), (3, -0.094), (4, 0.004), (5, -0.012), (6, 0.014), (7, 0.039), (8, 0.027), (9, -0.023), (10, 0.01), (11, -0.037), (12, 0.054), (13, 0.011), (14, 0.05), (15, 0.003), (16, -0.035), (17, 0.016), (18, -0.013), (19, 0.059), (20, -0.037), (21, 0.018), (22, 0.033), (23, 0.021), (24, 0.025), (25, 0.001), (26, 0.027), (27, -0.053), (28, -0.025), (29, -0.021), (30, 0.065), (31, -0.005), (32, -0.026), (33, -0.007), (34, 0.035), (35, -0.0), (36, -0.014), (37, -0.01), (38, 0.002), (39, -0.014), (40, -0.013), (41, -0.029), (42, -0.063), (43, 0.024), (44, -0.003), (45, -0.022), (46, 0.018), (47, 0.025), (48, 0.017), (49, 0.034)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.98200679 368 andrew gelman stats-2010-10-25-Is instrumental variables analysis particularly susceptible to Type M errors?

Introduction: Hendrik Juerges writes: I am an applied econometrician. The reason I am writing is that I am pondering a question for some time now and I am curious whether you have any views on it. One problem the practitioner of instrumental variables estimation faces is large standard errors even with very large samples. Part of the problem is of course that one estimates a ratio. Anyhow, more often than not, I and many other researchers I know end up with large point estimates and standard errors when trying IV on a problem. Sometimes some of us are lucky and get a statistically significant result. Those estimates that make it beyond the 2 standard error threshold are often ridiculously large (one famous example in my line of research being Lleras-Muney’s estimates of the 10% effect of one year of schooling on mortality). The standard defense here is that IV estimates the complier-specific causal effect (which is mathematically correct). But still, I find many of the IV results (including my

2 0.81405449 960 andrew gelman stats-2011-10-15-The bias-variance tradeoff

Introduction: Joshua Vogelstein asks for my thoughts as a Bayesian on the above topic. So here they are (briefly): The concept of the bias-variance tradeoff can be useful if you don’t take it too seriously. The basic idea is as follows: if you’re estimating something, you can slice your data finer and finer, or perform more and more adjustments, each time getting a purer—and less biased—estimate. But each subdivision or each adjustment reduces your sample size or increases potential estimation error, hence the variance of your estimate goes up. That story is real. In lots and lots of examples, there’s a continuum between a completely unadjusted general estimate (high bias, low variance) and a specific, focused, adjusted estimate (low bias, high variance). Suppose, for example, you’re using data from a large experiment to estimate the effect of a treatment on a fairly narrow group, say, white men between the ages of 45 and 50. At one extreme, you could just take the estimated treatment e

3 0.79882944 2180 andrew gelman stats-2014-01-21-Everything I need to know about Bayesian statistics, I learned in eight schools.

Introduction: This post is by Phil. I’m aware that there  are  some people who use a Bayesian approach largely because it allows them to provide a highly informative prior distribution based subjective judgment, but that is not the appeal of Bayesian methods for a lot of us practitioners. It’s disappointing and surprising, twenty years after my initial experiences, to still hear highly informed professional statisticians who think that what distinguishes Bayesian statistics from Frequentist statistics is “subjectivity” ( as seen in  a recent blog post and its comments ). My first encounter with Bayesian statistics was just over 20 years ago. I was a postdoc at Lawrence Berkeley National Laboratory, with a new PhD in theoretical atomic physics but working on various problems related to the geographical and statistical distribution of indoor radon (a naturally occurring radioactive gas that can be dangerous if present at high concentrations). One of the issues I ran into right at the start was th

4 0.79035288 1409 andrew gelman stats-2012-07-08-Is linear regression unethical in that it gives more weight to cases that are far from the average?

Introduction: I received the following note from someone who’d like to remain anonymous: I read your post on ethics and statistics, and the comments therein, with much interest. I did notice, however, that most of the dialogue was about ethical behavior of scientists. Herein I’d like to suggest a different take, one that focuses on the statistical methods of scientists. For example, fitting a line to a scatter plot of data using OLS [linear regression] gives more weight to outliers. If each data point represents a person we are weighting people differently. And surely the ethical implications are different if we use a least absolute deviation estimator. Recently I reviewed a paper where the authors claimed one advantage of non-parametric rank-based tests is their robustness to outliers. Again, maybe that outlier is the 10th person who dies from an otherwise beneficial medicine. Should we ignore him in assessing the effect of the medicine? I guess this gets me partly into loss f

5 0.77771062 248 andrew gelman stats-2010-09-01-Ratios where the numerator and denominator both change signs

Introduction: A couple years ago, I used a question by Benjamin Kay as an excuse to write that it’s usually a bad idea to study a ratio whose denominator has uncertain sign. As I wrote then: Similar problems arise with marginal cost-benefit ratios, LD50 in logistic regression (see chapter 3 of Bayesian Data Analysis for an example), instrumental variables, and the Fieller-Creasy problem in theoretical statistics. . . . In general, the story is that the ratio completely changes in interpretation when the denominator changes sign. More recently, Kay sent in a related question: I [Kay] wondered if you have any advice on handling ratios when the signs change as a result of a parameter. I have three functions, one C * x^a, another D * x^a, and a third f(x,a) in my paper such that: C * x^a, < f(x,a) < D * x^a C,D and a all have the same signs. We can divide through by C * x^a but the results depend on the sign of C either 1< f(x,a) / C * x^a < D * x^a / C * x^a, or 1 / f(x,a

6 0.77497762 643 andrew gelman stats-2011-04-02-So-called Bayesian hypothesis testing is just as bad as regular hypothesis testing

7 0.76443148 2099 andrew gelman stats-2013-11-13-“What are some situations in which the classical approach (or a naive implementation of it, based on cookbook recipes) gives worse results than a Bayesian approach, results that actually impeded the science?”

8 0.75678784 775 andrew gelman stats-2011-06-21-Fundamental difficulty of inference for a ratio when the denominator could be positive or negative

9 0.75223279 2223 andrew gelman stats-2014-02-24-“Edlin’s rule” for routinely scaling down published estimates

10 0.75199395 553 andrew gelman stats-2011-02-03-is it possible to “overstratify” when assigning a treatment in a randomized control trial?

11 0.75031388 466 andrew gelman stats-2010-12-13-“The truth wears off: Is there something wrong with the scientific method?”

12 0.7492488 518 andrew gelman stats-2011-01-15-Regression discontinuity designs: looking for the keys under the lamppost?

13 0.74778873 310 andrew gelman stats-2010-10-02-The winner’s curse

14 0.73279548 777 andrew gelman stats-2011-06-23-Combining survey data obtained using different modes of sampling

15 0.73132676 1971 andrew gelman stats-2013-08-07-I doubt they cheated

16 0.73022729 1196 andrew gelman stats-2012-03-04-Piss-poor monocausal social science

17 0.72753108 1441 andrew gelman stats-2012-08-02-“Based on my experiences, I think you could make general progress by constructing a solution to your specific problem.”

18 0.72529536 106 andrew gelman stats-2010-06-23-Scientists can read your mind . . . as long as the’re allowed to look at more than one place in your brain and then make a prediction after seeing what you actually did

19 0.72497916 1955 andrew gelman stats-2013-07-25-Bayes-respecting experimental design and other things

20 0.72476083 2204 andrew gelman stats-2014-02-09-Keli Liu and Xiao-Li Meng on Simpson’s paradox


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(5, 0.014), (16, 0.038), (24, 0.142), (53, 0.02), (64, 0.013), (72, 0.021), (76, 0.205), (86, 0.05), (95, 0.029), (99, 0.362)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.97976625 300 andrew gelman stats-2010-09-28-A calibrated Cook gives Dems the edge in Nov, sez Sandy

Introduction: Sandy Gordon sends along this fun little paper forecasting the 2010 midterm election using expert predictions (the Cook and Rothenberg Political Reports). Gordon’s gimmick is that he uses past performance to calibrate the reports’ judgments based on “solid,” “likely,” “leaning,” and “toss-up” categories, and then he uses the calibrated versions of the current predictions to make his forecast. As I wrote a few weeks ago in response to Nate’s forecasts, I think the right way to go, if you really want to forecast the election outcome, is to use national information to predict the national swing and then do regional, state, and district-level adjustments using whatever local information is available. I don’t see the point of using only the expert forecasts and no other data. Still, Gordon is bringing new information (his calibrations) to the table, so I wanted to share it with you. Ultimately I like the throw-in-everything approach that Nate uses (although I think Nate’s descr

2 0.97474623 988 andrew gelman stats-2011-11-02-Roads, traffic, and the importance in decision analysis of carefully examining your goals

Introduction: Sandeep Baliga writes : [In a recent study , Gilles Duranton and Matthew Turner write:] For interstate highways in metropolitan areas we [Duranton and Turner] find that VKT (vehicle kilometers traveled) increases one for one with interstate highways, confirming the fundamental law of highway congestion.’ Provision of public transit also simply leads to the people taking public transport being replaced by drivers on the road. Therefore: These findings suggest that both road capacity expansions and extensions to public transit are not appropriate policies with which to combat traffic congestion. This leaves congestion pricing as the main candidate tool to curb traffic congestion. To which I reply: Sure, if your goal is to curb traffic congestion . But what sort of goal is that? Thinking like a microeconomist, my policy goal is to increase people’s utility. Sure, traffic congestion is annoying, but there must be some advantages to driving on that crowded road or pe

3 0.972408 1551 andrew gelman stats-2012-10-28-A convenience sample and selected treatments

Introduction: Charlie Saunders writes: A study has recently been published in the New England Journal of Medicine (NEJM) which uses survival analysis to examine long-acting reversible contraception (e.g. intrauterine devices [IUDs]) vs. short-term commonly prescribed methods of contraception (e.g. oral contraceptive pills) on unintended pregnancies. The authors use a convenience sample of over 7,000 women. I am not well versed-enough in sampling theory to determine the appropriateness of this but it would seem that the use of a non-probability sampling would be a significant drawback. If you could give me your opinion on this, I would appreciate it. The NEJM is one of the top medical journals in the country. Could this type of sampling method coupled with this method of analysis be published in a journal like JASA? My reply: There are two concerns, first that it is a convenience sample and thus not representative of the population, and second that the treatments are chosen rather tha

4 0.96250588 1351 andrew gelman stats-2012-05-29-A Ph.D. thesis is not really a marathon

Introduction: Thomas Basbøll writes : A blog called The Thesis Whisperer was recently pointed out to me. I [Basbøll] haven’t looked at it closely, but I’ll be reading it regularly for a while before I recommend it. I’m sure it’s a good place to go to discover that you’re not alone, especially when you’re struggling with your dissertation. One post caught my eye immediately. It suggested that writing a thesis is not a sprint, it’s a marathon. As a metaphorical adjustment to a particular attitude about writing, it’s probably going to help some people. But if we think it through, it’s not really a very good analogy. No one is really a “sprinter”; and writing a dissertation is nothing like running a marathon. . . . Here’s Ben’s explication of the analogy at the Thesis Whisperer, which seems initially plausible. …writing a dissertation is a lot like running a marathon. They are both endurance events, they last a long time and they require a consistent and carefully calculated amount of effor

same-blog 5 0.95213151 368 andrew gelman stats-2010-10-25-Is instrumental variables analysis particularly susceptible to Type M errors?

Introduction: Hendrik Juerges writes: I am an applied econometrician. The reason I am writing is that I am pondering a question for some time now and I am curious whether you have any views on it. One problem the practitioner of instrumental variables estimation faces is large standard errors even with very large samples. Part of the problem is of course that one estimates a ratio. Anyhow, more often than not, I and many other researchers I know end up with large point estimates and standard errors when trying IV on a problem. Sometimes some of us are lucky and get a statistically significant result. Those estimates that make it beyond the 2 standard error threshold are often ridiculously large (one famous example in my line of research being Lleras-Muney’s estimates of the 10% effect of one year of schooling on mortality). The standard defense here is that IV estimates the complier-specific causal effect (which is mathematically correct). But still, I find many of the IV results (including my

6 0.95034558 283 andrew gelman stats-2010-09-17-Vote Buying: Evidence from a List Experiment in Lebanon

7 0.95015568 51 andrew gelman stats-2010-05-26-If statistics is so significantly great, why don’t statisticians use statistics?

8 0.94826984 1850 andrew gelman stats-2013-05-10-The recursion of pop-econ

9 0.94403362 1835 andrew gelman stats-2013-05-02-7 ways to separate errors from statistics

10 0.94339216 1609 andrew gelman stats-2012-12-06-Stephen Kosslyn’s principles of graphics and one more: There’s no need to cram everything into a single plot

11 0.9432615 337 andrew gelman stats-2010-10-12-Election symposium at Columbia Journalism School

12 0.94220877 257 andrew gelman stats-2010-09-04-Question about standard range for social science correlations

13 0.92762148 1600 andrew gelman stats-2012-12-01-$241,364.83 – $13,000 = $228,364.83

14 0.92404902 32 andrew gelman stats-2010-05-14-Causal inference in economics

15 0.92226291 1818 andrew gelman stats-2013-04-22-Goal: Rules for Turing chess

16 0.91721642 1105 andrew gelman stats-2012-01-08-Econ debate about prices at a fancy restaurant

17 0.91144997 922 andrew gelman stats-2011-09-24-Economists don’t think like accountants—but maybe they should

18 0.90892839 1084 andrew gelman stats-2011-12-26-Tweeting the Hits?

19 0.90865022 2246 andrew gelman stats-2014-03-13-An Economist’s Guide to Visualizing Data

20 0.90772855 467 andrew gelman stats-2010-12-14-Do we need an integrated Bayesian-likelihood inference?