andrew_gelman_stats andrew_gelman_stats-2011 andrew_gelman_stats-2011-1047 knowledge-graph by maker-knowledge-mining

1047 andrew gelman stats-2011-12-08-I Am Too Absolutely Heteroskedastic for This Probit Model

meta infos for this blog

Source: html

Introduction: Soren Lorensen wrote: I’m working on a project that uses a binary choice model on panel data. Since I have panel data and am using MLE, I’m concerned about heteroskedasticity making my estimates inconsistent and biased. Are you familiar with any statistical packages with pre-built tests for heteroskedasticity in binary choice ML models? If not, is there value in cutting my data into groups over which I guess the error variance might vary and eyeballing residual plots? Have you other suggestions about how I might resolve this concern? I replied that I wouldn’t worry so much about heteroskedasticity. Breaking up the data into pieces might make sense, but for the purpose of estimating how the coefficients might vary—that is, nonlinearity and interactions. Soren shot back: I’m somewhat puzzled however: homoskedasticity is an identifying assumption in estimating a probit model: if we don’t have it all sorts of bad things can happen to our parameter estimates. Do you suggest n

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Soren Lorensen wrote: I’m working on a project that uses a binary choice model on panel data. [sent-1, score-0.637]

2 Since I have panel data and am using MLE, I’m concerned about heteroskedasticity making my estimates inconsistent and biased. [sent-2, score-1.023]

3 Are you familiar with any statistical packages with pre-built tests for heteroskedasticity in binary choice ML models? [sent-3, score-0.972]

4 If not, is there value in cutting my data into groups over which I guess the error variance might vary and eyeballing residual plots? [sent-4, score-0.451]

5 Have you other suggestions about how I might resolve this concern? [sent-5, score-0.203]

6 I replied that I wouldn’t worry so much about heteroskedasticity. [sent-6, score-0.123]

7 Breaking up the data into pieces might make sense, but for the purpose of estimating how the coefficients might vary—that is, nonlinearity and interactions. [sent-7, score-0.685]

8 Soren shot back: I’m somewhat puzzled however: homoskedasticity is an identifying assumption in estimating a probit model: if we don’t have it all sorts of bad things can happen to our parameter estimates. [sent-8, score-0.563]

9 Do you suggest not worrying about it because the means of dealing with it are so noisy? [sent-9, score-0.179]

10 [I had hoped to test for it using the algorithm suggested by Davidson & MacKinnon (1993) and to correct for it using a multiplicative heteroskedasticity model. [sent-10, score-1.091]

11 To which I replied: If you’re worried you can always check your model fitting using some simulated data. [sent-12, score-0.418]

similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('heteroskedasticity', 0.498), ('soren', 0.29), ('binary', 0.222), ('choice', 0.17), ('panel', 0.167), ('concerned', 0.149), ('vary', 0.142), ('concern', 0.135), ('scoffed', 0.132), ('stem', 0.125), ('hoped', 0.125), ('replied', 0.123), ('estimating', 0.12), ('mle', 0.119), ('nonlinearity', 0.119), ('might', 0.117), ('undergrad', 0.115), ('ml', 0.115), ('multiplicative', 0.112), ('using', 0.11), ('davidson', 0.106), ('graduated', 0.106), ('probit', 0.104), ('puzzled', 0.104), ('residual', 0.1), ('inconsistent', 0.099), ('worrying', 0.097), ('cutting', 0.092), ('simulated', 0.088), ('breaking', 0.087), ('resolve', 0.086), ('identifying', 0.085), ('shot', 0.084), ('packages', 0.082), ('dealing', 0.082), ('econometrics', 0.081), ('model', 0.078), ('pieces', 0.077), ('noisy', 0.076), ('professors', 0.076), ('maximum', 0.075), ('worried', 0.074), ('plots', 0.073), ('concerns', 0.07), ('algorithm', 0.07), ('always', 0.068), ('purpose', 0.068), ('coefficients', 0.067), ('suggested', 0.066), ('assumption', 0.066)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0 1047 andrew gelman stats-2011-12-08-I Am Too Absolutely Heteroskedastic for This Probit Model

2 0.12436163 2277 andrew gelman stats-2014-03-31-The most-cited statistics papers ever

Introduction: Robert Grant has a list . I’ll just give the ones with more than 10,000 Google Scholar cites: Cox (1972) Regression and life tables: 35,512 citations. Dempster, Laird, Rubin (1977) Maximum likelihood from incomplete data via the EM algorithm: 34,988 Bland & Altman (1986) Statistical methods for assessing agreement between two methods of clinical measurement: 27,181 Geman & Geman (1984) Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images: 15,106 We can find some more via searching Google scholar for familiar names and topics; thus: Metropolis et al. (1953) Equation of state calculations by fast computing machines: 26,000 Benjamini and Hochberg (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing: 21,000 White (1980) A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity: 18,000 Heckman (1977) Sample selection bias as a specification error:

3 0.10905684 906 andrew gelman stats-2011-09-14-Another day, another stats postdoc

Introduction: This post is from Phil Price. I work in the Environmental Energy Technologies Division at Lawrence Berkeley National Laboratory, and I am looking for a postdoc who knows substantially more than I do about time-series modeling; in practice this probably means someone whose dissertation work involved that sort of thing. The work involves developing models to predict and/or forecast the time-dependent energy use in buildings, given historical data and some covariates such as outdoor temperature. Simple regression approaches (e.g. using time-of-week indicator variables, plus outdoor temperature) work fine for a lot of things, but we still have a variety of problems. To give one example, sometimes building behavior changes — due to retrofits, or a change in occupant behavior — so that a single model won’t fit well over a long time period. We want to recognize these changes automatically . We have many other issues besides: heteroskedasticity, need for good uncertainty estimates, abilit

4 0.10559429 1960 andrew gelman stats-2013-07-28-More on that machine learning course

Introduction: Following up on our discussion the other day, Andrew Ng writes: Looking at the “typical” ML syllabus, I think most classes do a great job teaching the core ideas, but that there’re two recent trends in ML that are usually not yet reflected. First, unlike 10 years ago, a lot of our students are now taking ML not to do ML research, but to apply it in other research areas or in industry. I’d like to serve these students as well. While many ML classes do a nice job teaching the theory and core algorithms, I’ve seen very few that teach the “hands-on” tactics for how to actually build a high-performance ML system, or on how to think about piecing together a complex ML architecture. For example, what sorts of diagnostics do you run to figure out why your algorithm isn’t giving reasonable accuracy? How much do you invest in collecting additional training data? How do you structure your org chart and metrics if you think there’re 3 components that need to be built and plugged

5 0.10089489 2145 andrew gelman stats-2013-12-24-Estimating and summarizing inference for hierarchical variance parameters when the number of groups is small

Introduction: Chris Che-Castaldo writes: I am trying to compute variance components for a hierarchical model where the group level has two binary predictors and their interaction. When I model each of these three predictors as N(0, tau) the model will not converge, perhaps because the number of coefficients in each batch is so small (2 for the main effects and 4 for the interaction). Although I could simply leave all these as predictors as unmodeled fixed effects, the last sentence of section 21.2 on page 462 of Gelman and Hill (2007) suggests this would not be a wise course of action: For example, it is not clear how to define the (finite) standard deviation of variables that are included in interactions. I am curious – is there still no clear cut way to directly compute the finite standard deviation for binary unmodeled variables that are also part of an interaction as well as the interaction itself? My reply: I’d recommend including these in your model (it’s probably easiest to do so

6 0.10040827 328 andrew gelman stats-2010-10-08-Displaying a fitted multilevel model

7 0.10026214 1252 andrew gelman stats-2012-04-08-Jagdish Bhagwati’s definition of feminist sincerity

8 0.099506795 1032 andrew gelman stats-2011-11-28-Does Avastin work on breast cancer? Should Medicare be paying for it?

9 0.092124738 1956 andrew gelman stats-2013-07-25-What should be in a machine learning course?

10 0.091473296 780 andrew gelman stats-2011-06-27-Bridges between deterministic and probabilistic models for binary data

11 0.086421706 1462 andrew gelman stats-2012-08-18-Standardizing regression inputs

12 0.085730545 779 andrew gelman stats-2011-06-25-Avoiding boundary estimates using a prior distribution as regularization

13 0.08474353 888 andrew gelman stats-2011-09-03-A psychology researcher asks: Is Anova dead?

14 0.084616527 2294 andrew gelman stats-2014-04-17-If you get to the point of asking, just do it. But some difficulties do arise . . .

15 0.083019562 1972 andrew gelman stats-2013-08-07-When you’re planning on fitting a model, build up to it by fitting simpler models first. Then, once you have a model you like, check the hell out of it

16 0.082744442 2163 andrew gelman stats-2014-01-08-How to display multinominal logit results graphically?

17 0.082238317 774 andrew gelman stats-2011-06-20-The pervasive twoishness of statistics; in particular, the “sampling distribution” and the “likelihood” are two different models, and that’s a good thing

18 0.081729986 1763 andrew gelman stats-2013-03-14-Everyone’s trading bias for variance at some point, it’s just done at different places in the analyses

19 0.081395909 1392 andrew gelman stats-2012-06-26-Occam

20 0.079877734 1406 andrew gelman stats-2012-07-05-Xiao-Li Meng and Xianchao Xie rethink asymptotics

similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.139), (1, 0.092), (2, 0.028), (3, -0.007), (4, 0.044), (5, 0.001), (6, 0.009), (7, -0.034), (8, 0.045), (9, 0.041), (10, 0.011), (11, 0.024), (12, -0.014), (13, -0.027), (14, -0.035), (15, -0.001), (16, -0.014), (17, -0.014), (18, -0.02), (19, -0.017), (20, 0.013), (21, -0.005), (22, 0.015), (23, -0.03), (24, -0.029), (25, -0.014), (26, -0.028), (27, 0.0), (28, 0.034), (29, 0.002), (30, -0.017), (31, 0.024), (32, 0.018), (33, -0.008), (34, 0.03), (35, 0.004), (36, -0.019), (37, -0.008), (38, -0.014), (39, 0.017), (40, 0.012), (41, -0.014), (42, -0.022), (43, 0.039), (44, -0.034), (45, -0.007), (46, -0.01), (47, 0.015), (48, 0.039), (49, 0.027)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.9637922 1047 andrew gelman stats-2011-12-08-I Am Too Absolutely Heteroskedastic for This Probit Model

2 0.82528806 1395 andrew gelman stats-2012-06-27-Cross-validation (What is it good for?)

Introduction: I think cross-validation is a good way to estimate a model’s forecasting error but I don’t think it’s always such a great tool for comparing models. I mean, sure, if the differences are dramatic, ok. But you can easily have a few candidate models, and one model makes a lot more sense than the others (even from a purely predictive sense, I’m not talking about causality here). The difference between the model doesn’t show up in a xval measure of total error but in the patterns of the predictions. For a simple example, imagine using a linear model with positive slope to model a function that is constrained to be increasing. If the constraint isn’t in the model, the predicted/imputed series will sometimes be nonmonotonic. The effect on the prediction error can be so tiny as to be undetectable (or it might even increase avg prediction error to include the constraint); nonetheless, the predictions will be clearly nonsensical. That’s an extreme example but I think the general point h

3 0.81633961 1267 andrew gelman stats-2012-04-17-Hierarchical-multilevel modeling with “big data”

Introduction: Dean Eckles writes: I make extensive use of random effects models in my academic and industry research, as they are very often appropriate. However, with very large data sets, I am not sure what to do. Say I have thousands of levels of a grouping factor, and the number of observations totals in the billions. Despite having lots of observations, I am often either dealing with (a) small effects or (b) trying to fit models with many predictors. So I would really like to use a random effects model to borrow strength across the levels of the grouping factor, but I am not sure how to practically do this. Are you aware of any approaches to fitting random effects models (including approximations) that work for very large data sets? For example, applying a procedure to each group, and then using the results of this to shrink each fit in some appropriate way. Just to clarify, here I am only worried about the non-crossed and in fact single-level case. I don’t see any easy route for cross

4 0.81472784 888 andrew gelman stats-2011-09-03-A psychology researcher asks: Is Anova dead?

Introduction: A research psychologist writes in with a question that’s so long that I’ll put my answer first, then put the question itself below the fold. Here’s my reply: As I wrote in my Anova paper and in my book with Jennifer Hill, I do think that multilevel models can completely replace Anova. At the same time, I think the central idea of Anova should persist in our understanding of these models. To me the central idea of Anova is not F-tests or p-values or sums of squares, but rather the idea of predicting an outcome based on factors with discrete levels, and understanding these factors using variance components. The continuous or categorical response thing doesn’t really matter so much to me. I have no problem using a normal linear model for continuous outcomes (perhaps suitably transformed) and a logistic model for binary outcomes. I don’t want to throw away interactions just because they’re not statistically significant. I’d rather partially pool them toward zero using an inform

5 0.81142873 1374 andrew gelman stats-2012-06-11-Convergence Monitoring for Non-Identifiable and Non-Parametric Models

Introduction: Becky Passonneau and colleagues at the Center for Computational Learning Systems (CCLS) at Columbia have been working on a project for ConEd (New York’s major electric utility) to rank structures based on vulnerability to secondary events (e.g., transformer explosions, cable meltdowns, electrical fires). They’ve been using the R implementation BayesTree of Chipman, George and McCulloch’s Bayesian Additive Regression Trees (BART). BART is a Bayesian non-parametric method that is non-identifiable in two ways. Firstly, it is an additive tree model with a fixed number of trees, the indexes of which aren’t identified (you get the same predictions in a model swapping the order of the trees). This is the same kind of non-identifiability you get with any mixture model (additive or interpolated) with an exchangeable prior on the mixture components. Secondly, the trees themselves have varying structure over samples in terms of number of nodes and their topology (depth, branching, etc

6 0.80661452 250 andrew gelman stats-2010-09-02-Blending results from two relatively independent multi-level models

7 0.79402232 269 andrew gelman stats-2010-09-10-R vs. Stata, or, Different ways to estimate multilevel models

8 0.79307789 810 andrew gelman stats-2011-07-20-Adding more information can make the variance go up (depending on your model)

9 0.79246068 759 andrew gelman stats-2011-06-11-“2 level logit with 2 REs & large sample. computational nightmare – please help”

10 0.79145205 1346 andrew gelman stats-2012-05-27-Average predictive comparisons when changing a pair of variables

11 0.77849126 1966 andrew gelman stats-2013-08-03-Uncertainty in parameter estimates using multilevel models

12 0.77737385 1392 andrew gelman stats-2012-06-26-Occam

13 0.77459037 1162 andrew gelman stats-2012-02-11-Adding an error model to a deterministic model

14 0.77413762 1363 andrew gelman stats-2012-06-03-Question about predictive checks

15 0.77295315 2176 andrew gelman stats-2014-01-19-Transformations for non-normal data

16 0.7716642 1431 andrew gelman stats-2012-07-27-Overfitting

17 0.76767761 1983 andrew gelman stats-2013-08-15-More on AIC, WAIC, etc

18 0.76749849 938 andrew gelman stats-2011-10-03-Comparing prediction errors

19 0.76645631 1875 andrew gelman stats-2013-05-28-Simplify until your fake-data check works, then add complications until you can figure out where the problem is coming from

20 0.76382309 1981 andrew gelman stats-2013-08-14-The robust beauty of improper linear models in decision making

similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(15, 0.027), (16, 0.082), (21, 0.014), (24, 0.162), (47, 0.011), (53, 0.247), (63, 0.047), (74, 0.013), (83, 0.013), (86, 0.017), (90, 0.012), (99, 0.251)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.97165871 1589 andrew gelman stats-2012-11-25-Life as a blogger: the emails just get weirder and weirder

Introduction: In the email the other day, subject line “Casting blogger, writer, journalist to host cable series”: Hi there Andrew, I’m casting a male journalist, writer, blogger, documentary filmmaker or comedian with a certain type personality for a television pilot along with production company, Pipeline39. See below: A certain type of character – no cockiness, no ego, a person who is smart, savvy, dry humor, but someone who isn’t imposing, who can infiltrate these organizations. This person will be hosting his own show and covering alternative lifestyles and secret societies around the world. If you’re interested in hearing more or would like to be considered for this project, please email me a photo and a bio of yourself, along with contact information. I’ll respond to you ASAP. I’m looking forward to hearing from you. *** Casting Producer (646) ***.**** ***@gmail.com I was with them until I got to the “no ego” part. . . . Also, I don’t think I could infiltrate any org

2 0.96157813 1856 andrew gelman stats-2013-05-14-GPstuff: Bayesian Modeling with Gaussian Processes

Introduction: I think it’s part of my duty as a blogger to intersperse, along with the steady flow of jokes, rants, and literary criticism, some material that will actually be useful to you. So here goes. Jarno Vanhatalo, Jaakko Riihimäki, Jouni Hartikainen, Pasi Jylänki, Ville Tolvanen, and Aki Vehtari write : The GPstuff toolbox is a versatile collection of Gaussian process models and computational tools required for Bayesian inference. The tools include, among others, various inference methods, sparse approximations and model assessment methods. We can actually now fit Gaussian processes in Stan . But for big problems (or even moderately-sized problems), full Bayes can be slow. GPstuff uses EP, which is faster. At some point we’d like to implement EP in Stan. (Right now we’re working with Dave Blei to implement VB.) GPstuff really works. I saw Aki use it to fit a nonparametric version of the Bangladesh well-switching example in ARM. He was sitting in his office and just whip

3 0.94642031 298 andrew gelman stats-2010-09-27-Who is that masked person: The use of face masks on Mexico City public transportation during the Influenza A (H1N1) outbreak

Introduction: Tapen Sinha writes: Living in Mexico, I have been witness to many strange (and beautiful) things. Perhaps the strangest happened during the first outbreak of A(H1N1) in Mexico City. We had our university closed, football (soccer) was played in empty stadiums (or should it be stadia) because the government feared a spread of the virus. The Metro was operating and so were the private/public buses and taxis. Since the university was closed, we took the opportunity to collect data on facemask use in the public transport systems. It was a simple (but potentially deadly!) exercise in first hand statistical data collection that we teach our students (Although I must admit that I did not dare sending my research assistant to collect data â€“ what if she contracted the virus?). I believe it was a unique experiment never to be repeated. The paper appeared in the journal Health Policy. From the abstract: At the height of the influenza epidemic in Mexico City in the spring of 2009, the f

4 0.93237329 1677 andrew gelman stats-2013-01-16-Greenland is one tough town

Introduction: Americans (including me) don’t know much about other countries. Jeff Lax sent me to this blog post by Myrddin pointing out that Belgium has a higher murder rate than the rest of Western Europe. I have no particular take on this, but it’s a good reminder that other countries differ from each other. Here in the U.S., we tend to think all western European countries are the same, all eastern European countries are the same, etc. In reality, Sweden is not Finland . P.S. According to the Wiki , Greenland is one tough town. I guess there’s nothing much to do out there but watch satellite TV, chew the blubber, and kill people.

5 0.92843944 1905 andrew gelman stats-2013-06-18-There are no fat sprinters

Introduction: This post is by Phil. A little over three years ago I wrote a post about exercise and weight loss in which I described losing a fair amount of weight due to (I believe) an exercise regime, with no effort to change my diet; this contradicted the prediction of studies that had recently been released. The comment thread on that post is quite interesting: a lot of people had had similar experiences — losing weight, or keeping it off, with an exercise program that includes very short periods of exercise at maximal intensity — while other people expressed some skepticism about my claims. Some commenters said that I risked injury; others said it was too early to judge anything because my weight loss might not last. The people who predicted injury were right: running the curve during a 200m sprint a month or two after that post, I strained my Achilles tendon. Nothing really serious, but it did keep me off the track for a couple of months, and rather than go back to sprinting I switched t

6 0.92784786 1468 andrew gelman stats-2012-08-24-Multilevel modeling and instrumental variables

same-blog 7 0.92578667 1047 andrew gelman stats-2011-12-08-I Am Too Absolutely Heteroskedastic for This Probit Model

8 0.92376351 991 andrew gelman stats-2011-11-04-Insecure researchers aren’t sharing their data

9 0.92025185 46 andrew gelman stats-2010-05-21-Careers, one-hit wonders, and an offer of a free book

10 0.91158342 1555 andrew gelman stats-2012-10-31-Social scientists who use medical analogies to explain causal inference are, I think, implicitly trying to borrow some of the scientific and cultural authority of that field for our own purposes

11 0.91044092 1802 andrew gelman stats-2013-04-14-Detecting predictability in complex ecosystems

12 0.90011704 733 andrew gelman stats-2011-05-27-Another silly graph

13 0.89962262 1902 andrew gelman stats-2013-06-17-Job opening at new “big data” consulting firm!

14 0.89691859 2022 andrew gelman stats-2013-09-13-You heard it here first: Intense exercise can suppress appetite

15 0.89539576 547 andrew gelman stats-2011-01-31-Using sample size in the prior distribution

16 0.89271897 495 andrew gelman stats-2010-12-31-“Threshold earners” and economic inequality

17 0.87730223 446 andrew gelman stats-2010-12-03-Is 0.05 too strict as a p-value threshold?

18 0.87661481 413 andrew gelman stats-2010-11-14-Statistics of food consumption

19 0.87619042 880 andrew gelman stats-2011-08-30-Annals of spam

20 0.86304462 354 andrew gelman stats-2010-10-19-There’s only one Amtrak