andrew_gelman_stats andrew_gelman_stats-2012 andrew_gelman_stats-2012-1270 knowledge-graph by maker-knowledge-mining

1270 andrew gelman stats-2012-04-19-Demystifying Blup


meta infos for this blog

Source: html

Introduction: In our recent thread on computing hierarchical models with big datasets, someone brought up Blup. I thought it might be worth explaining what Blup is and how it relates to hierarchical models. Blup stands for Best Linear Unbiased Prediction, but in my terminology it’s just hierarchical modeling. Let me break it down: - “Best” doesn’t really matter. What’s important is that our estimates and predictions make sense and are as accurate as possible. - “Linear” isn’t so important. Statistical predictions are linear for Gaussian linear models, otherwise not. We can and do perform hierarchical generalized linear models all the time. - “Unbiased” doesn’t really matter (see discussion of “Best,” above). - “Prediction” is the key word for relating Blup and hierarchical modeling to classical statistical terminology. In classical statistics, “estimation” of a parameter theta is evaluated conditional on the true value of theta, whereas “prediction” of a predictive quantity phi is eval


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 In our recent thread on computing hierarchical models with big datasets, someone brought up Blup. [sent-1, score-0.567]

2 I thought it might be worth explaining what Blup is and how it relates to hierarchical models. [sent-2, score-0.389]

3 Blup stands for Best Linear Unbiased Prediction, but in my terminology it’s just hierarchical modeling. [sent-3, score-0.409]

4 Let me break it down: - “Best” doesn’t really matter. [sent-4, score-0.047]

5 What’s important is that our estimates and predictions make sense and are as accurate as possible. [sent-5, score-0.124]

6 Statistical predictions are linear for Gaussian linear models, otherwise not. [sent-7, score-0.453]

7 We can and do perform hierarchical generalized linear models all the time. [sent-8, score-0.654]

8 - “Prediction” is the key word for relating Blup and hierarchical modeling to classical statistical terminology. [sent-10, score-0.588]

9 In classical statistics, “estimation” of a parameter theta is evaluated conditional on the true value of theta, whereas “prediction” of a predictive quantity phi is evaluated unconditional on phi, but conditional on theta. [sent-11, score-0.967]

10 “Prediction” is a way to do Bayesian inference in a classical setting. [sent-12, score-0.156]

11 In the classical “empirical Bayes” framework, some of the unknowns are called “parameters” and some are called “predictive quantities” or missing data. [sent-13, score-0.355]

12 ” We discuss this briefly in BDA (maybe in a footnote somewhere). [sent-15, score-0.107]

13 For the purposes of modeling and data analysis, Blup is hierarchical regression. [sent-16, score-0.429]

14 Any computational method for Blup can be ported directly into computation for hierarchical modeling, and vice-versa. [sent-18, score-0.529]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('blup', 0.737), ('hierarchical', 0.289), ('unbiased', 0.218), ('linear', 0.186), ('prediction', 0.161), ('classical', 0.156), ('phi', 0.146), ('evaluated', 0.107), ('theta', 0.102), ('datasets', 0.099), ('computation', 0.095), ('modeling', 0.087), ('models', 0.084), ('ported', 0.084), ('predictions', 0.081), ('unknowns', 0.079), ('estimation', 0.079), ('conditional', 0.079), ('predictive', 0.076), ('best', 0.068), ('terminology', 0.063), ('unconditional', 0.063), ('method', 0.061), ('called', 0.06), ('footnote', 0.058), ('stands', 0.057), ('relating', 0.056), ('mainstream', 0.055), ('quantities', 0.054), ('relates', 0.054), ('purposes', 0.053), ('brought', 0.052), ('quantity', 0.052), ('bda', 0.052), ('thread', 0.052), ('generalized', 0.051), ('gaussian', 0.051), ('briefly', 0.049), ('relevance', 0.047), ('break', 0.047), ('explaining', 0.046), ('users', 0.045), ('computing', 0.045), ('machine', 0.045), ('big', 0.045), ('developed', 0.044), ('perform', 0.044), ('accurate', 0.043), ('framework', 0.043), ('somewhere', 0.043)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0 1270 andrew gelman stats-2012-04-19-Demystifying Blup

Introduction: In our recent thread on computing hierarchical models with big datasets, someone brought up Blup. I thought it might be worth explaining what Blup is and how it relates to hierarchical models. Blup stands for Best Linear Unbiased Prediction, but in my terminology it’s just hierarchical modeling. Let me break it down: - “Best” doesn’t really matter. What’s important is that our estimates and predictions make sense and are as accurate as possible. - “Linear” isn’t so important. Statistical predictions are linear for Gaussian linear models, otherwise not. We can and do perform hierarchical generalized linear models all the time. - “Unbiased” doesn’t really matter (see discussion of “Best,” above). - “Prediction” is the key word for relating Blup and hierarchical modeling to classical statistical terminology. In classical statistics, “estimation” of a parameter theta is evaluated conditional on the true value of theta, whereas “prediction” of a predictive quantity phi is eval

2 0.15798391 1425 andrew gelman stats-2012-07-23-Examples of the use of hierarchical modeling to generalize to new settings

Introduction: In a link to our back-and-forth on causal inference and the use of hierarchical models to bridge between different inferential settings, Elias Bareinboim (a computer scientist who is working with Judea Pearl) writes : In the past week, I have been engaged in a discussion with Andrew Gelman and his blog readers regarding causal inference, selection bias, confounding, and generalizability. I was trying to understand how his method which he calls “hierarchical modeling” would handle these issues and what guarantees it provides. . . . If anyone understands how “hierarchical modeling” can solve a simple toy problem (e.g., M-bias, control of confounding, mediation, generalizability), please share with us. In his post, Bareinboim raises a direct question about hierarchical modeling and also indirectly brings up larger questions about what is convincing evidence when evaluating a statistical method. As I wrote earlier, Bareinboim believes that “The only way investigators can decide w

3 0.14347577 1999 andrew gelman stats-2013-08-27-Bayesian model averaging or fitting a larger model

Introduction: Nick Firoozye writes: I had a question about BMA [Bayesian model averaging] and model combinations in general, and direct it to you since they are a basic form of hierarchical model, albeit in the simplest of forms. I wanted to ask what the underlying assumptions are that could lead to BMA improving on a larger model. I know model combination is a topic of interest in the (frequentist) econometrics community (e.g., Bates & Granger, http://www.jstor.org/discover/10.2307/3008764?uid=3738032&uid;=2&uid;=4&sid;=21101948653381) but at the time it was considered a bit of a puzzle. Perhaps small models combined outperform a big model due to standard errors, insufficient data, etc. But I haven’t seen much in way of Bayesian justification. In simplest terms, you might have a joint density P(Y,theta_1,theta_2) from which you could use the two marginals P(Y,theta_1) and P(Y,theta_2) to derive two separate forecasts. A BMA-er would do a weighted average of the two forecast densities, having p

4 0.13639006 2273 andrew gelman stats-2014-03-29-References (with code) for Bayesian hierarchical (multilevel) modeling and structural equation modeling

Introduction: A student writes: I am new to Bayesian methods. While I am reading your book, I have some questions for you. I am interested in doing Bayesian hierarchical (multi-level) linear regression (e.g., random-intercept model) and Bayesian structural equation modeling (SEM)—for causality. Do you happen to know if I could find some articles, where authors could provide data w/ R and/or BUGS codes that I could replicate them? My reply: For Bayesian hierarchical (multi-level) linear regression and causal inference, see my book with Jennifer Hill. For Bayesian structural equation modeling, try google and you’ll find some good stuff. Also, I recommend Stan (http://mc-stan.org/) rather than Bugs.

5 0.12874353 1763 andrew gelman stats-2013-03-14-Everyone’s trading bias for variance at some point, it’s just done at different places in the analyses

Introduction: Some things I respect When it comes to meta-models of statistics, here are two philosophies that I respect: 1. (My) Bayesian approach, which I associate with E. T. Jaynes, in which you construct models with strong assumptions, ride your models hard, check their fit to data, and then scrap them and improve them as necessary. 2. At the other extreme, model-free statistical procedures that are designed to work well under very weak assumptions—for example, instead of assuming a distribution is Gaussian, you would just want the procedure to work well under some conditions on the smoothness of the second derivative of the log density function. Both the above philosophies recognize that (almost) all important assumptions will be wrong, and they resolve this concern via aggressive model checking or via robustness. And of course there are intermediate positions, such as working with Bayesian models that have been shown to be robust, and then still checking them. Or, to flip it arou

6 0.11656293 1383 andrew gelman stats-2012-06-18-Hierarchical modeling as a framework for extrapolation

7 0.10562662 846 andrew gelman stats-2011-08-09-Default priors update?

8 0.095654309 1418 andrew gelman stats-2012-07-16-Long discussion about causal inference and the use of hierarchical models to bridge between different inferential settings

9 0.094870158 960 andrew gelman stats-2011-10-15-The bias-variance tradeoff

10 0.094544142 1786 andrew gelman stats-2013-04-03-Hierarchical array priors for ANOVA decompositions

11 0.09276282 1695 andrew gelman stats-2013-01-28-Economists argue about Bayes

12 0.092441924 801 andrew gelman stats-2011-07-13-On the half-Cauchy prior for a global scale parameter

13 0.089740932 62 andrew gelman stats-2010-06-01-Two Postdoc Positions Available on Bayesian Hierarchical Modeling

14 0.089527339 1469 andrew gelman stats-2012-08-25-Ways of knowing

15 0.086089157 899 andrew gelman stats-2011-09-10-The statistical significance filter

16 0.084803037 1983 andrew gelman stats-2013-08-15-More on AIC, WAIC, etc

17 0.08462131 2035 andrew gelman stats-2013-09-23-Scalable Stan

18 0.08302062 1610 andrew gelman stats-2012-12-06-Yes, checking calibration of probability forecasts is part of Bayesian statistics

19 0.08138974 961 andrew gelman stats-2011-10-16-The “Washington read” and the algebra of conditional distributions

20 0.077652067 1880 andrew gelman stats-2013-06-02-Flame bait


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.113), (1, 0.123), (2, -0.016), (3, 0.025), (4, -0.002), (5, 0.023), (6, -0.023), (7, -0.02), (8, 0.014), (9, 0.024), (10, -0.016), (11, 0.001), (12, 0.001), (13, 0.001), (14, -0.02), (15, 0.008), (16, -0.038), (17, -0.001), (18, 0.022), (19, -0.029), (20, 0.015), (21, 0.004), (22, 0.03), (23, 0.026), (24, 0.02), (25, -0.017), (26, -0.029), (27, 0.029), (28, 0.006), (29, 0.012), (30, 0.009), (31, 0.028), (32, 0.003), (33, -0.032), (34, -0.018), (35, -0.001), (36, -0.014), (37, -0.01), (38, -0.037), (39, 0.022), (40, -0.015), (41, 0.024), (42, -0.025), (43, -0.008), (44, -0.037), (45, -0.041), (46, 0.026), (47, 0.014), (48, -0.02), (49, -0.061)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.97355658 1270 andrew gelman stats-2012-04-19-Demystifying Blup

Introduction: In our recent thread on computing hierarchical models with big datasets, someone brought up Blup. I thought it might be worth explaining what Blup is and how it relates to hierarchical models. Blup stands for Best Linear Unbiased Prediction, but in my terminology it’s just hierarchical modeling. Let me break it down: - “Best” doesn’t really matter. What’s important is that our estimates and predictions make sense and are as accurate as possible. - “Linear” isn’t so important. Statistical predictions are linear for Gaussian linear models, otherwise not. We can and do perform hierarchical generalized linear models all the time. - “Unbiased” doesn’t really matter (see discussion of “Best,” above). - “Prediction” is the key word for relating Blup and hierarchical modeling to classical statistical terminology. In classical statistics, “estimation” of a parameter theta is evaluated conditional on the true value of theta, whereas “prediction” of a predictive quantity phi is eval

2 0.6983602 1737 andrew gelman stats-2013-02-25-Correlation of 1 . . . too good to be true?

Introduction: Alex Hoffman points me to this interview by Dylan Matthews of education researcher Thomas Kane, who at one point says, Once you corrected for measurement error, a teacher’s score on their chosen videos and on their unchosen videos were correlated at 1. They were perfectly correlated. Hoffman asks, “What do you think? Do you think that just maybe, perhaps, it’s possible we aught to consider, I’m just throwing out the possibility that it might be that the procedure for correcting measurement error might, you now, be a little too strong?” I don’t know exactly what’s happening here, but it might be something that I’ve seen on occasion when fitting multilevel models using a point estimate for the group-level variance. It goes like this: measurement-error models are multilevel models, they involve the estimation of a distribution of a latent variable. When fitting multilevel models, it is possible to estimate the group-level variance to be zero, even though the group-level varia

3 0.69486374 2129 andrew gelman stats-2013-12-10-Cross-validation and Bayesian estimation of tuning parameters

Introduction: Ilya Lipkovich writes: I read with great interest your 2008 paper [with Aleks Jakulin, Grazia Pittau, and Yu-Sung Su] on weakly informative priors for logistic regression and also followed an interesting discussion on your blog. This discussion was within Bayesian community in relation to the validity of priors. However i would like to approach it rather from a more broad perspective on predictive modeling bringing in the ideas from machine/statistical learning approach”. Actually you were the first to bring it up by mentioning in your paper “borrowing ideas from computer science” on cross-validation when comparing predictive ability of your proposed priors with other choices. However, using cross-validation for comparing method performance is not the only or primary use of CV in machine-learning. Most of machine learning methods have some “meta” or complexity parameters and use cross-validation to tune them up. For example, one of your comparison methods is BBR which actually

4 0.69486332 1425 andrew gelman stats-2012-07-23-Examples of the use of hierarchical modeling to generalize to new settings

Introduction: In a link to our back-and-forth on causal inference and the use of hierarchical models to bridge between different inferential settings, Elias Bareinboim (a computer scientist who is working with Judea Pearl) writes : In the past week, I have been engaged in a discussion with Andrew Gelman and his blog readers regarding causal inference, selection bias, confounding, and generalizability. I was trying to understand how his method which he calls “hierarchical modeling” would handle these issues and what guarantees it provides. . . . If anyone understands how “hierarchical modeling” can solve a simple toy problem (e.g., M-bias, control of confounding, mediation, generalizability), please share with us. In his post, Bareinboim raises a direct question about hierarchical modeling and also indirectly brings up larger questions about what is convincing evidence when evaluating a statistical method. As I wrote earlier, Bareinboim believes that “The only way investigators can decide w

5 0.69396186 1763 andrew gelman stats-2013-03-14-Everyone’s trading bias for variance at some point, it’s just done at different places in the analyses

Introduction: Some things I respect When it comes to meta-models of statistics, here are two philosophies that I respect: 1. (My) Bayesian approach, which I associate with E. T. Jaynes, in which you construct models with strong assumptions, ride your models hard, check their fit to data, and then scrap them and improve them as necessary. 2. At the other extreme, model-free statistical procedures that are designed to work well under very weak assumptions—for example, instead of assuming a distribution is Gaussian, you would just want the procedure to work well under some conditions on the smoothness of the second derivative of the log density function. Both the above philosophies recognize that (almost) all important assumptions will be wrong, and they resolve this concern via aggressive model checking or via robustness. And of course there are intermediate positions, such as working with Bayesian models that have been shown to be robust, and then still checking them. Or, to flip it arou

6 0.68304616 1165 andrew gelman stats-2012-02-13-Philosophy of Bayesian statistics: my reactions to Wasserman

7 0.68145502 1786 andrew gelman stats-2013-04-03-Hierarchical array priors for ANOVA decompositions

8 0.67550886 575 andrew gelman stats-2011-02-15-What are the trickiest models to fit?

9 0.67516297 1739 andrew gelman stats-2013-02-26-An AI can build and try out statistical models using an open-ended generative grammar

10 0.66729885 1374 andrew gelman stats-2012-06-11-Convergence Monitoring for Non-Identifiable and Non-Parametric Models

11 0.66408342 810 andrew gelman stats-2011-07-20-Adding more information can make the variance go up (depending on your model)

12 0.66367972 246 andrew gelman stats-2010-08-31-Somewhat Bayesian multilevel modeling

13 0.65934962 1309 andrew gelman stats-2012-05-09-The first version of my “inference from iterative simulation using parallel sequences” paper!

14 0.6589942 524 andrew gelman stats-2011-01-19-Data exploration and multiple comparisons

15 0.65770876 2072 andrew gelman stats-2013-10-21-The future (and past) of statistical sciences

16 0.65764046 269 andrew gelman stats-2010-09-10-R vs. Stata, or, Different ways to estimate multilevel models

17 0.65512323 2349 andrew gelman stats-2014-05-26-WAIC and cross-validation in Stan!

18 0.65021026 2033 andrew gelman stats-2013-09-23-More on Bayesian methods and multilevel modeling

19 0.64893991 1406 andrew gelman stats-2012-07-05-Xiao-Li Meng and Xianchao Xie rethink asymptotics

20 0.64636719 2258 andrew gelman stats-2014-03-21-Random matrices in the news


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(1, 0.01), (16, 0.082), (20, 0.152), (21, 0.04), (24, 0.162), (28, 0.013), (84, 0.016), (86, 0.071), (99, 0.286)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.97903323 479 andrew gelman stats-2010-12-20-WWJD? U can find out!

Introduction: Two positions open in the statistics group at the NYU education school. If you get the job, you get to work with Jennifer HIll! One position is a postdoctoral fellowship, and the other is a visiting professorship. The latter position requires “the demonstrated ability to develop a nationally recognized research program,” which seems like a lot to ask for a visiting professor. Do they expect the visiting prof to develop a nationally recognized research program and then leave it there at NYU after the visit is over? In any case, Jennifer and her colleagues are doing excellent work, both applied and methodological, and this seems like a great opportunity.

same-blog 2 0.95426184 1270 andrew gelman stats-2012-04-19-Demystifying Blup

Introduction: In our recent thread on computing hierarchical models with big datasets, someone brought up Blup. I thought it might be worth explaining what Blup is and how it relates to hierarchical models. Blup stands for Best Linear Unbiased Prediction, but in my terminology it’s just hierarchical modeling. Let me break it down: - “Best” doesn’t really matter. What’s important is that our estimates and predictions make sense and are as accurate as possible. - “Linear” isn’t so important. Statistical predictions are linear for Gaussian linear models, otherwise not. We can and do perform hierarchical generalized linear models all the time. - “Unbiased” doesn’t really matter (see discussion of “Best,” above). - “Prediction” is the key word for relating Blup and hierarchical modeling to classical statistical terminology. In classical statistics, “estimation” of a parameter theta is evaluated conditional on the true value of theta, whereas “prediction” of a predictive quantity phi is eval

3 0.93977249 1420 andrew gelman stats-2012-07-18-The treatment, the intermediate outcome, and the ultimate outcome: Leverage and the financial crisis

Introduction: Gur Huberman points to an article on the financial crisis by Bethany McLean, who writes : lthough our understanding of what instigated the 2008 global financial crisis remains at best incomplete, there are a few widely agreed upon contributing factors. One of them is a 2004 rule change by the U.S. Securities and Exchange Commission that allowed investment banks to load up on leverage. This disastrous decision has been cited by a host of prominent economists, including Princeton professor and former Federal Reserve Vice-Chairman Alan Blinder and Nobel laureate Joseph Stiglitz. It has even been immortalized in Hollywood, figuring into the dark financial narrative that propelled the Academy Award-winning film Inside Job. . . . Here’s just one problem with this story line: It’s not true. Nor is it hard to prove that. Look at the historical leverage of the big five investment banks — Bear Stearns, Lehman Brothers, Merrill Lynch, Goldman Sachs and Morgan Stanley. The Government Accou

4 0.93584895 1206 andrew gelman stats-2012-03-10-95% intervals that I don’t believe, because they’re from a flat prior I don’t believe

Introduction: Arnaud Trolle (no relation ) writes: I have a question about the interpretation of (non-)overlapping of 95% credibility intervals. In a Bayesian ANOVA (a within-subjects one), I computed 95% credibility intervals about the main effects of a factor. I’d like to compare two by two the main effects across the different conditions of the factor. Can I directly interpret the (non-)overlapping of these credibility intervals and make the following statements: “As the 95% credibility intervals do not overlap, both conditions have significantly different main effects” or conversely “As the 95% credibility intervals overlap, the main effects of both conditions are not significantly different, i.e. equivalent”? I heard that, in the case of classical confidence intervals, the second statement is false, but what happens when working within a Bayesian framework? My reply: I think it makes more sense to directly look at inference for the difference. Also, your statements about equivalence

5 0.93146515 1287 andrew gelman stats-2012-04-28-Understanding simulations in terms of predictive inference?

Introduction: David Hogg writes: My (now deceased) collaborator and guru in all things inference, Sam Roweis, used to emphasize to me that we should evaluate models in the data space — not the parameter space — because models are always effectively “effective” and not really, fundamentally true. Or, in other words, models should be compared in the space of their predictions, not in the space of their parameters (the parameters didn’t really “exist” at all for Sam). In that spirit, when we estimate the effectiveness of a MCMC method or tuning — by autocorrelation time or ESJD or anything else — shouldn’t we be looking at the changes in the model predictions over time, rather than the changes in the parameters over time? That is, the autocorrelation time should be the autocorrelation time in what the model (at the walker position) predicts for the data, and the ESJD should be the expected squared jump distance in what the model predicts for the data? This might resolve the concern I expressed a

6 0.92766291 1016 andrew gelman stats-2011-11-17-I got 99 comparisons but multiplicity ain’t one

7 0.92661703 480 andrew gelman stats-2010-12-21-Instead of “confidence interval,” let’s say “uncertainty interval”

8 0.92540634 2201 andrew gelman stats-2014-02-06-Bootstrap averaging: Examples where it works and where it doesn’t work

9 0.9165554 1881 andrew gelman stats-2013-06-03-Boot

10 0.91454339 1913 andrew gelman stats-2013-06-24-Why it doesn’t make sense in general to form confidence intervals by inverting hypothesis tests

11 0.91316867 870 andrew gelman stats-2011-08-25-Why it doesn’t make sense in general to form confidence intervals by inverting hypothesis tests

12 0.91313136 2248 andrew gelman stats-2014-03-15-Problematic interpretations of confidence intervals

13 0.91310018 2365 andrew gelman stats-2014-06-09-I hate polynomials

14 0.91166931 900 andrew gelman stats-2011-09-11-Symptomatic innumeracy

15 0.91140598 2208 andrew gelman stats-2014-02-12-How to think about “identifiability” in Bayesian inference?

16 0.9089216 662 andrew gelman stats-2011-04-15-Bayesian statistical pragmatism

17 0.90870386 910 andrew gelman stats-2011-09-15-Google Refine

18 0.90690124 899 andrew gelman stats-2011-09-10-The statistical significance filter

19 0.90678453 391 andrew gelman stats-2010-11-03-Some thoughts on election forecasting

20 0.90669632 1400 andrew gelman stats-2012-06-29-Decline Effect in Linguistics?