andrew_gelman_stats andrew_gelman_stats-2012 andrew_gelman_stats-2012-1406 knowledge-graph by maker-knowledge-mining

1406 andrew gelman stats-2012-07-05-Xiao-Li Meng and Xianchao Xie rethink asymptotics


meta infos for this blog

Source: html

Introduction: In an article catchily entitled, “I got more data, my model is more refined, but my estimator is getting worse! Am I just dumb?”, Meng and Xie write: Possibly, but more likely you are merely a victim of conventional wisdom. More data or better models by no means guarantee better estimators (e.g., with a smaller mean squared error), when you are not following probabilistically principled methods such as MLE (for large samples) or Bayesian approaches. Estimating equations are par- ticularly vulnerable in this regard, almost a necessary price for their robustness. These points will be demonstrated via common tasks of estimating regression parameters and correlations, under simple mod- els such as bivariate normal and ARCH(1). Some general strategies for detecting and avoiding such pitfalls are suggested, including checking for self-efficiency (Meng, 1994, Statistical Science) and adopting a guiding working model. Using the example of estimating the autocorrelation ρ under a statio


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 In an article catchily entitled, “I got more data, my model is more refined, but my estimator is getting worse! [sent-1, score-0.189]

2 ”, Meng and Xie write: Possibly, but more likely you are merely a victim of conventional wisdom. [sent-3, score-0.206]

3 These points will be demonstrated via common tasks of estimating regression parameters and correlations, under simple mod- els such as bivariate normal and ARCH(1). [sent-8, score-0.234]

4 Some general strategies for detecting and avoiding such pitfalls are suggested, including checking for self-efficiency (Meng, 1994, Statistical Science) and adopting a guiding working model. [sent-9, score-0.242]

5 Using the example of estimating the autocorrelation ρ under a stationary AR(1) model, we also demonstrate the interaction between model assumptions and observation structures in seeking additional information, as the sampling interval s increases. [sent-10, score-0.524]

6 Furthermore, for a given sample size, the optimal s for minimizing the asymptotic variance of ρ. [sent-11, score-0.181]

7 MLE is s = 1 if and only if ρ^2 ≤ 1/3; beyond that region the optimal s increases at the rate of log^(−1)(ρ^(−2)) as ρ approaches a unit root, as does the gain in efficiency relative to using s = 1. [sent-13, score-0.218]

8 A practical implication of this result is that the so-called “non-informative” Jeffreys prior can be far from non-informative even for stationary time series models, because here it converges rapidly to a point mass at a unit root as s increases. [sent-14, score-0.645]

9 Our overall emphasis is that intuition and conventional wisdom need to be examined via critical thinking and theoretical verification before they can be trusted fully. [sent-15, score-0.92]

10 I’m very sympathetic to the argument that we have to be careful when imputing general statistical properties of a method based on past successes. [sent-16, score-0.167]

11 I’m reminded of my (friendly) disputes with Adrian Raftery on Bayesian model selection. [sent-17, score-0.189]

12 As Don Rubin and I wrote , “Raftery implies that the model with higher BIC will be expected to yield better out-of-sample predictions than any other model being compared. [sent-18, score-0.477]

13 This implication is not generally true; there is no general result, either applied or theoretical, that implies this. [sent-19, score-0.21]

14 But, as Meng and Xie say, “intuition and conventional wisdom need to be examined via critical thinking and theoretical verification. [sent-21, score-0.738]

15 In addition, the world of time series is full of models that don’t make sense but are considered to be standard and acceptable. [sent-23, score-0.205]

16 They distrust the idea of statistical data-based model-building (instead of what they prefer, which is a priori model building based on economic theory, or else fully nonparametric non-theoretically-based models). [sent-25, score-0.447]

17 The statistical tradition of building a model using data with some theoretical support is not so popular in econometrics, as they worry that data-based modeling will violate statistical principles. [sent-26, score-0.621]

18 Similarly, they like AR or ARMA models with automatically-chosen lags because such models are objective and require no human input. [sent-29, score-0.34]

19 Here’s an example of a simple theoretically-based Bayesian model outperforming a default AR model. [sent-30, score-0.189]

20 My impression was that they felt that the AR model was the game to be played and that it was cheating for a model to be built based on the structure of the problem. [sent-32, score-0.499]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('ar', 0.314), ('raftery', 0.215), ('model', 0.189), ('meng', 0.184), ('xie', 0.174), ('theoretical', 0.147), ('stationary', 0.143), ('conventional', 0.138), ('models', 0.131), ('root', 0.128), ('cheating', 0.121), ('estimating', 0.119), ('examined', 0.117), ('wisdom', 0.115), ('via', 0.115), ('optimal', 0.111), ('implication', 0.111), ('unit', 0.107), ('critical', 0.106), ('statistical', 0.1), ('intuition', 0.1), ('implies', 0.099), ('arma', 0.087), ('building', 0.085), ('adopting', 0.082), ('refined', 0.082), ('verification', 0.082), ('arch', 0.082), ('searches', 0.082), ('converges', 0.082), ('guiding', 0.082), ('probabilistically', 0.078), ('ecologists', 0.078), ('lags', 0.078), ('cavan', 0.078), ('pitfalls', 0.078), ('mle', 0.078), ('jeffreys', 0.078), ('principled', 0.076), ('bic', 0.076), ('series', 0.074), ('distrust', 0.073), ('autocorrelation', 0.073), ('vulnerable', 0.073), ('minimizing', 0.07), ('adrian', 0.068), ('victim', 0.068), ('bayesian', 0.068), ('transformations', 0.067), ('imputing', 0.067)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0000004 1406 andrew gelman stats-2012-07-05-Xiao-Li Meng and Xianchao Xie rethink asymptotics

Introduction: In an article catchily entitled, “I got more data, my model is more refined, but my estimator is getting worse! Am I just dumb?”, Meng and Xie write: Possibly, but more likely you are merely a victim of conventional wisdom. More data or better models by no means guarantee better estimators (e.g., with a smaller mean squared error), when you are not following probabilistically principled methods such as MLE (for large samples) or Bayesian approaches. Estimating equations are par- ticularly vulnerable in this regard, almost a necessary price for their robustness. These points will be demonstrated via common tasks of estimating regression parameters and correlations, under simple mod- els such as bivariate normal and ARCH(1). Some general strategies for detecting and avoiding such pitfalls are suggested, including checking for self-efficiency (Meng, 1994, Statistical Science) and adopting a guiding working model. Using the example of estimating the autocorrelation ρ under a statio

2 0.17006879 1392 andrew gelman stats-2012-06-26-Occam

Introduction: Cosma Shalizi and Larry Wasserman discuss some papers from a conference on Ockham’s Razor. I don’t have anything new to add on this so let me link to past blog entries on the topic and repost the following from 2004 : A lot has been written in statistics about “parsimony”—that is, the desire to explain phenomena using fewer parameters–but I’ve never seen any good general justification for parsimony. (I don’t count “Occam’s Razor,” or “Ockham’s Razor,” or whatever, as a justification. You gotta do better than digging up a 700-year-old quote.) Maybe it’s because I work in social science, but my feeling is: if you can approximate reality with just a few parameters, fine. If you can use more parameters to fold in more information, that’s even better. In practice, I often use simple models—because they are less effort to fit and, especially, to understand. But I don’t kid myself that they’re better than more complicated efforts! My favorite quote on this comes from Rad

3 0.15570645 1972 andrew gelman stats-2013-08-07-When you’re planning on fitting a model, build up to it by fitting simpler models first. Then, once you have a model you like, check the hell out of it

Introduction: In response to my remarks on his online book, Think Bayes, Allen Downey wrote: I [Downey] have a question about one of your comments: My [Gelman's] main criticism with both books is that they talk a lot about inference but not so much about model building or model checking (recall the three steps of Bayesian data analysis). I think it’s ok for an introductory book to focus on inference, which of course is central to the data-analytic process—but I’d like them to at least mention that Bayesian ideas arise in model building and model checking as well. This sounds like something I agree with, and one of the things I tried to do in the book is to put modeling decisions front and center. But the word “modeling” is used in lots of ways, so I want to see if we are talking about the same thing. For example, in many chapters, I start with a simple model of the scenario, do some analysis, then check whether the model is good enough, and iterate. Here’s the discussion of modeling

4 0.14981504 1469 andrew gelman stats-2012-08-25-Ways of knowing

Introduction: In this discussion from last month, computer science student and Judea Pearl collaborator Elias Barenboim expressed an attitude that hierarchical Bayesian methods might be fine in practice but that they lack theory, that Bayesians can’t succeed in toy problems. I posted a P.S. there which might not have been noticed so I will put it here: I now realize that there is some disagreement about what constitutes a “guarantee.” In one of his comments, Barenboim writes, “the assurance we have that the result must hold as long as the assumptions in the model are correct should be regarded as a guarantee.” In that sense, yes, we have guarantees! It is fundamental to Bayesian inference that the result must hold if the assumptions in the model are correct. We have lots of that in Bayesian Data Analysis (particularly in the first four chapters but implicitly elsewhere as well), and this is also covered in the classic books by Lindley, Jaynes, and others. This sort of guarantee is indeed p

5 0.14776357 1141 andrew gelman stats-2012-01-28-Using predator-prey models on the Canadian lynx series

Introduction: The “Canadian lynx data” is one of the famous examples used in time series analysis. And the usual models that are fit to these data in the statistics time-series literature, don’t work well. Cavan Reilly and Angelique Zeringue write : Reilly and Zeringue then present their analysis. Their simple little predator-prey model with a weakly informative prior way outperforms the standard big-ass autoregression models. Check this out: Or, to put it into numbers, when they fit their model to the first 80 years and predict to the next 34, their root mean square out-of-sample error is 1480 (see scale of data above). In contrast, the standard model fit to these data (the SETAR model of Tong, 1990) has more than twice as many parameters but gets a worse-performing root mean square error of 1600, even when that model is fit to the entire dataset. (If you fit the SETAR or any similar autoregressive model to the first 80 years and use it to predict the next 34, the predictions

6 0.14145286 773 andrew gelman stats-2011-06-18-Should we always be using the t and robit instead of the normal and logit?

7 0.1371631 781 andrew gelman stats-2011-06-28-The holes in my philosophy of Bayesian data analysis

8 0.13433139 291 andrew gelman stats-2010-09-22-Philosophy of Bayes and non-Bayes: A dialogue with Deborah Mayo

9 0.13179442 1205 andrew gelman stats-2012-03-09-Coming to agreement on philosophy of statistics

10 0.1299998 320 andrew gelman stats-2010-10-05-Does posterior predictive model checking fit with the operational subjective approach?

11 0.12925375 1763 andrew gelman stats-2013-03-14-Everyone’s trading bias for variance at some point, it’s just done at different places in the analyses

12 0.1266174 774 andrew gelman stats-2011-06-20-The pervasive twoishness of statistics; in particular, the “sampling distribution” and the “likelihood” are two different models, and that’s a good thing

13 0.126279 1188 andrew gelman stats-2012-02-28-Reference on longitudinal models?

14 0.12570213 1287 andrew gelman stats-2012-04-28-Understanding simulations in terms of predictive inference?

15 0.12518893 24 andrew gelman stats-2010-05-09-Special journal issue on statistical methods for the social sciences

16 0.12284923 1695 andrew gelman stats-2013-01-28-Economists argue about Bayes

17 0.11936094 754 andrew gelman stats-2011-06-09-Difficulties with Bayesian model averaging

18 0.11890192 1431 andrew gelman stats-2012-07-27-Overfitting

19 0.11724537 1817 andrew gelman stats-2013-04-21-More on Bayesian model selection in high-dimensional settings

20 0.11672378 1395 andrew gelman stats-2012-06-27-Cross-validation (What is it good for?)


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.221), (1, 0.185), (2, 0.002), (3, 0.028), (4, -0.016), (5, 0.015), (6, -0.051), (7, -0.018), (8, 0.082), (9, 0.046), (10, 0.004), (11, 0.01), (12, -0.056), (13, 0.008), (14, -0.055), (15, -0.021), (16, 0.034), (17, -0.011), (18, -0.009), (19, -0.022), (20, 0.005), (21, -0.037), (22, -0.001), (23, 0.007), (24, 0.015), (25, 0.011), (26, -0.011), (27, 0.001), (28, 0.013), (29, -0.021), (30, -0.007), (31, 0.004), (32, 0.012), (33, 0.005), (34, 0.018), (35, 0.01), (36, -0.025), (37, -0.007), (38, 0.007), (39, 0.006), (40, -0.006), (41, 0.03), (42, -0.032), (43, 0.033), (44, 0.023), (45, -0.014), (46, -0.004), (47, -0.043), (48, 0.007), (49, -0.021)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.97890097 1406 andrew gelman stats-2012-07-05-Xiao-Li Meng and Xianchao Xie rethink asymptotics

Introduction: In an article catchily entitled, “I got more data, my model is more refined, but my estimator is getting worse! Am I just dumb?”, Meng and Xie write: Possibly, but more likely you are merely a victim of conventional wisdom. More data or better models by no means guarantee better estimators (e.g., with a smaller mean squared error), when you are not following probabilistically principled methods such as MLE (for large samples) or Bayesian approaches. Estimating equations are par- ticularly vulnerable in this regard, almost a necessary price for their robustness. These points will be demonstrated via common tasks of estimating regression parameters and correlations, under simple mod- els such as bivariate normal and ARCH(1). Some general strategies for detecting and avoiding such pitfalls are suggested, including checking for self-efficiency (Meng, 1994, Statistical Science) and adopting a guiding working model. Using the example of estimating the autocorrelation ρ under a statio

2 0.89810723 1739 andrew gelman stats-2013-02-26-An AI can build and try out statistical models using an open-ended generative grammar

Introduction: David Duvenaud writes: I’ve been following your recent discussions about how an AI could do statistics [see also here ]. I was especially excited about your suggestion for new statistical methods using “a language-like approach to recursively creating new models from a specified list of distributions and transformations, and an automatic approach to checking model fit.” Your discussion of these ideas was exciting to me and my colleagues because we recently did some work taking a step in this direction, automatically searching through a grammar over Gaussian process regression models. Roger Grosse previously did the same thing , but over matrix decomposition models using held-out predictive likelihood to check model fit. These are both examples of automatic Bayesian model-building by a search over more and more complex models, as you suggested. One nice thing is that both grammars include lots of standard models for free, and they seem to work pretty well, although the

3 0.89524168 964 andrew gelman stats-2011-10-19-An interweaving-transformation strategy for boosting MCMC efficiency

Introduction: Yaming Yu and Xiao-Li Meng write in with a cool new idea for improving the efficiency of Gibbs and Metropolis in multilevel models: For a broad class of multilevel models, there exist two well-known competing parameterizations, the centered parameterization (CP) and the non-centered parameterization (NCP), for effective MCMC implementation. Much literature has been devoted to the questions of when to use which and how to compromise between them via partial CP/NCP. This article introduces an alternative strategy for boosting MCMC efficiency via simply interweaving—but not alternating—the two parameterizations. This strategy has the surprising property that failure of both the CP and NCP chains to converge geometrically does not prevent the interweaving algorithm from doing so. It achieves this seemingly magical property by taking advantage of the discordance of the two parameterizations, namely, the sufficiency of CP and the ancillarity of NCP, to substantially reduce the Markovian

4 0.87813175 320 andrew gelman stats-2010-10-05-Does posterior predictive model checking fit with the operational subjective approach?

Introduction: David Rohde writes: I have been thinking a lot lately about your Bayesian model checking approach. This is in part because I have been working on exploratory data analysis and wishing to avoid controversy and mathematical statistics we omitted model checking from our discussion. This is something that the refereeing process picked us up on and we ultimately added a critical discussion of null-hypothesis testing to our paper . The exploratory technique we discussed was essentially a 2D histogram approach, but we used Polya models as a formal model for the histogram. We are currently working on a new paper, and we are thinking through how or if we should do “confirmatory analysis” or model checking in the paper. What I find most admirable about your statistical work is that you clearly use the Bayesian approach to do useful applied statistical analysis. My own attempts at applied Bayesian analysis makes me greatly admire your applied successes. On the other hand it may be t

5 0.87689507 1041 andrew gelman stats-2011-12-04-David MacKay and Occam’s Razor

Introduction: In my comments on David MacKay’s 2003 book on Bayesian inference, I wrote that I hate all the Occam-factor stuff that MacKay talks about, and I linked to this quote from Radford Neal: Sometimes a simple model will outperform a more complex model . . . Nevertheless, I believe that deliberately limiting the complexity of the model is not fruitful when the problem is evidently complex. Instead, if a simple model is found that outperforms some particular complex model, the appropriate response is to define a different complex model that captures whatever aspect of the problem led to the simple model performing well. MacKay replied as follows: When you said you disagree with me on Occam factors I think what you meant was that you agree with me on them. I’ve read your post on the topic and completely agreed with you (and Radford) that we should be using models the size of a house, models that we believe in, and that anyone who thinks it is a good idea to bias the model toward

6 0.87529325 773 andrew gelman stats-2011-06-18-Should we always be using the t and robit instead of the normal and logit?

7 0.87368333 1392 andrew gelman stats-2012-06-26-Occam

8 0.86997235 244 andrew gelman stats-2010-08-30-Useful models, model checking, and external validation: a mini-discussion

9 0.86978608 774 andrew gelman stats-2011-06-20-The pervasive twoishness of statistics; in particular, the “sampling distribution” and the “likelihood” are two different models, and that’s a good thing

10 0.8650623 1482 andrew gelman stats-2012-09-04-Model checking and model understanding in machine learning

11 0.86362666 781 andrew gelman stats-2011-06-28-The holes in my philosophy of Bayesian data analysis

12 0.86192745 1431 andrew gelman stats-2012-07-27-Overfitting

13 0.8590169 754 andrew gelman stats-2011-06-09-Difficulties with Bayesian model averaging

14 0.85751569 1459 andrew gelman stats-2012-08-15-How I think about mixture models

15 0.84821564 496 andrew gelman stats-2011-01-01-Tukey’s philosophy

16 0.84006417 1395 andrew gelman stats-2012-06-27-Cross-validation (What is it good for?)

17 0.83698887 1510 andrew gelman stats-2012-09-25-Incoherence of Bayesian data analysis

18 0.83651835 1999 andrew gelman stats-2013-08-27-Bayesian model averaging or fitting a larger model

19 0.83639115 1374 andrew gelman stats-2012-06-11-Convergence Monitoring for Non-Identifiable and Non-Parametric Models

20 0.83141232 1763 andrew gelman stats-2013-03-14-Everyone’s trading bias for variance at some point, it’s just done at different places in the analyses


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(15, 0.023), (16, 0.06), (21, 0.042), (24, 0.136), (30, 0.012), (45, 0.017), (47, 0.017), (53, 0.01), (55, 0.2), (86, 0.064), (99, 0.27)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.96328986 1896 andrew gelman stats-2013-06-13-Against the myth of the heroic visualization

Introduction: Alberto Cairo tells a fascinating story about John Snow, H. W. Acland, and the Mythmaking Problem: Every human community—nations, ethnic and cultural groups, professional guilds—inevitably raises a few of its members to the status of heroes and weaves myths around them. . . . The visual display of information is no stranger to heroes and myth. In fact, being a set of disciplines with a relatively small amount of practitioners and researchers, it has generated a staggering number of heroes, perhaps as a morale-enhancing mechanism. Most of us have heard of the wonders of William Playfair’s Commercial and Political Atlas, Florence Nightingale’s coxcomb charts, Charles Joseph Minard’s Napoleon’s march diagram, and Henry Beck’s 1933 redesign of the London Underground map. . . . Cairo’s goal, I think, is not to disparage these great pioneers of graphics but rather to put their work in perspective, recognizing the work of their excellent contemporaries. I would like to echo Cairo’

2 0.95895326 1463 andrew gelman stats-2012-08-19-It is difficult to convey intonation in typed speech

Introduction: I just wanted to add the above comment to Bob’s notes on language. Spoken (and, to some extent, handwritten) language can be much more expressive than the typed version. I’m not just talking about slang or words such as baaaaad; I’m also talking about pauses that give logical structure to a sentence. For example, sentences such as “The girl who hit the ball where the dog used to be was the one who was climbing the tree when the dog came by” are effortless to understand in speech but can be difficult for a reader to follow. Often when I write, I need to untangle my sentences to keep them readable.

3 0.95293725 1299 andrew gelman stats-2012-05-04-Models, assumptions, and data summaries

Introduction: I saw an analysis recently that I didn’t like. I won’t go into the details, but basically it was a dose-response inference, where a continuous exposure was binned into three broad categories (terciles of the data) and the probability of an adverse event was computed for each tercile. The effect and the sample size was large enough that the terciles were statistically-significantly different from each other in probability of adverse event, with the probabilities increasing from low to mid to high exposure, as one would predict. I didn’t like this analysis because it is equivalent to fitting a step function. There is a tendency for people to interpret the (arbitrary) tercile boundaries as being meaningful thresholds even though the underlying dose-response relation has to be continuous. I’d prefer to start with a linear model and then add nonlinearity from there with a spline or whatever. At this point I stepped back and thought: Hey, the divide-into-three analysis does not lite

4 0.94971544 1617 andrew gelman stats-2012-12-11-Math Talks :: Action Movies

Introduction: Jonathan Goodman gave the departmental seminar yesterday (10 Dec 2012) and I was amused by an extended analogy he made. After his (very clear) intro, he said that math talks were like action movies. The overall theorem and its applications provide the plot, and the proofs provide the action scenes.

5 0.94415581 269 andrew gelman stats-2010-09-10-R vs. Stata, or, Different ways to estimate multilevel models

Introduction: Cyrus writes: I [Cyrus] was teaching a class on multilevel modeling, and we were playing around with different method to fit a random effects logit model with 2 random intercepts—one corresponding to “family” and another corresponding to “community” (labeled “mom” and “cluster” in the data, respectively). There are also a few regressors at the individual, family, and community level. We were replicating in part some of the results from the following paper : Improved estimation procedures for multilevel models with binary response: a case-study, by G Rodriguez, N Goldman. (I say “replicating in part” because we didn’t include all the regressors that they use, only a subset.) We were looking at the performance of estimation via glmer in R’s lme4 package, glmmPQL in R’s MASS package, and Stata’s xtmelogit. We wanted to study the performance of various estimation methods, including adaptive quadrature methods and penalized quasi-likelihood. I was shocked to discover that glmer

6 0.94006526 333 andrew gelman stats-2010-10-10-Psychiatric drugs and the reduction in crime

7 0.93310738 168 andrew gelman stats-2010-07-28-Colorless green, and clueless

8 0.93257695 688 andrew gelman stats-2011-04-30-Why it’s so relaxing to think about social issues

9 0.92900681 2019 andrew gelman stats-2013-09-12-Recently in the sister blog

10 0.92778152 620 andrew gelman stats-2011-03-19-Online James?

same-blog 11 0.92473543 1406 andrew gelman stats-2012-07-05-Xiao-Li Meng and Xianchao Xie rethink asymptotics

12 0.91786045 50 andrew gelman stats-2010-05-25-Looking for Sister Right

13 0.91684496 874 andrew gelman stats-2011-08-27-What’s “the definition of a professional career”?

14 0.91476041 997 andrew gelman stats-2011-11-07-My contribution to the discussion on “Should voting be mandatory?”

15 0.89557099 706 andrew gelman stats-2011-05-11-The happiness gene: My bottom line (for now)

16 0.89307547 201 andrew gelman stats-2010-08-12-Are all rich people now liberals?

17 0.89175963 1520 andrew gelman stats-2012-10-03-Advice that’s so eminently sensible but so difficult to follow

18 0.88943017 13 andrew gelman stats-2010-04-30-Things I learned from the Mickey Kaus for Senate campaign

19 0.87570828 1243 andrew gelman stats-2012-04-03-Don’t do the King’s Gambit

20 0.87174505 2035 andrew gelman stats-2013-09-23-Scalable Stan