andrew_gelman_stats andrew_gelman_stats-2011 andrew_gelman_stats-2011-554 knowledge-graph by maker-knowledge-mining

554 andrew gelman stats-2011-02-04-An addition to the model-makers’ oath

meta infos for this blog

Source: html

Introduction: Yesterday Aleks posted a proposal for a model makers’ Hippocratic Oath. I’d like to add two more items: 1. From Mark Palko : “Our model only describes the data we used to build it; if you go outside of that range, you do so at your own risk.” 2. In case you like to think of your methods as nonparametric or non-model-based: “Our method, just like any model, relies on assumptions which we have the duty to state and to check.” (Observant readers will see that I use “we” rather than “I” in these two items. Modeling is an inherently collaborative endeavor.

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Yesterday Aleks posted a proposal for a model makers’ Hippocratic Oath. [sent-1, score-0.528]

2 From Mark Palko : “Our model only describes the data we used to build it; if you go outside of that range, you do so at your own risk. [sent-3, score-0.886]

3 In case you like to think of your methods as nonparametric or non-model-based: “Our method, just like any model, relies on assumptions which we have the duty to state and to check. [sent-5, score-1.319]

4 ” (Observant readers will see that I use “we” rather than “I” in these two items. [sent-6, score-0.408]

similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('hippocratic', 0.323), ('makers', 0.266), ('endeavor', 0.266), ('collaborative', 0.25), ('relies', 0.231), ('duty', 0.225), ('inherently', 0.214), ('nonparametric', 0.206), ('aleks', 0.198), ('proposal', 0.196), ('model', 0.191), ('palko', 0.184), ('build', 0.18), ('items', 0.175), ('describes', 0.169), ('yesterday', 0.157), ('range', 0.151), ('outside', 0.151), ('assumptions', 0.141), ('posted', 0.141), ('mark', 0.138), ('add', 0.125), ('two', 0.121), ('method', 0.118), ('readers', 0.116), ('modeling', 0.112), ('state', 0.108), ('like', 0.105), ('methods', 0.094), ('used', 0.078), ('go', 0.073), ('case', 0.069), ('rather', 0.066), ('use', 0.062), ('data', 0.044), ('see', 0.043), ('think', 0.035)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0 554 andrew gelman stats-2011-02-04-An addition to the model-makers’ oath

2 0.16937192 552 andrew gelman stats-2011-02-03-Model Makers’ Hippocratic Oath

Introduction: Emanuel Derman and Paul Wilmott wonder how to get their fellow modelers to give up their fantasy of perfection. In a Business Week article they proposed, not entirely in jest, a model makers’ Hippocratic Oath: I will remember that I didn’t make the world and that it doesn’t satisfy my equations. Though I will use models boldly to estimate value, I will not be overly impressed by mathematics. I will never sacrifice reality for elegance without explaining why I have done so. Nor will I give the people who use my model false comfort about its accuracy. Instead, I will make explicit its assumptions and oversights. I understand that my work may have enormous effects on society and the economy, many of them beyond my comprehension. Found via Abductive Intelligence .

3 0.16444728 227 andrew gelman stats-2010-08-23-Visualization magazine

Introduction: Aleks pointed me to this .

4 0.16410065 1318 andrew gelman stats-2012-05-13-Stolen jokes

Introduction: Fun stories here (from Kliph Nesteroff, link from Mark Palko).

5 0.15873945 1063 andrew gelman stats-2011-12-16-Suspicious histogram bars

Introduction: Aleks sent me this (Iâ€™m not sure from where):

6 0.14993902 602 andrew gelman stats-2011-03-06-Assumptions vs. conditions

7 0.13158354 1972 andrew gelman stats-2013-08-07-When you’re planning on fitting a model, build up to it by fitting simpler models first. Then, once you have a model you like, check the hell out of it

8 0.12176157 281 andrew gelman stats-2010-09-16-NSF crowdsourcing

9 0.11602065 533 andrew gelman stats-2011-01-23-The scalarization of America

10 0.1154122 1302 andrew gelman stats-2012-05-06-Fun with google autocomplete

11 0.10984483 781 andrew gelman stats-2011-06-28-The holes in my philosophy of Bayesian data analysis

12 0.10866502 1858 andrew gelman stats-2013-05-15-Reputations changeable, situations tolerable

13 0.10704986 441 andrew gelman stats-2010-12-01-Mapmaking software

14 0.099684224 1418 andrew gelman stats-2012-07-16-Long discussion about causal inference and the use of hierarchical models to bridge between different inferential settings

15 0.098045096 99 andrew gelman stats-2010-06-19-Paired comparisons

16 0.094385393 858 andrew gelman stats-2011-08-17-Jumping off the edge of the world

17 0.08853671 244 andrew gelman stats-2010-08-30-Useful models, model checking, and external validation: a mini-discussion

18 0.087760225 1723 andrew gelman stats-2013-02-15-Wacky priors can work well?

19 0.085594609 1425 andrew gelman stats-2012-07-23-Examples of the use of hierarchical modeling to generalize to new settings

20 0.085424662 1763 andrew gelman stats-2013-03-14-Everyone’s trading bias for variance at some point, it’s just done at different places in the analyses

similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.126), (1, 0.085), (2, 0.011), (3, 0.043), (4, 0.018), (5, 0.027), (6, -0.026), (7, 0.007), (8, 0.069), (9, 0.02), (10, 0.039), (11, 0.008), (12, -0.024), (13, 0.008), (14, -0.097), (15, 0.065), (16, 0.028), (17, 0.044), (18, -0.129), (19, -0.109), (20, -0.026), (21, -0.081), (22, 0.003), (23, -0.003), (24, -0.02), (25, -0.01), (26, -0.017), (27, -0.002), (28, -0.011), (29, 0.03), (30, -0.011), (31, -0.022), (32, 0.019), (33, 0.054), (34, 0.01), (35, 0.042), (36, -0.022), (37, 0.022), (38, 0.013), (39, 0.018), (40, -0.017), (41, 0.051), (42, 0.008), (43, -0.001), (44, 0.003), (45, 0.003), (46, -0.049), (47, -0.034), (48, -0.044), (49, 0.027)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.90376222 554 andrew gelman stats-2011-02-04-An addition to the model-makers’ oath

2 0.66824371 441 andrew gelman stats-2010-12-01-Mapmaking software

Introduction: I can’t use this on my PC, but the link comes from Aleks, so maybe it’s something good!

3 0.64400476 1302 andrew gelman stats-2012-05-06-Fun with google autocomplete

Introduction: Aleks points us to this idea of labeling for news.

4 0.63921684 964 andrew gelman stats-2011-10-19-An interweaving-transformation strategy for boosting MCMC efficiency

Introduction: Yaming Yu and Xiao-Li Meng write in with a cool new idea for improving the efficiency of Gibbs and Metropolis in multilevel models: For a broad class of multilevel models, there exist two well-known competing parameterizations, the centered parameterization (CP) and the non-centered parameterization (NCP), for effective MCMC implementation. Much literature has been devoted to the questions of when to use which and how to compromise between them via partial CP/NCP. This article introduces an alternative strategy for boosting MCMC efficiency via simply interweaving—but not alternating—the two parameterizations. This strategy has the surprising property that failure of both the CP and NCP chains to converge geometrically does not prevent the interweaving algorithm from doing so. It achieves this seemingly magical property by taking advantage of the discordance of the two parameterizations, namely, the sufficiency of CP and the ancillarity of NCP, to substantially reduce the Markovian

5 0.63526618 1406 andrew gelman stats-2012-07-05-Xiao-Li Meng and Xianchao Xie rethink asymptotics

Introduction: In an article catchily entitled, “I got more data, my model is more refined, but my estimator is getting worse! Am I just dumb?”, Meng and Xie write: Possibly, but more likely you are merely a victim of conventional wisdom. More data or better models by no means guarantee better estimators (e.g., with a smaller mean squared error), when you are not following probabilistically principled methods such as MLE (for large samples) or Bayesian approaches. Estimating equations are par- ticularly vulnerable in this regard, almost a necessary price for their robustness. These points will be demonstrated via common tasks of estimating regression parameters and correlations, under simple mod- els such as bivariate normal and ARCH(1). Some general strategies for detecting and avoiding such pitfalls are suggested, including checking for self-efficiency (Meng, 1994, Statistical Science) and adopting a guiding working model. Using the example of estimating the autocorrelation ρ under a statio

6 0.63020372 1064 andrew gelman stats-2011-12-16-The benefit of the continuous color scale

7 0.62963867 1063 andrew gelman stats-2011-12-16-Suspicious histogram bars

8 0.62332612 909 andrew gelman stats-2011-09-15-7 steps to successful infographics

9 0.61867225 281 andrew gelman stats-2010-09-16-NSF crowdsourcing

10 0.61633629 496 andrew gelman stats-2011-01-01-Tukey’s philosophy

11 0.61560923 1004 andrew gelman stats-2011-11-11-Kaiser Fung on how not to critique models

12 0.61421382 227 andrew gelman stats-2010-08-23-Visualization magazine

13 0.61347395 1958 andrew gelman stats-2013-07-27-Teaching is hard

14 0.60645527 1141 andrew gelman stats-2012-01-28-Using predator-prey models on the Canadian lynx series

15 0.60115993 2133 andrew gelman stats-2013-12-13-Flexibility is good

16 0.60092545 1392 andrew gelman stats-2012-06-26-Occam

17 0.60033357 1972 andrew gelman stats-2013-08-07-When you’re planning on fitting a model, build up to it by fitting simpler models first. Then, once you have a model you like, check the hell out of it

18 0.59576505 2136 andrew gelman stats-2013-12-16-Whither the “bet on sparsity principle” in a nonsparse world?

19 0.59380591 448 andrew gelman stats-2010-12-03-This is a footnote in one of my papers

20 0.59360641 328 andrew gelman stats-2010-10-08-Displaying a fitted multilevel model

similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(15, 0.064), (16, 0.081), (53, 0.066), (81, 0.042), (83, 0.193), (85, 0.03), (86, 0.029), (99, 0.353)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.90337545 926 andrew gelman stats-2011-09-26-NYC

Introduction: Our downstairs neighbor hates us. She looks away from us when we see them on the street, if we’re coming into the building at the same time she doesn’t hold open the door, and if we’re in the elevator when it stops on her floor, she refuses to get on. On the other hand, if you’re a sociology professor in Chicago, one of your colleagues might try to run you over in a parking lot. So I guess I’m getting off easy.

2 0.90155917 1312 andrew gelman stats-2012-05-11-Are our referencing errors undermining our scholarship and credibility? The case of expatriate failure rates

Introduction: Thomas Basbøll points to this ten-year-old article from Anne-Wil Harzing on the consequences of sloppy citations. Harzing tells the story of an unsupported claim that is contradicted by published data but has been presented as fact in a particular area of the academic literature. She writes that “high expatriate failure rates [with "expatriate failure" defined as "the expatriate returning home before his/her contractual period of employment abroad expires"] were in fact a myth created by massive misquotations and careless copying of references.” Many papers claimed an expatriate failure rate of 25-40% (according to Harzing, this is much higher than the actual rate as estimated from empirical data), with this overly-high rate supported by a complicated link of references leading to . . . no real data. Hartzing reports the following published claims: Harvey (1996: 103): `The rate of failure of expatriate managers relocating overseas from United States based MNCs has been estima

3 0.89523339 1307 andrew gelman stats-2012-05-07-The hare, the pineapple, and Ed Wegman

Introduction: Commenters here are occasionally bothered that I spend so much time attacking frauds and plagiarists. See, for example, here and here . Why go on and on about these losers, given that there are more important problems in the world such as war, pestilence, hunger, and graphs where the y-axis doesn’t go all the way down to zero? Part of the story is that I do research for a living so I resent people who devalue research through misattribution or fraud, in the same way that rich people don’t like counterfeiters. What really bugs me, though, is when cheaters get caught and still don’t admit it. People like Hauser, Wegman, Fischer, and Weick get under my skin because they have the chutzpah to just deny deny deny. The grainy time-stamped videotape with their hand in the cookie jar is right there, and they’ll still talk around the problem. Makes me want to scream. This happens all the time . All. Over. The. Place. Everybody makes mistakes, and just about everybody does thing

same-blog 4 0.89419079 554 andrew gelman stats-2011-02-04-An addition to the model-makers’ oath

5 0.8928681 1294 andrew gelman stats-2012-05-01-Modeling y = a + b + c

Introduction: Brandon Behlendorf writes: I [Behlendorf] am replicating some previous research using OLS [he's talking about what we call "linear regression"---ed.] to regress a logged rate (to reduce skew) of Y on a number of predictors (Xs). Y is the count of a phenomena divided by the population of the unit of the analysis. The problem that I am encountering is that Y is composite count of a number of distinct phenomena [A+B+C], and these phenomena are not uniformly distributed across the sample. Most of the research in this area has conducted regressions either with Y or with individual phenomena [A or B or C] as the dependent variable. Yet it seems that if [A, B, C] are not uniformly distributed across the sample of units in the same proportion, then the use of Y would be biased, since as a count of [A+B+C] divided by the population, it would treat as equivalent units both [2+0.5+1.5] and [4+0+0]. My goal is trying to find a methodology which allows a researcher to regress Y on a

6 0.89044738 1456 andrew gelman stats-2012-08-13-Macro, micro, and conflicts of interest

7 0.8782354 649 andrew gelman stats-2011-04-05-Internal and external forecasting

8 0.86815822 645 andrew gelman stats-2011-04-04-Do you have any idea what you’re talking about?

9 0.86474055 1313 andrew gelman stats-2012-05-11-Question 1 of my final exam for Design and Analysis of Sample Surveys

10 0.86166453 282 andrew gelman stats-2010-09-17-I can’t escape it

11 0.85799378 1554 andrew gelman stats-2012-10-31-It not necessary that Bayesian methods conform to the likelihood principle

12 0.85770494 1389 andrew gelman stats-2012-06-23-Larry Wasserman’s statistics blog

13 0.85739195 1704 andrew gelman stats-2013-02-03-Heuristics for identifying ecological fallacies?

14 0.8560037 1861 andrew gelman stats-2013-05-17-Where do theories come from?

15 0.85331863 1681 andrew gelman stats-2013-01-19-Participate in a short survey about the weight of evidence provided by statistics

16 0.85323811 108 andrew gelman stats-2010-06-24-Sometimes the raw numbers are better than a percentage

17 0.85293186 2329 andrew gelman stats-2014-05-11-“What should you talk about?”

18 0.85283589 1042 andrew gelman stats-2011-12-05-Timing is everything!

19 0.85082954 339 andrew gelman stats-2010-10-13-Battle of the NYT opinion-page economists

20 0.85020304 2070 andrew gelman stats-2013-10-20-The institution of tenure