andrew_gelman_stats andrew_gelman_stats-2013 andrew_gelman_stats-2013-1884 knowledge-graph by maker-knowledge-mining

1884 andrew gelman stats-2013-06-05-A story of fake-data checking being used to shoot down a flawed analysis at the Farm Credit Agency


meta infos for this blog

Source: html

Introduction: Austin Kelly writes: While reading your postings [or here ] on the subject of testing your model by running fake data I was reminded of the fact that I got one of these kinds of tests actually published in a GAO report back in the day. Reading your posts on Unz and political vs. economic discourse made me think of that work again. I thought I’d actually drop you a line on the subject. Back in 2003 GAO was asked to look at Farmer Mac, including a look at the Farm Credit Agency’s regulation of Farmer Mac. As the resident mortgage econometrician back then I was asked to look at FCA’s risk based capital stress test for Farmer Mac. The work was pretty easy. I found a lot of oddities, but the biggest one was that they were using a discrete choice set up (loan goes bad or doesn’t) instead of a hazard model (loan goes bad this period or survives to the next). Not necessarily a problem – lots of mortgage models run that way. But you have to be really careful with your independe


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Austin Kelly writes: While reading your postings [or here ] on the subject of testing your model by running fake data I was reminded of the fact that I got one of these kinds of tests actually published in a GAO report back in the day. [sent-1, score-0.377]

2 economic discourse made me think of that work again. [sent-3, score-0.158]

3 I thought I’d actually drop you a line on the subject. [sent-4, score-0.083]

4 As the resident mortgage econometrician back then I was asked to look at FCA’s risk based capital stress test for Farmer Mac. [sent-6, score-0.417]

5 I found a lot of oddities, but the biggest one was that they were using a discrete choice set up (loan goes bad or doesn’t) instead of a hazard model (loan goes bad this period or survives to the next). [sent-8, score-0.34]

6 Not necessarily a problem – lots of mortgage models run that way. [sent-9, score-0.211]

7 But you have to be really careful with your independent variables. [sent-10, score-0.102]

8 They defined as an “independent” variable the largest drop in farmland prices from mortgage origination to now, or to the date the mortgage went bad, whichever came first. [sent-12, score-1.17]

9 I always get a little suspicious when the event you are trying to predict gets incorporated as part of the definition of the variable that’s supposed to explain the event. [sent-13, score-0.232]

10 "] I searched through Heckman’s old reading list, JSTOR, etc. [sent-17, score-0.091]

11 but couldn’t come up with a proof of why that doesn’t work. [sent-18, score-0.092]

12 Generating some fake data was a lot easier, and apparently more persuasive to my non-quant colleagues. [sent-24, score-0.089]

13 Reading your post on academic vs political, my first thought was that just about every time I’ve engaged an academic in a “political” sphere they’ve adopted “political” discourse. [sent-26, score-0.502]

14 Over my career I could point to many cases where the response from an academic was political discourse. [sent-30, score-0.321]

15 It’s just that the political responses are the ones that stick in the craw and are most easily remembered. [sent-32, score-0.152]

16 Regarding academic vs political discourse, I agree completely that academics often seem to care more about short-term winning than about getting things right. [sent-33, score-0.549]

17 My point in that blog post was not that academics are better or more honorable than politicians, but rather that the rules are different. [sent-34, score-0.128]

18 We would like an academic to engage in open discourse and not use the truth as negotiation chits, and when they behave in political ways we are unhappy. [sent-35, score-0.55]

19 In contrast, a politician is supposed to negotiate. [sent-36, score-0.176]

20 If a politician makes concessions without getting anything in return, we respect him less. [sent-37, score-0.178]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('farmland', 0.31), ('fca', 0.233), ('gao', 0.212), ('mortgage', 0.211), ('farmer', 0.2), ('county', 0.193), ('loan', 0.191), ('academic', 0.169), ('discourse', 0.158), ('jstor', 0.155), ('political', 0.152), ('origination', 0.141), ('academics', 0.128), ('whichever', 0.128), ('heckman', 0.111), ('price', 0.111), ('politician', 0.107), ('independent', 0.102), ('vs', 0.1), ('proof', 0.092), ('reading', 0.091), ('fake', 0.089), ('variable', 0.086), ('drop', 0.083), ('discrete', 0.078), ('event', 0.077), ('back', 0.075), ('change', 0.074), ('period', 0.072), ('oddities', 0.071), ('concessions', 0.071), ('negotiation', 0.071), ('chits', 0.071), ('supposed', 0.069), ('farm', 0.067), ('resident', 0.067), ('kelly', 0.067), ('eda', 0.067), ('couldn', 0.064), ('sphere', 0.064), ('survives', 0.064), ('proving', 0.064), ('econometrician', 0.064), ('bad', 0.063), ('got', 0.062), ('consultants', 0.062), ('postings', 0.06), ('permanent', 0.06), ('austin', 0.06), ('grants', 0.057)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.99999988 1884 andrew gelman stats-2013-06-05-A story of fake-data checking being used to shoot down a flawed analysis at the Farm Credit Agency

Introduction: Austin Kelly writes: While reading your postings [or here ] on the subject of testing your model by running fake data I was reminded of the fact that I got one of these kinds of tests actually published in a GAO report back in the day. Reading your posts on Unz and political vs. economic discourse made me think of that work again. I thought I’d actually drop you a line on the subject. Back in 2003 GAO was asked to look at Farmer Mac, including a look at the Farm Credit Agency’s regulation of Farmer Mac. As the resident mortgage econometrician back then I was asked to look at FCA’s risk based capital stress test for Farmer Mac. The work was pretty easy. I found a lot of oddities, but the biggest one was that they were using a discrete choice set up (loan goes bad or doesn’t) instead of a hazard model (loan goes bad this period or survives to the next). Not necessarily a problem – lots of mortgage models run that way. But you have to be really careful with your independe

2 0.14253883 1743 andrew gelman stats-2013-02-28-Different modes of discourse

Introduction: Political/business negotiation vs. scholarly communication. In a negotiation you hold back, you only make concessions if you have to or in exchange for something else. In scholarly communication you look for your own mistakes, you volunteer information to others, and if someone points out a mistake, you learn from it. (Just a couple days ago, in fact, someone sent me an email showing a problem with bayesglm. I ran and altered his code, and it turned out we had a problem. Based on this information, Yu-Sung found and fixed the code. I was grateful to be informed of the problem.) Not all scholarly exchange goes like this, but that’s the ideal. In contrast, openness and transparency are not ideals in politics and business; in many cases they’re not even desired. If Barack Obama and John Boehner are negotiating on the budget, would it be appropriate for one of them to just start off the negotiations by making a bunch of concessions for free? No, of course not. Negotiation doesn

3 0.10155889 1732 andrew gelman stats-2013-02-22-Evaluating the impacts of welfare reform?

Introduction: John Pugliese writes: I was recently in a conversation with some colleagues regarding the evaluation of recent welfare reform in California. The discussion centered around what types of design might allow us to understand the impact the changes. Experimental designs were out, as random assignment is not feasible. Our data is pre/post, and some of my colleagues believed that the best we can do under these circumstance was a descriptive study; i.e. no causal inference. All of us were concerned with changes in economic and population changes over the pre-to-post period; i.e. over-estimating the effects in an improving economy. I was thought a quasi-experimental design was possible using MLM. Briefly, my suggestion was the following: Match our post-participants to a set of pre-participants on relevant person level factors, and treat the pre/post differences as a random effect at the county level. Next, we would adjust the pre/post differences by changes in economic and populati

4 0.10081938 770 andrew gelman stats-2011-06-15-Still more Mr. P in public health

Introduction: When it rains it pours . . . John Transue writes: I saw a post on Andrew Sullivan’s blog today about life expectancy in different US counties. With a bunch of the worst counties being in Mississippi, I thought that it might be another case of analysts getting extreme values from small counties. However, the paper (see here ) includes a pretty interesting methods section. This is from page 5, “Specifically, we used a mixed-effects Poisson regression with time, geospatial, and covariate components. Poisson regression fits count outcome variables, e.g., death counts, and is preferable to a logistic model because the latter is biased when an outcome is rare (occurring in less than 1% of observations).” They have downloadable data. I believe that the data are predicted values from the model. A web appendix also gives 90% CIs for their estimates. Do you think they solved the small county problem and that the worst counties really are where their spreadsheet suggests? My re

5 0.094159037 2180 andrew gelman stats-2014-01-21-Everything I need to know about Bayesian statistics, I learned in eight schools.

Introduction: This post is by Phil. I’m aware that there  are  some people who use a Bayesian approach largely because it allows them to provide a highly informative prior distribution based subjective judgment, but that is not the appeal of Bayesian methods for a lot of us practitioners. It’s disappointing and surprising, twenty years after my initial experiences, to still hear highly informed professional statisticians who think that what distinguishes Bayesian statistics from Frequentist statistics is “subjectivity” ( as seen in  a recent blog post and its comments ). My first encounter with Bayesian statistics was just over 20 years ago. I was a postdoc at Lawrence Berkeley National Laboratory, with a new PhD in theoretical atomic physics but working on various problems related to the geographical and statistical distribution of indoor radon (a naturally occurring radioactive gas that can be dangerous if present at high concentrations). One of the issues I ran into right at the start was th

6 0.093290165 2255 andrew gelman stats-2014-03-19-How Americans vote

7 0.090374142 1962 andrew gelman stats-2013-07-30-The Roy causal model?

8 0.087914698 604 andrew gelman stats-2011-03-08-More on the missing conservative psychology researchers

9 0.08783821 182 andrew gelman stats-2010-08-03-Nebraska never looked so appealing: anatomy of a zombie attack. Oops, I mean a recession.

10 0.086428076 1148 andrew gelman stats-2012-01-31-“the forces of native stupidity reinforced by that blind hostility to criticism, reform, new ideas and superior ability which is human as well as academic nature”

11 0.085019611 1605 andrew gelman stats-2012-12-04-Write This Book

12 0.08497899 1955 andrew gelman stats-2013-07-25-Bayes-respecting experimental design and other things

13 0.084193081 1844 andrew gelman stats-2013-05-06-Against optimism about social science

14 0.082599461 2235 andrew gelman stats-2014-03-06-How much time (if any) should we spend criticizing research that’s fraudulent, crappy, or just plain pointless?

15 0.081340738 805 andrew gelman stats-2011-07-16-Hey–here’s what you missed in the past 30 days!

16 0.079335935 888 andrew gelman stats-2011-09-03-A psychology researcher asks: Is Anova dead?

17 0.077299267 1807 andrew gelman stats-2013-04-17-Data problems, coding errors…what can be done?

18 0.076094121 1656 andrew gelman stats-2013-01-05-Understanding regression models and regression coefficients

19 0.07546109 1632 andrew gelman stats-2012-12-20-Who exactly are those silly academics who aren’t as smart as a Vegas bookie?

20 0.073137112 2232 andrew gelman stats-2014-03-03-What is the appropriate time scale for blogging—the day or the week?


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.196), (1, -0.04), (2, 0.018), (3, 0.013), (4, 0.005), (5, 0.007), (6, 0.048), (7, -0.019), (8, 0.046), (9, 0.015), (10, -0.016), (11, 0.042), (12, -0.046), (13, -0.019), (14, -0.002), (15, 0.039), (16, -0.023), (17, -0.031), (18, 0.009), (19, 0.002), (20, -0.006), (21, 0.017), (22, -0.015), (23, -0.013), (24, -0.003), (25, -0.003), (26, 0.026), (27, -0.039), (28, 0.012), (29, -0.01), (30, 0.006), (31, 0.002), (32, 0.01), (33, 0.005), (34, 0.014), (35, -0.026), (36, -0.028), (37, 0.01), (38, 0.036), (39, 0.017), (40, 0.021), (41, -0.053), (42, -0.019), (43, -0.012), (44, -0.019), (45, 0.02), (46, 0.03), (47, 0.037), (48, 0.031), (49, 0.01)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.96692961 1884 andrew gelman stats-2013-06-05-A story of fake-data checking being used to shoot down a flawed analysis at the Farm Credit Agency

Introduction: Austin Kelly writes: While reading your postings [or here ] on the subject of testing your model by running fake data I was reminded of the fact that I got one of these kinds of tests actually published in a GAO report back in the day. Reading your posts on Unz and political vs. economic discourse made me think of that work again. I thought I’d actually drop you a line on the subject. Back in 2003 GAO was asked to look at Farmer Mac, including a look at the Farm Credit Agency’s regulation of Farmer Mac. As the resident mortgage econometrician back then I was asked to look at FCA’s risk based capital stress test for Farmer Mac. The work was pretty easy. I found a lot of oddities, but the biggest one was that they were using a discrete choice set up (loan goes bad or doesn’t) instead of a hazard model (loan goes bad this period or survives to the next). Not necessarily a problem – lots of mortgage models run that way. But you have to be really careful with your independe

2 0.83558577 2337 andrew gelman stats-2014-05-18-Never back down: The culture of poverty and the culture of journalism

Introduction: Ta-Nehisi Coates recently published a fascinating column on the “culture of poverty,” in particular focusing on the idea that behavior that is rational and adaptive in some settings is not so appropriate in others: The set of practices required for a young man to secure his safety on the streets of his troubled neighborhood are not the same as those required to place him on an honor roll . . . The way to guide him through this transition is not to insult his native language. . . . For black men like us, the feeling of having something to lose, beyond honor and face, is foreign. We grew up in communities—New York, Baltimore, Chicago—where the Code of the Streets was the first code we learned. Respect and reputation are everything there. These values are often denigrated by people who have never been punched in the face. But when you live around violence there is no opting out. A reputation for meeting violence with violence is a shield. That protection increases when you are part

3 0.81027532 189 andrew gelman stats-2010-08-06-Proposal for a moratorium on the use of the words “fashionable” and “trendy”

Introduction: Tyler Cowen links to an interesting article by Terry Teachout on David Mamet’s political conservatism. I don’t think of playwrights as gurus, but I do find it interesting to consider the political orientations of authors and celebrities . I have only one problem with Teachout’s thought-provoking article. He writes: As early as 2002 . . . Arguing that “the Western press [had] embraced antisemitism as the new black,” Mamet drew a sharp contrast between that trendy distaste for Jews and the harsh realities of daily life in Israel . . . In 2006, Mamet published a collection of essays called The Wicked Son: Anti-Semitism, Jewish Self-Hatred and the Jews that made the point even more bluntly. “The Jewish State,” he wrote, “has offered the Arab world peace since 1948; it has received war, and slaughter, and the rhetoric of annihilation.” He went on to argue that secularized Jews who “reject their birthright of ‘connection to the Divine’” succumb in time to a self-hatred tha

4 0.78959727 1369 andrew gelman stats-2012-06-06-Your conclusion is only as good as your data

Introduction: Jay Livingston points to an excellent rant from Peter Moskos, trashing a study about “food deserts” (which I kept reading as “food desserts”) in inner-city neighborhoods. Here’s Moskos: From the Times: There is no relationship between the type of food being sold in a neighborhood and obesity among its children and adolescents. Within a couple of miles of almost any urban neighborhood, “you can get basically any type of food,” said Roland Sturm of the RAND Corporation, lead author of one of the studies. “Maybe we should call it a food swamp rather than a desert,” he said. Sure thing, Sturm. But I suspect you wouldn’t think certain neighborhoods are swamped with good food if you actually got out of your office and went to one of the neighborhoods. After all, what are going to believe: A nice data set or your lying eyes? “Food outlet data … are classifıed using the North American Industry Classifıcation System (NAICS)” (p. 130). Assuming validity and reliability of NAICS

5 0.78105587 139 andrew gelman stats-2010-07-10-Life in New York, Then and Now

Introduction: Interesting mini-memoir from John Podhoretz about the Upper West Side, in his words, “the most affluent shtetl the world has ever seen.” The only part I can’t quite follow is his offhand remark, “It is an expensive place to live, but then it always was.” I always thought that, before 1985 or so, the Upper West Side wasn’t so upscale. People at Columbia tell all sorts of stories about how things used to be in the bad old days. I have one other comment. Before giving it, let me emphasize that enjoyed reading Podhoretz’s article and, by making the comment below, I’m not trying to shoot Podhoretz down; rather, I’m trying to help out by pointing out a habit in his writing that might be getting in the way of his larger messages. Podhoretz writes the following about slum clearance: Over the course of the next four years, 20 houses on the block would be demolished and replaced with a high school named for Louis Brandeis and a relocated elementary school. Of the 35 brownstones t

6 0.77968162 105 andrew gelman stats-2010-06-23-More on those divorce prediction statistics, including a discussion of the innumeracy of (some) mathematicians

7 0.76993352 470 andrew gelman stats-2010-12-16-“For individuals with wine training, however, we find indications of a positive relationship between price and enjoyment”

8 0.76940101 69 andrew gelman stats-2010-06-04-A Wikipedia whitewash

9 0.76869935 2190 andrew gelman stats-2014-01-29-Stupid R Tricks: Random Scope

10 0.76609379 1807 andrew gelman stats-2013-04-17-Data problems, coding errors…what can be done?

11 0.76500255 2270 andrew gelman stats-2014-03-28-Creating a Lenin-style democracy

12 0.75644338 229 andrew gelman stats-2010-08-24-Bizarre twisty argument about medical diagnostic tests

13 0.7525655 1042 andrew gelman stats-2011-12-05-Timing is everything!

14 0.75238371 335 andrew gelman stats-2010-10-11-How to think about Lou Dobbs

15 0.75191146 1148 andrew gelman stats-2012-01-31-“the forces of native stupidity reinforced by that blind hostility to criticism, reform, new ideas and superior ability which is human as well as academic nature”

16 0.74851376 458 andrew gelman stats-2010-12-08-Blogging: Is it “fair use”?

17 0.74807435 1417 andrew gelman stats-2012-07-15-Some decision analysis problems are pretty easy, no?

18 0.74702108 321 andrew gelman stats-2010-10-05-Racism!

19 0.74441093 1525 andrew gelman stats-2012-10-08-Ethical standards in different data communities

20 0.74137038 1479 andrew gelman stats-2012-09-01-Mothers and Moms


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(9, 0.01), (16, 0.082), (21, 0.024), (23, 0.011), (24, 0.113), (34, 0.012), (42, 0.025), (53, 0.011), (79, 0.162), (81, 0.032), (82, 0.012), (84, 0.019), (86, 0.056), (94, 0.014), (98, 0.018), (99, 0.266)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.95830953 1515 andrew gelman stats-2012-09-29-Jost Haidt

Introduction: Research psychologist John Jost reviews the recent book, “The Righteous Mind,” by research psychologist Jonathan Haidt. Some of my thoughts on Haidt’s book are here . And here’s some of Jost’s review: Haidt’s book is creative, interesting, and provocative. . . . The book shines a new light on moral psychology and presents a bold, confrontational message. From a scientific perspective, however, I worry that his theory raises more questions than it answers. Why do some individuals feel that it is morally good (or necessary) to obey authority, favor the ingroup, and maintain purity, whereas others are skeptical? (Perhaps parenting style is relevant after all.) Why do some people think that it is morally acceptable to judge or even mistreat others such as gay or lesbian couples or, only a generation ago, interracial couples because they dislike or feel disgusted by them, whereas others do not? Why does the present generation “care about violence toward many more classes of victims

2 0.95779514 845 andrew gelman stats-2011-08-08-How adoption speed affects the abandonment of cultural tastes

Introduction: Interesting article by Jonah Berger and Gael Le Mens: Products, styles, and social movements often catch on and become popular, but little is known about why such identity-relevant cultural tastes and practices die out. We demonstrate that the velocity of adoption may affect abandonment: Analysis of over 100 years of data on first-name adoption in both France and the United States illustrates that cultural tastes that have been adopted quickly die faster (i.e., are less likely to persist). Mirroring this aggregate pattern, at the individual level, expecting parents are more hesitant to adopt names that recently experienced sharper increases in adoption. Further analysis indicate that these effects are driven by concerns about symbolic value: Fads are perceived negatively, so people avoid identity-relevant items with sharply increasing popularity because they believe that they will be short lived. Ancillary analyses also indicate that, in contrast to conventional wisdom, identity-r

3 0.9467411 1379 andrew gelman stats-2012-06-14-Cool-ass signal processing using Gaussian processes (birthdays again)

Introduction: Aki writes: Here’s my version of the birthday frequency graph . I used Gaussian process with two slowly varying components and periodic component with decay, so that periodic form can change in time. I used Student’s t-distribution as observation model to allow exceptional dates to be outliers. I guess that periodic component due to week effect is still in the data because there is data only from twenty years. Naturally it would be better to model the whole timeseries, but it was easier to just use the cvs by Mulligan. ALl I can say is . . . wow. Bayes wins again. Maybe Aki can supply the R or Matlab code? P.S. And let’s not forget how great the simple and clear time series plots are, compared to various fancy visualizations that people might try. P.P.S. More here .

4 0.94393373 1825 andrew gelman stats-2013-04-25-It’s binless! A program for computing normalizing functions

Introduction: Zhiqiang Tan writes: I have created an R package to implement the full likelihood method in Kong et al. (2003). The method can be seen as a binless extension of so-called Weighted Histogram Analysis Method (UWHAM) widely used in physics and chemistry. The method has also been introduced to the physics literature and called the Multivariate Bennet Acceptance Ratio (MBAR) method. But a key point of my implementation is to compute the free energy estimates by minimizing a convex function, instead of solving nonlinear equations by the self-consistency or the Newton-Raphson algorithm.

same-blog 5 0.9399569 1884 andrew gelman stats-2013-06-05-A story of fake-data checking being used to shoot down a flawed analysis at the Farm Credit Agency

Introduction: Austin Kelly writes: While reading your postings [or here ] on the subject of testing your model by running fake data I was reminded of the fact that I got one of these kinds of tests actually published in a GAO report back in the day. Reading your posts on Unz and political vs. economic discourse made me think of that work again. I thought I’d actually drop you a line on the subject. Back in 2003 GAO was asked to look at Farmer Mac, including a look at the Farm Credit Agency’s regulation of Farmer Mac. As the resident mortgage econometrician back then I was asked to look at FCA’s risk based capital stress test for Farmer Mac. The work was pretty easy. I found a lot of oddities, but the biggest one was that they were using a discrete choice set up (loan goes bad or doesn’t) instead of a hazard model (loan goes bad this period or survives to the next). Not necessarily a problem – lots of mortgage models run that way. But you have to be really careful with your independe

6 0.93946785 1126 andrew gelman stats-2012-01-18-Bob on Stan

7 0.93835139 1538 andrew gelman stats-2012-10-17-Rust

8 0.9348315 469 andrew gelman stats-2010-12-16-2500 people living in a park in Chicago?

9 0.93390417 1172 andrew gelman stats-2012-02-17-Rare name analysis and wealth convergence

10 0.93298435 863 andrew gelman stats-2011-08-21-Bad graph

11 0.92227489 1786 andrew gelman stats-2013-04-03-Hierarchical array priors for ANOVA decompositions

12 0.92001891 939 andrew gelman stats-2011-10-03-DBQQ rounding for labeling charts and communicating tolerances

13 0.90160871 1041 andrew gelman stats-2011-12-04-David MacKay and Occam’s Razor

14 0.90026093 1048 andrew gelman stats-2011-12-09-Maze generation algorithms!

15 0.88347393 1229 andrew gelman stats-2012-03-25-Same old story

16 0.88185149 2145 andrew gelman stats-2013-12-24-Estimating and summarizing inference for hierarchical variance parameters when the number of groups is small

17 0.88102144 1384 andrew gelman stats-2012-06-19-Slick time series decomposition of the birthdays data

18 0.88067889 888 andrew gelman stats-2011-09-03-A psychology researcher asks: Is Anova dead?

19 0.8747316 1323 andrew gelman stats-2012-05-16-Question 6 of my final exam for Design and Analysis of Sample Surveys

20 0.87377948 636 andrew gelman stats-2011-03-29-The Conservative States of America