andrew_gelman_stats andrew_gelman_stats-2012 andrew_gelman_stats-2012-1350 knowledge-graph by maker-knowledge-mining

1350 andrew gelman stats-2012-05-28-Value-added assessment: What went wrong?


meta infos for this blog

Source: html

Introduction: Jacob Hartog writes the following in reaction to my post on the use of value-added modeling for teacher assessment: What I [Hartog] think has been inadequately discussed is the use of individual model specifications to assign these teacher ratings, rather than the zone of agreement across a broad swath of model specifications. For example, the model used by NYCDOE doesn’t just control for a student’s prior year test score (as I think everyone can agree is a good idea.) It also assumes that different demographic groups will learn different amounts in a given year, and assigns a school-level random effect. The result is that, as was much ballyhooed at the time of the release of the data,the average teacher rating for a given school is roughly the same, no matter whether the school is performing great or terribly. The headline from this was “excellent teachers spread evenly across the city’s schools,” rather than “the specification of these models assume that excellent teachers are


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 For example, the model used by NYCDOE doesn’t just control for a student’s prior year test score (as I think everyone can agree is a good idea. [sent-2, score-0.31]

2 ) It also assumes that different demographic groups will learn different amounts in a given year, and assigns a school-level random effect. [sent-3, score-0.307]

3 The result is that, as was much ballyhooed at the time of the release of the data,the average teacher rating for a given school is roughly the same, no matter whether the school is performing great or terribly. [sent-4, score-0.725]

4 The headline from this was “excellent teachers spread evenly across the city’s schools,” rather than “the specification of these models assume that excellent teachers are spread evenly across the city’s schools. [sent-5, score-1.134]

5 I also think that our baseline assumption should be that if demographic groups are learning different amounts in a given year, it is because they are getting different qualities of education, not because their intrinsic coefficients are different from one another. [sent-7, score-0.438]

6 Jesse Rothstein had a good paper a couple years back that showed that due to dynamic tracking of students, a 5th grade teacher’s value-added score is often correlated with their students’ 4th grade progress, which it shouldn’t be if these models are unbiased. [sent-8, score-0.442]

7 If I was more sophisticated, I’d try to extend Rothstein’s paper to show that dynamic sorting of teachers into high and low-functioning schools (as anyone can tell you happens in NYC) messes up the models just as badly as dynamic sorting of students does. [sent-9, score-1.181]

8 I should add that as someone who taught in NYC schools for 8 years, there’s nothing wrong with measurement. [sent-10, score-0.167]

9 The pre-existing principal-based observation and evaluation system was completely terrible, and it is totally reasonable to combine evaluations with test-based measurement in making decision. [sent-11, score-0.281]

10 Analytically privileging the results of tests over other forms of evaluation, let alone assigning a percentile to the coefficient on a single (remarkably complex) model and publishing it in the newspapers, is absolutely bonkers. [sent-12, score-0.193]

11 Here’s what I think is the actual model NYCDOE uses. [sent-21, score-0.091]

12 My impression was that they were making a lot of compromises in order to create assessments which in their view were simple and transparent. [sent-23, score-0.268]

13 I’ve also had several conversations with Jonah Rockoff, an economist who’s done some studies of teacher effects in NYC schools. [sent-25, score-0.343]

14 I have not looked at his analysis in detail but it all looked very impressive to me. [sent-26, score-0.162]

15 I recall Rockoff telling me that there was not a lot of evidence for good teachers sorting into good schools. [sent-27, score-0.625]

16 This is not to say that average teacher effects are zero within each school; rather, I think he looked for aggregate differences between schools (after controlling for student and teacher differences) and didn’t find much. [sent-28, score-1.046]

17 At one point our efforts on the NYC schools were financially supported by a wealthy public-spirited friend of mine; our plan was to increase citizen involvement by opening up the process as much as possible and making data available to parents and teachers as well as to administrators. [sent-30, score-0.63]

18 In any case, as noted in my earlier post I’ve been surprised that these sorts of quantitative teacher assessments have been such a flop. [sent-32, score-0.491]

19 It makes sense that many teachers unions have opposed them, but even the supporters of these assessments don’t seem to be out there defending them. [sent-33, score-0.42]

20 I’d be curious if NYC had just posted average pretest and posttest scores by teacher, instead of a VA percentile, if it would have made any difference to its political palatability. [sent-36, score-0.174]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('teacher', 0.343), ('teachers', 0.272), ('hartog', 0.218), ('nyc', 0.211), ('sorting', 0.179), ('schools', 0.167), ('assessments', 0.148), ('nycdoe', 0.145), ('rothstein', 0.145), ('dynamic', 0.129), ('player', 0.127), ('evenly', 0.119), ('rockoff', 0.119), ('average', 0.112), ('percentile', 0.102), ('school', 0.102), ('amounts', 0.095), ('model', 0.091), ('good', 0.087), ('assign', 0.084), ('totally', 0.083), ('demographic', 0.082), ('looked', 0.081), ('assessment', 0.081), ('grade', 0.078), ('players', 0.077), ('spread', 0.076), ('evaluation', 0.073), ('parents', 0.071), ('city', 0.071), ('score', 0.07), ('across', 0.07), ('liebman', 0.066), ('ballyhooed', 0.066), ('fiddaman', 0.066), ('intrinsic', 0.066), ('jesse', 0.066), ('swath', 0.066), ('va', 0.066), ('students', 0.066), ('different', 0.065), ('measurement', 0.065), ('wan', 0.062), ('posttest', 0.062), ('year', 0.062), ('excellent', 0.06), ('making', 0.06), ('compromises', 0.06), ('messes', 0.06), ('financially', 0.06)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.99999988 1350 andrew gelman stats-2012-05-28-Value-added assessment: What went wrong?

Introduction: Jacob Hartog writes the following in reaction to my post on the use of value-added modeling for teacher assessment: What I [Hartog] think has been inadequately discussed is the use of individual model specifications to assign these teacher ratings, rather than the zone of agreement across a broad swath of model specifications. For example, the model used by NYCDOE doesn’t just control for a student’s prior year test score (as I think everyone can agree is a good idea.) It also assumes that different demographic groups will learn different amounts in a given year, and assigns a school-level random effect. The result is that, as was much ballyhooed at the time of the release of the data,the average teacher rating for a given school is roughly the same, no matter whether the school is performing great or terribly. The headline from this was “excellent teachers spread evenly across the city’s schools,” rather than “the specification of these models assume that excellent teachers are

2 0.30778763 226 andrew gelman stats-2010-08-23-More on those L.A. Times estimates of teacher effectiveness

Introduction: In discussing the ongoing Los Angeles Times series on teacher effectiveness, Alex Tabarrok and I both were impressed that the newspaper was reporting results on individual teachers, moving beyond the general research findings (“teachers matter,” “KIPP really works, but it requires several extra hours in the school day,” and so forth) that we usually see from value-added analyses in education. My first reaction was that the L.A. Times could get away with this because, unlike academic researchers, they can do whatever they want as long as they don’t break the law. They don’t have to answer to an Institutional Review Board. (By referring to this study by its publication outlet rather than its authors, I’m violating my usual rule (see the last paragraph here ). In this case, I think it’s ok to refer to the “L.A. Times study” because what’s notable is not the analysis (thorough as it may be) but how it is being reported.) Here I’d like to highlight a few other things came up in our

3 0.27043465 1620 andrew gelman stats-2012-12-12-“Teaching effectiveness” as another dimension in cognitive ability

Introduction: I’m not a great teacher. I can get by because I work hard and I know a lot, and for some students my classes are just great, but it’s not a natural talent of mine. I know people who are amazing teachers, and they have something that I just don’t have. I wrote that book, Teaching Statistics: A Bag of Tricks (with Deb Nolan) because I’m not a good teacher and hence need to develop all sorts of techniques to be able to do what good teachers can do without even trying. I’m not proud of being mediocre at teaching. I don’t think that low teaching skill is some sort of indicator that I’m a great researcher. The other think about teaching ability is that I think it’s hard to detect without actually seeing someone teach a class. If you see me give a seminar presentation or even a guest lecture, you’d think I’m an awesome teacher. But, actually, no. I’m an excellent speaker, not such a great teacher. This all came to mind when I received the following email from anthropologist Hen

4 0.23555094 1274 andrew gelman stats-2012-04-21-Value-added assessment political FAIL

Introduction: Jimmy points me to a sequence of posts (Analyzing Released NYC Value-Added Data Parts 1, 2, 3, 4) by Gary Rubinstein slamming value-added assessment of teachers. A skeptical consensus seems to have arisen on this issue. The teachers groups don’t like the numbers and it seems like none of the reformers trust the numbers enough to defend them. Lots of people like the idea of evaluating teacher performance, but I don’t see anybody out there wanting to seriously defend the numbers that are being pushed out here. P.S. Just to be clear, I’m specifically addressing the problems arising in value assessment of individual teachers. I’m not criticizing the interesting research by Jonah Rockoff and others on the distribution of teacher effects. It’s a lot easier to estimate the distribution of a set of parameters than to estimate the parameters individually.

5 0.2312676 222 andrew gelman stats-2010-08-21-Estimating and reporting teacher effectivenss: Newspaper researchers do things that academic researchers never could

Introduction: Alex Tabarrok reports on an analysis from the Los Angeles Times of teacher performance (as measured by so-called value-added analysis, which is basically compares teachers based on their students’ average test scores at the end of the year, after controlling for pre-test scores. It’s well known that some teachers are much better than others, but, as Alex points out, what’s striking about the L.A. Times study is that they are publishing the estimates for individual teachers . For example, this: Nice graphics, too. To me, this illustrates one of the big advantages of research in a non-academic environment. If you’re writing an article for the L.A. Times, you can do what you want (within the limits of the law). If you’re doing the same research study at a university, there are a million restrictions. For example, from an official documen t, “The primary purpose of an Institutional Review Board (IRB) is to protect the rights and welfare of human subjects participati

6 0.2291704 1644 andrew gelman stats-2012-12-30-Fixed effects, followed by Bayes shrinkage?

7 0.22488239 606 andrew gelman stats-2011-03-10-It’s no fun being graded on a curve

8 0.20251517 529 andrew gelman stats-2011-01-21-“City Opens Inquiry on Grading Practices at a Top-Scoring Bronx School”

9 0.17278804 1803 andrew gelman stats-2013-04-14-Why girls do better in school

10 0.14657639 344 andrew gelman stats-2010-10-15-Story time

11 0.14165375 968 andrew gelman stats-2011-10-21-Could I use a statistics coach?

12 0.13699496 957 andrew gelman stats-2011-10-14-Questions about a study of charter schools

13 0.12794478 261 andrew gelman stats-2010-09-07-The $900 kindergarten teacher

14 0.1263299 1517 andrew gelman stats-2012-10-01-“On Inspiring Students and Being Human”

15 0.12361017 1353 andrew gelman stats-2012-05-30-Question 20 of my final exam for Design and Analysis of Sample Surveys

16 0.11802949 1903 andrew gelman stats-2013-06-17-Weak identification provides partial information

17 0.11583018 2083 andrew gelman stats-2013-10-31-Value-added modeling in education: Gaming the system by sending kids on a field trip at test time

18 0.11471098 936 andrew gelman stats-2011-10-02-Covariate Adjustment in RCT - Model Overfitting in Multilevel Regression

19 0.11405128 542 andrew gelman stats-2011-01-28-Homework and treatment levels

20 0.11201049 361 andrew gelman stats-2010-10-21-Tenure-track statistics job at Teachers College, here at Columbia!


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.234), (1, -0.008), (2, 0.033), (3, -0.028), (4, 0.071), (5, 0.102), (6, 0.055), (7, 0.115), (8, 0.002), (9, 0.044), (10, 0.032), (11, 0.117), (12, -0.054), (13, -0.076), (14, -0.0), (15, -0.026), (16, 0.049), (17, 0.057), (18, -0.08), (19, 0.055), (20, -0.057), (21, 0.024), (22, -0.024), (23, -0.024), (24, 0.078), (25, -0.038), (26, -0.054), (27, 0.09), (28, -0.034), (29, 0.002), (30, -0.022), (31, -0.004), (32, 0.077), (33, -0.012), (34, 0.017), (35, 0.045), (36, 0.047), (37, 0.016), (38, 0.084), (39, 0.008), (40, 0.046), (41, 0.032), (42, 0.012), (43, 0.022), (44, -0.052), (45, 0.027), (46, -0.015), (47, -0.037), (48, 0.002), (49, -0.036)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.94167942 1350 andrew gelman stats-2012-05-28-Value-added assessment: What went wrong?

Introduction: Jacob Hartog writes the following in reaction to my post on the use of value-added modeling for teacher assessment: What I [Hartog] think has been inadequately discussed is the use of individual model specifications to assign these teacher ratings, rather than the zone of agreement across a broad swath of model specifications. For example, the model used by NYCDOE doesn’t just control for a student’s prior year test score (as I think everyone can agree is a good idea.) It also assumes that different demographic groups will learn different amounts in a given year, and assigns a school-level random effect. The result is that, as was much ballyhooed at the time of the release of the data,the average teacher rating for a given school is roughly the same, no matter whether the school is performing great or terribly. The headline from this was “excellent teachers spread evenly across the city’s schools,” rather than “the specification of these models assume that excellent teachers are

2 0.8767572 226 andrew gelman stats-2010-08-23-More on those L.A. Times estimates of teacher effectiveness

Introduction: In discussing the ongoing Los Angeles Times series on teacher effectiveness, Alex Tabarrok and I both were impressed that the newspaper was reporting results on individual teachers, moving beyond the general research findings (“teachers matter,” “KIPP really works, but it requires several extra hours in the school day,” and so forth) that we usually see from value-added analyses in education. My first reaction was that the L.A. Times could get away with this because, unlike academic researchers, they can do whatever they want as long as they don’t break the law. They don’t have to answer to an Institutional Review Board. (By referring to this study by its publication outlet rather than its authors, I’m violating my usual rule (see the last paragraph here ). In this case, I think it’s ok to refer to the “L.A. Times study” because what’s notable is not the analysis (thorough as it may be) but how it is being reported.) Here I’d like to highlight a few other things came up in our

3 0.86253351 1265 andrew gelman stats-2012-04-15-Progress in U.S. education; also, a discussion of what it takes to hit the op-ed pages

Introduction: Howard Wainer writes : When we focus only on the differences between groups, we too easily lose track of the big picture. Nowhere is this more obvious than in the current public discussions of the size of the gap in test scores that is observed between racial groups. It has been noted that in New Jersey the gap between the average scores of white and black students on the well-developed scale of the National Assessment of Educational Progress (NAEP) has shrunk by only about 25 percent over the past two decades. The conclusion drawn was that even though the change is in the right direction, it is far too slow. But focusing on the difference blinds us to what has been a remarkable success in education over the past 20 years. Although the direction and size of student improvements are considered across many subject areas and many age groups, I will describe just one — 4th grade mathematics. . . . there have been steep gains for both racial groups over this period (somewhat steeper g

4 0.85200882 1803 andrew gelman stats-2013-04-14-Why girls do better in school

Introduction: Wayne Folta writes, “In light of your recent blog post on women in higher education, here’s one I just read about on a techie website regarding elementary education”: Why do girls get better grades in elementary school than boys—even when they perform worse on standardized tests? New research . . . suggests that it’s because of their classroom behavior, which may lead teachers to assign girls higher grades than their male counterparts. . . . The study, co-authored by [Christopher] Cornwell and David Mustard at UGA and Jessica Van Parys at Columbia, analyzed data on more than 5,800 students from kindergarten through fifth grade. It examined students’ performance on standardized tests in three categories—reading, math and science-linking test scores to teachers’ assessments of their students’ progress, both academically and more broadly. The data show, for the first time, that gender disparities in teacher grades start early and uniformly favor girls. In every subject area, bo

5 0.85002339 606 andrew gelman stats-2011-03-10-It’s no fun being graded on a curve

Introduction: Mark Palko points to a news article by Michael Winerip on teacher assessment: No one at the Lab Middle School for Collaborative Studies works harder than Stacey Isaacson, a seventh-grade English and social studies teacher. She is out the door of her Queens home by 6:15 a.m., takes the E train into Manhattan and is standing out front when the school doors are unlocked, at 7. Nights, she leaves her classroom at 5:30. . . . Her principal, Megan Adams, has given her terrific reviews during the two and a half years Ms. Isaacson has been a teacher. . . . The Lab School has selective admissions, and Ms. Isaacson’s students have excelled. Her first year teaching, 65 of 66 scored proficient on the state language arts test, meaning they got 3′s or 4′s; only one scored below grade level with a 2. More than two dozen students from her first two years teaching have gone on to . . . the city’s most competitive high schools. . . . You would think the Department of Education would want to r

6 0.84319657 542 andrew gelman stats-2011-01-28-Homework and treatment levels

7 0.83717597 93 andrew gelman stats-2010-06-17-My proposal for making college admissions fairer

8 0.82721746 1620 andrew gelman stats-2012-12-12-“Teaching effectiveness” as another dimension in cognitive ability

9 0.81690532 529 andrew gelman stats-2011-01-21-“City Opens Inquiry on Grading Practices at a Top-Scoring Bronx School”

10 0.81223917 484 andrew gelman stats-2010-12-24-Foreign language skills as an intrinsic good; also, beware the tyranny of measurement

11 0.80852479 825 andrew gelman stats-2011-07-27-Grade inflation: why weren’t the instructors all giving all A’s already??

12 0.79359823 1507 andrew gelman stats-2012-09-22-Grade inflation: why weren’t the instructors all giving all A’s already??

13 0.79257071 1688 andrew gelman stats-2013-01-22-That claim that students whose parents pay for more of college get worse grades

14 0.77433431 326 andrew gelman stats-2010-10-07-Peer pressure, selection, and educational reform

15 0.77339339 315 andrew gelman stats-2010-10-03-He doesn’t trust the fit . . . r=.999

16 0.75812089 95 andrew gelman stats-2010-06-17-“Rewarding Strivers: Helping Low-Income Students Succeed in College”

17 0.7538048 452 andrew gelman stats-2010-12-06-Followup questions

18 0.75047934 222 andrew gelman stats-2010-08-21-Estimating and reporting teacher effectivenss: Newspaper researchers do things that academic researchers never could

19 0.74995309 71 andrew gelman stats-2010-06-07-Pay for an A?

20 0.73939598 617 andrew gelman stats-2011-03-17-“Why Preschool Shouldn’t Be Like School”?


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(2, 0.027), (4, 0.089), (15, 0.047), (16, 0.087), (21, 0.018), (24, 0.128), (28, 0.018), (31, 0.011), (43, 0.015), (53, 0.017), (60, 0.011), (65, 0.013), (66, 0.016), (86, 0.037), (89, 0.033), (90, 0.014), (98, 0.016), (99, 0.285)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.97615862 947 andrew gelman stats-2011-10-08-GiveWell sez: Cost-effectiveness of de-worming was overstated by a factor of 100 (!) due to a series of sloppy calculations

Introduction: Alexander at GiveWell writes : The Disease Control Priorities in Developing Countries (DCP2), a major report funded by the Gates Foundation . . . provides an estimate of $3.41 per disability-adjusted life-year (DALY) for the cost-effectiveness of soil-transmitted-helminth (STH) treatment, implying that STH treatment is one of the most cost-effective interventions for global health. In investigating this figure, we have corresponded, over a period of months, with six scholars who had been directly or indirectly involved in the production of the estimate. Eventually, we were able to obtain the spreadsheet that was used to generate the $3.41/DALY estimate. That spreadsheet contains five separate errors that, when corrected, shift the estimated cost effectiveness of deworming from $3.41 to $326.43. [I think they mean to say $300 -- ed.] We came to this conclusion a year after learning that the DCP2’s published cost-effectiveness estimate for schistosomiasis treatment – another kind of

2 0.97338557 1801 andrew gelman stats-2013-04-13-Can you write a program to determine the causal order?

Introduction: Mike Zyphur writes: Kaggle.com has launched a competition to determine what’s an effect and what’s a cause. They’ve got correlated variables, they’re deprived of context, and you’re asked to determine the causal order. $5,000 prizes. I followed the link and the example they gave didn’t make much sense to me (the two variables were temperature and altitude of cities in Germany, and they said that altitude causes temperature). It has the feeling to me of one of those weird standardized tests we used to see sometimes in school, where there’s no real correct answer so the goal is to figure out what the test-writer wanted you to say. Nonetheless, this might be of interest, so I’m passing it along to you.

3 0.9718076 1618 andrew gelman stats-2012-12-11-The consulting biz

Introduction: I received the following (unsolicited) email: Hello, *** LLC, a ***-based market research company, has a financial client who is interested in speaking with a statistician who has done research in the field of Alzheimer’s Disease and preferably familiar with the SOLA and BAPI trials. We offer an honorarium of $200 for a 30 minute telephone interview. Please advise us if you have an employment or consulting agreement with any organization or operate professionally pursuant to an organization’s code of conduct or employee manual that may control activities by you outside of your regular present and former employment, such as participating in this consulting project for MedPanel. If there are such contracts or other documents that do apply to you, please forward MedPanel a copy of each such document asap as we are obligated to review such documents to determine if you are permitted to participate as a consultant for MedPanel on a project with this particular client. If you are

4 0.97068346 113 andrew gelman stats-2010-06-28-Advocacy in the form of a “deliberative forum”

Introduction: John Sides reports on a paper by Benjamin Page and Lawrence Jacobs about so-called deliberative forums, in particular a set of meetings called America Speaks that have been organized and conducted by the Peter G. Peterson Foundation, an organization formed by the former advertising executive, Secretary of Commerce, and investment banker to focus attention on the national debt. Sides, Page, and Jacobs discuss three key points: 1. Any poll or focus group is only as good as its sample, and there is no evidence that the participants in the America Speaks forums were selected in a way to be representative of the nation. Page and Jacobs write: Deliberative forums often fail to get a representative sample of Americans to participate, even when they try hard to do so. Worse, some deliberative forums make little or no serious effort to achieve representativeness. They throw open the doors to self-selected political activists with extreme opinions, or they compile a secret list

same-blog 5 0.96541595 1350 andrew gelman stats-2012-05-28-Value-added assessment: What went wrong?

Introduction: Jacob Hartog writes the following in reaction to my post on the use of value-added modeling for teacher assessment: What I [Hartog] think has been inadequately discussed is the use of individual model specifications to assign these teacher ratings, rather than the zone of agreement across a broad swath of model specifications. For example, the model used by NYCDOE doesn’t just control for a student’s prior year test score (as I think everyone can agree is a good idea.) It also assumes that different demographic groups will learn different amounts in a given year, and assigns a school-level random effect. The result is that, as was much ballyhooed at the time of the release of the data,the average teacher rating for a given school is roughly the same, no matter whether the school is performing great or terribly. The headline from this was “excellent teachers spread evenly across the city’s schools,” rather than “the specification of these models assume that excellent teachers are

6 0.96417582 238 andrew gelman stats-2010-08-27-No radon lobby

7 0.96225989 1829 andrew gelman stats-2013-04-28-Plain old everyday Bayesianism!

8 0.96170741 907 andrew gelman stats-2011-09-14-Reproducibility in Practice

9 0.96034044 1918 andrew gelman stats-2013-06-29-Going negative

10 0.9555254 419 andrew gelman stats-2010-11-18-Derivative-based MCMC as a breakthrough technique for implementing Bayesian statistics

11 0.9535054 2211 andrew gelman stats-2014-02-14-The popularity of certain baby names is falling off the clifffffffffffff

12 0.94694537 1435 andrew gelman stats-2012-07-30-Retracted articles and unethical behavior in economics journals?

13 0.9466368 1878 andrew gelman stats-2013-05-31-How to fix the tabloids? Toward replicable social science research

14 0.94532526 1163 andrew gelman stats-2012-02-12-Meta-analysis, game theory, and incentives to do replicable research

15 0.94511765 2000 andrew gelman stats-2013-08-28-Why during the 1950-1960′s did Jerry Cornfield become a Bayesian?

16 0.94410586 1605 andrew gelman stats-2012-12-04-Write This Book

17 0.94389343 2212 andrew gelman stats-2014-02-15-Mary, Mary, why ya buggin

18 0.943829 2137 andrew gelman stats-2013-12-17-Replication backlash

19 0.94338566 2244 andrew gelman stats-2014-03-11-What if I were to stop publishing in journals?

20 0.94275618 2297 andrew gelman stats-2014-04-20-Fooled by randomness