andrew_gelman_stats andrew_gelman_stats-2012 andrew_gelman_stats-2012-1274 knowledge-graph by maker-knowledge-mining

1274 andrew gelman stats-2012-04-21-Value-added assessment political FAIL


meta infos for this blog

Source: html

Introduction: Jimmy points me to a sequence of posts (Analyzing Released NYC Value-Added Data Parts 1, 2, 3, 4) by Gary Rubinstein slamming value-added assessment of teachers. A skeptical consensus seems to have arisen on this issue. The teachers groups don’t like the numbers and it seems like none of the reformers trust the numbers enough to defend them. Lots of people like the idea of evaluating teacher performance, but I don’t see anybody out there wanting to seriously defend the numbers that are being pushed out here. P.S. Just to be clear, I’m specifically addressing the problems arising in value assessment of individual teachers. I’m not criticizing the interesting research by Jonah Rockoff and others on the distribution of teacher effects. It’s a lot easier to estimate the distribution of a set of parameters than to estimate the parameters individually.


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Jimmy points me to a sequence of posts (Analyzing Released NYC Value-Added Data Parts 1, 2, 3, 4) by Gary Rubinstein slamming value-added assessment of teachers. [sent-1, score-0.729]

2 A skeptical consensus seems to have arisen on this issue. [sent-2, score-0.52]

3 The teachers groups don’t like the numbers and it seems like none of the reformers trust the numbers enough to defend them. [sent-3, score-1.573]

4 Lots of people like the idea of evaluating teacher performance, but I don’t see anybody out there wanting to seriously defend the numbers that are being pushed out here. [sent-4, score-1.458]

5 Just to be clear, I’m specifically addressing the problems arising in value assessment of individual teachers. [sent-7, score-0.898]

6 I’m not criticizing the interesting research by Jonah Rockoff and others on the distribution of teacher effects. [sent-8, score-0.632]

7 It’s a lot easier to estimate the distribution of a set of parameters than to estimate the parameters individually. [sent-9, score-0.924]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('assessment', 0.263), ('defend', 0.259), ('teacher', 0.247), ('numbers', 0.222), ('reformers', 0.214), ('rockoff', 0.193), ('arisen', 0.181), ('rubinstein', 0.177), ('parameters', 0.172), ('pushed', 0.166), ('individually', 0.163), ('addressing', 0.16), ('jonah', 0.16), ('jimmy', 0.158), ('slamming', 0.156), ('sequence', 0.15), ('arising', 0.15), ('distribution', 0.148), ('nyc', 0.137), ('estimate', 0.133), ('consensus', 0.133), ('anybody', 0.132), ('evaluating', 0.131), ('wanting', 0.13), ('released', 0.13), ('gary', 0.127), ('teachers', 0.126), ('criticizing', 0.121), ('skeptical', 0.118), ('analyzing', 0.116), ('specifically', 0.109), ('parts', 0.109), ('easier', 0.105), ('trust', 0.105), ('none', 0.104), ('posts', 0.104), ('seriously', 0.101), ('performance', 0.097), ('groups', 0.093), ('seems', 0.088), ('individual', 0.079), ('value', 0.078), ('like', 0.07), ('clear', 0.064), ('lots', 0.063), ('others', 0.061), ('set', 0.061), ('problems', 0.059), ('points', 0.056), ('interesting', 0.055)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0 1274 andrew gelman stats-2012-04-21-Value-added assessment political FAIL

Introduction: Jimmy points me to a sequence of posts (Analyzing Released NYC Value-Added Data Parts 1, 2, 3, 4) by Gary Rubinstein slamming value-added assessment of teachers. A skeptical consensus seems to have arisen on this issue. The teachers groups don’t like the numbers and it seems like none of the reformers trust the numbers enough to defend them. Lots of people like the idea of evaluating teacher performance, but I don’t see anybody out there wanting to seriously defend the numbers that are being pushed out here. P.S. Just to be clear, I’m specifically addressing the problems arising in value assessment of individual teachers. I’m not criticizing the interesting research by Jonah Rockoff and others on the distribution of teacher effects. It’s a lot easier to estimate the distribution of a set of parameters than to estimate the parameters individually.

2 0.23555094 1350 andrew gelman stats-2012-05-28-Value-added assessment: What went wrong?

Introduction: Jacob Hartog writes the following in reaction to my post on the use of value-added modeling for teacher assessment: What I [Hartog] think has been inadequately discussed is the use of individual model specifications to assign these teacher ratings, rather than the zone of agreement across a broad swath of model specifications. For example, the model used by NYCDOE doesn’t just control for a student’s prior year test score (as I think everyone can agree is a good idea.) It also assumes that different demographic groups will learn different amounts in a given year, and assigns a school-level random effect. The result is that, as was much ballyhooed at the time of the release of the data,the average teacher rating for a given school is roughly the same, no matter whether the school is performing great or terribly. The headline from this was “excellent teachers spread evenly across the city’s schools,” rather than “the specification of these models assume that excellent teachers are

3 0.16862799 1620 andrew gelman stats-2012-12-12-“Teaching effectiveness” as another dimension in cognitive ability

Introduction: I’m not a great teacher. I can get by because I work hard and I know a lot, and for some students my classes are just great, but it’s not a natural talent of mine. I know people who are amazing teachers, and they have something that I just don’t have. I wrote that book, Teaching Statistics: A Bag of Tricks (with Deb Nolan) because I’m not a good teacher and hence need to develop all sorts of techniques to be able to do what good teachers can do without even trying. I’m not proud of being mediocre at teaching. I don’t think that low teaching skill is some sort of indicator that I’m a great researcher. The other think about teaching ability is that I think it’s hard to detect without actually seeing someone teach a class. If you see me give a seminar presentation or even a guest lecture, you’d think I’m an awesome teacher. But, actually, no. I’m an excellent speaker, not such a great teacher. This all came to mind when I received the following email from anthropologist Hen

4 0.15950033 226 andrew gelman stats-2010-08-23-More on those L.A. Times estimates of teacher effectiveness

Introduction: In discussing the ongoing Los Angeles Times series on teacher effectiveness, Alex Tabarrok and I both were impressed that the newspaper was reporting results on individual teachers, moving beyond the general research findings (“teachers matter,” “KIPP really works, but it requires several extra hours in the school day,” and so forth) that we usually see from value-added analyses in education. My first reaction was that the L.A. Times could get away with this because, unlike academic researchers, they can do whatever they want as long as they don’t break the law. They don’t have to answer to an Institutional Review Board. (By referring to this study by its publication outlet rather than its authors, I’m violating my usual rule (see the last paragraph here ). In this case, I think it’s ok to refer to the “L.A. Times study” because what’s notable is not the analysis (thorough as it may be) but how it is being reported.) Here I’d like to highlight a few other things came up in our

5 0.15028784 1644 andrew gelman stats-2012-12-30-Fixed effects, followed by Bayes shrinkage?

Introduction: Stuart Buck writes: I have a question about fixed effects vs. random effects . Amongst economists who study teacher value-added, it has become common to see people saying that they estimated teacher fixed effects (via least squares dummy variables, so that there is a parameter for each teacher), but that they then applied empirical Bayes shrinkage so that the teacher effects are brought closer to the mean. (See this paper by Jacob and Lefgren, for example.) Can that really be what they are doing? Why wouldn’t they just run random (modeled) effects in the first place? I feel like there’s something I’m missing. My reply: I don’t know the full story here, but I’m thinking there are two goals, first to get an unbiased estimate of an overall treatment effect (and there the econometricians prefer so-called fixed effects; I disagree with them on this but I know where they’re coming from) and second to estimate individual teacher effects (and there it makes sense to use so-called

6 0.13137458 750 andrew gelman stats-2011-06-07-Looking for a purpose in life: Update on that underworked and overpaid sociologist whose “main task as a university professor was self-cultivation”

7 0.13122989 222 andrew gelman stats-2010-08-21-Estimating and reporting teacher effectivenss: Newspaper researchers do things that academic researchers never could

8 0.10475883 740 andrew gelman stats-2011-06-01-The “cushy life” of a University of Illinois sociology professor

9 0.10133618 606 andrew gelman stats-2011-03-10-It’s no fun being graded on a curve

10 0.091503628 529 andrew gelman stats-2011-01-21-“City Opens Inquiry on Grading Practices at a Top-Scoring Bronx School”

11 0.091307364 2279 andrew gelman stats-2014-04-02-Am I too negative?

12 0.08858116 247 andrew gelman stats-2010-09-01-How does Bayes do it?

13 0.083267458 195 andrew gelman stats-2010-08-09-President Carter

14 0.081785388 779 andrew gelman stats-2011-06-25-Avoiding boundary estimates using a prior distribution as regularization

15 0.080621742 745 andrew gelman stats-2011-06-04-High-level intellectual discussions in the Columbia statistics department

16 0.077601805 295 andrew gelman stats-2010-09-25-Clusters with very small numbers of observations

17 0.075639285 957 andrew gelman stats-2011-10-14-Questions about a study of charter schools

18 0.075238422 858 andrew gelman stats-2011-08-17-Jumping off the edge of the world

19 0.07232593 1958 andrew gelman stats-2013-07-27-Teaching is hard

20 0.071775638 209 andrew gelman stats-2010-08-16-EdLab at Columbia’s Teachers’ College


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.111), (1, 0.018), (2, 0.028), (3, -0.005), (4, 0.036), (5, 0.009), (6, 0.061), (7, 0.055), (8, -0.054), (9, 0.019), (10, 0.014), (11, 0.015), (12, -0.044), (13, -0.016), (14, -0.023), (15, 0.009), (16, -0.017), (17, 0.045), (18, -0.015), (19, 0.0), (20, -0.013), (21, 0.003), (22, -0.001), (23, 0.003), (24, 0.043), (25, -0.012), (26, -0.034), (27, 0.081), (28, -0.014), (29, 0.037), (30, 0.002), (31, 0.019), (32, 0.025), (33, -0.028), (34, 0.007), (35, -0.006), (36, 0.009), (37, 0.003), (38, 0.037), (39, -0.033), (40, 0.052), (41, 0.02), (42, -0.024), (43, 0.009), (44, -0.034), (45, 0.019), (46, 0.02), (47, 0.014), (48, 0.01), (49, -0.061)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.94285095 1274 andrew gelman stats-2012-04-21-Value-added assessment political FAIL

Introduction: Jimmy points me to a sequence of posts (Analyzing Released NYC Value-Added Data Parts 1, 2, 3, 4) by Gary Rubinstein slamming value-added assessment of teachers. A skeptical consensus seems to have arisen on this issue. The teachers groups don’t like the numbers and it seems like none of the reformers trust the numbers enough to defend them. Lots of people like the idea of evaluating teacher performance, but I don’t see anybody out there wanting to seriously defend the numbers that are being pushed out here. P.S. Just to be clear, I’m specifically addressing the problems arising in value assessment of individual teachers. I’m not criticizing the interesting research by Jonah Rockoff and others on the distribution of teacher effects. It’s a lot easier to estimate the distribution of a set of parameters than to estimate the parameters individually.

2 0.72353923 226 andrew gelman stats-2010-08-23-More on those L.A. Times estimates of teacher effectiveness

Introduction: In discussing the ongoing Los Angeles Times series on teacher effectiveness, Alex Tabarrok and I both were impressed that the newspaper was reporting results on individual teachers, moving beyond the general research findings (“teachers matter,” “KIPP really works, but it requires several extra hours in the school day,” and so forth) that we usually see from value-added analyses in education. My first reaction was that the L.A. Times could get away with this because, unlike academic researchers, they can do whatever they want as long as they don’t break the law. They don’t have to answer to an Institutional Review Board. (By referring to this study by its publication outlet rather than its authors, I’m violating my usual rule (see the last paragraph here ). In this case, I think it’s ok to refer to the “L.A. Times study” because what’s notable is not the analysis (thorough as it may be) but how it is being reported.) Here I’d like to highlight a few other things came up in our

3 0.6889962 1350 andrew gelman stats-2012-05-28-Value-added assessment: What went wrong?

Introduction: Jacob Hartog writes the following in reaction to my post on the use of value-added modeling for teacher assessment: What I [Hartog] think has been inadequately discussed is the use of individual model specifications to assign these teacher ratings, rather than the zone of agreement across a broad swath of model specifications. For example, the model used by NYCDOE doesn’t just control for a student’s prior year test score (as I think everyone can agree is a good idea.) It also assumes that different demographic groups will learn different amounts in a given year, and assigns a school-level random effect. The result is that, as was much ballyhooed at the time of the release of the data,the average teacher rating for a given school is roughly the same, no matter whether the school is performing great or terribly. The headline from this was “excellent teachers spread evenly across the city’s schools,” rather than “the specification of these models assume that excellent teachers are

4 0.67722625 1620 andrew gelman stats-2012-12-12-“Teaching effectiveness” as another dimension in cognitive ability

Introduction: I’m not a great teacher. I can get by because I work hard and I know a lot, and for some students my classes are just great, but it’s not a natural talent of mine. I know people who are amazing teachers, and they have something that I just don’t have. I wrote that book, Teaching Statistics: A Bag of Tricks (with Deb Nolan) because I’m not a good teacher and hence need to develop all sorts of techniques to be able to do what good teachers can do without even trying. I’m not proud of being mediocre at teaching. I don’t think that low teaching skill is some sort of indicator that I’m a great researcher. The other think about teaching ability is that I think it’s hard to detect without actually seeing someone teach a class. If you see me give a seminar presentation or even a guest lecture, you’d think I’m an awesome teacher. But, actually, no. I’m an excellent speaker, not such a great teacher. This all came to mind when I received the following email from anthropologist Hen

5 0.64603472 222 andrew gelman stats-2010-08-21-Estimating and reporting teacher effectivenss: Newspaper researchers do things that academic researchers never could

Introduction: Alex Tabarrok reports on an analysis from the Los Angeles Times of teacher performance (as measured by so-called value-added analysis, which is basically compares teachers based on their students’ average test scores at the end of the year, after controlling for pre-test scores. It’s well known that some teachers are much better than others, but, as Alex points out, what’s striking about the L.A. Times study is that they are publishing the estimates for individual teachers . For example, this: Nice graphics, too. To me, this illustrates one of the big advantages of research in a non-academic environment. If you’re writing an article for the L.A. Times, you can do what you want (within the limits of the law). If you’re doing the same research study at a university, there are a million restrictions. For example, from an official documen t, “The primary purpose of an Institutional Review Board (IRB) is to protect the rights and welfare of human subjects participati

6 0.62118691 606 andrew gelman stats-2011-03-10-It’s no fun being graded on a curve

7 0.61327749 542 andrew gelman stats-2011-01-28-Homework and treatment levels

8 0.60197622 529 andrew gelman stats-2011-01-21-“City Opens Inquiry on Grading Practices at a Top-Scoring Bronx School”

9 0.5985198 874 andrew gelman stats-2011-08-27-What’s “the definition of a professional career”?

10 0.5924896 484 andrew gelman stats-2010-12-24-Foreign language skills as an intrinsic good; also, beware the tyranny of measurement

11 0.58464342 344 andrew gelman stats-2010-10-15-Story time

12 0.58416986 452 andrew gelman stats-2010-12-06-Followup questions

13 0.58359981 2257 andrew gelman stats-2014-03-20-The candy weighing demonstration, or, the unwisdom of crowds

14 0.57778895 2202 andrew gelman stats-2014-02-07-Outrage of the week

15 0.57522982 464 andrew gelman stats-2010-12-12-Finite-population standard deviation in a hierarchical model

16 0.57378054 1424 andrew gelman stats-2012-07-22-Extreme events as evidence for differences in distributions

17 0.57211214 2271 andrew gelman stats-2014-03-28-What happened to the world we knew?

18 0.56948602 93 andrew gelman stats-2010-06-17-My proposal for making college admissions fairer

19 0.56739444 269 andrew gelman stats-2010-09-10-R vs. Stata, or, Different ways to estimate multilevel models

20 0.56635535 1980 andrew gelman stats-2013-08-13-Test scores and grades predict job performance (but maybe not at Google)


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(2, 0.039), (5, 0.02), (9, 0.018), (16, 0.107), (24, 0.138), (28, 0.241), (63, 0.071), (99, 0.247)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.94215769 1901 andrew gelman stats-2013-06-16-Evilicious: Why We Evolved a Taste for Being Bad

Introduction: The other day, a friend told me that when he saw me blogging on Noam Chomsky, he was surprised not to see any mention of disgraced primatologist Marc Hauser. I was like, whaaaaaa? I had no idea these two had any connection. In fact, though, they wrote papers together. This made me wonder what Chomsky thought of Hauser’s data scandal. I googled *marc hauser noam chomsky* and the first item that came up was this, from July 2011, reported by Tom Bartlett: I [Bartlett] asked Chomsky for his comment on the Hauser resignation and he e-mailed the following: Mark Hauser is a fine scientist with an outstanding record of accomplishment. His resignation is a serious loss for Harvard, and given the nature of the attack on him, for science generally. Chomsky is a mentor of Hauser so I can’t fault Chomsky for defending the guy. But why couldn’t he have stuck with something more general, something like, “I respect and admire Mark Hauser and am not aware of any improprieties in his w

same-blog 2 0.92587006 1274 andrew gelman stats-2012-04-21-Value-added assessment political FAIL

Introduction: Jimmy points me to a sequence of posts (Analyzing Released NYC Value-Added Data Parts 1, 2, 3, 4) by Gary Rubinstein slamming value-added assessment of teachers. A skeptical consensus seems to have arisen on this issue. The teachers groups don’t like the numbers and it seems like none of the reformers trust the numbers enough to defend them. Lots of people like the idea of evaluating teacher performance, but I don’t see anybody out there wanting to seriously defend the numbers that are being pushed out here. P.S. Just to be clear, I’m specifically addressing the problems arising in value assessment of individual teachers. I’m not criticizing the interesting research by Jonah Rockoff and others on the distribution of teacher effects. It’s a lot easier to estimate the distribution of a set of parameters than to estimate the parameters individually.

3 0.86779785 166 andrew gelman stats-2010-07-27-The Three Golden Rules for Successful Scientific Research

Introduction: A famous computer scientist, Edsger W. Dijkstra, was writing short memos on a daily basis for most of his life. His memo archives contains a little over 1300 memos. I guess today he would be writing a blog, although his memos do tend to be slightly more profound than what I post. Here are the rules (follow link for commentary), which I tried to summarize: Pursue quality and challenge, avoid routine. (“Raise your quality standards as high as you can live with, avoid wasting your time on routine problems, and always try to work as closely as possible at the boundary of your abilities. Do this, because it is the only way of discovering how that boundary should be moved forward.”) When pursuing social relevance, never compromise on scientific soundness. (“We all like our work to be socially relevant and scientifically sound. If we can find a topic satisfying both desires, we are lucky; if the two targets are in conflict with each other, let the requirement of scientific sou

4 0.82801646 747 andrew gelman stats-2011-06-06-Research Directions for Machine Learning and Algorithms

Introduction: After reading this from John Langford: The Deep Learning problem remains interesting. How do you effectively learn complex nonlinearities capable of better performance than a basic linear predictor? An effective solution avoids feature engineering. Right now, this is almost entirely dealt with empirically, but theory could easily have a role to play in phrasing appropriate optimization algorithms, for example. Jimmy asks: Does this sound related to modeling the deep interactions you often talk about? (I [Jimmy] never understand the stuff on hunch, but thought that might be so?) My reply: I don’t understand that stuff on hunch so well either–he uses slightly different jargon than I do! That said, it looks interesting and important so I’m pointing you all to it.

5 0.82719672 1990 andrew gelman stats-2013-08-20-Job opening at an organization that promotes reproducible research!

Introduction: I was told about an organization called Reproducibility Initiative. They tell me they are trying to make what was described in our “50 shades of gray” post standard across all of science, particularly areas like cancer research. I don’t know anything else about them, but that sounds like a good start! Here’s the ad: Data Scientist: Science Exchange, Palo Alto, CA Science Exchange is an innovative start-up with a mission to improve the efficiency and quality of scientific research. This Data Science position is critical to our mission. Our ideal candidate has the ability to collect and normalize data from multiple sources. This information will be used to drive marketing and product decisions, as well as fuel many of the features of Science Exchange. Desired Skills & Experience Experience with text mining, entity extraction and natural language processing is essential Experience scripting with either Python or R Experience running complex statistical analyses on l

6 0.81841528 835 andrew gelman stats-2011-08-02-“The sky is the limit” isn’t such a good thing

7 0.80839467 2119 andrew gelman stats-2013-12-01-Separated by a common blah blah blah

8 0.796121 351 andrew gelman stats-2010-10-18-“I was finding the test so irritating and boring that I just started to click through as fast as I could”

9 0.79596835 2354 andrew gelman stats-2014-05-30-Mmm, statistical significance . . . Evilicious!

10 0.79205382 505 andrew gelman stats-2011-01-05-Wacky interview questions: An exploration into the nature of evidence on the internet

11 0.79018259 1255 andrew gelman stats-2012-04-10-Amtrak sucks

12 0.78653216 1812 andrew gelman stats-2013-04-19-Chomsky chomsky chomsky chomsky furiously

13 0.78570443 1258 andrew gelman stats-2012-04-10-Why display 6 years instead of 30?

14 0.78032494 1521 andrew gelman stats-2012-10-04-Columbo does posterior predictive checks

15 0.76769674 1650 andrew gelman stats-2013-01-03-Did Steven Levitt really believe in 2008 that Obama “would be the greatest president in history”?

16 0.76358181 2026 andrew gelman stats-2013-09-16-He’s adult entertainer, Child educator, King of the crossfader, He’s the greatest of the greater, He’s a big bad wolf in your neighborhood, Not bad meaning bad but bad meaning good

17 0.76263547 2089 andrew gelman stats-2013-11-04-Shlemiel the Software Developer and Unknown Unknowns

18 0.76251292 1484 andrew gelman stats-2012-09-05-Two exciting movie ideas: “Second Chance U” and “The New Dirty Dozen”

19 0.76231474 1621 andrew gelman stats-2012-12-13-Puzzles of criminal justice

20 0.76173246 503 andrew gelman stats-2011-01-04-Clarity on my email policy