Schooled

A Controversial Teacher-Evaluation Method Is Heading to Court. Here’s Why That’s a Huge Deal.

New York Gov. Andrew Cuomo is a big fan of using test scores to evaluate teachers. 

Photo by Spencer Platt/Getty Images

Sheri Lederman has taught elementary school in Great Neck, New York, for 18 years. She had always been “highly regarded as an educator” (her superintendent’s words), with more than two-thirds of her students scoring as proficient or advanced in 2012–13. But then, the next year, although roughly the same percentage of her students met or exceeded state standards, Lederman was slapped with an “ineffective” rating and given one out of 20 points. What happened? To understand that requires a quick briefing on the labyrinthine, data-driven methods used to evaluate her, one that’s known as student-growth percentile, or SGP, which measures a student’s growth from one year to the next.* It is similar to another evaluation method beloved by education reformers that’s known as value-added modeling, or VAM. If Lederman’s lawsuit succeeds, both of these score-based evaluation systems could be under serious threat.

Proponents tout SGP and VAM as a means of assessing a teacher’s effectiveness that’s far more accurate and objective than previous systems that scored teachers exclusively on principal or student evaluations, or on standardized test scores in a vacuum. Critics, including Lederman, believe that these scoring methods are irrational and potentially injurious to even the best teachers.

These systems work in part by assessing students’ year-to-year progress: by measuring the same cohort of students’ scores as they move up a grade as well as comparing that cohort with the previous year’s scores. In deeming whether a teacher should be rated as “highly effective,” “effective,” “developing,” or “ineffective,” the scores also attempt to control for factors outside a teacher’s control, like demographics (not that Great Neck is exactly rife with disadvantaged youth) and past test scores. The goal of these models is to isolate how much value each teacher adds to the equation.

But is it fair? Lederman doesn’t think so, and she’s suing the New York State Education Department to scrap her score. In the opening paragraph of her lawsuit, she claims that the score she received was “arbitrary and capricious and an abuse of discretion,” and one that could damage her future career prospects. (In New York, a teacher who receives two consecutive “ineffective” ratings could be placed on the fast track to dismissal.) Oral arguments begin Wednesday in a case that education reformers and their opponents are watching closely.

These approaches to teacher assessment are relatively new—major school systems like New York, Chicago, and the District of Columbia started deploying VAM at the beginning of the decade, and now more than half the states use some form of it—that has been rife with controversy from the beginning. While classroom observations do play a role in a teacher’s overall score, one recurring complaint is that, because these metrics hinge so much on test-score growth, students scoring high one year may leave little room for improvement the next year, which could hurt their teachers’ ratings. These evaluations often measure students’ predicted test scores against their actual scores, which can lead to some lopsided results, like the much-discussed predicament of this Florida teacher. And, as a Schooled story illustrated last month, because the testing is generally limited to math and English, teachers of, say, art or science could be punished for students’ performance in subjects they don’t even teach. 

So the future of using student test scores to evaluate teacher performance—which the Gates Foundation has found to be effective, and which education reformers, including Education Secretary Arne Duncan and New York’s own Gov. Andrew Cuomo, have doggedly embraced despite concerns—may hang in the balance with this lawsuit. If Lederman prevails, New York may have to start from scratch and come up with an entirely new teacher evaluation system. And, to adapt Metternich’s line about France and Europe: When New York sneezes, the rest of the United States catches a cold.

*Correction, Aug. 12, 2015: This post originally misstated that New York uses value-added modeling. It uses student-growth percentiles. The post has been revised to clarify the difference between student-growth percentiles and value-added modeling.