- K-12 Education
- Higher Education
- Who We Are
As a nation, we are changing the way we evaluate teachers, moving from a patchwork of weak and haphazard approaches to whole data-driven systems with dramatically high stakes. From Memphis to Chicago to Baltimore, districts and states are working to develop these systems, acknowledging the crucial role played by teachers and pushed toward greater accountability by competitive federal grant programs. At least 40 states have applied to the U.S. Department of Education for waivers that, in exchange for more flexibility on No Child Left Behind provisions, require comprehensive teacher evaluation systems (U.S. Department of Education, n.d.).
As they embark on this challenging task, districts can take some lessons from the successes and shortcomings of an evaluation system that is often held up as a model: the IMPACT system in Washington, D.C. The controversial legacy of former schools chancellor Michelle Rhee, IMPACT sets clear expectations for instruction and holds teachers to well-defined standards of performance. Now into its third year, the program appears to be meeting its goals of rewarding effective teachers and eliminating educators it considers incompetent. And it has given the public reason to have more faith in its school system. But IMPACT, which ranks teachers on several measures, has earned plenty of criticism from teachers who say it is rigid and punitive and forces them to teach in an overly prescriptive way. More important, teachers say that in its rush to strengthen accountability, IMPACT misses what they say they need most — greater support and more meaningful professional development.
Like most public school systems, the District of Columbia Public Schools was badly in need of a new way to ensure that it was putting a good teacher in every classroom. In 2007, when then-mayor Adrian Fenty took control of the city schools, the district’s scores on the National Assessment of Educational Progress were among the lowest in the nation, and its black-white achievement gap was the largest of 11 urban districts that reported their results (National Center for Education Statistics, n.d.). And yet the district’s evaluation system, which called for observations just once a year and graded teachers on a short checklist, rated 95% of teachers satisfactory or above.
A NEW SYSTEM
The architects of the new system started with the basics: defining good teaching. The Teaching and Learning Framework, as the resulting document is called, was a way for principals, teachers, and administrators to work together to improve instruction. Instead of focusing on what to teach, they concentrated on how to teach, with specific directions that spanned subject areas. (See “Elements of good teaching” in box at right.) “We focused first on pedagogy, whereas most other reforms focused on curriculum,” said Scott Thompson, director of teacher effectiveness strategy for DCPS. “You could have the greatest curriculum in the world, but if the teachers are ineffective in conveying it, then it’s not going to matter.”
Defining good teaching is one thing. Implementing an evaluation system around it, as D.C. and other school systems have found, is a far more complicated task. With input from teachers, administrators, and policy experts, D.C. produced a system that rates teachers on a combination of factors, some weighted more heavily than others. Classroom performance, as judged by the teaching and learning rubric, counts for 3%; student test scores (value-added data) for teachers in grades that take standardized tests count for 50%; commitment to the school community gets 10%; and school value-added data — a measure of the school’s overall impact on student learning — is worth another 5%.
Because teachers in nontesting grades do not receive valueadded data, their classroom performance counts for 75% of their score. A component called “teacher-assessed student achievement data” counts for 10%, and the other factors count the same as they do for the other teachers. For both categories of teachers, the final score is then adjusted up or down based on a factor called “core professionalism,” which covers things like coming to work on time.
The value-added measure has been as polarizing in D.C. as it has been elsewhere because it ties teacher performance to factors they say they often can’t control. And the scores have been undermined by reports of cheating by teachers and administrators on the tests on which it is substantially based (Gillum & Bello, 2011). But the classroom observations are just as controversial. Under IMPACT, every teacher in the district is observed five times a year: three times by a school administrator (usually the principal) and twice by a “master educator,” an outside teacher trained in the same discipline.
The observations take 30 minutes, and all but one of the administrator visits are unannounced. Based on them, teachers are ranked from 1 to 4. Critics say that 30 minutes is too short a time for an evaluator to assess performance, and that the assessment of that performance is subjective. The evaluations allow for virtually no
input from teachers and provide no way for the instructor to put the lesson or her students in context.
Combined with other factors, all these ratings produce an overall IMPACT score that translates into highly effective, effective, minimally effective, or ineffective. A rating of ineffective means the teacher is immediately subject to dismissal; a rating of minimally effective gives the teacher one year to improve or be fired; effective wins the teacher a standard contract raise; a highly effective rating qualifies the teacher for a bonus of up to $25,000.
At the end of IMPACT’s second year, roughly 17% of teachers were eligible for bonuses ranging from $3,000 to $25,000. With a second consecutive year of highly effective
ratings, 7% were eligible to have a base salary increase of up to $27,000. Six percent of teachers were fired — 2% who were rated ineffective and 4% who received minimally effective ratings for the second year in a row. Thus it can be said that IMPACT has served a purpose as a sorter, separating the good from the bad...