Subscribe to our Biweekly Digest, event invitations, and more.
Report Release: Reforming Teacher Pensions for a Changing Work Force
New Education Sector report examines teacher pensions and details the problems facing current state pension programs.
Sport or Not? A Question for the Courts
Senior Policy Analyst Elena Silva interviewed by the New York Times on Title IX.
Teachers Unions as Agents of Reform
Brad Jupp, an architect of Denver's landmark performance-based teacher pay system, ProComp, is an outspoken advocate of both labor organizing and quality education for disadvantaged kids. In this interview, Jupp talks about ProComp, his views on teacher unionism, and the future of the teaching profession.
Education Sector Welcomes Three New Board Members
Education Sector's board of directors names three prominent leaders in the fields of education and journalism to the board: David W. Breneman, Richard Lee Colvin, and Peter McWalters.
For-profit colleges: Do they shortchange students?
Policy Director Kevin Carey comments on a recent Senate HELP Committee hearing on for-profit colleges.
On June 23rd, New York City Mayor Michael Bloomberg announced an educational victory. Test scores in the city had risen dramatically, he declared. But, rather than celebrate, skeptics, such as education historian Diane Ravitch, challenged the accuracy of the results. That day's story in the New York Sun, "Mayor Sees a Test Scores Triumph; Or Is It a Case of Inflation of Results?" captured a debate that’s raging in New York City and across the country: What do annual state tests, mandated by the federal No Child Left Behind Act (NCLB), really tell us about student achievement and how should these scores be used to make decisions about teacher and school performance?
Enter Measuring Up, a new book on educational testing by Harvard professor and Education Sector research advisory board member Daniel Koretz. Koretz details how the test scores that we use to judge schools' performance—and that we believe to be precise, objective, and scientific—are in reality only incomplete representations of students' knowledge and are subject to a number of assumptions and a great deal of human judgment. With each chapter of the book, the seemingly simple metric that undergirds our current school accountability system—a standardized test score—becomes more and more complex, casting doubt on policies or parental choices that use a single score as the sole determinant of school quality.
Measuring Up, however, is not an anti-testing screed, although those who reject any type of standardized testing will likely use the book's arguments about the complexity and imprecise nature of testing to support their views. As a testing expert and one of the field’s leading researchers, Koretz clearly doesn't want to eliminate standardized testing. His main purpose is to help readers understand how best to think about the evidence that test scores provide. Thus, even as he affirms that testing can provide valuable evidence to help gauge student learning, he attempts to objectively describe the limitations of large-scale standardized testing. Overall, Koretz succeeds. The book is accessible and well-written—especially for a subject that is usually addressed in impenetrable technical terms.
Using simple and illuminating analogies, Koretz explains a number of complex testing issues. He refers to a bathroom scale, for instance, to discuss measurement error, which can include inconsistency among items on different forms of the same test, fluctuations in student performance, and inconsistency in scoring. Fluctuations in student performance are analogous to fluctuations in weight t
He goes on to show why such measurement errors matter to our understanding of student scores. Just as any one weigh-in on an inaccurate scale may be off by a pound or two, measurement error requires that testing officials report scores with a range of uncertainty. A reported score of 247, for example, may actually reflect a possible range of scores from 240 to 255. But, if there is a fixed or "cut" score that determines pass/fail, say in this case 249, then the student with a score of 247 will fail—even if statistically there is a likely chance that the student's true ability lies above the pass/fail line.
Measuring Up also helps explain an interesting paradox in
While the average scaled scores on this past year's ELA test remained the same, the percentage of students reported as "proficient" on the state's standards increased from 68 percent to 71 percent. Thus, depending on the reporting method (average scaled scores or proficiency levels),
Koretz also shows how proficiency-based reporting can distort comparisons between groups of students. Each group could have identical gains, but because of the distribution of gains within the group and the placement of the proficiency cut scores, the proficiency-based reporting method could distort the results to appear that one group did much better than another. Koretz advocates that states report both scaled scores and proficiency levels and that policymakers review both to fully understand test results.
With such an emphasis on the complexity and limitations of standardized testing, it is no surprise that Koretz strongly condemns NCLB's test-based accountability system. He argues in favor of an "effective system of accountability, one that maximizes real gains and minimizes bogus gains and other negative side effects … analogous to an FDA-approved drug: both effective and safe." But, while he clearly describes the limitations of testing, he doesn’t suggest how to create the effective accountability system he desires. The book literally ends with his advice to "Let the buyer beware."
Still, Measuring Up is an excellent primer on large-scale, standardized testing. Supporters of NCLB may resist the book because of Koretz's conclusions about the law, but that would be a mistake. The book is a serious contribution to the debate, eschewing the simplistic, polarized rhetoric around standardized testing that creates a false dichotomy between testing and no testing. Serious reformers, those who want to understand the complexities of testing and consider how best to use large-scale standardized tests as a component of an effective accountability system for education, should read Measuring Up.