» Down the Memory Hole: Evidence on Educational Testing

Richard Phelps:

What happens to the research evidence in a scientific field when the professionals in that field do not like it?

Some naively believe, as I once did, that all scientific research is somehow accumulated and preserved. Some of it is, even if its preservation may be obscure. Many scholarly journal indexes, for example, date back to the early twentieth century, and their earliest journal contents can still be found in some dusty academic libraries or on microfiche. Other scientific research is not deliberately preserved, or even indexed, and can more easily be forsaken and forgotten.1

Research on educational testing, its uses and effects, should greatly interest the American public. A standardized test, when administered by objective third parties, is one of the few instruments available to measure what happens inside our schools, which is not controlled by those who run our schools. For several decades, most U.S. states have incorporated systemwide testing in their education programs. Then, starting in the early 2000s, the federal government intervened with system wide testing requirements in most states in seven grade levels. Those requirements continue today. To many, testing seems omnipresent in our public schools.

It is no secret, however, that education professors tend to be less enthusiastic than the general public about testing mandates or externally administered standardized tests.2 Nonetheless, by default our graduate schools of education, their libraries, and the scholarly journals they manage serve as the primary repositories of research on the uses and effects of educational testing.

In my “spare” time, I read research on the effect of testing on student learning. Over the years, I have reviewed thousands of studies and found several hundred that fit the requirements for a statistical meta-analysis, including hundreds of randomized controlled experiments—the “gold standard” in social science research—dating back to the 1910s. Among the many sources I found helpful were a 233-page Bibliography of Educational and Psychological Tests and Measurement from 1923 and a 1942 book by C. C. Ross, Measurement in Today’s Schools—a source that led me to many other sources.

The “scientific” study of school testing—that is, the statistical analysis of test use and its effects—dates back to the 1890s. In 1923, standardized educational tests were still relatively new, but had already proliferated widely. The Bibliography, conducted for the U.S. Department of Interior, lists several hundred different tests and cites several hundred more reports of their implementation.