Fast forward, and after attending a presentation at this year’s ASA in New York last week, I’ve come to question my assessment–and theirs. At the time, I was looking at percentage point gains over time, and we know that these are not a good way to assess effect sizes since they do not take into account the amount of variation in the sample. Once the gains are standardized, Arum and Roksa find that students tested twice, four years apart, improve their scores on the Collegiate Learning Assessment by an average of 0.46 standard deviations. Now that’s a number we can begin to seriously consider.
Is a gain of 0.46 sd evidence of “limited learning” and something to sniff at? As I said back in 2009, we need a frame of reference in order to assess this. In the abstract, an effect size means little if anything at all.
For their part, the authors point to a review of research by Ernie Pascarella and Pat Terenzini indicating that on tests given at the time, students in the 1980s gained about 1 standard deviation. Doesn’t that mean students learn less today than they once did, and that that’s a problem? Actually, no.
Scores cannot simply be compared across different tests. The scales on tests differ and can only be linked by administering the same test to comparable people. Clearly, the CLA was not administered to students attending college in the 1980s. Nor, for that matter, were students then comparable in demographic characteristics to the students of today, or were the conditions of testing the same.