Recently we discussed a debate about how much of the improvement in test scores of students in Mississippi can be attributed to a policy of holding back more students–in particular, having kids repeat third grade will be expected to improve average for fourth graders. Education researchers Howard Wainer, Irina Grabovsky, and Daniel Robinson expressed skepticism about claimed dramatic benefits from the Mississippi plan, but then there were good arguments on the other side. One thing is that a lot of the discussion was about what happened right after the new plan was implemented in the mid-2010s, but there have been longer-term trends in Mississippi and other states. Changes in averages are always hard to interpret because of possible changes in compositional effects, including decisions of the age at which children start first grade, classification of students as disabled, and who’s taking the test in any given year. Also, all these comparisons are observational: as Wainer puts it, there’s no control group. On the other other hand, decisions need to be made in the absence of ironclad evidence. So I was left in a state of uncertainty.
A couple days later we learned that Wainer et al. had garbled some statistics, entirely misreporting Mississippi’s fourth and eighth grade math scores. Wainer et al. were making a general point about testing and selection, something they’d seen in various forms many times in their careers, but they were evidently not close to the data from Mississippi, even to the aggregate data that are easily available. As I discussed, I should’ve earlier been more suspicious of their claims about the math scores, given that in my earlier post I’d noticed a discrepancy between those and others’ claims. After all this, I remain unsure what to think about Mississippi. It’s an observational comparison, there’s selection, there’s variation between states in how much they teach to the test, and at the individual level there are the spillover effects on the kids who are not held back . . . all sorts of things. On the other hand there are these long-term trends. Selection has to be explaining some of what is happening in Mississippi–if you hold kids back and give them the test later or manage to exclude them from the tested population entirely, the average scores of the remaining students should rise–but it’s hard to say how much, and at some point you have to go with the data in front of you. As is often the case, we’re not just arguing about causal effects; we’re also trying to pin down what exactly is happening.
In the meantime, I received an email from another education researcher, Doug Harris, who writes:
Wainer et al. also got it wrong on the other cities like New Orleans. To quote them: “We have seen several previous K–12 education ‘miracles’ that turned out to be hoaxes. Five of them were in Houston, Atlanta, the District of Columbia, El Paso, and New Orleans . . . The New Orleans miracle was caused by a natural disaster. Hurricane Katrina tragically relocated about a third of the students who came from the poorest areas. Removing thousands of low scorers immediately raised the average test scores of the students who remained.”
Several people pointed this out to me [Harris], especially because I have been studying the New Orleans school reforms for more than 10 years. My center, the Education Research Alliance for New Orleans, has published more than 50 articles about it. Our Advisory Board includes both supporters and critics of the reforms.
When I first came to New Orleans the sharp upward trend in outcomes gave me and others good reason to think this fit the first rule. The school reforms were sparked by Hurricane Katrina, which changed the city in many ways. Many families never returned, at least not to their original homes and neighborhoods. The whole city was hit hard, but low-income neighborhoods were hit a bit harder. Given the correlation between demographics and education outcomes, it was reasonable to be concerned that changes in the population, not the school reforms, drove the change in outcomes. Recognizing the problem, I spent years trying to disentangle this.