The Unexamined Model Is Not Worth Trusting (We know best…)

Chris von Csefalvay:

In early March, British leaders planned to take a laissez-faire approach to the spread of the coronavirus. Officials would pursue “herd immunity,” allowing as many people in non-vulnerable categories to catch the virus in the hope that eventually it would stop spreading. But on March 16, a report from the Imperial College Covid-19 Response Team, led by noted epidemiologist Neil Ferguson, shocked the Cabinet of the United Kingdom into a complete reversal of its plans. Report 9, titled “Impact of non-pharmaceutical interventions (NPIs) to reduce COVID-19 mortality and healthcare demand,” used computational models to predict that, absent social distancing and other mitigation measures, Britain would suffer 500,000 deaths from the coronavirus. Even with mitigation measures in place, the report said, the epidemic “would still likely result in hundreds of thousands of deaths and health systems (most notably intensive care units) being overwhelmed many times over.” The conclusions so alarmed Prime Minister Boris Johnson that he imposed a national quarantine.

Subsequent publication of the details of the computer model that the Imperial College team used to reach its conclusions raised eyebrows among epidemiologists and specialists in computational biology and presented some uncomfortable questions about model-driven decision-making. The Imperial College model itself appeared solid. As a spatial model, it divides the area of the U.K. into small cells, then simulates various processes of transmission, incubation, and recovery over each cell. It factors in a good deal of randomness. The model is typically run tens of thousands of times, and results are averaged—a technique commonly referred to as an ensemble model.

In a tweet sent in late March, Ferguson—then still one of the leading voices within the U.K.’s Scientific Advisory Group for Emergencies (SAGE), tasked with handling the coronavirus crisis—stated that the model was implemented in “thousands of lines of undocumented” code written in C, a widely used and high-performing computing language. He refused to publish the original source code, and Imperial College has refused a Freedom of Information Act request for the original source, alleging that the public interest is not sufficiently compelling.

As Ferguson himself admits, the code was written 13 years ago, to model an influenza pandemic. This raises multiple questions: other than Ferguson’s reputation, what did the British government have at its disposal to assess the model and its implementation? How was the model validated, and what safeguards were implemented to ensure that it was correctly applied? The recent release of an improved version of the source code does not paint a favorable picture. The code is a tangled mess of undocumented steps, with no discernible overall structure. Even experienced developers would have to make a serious effort to understand it.

I’m a virologist, and modelling complex processes is part of my day-to-day work. It’s not uncommon to see long and complex code for predicting the movement of an infection in a population, but tools exist to structure and document code properly. The Imperial College effort suggests an incumbency effect: with their outstanding reputations, the college and Ferguson possessed an authority based solely on their own authority. The code on which they based their predictions would not pass a cursory review by a Ph.D. committee in computational epidemiology.

Related, Madison K-12 experiments:

English 10

Small Learning Communities

Reading Recovery

Connected Math

Discovery Math

Madison’s taxpayer supported K-12 school district, despite spending far more than most, has long tolerated disastrous reading results.

My Question to Wisconsin Governor Tony Evers on Teacher Mulligans and our Disastrous Reading Results