In March 2007, Nick Pearce was running the thinktank the Institute for Public Policy Research. That month, one of his young interns, Amelia Zollner, was killed by a lorry while cycling in London. Amelia was a bright, energetic Cambridge graduate, who worked at University College London. She was waiting at traffic lights when a lorry crushed her against a fence and dragged her under its wheels.
Two years later, in March 2009, Pearce was head of prime minister Gordon Brown’s Number 10 policy unit. He had not forgotten Amelia and wondered to a colleague if the publication of raw data on bicycle accidents would help. Perhaps someone might then build a website that would help cyclists stay safe?
The first dataset was put up on 10 March. Events then moved quickly. The file was promptly translated by helpful web users who came across it online, making it compatible with mapping applications.
A day later, a developer emailed to say that he had “mashed up” the data on Google Maps. (Mashing means the mixing together of two or more sets of data.) The resulting website allowed anyone to look up a journey and instantly see any accident spots along the way.
Within 48 hours, the data had been turned from a pile of figures into a resource that could save lives and that could help people to pressure government to deal with black spots.
Now, imagine if the government had produced a bicycle accident website in the conventional way. Progress would have been glacial. The government would have drawn up requirements, put it out to tender and eventually gone for the lowest bidder. Instead, within two days, raw data had been transformed into a powerful public service.
Politicians, entrepreneurs, academics, even bureaucrats spend an awful lot of time these days lecturing each other about data. There is big data, personal data, open data, aggregate data and anonymised data. Each variety has issues: where does it originate? Who owns it? What it is worth?