What Statistics Can and Can’t Tell Us About Ourselves

Hanna Fry:

The dangers of making individual predictions from our collective characteristics were aptly demonstrated in a deal struck by the French lawyer André-François Raffray in 1965. He agreed to pay a ninety-year-old woman twenty-five hundred francs every month until her death, whereupon he would take possession of her apartment in Arles.

At the time, the average life expectancy of French women was 74.5 years, and Raffray, then forty-seven, no doubt thought he’d negotiated himself an auspicious contract. Unluckily for him, as Bill Bryson recounts in his new book, “The Body,” the woman was Jeanne Calment, who went on to become the oldest person on record. She survived for thirty-two years after their deal was signed, outliving Raffray, who died at seventy-seven. By then, he had paid more than twice the market value for an apartment he would never live in.

Raffray learned the hard way that people are not well represented by the average. As the mathematician Ian Stewart points out in “Do Dice Play God?” (Basic), the average person has one breast and one testicle. In large groups, the natural variability among human beings cancels out, the random zig being countered by the random zag; but that variability means that we can’t speak with certainty about the individual—a fact with wide-ranging consequences.

Every day, millions of people, David Spiegelhalter included, swallow a small white statin pill to reduce the risk of heart attack and stroke. If you are one of those people, and go on to live a long and happy life without ever suffering a heart attack, you have no way of knowing whether your daily statin was responsible or whether you were never going to have a heart attack in the first place. Of a thousand people who take statins for five years, the drugs will help only eighteen to avoid a major heart attack or stroke. And if you do find yourself having a heart attack you’ll never know whether it was delayed by taking the statin. “All I can ever know,” Spiegelhalter writes, “is that on average it benefits a large group of people like me.”

That’s the rule with preventive drugs: for most individuals, most of those drugs won’t do anything. The fact that they produce a collective benefit makes them worth taking. But it’s a pharmaceutical form of Pascal’s wager: you may as well act as though God were real (and believe that the drugs will work for you), because the consequences otherwise outweigh the inconvenience.

There is so much that, on an individual level, we don’t know: why some people can smoke and avoid lung cancer; why one identical twin will remain healthy while the other develops a disease like A.L.S.; why some otherwise similar children flourish at school while others flounder. Despite the grand promises of Big Data, uncertainty remains so abundant that specific human lives remain boundlessly unpredictable. Perhaps the most successful prediction engine of the Big Data era, at least in financial terms, is the Amazon recommendation algorithm. It’s a gigantic statistical machine worth a huge sum to the company. Also, it’s wrong most of the time. “There is nothing of chance or doubt in the course before my son,” Dickens’s Mr. Dombey says, already imagining the business career that young Paul will enjoy. “His way in life was clear and prepared, and marked out before he existed.” Paul, alas, dies at age six.