Normal ain't What it Used to Be
Seth Godin made an interesting point: we tend to assume all kinds of data has a “normal distribution”, the famous Bell curve shape, even when it doesn’t. It’s become the default option in most people’s heads: the one thing that sticks from statistics taught at school.
The problem, of
course, is that lots of things don’t follow a Bell curve. As Steven Strogatz
writes in The Joy of x:
“Curiously,
these types of distributions are barely mentioned in the elementary statistics
textbooks, and when they are, they’re usually trotted out as pathological
specimens. It’s outrageous.”
That bias explains
why the Bell curve is called the “normal” curve.
But the real world
is different. A flood or earthquake can cause a huge spike in damage in a
particular area, raising the cost to insurance companies. Stock market can have
massive moves on a particular day, not a steady movement. The number of deaths
in wars follows the same pattern: an outlier can be off the charts (like the
World Wars). Epidemics are the same: a black death or Spanish flu or COVID-19
can rip apart any patterns you thought existed for diseases.
Relying on the
wrong curve to estimate or plan can be deadly, pun intended.
In the age of the
Internet, it’s easier than ever to collect data and analyse it. The reason we
find more and more content meant for niche groups in Amazon and Netflix is
because the number of people with esoteric tastes, preferences and needs is not
tiny – it’s call the “fat tail” of the curve, to differentiate it from the Bell
curve that dies down at the ends. Other common tails include the “long tail”
(endless number of items, even if most of them have tiny quantities) and the
“heavy tail”. All of which is why Strogatz calls the chapter on all these data
distribution curves “The New Normal”. And ends his chapter by saying:
“Fat, heavy and long? Yeah, that’s right. Now who’s normal?”
Comments
Post a Comment