clay_shirky's picture
Social & Technology Network Topology Researcher; Adjunct Professor, NYU Graduate School of Interactive Telecommunications Program (ITP); Author, Cognitive Surplus
Pareto Principle

You see the pattern everywhere: the top 1% of the population control 35% of the wealth. On Twitter, the top 2% of users send 60% of the messages. In the health care system, the treatment for the most expensive fifth of patients create four-fifths of the overall cost. These figures are always reported as shocking, as if the normal order of things has been disrupted, as if the appearance of anything other than a completely linear distribution of money, or messages, or effort, is a surprise of the highest order.

It's not. Or rather, it shouldn't be.

The Italian economist Vilfredo Pareto undertook a study of market economies a century ago, and discovered that no matter what the country, the richest quintile of the population controlled most of the wealth. The effects of this Pareto Distribution go by many names — the80/20 Rule, Zipfs Law, the Power Law distribution, Winner-Take-All — but the basic shape of the underlying distribution is always the same: the richest or busiest or most connected participants in a system will account for much much more wealth, or activity, or connectedness than average.

Furthermore, this pattern is recursive. Within the top 20% of a system that exhibits a Pareto distribution, the top 20% of that slice will also account for disproportionately more of whatever is being measured, and so on. The most highly ranked element of such a system will be much more highly weighted than even the #2 item in the same chart. (The word "the" is not only the commonest word in English, it appears twice as often the second most common, "of".)

This pattern was so common, Pareto called it a "predictable imbalance"; despite this bit of century-old optimism, however, we are still failing to predict it, even though it is everywhere.

Part of our failure to expect the expected is that we have been taught that the paradigmatic distribution of large systems is the Gaussian distribution, commonly known as the bell curve. In a bell curve distribution like height, say, the average and the median (the middle point in the system) are the same — the average height of a hundred American women selected at random will be about 5'4", and the height of the 50th woman, ranked in height order, will also be 5'4".

Pareto distributions are nothing like that — the recursive 80/20 weighting means that the average is far from the middle. This in turn means that in such systems most people (or whatever is being measured) are below average, a pattern encapsulated in the old economics joke: "Bill Gates walks into a bar and makes everybody a millionaire, on average."

The Pareto distribution shows up in a remarkably wide array of complex systems. Together, "the" and "of" account for 10% of all words used in English. The most volatile day in the history of a stock market will typically be twice that of the second-most volatile, and ten times the tenth-most. Tag frequency on Flickr photos obeys a Pareto distribution, as does the magnitude of earthquakes, the popularity of books, the size of asteroids, and the social connectedness of your friends. The Pareto Principle is so basic to the sciences that special graph paper that shows Pareto distributions as straight lines rather than as steep curves is manufactured by the ream.

And yet, despite a century of scientific familiarity, samples drawn from Pareto distributions are routinely presented to the public as anomalies, which prevents us from thinking clearly about the world. We should stop thinking that average family income and the income of the median family have anything to do with one another, or that enthusiastic and normal users of communications tools are doing similar things, or that extroverts should be only moderately more connected than normal people. We should stop thinking that the largest future earthquake or market panic will be as large as the largest historical one; the longer a system persists, the likelier it is that an event twice as large as all previous ones is coming.

This doesn't mean that such distributions are beyond our ability to affect them. A Pareto curve's decline from head to tail can be more or less dramatic, and in some cases, political or social intervention can affect that slope — tax policy can raise or lower the share of income of the top 1% of a population, just as there are ways to constrain the overall volatility of markets, or to reduce the band in which health care costs can fluctuate.

However, until we assume such systems are Pareto distributions, and will remain so even after any such intervention, we haven't even started thinking about them in the right way; in all likelihood, we're trying to put a Pareto peg in a Gaussian hole. A hundred years after the discovery of this predictable imbalance, we should finish the job and actually start expecting it.