Information Flow

The concept of cause and effect is better understood as the flow of information between two connected events, from the earlier event to the later one. Saying "A causes B" sounds precise, but is actually very vague. I would specify much more by saying "with the information that A has happened, I can compute with almost total confidence* that B will happen." The latter rules out the possibility that other factors could prevent B even if A does happen, but allows the possibility that other factors could cause B even if A doesn't happen.

As shorthand, we can say that one set of information "specifies" another if the latter can be deduced or computed from the former.  Note that this doesn't only apply to one-bit sets of information, like the occurrence of a specific event. It can also apply to symbolic variables (given the state of the Web, the results you get from a search engine are specified by your query), numeric variables (the number read off a precise thermometer is specified by the temperature of the sensor), or even behavioral variables (the behavior of a computer is specified by the bits loaded in its memory).

But let's take a closer look at the assumptions we're making. Astute readers may have noticed that in one of my examples, I assumed that the entire state of the Web was a constant. How ridiculous! In mathematical parlance, assumptions are known as "priors," and in a certain widespread school of statistical thought, they are considered the most important aspect of any process involving information. What we really want to know is if, given a set of existing priors, adding one piece of information (A) would allow us to update our estimate of the likelihood of another piece of information (B). Of course, this depends on the priors — for instance, if our priors include absolute knowledge of B, then an update will not be possible.

If, for most reasonable sets of priors, information about A would allow us to update our estimate of B, then it would seem there is some sort of causal connection between the two. But the form of the causal connection is unspecified — a principle often called "correlation does not imply causation." The reason for this is that the essence of causation as a concept rests on our tendency to have information about earlier events before we have information about later events. (The full implications of this concept on human consciousness, the second law of thermodynamics, and the nature of time are interesting, but sadly outside the scope of this essay.)

If information about all events always came in the order they occurred, then correlation would indeed imply causation. But, in the real world, not only are we limited to observing events in the past, but we may also discover information about those events out of order.  Thus, the correlations we observe could be reverse causes (information about A allows us to update our estimate of B, although B happened first and thus was the cause of A) or even more complex situations (e.g. information about A allows us to update our estimate of B, but is also giving us information about C, which happened before either A or B and caused both).

Information flow is symmetric: if information about A were to allow us to update our estimate of B, then information about B would allow us to update our estimate of A. But since we cannot change the past or know the future, these constraints are only useful to us when contextualized temporally and arranged in order of occurrence. Information flow is always from the past to the future, but in our minds, some of the arrows may be reversed. Resolving this ambiguity is essentially the problem that science was designed to solve. If you can master the technique of visualizing all information flow and keeping track of your priors, then the full power of the scientific method — and more — is yours to wield from your personal cognitive toolkit.

* In our universe, too many things are interconnected for absolute statements of any kind, so we usually relax our criteria; for instance, "total confidence" might be relaxed from a 0% chance of being wrong to, say, a 1 in 3 quadrillion chance of being wrong — about the chance that, as you finish this sentence, all of humanity will be wiped out by a meteor.