Did you know that consuming large amounts of olive oil can reduce your mortality risk by 41 percent? Did you know that if you have cataracts and get them operated on your mortality risk is lowered by 40 percent over the next 15 years compared to people with cataracts who don't get them operated on? Did you know that deafness causes dementia?
Those claims and scores like them appear every day in the media.
They are usually based on studies employing multiple regression analyis (MRA). In MRA a number of independent variables are correlated simultaneously with some dependent variable.
The goal is typically to show that variable A influences variable B "net of" the effects of all the other variables. To put that a little differently, the goal is to show that, at every level of variables C, D and E, an association between A and B is found. For exemple, drinking wine is correlated with low incidence of cardiovascular disease, controlling for (net of) the contributions to cardiovascular disease of social class, excess weight, age, etc., etc.
Epidemiologists, medical researchers, sociologists, psychologists and economists are particularly likely to use this technique, though it can be used in almost any scientific field.
The claims—always at least implicit, often explicit – that MRA can reveal causality are simply mistaken. We know that the target independent variable (consumption of olive oil, for example) brings along its correlations with many other variables—measured in some inevitably imperfect way or not at all. And the level on each of these variables is "self-selected." Any one of these variables could be driving the effects on the dependent variable.
Would you think the number of children in a classroom matters for how well school children learn? It seems reasonable that it would. But a number of MRA studies tell us that, net of average family income of families in the school district, size of the school, IQ test performance, city size, geographic location, etc., average class size is uncorrelated with student performance. The implication: We now know we needn't waste money on decreasing the size of classes.
But researchers have assigned kindergartners through third graders, by the flip of a coin, to either small classes (13 to 17) or larger classes (22-25 per class). The classes with smaller size showed more improvement in standardized test performance; the effect on minority children was greater than the effect on white children. This is not merely another study on the effects of class size. It replaces all the multiple regression studies on class size.
This is the case because it is the experimenter who selects the level on the target independent variable. This means that the experimental classrooms have equally good teachers on average, equally able students, equal social class of students, etc. Thus the only thing that differs between experimental and control classrooms is the independent variable of interest, namely class size.
MRA studies that attempt to "control" for other factors such as social class, age, prior state of health, etc. can't get around the self-selection problem. The sorts of people who get treatment differ from those who don't get it in goodness knows how many ways.
Consider social class. If an investigator wishes to see whether social class is associated with some outcome, anything correlated with social class might be producing or suppressing the effects of class per se. We can be fairly sure that the people consuming all that olive oil are richer, better educated, more knowledgeable about health and more concerned about health (with spouses also more concerned about their health, etc.) They are almost surely less likely to smoke or to drink to excess, and they probably live in less toxic environments than people who use corn oil. They are also more likely to be of Italian descent (Italians are relatively long-lived) than African descent (blacks have generally high mortality rates). All of these variables are candidates for being the true cause of the association between social class and mortality, rather than the consumption of olive oil per se.
Even when there is an attempt to control for all possible variables, they are not necessarily well-measured, which means that their contribution to the target dependent variable will be underestimated. For example, there is no unique correct way to measure social class. Education level, income, wealth, and occupational level are all pieces of the pie and there is no canonical way to weight them to come up the same social-class value that God has in mind.
A New York Times Op-Ed writer, a PhD at Harvard, recently expressed the opinion that MRA studies are superior to experiments because MRA studies based on Big Data can have many more subjects.
The error here is the assumption that having a relatively small number of subjects is likely to mislead. This is mistaken. Larger N is always better than smaller N because we are more likely to detect even small effects. But our confidence in studies is based not on the number of cases but on whether we have unbiased estimates of effects and whether the effects are statistically significant. And in fact, if you have a statistically significant effect with a relatively small number of subjects, this means, other things equal, that your effect is bigger than if it had required a larger number of subjects to reach the same level of significance.
Big data is going to be useful for all kinds of purposes, including generating MRA findings that suggest randomized-design experiments which can provide definitive evidence about whether an apparent effect is real. A lovely example of this kind of sequence results from the 2011 finding in MRA research by Becutti and Pannain that low levels of sleep are associated with obesity. That finding taken by itself is next to meaningless. Bad health outcomes are almost all correlated with each other: overweight people have worse cardiovascular haealth, worse psychological health, use more drugs, get less exercise, etc. But following the MRA research, experimenters have done the requisite experiments. They deprived people of sleep and found that they did in fact gain weight. Not only that, but researchers found hormonal and endocrine consequences of sleep disturbances that mediated the weight gain.
Multiple regression, like all statistical techniques based on correlation, has a severe limitation due to the fact that correlation doesn't prove causation. And no amount of measuring of "control" variables can untangle the web of causality. What nature hath joined together, multiple regression cannot put asunder.