A defining feature of science is its capacity to evolve in response to new developments. Historically—changes in technological capacities, quantitative procedures, and scientific understanding have all contributed to large-scale revisions in the conduct of scientific investigations. Pressure is mounting for further improvements. In disciplines such as medicine, psychology, genetics, and biology researchers have been confronting findings that are not as robust as they initially appeared. Such shrinking effects raise questions not only about the specific findings they challenge, but more generally about the confidence that we can have in published results that have yet to be re-evaluated.
In attempting to understand its own limitations, science is fueling the consolidation of an emerging new discipline: meta-science. Meta-science, the science of science, attempts to use quantifiable scientific methodologies to elucidate how current scientific practices influence the veracity of scientific conclusions. This nascent endeavor is joining the agendas of a variety of fields including medicine, biology, and psychology—each seeking to understand why some initial findings fail to fully replicate. Meta-science has its roots in the philosophy of science and the study of scientific methods, but is distinguished from the former by its reliance on quantitative analysis and from the latter by its broad focus on the general factors that contribute to the limitations and successes of scientific investigations.
This year the most ambitious meta-scientific study to date was published in Science by Brian Nosek and the Open Science Collaboration. A large-scale effort in psychology sought to replicate 100 “quasi-randomly” selected studies from three premier journals and found that less than half (39 percent) of the studies reached traditional levels of significance when replicated. This study is noteworthy because it directed the lens of science not at any particular phenomena but rather at the process of science itself. In this sense, it represents one of the first major implementations of evidence-based meta-science. Although it is certain to have a major impact on science, only time will tell how it will be remembered.
Although I am enthusiastic about the meta-scientific goals that this study exemplifies, I worry that major limitations in its design and implementation may have produced a misleadingly pessimistic assessment of the health of the field of psychology. Numerous factors may have contributed to an underestimation of the reliability of the findings, including: variations in the skills and motivations of the replicating scientists, limitations in the statistical power of the replications, and perhaps most importantly, questions regarding the fidelity with which the original methods were reproduced. Although the authors attempted to vet their replication procedure with the originating lab, many of the replicated studies were conducted without the originating lab’s endorsement, and these unapproved efforts disproportionately contributed to the low replication estimate.
Even the studies that used procedures that were approved by the originating laboratories still may have been lacking in fidelity. For example, one of the more well-known findings that failed to replicate involved the observation that exposing people to an anti free will message can increase cheating. I am particularly familiar with this example, (and perhaps biased to defend it) as I was a co-author of the original study. Although we signed off on the replication protocol, we subsequently discovered a small but important detail that was left out of the replicating procedure. In the original study, but not the replication, the anti free will message was framed as part of an entirely different study. We have recently found that people are less likely to change their beliefs about free will when the anti free will message is introduced as part of the same study. Apparently people are reluctant to change their mind on this important topic if they feel coerced to do so. In this context it is notable that in the replication study, the anti free will message failed to significantly discourage participants from believing in free will in the first place, and thus could hardly have been expected to produce the further ramification of increased cheating. I suspect that a big portion of failures to replicate may involve the omission of similar small but important methodological details.
As the emerging field of meta-science moves forward, it will be important to refine techniques for understanding how disparities between original studies and replications may contribute to difficulties in reproducing results. Increasing the transparency of originally conducted studies, through methods such as detailed pre-registration, is likely to make it easier for replication teams to understand precisely how the project was originally implemented. However, it will also be important to develop methods for evaluating the fidelity of the reproductions themselves.
Another important next step for meta-science is the implementation of prospective replication experiments that systematically investigate how new hypotheses fair when tested repeatedly across laboratories. Prospective replication experiments will help to overcome potential biases inherent in selecting which published studies to replicate while simultaneously illuminating various factors that may govern the replicability of scientific findings, including variations in population sample, researcher investment and reproduction fidelity.
More generally, as we adopt a more meta-scientific perspective, researchers will hopefully increasingly appreciate that just as a single study cannot irrefutably demonstrate the existence of a phenomenon, neither can a single failure to replicate disprove it. Over time, scientists will likely become increasingly comfortable with meticulously documenting and (ideally) pre-registering all aspects of their research. They will see the replication of their work not as a threat to their integrity but rather as testament to their work’s importance. They will recognize that replicating other findings is an important component of their scientific responsibilities. They will refine replication procedures to not only discern the robustness of findings, but to understand their boundary conditions, and the reasons why they sometimes (often?) decline in magnitude. Even if history discerns that the original foray into meta-science was significantly lacking, ultimately meta-science will surely offer deep insights into the nature of the scientific method itself.