A scientific look at science

What can bioscience learn from the reproducibility crisis?

The Biologist 64(5) p6

Since 2005, when Stanford statistician John Ioannidis cast doubt on much of the methodology used in published research, a growing body of empirical evidence has confirmed that many published research findings cannot be replicated. For example, a recent attempt to replicate 100 studies published across three key psychology journals found that only around 40% could be replicated. There's no reason to believe that these issues are unique to any one field – similar concerns have been raised across the biomedical sciences and beyond.

This interest in research reproducibility has stimulated a new discipline – meta-science, or the use of scientific methods to understand science itself. For example, we published an analysis suggesting that many studies in neuroscience literature may be too small[1] to provide reliable results (a finding we recently replicated[2] across a wider range of biomedical disciplines). In collaboration with economists, we also showed that the current peer review system may lead to 'herd behaviour',[3] which increases the risk that scientists will converge on an incorrect answer.

There is also growing interest in reproducibility from funders and journals, and increasing acceptance of the possibility that science may not be functioning optimally. What is at the heart of the problem?
Part of the issue may be that scientists are human and, however well trained and well motivated, subject to cognitive biases, such as confirmation bias and hindsight bias.

We also want our work to be recognised, through publication and the award of grants, for career advancement and esteem.

These pressures, and the incentive structures we work within, may shape our behaviour unconsciously. This is perhaps why psychologists are at the forefront of the reproducibility debate – the issue is ultimately one of human behaviour.

There is growing evidence that incentive structures in science do play a part. For example, we have found that studies conducted in the US, where academic salaries are often not guaranteed if grant income is not generated, tend to overestimate effects[4] compared with studies done outside the US (a result later replicated, at least for the 'softer' biomedical sciences[5]). Even more worryingly, studies published in journals with a high impact factor (a widely used metric often mistaken for quality) also seem to be more likely to overestimate effects[6].

With colleagues in biological sciences, we modelled the current scientific 'ecosystem'[7] and showed that incentives that prioritise a small number of novel, eye-catching findings (that is, those published in journals with a high impact factor) are likely to result in studies that are too small and give unreliable findings. This strategy is optimal for career advancement in the current system, but not optimal for science.

What can we do? One solution is to offer better training – in particular, making researchers aware of the kinds of bias that can unconsciously distort their interpretation of their data. However, real change will require engagement from multiple stakeholders (institutions, funders and journals) working together.

We published a manifesto for reproducible science[10] in which we argue for the adoption of a range of measures to optimise key elements of the scientific process: methods, reporting and dissemination, reproducibility, evaluation and incentives. While many of these can be adopted by individual researchers or research groups, others will require the engagement of key stakeholders – funders, journals and institutions. There may also be discipline-specific measures that can be taken – for example, a recent review outlined measures specifically intended to improve the reliability of findings generated by functional neuroimaging research[8].

Fortunately, there are positive signs that stakeholders are taking reproducibility seriously. In 2015 the Academy of Medical Sciences, together with the Medical Research Councils, the BBSRC and the Wellcome Trust, produced a report on how to improve the reproducibility and reliability of biomedical research in the UK[9], summarising potential causes and strategies for countering poor practice. The Medical Research Council now recommends that applications for funding include a "reproducibility and experimental design" annexe to "provide important additional information on reproducibility, and to explain the steps taken to ensure the reliability and robustness of the chosen methodology".

Ultimately, the issue is one of quality control – at a workshop convened by a major funder of Huntington's disease research to discuss these issues, it was pointed out that most scientific findings are produced to be 'fixed' (confirmed or disconfirmed) later.

An analogy was made with the US automobile industry in the 1970s, in which productivity was high, but quality control poor. The Japanese automobile industry took advice from US statistician William Edwards Deming and introduced quality control measures at every stage of the production pipeline. Japanese cars still have a reputation for reliability today. Perhaps we need to do something similar with biomedical science.

Marcus Munafo is a professor of biological psychology at the University of Bristol