ALL Metrics
-
Views
-
Downloads
Get PDF
Get XML
Cite
Export
Track
Opinion Article

A failure to reproduce: How bad biomedical science is holding us back

[version 1; peer review: 1 approved with reservations, 2 not approved]
PUBLISHED 30 Mar 2016
Author details Author details
OPEN PEER REVIEW
REVIEWER STATUS

Abstract

Irreproducibility is a common problem in the biomedical sciences. Numerous studies have revealed the systemic and chronic nature of the problem, yet not enough is being down to combat it. The financial cost is estimated to be 28 billion dollars in the United States alone. Combine this financial cost with the time spent on irreproducible studies and the net effect is staggering. The factors for this lack of reproducibility are however identifiable and concrete steps can be taken to improve the situation. This article describes some of the factors leading to irreproducibility in the biomedical sciences and how stakeholders at every level of the field can act to reverse them.

Keywords

Reproducibility, Biomedical Science, Irreproducibility, publishing, funding, standards and practises, institutions

A tale of two papers

In 2005 a paper was published by Dr. John Ioannidis entitled “Why most published research findings are false” (Ioannidis, 2005b). It was essentially Dr. Ioannidis’s claim that “false findings may be the majority or even the vast majority of published research claims”. The strange thing about this paper was that it wasn’t exaggerating.

In his paper Dr. Ioannidis used a mathematical proof that assumed modest levels of researcher bias. This bias could be human error, bad methodology or any number of other factors. He argued that a sufficiently motivated researcher that wishes to prove a theory correct, can do so most of the time regardless of whether the theory is actually correct. The rates of “wrongness” his model predicted in various fields of medical research corresponded to the observed rates at which findings were later refuted. And these rates of “wrongness” were not insignificant. 80 percent of non-randomized studies, 25 percent of “gold-standard” randomized trials and even as much as 10 percent of the “platinum-standard” large randomized trials turn out to be irreproducible.

A second paper by Dr. Ioannidis was published that same year. In this paper Dr. Ioannidis looked at 49 of the most citied articles in the most citied journals (Ioannidis, 2005a). 45 of these papers claimed to have described effective interventions for various diseases ranging from heart attacks to cancer. Of these 45, seven were contradicted by subsequent studies, seven others had their reported effects diminished by subsequent studies and 11 were largely unchallenged. Only 20/45 (44%) of the field guiding papers had been replicated successfully. And in a finding that shows how these irreproducible papers are impacting the field, Dr. Ioannidis found that even when a research paper is later soundly refuted its findings can persist, with researchers continuing to cite them as correct for years afterwards.

The counterargument is of course that despite all this the system clearly does work on the whole. Even if mistakes are being made and inefficiency is rampant, if something is wrong it will eventually be found out and corrected. Take for example the now infamous recent controversy regarding stimulus-triggered acquisition of pluripotency cells or STAP cells. Published in Nature, these two papers were considered a massive breakthrough in stem cell research (Obokata et al., 2014a; Obokata et al., 2014b). However very quickly problems emerged with the data presented in the paper. An investigation by the hosting institute found the lead investigator guilty of misconduct and the papers were subsequently retracted (Editorial, 2014). This is one example showing how the checks and balances in place within biomedical research can work and work well. Despite these measures the price of irreproducible research remains a substantial one.

The price we pay

A recent study estimated the cost of irreproducible pre-clinical research at 28 billion dollars in the US alone (Freedman et al., 2015). The study estimated the overall rate of irreproducibility at 53%, but warned that the true rate could be anywhere between 18% and 89%. While the exact figures are certainly debateable the clear message is that this is a significant issue even if we assume the rate of irreproducibility is at the lower end of their scale. Even big pharma companies have noted the lack of reproducibility coming from academia. They report that their attempts to replicate the conclusions of peer-reviewed papers fail at rates upwards of 75% (Prinz et al., 2011).

Going hand in hand with the financial costs we cannot forget about the time invested on these irreproducible studies. One could argue that this is indeed the more damaging factor given that it slows the development of potentially lifesaving treatments and interventions that would significantly improve quality of life for large proportions of society. This is of course an even more difficult if not impossible metric to measure accurately but it logically must be impacted upon. For example consider this: a researcher has a hypothesis and carries out 3 experiments to test this hypothesis. The first 2 experiments are successful and seem to confirm the researcher’s hypothesis. The findings from these first two experiments are thus published. This naturally leads to a third experiment which seems to strongly disprove the hypothesis and so the researcher abandons this line of work to move onto something else. The researcher is unlikely to publish the findings of the third experiment but the 2 other papers will remain published. This may lead other research groups to continue this work from the first 2 papers, perhaps even carrying out the same failed experiments, unware that it had already been tried and rejected. Again it is difficult to know how often something like this happens simply because we don’t know how often researchers are leaving their negative results unpublished. However even a cursory read through some biomedical journals will reveal that papers with negative results are few and far between.

Yet another group that believes this is a major issue in need of addressing is the Global Biological Standards Institute or GBSI. The GBSI carried out a study in which they interviewed 60 individuals throughout the life science community including biopharmaceutical R&D executives, academic & industry researchers, editors of peer-reviewed journals, leaders of scientific professional societies, experts in quality management, experts in standards, academic research leaders and many more disciplines (GBSI, 2013). In the extensive interviews with these professionals a systemic and pervasive problem with reproducibility was reported. Over 80% of academic research leaders they interviewed had some experience with irreproducibility. The reasons for this irreproducibility included inconsistencies between labs, non-standardised reagents, variability in assays, cell lines, experimental bias, differences in statistical methods, lack of appropriate controls and several others.

Why we falter

There is a perception amongst the general public that scientists are a group of meticulous, highly organised, extremely intelligent section of society (Castell et al., 2014). This perception is certainly not without a basis in reality but fails to appreciate the human aspect of scientists and our work. Mistakes happen, negligence occurs. Politics, money, bureaucracy and rivalries all get in the way of scientific research (GBSI, 2013; Wilmshurst, 2007). This all happens on a pretty regular basis according to the GBSI report with one researcher quoted as saying “We’ve had to address issues with replicating published work many times. Sometimes it’s a technical issue, sometimes a reagent issue, sometimes it’s that the technique was not being used appropriately” (GBSI, 2013). Within the biomedical research community these obstacles are a disliked but tolerated part of doing science. They are unfortunately sometimes considered part of the job and just how the system works as is demonstrated by their prevalence (Tavare, 2012; Wilmshurst, 2007).

This is a dangerous and irresponsible attitude to allow to continue within our community. It is precisely because of our line of work that we should seek to uphold to highest professional and academic standards. Our work which can quite literally be the difference between someone living or dying, or having to suffer from debilitating illness on a daily basis. Just because our impact is delayed by its long journey from bench to bedside does not make it any less crucial to people’s lives.

Yes, there are reasons for things being the way they are. The publishing process is outdated. There’s never enough funding to go around and what little there is goes often has strict criteria attached to it (Editors, 2011). When applying for positions in academia, publications are king with quantity being above quality or accuracy in many cases and these publications are being dominated by a select few in the field (Ioannidis et al., 2014). It is therefore unfortunately often necessary for scientists to play the game and submit to the demands of the system. Corners are cut, statistics are “reinterpreted” and results exaggerated all in the hopes of getting past the journals review panel, or having a grant proposal approved (Baker, 2016; GBSI, 2013). None of this is maliciously done of course, at least not usually, but malicious or not the negative effects are the same. Whatever the reason it’s still bad science.

The solutions

So what is the solution? First we must appreciate that the problems leading to this situation are clearly multifaceted and require concerted efforts from groups and individuals at all levels throughout the community. We will discuss 4 of the main areas that could be improved upon.

Publishing/Journals

One change that could have the most wide reaching effect would be a greater emphasis on the importance of replicability in studies. Just as the number of citations a researcher has today is considered an important metric for the quality of their work, the number of a researcher’s papers that have been reproduced by other groups should be strongly considered too. Standardised metrics such as this will help place greater importance on the quality of a piece of work, rather than just the exciting nature of its claims. This can help us to reduce the pressure to publish in the highest impact factor journals. With standardised metrics the prestige of a journal won’t necessarily matter as long as the papers quality metrics are solid.

For this to work however journals need to be more accepting of papers whose sole purpose is to reproduce or confirm another group’s work instead of favouring papers that report new discoveries or interventions. In addition journals could allow researchers to pre-register their planned experiments with them in exchange for a potential fast track to publication if they then carry out those experiments, even if they give negative results. This would go a long way to improving transparency and encouraging researchers to not discard unfavourable experiments. It would also help avoid situations like the one we discussed above where researchers continued to work on the basis of experiments they were unaware had already been disproven. Some examples of initiatives moving towards these goals include the All Trials campaign and the registered reports approach (AllTrials, 2016; Chambers, 2016).

Standards and practises

Next there needs to be an expanded development and adoption of standards and best practices. Standards can be physical standards like standardised assays and cell lines or they can be documental standards such as protocols and practises. To do this guidelines for best practises and standards need to be easily accessible and widely available, which is one of the aims of the EQUATOR network (EQUATOR, 2016). Improvement in standards is arguably the most important factor for increasing reliability but given its wide ranging, complicated and technical nature we will leave discussion of it for others to pursue.

Funding bodies

Similar to journals, funding bodies could introduce mechanisms to reward reproducibility in a researcher’s work. They should look at an investigator’s past record with producing reproducible work (new journal metrics would complement this) and also look at his/her current grant application to see if it’s set up to produce reproducible results. The NIH for example has introduced new guidelines beginning in January 2016 to improve reproducibility in grant in applications (NIH, 2016). The 4 main areas the guidelines seek to address include the scientific premise of the proposed research, rigorous experimental design for robust and unbiased results, the consideration of relevant biological variables and the authentication of key biological and/or chemical resources proposed in a grant application.

Investigators/Institutions

Finally and perhaps ultimately, the responsibility of producing high quality reproducible work is down to the principle Investigators, their lab group and the institution that hosts them. These are the people that can have the most immediate impact through self-correction and adherence to the best standards and practises whenever and wherever possible.

The bottom line

This article is in no way all-encompassing regarding the issue of reproducibility. Several problems and solutions exist which we have not discussed in detail, not least of which the troubling subject of researcher bias, for example. The references below do however discuss some of these in greater detail. Many will argue that implementing the above proposals will of course require additional work and possibly some significant upfront costs but we would counter that the longer term impact to biomedical research would be immense and one we cannot afford to miss out on.

Comments on this article Comments (4)

Version 1
VERSION 1 PUBLISHED 30 Mar 2016
  • Reader Comment 28 Apr 2023
    Jagannadha Avasarala, Neurology, University of Kentucky, Lexington, USA
    28 Apr 2023
    Reader Comment
    This is a great topic for discussion not just because of the tax-payer dollars/money lost but also because of years and decades lost to junk science. As an extension of ... Continue reading
  • Reader Comment 09 May 2016
    Omar González-Santiago, Universidad Autónoma de Nuevo León, Mexico
    09 May 2016
    Reader Comment
    Standing on the shoulders of giants should be the premise in scientific research but where are the real giants, the real data. Any scientific research should be tested with acid ... Continue reading
  • Reader Comment 05 Apr 2016
    Shaun Lehmann, Australian National University, Australia
    05 Apr 2016
    Reader Comment
    I am glad that you are highlighting this issue as it is of real importance.

    I am currently working in phylogenetic method development and I have had quite a lot of ... Continue reading
  • Author Response 31 Mar 2016
    Hussein Jaafar, Institute of Psychiatry, Psychology & Neuroscience, King's College London, London, UK
    31 Mar 2016
    Author Response
    A spelling mistake in the abstract has been noted and will be amended after peer review.
    Competing Interests: Author of article
Author details Author details
Competing interests
Grant information
Copyright
Download
 
Export To
metrics
Views Downloads
F1000Research - -
PubMed Central
Data from PMC are received and updated monthly.
- -
Citations
CITE
how to cite this article
Jaafar H and Maweni RM. A failure to reproduce: How bad biomedical science is holding us back [version 1; peer review: 1 approved with reservations, 2 not approved]. F1000Research 2016, 5:415 (https://doi.org/10.12688/f1000research.8370.1)
NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.
track
receive updates on this article
Track an article to receive email alerts on any updates to this article.

Open Peer Review

Current Reviewer Status: ?
Key to Reviewer Statuses VIEW
ApprovedThe paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approvedFundamental flaws in the paper seriously undermine the findings and conclusions
Version 1
VERSION 1
PUBLISHED 30 Mar 2016
Views
24
Cite
Reviewer Report 15 Jul 2016
Neil Chue Hong, Software Sustainability Institute, University of Edinburgh, Edinburgh, UK 
Not Approved
VIEWS 24
I carry out this review following the guidelines set in "An Open Science Peer Review Oath".

There have been increasing scrutiny of the many fields of research, with reproducible research being used as one of the key ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Chue Hong N. Reviewer Report For: A failure to reproduce: How bad biomedical science is holding us back [version 1; peer review: 1 approved with reservations, 2 not approved]. F1000Research 2016, 5:415 (https://doi.org/10.5256/f1000research.9003.r13423)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
Views
37
Cite
Reviewer Report 09 May 2016
Krzysztof Gorgolewski, Department of Psychology, Stanford University, Stanford, CA, USA 
Approved with Reservations
VIEWS 37
This is a very well written paper discussing the important and timely topic of reproducibility. As an opinion piece it is factually accurate, but despite promising introduction to the topic, it provides very little in terms of practical solutions. Listed ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Gorgolewski K. Reviewer Report For: A failure to reproduce: How bad biomedical science is holding us back [version 1; peer review: 1 approved with reservations, 2 not approved]. F1000Research 2016, 5:415 (https://doi.org/10.5256/f1000research.9003.r13428)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
Views
47
Cite
Reviewer Report 20 Apr 2016
Gary G Borisy, Department of Microbiology, The Forsyth Institute, Cambridge, MA, USA 
Not Approved
VIEWS 47
Jaafar and Maweni are entitled to their opinion but I don't believe they have contributed in a sufficiently significant way to warrant being indexed. They provide no original analysis of their own; they provide a brief overview of contributions by ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Borisy GG. Reviewer Report For: A failure to reproduce: How bad biomedical science is holding us back [version 1; peer review: 1 approved with reservations, 2 not approved]. F1000Research 2016, 5:415 (https://doi.org/10.5256/f1000research.9003.r13424)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
  • Author Response 21 Apr 2016
    Hussein Jaafar, Institute of Psychiatry, Psychology & Neuroscience, King's College London, London, UK
    21 Apr 2016
    Author Response
    Thank you for your review Dr. Borisy. We would like to respond to your critiques.

    “They provide no original analysis of their own; they provide a brief overview of contributions by ... Continue reading
COMMENTS ON THIS REPORT
  • Author Response 21 Apr 2016
    Hussein Jaafar, Institute of Psychiatry, Psychology & Neuroscience, King's College London, London, UK
    21 Apr 2016
    Author Response
    Thank you for your review Dr. Borisy. We would like to respond to your critiques.

    “They provide no original analysis of their own; they provide a brief overview of contributions by ... Continue reading

Comments on this article Comments (4)

Version 1
VERSION 1 PUBLISHED 30 Mar 2016
  • Reader Comment 28 Apr 2023
    Jagannadha Avasarala, Neurology, University of Kentucky, Lexington, USA
    28 Apr 2023
    Reader Comment
    This is a great topic for discussion not just because of the tax-payer dollars/money lost but also because of years and decades lost to junk science. As an extension of ... Continue reading
  • Reader Comment 09 May 2016
    Omar González-Santiago, Universidad Autónoma de Nuevo León, Mexico
    09 May 2016
    Reader Comment
    Standing on the shoulders of giants should be the premise in scientific research but where are the real giants, the real data. Any scientific research should be tested with acid ... Continue reading
  • Reader Comment 05 Apr 2016
    Shaun Lehmann, Australian National University, Australia
    05 Apr 2016
    Reader Comment
    I am glad that you are highlighting this issue as it is of real importance.

    I am currently working in phylogenetic method development and I have had quite a lot of ... Continue reading
  • Author Response 31 Mar 2016
    Hussein Jaafar, Institute of Psychiatry, Psychology & Neuroscience, King's College London, London, UK
    31 Mar 2016
    Author Response
    A spelling mistake in the abstract has been noted and will be amended after peer review.
    Competing Interests: Author of article
Alongside their report, reviewers assign a status to the article:
Approved - the paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations - A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approved - fundamental flaws in the paper seriously undermine the findings and conclusions
Sign In
If you've forgotten your password, please enter your email address below and we'll send you instructions on how to reset your password.

The email address should be the one you originally registered with F1000.

Email address not valid, please try again

You registered with F1000 via Google, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Google account password, please click here.

You registered with F1000 via Facebook, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Facebook account password, please click here.

Code not correct, please try again
Email us for further assistance.
Server error, please try again.