ALL Metrics
-
Views
-
Downloads
Get PDF
Get XML
Cite
Export
Track
Opinion Article

What is reproducibility?

[version 1; peer review: 3 approved with reservations]
PUBLISHED 09 Jan 2019
Author details Author details
OPEN PEER REVIEW
REVIEWER STATUS

This article is included in the Research on Research, Policy & Culture gateway.

Abstract

The debate on reproducibility in biomedicine will gain precision only if we agree what reproducibility means. Importantly, reproducibility should be distinguished from validity (“truth”). We propose the application of an equivalence trials framework to clarify the concept of reproducibility by changing the (narrow) equivalence zone around a zero difference by a zone of reproducibility around (a) previous finding(s).

Keywords

reproducibility, replicability, repeatability, agreement, validation, truth, methodology, equivalence

Introduction

Reproducibility is said to be a core principle of scientific progress. Nevertheless, poor reproducibility has recently been shown to haunt preclinical research1,2, translational research3, medicine4 and psychology5. False-positive initial results due to random chance or incorrect study design were among the reasons implicated, as well as data-dredging, publication bias and misconduct. Others called irreproducible results ‘biased’1 and ‘unreliable’5.

Coming from a background of meta-analysis with its countless examples of unexplained heterogeneity and an ingrained appreciation of sampling variability, we were surprised that these outcries cited above were not accompanied by a formal definition of the concept of reproducibility. Goodman et al. did define three types of reproducibility (methods, results, and inferences) and stated that confusion arises when, inadvertently, people use reproducibility as a synonym for “truth”6. We read their paper as being about truth although its title suggests otherwise. Our paper is about reproducibility sensu stricto and we revisit some basic definitions of reproducibility, notice that these definitions are problematic, and argue that the concept of equivalence in randomized trials may be fruitfully applied to sharpen our understanding of what we mean by reproducibility. We propose that investigators aiming to reproduce others’ findings should pay more attention to predefining a margin of (unacceptable) discordance with existing findings.

Discussion

Box 1 shows two formal definitions of the concept of reproducibility.

Box 1

Definition 1:

“The value below which the absolute difference between two single test [or study, our addition] results may be expected to lie with a probability of 95%, when the results are obtained by the same method and equipment from identical test material in the same setting by the same operator within short intervals of time. A test or measurement [or study, our addition] is reproducible if the results are identical or closely similar each time it is conducted (Synonym, repeatability)”7

Definition 2:

“The degree of agreement among a set of observations […] after all known sources of error are accounted for (Synonym, precision)”8

Note the following differences between definitions 1 and 2:

  • (i) In definition 1, reproducibility is taken to be a binary concept: a result is either reproduced or not. Definition 2, takes reproducibility to be a continuous concept, like a degree of concordance.

  • (ii) Related to (i), definition 1 implies the subjective choice of a difference, δ, whose value will depend on the measurement problem at hand. Definition 2 avoids a choice of δ.

  • (iii) Definition 1 chooses the value ‘95’ for the confidence interval to be used. Definition 2 avoids subjective choices of a particular confidence level, such as 95, 90, 68 etc.

  • (iv) Only definition 2 emphasizes measurement that is free of bias.

Reproducibility studies may be seen as a type of equivalence trials (see Figure 1). Briefly, in classic superiority trials, we pose a statistical null hypothesis of no difference, which we then seek to reject to conclude that a difference exists. In equivalence trials, we define a (narrow) zone around a zero difference (between, say, our new drug and an existing one) and we establish equivalence if the entire confidence interval for the reproducibility study lies inside that zone. In this article, we propose to replace the difference of zero by the (pooled) value of (the) previous study or studies (vertical line in Figure 1). The width of the grey equivalence zone or “zone of reproducibility” is crucial and it seems sensible to define it pragmatically for each research situation separately. Without concrete ideas about the maximal width of this zone, judgments of when a result counts as a reproducibility can be quite subjective. For example, Begley and Ellis considered positive results as not reproduced if the replicate findings were not sufficiently robust to drive a drug-development program. Ioannidis considered the results of a therapeutic intervention as reproduced if the researcher’s final interpretation of the data in both studies was that the intervention was effective (or ineffective). Figure 1, however, shows that even in situations in which one has strictly defined the width of the zone and a suitable type of confidence interval, undecided outcomes may still occur (situations 5–7, Figure 1).

e93671fa-c863-47fb-aa3f-a8c5fe051b29_figure1.gif

Figure 1. Analogy between equivalence trials framework and reproducibility (concordance): 9 examples.

Numbers in brackets refer to the 9 scenarios; horizontal lines are xx% confidence intervals (CI), where xx=95, 90, or 68 etc; short vertical lines depict point estimates; the grey area signifies the zone of reproducibility; delta (δ) refers to the maximal absolute value below which reproducibility (concordance with (an) existing finding(s)) is deemed present. Scenarios 1–4: reproducibility is present since the new point estimate and its entire 95%CI interval lie within the grey zone; scenarios 5–6: presence of reproducibility is uncertain since the point estimate lies inside the grey zone, but the xx%CI does not; scenario 7: presence of reproducibility is uncertain since the point estimate lies outside the grey zone, but part of its xx%CI lies inside; scenario 8–9: absence of reproducibility since point estimate and corresponding xx%CIs are outside the grey zone. Note, that two components are subjective: (1) the choice of δ, although preferably it should be chosen with a thorough understanding of theory or application of the research problem, and (2) the type of confidence interval since other choices than a 95%CI may be possible and defensible. Note also that, even after delta and the type of confidence limit have been chosen, uncertainty may persist if confidence limits overlap the boundaries of delta.

Reproducibility studies imply healthy scepticism: “Can we reproduce this finding?” In contrast with the comment cited above, which states that irreproducible results are biased, we emphasize that (ir)reproducibility of results says nothing about the validity of the previous nor of the current findings. For that, we need (validity) judgments about rigor of study design and execution. Meta-analyses of many small, but concordant, studies that were subsequently negated by the result of a single mega-trial (believed by many to represent the truth) illustrate this situation9.

In conclusion, the concept of reproducibility (repeatability, precision) should be distinguished from validity (“truth”). Furthermore, an equivalence trials framework can be fruitfully used to clarify the concept of reproducibility if we change the (narrow) equivalence zone around a zero difference by a zone of reproducibility around (a) previous finding(s). Care should be exercised when selecting sensible margins (delta) to decide on reproducibility of results10.

Data availability

No data is associated with this article.

Comments on this article Comments (0)

Version 1
VERSION 1 PUBLISHED 09 Jan 2019
Comment
Author details Author details
Competing interests
Grant information
Copyright
Download
 
Export To
metrics
Views Downloads
F1000Research - -
PubMed Central
Data from PMC are received and updated monthly.
- -
Citations
CITE
how to cite this article
ter Riet G, Storosum BWC and Zwinderman AH. What is reproducibility? [version 1; peer review: 3 approved with reservations]. F1000Research 2019, 8:36 (https://doi.org/10.12688/f1000research.17615.1)
NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.
track
receive updates on this article
Track an article to receive email alerts on any updates to this article.

Open Peer Review

Current Reviewer Status: ?
Key to Reviewer Statuses VIEW
ApprovedThe paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approvedFundamental flaws in the paper seriously undermine the findings and conclusions
Version 1
VERSION 1
PUBLISHED 09 Jan 2019
Views
13
Cite
Reviewer Report 25 Feb 2019
Ksenija Bazdaric, Department of Medical Informatics, University of Rijeka Faculty of Medicine, Rijeka, Croatia 
Approved with Reservations
VIEWS 13
Thank you for giving me the opportunity to read this manuscript. It was very interesting. As opinion pieces are not supposed to be very long I understand that not all concepts/constructs could have been explained in detail. I think the ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Bazdaric K. Reviewer Report For: What is reproducibility? [version 1; peer review: 3 approved with reservations]. F1000Research 2019, 8:36 (https://doi.org/10.5256/f1000research.19261.r44140)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
Views
16
Cite
Reviewer Report 15 Feb 2019
C. Glenn Begley, BioCurate Pty. Ltd., Parkville, VIC, Australia 
Approved with Reservations
VIEWS 16
The issue of data reproducibility is central to science and is worthy of ongoing discussion. Although the authors state at the outset that "Reproducibility is said to be a core principle of scientific progress", to me it IS a core principle.

... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Begley CG. Reviewer Report For: What is reproducibility? [version 1; peer review: 3 approved with reservations]. F1000Research 2019, 8:36 (https://doi.org/10.5256/f1000research.19261.r44139)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
Views
23
Cite
Reviewer Report 13 Feb 2019
Steven N. Goodman, Departments of Health Research and Policy (Epidemiology) and Medicine, Stanford University, Stanford, CA, USA 
Approved with Reservations
VIEWS 23
This is a thoughtful piece that attempts to offer a construct that will help define research reproducibility. They say that their purpose is to offer an operational definition of reproducibility that they claim a previous paper entitled “What does research ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Goodman SN. Reviewer Report For: What is reproducibility? [version 1; peer review: 3 approved with reservations]. F1000Research 2019, 8:36 (https://doi.org/10.5256/f1000research.19261.r42838)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.

Comments on this article Comments (0)

Version 1
VERSION 1 PUBLISHED 09 Jan 2019
Comment
Alongside their report, reviewers assign a status to the article:
Approved - the paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations - A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approved - fundamental flaws in the paper seriously undermine the findings and conclusions
Sign In
If you've forgotten your password, please enter your email address below and we'll send you instructions on how to reset your password.

The email address should be the one you originally registered with F1000.

Email address not valid, please try again

You registered with F1000 via Google, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Google account password, please click here.

You registered with F1000 via Facebook, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Facebook account password, please click here.

Code not correct, please try again
Email us for further assistance.
Server error, please try again.