The imperative to find the courage to redesign the biomedical research enterprise [version 1; peer review: awaiting peer review]

Medical research aims to improve health for everyone. While its advances are undeniable, the pace and cost of the progress are not optimal. For example, independent analyses concluded that at least half of the published biomedical research findings are irreproducible, and most scientific papers are never read or cited. This paper examines biomedical research holistically, as a system of incentives that shape the behavior of scientists, administrators, publishers, and funders, and are detrimental to medical progress. We identify opportunities to change and improve those incentives by altering the way research output is disseminated and evaluated, and recommend transparent, data-driven measures of methodological rigor, reproducibility, and societal value of scientific discoveries. Embracing these opportunities would maximize our investments in biomedical research and optimize its value to human health, while simultaneously increasing freedom, creativity, and satisfaction of the scientific workforce.


Introduction
At least half of the human and financial resources currently devoted to the biomedical academic research generates findings that may not enhance collective knowledge or are unlikely to improve health. Every independent attempt to reproduce published biomedical scientific studies found stunning irreproducibility rates ranging from 51 to 89%. [1][2][3][4][5][6][7] In a survey of more than 1500 researchers, 90% of respondents believe there is a reproducibility crisis in biomedical sciences. 8 Current solutions to this crisis focus on encouraging better training of scientists in the methods of reproducible research and transparent reporting of methodological details. 9,10 While these efforts target important aspects of the problem, they fail to address its roots. This paper treats the research enterprise as a system and examines how that system shapes the behavior of scientists and administrators. It asks whether the current design of publicly funded biomedical research maximizes its societal value and considers modifications to the organizational structure to enhance research reproducibility and its relevance for human health.
What is the desired outcome or mission of biomedical research? The mission of the US National Institutes of Health (NIH), the largest publicly funded biomedical research enterprise in the world, is to "Uncover new knowledge that will lead to better health for everyone." One could assume that "uncovering new knowledge" should be optimized in terms of speed and value to improving health. To "lead to better health for everyone," the knowledge must be both reproducible, which ensures that we can trust the results and build upon them, and relevant to human health. 11 To achieve optimal speed and value in advancing the NIH's mission, a system must be organized around the pursuit of these implicit goals, which presently it is not. One way to see this is to explore the differences between the current biomedical research enterprise, and uniquely successful past public projects such as the Manhattan Project, 12 and the Apollo Mission. 13 The Manhattan and Apollo projects were entirely mission-driven with defined, concrete end-products. Accomplishment of their missions, in the expected timeframe, was the definition of success for all actors from funders to administrators and scientists ( Figure 1A). Every expense, every human resource decision, every regulation imposed was judged based on its relevance to the mission: will it bring us closer or further away from the goal? Thus, the administrators and funders of successful organizations facilitate innovation by dismantling barriers and producing environments conducive to creativity and productivity. A. The successful institutions or temporary endeavors such as the Manhattan or NASA's Apollo projects are strictly mission-driven: all players (i.e., funders, administrators and scientists) are awarded based on achieving defined organizational goals. This leads to synergy between players, creating an environment that seeks and values creativity, diversity, cooperation, and shared responsibility. B. In most current, publicly funded biomedical research institutions, the mission exerts only a minor influence on daily decisions. Goals and award structures are largely separate for different actors. This creates inefficiencies, administrative burdens, and wastes resources.
For example, at the inception of the Apollo project, the paucity of skilled aeronautic engineers was solved by the administrators flying to Canada, recruiting Canadian engineers and facilitating their integration to NASA teams. 13 There was a concerted effort to foster collaboration within the endeavor, challenging the current belief that internal competition generates efficient organizations. [14][15][16][17][18] Analogously, funders of productive organizations either directly lead the effort or facilitate and monitor accomplishment of the mission, consistent with their (often financial) interest in the outcome.
In stark contrast, while the current publicly funded biomedical research institutions defined the broad aims (to improve health), there are few concrete steps or timelines, and these exert only minor influence on daily decisions by researchers, administrators, and funders. Although researchers need freedom to pursue novel hypotheses, as these often lead to important discoveries, the award structures must unite the various actors within the system behind the organizational mission. Disturbingly, this is currently not true, as the goals and awards are often completely disconnected from the mission ( Figure 1B).
For example, the NIH intramural research program has close to 100 different electronic systems. Most fulfill necessary administrative or regulatory needs, but generally disregard the burdens their mandated use imposes on researchers. Because these systems largely do not communicate with each other, researchers must repeatedly input identical data in different formats and at different time-intervals mandated by each electronic system. Thus, medical providers and researchers with advanced degrees perform repetitive tasks that can be completed much faster and without errors by the computers. Even worse, the lack of harmonization creates an almost comical level of complexity, where the researchers must "remember" that the same event, for example a side effect of a medication, is considered "serious" by one regulatory system and must be imputed immediately, "expected" by another system and should be imputed as aggregate yearly data, and "serious but expected" by the third system. These arbitrary complexities cause mistakes, which must be found and reported, creating a vicious cycle of useless paperwork.
One reason for such inefficiencies is the fact that administrators are judged not on the output of the scientists in their division, much less the reproducibility of the work coming out of their labs, but rather on the administrator's skill in accomplishing administrative tasks and, often, their ability to raise money for the institution. No administrator is punished for being unwilling to remove an impediment to efficient scientific production, such as the NIH's byzantine electronic systems. On the contrary, the disincentives of potential failure and the effort needed for transformative leadership eventually stifles even the most mission-driven administrator.
Examining incentive structure for scientists Similarly, the research teams, the "engines" discovering new knowledge, also operate within a reward structure which is not conducive to producing the most relevant and reproducible research possible.
Research teams typically consist of three to 30 people led by a single principal investigator (PI). Up to 80% of the team's workforce are trainees. Their inevitable turnover impedes the continuity of long-term projects, such as clinical trials. The peak of a career trajectory for an academic researcher is to attain a (scarce) tenured PI position with an independent laboratory and budget. The currency of their realm is publications: how many articles are published; the prestige of the journals in which the articles are published; and the fame these articles generate.
Astonishingly, research reproducibility and its value to health are almost never considered among hiring, promotion and tenure criteria. 19 There are few penalties for scientists who routinely publish irreproducible results except in cases of outright fraud or research misconduct. This gross, de-facto misalignment of the researcher's incentives causes the irreproducibility crisis.
Incremental changes versus system re-design? Numerous authors and organizations have suggested or implemented incremental enhancements of research enterprise. Partially successful fixes, such as pre-print services, and journals that embrace open reviews and/or implement publication checklists that provide transparency in disclosing methodological details, are steps in the right direction, but thus far have failed to make a significant dent in irreproducibility. These incremental solutions often cannot become permanent in the face of existing incentives, as was the case of the PubMed Commons post-publication open peer review, 20 which was discontinued because its use offered no advantage to scientists' careers.
Change is difficult. Even people deeply dissatisfied with the status quo fear its disruptive consequences. Furthermore, redesigning the publicly funded biomedical research enterprise is a daunting task, as no single entity oversees research in its entirety. Nevertheless, NIH is the leading force in the US, and as demonstrated in the coronavirus disease 2019 (COVID-19) pandemic, is best positioned to establish fruitful collaboration among funding agencies, academic institutions, and private entities to lead transformative initiatives.
Reforming the scientific enterprise into a self-regulating system A mission-driven approach to scientific reform includes the following desired outcomes: 1. Incentivize reproducible, impactful research that is efficiently performed and rapidly disseminated.
2. Turn scientists into collaborators by rewarding sharing of resources, dynamic data re-analyses, and transparent criticism and correction of the methods, results, and interpretations.
3. Re-envision scientific teams by utilizing individuals' strengths, by fairly and transparently distributing benefits from scientific accomplishments to team members, and by facilitating dynamic collaborations.
To attain these outcomes, the greatest opportunity is leveraging technological advances that can integrate the existing system of cataloguing scientific publications (e.g. PubMed) with objective, data-driven assessment of scientific papers' technical quality, generalizability, and resulting societal value. We call such a dynamic data integration the "biomedical research network" (BRN, Figure 2). This free, transparently curated resource would transform the way in which scientists (and their institutions) are judged and rewarded, by their funders and the public, thereby affecting all outcomes listed above.

BRN and its metrics
We envision three ways to measure contributions to biomedical research objectively: a methodological rigor (MR) score; a reproducibility (R) score; and a societal impact (SI) score. Every publication would be assessed by each of these metrics. Electronic navigation allows to zoom in a specific branch of scientific investigation, exemplified by IL17 subnetwork (B). The IL17 branch is initiated by the discovery/sequencing of the IL17 gene in 1993. There are many different new branches of investigation related to IL17, but the branch that links IL17 to autoimmunity is the most successful. Although IL17 has been associated with many different animal models of autoimmunity and with more human diseases, only three subdivisions are highlighted due to space constraints: IL17 and ankylosing spondylitis (AS), IL17 and psoriasis, and IL17 and multiple sclerosis (MS). Of these, the IL17 and AS, and IL17 and psoriasis branches are highly successful, with proven therapeutic efficacy of many direct and indirect IL17 inhibitors. Due to this commercial and public health impact, the publications forming branches highlighted in green color accumulate positive Societal Impact (SI) scores, whereas the IL17 and MS branch thus far had no commercial success; therefore, its publications do not partake in positive SI scores allocation.
The size of the circles of the defining papers (network "hubs" exemplified by publication of the discovery of IL17, 1 st publication that associated IL17 with autoimmunity, 1 st publication that associated IL17 with human psoriasis etc.) is proportional to the assigned SI scores: discovery of IL17 has the highest Scientific Impact subscore; whereas positive clinical trials of IL17-blocking agents in AS and psoriasis have a higher Commercial impact subscore of SI scores. The Studies #1-3 are described in detail in the text of the article, where the studies #2-3 attempted to reproduce findings of the high-impact Study#1 but failed. The font color in the Integrated knowledge diagram is proportional to the MR scores of the three publications.
Methodological rigor (MR) score The first metric, the MR score, is the most critical and the easiest to implement. It would be based on established attributes of reproducible research, such as blinding, power calculation, randomization, use of controls, validation of results in an independent cohort, availability of raw data and analysis codes. 4,21,22 Though the quickest way forward is to assign an MR score to a paper based on a self-reported checklist by the author(s), this may generate bias and cannot be applied to already-published papers. Thus, at the BRN's inception, reviewers would need to validate the self-reported MR score against the Methodology section of the paper. The latter can be accomplished through an open review system, associated e.g., with free/public pre-print service such as medRxiv. MedRxiv is run by a consortium of two academic institutions and the BMJ, and funded by a private foundation, the Chan Zuckerberg Initiative. MedRxiv has proven to be an invaluable aggregator of science associated with the COVID-19 pandemic. Reviewers' curated MR scores would serve as the dataset for the development, optimization, and validation of an automated, word-recognition based algorithm, which assigns MR score by scanning the Methods section of scientific papers.
Scoring of all publications can produce an MR score for individual scientists, averaged across their publications. This would incentivize all co-authors to support methodological rigor for every experiment, effectively self-regulating high-quality experimental design. The MR score would also help non-experts to judge the technical rigor of the presented work.
Scientists may have two objections: one, the retrospective grading may not accurately reflect the quality of the original experimental design, because in the past, journals restricted word count for the Methods section. This is remediated by standardizing an individual paper's MR score to the yearly averages, a solution used in standardized testing. Two, the authors may disagree with an assigned score. True mistakes, if validated in an open review, would improve the automated algorithm, possibly through public competition(s) similar to the Netflix prize.
The evolution of MR scores on a population level would objectively measure the success of the BRN strategy to combat irreproducibility. Improvements in a scientist's standardized MR scores could represent professional growth. Additional training initiatives intended to improve methodological deficiencies and currently imposed on all scientists, could be more effectively targeted to the scientists with consistently low MR scores.
Reproducibility score (R) score Unless we demand pre-registration of all experimental designs akin to pre-registration of clinical trials in www. clinicaltrials.gov, the MR score will depend on what scientists choose to disclose. Some fear that pre-registration of all experiments will increase administrative burden 23 and stifle innovation. Others believe that a lack of preregistration substitutes scientific predictions by less credible "postdictions", 24 exacerbating irreproducibility. A solution that integrates both views is to objectively assess reproducibility of results, or in broad understanding, the "scientific truth" 25 of each study, post-publication.
Using language recognition, the BRN would assemble all related publications to a sub-network ( Figure 2B), where the publication describing the crucial discovery that spurred a subsequent line of investigation would represent the main hub located at the center of the subnetwork. For example, the discovery of cytokine IL17 would sit at the hub of a subnetwork, and subsequent papers describing its role in autoimmune diseases would be one of the most successful "branches" of this IL17 subnetwork. Each new, related publication would be placed in the periphery of such a subnetwork, while minimizing the network "distance" between closely related research papers.
To illustrate how the BRN associates related articles and uses their MR score to assign a Reproducibility, or R score, consider three related articles that investigate the role of IL17F in predicting therapeutic response to the drug interferon beta (IFNb) for treating multiple sclerosis (MS). Study #1 26 was published in the high impact journal, Nature Medicine. By using only 26 MS patients and not employing methods to prevent bias, this study expands observations in mouse models to humans, concluding that high blood levels of IL17F predict poor response to IFNb. Two validation studies (Studies #2 27 and #3 28 ) that collectively (and blindly) analyzed samples from 357 randomly selected MS patients demonstrated unequivocally that serum levels of IL17F do not predict IFNb response in MS. In contrast to PubMed, which does not automatically associate related studies, the BRN algorithm places the validation studies in proximity of the Study #1 (i.e., on the same branch). Additionally, the BRN automatically integrates knowledge ( Figure 2B) from related studies: because Studies #2 and #3 have high MR scores and validate each other, both receive high R scores. In contrast, Study #1 receives a low R score because it has a low MR score and was, in fact, not reproducible. To compute the R score, the algorithm compares the results of the two congruent studies "weighted" by their MR scores against the MR score of the conflicting study, to assign the probability that the specific result is true: the probability that serum IL17F levels predict therapeutic response to IFNb based on this integrated knowledge is very lowlet's say 0.1%; therefore, the Study#1 receives an R score of 0.001, whereas Studies#2 and #3 receive R scores of 0.999.
The goal is to employ technology to curate knowledge, while rewarding scientists for greater rigor. Because experimental rigor positively correlates with reproducibility, 21 receiving consistently low R scores for consistently high MR scores will question accuracy of self-reported experimental designs. Such discrepancies would become exceedingly rare, as the mere understanding that objective curation will identify contradictions would naturally limit any temptation to cheat.
The most obvious objection against implementing the R score is its potential inaccuracy. Like every machine-learning algorithm, generated R scores will not be 100% accurate at its first iteration, but access to large, curated datasets (provided by author's contestation and human reviewers' adjudication of R scores), will allow data-driven optimizations.
Ultimately, this will generate algorithm(s) that consistently outperform humans. 29 Societal impact (SI) score and its scientific, commercial, and public health contributions Innovation and relevance of scientific discoveries to health are essential attributes of the NIH mission. Indeed, science can be well-designed, reproducible, but still irrelevant. 11 To assure that biomedical science is innovative and generates societal value, we must learn to quantify its worth objectively.
The impact of a discovery on science itself can be measured within the BRN as a growth of a new branch of scientific explorations. For example, basic scientific discoveries, such as the reprogramming of somatic cells to induced pluripotent stem cells (iPS; 30 ) or CRISPR-CAS9 genetic editing both established new branches of science. 31 Thus, their scientific impact is enormous. To assess the public health and commercial impacts of scientific discoveries, the BRN must link to databases of patents, newly approved drugs, medical products, and new technologies. This will trace the origin of societal and public health advances back to individual scientists in order to develop the integrated SI scores. The goal of the SI score is to objectively measure the impact of individual discoveries and cumulative impact of individual scientists or even scientific institutions, which can be used to reward institutional leaders. Current attempts to assign value to a publication or a researcher, such as NIH's Relative Citation Ratio are insufficient, because they rely on the number of accrued citations. In our concrete example, the irreproducible publication in the prestigious journal is easily found in internet searches and continues to accrue citations years after its findings were unequivocally refuted: Study #1 26 accrued (as of May 2021) 590 citations, while the two congruent validation studies, published in less prestigious journals, accrued only 114 citations in total. This example is hardly an exception: a major effort to reproduce important scientific discoveries found non-reproduced articles published in higher impact journals and were accruing more citations than reproducible articles. 3 Undoubtedly, the SI score is the most difficult metric to derive. The challenge is to assign the accurate proportion of any given publication's SI score to individual discoveries, as no discovery is made in isolation (exemplified in CRISPR technology, see the Broad Institute's CRISPR Timeline). In other words, as a pie grows (i.e., measured by the growth of subsequent publications and their societal impact), the number of scientists credited for the share of a pie also expands proportionally to the calculated impact of their discoveries. The SI score will likely receive the greatest push-back, as scientists have limited influence over commercialization of their discoveries and comparing societal impact of discoveries in different scientific fields may seem unsurmountable. But should we shrink from the enormity of this task and accept that the opinion of a few "experts" who nominate and select scientists for prestigious awards, is the best we can ever do?
Linking the BRN with the reform of the scientific publication system The overarching goal of the BRN is to align research incentives to promote scientific breakthroughs with meaningful societal impacts. Current emphasis on publications, detached from their scientific rigor, reproducibility, and societal impact has been detrimental: thousands of scientists publish a paper every five days, 32 most of which are unread and never cited (see 33 and the NIH's iCite database). No human being can keep up with this exponential growth of publications, even in his/her research field. Without transparent algorithms that curate this publication excess, work may be unnecessarily duplicative and findings with potential societal value may be difficult to find in search engines, slowing medical progress.
Additionally, publications must become the means, while transformative advances in basic sciences and medical cures are the ends. Therefore, although the BRN can be developed in parallel with the existing publication and dissemination system, reforming that system offers major advantages. First, the current unacceptably long and pointlessly expensive publication process stifles innovation. The public may be unaware that most scientific articles take years to publish, with research teams reformatting rejected papers to conform to requirements of different journals and paying thousands of USD per publication, even if these are no longer printed and only published electronically. This causes frustration, and wastes resources that should be dedicated to new discoveries. Making the BRN a universal, immediate and free dissemination system for biomedical sciences (e.g., by merging PubMed with current pre-print services such as bioRxiv and medRxiv) would avoid wasted time, labor and money.
Open peer review Second, a single-blinded review system is morally indefensible. It empowers reviewers to determine the fate of the publication without accepting public responsibility (or reward) for the review. This system institutionalizes bias by limiting scientific contributions from women and minorities, and has shown itself to be incapable of weeding out irreproducible science. 3 Open peer review is often disparaged on the assumption that few scientists would choose to participate, fearing retaliation for a critical review. If this is true, the scientific culture is broken. Engaging in discourse on validity and interpretation of scientific discoveries is a precondition to scientific progress. Providing constructive criticism of presented work and identifying ways for its improvement, while respecting the dignity of the scientists who produced it, underlies scientific maturity. This is very different from what can occur now, the writing of anonymous reviews containing publicly indefensible arguments and refusing to take personal responsibility for their accuracy.
The validity of science is the responsibility of all scientists. The irreproducibility crisis reflects on all of us, whether we adhere to sound or poor experimental design. We are the ones who form and embrace scientific culture and the only ones who can reform it.

Reviewer impact (RI) score
If we want to achieve sustainable open peer review, 34 we must reward scientists for their willingness to provide critical, creative thinking. Let's return to the example described in the legend of Figure 2 and consider three hypothetical scenarios: in scenario #1, Study #1 is submitted to the BRN and a reviewer alerts the authors to the methodological flaw through open review. The authors acknowledge their mistake and amend or retract the problematic experiment in their paper. Because there were no negative societal consequences, the authors' MR score would go down slightly for submitting a flawed experiment, and the reviewer would accrue positive RI score for identifying the paper's shortcoming and triggering a correction.
In scenario #2, the authors submit a flawed experiment but choose to ignore the raised criticism. With the critical review in the public domain, an economical, well-designed (i.e., of high MR score) independent validation study is performed and fails to reproduce original finding. Now, the authors accrue not only a low MR score but also a low R score and negative SI score, equivalent to the value of wasted resources.
Finally, in scenario #3, which in many ways reflects the status quo, all reviewers fear the consequences of posting a negative review of the experiment coming from an influential scientist. The reported findings appear to be so important that a pharma company decides to perform a new clinical trial based on the published data. After considerable expenditures, the trial unequivocally demonstrates the irreproducibility of the original publication. The authors would accumulate not only low MR and R scores, but also highly negative SI score due to accumulated costs of wasted resources. Everybody loses. Through transparent, responsible review we can correct mistakes while they are still small, help each other grow as scientists and maximize the value of our discoveries.
Scientists should freely publish their experiments, and any reader should be able to transparently (and succinctly) critique them, accepting full responsibility for the review. Reviewers with habitually frivolous criticism not supported by data or scientific understanding would themselves accrue low cumulative RI scores, prompting them to change this behavior. By contrast, critiques from scientists with high cumulative RI scores would impel the authors to carefully consider the reviewers' arguments and perform additional experiments to ensure the validity of findings.
The BRN would automatically limit irreproducible science if career rewards depended on high MR, R, and SI metrics. If performing bad science were detrimental to one's career, why would anybody conduct much less try to publish flawed experiments? And, if scientists read papers anyway (which they must do) and rating them were a simple task that promoted their careers, why would they not review?
Finally, an open review process would level the playing field of scientific education by allowing all trainees and even journalists to learn critical thinking from the greatest minds. Everybody wins.

Dynamic publications
We also wholeheartedly embrace the concept of dynamic publications. 35 Publications should be transparently modified (with the modification history recorded in the BRN) based on new experiments or evolved knowledge. Such modifications may enhance clarity, adjust conclusions, and add, improve, or retract experiments in response to open review. Furthermore, scientists who were not original authors should be able to attach results of subsequent experiments to an existing publication, rather than publishing independent papers. For example, an investigator publishes an intriguing finding that lacks an independent validation cohort. Another scientist has such a cohort and validates the published findings. Rather than submitting this as a new publication (and thus needlessly duplicating information already contained in the introduction and discussion of the original publication), this independent lab could simply submit the results of the validation experiment and its related methods to the BRN, and link it to the previous paper. If the results subsequently translate into societal value, all contributing scientists will share the benefit proportional to their contribution. This way, many scientists can efficiently strengthen or refute existing publications, solving the current problem of (not) publishing negative data 36 and replication studies, which are essential for assessing the generalizability of discoveries. 37,38 Sharing raw data and dynamic re-analyses and re-use of published cohorts Related to the publication process is the issue of depositing full datasets into the public domain, not only to verify published results but also for the datasets to be reused to test new hypotheses. The societal benefit of sharing raw data is widely recognized 39,40 but practically left at the discretion of the scientists. For example, the NIH encourages data sharing, and many journals request that the authors include a statement on data availability, such as: "Anonymized data not published within this article will be made available by request from any qualified investigator." Such requests are not always granted and the declarations in the publication cannot be enforced. 41 The reason for the lack of data sharing, especially from well-designed clinical studies, is that these datasets remain valuable; they allow scientists to ask new questions and publish new papers-the same value society loses if the data are not shared. Because different researchers ask different questions, some essential analyses may never be performed. Acknowledging truth on both sides, the solution is to incentivize data-sharing by assigning SI score to the investigators who generated valuable datasets whenever their reuse led to new societal benefits. If one team published raw data after spending 10 years generating an extremely well-characterized longitudinal cohort, and a second team re-analyzes these data and gains valuable insights, then both teams should share the resulting SI score for such invention, proportional to their efforts and intellectual contributions. Everybody wins, especially patients.
Re-envisioning the role of publication industry experts in the research enterprise Would the BRN destroy the current publication industry? Not necessarily. First, the BRN may be based on public-private partnership and employ current publication industry experts. Second, the BRN would provide new job opportunities, such as science writers analyzing aggregate knowledge to identify new trends and knowledge gaps, or to promote commercialization of discoveries by guiding investment funds. These would allow the current scientific publication workforce to directly advance science and health.
Re-envisioning scientific teams by mapping individuals' strengths, by fairly and transparently distributing benefits from scientific accomplishments and by facilitating dynamic collaborations Successful research requires a diverse workforce with complementary skills, as again highlighted by the Manhattan and Apollo projects. When people do their job expertly, they become invaluable.
Current co-authorship criteria on scientific publications favor intellectual over manual or technical contributions. But what prevents us from fairly crediting the work of all team-members? The current acknowledgement style is needlessly vague: e.g., writing that persons #1-3 contributed to data analysis does not give justice to the reality if person #1 performed 90%, person #2, 9% and person #3, 1% of the work. The BRN could measure scientific contributions comprehensively, providing person-specific aggregate data, essential for junior scientists and support staff. This level of transparency and granularity, if linked to reward structures (recruitment, promotion, salary) would inspire the scientific workforce to assume more responsibilities, enhance their skillset and increase productivity. This productivity and cumulative societal value of institutional discoveries measured by the BRN metrics could re-align incentive structures of administrators and empower funders to make more impactful funding decisions.

Conclusion
Biomedical data has become unmanageably vast, and much of it is not contributing to new knowledge or better health, which are the stated goals of the NIH and undoubtedly of many if not most biomedical researchers. We need advanced computer algorithms to help us make sense of all that data, as well as a new set of incentives to encourage all researchers to produce replicable results. Everything described in this article is technologically feasible. The COVID-19 pandemic demonstrated that medical innovation can be both fast and transformative, as long as leaders are willing to dismantle administrative barriers and foster an environment conducive to creativity and collaboration. Courageous leadership in science administration can propel us to a new era of accelerated biomedical advances, that will achieve the mission of the NIH: to improve health for everyone.

Data availability
No data are associated with this article.