Imagining tomorrow's university in an era of open science

As part of a recent workshop entitled "Imagining Tomorrow's University”, we were asked to visualize the future of universities as research becomes increasingly data- and computation-driven, and identify a set of principles characterizing pertinent opportunities and obstacles presented by this shift. In order to establish a holistic view, we take a multilevel approach and examine the impact of open science on individual scholars and how this impacts as well as on the university as a whole. At the university level, open science presents a double-edged sword: when well executed, open science can accelerate the rate of scientific inquiry across the institution and beyond; however, haphazard or half-hearted efforts are likely to squander valuable resources, diminish university productivity and prestige, and potentially do more harm than good. We present our perspective on the role of open science at the university.


Introduction
The mission of universities, specifically land-grant institutions originating from the Morrill Act of 1862, is to provide accessible education and scholarship to all people. In a similar vein, open science has emerged as an approach to minimize the barriers associated with traditional ways of sharing the outcomes of scholarship. As defined by the Open Definition (https://okfn.org/), open science embodies the notion that information is available for anyone to "freely access, use, modify, and share for any purpose", regardless of personal or institutional resources. Fostered by increasingly data-and computation-driven research, universities are uniquely positioned to reimagine their role in knowledge dissemination vis-à -vis the principles of open science. As part of a recent workshop entitled "Imagining Tomorrow's University", we were asked to visualize the future of universities in an open, networked era and to identify a set of principles characterizing pertinent opportunities and obstacles presented by this shift. In order to establish a holistic view, we take a multilevel approach and examine the impact of open science on individual scholars as well as on the university as a whole. Generally, we agree that increased transparency in the scientific process can broaden and deepen scientific inquiry, understanding, and impact. However, the realization of these outcomes will require significant time, effort, and aptitude to successfully convey the means by which data are transformed into knowledge. We propose that open science can most effectively enable this evolution when it is conceptualized as a multifaceted pathway that includes: • The provision of accessible and well-described data, along with information about its context 1 ; • The methodology and mechanisms necessary to reproduce data analyses; • Training products that provide transparent understanding of how the data can be applied to answer questions.
Thus, impactful open science requires investments from individual researchers that are often greater than those that might be needed for "non-open" science. At the university level, open science represents a double-edged sword: when well executed, it can accelerate the rate of scientific inquiry across the institution and beyond; however, haphazard or half-hearted efforts are likely to squander valuable resources and diminish university productivity and prestige, potentially doing more harm than good. Here, we present our perspective on the varying roles of open science.

Open science enables low-barrier collaborations
For some university researchers, open science can be both powerful and transformative 2 . Imagine a research program that generates not only publications but also develops code that can quickly reproduce each analysis and publishable figure with a minimal amount of manual intervention. This structure can provide continuity in a project and accelerate the research enterprise by allowing researchers to rapidly repeat the same analysis on new datasets, all while lowering training and other human capital investments. Included in a publication, this "research notebook" and accompanying datasets (e.g., 3), could be compiled into a tutorial for others in the field who could then repeat this work with their own data -all without the need for formal collaborations. Such approaches can benefit not only the initiating research group but also an entire scientific discipline.

See referee reports
REVISED turn increasingly to private sector funding, which comes with proprietary limitations on the dissemination of results.

The potential for broader impacts with open science
It is possible that the increasing availability and transparency of scientific inquiry could ignite broader interest in research. The current publishing paradigm of most fields limits research availability to a relatively narrow audience, with paid access to scientific journals. Meanwhile, polling data from Gallup indicates a slow but relatively steady decline in Americans' trust of institutions in general since 2000 8 , although Gallup does not include "universities" specifically in the poll. In one study that compared follow-on inventions from discoveries that were made simultaneously but separately at a university and at a corporate firm, the same discovery at a university was 20-30% less likely to be used in follow-up innovations 9 . This study also included openended interviews to shed light on this "Ivory Tower effect", and a driver appeared to be "considerable skepticism toward academic science." More openness in university science research may help to address this apparent skepticism.
Even though there are concerns associated with society's growing disconnect with the scientific enterprise and the accompanying devaluation of research, it should be noted that in general academics are still held in high regard and seen as reliable sources of information for a wide range of issues 10,11 . To maintain this esteem, it is important to realize that data without an understanding of what it entails or the questions it can answer can be considered useless and even dangerous when used improperly to influence decisionmaking and policy 12 . Thus, providing useful open data requires more thought on how this data can be translated into useful information. Mechanisms to reproduce analyses and communications that explain the complexities and intricacies of these tasks could be an important first step. While the peer-reviewed-publication paradigm currently provides an established, if not optimal, communication mechanism for conveying the results of scientific activities to our peers, no such standard currently exists to govern the creation and exchange of open science to our peers and beyond. Efforts at the university level that encourage the rigorous construction of appropriate dissemination systems are laying the foundation for success in this endeavor.

A path forward: recognition, training and infrastructure
Recognize open science impacts. Universities have a moral responsibility to educate, and there are significant opportunities in the open science model to broaden the output of research with an eye towards education. Nevertheless, the current university promotion and tenure system is optimized for evaluating the traditional format of peer-reviewed journals as the only necessary and sufficient product of a research project. Given the "publish or perish" paradigm that currently pervades the academy, an accompanying lack of recognition for the time and effort put into facilitating open science is apt to dampen participation 12 . For example, utilizing openly available code for an analysis in a subsequent publication does not require a citation, and even if the code were to be highly cited, it does not carry the same weight as a peerreviewed publication. Thus, universities have an opportunity to reimagine what it means to contribute to research, specifically extending the definition to include more than a tally of peer reviewed publications. The development of robust, reliable, and transparent tools to track utilization of open science products may be one path forward to quantitatively measure the impact of faculty generated research outputs not currently tracked or rewarded, and both incentivize and acknowledge the resources required to effectively engage in open science.
Train best practices and provide infrastructure to broaden participation. A notable effort to define the characteristics of open science products are the FAIR Data Principles 13 , which emphasize that scholarly products should be findable, accessible, interoperable, and reusable and that good data management is not a goal in itself but can catalyze knowledge discovery and innovation. At the university, training for sustainable data management best practices would deepen the overall understanding of the opportunities inherent in open science. In many respects, the products of open science are available to benefit by all that require support infrastructure to share data, tools, and training to broaden participation and limit exploitation. This infrastructure could also be re-imagined to include metrics to quantify impact, supporting the need to acknowledge contributions.
In conclusion, open science is a significant opportunity for universities, but a one-size-fits-all approach is sub-optimal. Executing open science in a way that facilitates meaningful advances requires a personal investment of time, both upfront to develop relevant capabilities, and ongoing for execution expenses. As such, it is important that universities develop infrastructure and training to support, measure, and reward efforts that deliver on the promise of open science, focusing on domains best positioned to further scientific understanding.
Author contributions AH, MH, AK, RR contributed equally in the preparation of this manuscript and have agreed to the final content.

Competing interests
No competing interests were disclosed.

Grant information
The author(s) declared that no grants were involved in supporting this work. The paper seems to settle in on open analysis code as the central argument later on, but in the opening The paper seems to settle in on open analysis code as the central argument later on, but in the opening sections the argument seems overly broad for the examples and support that are given. The only example of benefits that is given is of the "open notebook" (a good and valuable example) but it is not sufficient to support broad claims about open science as a double-edged sword (where neither the broads benefits or potential downfalls are explained in detail or supported with evidence from the literature). I think focusing the argument and placing their perspective more clearly within the broader field of open science early on would create a more cohesive argument and one that can be better supported with the experiences and examples the authors provide. Later on, citations 7 and 8 are used to support discussion of take up of academic discoveries. These are both citations of the same study though, with 7 being the study itself and 8 being a popular media report on the study. Using both seems to suggest that there are two independent sources to support this claim. There is also a published version of the study that might be a preferred citation . And I think it might be helpful to note that the study does not find mistrust to be the main reason for lack of uptake but says it is secondary to the natural competitiveness of industrial science, where they are constantly monitoring competitors and therefore likely to notice discoveries that competitors make.

Open Peer Review
"To maintain this esteem, it is important to realize that data without an understanding of what it entails or the questions it can answer can be considered useless and even dangerous when used improperly to influence decision making and policy" [11]. This is a strong claim, and it could represent important reservations about open science, but no support is provided. The citation does not seem to be related at all (Title: "Electrically conductive bacterial nanowires produced by Shewanella oneidensis strain MR-1 and other microorganisms.") Are the authors referring to some controversy surrounding the inappropriate use of that data for decision making and policy? If so, this needs to be explained and supported explicitly. Otherwise, other citations should be found to support this claim.
Referring to Hardin's "Tragedy of the commons" [Citation 14] also does not seem like a closely related source for the use of the term "common good" as the authors have used it. Some clarification would be helpful there as well. Neilsen's book , similarly talks about "knowledge commons" specifically in relation to open science in a way that might be more relevant to the arguments made here.

Are the conclusions drawn balanced and justified on the basis of the presented arguments?
In the end, the authors make an important and valid argument about university supports and infrastructure, but the points leading up to that conclusion could be more clearly explained and better supported and connected. From the section "A path forward" and onward, the examples lead nicely towards the conclusion. Given the authors expertise I feel that the could be expanded a bit and to explore and support the conclusion, and the earlier paragraphs could focus more specifically on the issues of open analysis code to build towards that conclusion. For example, in discussing the Ivory Tower effect earlier on, the original study that is cited explores the publish or perish system as one of the reasons for distrust. The argument made about publish or perish later on could be more meaningful if that connection 1 2 distrust. The argument made about publish or perish later on could be more meaningful if that connection had been made in the previous section.
Overall, the authors have an important contribution to make to discussions of open science and important expertise in the practice of open science. With some clarification to the supports and the argument, this paper will be a valuable and interesting piece of that conversation.

Are arguments sufficiently supported by evidence from the published literature? Partly
Are the conclusions drawn balanced and justified on the basis of the presented arguments? Partly I contributed to a discussion paper for the same workshop (Imagining Tomorrow's Competing Interests: University) but the papers were not in competition with one another, evaluated relative to one another, or written in collaboration with each other in any way.

I have read this submission. I believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.
Author Later on, citations 7 and 8 are used to support discussion of take up of academic discoveries. These are both citations of the same study though, with 7 being the study itself and 8 being a popular media report on the study. Using both seems to suggest that there are two independent sources to support this claim. There is also a published version of the study that might be a preferred citation . And I think it might be helpful to note that the study does not find mistrust to be the main reason for lack of uptake but says it is secondary to the natural competitiveness of industrial science, where they are constantly monitoring competitors and therefore likely to notice discoveries that competitors make.
Thank you for these comments -we agree that two citations here is misleading and have removed the citation #8 and replaced the original preprint with the suggested citation provided more recent published study (https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2333413). The general results of the cited study indicate that "the results suggest that the peer-based knowledge validation process in academia creates uncertainty about the reliability and relevance of academic science as a map for technology development." [Bikard, 2015]. We also modified the reason inventors draw on knowledge from firms rather than academics to be associated (e.g., a driver vs a key driver) with skepticism toward academic science. 1 4. "To maintain this esteem, it is important to realize that data without an understanding of what it entails or the questions it can answer can be considered useless and even dangerous when used improperly to influence decision making and policy" [11]. This is a strong claim, and it could represent important reservations about open science, but no support is provided. The citation does not seem to be related at all (Title: "Electrically conductive bacterial nanowires produced by Shewanella oneidensis strain MR-1 and other microorganisms.") Are the authors referring to some controversy surrounding the inappropriate use of that data for decision making and policy? If so, this needs to be explained and supported explicitly. Otherwise, other citations should be found to support this claim.
Thank you for this observation -we've corrected the citation.
5. Referring to Hardin's "Tragedy of the commons" [Citation 14] also does not seem like a closely related source for the use of the term "common good" as the authors have used it. Some clarification would be helpful there as well. Neilsen's book , similarly talks about "knowledge commons" specifically in relation to open science in a way that might be more relevant to the arguments made here.
To clarify this phrase, we modified "common good" to explicitly state "available to benefit by all".
6. In the end, the authors make an important and valid argument about university supports and infrastructure, but the points leading up to that conclusion could be more clearly explained and better supported and connected. From the section "A path forward" and onward, the examples lead nicely towards the conclusion. Given the authors expertise I feel that the could be expanded a bit and to explore and support the conclusion, and the earlier paragraphs could focus more specifically on the issues of open analysis code to build towards that conclusion. For example, in discussing the Ivory Tower effect earlier on, the original study that is cited explores the publish or perish system as one of the reasons for distrust. The argument made about publish or perish later on could be more meaningful if that connection had been made in the previous section.
Thank you for your suggestions. With these suggestions in mind, we have edited the text for a more natural flow, using both additional examples and citations and specific headers. Great suggestion. We agree that this was missing and we now provide a context for universities and their role in open science as an introduction.
2. While this is a well-written article in general, there are a few phrases used that should be reconsidered. For example, the use of the phrase "the whole enchilada" should be re-written in a more professional phrasing.

Modified.
3. I would argue that the most significant "costs" or "barriers" to open science are the financial costs (of which personnel labor/ time and electronic/ computer storage space issues are perhaps the biggest components). It is a missed opportunity to not mention the financial costs of open science. As part of that discussion, it would be interesting for the authors to discuss who bears the financial costs and whether there is room for improvement. For example, could funders/sponsors do more to support the costs of open science? And, are there opportunities for new policies (at the level of funders/sponsors) to be developed that could further support the wider implementation of open science?
Thank you for pointing out this opportunity in this piece. We have added a discussion to bring up the topic of financial costs associated with open science and the complexities of determining incentives at the university level. While there is opportunity for funders/sponsors to help bear these financial burdens, the scope of this effort is what a university can do, and we have limited our discussion to this topic. I would replace the words "the whole enchilada" under the headline "Open science requires significant Investment". I would also replace "so on" in the same section.
Instead of the headline "Broader impact of open science is uncertain" may I suggest something along the line of: "Open and broad communication could impact open science".
In the section "Open Science requires significant investment", it is suggested that a key cost of open science is time, which is reasonable. However, one of the points is that "time to produce scripts with a sufficient level of robustness and documentation to be useful to others"; this point is less reasonable. I believe with or without open science, this should an integral requirement of all scientists and so this last point should be omitted.
Your three first points to propose to enable open science more effectively are valid, however, you don't circle back to these in your final discussion. May I suggest using instead the points that you end with, such as: Finding a way to properly cite open codes or data available through open sources Developing a reliable, robust tool to track utilization of open science (note: similar idea to 1, but goes one step further) Universities need to support infrastructure to implement FAIR data principles This way you end with developing the points that should enable open science.
All these are suggestions. Should the authors decide to ignore them, it will not dampen my enthusiasm for this well-written manuscript.

Is the topic of the opinion article discussed accurately in the context of the current literature? Yes
Are all factual statements correct and adequately supported by citations? Yes