The Varying Openness of Digital Open Science Tools

Digital tools that support Open Science practices play a key role in the seamless accumulation, archiving and dissemination of scholarly data, outcomes and conclusions. Despite their integration into Open Science practices, the providence and design of these digital tools are rarely explicitly scrutinized. ​ This means that influential factors, such as the funding models of the parent organizations, their geographic location, and the dependency on digital infrastructures are rarely considered. Suggestions from literature and anecdotal evidence already draw attention to the impact of these factors, and raise the question of whether the Open Science ecosystem can realise the aspiration to become a truly “unlimited digital commons” in its current structure. In an online research approach, we compiled and analysed the geolocation, terms and conditions as well as funding models of 242 digital tools increasingly being used by researchers in various disciplines. Our findings indicate that design decisions and restrictions are biased towards researchers in North American and European scholarly communities. In order to make the future Open Science ecosystem inclusive and operable for researchers in all world regions including Africa, Latin America, Asia and Oceania, those should be actively included in design decision processes.


Introduction
The evolution of the Open Science ecosystem The Open Science movement has garnered support from both individual researchers as well as high-level policy and funding around the world and given rise to a range of influential regional high-level and grassroots initiatives alike in Africa , IberoAmerica , 1 2 facilitating openness during the research lifecycle. In this paper, the term 'DOST' includes any digital tool (for-profit, non-profit and community-led entities) used in open research, irrespective of whether they were designed explicitly for Open Science or have been co-opted into Open Science practices.
Growing efforts to promote interoperability and open workflows have made interconnection key to the success of any DOST (Wilkinson et al. , 2016). The interconnection of tools and the interoperability of their outputs enable users to move between tools at different stages of the research lifecycle to facilitate research, data dissemination and publication. The interconnectedness of the tools, as well as the overlaps in their function within the DOST landscape ("multiplicity"), means that multiple "pathways" exist for data to progress through a research lifecycle (see figure 1). How these "pathways" are selected depend on a variety of issues such as user preference, access to specific DOSTs, demands of the research project and preferences of the research community.
[ Figure 1] New tools are continually being added to the DOST landscape, and new connections between tools are regularly emerging to populate this ecosystem. We use the term "ecosystem" in contrast to the more common "landscape" to designate the dynamism of the online environment as an interconnected system through which resources move.
We ground this understanding in biological understandings of ecosystems as biological communities of interacting organisms and non-living components that interact as a system. This DOST ecosystem is dynamic, multiplicitous and subject to internal and external pressures. It includes interconnected/interdependent DOSTs, as well as the information and communication infrastructures, communities of users and socio-political stakeholders. Internal and external pressures from these actors determine the persistence of the DOSTs and the structure of the ecosystem.
The underlying dynamics and influences within the evolving DOST ecosystem have been extremely influential in driving forward the Open Science movement as a whole.
Tools within the ecosystem, such as GitHub, are changing the way collaborations are managed. Publishers like PLoS and F1000 are redefining transparent publishing models, and repositories such as Zenodo, OSF and dSpace are offering open platforms for sharing and re-using data.
The constantly growing uptake in usage of DOSTs , and their increasing interconnectivity and interoperability may give the impression that the digital landscape of Open Science is positively unfolding and developing to support the growing needs of the Open Science community. Widespread endorsement of many DOSTs, support from socio-political actors and the rapid organization of "user communities" associated with specific tools has left little time for critical reflection on how the ecosystem is evolving and what power dynamics are shaping its evolution. How, it is increasingly being asked, do the tools developed by multiple scholarly for-profit service providers, non-profit organizations and open source communities contribute to the Open Science vision and mission to make research workflows and results accessible to all sectors of society across the globe?
In this paper we critically interrogate the DOST ecosystem. We ask how its current structure enables knowledge availability and question whether social, political or economic barriers linked to DOST design and deployment undermine this objective. To do so, we ask three main questions of the ecosystem and its actors: The subsections below provide a short background to these three questions, and frame the empirical data presented in the following sections.

Geographic distribution of DOSTs and user communities
The Open Science movement supports the democratization of research resources.
Increasing openness in research will make resources available to all individuals in all nations and at all levels of society . In this way, Open Science promotes equitable 13 access to resources through the (self-described) model of the "knowledge commons" (Hess and Ostrom, 2007) which promotes a form of direct democracy, where every individual has the right---and ability--to access information, data, and content that is collectively owned and managed by a community of users. The specific geographic location of many DOSTs, as reflected by the location of their development, registration and hosting, contrasts to the approach championed by the Free and Open Source Software (FOSS) movement. FOSS has long been calling for and implementing a more representative form of democracy for software development and distribution by promoting models that avoid specific geographically-clustered nodes (Vermeir et al. , 2018;Tennant et al. , 2020). This model of "software mirrors" is 14 commonly used in systems such as GNU as well as Linux distributions like Debian and Fedora . Nonetheless, and likely due in part to economies of scale, this approach of mirroring services to increase access has not been replicated within the DOST ecosystem.

Heterogeneities in purpose and design of DOSTs
The Open Science movement promotes widely agreed values that also define good scientific practice. These include openness, credibility, reproducibility, and verifiability of any research output (Bartling and Friesike, 2014). Nonetheless, the endorsement of these core values can cause the widespread value/practice-heterogeneity within the Open Science movement to be overlooked. Indeed, Open Science can be thought of both as a practice and as a philosophy (Levin et al. , 2016), implying that the motivations for individuals to get involved can vary considerably (Fecher and Friesike, 2014).
14 A software mirror is a server that provides an exact copy of data from another server. These mirrors can be held in different geographic locations and are intended to provide fault tolerance, or a means of redundancy in case something goes wrong with the primary or "principal" server. Fecher and Fiesike (2014: 17) mention five different Open Science schools of thought: The infrastructure school (which is concerned with the technological architecture), the public school (which is concerned with the accessibility of knowledge creation), the measurement school (which is concerned with alternative impact measurement), the democratic school (which is concerned with access to knowledge) and the pragmatic school (which is concerned with collaborative research ). It can thus not be assumed that everyone is motivated to a similar degree by the core values. A number of pragmatic reasons also play important roles in the uptake of Open Science practices and tools, including efficiency, career advancement, journal and institutional requirements and community expectations (Ferguson, 2014

External power dynamics
Research occurs within highly complex networks of power and influence of financial, governmental and societal actors (Vermeir et al. , 2018 -Tools may be uncritically integrated into the ecosystem causing existing power dynamics to be perpetuated, leading to the marginalization of certain user groups -Governments and commercial companies have undue influence on the landscape due to their hosting, financing, and otherwise influential roles -The existing DOST ecosystem may become prescriptive of a specific way of "doing", as one tool becomes hyper-dominant Table 1 adapts the concept of "data assemblages" developed in Critical Data Studies for use in outlining the DOST ecosystem. Data assemblage refers to the technological, political, social and economic apparatuses and elements that constitute and frame the generation, circulation and deployment of data (Kitchin and Lauriault, 2014). Just as data assemblages map the complex set of stakeholders and pressures that influence the production, dissemination and reuse of data, Table 1 highlights some of the key pressures on the DOST ecosystem and their potential impact on the evolution of these spaces.
[ Table 1] To our knowledge, there have not yet been attempts to provide an overview of how the DOST ecosystem shifts and adapts to these pressures. Indeed, the heterogeneity not only of the ecosystem, but also the actors and pressures that influence it, make this a challenging task. This paper presents a methodological attempt to map a selection of the DOST ecosystem including links between the tools. Our intention is to generate an interactive map of the DOST ecosystem so as to be able to test pressure and tipping points that shape ecosystem make-up and functioning.

Methodology
DOSTs were identified from a range of different sources, including previous studies on DOSTs, extended web searches and tools foregrounded in key Open Science 15 communities such as the Research Data Alliance and the Open Science MOOC and compiled into a database. As mentioned in the introduction, we used a very broad definition of DOSTs and included commercial, non-profit and community-driven digital tools that are currently used in open research. We did not make the availability of source code a prerequisite for inclusion. Neither did we limit the tools to those provided free to users.
This database was ordered according to the criteria outlined in section below. We developed the categories based on our analysis of ecosystem pressures presented in table 1. The information used to populate these categories was freely available on the respective websites, each of which was examined by both of authors. Database entries were cross-checked by the authors in duplicate. Discrepancies were discussed until consensus was reached. Using the network mapping criteria above, a database was developed for analysis.
The paper is based on the 3 September 2020 version of this database that can be accessed a https://zenodo.org/record/4013812#.X1D-FnlKjIU . It is anticipated that the database will continue to evolve with community input. Contributions to the evolving database are encouraged through communication with authors.

Sorting criteria for DOST database
In this paper we examine the current dataset which includes 242 DOSTs focusing on the information about language, T&Cs, Host institution, and sponsor or funding institutions. The columns in the dataset display the sorting criteria that were applied as follows: continually discussed and updated, both by ourselves and other practitioners in the field. In this way, the DOST dataset may become a reference resource for the Open Science community for digital tools development and optimization.

Visualization of dataset
An interactive visual map of the 242 DOSTs was generated in Kumu. The interactive plot can be viewed here: https://kumu.io/a2p/dost . The Kumu software allows users to sort the data according to any of the sorting criteria discussed above. Figure 2 below illustrates the distribution of DOSTs according to research workflow steps. As can be seen from this figure, DOSTs actively contribute to all stages of research, but are particularly concentrated around analysis of data and publication.

Geographic distribution of tools and host organizations
The majority of DOSTs included in the database were explicitly connected to specific countries and regions. The geographic location of the DOST was available on the web pages through contact details, named host institutions or details of registration in the terms and conditions (T&Cs). Eighteen (18) of the listed tools did not give a specific geographic location on their websites.
As can be seen from figure 3 below, a high proportion of DOSTs available to the international research community are registered in (or linked to) the United States. It is therefore likely that the design and deployment of many of these tools was influenced The complexity of the underlying funding mechanisms has significant implications for the DOST ecosystem. In particular, it complicates efforts to make the DOST ecosystem transparent with regards to funding sources and legislative influence. It also impacts on the financial viability and longevity of the DOSTs within the ecosystem. Indeed, the reliance of many DOSTs on crowdsourcing and time-limited grants means that many will struggle to achieve financial independence and sustainability.

Highly influential actors
We have documented selected interlinkages between the DOSTs in the database which can be viewed using the link https://zenodo.org/record/4013812#.X1D-FnlKjIU ). From the analysis of the database it became apparent that certain entities are highly interlinked within the DOST ecosystem, such as GitHub, Center for Open Science and Digital Science. Figure 5 below details 8 highly influential organizations within the DOST landscape, demonstrating how these organizations/institutions are linked to DOSTs operating throughout the research workflow.

[Figure 5]
As shown in figure 5, 80.9% of the DOSTs in the database are linked to one or more of these 8 entities. These interlinkages were diverse and included direct sponsorship, hosting of the DOST, or the hosting of DOST resources. These interlinkages can also be visualized in the kumu plot at https://kumu.io/a2p/dost . More research on the extent of geoblocking is urgently required to clarify these issues.

Variations in Terms & Conditions
[

Persistence and preferences
The current structure of the DOST ecosystem means that the persistence of individual tools depends on attracting a community of users and securing stable funding. This might suggest that these features support a meritocracy, whereby the "best" DOSTs persist by common consent and investment. Such a position, however, overlooks key issues such as diversity within user communities and accessibility of funding.
Overlooking such issues can undermine the Open Science values described aboveparticularly the aspiration that the Open Science ecosystem be globally accessible and useful.
As illustrated in figure 2, many of the presented DOSTs are hosted in the United States.
It is therefore likely that these tools have been piloted and beta-tested within the immediate research communities and therefore many of the design decisions integrated DOSTs without having the opportunity to feed back into design decisions (Arthur, 1989;Leonelli, 2017).
Situations of "locking" research communities into certain DOSTs and digital workflows can cause the ecosystem to unintentionally perpetuate marginalizations. The design and persistence of the DOSTs not only influence the "pathways" that the research follows through the ecosystem, but also the research methods, data collection and curation methods and analysis tools used. The selection of certain tools over others can thus have far-reaching implications. The decisions incorporated into its design reflect a specific geographic context and value system can influence research practices across the globe.
Such concerns relate to the "Juan Valdez problem" discussed by Busch and Juska, (Busch and Juska, 1997)

Power brokers in the DOST ecosystem
From figure 4 above it is evident that the DOST ecosystem is dominated not only by certain countries, but also by certain companies, organizations and institutions. Such clustering -in light of funding, access to target audiences, permissive legislation and business cultures -is not particularly surprising. Indeed, it may be said to follow other models of technical expansion throughout history. Accepting this expansion as entirely normal from the user perspective, however, does not make it unproblematic.
The DOST ecosystem and the DOSTs themselves are intended to be distributed and multiplicitous to allow the maximal flexibility of research practices. Allowing a small number of entities to dominate the ecosystem and its evolution thus presents challenges to these aims. In particular, two key concerns arise: first, the dominance of certain entities causes centralization and interdependence on individual actors. Second, the dominance of certain entities allows specific approaches to Open Science, and related values, practices and preferences to be prioritized. This can affect the heterogeneity of the Open Science movement and foster a perception that there is consensus on how Open Science "should be done" (Fecher and Friesike, 2014).
In recognizing the former, the DOST ecosystem must confront a paradox. While interconnectedness is vital for fostering open, global research and removing national, disciplinary, and linguistic siloes, the same tools that facilitate this connectedness can lead to a centralism that drives out regional and local expertise and diversity. In particular, having tools such as GitHub dominate various stages of the research lifecycle in a number of tools not only enhances interoperability, but also centralization and dependence, thereby diminishing accessibility to some.
The latter concern relates to an often-overlooked aspect of technology: The intentions, experiences, priorities and cultures of the IT-professionals influence the design and deployment of the technology (Winner, 1986 (Jasanoff and Kim, 2009, p. 126), and foreground certain views of openness through their positioning in the DOST ecosystem.
GitHub, for example, is a commercial company based in the US. The design of GitHub, and its operating practices thus align to a specific set of values. As a result of its dominance within the DOST ecosystem, its position on key issues such as inclusion, sharing and transparency are increasingly becoming the "norm" for many users despite its political constraints and accessibility restrictions for many researchers. Recognizing such issues highlights the need for closer scrutiny of the value structures of the tools within the DOST ecosystem. Asking questions such as why tools were created, how users were recruited and why they favour one tool over another will shed light on these issues. In particular, it will highlight the limitations of allowing certain countries, tools and organizations to dominate the DOST ecosystem.

Access and underlying infrastructures
The decisions influencing the design of DOSTs do not only reflect user community preferences and perspectives of Open Science, but also assumptions about the availability of infrastructures and resources. These include a wide range of different issues, including access to funding and the ability to make online payments, linguistic competence, access to software and hardware, as well as infrastructure availability relating to internet connectivity and bandwidth.
For many DOSTs developed in Europe or the US there is an emphasis on the tools being cloud-based. On the one hand, such an emphasis makes sense in many ways such as the ease of having nothing to install, being able to deliver the latest version of software via the browser and having access to the content from any device anywhere in the world, as long as it is connected to the internet. On the other hand, some institutions especially in the European Union prefer the tools to operate on their own servers to keep them confidential from potential competitors for patenting and to ensure data and content ownership through territorial storage. For research communities in LMICs these same design decisions form a usage barrier because of low bandwidth and intermittent internet connection that make an over-reliance on "online only" tools problematic (Bezuidenhout et al. , 2016).
While multiplicity in the DOST landscape can allow marginal researchers to plot alternative pathways through the OS ecosystem, this can mean that they must resort to using less popular tools. As a result, there is a chance that these researchers continue to be excluded from the user communities that are driving research forward. This has obvious implications for collaborations, visibility/engagement with researcher communities and perceptions of worth.
Designing DOSTs for infrastructure present in the dominant geographical regions (such as the US) legitimizes a specific expectation of service access and provision. In this way, the DOST ecosystem fails to address the recognized imbalance between central and marginalized countries and research communities. Indeed, the cost for internet access and [institutional as well as private] connectivity varies drastically across world regions and tends to be extraordinarily high in LMICs. By perpetuating a set of 19 embedded assumptions like web interfaces or connectivity, Open Science continues to perpetuate a limited perspective for "inclusion" that often falls short of being inclusive.
Ensuring more inclusive design structures and processes will require ethnically and regionally diverse teams of DOST designers to ensure that infrastructural challenges are considered and responses incorporated into design decisions.

Sanctions and political clout
As demonstrated in the results section, the DOST ecosystem has to contend with a range of power dynamics external to research infrastructure. Perhaps the most pernicious of these is the role that financial legislation plays in dictating access to open resources (Bezuidenhout et al. , 2019). This is perhaps best demonstrated by the impact that US financial sanctions have on access to DOSTs. As demonstrated by table This creates situations of marginalization and lack of access for certain communities of end-users. Even more concerning, however, is that one country's political preferences are able to dictate the evolution of aspects of the DOST ecosystem. While it is important to note that the introduction of these political values is likely done unintentionally or via funding-related necessity, the impact is nonetheless severe. Acknowledging that certain aspects of the DOST ecosystem are unavailable to certain communities of users is vital for further critical reflection on the evolution of Open Science. In particular, what does this mean for the core values of the Open Science movement and the notion of a "digital commons" (Hess and Ostrom, 2007;Bezuidenhout, 2020)?

A critical appraisal of the DOST ecosystem
The results and discussion presented in this paper draw attention to problems within the current DOST ecosystem. Without detracting from the importance of the emergence of more and more discipline-and region-specific DOSTs, and the work of dedicated individuals who create them, words of caution are appropriate. The results of this paper demonstrate the heterogeneity of the actors, power dynamics and stakeholders that are currently driving and dominating the evolution of the DOST ecosystem. Even if all DOSTs were created by well-meaning individuals who wish to promote Open Science, one cannot simply assume that the resultant ecosystem will automatically reflect and perpetuate the core values of Open Science. Instead, a range of different factors inherent within DOST design create a landscape that continues to perpetuate marginalization and exclusion.
This marginalization is multifaceted. Not only are marginal research communities excluded from design decisions of DOSTs, they are likely also sidelined in the user communities that develop around them. Moreover, DOST (un)availability/accessibility does more than exclude researchers from sharing communities, it also dictates research practices and digital workflows. In this way, the design of the DOST ecosystem can affect both present and future research. While the DOST ecosystem is dynamic and multiplicitous, the dominance of a few entities is rapidly driving forward a "status quo" of how research should be done. Once such practices reach a "carrying capacity" within the global research community, they are unlikely to be easily adapted.
This can mean that the current design of the DOST ecosystem marginalizes future, as well as present, researchers.
The results and discussion in this Many of the issues mentioned and concerns raised in this paper will not come as a surprise to Open Science practitioners. Nor will it be surprising to add that the current model of persistent barriers continues to place certain members of the Open Science community in uncomfortable and sometimes unethical positions. Research Assessment ).
The section above highlighted how inequalities, marginalization and injustices were perpetrated by the current structure of the DOST ecosystem. The design of DOSTs, the ways in which they are interlinked, and the dependencies/dominances of certain entities raises the question of whether the DOST ecosystem can realise the aspiration of becoming a truly "unlimited digital commons" in its current structure. From the data presented above, it would seem that things need to change.
Nonetheless, the DOST ecosystem is a complicated landscape, and imposing a specific value set or "way of doing things" will harm the richness and diversity of this rapidly evolving field. Rather than imposing restrictions on what should constitute a DOST, we suggest that those designers and users be supported to critically reflect on the values that they are introducing into the ecosystem. There are many models currently in use on how to balance well-intentioned innovation with pragmatic requirements, and these need to be more strongly developed for DOSTs.
One such model, Responsible Research and Innovation (RRI), has made considerable contributions to discourse around socially responsible innovation. Opening up access to data and support of Open Science are fundamental components of the RRI model (Stilgoe, Owen and Macnaghten, 2013 has developed a list of criteria that any open repository needs to demonstrate. Such community standards have been highly influential and are being widely adopted by research communities and provide for cross-regional and cross-disciplinary agreement and functionality. While conversations about Open Science tool standards have existed for more than a decade, the broader community needs to be engaged for such standards to become a reality. The design of the DOST ecosystem not only determines how research is conducted today, but also determines the directions and practices of future research. Allowing certain actors, pathways or regions to become too entrenched will allow inequality and marginalizations to persist and become a future norm. Research practices are changing rapidly (ie. AI, big data), international politics are in flux (ie. Brexit, COVID-19 pandemic) and historically marginalized research communities (ie. citizen scientists and LMIC researchers) are increasingly vocal and influential (Aspesi and Brand, 2020). It is now the right time to critically assess what has already been built, and what the united global research community wants to take forward into the future.

Concluding Comments
Much of the OS ecosystem has been developed by volunteers, who donate time and expertise to developing DOSTs, infrastructures and interoperable practices. This community has the history, expertise and perspectives to take up the challenges raised in this paper. How, they need to ask, can they guide and adapt the ecosystem that is rapidly changing research? This requires a reframing of Open Science responsibilities, from contributing labour and data to discussing the complex power dynamics underpinning the evolving ecosystem. Only then will the UNESCO theme 2019 of "Open Science: leaving no one behind" become a reality.
The OS landscape is ever increasing globally, also in historically underrepresented regions such as Latin America, Africa and Asia. We therefore suggest to tie the digital development and regional adaptation of DOSTs on the Open Science Manifesto,