Current status of global research on novel coronavirus disease

Novel coronavirus disease (COVID-19) is a major global Background: health concern due to its pathogenicity and widespread distribution around the world. Despite a growing interest, little is known about the current state of research on COVID-19. This bibliometric study evaluated the contemporary scientific literature to assess the evolution of knowledge on COVID-19, identify the leading research stakeholders, and analyze the conceptual areas of knowledge development in this domain. Bibliometric data on COVID-19 related studies published until Methods: April 1, 2020, were retrieved from Web of Science core collection. Further, a quantitative evaluation and visualizations of knowledge areas in COVID-19 research were created by statistical and text-mining approaches using bibliometric tools and R software. A total of 422 citations were retained in this study, including Results: journal articles, reviews, letters, and other publications. The mean number of authors and citations per document was 3.91 and 2.47, respectively. Also, the top ten articles, authors, and journals were identified based on the frequencies of citations and publications. Networks of contributing authors, institutions, and countries were visualized in maps, which highlight discrete developments in research collaborations. Major areas identified through evaluating keywords and text data included genetic, epidemiological, zoonotic, and other biological topics associated with COVID-19. Current status of COVID-19 research shows early Conclusions: development in different areas of knowledge. More research should be conducted in less-explored areas, including socioeconomic determinants and impacts of COVID-19. Also, global research collaboration should be encouraged for strengthening evidence-based decision-making preventing and addressing the COVID-19 pandemic and aftermath.


Introduction
Coronaviruses are RNA viruses widely found among many mammal species, including human beings 1 . Although these viruses generally have low virulence, two epidemics by severe acute respiratory syndrome coronavirus (SARS-CoV) and Middle East respiratory syndrome coronavirus (MERS-CoV) are considered as major public health events in the past two decades. The case fatality rates were 10% and 37% for SARS-CoV 2 and MERS-CoV 3 , respectively. In December 2019, a novel coronavirus emerged in Wuhan City, Hubei Province of China 4 . This outbreak was unique in terms of high pathogenicity and mortality compared to the earlier epidemics by coronaviruses 5 . Soon cases affected by novel coronavirus were found outside Wuhan and eventually around the world. On January 30, 2020, the World Health Organization (WHO) declared the outbreak a public health emergency of international concern 6 . Later, the WHO named the disease a novel coronavirus as "COVID-19", which is a short form of "coronavirus disease 2019" on February 11, 2020 7 . With a growing number of new cases and increased mortality attributable to COVID-19 pandemic 8 , global health discourses among the scientific community, policymakers, and the general population are emphasizing on what is known about this virus. Although it is known that COVID-19 is uniquely different to SARS-CoV and MERS-CoV, the scientific knowledge on COVID-19 remains limited within the scope of recently published articles. It is essential to understand the evolution of emerging scientific knowledge on COVID-19 to inform further research as well as evidence-based policymaking.
Bibliometric analysis quantitatively examines the research progress of any topic and offers a comprehensive assessment of scientific research trends, which is widely used for mapping knowledge in different scientific disciplines [9][10][11] . As a research method, bibliometric study was used back in 1917 by Cole and Eales, who studied the growth of scientific production of articles published in the field of comparative anatomy 12 . This approach was subsequently termed as "Bibliometrics" by eminent British scientist Allen Richard 13 . Over the years, bibliometric studies have been used for analyzing a topic or field emerging in the global knowledge landscape and evaluating the evolution of research over time [14][15][16] . More importantly, it provides critical insights on most prolific authors, institutions, countries of affiliation, thematic change within the domain itself, co-citations, co-authorships, and holistic development of the field(s) of interest 9,13,16 .
With a growing interest in COVID-19 related research across the globe, a bibliometric study may inform the current status of global research and provide meaningful insights on future research. An earlier bibliometric analysis evaluated the scientific literature on different coronaviruses 17 . However, there is no bibliometric analysis available to date that specifically focuses on contemporary scientific development on COVID-19. This study aimed to address this knowledge gap and conducted a bibliometric analysis to evaluate the characteristics of the current body of literature on COVID-19, identify the prolific authors, institutions, and countries involved in COVID-19 research, and examine the evolution of key knowledge areas within COVID-19 related studies.

Methods
For this study, bibliometric data were collected from the Science Citation Index Expanded, Social Sciences Citation Index, and Emerging Sources Citation Index databases within Web of Science (WoS) core collection. These databases within WoS are maintained by Clarivate Analytics, which offer the world's leading scientific citation search and analytical information platform 18 . Collectively WoS collection provides enriched bibliometric data useful for citations analytics and mapping the knowledge in a given domain by examining leading authors, institutions, and collaborating nations working in a given domain of scientific research.
The following query was administered to retrieve COVID-19 related bibliometric data: "Novel coronavirus" OR "Novel coronavirus 2019" OR "2019 Novel coronavirus" OR "2019 nCoV" OR "COVID-19" OR "Wuhan coronavirus" OR "Wuhan pneumonia" OR "SARS nCoV" OR "SARS-CoV-2". Considering the timing of the outbreak in late 2019, the search strategy was limited to 2019-2020 to retrieve data that may contain publications on COVID-19 rather than earlier coronaviruses. Also, all search fields were selected including topics, titles, and abstracts to retrieve the bibliometric data ensuring the sensitivity of the search strategy. This search was conducted in February 22, 2019 and updated in April 1, 2020, for the last time. Moreover, no restrictions on languages or publication types were applied due to the low number of publications on this recent topic. The inclusion criteria for this bibliometric study was as followings: a) journal articles published on COVID-19 topic, b) language of the publication was English, c) articles irrespective of their methodology were included, d) studies published between January 1, 2019, to April 1, 2020, were included. Furthermore, articles were excluded if they had conflicts with any of the above-mentioned inclusion criteria. The references of the retrieved articles were not evaluated, therefore, articles retrieved through citations search are the only source of data in this bibliometric study.
After extracting bibliometric data from WoS, the citations were uploaded to RefWorks (freely available alternative: Mendeley), which is a cloud-based software for citation management. Further, they were screened as per the criteria described earlier, and then finally recruited citations were uploaded to R (Version 3.6.1). Using this software, descriptive analyses were conducted to evaluate the characteristics and types of documents. Also, the WoS metrics were used to assess the top ten impactful articles in the literature in terms of citations, the top 10 authors and journals based on the number of published documents on COVID-19. In addition, co-authorship among all the authors in the bibliography was assessed, and an evaluation of how many of them were connected within documents authored or co-authored by individuals was conducted.
Further, the affiliating institutions and countries of the respective authors were mapped using a network analysis approach. This set of analyses allowed to evaluate the nature and magnitude of collaboration at the individual, institutional, and international levels and how such collaborated impacted the knowledge base on COVID-19. Also, keywords and texts in titles and abstracts within scientific documents were identified and evaluated using text-mining approaches using shiny package in R (Version 3.6.1). At this stage, network analyses were conducted to assess the connectedness among those documents and related keywords. Furthermore, the cooccurrence of multiple authors, keywords, institutions, and countries, different thresholds were used to create visualizations of frequency distributions for each variable, whereas all entries within each variable were assessed for the same threshold to ensure equitable comparisons within respective fields of analyses. Furthermore, the relational mapping among the authors, institutions, countries, and common keywords were created using VOSviewer software, which is a bibliometric tool for visualization of citations data. In this mapping process, networks were developed at different thresholds for authors (n = 3 documents per author), institutions (n = 7 documents per institutions), countries (n = 1 document per country), and keywords (n = 3 cooccurring keywords). In addition, a multidimensional scaling approach was used to conduct a factorial analysis of 50 most-occurring research terms in the bibliometric data in R package as stated earlier. This allowed constructing a conceptual structure map depicting hierarchical relationships among knowledge areas within the research landscape of COVID-19.

Results
Summary of bibliometric data and the current status of scientific production on COVID-19 A total of 422 bibliometric records were recruited in this study, which were authored by 1652 authors with 3.91 authors per document (Table 1). Most documents (n = 1581) had multiple authors, and the mean citations received per document was 2.47. A major proportion of the publications were articles (33.41%) and editorials (32.23%).
In addition, top ten articles based on the number of citations in WoS were identified (Table 2), which included genetic, epidemiological, and clinical studies on COVID-19. Among the authors, Mahase E. has the highest number of publications (n = 13) followed by Akhmetzhanov AR., Linton NM., Nishiura H., and Zhang W. with seven publications per author. Also, top ten journals were identified that published the highest number of documents, which include British Medical Journal (n = 47) followed by The Lancet (n = 37), Eurosurveillance (n = 22), Journal of Medical Virology (n = 22), and Intensive Care Medicine (n = 13).

Networks of authors, institutions, and nations in COVID-19 research
A network map of co-authors who contributed to COVID-19 research was created at the threshold of 3 documents per author, which found 52 collaborating authors, as illustrated in Figure 1. Scattered zones show groups of collaborating authors, whereas connections between individuals and groups are plotted accordingly. Figure 2 shows the network of collaborating institutions that were affiliated with at least seven documents on COVID-19 research. This threshold identified 16 collaborating institutions, including Capital Medical University (number of documents, n d = 13; number of citations, n c = 237), Huazhong University of Science and Technology (n d = 18, n c = 167), and Wuhan University (n d = 15, n c = 154).
Further, the bibliometric records were analyzed for the contributing countries and a network was developed using co-authorship among scholars from those nations. Any collaborating country with at least one publication was included in this map (Figure 3). Among global nations, China has the highest number of documents (n = 185), followed by the US (n = 68), UK (n = 36), Italy (n = 23) and Canada (n = 23).

Intellectual evolution among documents in the key knowledge areas
In this study, a visualization guided by quantitative evaluation of the cooccurrence of keywords was prepared, as depicted in Figure 4. A threshold of at least three cooccurrence of a keyword was set to identify the most frequent research terms indexed in the literature, which revealed a total of 69 keywords. The top ten cooccurring words were "coronavirus" (n = 69), "sars" (n = 47), "2019-ncov" (n = 43), "covid-19" (n = 44), "sars-cov-2" (n = 26), "pneumonia" (n = 25), "wuhan" (n = 18), and "outbreak" (n = 18).    A factorial analysis was conducted among the leading 50 key terms in the bibliometric data using a multi-dimensional scaling approach ( Figure 5). This analysis resulted in a dendrogram of repeatedly co-appearing keywords in hierarchical clustering, which highlights conceptual structures in the research field. The first cluster in this dendrogram (in blue) included research terms including diversity, multiple sequence alignment, and sars-like coronaviruses. Another cluster (in red) comprised of research terms related to pathogenicity of coronavirus outbreak, earlier outbreaks with other typologies, epidemiology, and diagnostic approaches. Both structures shared several common thematic areas, including zoonotic connections in COVID-19 epidemiology and genetic and molecular properties of interest in COVID-19 research.

Discussion
This bibliometric study and knowledge mapping identified contemporary scientific documents on COVID-19 from scholarly sources. The findings of this study reflect the recent scholarly growth of the global body of knowledge on COVID-19. Most documents had multiple authors from different collaborating institutions and nations, which highlight the productivity of scientific activities. Moreover, research keywords presented in the bibliometric data reflect the complexity and inclusion of multiple disciplines like virology, microbiology, infectious diseases, clinical medicine, public health, allied health sciences, social sciences, and other branches of knowledge. Such scholarly growth of the knowledge base may help in understanding the ontology and phenomenology of a new global health challenge imposed by COVID-19.
Notably, the affiliating institutions and nations collaborated in COVID-19 research may inform the utility of global research collaboration, particularly during complex public health problems where multiple stakeholders from different institutions and contexts may offer diverse resources and competencies in addressing knowledge gaps in a more efficient manner compared to individualistic approaches. This can be profoundly challenging for low-and middle-income countries, like nations in Africa and South Asia, who have suboptimal research capacities and poor evidence-base to make informed decisions on highly prevalent health problems 19 .
Another challenge is a critical lack of technological infrastructure in such contexts, which reinforces the need to strengthen global collaborations for research and evidence synthesis. For example, resource-constrained contexts have lesser availability and accessibility to advanced technologies, which may limit their abilities to conduct research requiring tools like deep learning or other computational approaches 20,21 . Also, this may restrict opportunities for substituting time-intensive lab-based research through simulation or increase the speed and quality of research processes. This may be a reason for the lack of representation of studies from countries in South Asia, South America, and Africa. Future efforts should focus on strengthening research capacities in those contexts is essential to improve regional and global knowledge on persisting and emerging diseases affecting global populations.
Also, it is essential to acknowledge the need for global collaborations as the magnitude of the problem necessitates a series of large-scale analyses, exchange of perspectives, knowledge synthesis, and translating the same to inform evidence-based policies and practices 22,23 . More importantly, increased collaborations in research are likely to facilitate trust and cooperation in developing scalable solutions globally, minimizing the cost and maximizing human benefits beyond borders 24,25 . Lessons learned from research collaboration can foster hope in existing global health disparities, particularly in developing vaccines and other preventive solutions 26,27 . These aspects are critical for the overall development of COVID-19 related research and practice as the current evidence on collaboration shows scattered growth of research groups, which may affect the true potential that collaborative efforts may offer in this scenario.
This study identified top keywords that appeared in scientific literature and demonstrated how they co-appeared across studies taking intellectual roots from earlier studies. Moreover, an evolution of conceptual structure using those keywords inform the current scenario of uncontrolled observations retrieved from global studies. Keywords are useful not only to retrieve studies from databases or topics within studies, but they also tell the scientometric themes underlying the information presented in a document 28,29 . In addition, these conceptual constructs may inform future scientific measures to define and distinguish how sub-domains within the knowledge base on COVID-19. Furthermore, similar keywords appeared in multiple documents in this study, among which a one-third were original articles, which informs the early stage of research. This early finding may offer critical directions of scientific development in this knowledge domain. Also, this study found more frequent appearances of epidemiological, genetic, and molecular biological keywords in COVID-19 studies, whereas some keywords indicated an emergence of zoonotic topics, including other animals related to the human food chain or ecology.
It is notable that the cooccurrence of keywords analysis or conceptual structure mapping did not find significant presence of social, economic, political, or cultural determinants of COVID-19 in the global landscape. It is increasingly being recognized that neither disease nor health can happen in isolation from the complex web of those determinants of human lives 30-33 . The findings of this study highlight this gap, which necessitates further multi-sectoral research on how different determinants can be associated with higher or lower risks of COVID-19 among individuals or populations. In addition, programs and policies for addressing epidemic outbreaks may influence physical and psychosocial health outcomes in diverse population groups 34,35 , which remains another potential area for future research. Moreover, there is a lack of research that may inform the preventive measures like vaccinations, pharmacological interventions, clinical prognosis, and outcomes of COVID-19. Maybe such studies yet to be available in the future, which will enrich future scientometric analyses and evidence mapping processes.
Another issue is the existing literature mentions little about the psychosocial and economic consequences of COVID-19. A major public health crisis like COVID-19 can affect those aspects of lives and create lasting problems among the affected populations 36,37 . Perhaps it is too early to estimate such impacts or get them published in indexed sources, which would need more research and timely communication across journals and other media. In addition to epidemiological and genetic studies, psychological, econometric, and social sciences research assessing those concurrent and future challenges should be prioritized to improve the knowledge base in those areas.

Limitations of this study
This study has several limitations that must be acknowledged to apprehend the findings and address those limitations through future research. First, this study used three databases from WoS core collection, which may have included most studies in a given domain, whereas it may exclude studies that are exclusively indexed in other databases. This may affect the generalizability of the findings. Second, newly published studies may take some time to get indexed in WoS, which could be sourced from searching individual journals that published articles related to COVID-19. A similar gap exists in terms of preprints that are available in respective servers, and insights from those articles cannot be reflected in this study. Also, research studies may take time to get published as journal articles and to be indexed in associated databases, which may also limit the scope of current literature to reflect contemporary knowledge. Third, a bibliometric analysis provides an overview of the evolution of a knowledge domain that is methodologically different than approaches used in clinical reviews. Such reviews may have different objectives and methods of synthesis, which were beyond the scope of this study. However, this study evaluated the knowledge evolution on COVID-19, which may have long-term impacts on the field of COVID-19 studies and future discourses on public health emergencies. The above-mentioned issues should be considered to use the findings of this study and conduct future research and evidence synthesis on COVID-19 addressing those challenges.

Conclusions
A public health emergency, like the COVID-19 pandemic, may affect different frontiers of human lives globally. To solve such problems, it is necessary to fully understand the problem and solutions that may address this. This need for knowledge is a fundamental force that keeps science alive and allows scientists to thrive in their research domains bringing the best possible methods and materials to answer real-life questions. Solving a complex public health problem like COVID-19 needs robust knowledge generated through rigorous methods specific to each problem related to different dimensions of COVID-19 as well as the lives of millions of people around the world. This study provided a global bibliometric evaluation of COVID-19 related studies, which may facilitate ongoing and future research. Such academic and professional efforts in understanding COVID-19 and addressing the same will be informed by the knowledge base we have today, which will continue to evolve over time, enriching science and societies globally.

If applicable, is the statistical analysis and its interpretation appropriate? Yes
Are all the source data underlying the results available to ensure full reproducibility? Yes

Are the conclusions drawn adequately supported by the results? Yes
No competing interests were disclosed.

Competing Interests:
Reviewer Expertise: COVID-19 research I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.
The benefits of publishing with F1000Research: Your article is published within days, with no editorial bias You can publish traditional articles, null/negative results, case reports, data notes and more The peer review process is transparent and collaborative Your article is indexed in PubMed after passing peer review Dedicated customer support at every stage For pre-submission enquiries, contact research@f1000.com