Keywords
bioinformatics, repurposing, coronavirus, SARS-CoV-2, COVID-19, virus, infection, therapeutic, target
This article is included in the Pathogens gateway.
This article is included in the Emerging Diseases and Outbreaks gateway.
This article is included in the Bioinformatics gateway.
This article is included in the Cell & Molecular Biology gateway.
This article is included in the Coronavirus (COVID-19) collection.
bioinformatics, repurposing, coronavirus, SARS-CoV-2, COVID-19, virus, infection, therapeutic, target
This updated version augments the prior contents of Table 4. Specifically, this table now includes newly-added columns that summarize the number of positive comparisons as well as the number, percentage, and rank of each drug based on the comparisons that were performed in this study.
See the authors' detailed response to the review by José Pedro Cerón-Carrasco
Coronaviruses consist of a lipid envelope that contains a single-stranded positive-sense RNA genome that is approximately 30 kilobases in length. Prior to 2019, six human coronavirus species had been discovered including HCoV-229E, HCoV-NL63, HCoV-HKU1, HCoV-OC43, SARS-CoV, and MERS-CoV. Four of these coronavirus species are endemic in humans and typically cause mild respiratory tract infections that present with cold-like symptoms but can cause more severe symptoms in immunocompromised individuals or infants1,2. Two of these four endemic virus species are members of the Alphacoronavirus genus (HCoV-229E and HCoV-NL63), while the other two species are members of the Betacoronavirus genus (HCoV-HKU1 and HCoV-OC43). The remaining two novel human coronavirus species discovered during this time are severe acute respiratory syndrome coronavirus (SARS-CoV; SARS) and Middle East respiratory syndrome coronavirus (MERS-CoV; MERS), which were emergent Betacoronaviruses responsible for epidemics in 2003 and 2012 respectively1,3. Human coronaviruses generally emerge from other animal hosts such as bats or mice, and typically pass through an intermediate host (e.g. civet cats for SARS and dromedary camels for MERS) before infecting a human host1,3.
In December 2019, a novel coronavirus was reported in Wuhan city, Hubei province, China and has since been named severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2). SARS-CoV-2 is highly similar to other SARS-like viruses that have been isolated from bats previously, especially BatCoV RaTG13 with which it shares 96.3% identity4,5. Initial infections with SARS-CoV-2 were traced back to the Huanan Seafood Wholesale Market and likely infected humans via a pangolin intermediate6. The global COVID-19 pandemic, as of February 14, 2021, has resulted in over 108 million cases and over 2.3 million deaths worldwide. Approximately 7.6 million of the cases and over 500,000 deaths have occurred in the United States of America7.
Common symptoms of COVID-19 include fever, dry cough, dyspnea, sore throat, myalgia, fatigue, and in some cases diarrhea4,5,8–11. SARS-CoV-2 is spread through aerosols, droplets, direct contact between people, and fomites4,12. Other studies suggest a fecal-oral transmission route is possible due to the presence of SARS-CoV-2 in stool samples of infected patients12,13. SARS-CoV-2 infects cells by binding to the membrane-associated Angiotensin-converting enzyme 2 (ACE2) receptor, which generally plays a role in the renin-angiotensin-aldosterone system to regulate blood pressure and fluid balance in the body14,15. ACE2 receptors are known to be expressed in lung, renal, cardiac, vascular, intestinal, and placental tissues14,15.
Both the widespread effects of COVID-19 and the initial highly susceptible population emphasized the need to identify potential drug treatments for emerging diseases--particularly before vaccines become available. The aim of many recent and ongoing clinical trials is to quantify the efficacy of various therapeutics for COVID-1916. Many vaccines are in various stages of preclinical (at least 139) or clinical (at least 25) development17, as well as some that have gained emergency use authorization by the US Food and Drug Administration (FDA).
The COVID-19 pandemic, caused by the SARS-CoV-2 virus, has highlighted the need to quickly and accurately identify therapeutics that can be repurposed to combat the signs, symptoms, and spread of disease. One method for predicting potential therapeutics is to identify host pathways that are dysregulated by infections and then find existing drugs that target those pathways. Programs to perform this analysis include DrugThatGene and GPSnet, which have both been used to identify potential drugs to target cancers18,19. The aim of this study was to construct a bioinformatics workflow that uniquely combines existing tools, databases, and programming libraries with custom scripts to predict potential human therapeutic targets for multiple members of the Betacoronavirus genus, which includes the SARS-CoV, SARS-CoV-2, and MERS-CoV species. Our unique combination of tools consists of a consistent and robust RNA-seq preprocessing workflow as well as an intracellular signaling pathway perturbation method that enables us to account for the role of protein-protein interaction networks instead of merely enriching for differentially expressed genes. We then applied this workflow to a use case involving a meta-analysis of coronaviruses and other respiratory viruses. This workflow first performs an analysis of human genes and significant signaling pathways that play a role in infection and pathogenesis. The pathway information is then used to predict relevant human drug targets and the associated small molecules or biologics that bind to the target of interest.
The rationale for identifying drug targets from multiple intersecting signaling pathways is based on the theory that a protein which participates in multiple affected pathways during viral infection has a higher likelihood of playing an important role in viral pathogenesis and replication. Targeting one or more of these proteins that act as “key hubs” with a therapeutic would therefore have a higher chance of reducing viral processes and the ensuing disease. Similarly, host proteins that participate in multiple pathways across various related viruses likely represent an evolutionarily conserved host-pathogen interaction that can be therapeutically modulated. We expect virus resistance to these host-based drugs to be relatively infrequent since they target relevant host processes. Our workflow could therefore be applicable not only to improving therapeutic treatment during infection with existing or emerging coronaviruses, such as SARS-CoV-2, but to rapidly identifying potential treatments for pathogens that may emerge in the future.
A search of the Gene Expression Omnibus (GEO) database, hosted at the National Center for Biotechnology Information (NCBI; https://www.ncbi.nlm.nih.gov/geo/)20, was performed in mid-2020 to find RNA-sequencing datasets for various viruses including “MERS”, “SARS”, and “coronavirus”. The corresponding sequencing data for four GEO series were retrieved from the Sequence Read Archive (SRA; https://www.ncbi.nlm.nih.gov/sra) at NCBI21: GSE122876, GSE56192, GSE147507, GSE13951622–24. These datasets were generated from cell cultures or patients infected with one of: respiratory syncytial virus (RSV), influenza A virus (IAV), MERS, SARS, and SARS-CoV-2. The SRA files were downloaded and converted to fastq format using version 2.10.1 of the NCBI sratools software package (https://github.com/ncbi/sra-tools).
The Automated Reproducible MOdular Workflow for Preprocessing and Differential Analysis of RNA-seq Data (ARMOR) workflow was used to preprocess and analyze the fastq files against the Ensembl reference transcriptome for Homo sapiens build GRCh38, release 98 [GCA_000001405.15]25. Briefly, this automated snakemake-based workflow performs quality control of the reads with fastQC (www.bioinformatics.babraham.ac.uk/projects/fastqc/), trims the adapters and poor-quality regions with TrimGalore! (https://www.bioinformatics.babraham.ac.uk/projects/trim_galore/), maps & quantifies reads to the human transcriptome using Salmon26, performs differential expression using edgeR27, identifies differential transcript usage with DRIMseq28, and enriches on Gene Ontology terms29 and Hallmark gene sets30 using the Camera algorithm to adjust for inter-gene correlation31. Differential expression was performed by calculating log2 fold-change and the associated p-values from infected samples versus mock-infected samples. The significance threshold for differential gene expression and Gene Ontology enrichment was defined as an false discovery rate (FDR)-corrected p-value < 0.05. Gene Ontology enrichment results from all datasets were then combined to identify shared terms.
Once the differentially expressed gene lists were constructed, the Ensembl identifiers for these human gene lists were mapped to the corresponding NCBI Entrez Gene identifiers using BioMart and Bioconductor32,33 prior to pathway enrichment using Signaling Pathway Impact Analysis (SPIA)34. Briefly, this pathway analysis combines robust statistics and bootstrapping to identify enriched pathways from lists of genes by incorporating the directionality of expression. Public pathway databases in the Graphite package in R Bioconductor were used by SPIA for enrichment with those surpassing a Bonferroni-corrected p-value < 0.05 being retained35. These databases include: KEGG36, Reactome37, Panther38, NCI39, and BioCarta. All pathways from each dataset were then compared to identify those that were shared among the various datasets, as well as those that were unique only to infection with SARS-CoV-2.
The output from the SPIA pathway enrichment were then used as input for a custom bioinformatics pipeline to identify existing drugs and biologics that could be repurposed to reduce signs, symptoms, and/or replication of coronaviruses. Specifically, this pipeline iteratively 1) retrieves the genes that participate in each statistically significant signaling pathway, 2) maps the gene identifiers to the corresponding UniProt protein identifier40, 3) searches the opentargets resource (www.opentargets.org41) to identify known drug targets and therapeutics, and 4) generates a table with various attributes of the target and small molecule/biologic treatment. These data were then integrated with the pathway comparisons across the various datasets and analyzed to determine what drugs would affect the pathways dysregulated in the greatest number of viruses and could be used as potential therapeutics for SARS-CoV-2 infection. The results were ranked by how many times a given protein target appeared in the results across the relevant virus taxa. The top ranked small molecules/biologic treatments were analyzed to determine which were predicted to reverse the effects of the viral infection on a given pathway. Manual review of high-ranking hits was then performed to determine the existing indication(s) for each treatment, followed by a literature search to determine which, if any, of the therapeutics identified had previously been used or considered for the treatment of COVID-19. Code for this workflow can be found on GitHub: https://github.com/bpickett/Pathway2Targets.
Each of the algorithmic components of our computational workflow was dependent on first calculating differentially expressed (DE) genes for each of the datasets. We consequently began by using the same computational workflow to generate the DE genes from the raw transcriptomic data for human cells infected with one of: RSV, IAV, MERS, SARS, and SARS-CoV-2. Each set of results was specific to the comparison that was performed (e.g. mock-infected vs. infected) and included both log2 fold-change values and FDR-corrected p-values. We then integrated and compared the differential expression results across all virus comparisons to facilitate downstream comparisons.
We began by identifying genes that were either up- or down-regulated across a large number of studies. Specifically, we calculated the number of DE genes (FDR-corrected p-value < 0.05) that significantly changed during infection with influenza A (4205 genes), RSV (3661 genes), MERS (range across studies: 1222-13006 genes), SARS-CoV (range across studies: 2130-5557 genes), or SARS-CoV-2 (range across studies: 427-6933 genes). Out of the 15 studies evaluated, we observed 11 genes that displayed increased expression across 10-13 study comparisons including: TRIM25, C3, NCOA7, PTAFR, TNFAIP3, EIF2AK2, HELZ2, HBEGF, NFKB2, REL and VEGFC. We found an additional 18 genes that were upregulated across 9 of the 15 comparisons including: IFIT2, EGFR, and FOSL1. In contrast, we identified 14 genes that showed decreased expression across 11-12 comparisons including: EIF4H, APH1A, NME2, SEC61B, TUBB4B, CHCHD2, TUBA1C, TP53I3, H2AFZ, PEBP1, HOXB6, TPM2, CBR1, and SIVA1.
We next analyzed the DE lists to identify significant genes that were shared across infections involving only the Betacoronaviruses including SARS-CoV, MERS-CoV, and SARS-CoV-2. This analysis revealed 12 genes that were upregulated in at least nine of thirteen comparisons including: TRIM25, REL, DNASE1, C3, NCOA7, PTAFR, EIF2AK2, HELZ2, NFKB2, ZNF385C, ZC3H12A, and OR7E122P. In contrast, the 10 genes that were found to be downregulated in ten of the thirteen comparisons involving any of the Betacoronaviruses included: CHCHD2, SMIM3, EIF4H, NME2, CBR1, SPARC, DSTN, CDC123, TIMM17A, and PDAP1.
We then performed a similar analysis to identify the five statistically significant DE genes (FDR-corrected p-value < 0.05) that displayed the highest and lowest average fold-change values during infection across all comparisons of SARS-CoV-2, Betacoronaviruses, or all viruses (Figure 1). This analysis showed that a subset of the genes such as CXCL11, which is induced by interferon and is involved in T-cell signaling, displayed similar fold-change values across multiple comparisons. This finding was somewhat expected since certain genes involved in the host innate immune response are expected to be modulated during virus infection. However, we also observed that genes such as MUC3A, PCSK5, MRC1, and CLEC3B displayed somewhat different average fold-change values across the included comparisons. This observation was also expected given the diverse virus replication requirements and the resulting host intracellular transcriptional response that occurs during infections with human viral pathogens.
The five genes with the highest and lowest log2 fold-change values for each virus comparison were identified and compared against expression in the other datasets. Squares in red or blue represent genes that are upregulated or downregulated (respectively) in the relevant comparison.
Given the number of DE genes involved in the analysis, manual interpretation of the results would be intractable. We therefore performed an analysis to determine which Gene Ontology (GO) terms and Hallmark gene sets were specifically enriched to better understand which biological processes and molecular functions were represented by the genes in each of the DE gene lists. Overall, we observed a superset of 580 terms that were enriched in any of the included viral infections. The enrichment results for each taxon of virus were then reviewed to identify annotated terms that were shared between multiple types of viral infections, shared amongst coronaviruses, or specific to SARS-CoV-2.
We found several notable terms that were shared among the results obtained from infection of multiple unrelated taxa of viruses. These terms referred to functions such as the host interferon response, regulation of virus response, chemokine activity, and immune cell migration that were positively enriched in both the RSV infection of A549 cells and the SARS-CoV-2 infection of either A549 or NHBE cells. As expected, the statistical significance of these terms supports the important role that the human immune system plays in response to virus infection, which is an expected result and validates the upstream DE gene analysis.
Our method identified no significant GO terms that were shared across all Betacoronavirus comparisons. Interestingly, we found that each coronavirus species had its own set of uniquely enriched terms. In MERS-CoV, “negative regulation of inclusion body assembly” was positively enriched in four of the seven comparisons, while “aggrephagy” and “regulation of nucleotide binding oligomerization domain containing 2 signaling pathway” were positively enriched in three of the seven comparisons. The SARS-CoV samples had very few significantly enriched annotation terms including the “signal peptidase complex” and “cyclin dependent protein serine threonine kinase activator activity” terms that were each negatively enriched in only one of the four SARS-CoV comparisons.
We also identified 290 terms that were uniquely present among the SARS-CoV-2 comparisons, although no significant terms were identified in comparisons involving either the human post-mortem biopsies or the infected Calu-3 cells. The shared significant terms that we identified in the NHBE- and A549-infected cells included “response to chemokine”, “antimicrobial humoral response”, and “humoral immune response.” Enriched terms in the NHBE-infected cells alluded to the role of the inflammatory response, interleukin-1, interleukin-6, neuroinflammation, arachidonic acid binding, and T-cell activation.
We then wanted to calculate which intracellular signaling pathways were significantly modulated during viral infection, based on the DE genes identified above. The pathway information used for this analysis was obtained from public databases that manually curate the proteins that participate in conveying a signal from receptors on the cell membrane to transcription factors within the nucleus in order to respond to a stimulus.
We used the results of the differential gene expression analysis as input to the SPIA algorithm. The robust bootstrap-based approach of this method identified 249 pathways that were significantly perturbed across at least one viral comparison. We subsequently analyzed the lists of significantly perturbed pathways for each comparison to determine those that were shared across virus taxa (e.g. influenza A, respiratory syncytial, MERS-CoV, SARS-CoV, and SARS-CoV-2), shared among coronaviruses, and unique to SARS-CoV-2. This analysis revealed a subset of the total number of pathways to be dysregulated across many of the viral taxa studied. These pathways included translation (affected in 12/17 comparisons and 4/5 viral taxa), cytokine-cytokine receptor interaction (modulated in 11/17 comparisons and 4/5 viral taxa), as well as rRNA processing (modulated in 10/17 comparisons and 3/5 viral taxa). We also observed that some of these pathways were predicted to be activated during infection with certain viral taxa in specific cell types, while inhibited in others.
We did not observe any signaling pathways that were significantly and consistently affected during infection by the individual coronavirus species examined. However, we did detect relevant pathways across multiple virus taxa. Interestingly, we found that the direction (i.e. activated or inhibited) was occasionally dependent on virus taxa or timepoint of infection. For example, the cytokine-cytokine receptor interaction pathway was predicted to be activated during infection with RSV, a subset of comparisons involving MERS-CoV, and many SARS-CoV-2 infections; while the same pathway was predicted to be inhibited during infection with SARS-CoV.
We predicted a total of 38 pathways that were uniquely affected during SARS-CoV-2 infection (Table 1). We noted that 35 of these pathways were only found to be significant in one SARS-CoV-2 comparison and included “NF-kB signaling pathway”, “Interleukin-1 signaling”, “IL6-mediated signaling events”, “PI3K-Akt signaling pathway”, “Jak-STAT signaling pathway”, “Apoptosis”, “Complement and coagulation cascades”, and other processes associated with either the immune system or infectious diseases. The remaining three pathways that were predicted as affected during at least two SARS-CoV-2 comparisons were “Cytokine signaling in Immune system”, “Tuberculosis”, and the “Innate immune system”. In short, these findings indicate a set of signaling pathways that are strongly associated with virus infection and/or immune activation in the host, some of which are uniquely detected during SARS-CoV-2 infection. The “Cytokine Signaling in Immune System” pathway is stored in the Reactome database and consists of interferon alpha/beta and gamma signaling, interleukin 1, 2, 6, and 10 signaling, and other components37. To better understand the impact of SARS-CoV-2 on interleukin-6 signaling, we overlaid the differential expression data on a representation of the signaling pathway (Figure 2). This analysis revealed six of the eleven total proteins in the pathway were upregulated, while another protein was downregulated during SARS-CoV-2 infection.
This signaling pathway is a component of the larger Cytokine Signaling in Immune System pathway in Reactome. Each node represents a protein in the network, while each edge represents a characterized interaction between the proteins. Higher log2 fold-change values (up-regulation) are represented by increasing saturation of red, while more negative log2-fold-change values (down-regulation) are colored with increased saturation of blue. White nodes indicate no measured log2-fold change.
We next wanted to determine whether any of the significantly affected signaling pathways contained known drug targets that could be modulated to reduce infection, virus replication, and/or clinical signs and symptoms associated with infection by a panel of Betacoronaviruses or by SARS-CoV-2 alone. To do so we cross-referenced the results from our significant pathways analysis with the drug-target information accessible through an application programming interface (API) to the opentargets.org database, which yielded 179 potential human drug targets (Table 2). We ranked the predicted drugs and their associated targets according to how many coronaviruses shared the same drug target across the various datasets included in our analysis. We performed a separate ranking based on the data obtained solely from the SARS-CoV-2 studies.
After reviewing the results of this analysis, we identified 38 potential human protein targets to counteract MERS-CoV, SARS-CoV, and/or SARS-CoV-2. Specifically, 38 targets were predicted to be relevant in at least 10 of the 15 coronavirus comparisons (Table 3). Seventeen of these targets were identified across 11 comparisons and included interferon (IFN)-A, IFN A receptors, IL1A and IL12B. Twenty-one additional targets were identified across 10 comparisons and included members of the tumor necrosis factor (TNF) superfamily, CCR4, CCR5, GHR, CXCR4, IL2R, IL4R, IL5, IL5R, IL6R, MPL, and IL20. We also predicted seven targets that were relevant specifically to SARS-CoV-2 including JAK1, FDPS, TLR7, TLR9, PRKCA, TYK2, and VDR.
Our analysis predicted twelve existing drugs that are predicted to be useful as repurposed therapeutics against 73.3% of all coronavirus taxa evaluated in this work and 100% of the SARS-CoV-2 comparisons included in our analysis (Table 4). Eight of these twelve drugs are used to treat common autoimmune disorders such as systemic lupus erythematosus (SLE), Crohn’s disease, and multiple sclerosis (MS). Five of the twelve drugs have been associated or used as antiviral measures largely against hepatitis C virus. Another 27 drugs were predicted to have potential therapeutic activity against 66.7% of all coronavirus infections and 30 were predicted as potential therapeutics against 75% of SARS-CoV-2 infections.
We then analyzed the drugs that were identified as having targets in pathways affected by SARS-CoV-2 to determine which were predicted to reverse the effects of the viral infection on the affected pathway. Of the 42 drugs that targeted SARS-CoV-2 related pathways, 27 were predicted to reverse viral effects on these pathways. We then performed a literature search to determine if any of these drugs had been previously used to treat COVID-19 or had been identified as potential therapeutics by other research groups. We found six of the 12 therapeutics that we predicted to be useful against SARS-CoV-2 had already shown positive results in clinical tests including canakinumab, anakinra, tocilizumab, sarilumab, and baricitinib. These results give further support to the validity of our computational workflow.
The computational workflow that we describe in this work predicts human therapeutic targets from signaling pathways and gene expression that are significantly affected during infection. We applied this workflow within the context of a meta-analysis that consisted of multiple public transcriptomic datasets of Betacoronaviruses. We then validated our results by comparing our predictions against recently published studies reporting therapeutics for SARS-CoV-2. Specifically, our downstream analyses enable us to calculate significant signaling pathways from DE genes using the SPIA algorithm as well as to predict potential therapeutics and their respective targets. Our analysis revealed thousands of DE genes, 580 enriched functional terms, as well as 249 significant pathways, including 38 pathways that were specifically affected during infection with SARS-CoV-2. It is important to point out that this workflow focuses on identifying human drug targets for two reasons: 1) to aid in the repurposing of existing drugs against emerging pathogens, and 2) to reduce the likelihood that a pathogen will develop resistance against the therapeutic(s) since they interact with a human protein that is much less likely to mutate than a viral protein.
Our approach differs from prior meta-analyses by focusing on a consistent, robust ARMOR-based RNA-seq preprocessing workflow for all datasets as well as a downstream pathway perturbation analysis. Previous studies have used a variety of approaches to predict possible therapeutics42–46, but none have combined the various aspects that are described in this work. The SPIA algorithm we used in this study has been shown to provide robust statistical results of perturbed pathways while not simply enriching for DE genes34. It also enabled us to identify protein components in signaling pathways that could be reversed to reduce the adverse signs and symptoms that occur during infection, which differs from simply attempting to target DE genes. This approach drastically differs from some attempts to directly target DE genes without accounting for how a treatment may affect the cellular protein-protein interaction network. In this analysis we have compared significant DE genes and pathways identified in multiple different studies that used different MOIs, timepoints, and cell types. We found many DE genes and pathways that were affected across a variety of samples which increases our confidence in our results as those genes and pathways appear to be affected more by the virus itself than by other variables such as cell type, MOI, or timepoint.
Although our approach was dependent on identifying DE genes using a consistent preprocessing workflow, the focus of our analysis was to identify relevant functions, pathways, and potential drug targets from the DE genes. These processed data could then be used to better understand the underlying biological mechanisms of pathogenesis, but to better identify host-based therapeutic targets. A subset of the enriched annotations identified by the Camera algorithm have been reported to be relevant during infection with SARS-CoV-2 in clinical studies including “response to chemokine”47, “humoral immune response”48–50, “chronic inflammatory response”51,52, “toll like receptor binding”53,54, “interleukin-6 production”55 and citrate metabolism. A separate study of COVID-19 identified Bradykinin as potentially playing a role in pathogenesis56. Interestingly, the annotations for this gene and its receptor include several of the enriched terms we identified such as arachidonic acid, inflammation, and G-protein coupled receptor activity. We interpret these separate studies to validate the findings of our functional enrichment analysis and anticipate that future studies will shed additional insight into the underlying mechanism(s) of viral pathogenesis.
Many of the signaling pathway components that we identified in this study have also been reported previously. One prior proteomics study reported applying translation inhibitors to Caco-2 cells infected with SARS-CoV-2 reduced virus replication57, which supports our pathway perturbation results. Other studies have reported that the ORF6, ORF8, and nucleocapsid proteins of SARS-CoV-2 are antagonists of type-I interferon and NF-kB in HEK-293 T cells58 or the induction of apoptosis during viral infection59. In contrast, our meta-analysis predicted the type-I interferon pathway to be activated, suggesting either that this response could be dependent on the cell type, or that a potential redundant mechanism in the host cell can still turn on this pathway even if certain components are down-regulated. Our analysis predicted that noncanonical NF-kB signaling was inhibited during MERS and SARS infection, while being activated during SARS-CoV-2 infection in Calu-3 cells. While it is possible that this difference is due to a cell-specific response from the studies included in our meta-analysis, a NF-kB inhibitor applied to Vero E6 cells infected with SARS-CoV-2 has been shown to reduce cytopathic effects and virus plaques60. This result suggests that NF-kB signaling may be active and contribute to the inflammatory signs and symptoms observed during virus infection, which agrees with our results. The Toll-like receptor and JAK-STAT pathways were previously found to be relevant to SARS-CoV-2 infection in A549 cells, which we also identified in infected Calu-3 cells42. Citrate metabolism was also identified as an important pathway by our analysis and has been supported elsewhere61.
Although the US FDA has only issued emergency use authorization for therapeutic treatment for severe cases of COVID-19, a multitude of studies have reported results from attempting to treat patients with a variety of existing FDA approved therapeutics62–69. We found that 27 of our 42 predicted therapeutics are predicted to “reverse” the effect on the pathways relevant to the viral infections being compared. Twelve of the 27 drugs that were predicted to be potential therapeutics against SARS-CoV-2 and are used to combat autoimmune or inflammatory diseases such as MS, SLE, and rheumatoid arthritis while others have been used in cancer treatments and against viral infections such as hepatitis C and human immunodeficiency virus. Six of these 12 drugs have been used to treat COVID-19 in patients including canakinumab, anakinra, tocilizumab, sarilumab, baricitinib, and hydroxychloroquine70. Two other drugs, maraviroc and brodalumab, have been identified as potential treatments via cell cultures and computer models71–74. Others on the list such as benralizumab have been identified through anecdotal data as biologics that potentially exert a prophylactic effect for SARS-CoV-2, when they are taken at the time of infection75. Baricitinib is of particular interest as the US FDA has issued emergency use authorization for its use in conjunction with remdesivir in the treatment of COVID-19 patients over the age of two that have been hospitalized and require supplemental oxygen, invasive mechanical ventilation, or extracorporeal membrane oxygenation76. Baricitinib has also been shown to be effective against COVID-19 when combined with corticosteroids77. A small study involving Tocilizumab has also shown it can be useful in improving the outcome of patients with severe COVID-1978.
In conclusion, we developed and applied an important bioinformatics workflow, that combines existing tools with custom scripts, to predict potential human therapeutic targets. This workflow was then validated through a meta-analysis of publicly available transcriptomics data. The multiple Betacoronavirus and SARS-CoV-2 datasets revealed significant genes, annotations, signaling pathways, and human proteins that could be targeted by therapeutics during infection with various Betacoronaviruses. It is important to recognize that many of the predictions made by our workflow have been supported by experimental and clinical work on this virus, which suggests that our approach could enable the rapid identification of relevant therapeutics against future emerging pathogens.
SARS-CoV-2: Severe acute respiratory syndrome coronavirus-2
COVID-19: Coronavirus disease-2019
GO: Gene ontology
SARS / SARS-CoV: Severe acute respiratory syndrome coronavirus
MERS / MERS-CoV: Middle east respiratory syndrome coronavirus
ACE2: Angiotensin-converting enzyme 2
FDA: Food and Drug Administration
GEO: Gene expression omnibus
NCBI: National center for biotechnology information
ARMOR: Automated Reproducible MOdular Workflow for Preprocessing and Differential Analysis of RNA-seq Data
SPIA: Signaling pathway impact analysis
DE: Differentially expressed
FDR: False-discovery rate
GEO: Transcriptomic analysis of MERS-CoV infected Calu-3 cell with or without AM580 treatment, Accession number GSE122876: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE122876
GEO: Transcriptomic analysis of the Novel Middle East Respiratory Syndrome Coronavirus (Human, MRC5 cells, Accession number GSE56192: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE56192
GEO: Transcriptional response to SARS-CoV-2 infection, Accession number GSE147507: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE147507
GEO: Transcriptomic Analysis Of circRNAs/miRNAs/mRNAs upon Middle East Respiratory Syndrome Coronavirus (MERS-CoV) infection, Accession number GSE139516: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE139516
Code for this workflow can be found on GitHub: https://github.com/bpickett/Pathway2Targets.
Archived code as at time of publication: http://doi.org/10.5281/zenodo.470619779.
Code is available under the terms of the Creative Commons Zero "No rights reserved" data waiver (CC0 1.0 Public domain dedication).
We would like to thank the Office of Research Computing at Brigham Young University for their help on this project. We also gratefully acknowledge the original providers of the RNA-seq datasets.
Views | Downloads | |
---|---|---|
F1000Research | - | - |
PubMed Central
Data from PMC are received and updated monthly.
|
- | - |
Is the rationale for developing the new method (or application) clearly explained?
Yes
Is the description of the method technically sound?
Yes
Are sufficient details provided to allow replication of the method development and its use by others?
Yes
If any results are presented, are all the source data underlying the results available to ensure full reproducibility?
Yes
Are the conclusions about the method and its performance adequately supported by the findings presented in the article?
Yes
Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Structural Bioinformatics, Computational Biophysics, MD Simulation
Is the rationale for developing the new method (or application) clearly explained?
Yes
Is the description of the method technically sound?
Yes
Are sufficient details provided to allow replication of the method development and its use by others?
Yes
If any results are presented, are all the source data underlying the results available to ensure full reproducibility?
Yes
Are the conclusions about the method and its performance adequately supported by the findings presented in the article?
Yes
Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Biochemistry
Is the rationale for developing the new method (or application) clearly explained?
Yes
Is the description of the method technically sound?
Yes
Are sufficient details provided to allow replication of the method development and its use by others?
Yes
If any results are presented, are all the source data underlying the results available to ensure full reproducibility?
Yes
Are the conclusions about the method and its performance adequately supported by the findings presented in the article?
Yes
Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Computational Chemistry for Drug Design.
Alongside their report, reviewers assign a status to the article:
Invited Reviewers | |||
---|---|---|---|
1 | 2 | 3 | |
Version 2 (revision) 20 Aug 21 |
read | ||
Version 1 29 Apr 21 |
read | read |
Provide sufficient details of any financial or non-financial competing interests to enable users to assess whether your comments might lead a reasonable person to question your impartiality. Consider the following examples, but note that this is not an exhaustive list:
Sign up for content alerts and receive a weekly or monthly email with all newly published articles
Already registered? Sign in
The email address should be the one you originally registered with F1000.
You registered with F1000 via Google, so we cannot reset your password.
To sign in, please click here.
If you still need help with your Google account password, please click here.
You registered with F1000 via Facebook, so we cannot reset your password.
To sign in, please click here.
If you still need help with your Facebook account password, please click here.
If your email address is registered with us, we will email you instructions to reset your password.
If you think you should have received this email but it has not arrived, please check your spam filters and/or contact for further assistance.
Comments on this article Comments (0)