Coronavirus disease 2019 drug discovery through molecular docking

Background: The dawn of the year 2020 witnessed the spread of the highly infectious and communicable disease coronavirus disease 2019 (COVID-19) globally since it was ﬁrst reported in 2019. Severe acute respiratory syndrome coronavirus-2 is the main causative agent. In total, 3,096,626 cases and 217,896 deaths owing to COVID-19 were reported by 30th April, 2020 by the World Health Organization. This means infection and deaths show an exponential growth globally. In order to tackle this pandemic, it is necessary to ﬁnd possible easily accessible therapeutic agents till an effective vaccine is developed. Methods: In this study, we present the results of molecular docking processes through high throughput virtual screening to analyze drugs recommended for the treatment of COVID-19. Results: Atovaquone, fexofenadine acetate (Allegra), ethamidindole, baicalin, glycyrrhetic acid, justicidin D, euphol, and curine are few of the lead molecules found after docking 129 known antivirals, antimalarial, antiparasitic drugs and 992 natural products. Conclusions: These molecules could act as an effective inhibitory drug against COVID-19.


Introduction
Coronavirus Disease 2019 (COVID-19) is caused by the severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2), that is responsible for respiratory illness and probably many more is yet to be discovered. This novel virus was first identified on 30 th December, 2019, with its first infection case infecting a human, which was reported in Wuhan city located in Hubei, China 1 . Coronaviruses are mainly zoonotic, and are present amongst birds and mammals, causing respiratory, neurological, hepatic and enteric diseases 2 as well as comprises of enveloped RNA. The World Health Organization (WHO) declared this disease as a pandemic on 11th March, 2020 and SARS-CoV-2 as the deadliest virus till date on earth claiming 217,896 deaths till 30 th April, 2020 3 .
Then, it is necessary for the rapid development and approval of a vaccine, which is not yet available 4 . Nevertheless, Chang et al., 5 have suggested that some drugs against same type of viruses approved by the US Food and Drug Administration (FDA) might offer promising results. Hydroxychloroquine is one such drug that is used worldwide whereas Remdesivir and Ivermectin have been reported to work against COVID-19 in silico by others.
The transmission of this coronavirus occurs due to the binding of the CoV spike protein to the angiotensin converting enzyme 2 (ACE2) receptor present on the cell surface of the human host. The ACE2 receptor is present in the respiratory organs, kidneys, gastrointestinal tract (at high levels in the esophagus, colon, and small intestine, but low in the stomach), and testes. Virulence of this novel virus is due to the presence of main protease responsible for virus replication along with many major functions 6 . Therefore, we have employed the main protease structure 6m03 as the target protein to identify the best inhibitory drugs in silico for our study.
SARS-CoV-2 (negatively stained) when observed under the electron micrograph was found to be spherical in shape with some pleiomorphic characteristic. The epithelial sections of human airway when observed, viruses were found in membrane bound vesicles in cytoplasm along with inclusion bodies. The virions appear similar to solar corona due to 9-to 12-nm distinctive spikes and the virions are 60 to 140 nm in diameter. Thus, it was established due to these morphological characteristics that this virus belongs to the Coronaviridae family along with its genome having more than 85% identity with a bat SARS-like CoV (bat-SL-CoVZC45, MG772933.1) genome as previously assessed via genome sequencing 6 . SARS-Cov-2 initially infects lower airways, binds to ACE2 receptor on cells activating immune cells, thus, inducing the secretion of inflammatory cytokines and chemokines in human pulmonary system 4 . Most COVID-19 patients exhibit flu-like symptoms within a span of two weeks from the exposure to the virus whereas there have been a majority rise in the asymptomatic COVID-19 patients.
In this work, we have performed high throughput virtual screening since it is the fastest approach in finding the probable drug against the target. High-throughput virtual screening (HTVS) of two databases was carried out via PyRx (Python prescription) software, which uses dock, Vina and Autodock as the docking tool. Autodock itself uses MGLTools comprising of computer aided drug discovery (CADD) pipeline for high throughput virtual screening of large databases for probable hits as target drugs. HTVS enables docking of multiple ligands on a single protein. PyRx is a freely available HTVS software. Docking results are based on the identification of pose visually and quantitatively using a scoring algorithm. Docking calculates the free binding energy (∆G) between the ligands and the protein. The free binding energy, thus calculated, is fundamental to the formation of complex systems in biochemistry and molecular biology. Lower free binding energy corresponds to a more favorable ligand binding affinity between a receptor and a ligand 7 .

Molecular docking
Molecular docking is a bioinformatics method that allows predicting the orientation of a molecule, when it is bounded to another molecule 8,9 . There are two main approaches for molecular docking. The first approach describes the protein and the ligand as complementary surfaces 10 . The second approach simulates the docking process calculating the ligand protein interaction based on the free binding energy ∆G 11 .

Molecule selection
Selection of database and the COVID-19 main protease structure In this study, we have docked the X-ray crystal structure of main COVID protease protein (PDB ID: 6M03, resolution: 2 Å) with 129 molecules obtained from DrugBank and 992 molecules from the Zinc Natural Product database. The list of 129 molecules are provided along with the link for Zinc natural database in the Extended data 12 . These 129 molecules chosen were either antimalarial, antiparasitic, antibiotics, or antivirals, since hydroxychloroquine, remdesivir and ivermectin are antimalarial, antiviral and antiparasitic drugs, respectively. The Zinc Natural Product Database was chosen since most of the drugs are natural derivatives used against various diseases at present and it is a freely available database. Similarity search could not be undertaken since there is no known drug to function 100% against this novel disease at time of publication.

Processing of macromolecules and ligands
Docking requires processing of the macromolecules and the ligands 13,14 . Water molecules were removed, polar hydrogen bonds were inserted into the crystal structure of 6m03 and it was converted to PDBQT format using AutoDock version 4. The energy of all ligands were minimized and they were converted into PDBQT files using Open Babel version 2.2.3 in PyRX version 0.8 15 .

Molecular docking process
The grid box was determined as center the coordinates X:12.2632, Y:12.3998, and Z:5.4737, while as dimension the coordinates X:29.9242, Y:64.1097, and Z:48.1126.
The docking was done using Vina version 2.0 in PyRX. After the run, the out files stored in the user folder where the path run was specified in the edit preference. These output files were stored in PDBQT files, each having nine poses. The autodock application file was launched which then showed the empty dashboard along with "File" on the left hand corner of the page. The out file models were loaded using the "read molecule" application from the selected out file folder. Different poses were analyzed in the AutoDock tool. The pose with the lowest binding energy in kcal/mol was selected for further analysis. The docked molecules were then further converted into PDB format in PyMol and their interaction was studied using the software Discovery Studio version 4.1. The interaction can also be studied with PyMol but a better quality picture is obtained via Discovery studio. Three known drugs (hydroxychloroquine, remdesivir and ivermectin) were first docked against the virus main protease to check their binding energy. The interaction of these three drugs with the COVID-19 main protease could later be utilised for getting the hits.

Results
Docking results for reference molecules The free binding energy for drugs known to act against COVID-19, which are hydroxychloroquine, remdesivir, and ivermectin, were found to be -5.5 kcal/mol, -6.3 kcal/mol, and -8.7 kcal/mol respectively as indicated in Table 1, which describes: a) PubChem compound ID (CID), which is the compound identifier in the PubChem database from where the 3D mol files of 129 molecules were downloaded; b) common drug name; and c) the free binding energy obtained after docking.
These drugs are known to improve the condition to some extent and yet their functions against COVID-19 are under study [16][17][18][19] .
Therefore, we used these three molecules as our reference drugs. The interaction of these drugs with the virus main protease can be seen in the Figure 1. We could observe the interaction of these reference molecules as hydroxychloroquine forms a hydrogen bond with Tyr237 residue of the 6M03 main protease with a distance of 2.10 Å. It also interacts with Leu272 and Leu287. Remdisivir forms six hydrogen bonds with Lys137, Thr199, and Tyr239 along with interacting with Leu272, Leu287, Tyr237, and Asn238 residues of the 6M03 main protease. Ivermectin interacts with Leu272, Tyr239, Leu286, Leu287, Gly275, Asn277, and Met276 residues of the 6M03 main protease.
Docking results for 129 additional molecules Keeping the free binding energy of our reference molecules in mind, we shortlisted 77 molecules from the database of 129 molecules with a cut of -6 kcal/mol free binding energy. Eprinomectin, artefenomel, doramectin, betulinic acid, atovaquone, and tetrandrine showed the lowest binding energies, at -9 kcal/mol, -8.7 kcal/mol, -8.4 kcal/mol, -8.4 kcal/mol, -8.2 kcal/mol and -8 kcal/mol, respectively. Table 2 presents the details of the best performing 18 of the 77 molecules along with their CID. These molecules have been considered due to the lowest free binding energy between the ligand and protein. Furthermore, their interactions with the virus main protease 6m03 was studied. Artefenomel interacts with Pro108, Val202, Ile249, Pro293, and Phe294 residues of the 6M03 main   protease. Eprinomectin forms a hydrogen bond (2.26 Å) with Lys5 residue of the 6M03 main protease and also interacts with Leu286, Leu287, and Asn277. Tetrandrine forms a hydrogen bond with Arg131 and also interacts with Leu272, Leu286 and Leu287. Betulinic acid forms a hydrogen bond with Arg137 with 2.66 Å distance and also interacts with Leu272, Leu286, Leu287, Tyr237, and Tyr239. Doramectin forms three hydrogen bonds with Thr199, Lys5, and Gly138 as well as interacts with Leu286 and Asp289. Atovoquone forms a hydrogen bond with the Thr199 residue of the 6M03 main protease with a distance of 2.48 Å and also interacts with Lys137, Leu272, Leu286, Leu287, and Tyr239. These interactions can be observed in Figure 2. Full results are available in the Extended data 12 .
Docking Results for natural molecules A total of 34 molecules of natural origin were chosen from the datasets of 992 molecules with a cut off -8.9 kcal/mol. Table 3 presents the details of the best 20 molecules along with their common name and ZINC ID sorted by ascending order of the free binding energy of the top 20 molecules from natural products database. The interaction of various molecules with the 6M03 main protease was studied. Allegra forms three hydrogen bonds with Thr111, Asn151, and Asp153 as well as interacts with Arg298, Val305, and Phe305 residues of 6M03 main protease. Baicalin forms four hydrogen bonds with Thr111, Thr292, Ile152, and Arg298 along with interacting with Asp153, Asn151, and Val303 residues of the main protease 6M03. Curine forms three hydrogen bonds with Arg131, Thr199, and Leu287 along with having interactions with Asp289, Leu286, and Tyr237 residues of 6m03 main protease. Etamidindole forms two hydrogen bonds with Thr111 and Asp295, it also interacts with Phe8, Phe294, Arg295, Arg298, and Pro252 residues of main protease 6M03. Glycyrrhetic acid forms a hydrogen bond with Lys137(2.24 Å) as well as interacts with Tyr237, Tyr239, Leu272, Leu286, and Leu287 residues of the main protease 6m03. Euphol interacts with Phe8, Val297, Arg298, and Val303 residues of the main protease. The interaction of few of the 20 molecules with the protein can be seen in Figure 3. Full results are available in the Extended data 12 .

Discussion
HTVS is one of the best methods for identifying molecules acting against drug targets in a very short time period, when compared to traditional drug identification strategies. Remdesivir was detected as COVID-19 inhibitory drug via virtual screening method 6 . Therefore, keeping the present scenario in our mind, we undertook this study to provide individuals with COVID-19 with drugs already available. Surprisingly, we have found good hits from the databases with medicinal properties.
Artefenomel is known to treat malaria and other parasitic diseases. Betulinic acid is under trial for the treatment of dysplastic nevus syndrome. Atovaquone is an approved drug for the treatment of Pneumocystis carinii pneumonia and malaria. Tetrandrine is in the experimental stage for anticancer, antimalarial, antiparasitic category. Eprinomectin and doramectin are veterinary antiparasitic drugs.
Many of the natural compounds identified have medicinal properties. Taraxerone has allelopathic and antifungal effect 20 , Morusin has anti-oxidant and anticancer properties 21 , RA VII compound is an antitumor agent 22 , and neoruscogenin is used against chronic venous disorders 23 . Justicidin D exhibits anti-inflammatory properties 24 , Licoricidin is an antimetastatic molecule 25 whereas euphol is used against asthma and cancer along with syphilis, and rheumatism 26 . Schisandrene has anti-oxidant activity 27 , curine is reported as a vasodilator 27 , angoluvarin has antimicrobial activity 28 , baicalin is used to treat cardiovascular diseases, inflammation and hypertension 29 . Glycyrrhetic acid shows anti-inflammatory, anti-ulcer, hepatoprotective, anti-allergic, anti-tumor, antioxidant and anti-diabetic activity 30 . Isomitraphylline has an antioxidant properties 31 , bikaverin and rutarensin has anti-tumour activities 32,33 . Jolkinol B has anticancer properties 34 and Ethamidindole exhibits antihistamine properties 35 . Fexofenadine acetate, commonly known as Allegra, is an antihistamine pharmaceutical drug presently used in the treatment of allergy symptoms such as urticaria and hay fever 36 .
When we compared our two datasets, we found that majority of the molecules showed lesser free binding energy as compared to the reference molecules as in Table 1, Table 2, and  Table 3. Surprisingly, we obtained good hits from our natural database which is good news since after observing their interaction and biotherapeutic functions, we might have achieved our COVID-19 inhibitory drugs. One such drug from natural database is fexofenadine acetate (Allegra) which is pres- ently in use as anti-allergic medicine. We also have hits from 129 drugs but there is one, atovaquone, which is presently used against pneumonia and malaria and also a very good candidate for COVID-19 treatment.

Conclusions
The best therapeutic drugs inferred from our studies are atovaquone, fexofenadine acetate (Allegra), justicidin D, baicalin, glycyrrhetic acid and ethamidindole based on their docking score, interaction studies and their present applications for probable COVID-19 treatment. The rest of the molecules could also be used as COVID-19 inhibitory drugs after further evaluation. When we compared our data with reference molecules score of currently in use drug against COVID-19, we found that atovaquone showed better binding energy than hydroxychloroquine and remdesivir. It is one of the best drug candidate for COVID-19 treatment since it is already in use for treating Pneumocystis carinii pneumonia and malaria. Fexofenadine acetate is another good target drug for COVID-19 treatment since it is naturally derived and presently used for its anti-histamine properties. Ethamidindole could possibly act as COVID-19 inhibitor since it is reported as anti-histamine and this novel virus activates cytokine secretion in human body. part from these, anti-inflammatory natural molecules such as justicidin D, baicalin, and glycyrrhetic acid could work against COVID-19 since SARS-CoV-2 virus induces inflammation. The rest of the top 20 molecules could also be considered since they all have some medicinal properties as explained above.  This project contains the following extended data: • Drugs repurposing list (PDF). (PubChem CID of each ligand along with the minimized energy of each molecule and binding affinity results) Original> The transmission of this coronavirus occurs due to the binding of the CoV spike protein to the angiotensin converting enzyme 2 (ACE2) receptor present on the cell surface of the human host.
Suggestion>The transmission of this coronavirus occurs due to the binding of the CoV spike protein to the angiotensin-converting enzyme 2 (ACE2) receptor present on the human host's cell surface.

Introduction
I suggest the authors to focus in the evidences showed by Chang We know that Paragraph 2: et al. vaccines are not available for COVID yet but your aim is to identify drugs for treatment. potential Move this paragraph below " Paragraph 3: The World Health Organization (WHO) declared this disease as a pandemic on 11th March, 2020 and SARS-CoV-2 as the deadliest virus till date on earth claiming 217,896 deaths till 30th April, 2020".
Please remove this paragraph because is not discussed in the corresponding section.

Paragraph 4:
I recommend the authors to rewrite this paragraph. Authors performed an approach for Paragraph 5: searching drugs candidates based on databases. Please, focus in the main objective of the work. In my opinion authors could highlight the importance of bioinformatic tools (docking as the first approach) using available databases. I suggest to rewrite this paragraph because most of it content correspond to the Methodology section.

Methods
In my opinion authors should be more clear with the methodology. You could cite a work in the introduction section where docking was the better strategy for drugs candidates search. In this section, you should explain your criteria of protein selection, provide a list and finally, the docking process. I suggest to reorganize this section as: "Molecule selection and processing (or edition)" and "Molecular docking". In my opinion it is not necessary to split this information.

Results
For a better understanding, I strongly recommend to use a general graphical scheme explaining the selection, the macromolecules edition, and the pipe used for the docking analysis. I suggest to focus only in the docking results against different ligands (from databases, natural etc). Some sentences correspond to the discussion section.

Discussion
It is not clear if you work with information from databases or with clinical samples. Please, rewrite this sentence. The paragraph underline (2nd) is disconnected from this section, please review. In my opinion the third paragraph of this section is not necessary. Authors could write in a sentence all this information and cite. I also suggest to discuss better your results and make a contrast with the available bibliography; authors repeated in this section the results. I understand that your contribution is relevant but from other authors repeated in this section the results. I understand that your contribution is relevant but from other perspective: bioinformatic tools for a quick detection of potential antivirals or drugs, using as an example the COVID. Several papers cited in the introduction section are not discussed.

Conclusions
I suggest the integration of this section with the discussion. Authors proposed several treatments, inferred from HTVS, however, you should be careful with your statements. I recommend to discuss and compare the variables or data as indicators for potential treatments.
The manuscript is written with understandable English, though some English revision is necessary. In my opinion, after several major revisions and reorganization of the text, the study will be acceptable for indexing on F1000Research.

Are sufficient details of methods and analysis provided to allow replication by others? Yes
If applicable, is the statistical analysis and its interpretation appropriate? I cannot comment. A qualified statistician is required.
Are all the source data underlying the results available to ensure full reproducibility? Yes

Are the conclusions drawn adequately supported by the results? Partly
No competing interests were disclosed.

Competing Interests:
Reviewer Expertise: Evolutionary Biology, Genomics I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.