Illustrating and homology modeling the proteins of the Zika virus

The Zika virus (ZIKV) is a flavivirus of the family Flaviviridae, which is similar to dengue virus, yellow fever and West Nile virus. Recent outbreaks in South America, Latin America, the Caribbean and in particular Brazil have led to concern for the spread of the disease and potential to cause Guillain-Barré syndrome and microcephaly. Although ZIKV has been known of for over 60 years there is very little in the way of knowledge of the virus with few publications and no crystal structures. No antivirals have been tested against it either in vitro or in vivo. ZIKV therefore epitomizes a neglected disease. Several suggested steps have been proposed which could be taken to initiate ZIKV antiviral drug discovery using both high throughput screens as well as structure-based design based on homology models for the key proteins. We now describe preliminary homology models created for NS5, FtsJ, NS4B, NS4A, HELICc, DEXDc, peptidase S7, NS2B, NS2A, NS1, E stem, glycoprotein M, propeptide, capsid and glycoprotein E using SWISS-MODEL. Eleven out of 15 models pass our model quality criteria for their further use. While a ZIKV glycoprotein E homology model was initially described in the immature conformation as a trimer, we now describe the mature dimer conformer which allowed the construction of an illustration of the complete virion. By comparing illustrations of ZIKV based on this new homology model and the dengue virus crystal structure we propose potential differences that could be exploited for antiviral and vaccine design. The prediction of sites for glycosylation on this protein may also be useful in this regard. While we await a cryo-EM structure of ZIKV and eventual crystal structures of the individual proteins, these homology models provide the community with a starting point for structure-based design of drugs and vaccines as well as a for computational virtual screening.

All flaviviruses are spherical and contain a genome of approximately 11kb that functions as mRNA and encodes a polyprotein that leads to 10 proteins 1 . Examples include dengue virus, yellow fever and West Nile virus 2 . The recent pandemic of ZIKV occurring in South America, Latin America, the Caribbean and in particular Brazil spread by the Aedes mosquito has awakened dormant interest in this flavivirus which is a mild dengue-like disease 3 . However several documented cases of Guillain-Barré syndrome and other neurologic conditions represent important complications of the disease. In recent weeks the extent of the disease has also become apparent as new discoveries and announcements are made almost daily. Though clearly we have a considerable number of significant gaps in our knowledge which need addressing 4 .
The most concerning issue however is microcephaly observed in women who had ZIKV during pregnancy. There have been multiple cases of ZIKV found in fetal or newborn brain tissue that had signs of prenatal damage. The virus seems to have neurotropism in fetal brains, which may account for the presumed association between the infection and microcephaly 5,6 . The fetus in the recent case study had microcephaly with calcifications and ZIKV was found in the brain 6 . The ZIKV strain was identified as from French Polynesia (GenBank accession number KJ776791) and several polymorphisms were noted in the NS1, NS4B and FtsJ like methyltransferase regions. While the findings are not absolute proof that ZIKV causes microcephaly, the evidence from this case report strengthens the linkage 7 . Experts involved in the decision on the World Health Organization determined Public Health Emergency of International Concern (PHEIC) recommended the need for more research into the microcephaly link and need for an animal model to be developed. This group also interestingly called for open data sharing 8 . Early work 45 years ago in inoculated newborn mice showed that ZIKV had neurological effects, enlarging astroglial cells and destroying pyriform cells. At the same time virus formation within the endoplasmic reticulum was also visualized 9 . We are not aware of any studies of effects of ZIKV on human brain or brain cells. Localization of such viruses to the brain is not unusual for flaviviruses i.e. West Nile virus and this tropism may arise from viral binding to glycosaminoglycans, as has been observed for dengue virus in human microvascular endothelial cells 10 . Heparan sulfate and the C-type lectin DC-SIGN (dendritic cell-specific intercellular adhesion molecule 3-grabbing nonintegrin) are well characterized attachment structures for flaviviruses on cells. Interfering with glycan binding is one potential approach to preventing virus entry. Another is to acidify the endosome as has been demonstrated in vitro with chloroquine for dengue virus infection 2 . Several entry and adhesion factors, including DC-SIGN, Tyro3, and AXL as well as others, have been shown to permit ZIKV entry in human skin cells 11 .
The routes for transmission of ZIKV besides mosquito are of some concern. Recent US CDC guidance to pregnant women describes precautions against sexual transmission of ZIKV 12 and that the virus can persist for up to 12 weeks 13 . Possible ZIKV transmission through blood transfusion in French Polynesia was described by detecting the virus in 3% of asymptomatic blood donors 14 . Given how widespread ZIKV has become, there is a risk of depleting the blood supply, if donation after potential virus exposure is deferred. Methods have also been developed to inactivate ZIKV in plasma using amotosalen and UVA illumination 15 . There are issues with detection of ZIKV as a false positive dengue NS1 antigen test in a traveler to Switzerland was found to have the virus later. Therefore, cross-reactivity appears to be an issue in detection 16 and this also suggests the need for better diagnostics to be developed. Understanding the three-dimensional structure of antigenic ZIKV proteins may help accelerate the development of antibodies for diagnostics and rationally designed vaccines. In addition, the comparison of the assembled surface glycoprotein of ZIKV with that of dengue virus may help understand the accessible epitopes for the development of anti-flaviral vaccines in general. There is considerable prior work including structure-based design and virtual screening for dengue, yellow fever and other related

Amendments from Version 1
We have added a paragraph and references at the end entitled "Notes added while in review" describing all the developments since this work was made public in early March 2016. We have addressed the reviewer comments and made minor edits to the manuscript. We have added the information on templates in Table 2 and removed this information from the text. flaviviruses to develop antivirals to target envelope glycoproteins 21-26 ,  guanylyltransferase 27 , capsid protein NS3 helicase, NS2B-NS3  protease and NS5 polymerase 28,29 , as well as whole cell screens 30 which have produced many molecules potentially useful against ZIKV in vitro. Early work has only tested a small number of FDA-approved drugs against ZIKV including (EC 50 in parenthesis) interferon (34.3 IU/ml), ribavirin (143 ug/ml), 6-azauridine (1.5 ug/ml) and glycyrrhizin (384 ug/ml) 31 . A recent paper by Hamel et al., from 2015 also showed interferon inhibited ZIKV replication in primary skin fibroblasts 11 .

REVISED
However, the use of compounds against ZIKV should take into account the treatment of pregnant women, and many of the potential options are unsuitable for use in pregnancy because of toxicity and/or teratogenicity. Despite limited human data, the available data in animal models suggests caution. Azauridine is highly toxic to the fetus in model systems (for example 32). Ribavirin is not recommended for use in pregnancy due to embryotoxic and teratogenic effects 33 . Interferon is a potential abortifacient 34 . We also need to consider the treatment of fetus as well as children (that might become infected after birth) and the relatively small subsection of FDA approved drugs that are approved for pediatric use 35 . Therefore, alternative potential drugs for ZIKV are needed. The risks of medication use in pregnancy are notable. In particular, these include teratogenicity concerns. There is also the issue that the disease does not usually pose a direct risk to pregnant women themselves, so it's important any drug, which will not be improving their own health, does not damage their health. Pregnancy can also create increased risks and liver problems, a particular concern with any new drug as it can also affect drug distribution. In pregnancy, a ZIKV infection at 13 weeks gestation was coupled with persistent virus in a fetus at 32 weeks 6 . Treatment of symptomatic pregnant women may reduce risks of transmission to the fetus. A potential drug for ZIKV could also protect fetuses from damage by reducing transmission in the general population. If the drug appeared to reduce the duration of symptoms (which though mild can be annoying) and in turn reduced viral load, reducing the chance of transmission, this could benefit them. For example, cholera patients are often given antibiotics to reduce transmission. Also those with the flu are prescribed Tamiflu/Oseltamivir to reduce the duration of symptoms, minimize the severity of symptoms, but its use might also reduce transmission of flu in the general population. Since ZIKV has been found in semen many weeks after symptoms resolve 12 , treatment of male partners may reduce viral load and reduce the long term risk of transmission.
To help accelerate drug discovery through computational analysis, we have now developed homology models of ZIKV proteins that may serve as potential drug and vaccine targets. To complement high-throughput screening efforts, we could perform virtual screening against the proteins in ZIKV. While there are crystal structures for proteins from dengue 36,37 , yellow fever, West Nile virus and other flaviviruses [38][39][40][41][42][43][44] there are (to date) none for ZIKV. Therefore we are limited to generating homology models, although the close evolutionary relationships between flavivirus and their component proteins and genomes represents a valid approach 45 .

Protein sequence alignment
As a prelude to modeling we assessed what 3D structural data had significant identity scores to a representative ZIKV polyprotein. For this we chose UniProtKB Q32ZE1_9FLAV. While we have requested the promotion of this entry to the Swiss-Prot expert review level it has been selected as the representative sequence for the UniRef90_Q32ZE1 entry that currently clusters 108 ZIKV individual sequence entries at 90% (or above) amino acid identity. We then performed a BLAST search of this against Protein Data Bank (PDB) sequence entries. Protein BLAST analysis was also performed for each ZIKV protein sequence 46 to identify the closest proteins 47 and understand potential evolution.

Homology modeling
The amino acid sequences of ZIKV strain (GenBank accession number KJ776791 48 ) were retrieved from the GenBank database 49 and used as targets for homology modelling using the SWISS-MODEL server 50,51 . The latter performed the target-template sequence alignment after searching the putative X-ray template proteins in PDB for generating the 3D models for all target sequences. The best homology models were selected according to Global Model Quality Estimation (GMQE) and QMEAN statistical parameters. GMQE is a quality estimation which combines properties from the target-template alignment. The quality estimate ranges between 0 and 1 with higher values for better models. QMEAN4 scoring function consisting of a linear combination of four structural descriptors as described elsewhere in more detail 52,53 . The pseudo energies returned from the four descriptors are related to what we would expect from high resolution X-ray structures of similar size using a Z-score scheme. Further, built models were exported to the SAVES server Version 4 54 and their overall stereochemical quality, including backbone torsional angles through the Ramachandran plot, was checked according to PROCHECK 55 . Lastly, each model was submitted to an energy minimization protocol, using the Smart Minimizer algorithm in Discovery Studio version 4.1 (Biovia, San Diego, CA).

Site of glycosylation prediction
Mammalian N-glycosylation sites were predicted for glycoprotein E by submitting the sequence to web-based tools namely N-GlycoSite 56 , GlycoEP 57,58 and NetNGlyc Version 1.0 59 .

Illustration for Zika virion and animation
The Zika virion illustrations were created by combining the homology model of the envelope ZIKV glycoprotein E with the symmetry data from the dengue virus envelope. PDB ID:1K4R 60 contains the coordinates for three copies of the protein subunit of the dengue virus envelope, along with the symmetry data necessary to create the 180-subunit icosahedral structure of the complete viral envelope. The PyMOL Molecular Graphics System, Version 1.7.6.0. Schrödinger, LLC. was used to export the surface models of the three proteins in .obj format. Then they were imported into Lightwave 3D (NewTek, San Antonio, TX) where the symmetry data was used to instance copies of the model into the icosahedral envelope. The entire structure was copied several times and lighting applied as a surfacing effect to create a visually pleasing composition, and the image rendered out.
The next step was to import into Pymol the homology model of the ZIKV envelope protein which was homology modeled using PDB ID: 3P54 (from Japanese Encephalitis Virus) as a template.
A surface model of this protein was exported from Pymol as an .obj, and imported into Lightwave in place of the dengue model, using the same symmetry operators to create the envelope array. Everything else about the picture was left the same (color, composition, lighting, etc...) to allow the structural differences to be more apparent, and that image rendered out as well.
The last step was to overlay the detailed area of the two images and create an animated gif to flip back and forth between the two images of ZIKV and dengue, again to allow the differences to be more clearly seen. The structure of the Zika virion could be explored in a similar manner, using known data from other flaviviruses as a guide.
ZIKV glycoprotein E homology model conformation comparison The immature 4 and mature (this study) homology models for glycoprotein E were compared using the 'align and superimpose' proteins protocol in Discovery Studio Version 4.1 (Biovia, San Diego, CA).  criteria to discriminate good from bad models. Acceptable alignment values and higher GMQE and QMEAN4 scores were obtained during modeling, suggesting statistically acceptable homology models were generated for 11 proteins: NS5, FtsJ, HELICc, DEXDc, peptidase S7, NS1, E stem, glycoprotein M, propeptide, capsid, and glycoprotein E (Table 2, Figure 1 and Figure 2). The Ramachandran plots for these 11 proteins provide further evidence of their acceptability (Figure 2). On the other hand, because of low GMQE scores and of low coverage observed in X-ray template proteins available in the PDB, homology models for NS4B, NS4A, NS2B, and NS2A proteins appeared to have limitations regarding active sites and epitopes and they could not be validated.

Sequence alignment across flaviviruses
After building of homology models, we performed an additional validation in order to explore stereochemical quality of dihedral angles phi against psi of amino acid residues in modeled structures and identify sterically allowed regions for these angles using PROCHECK analysis. The results shown in Table 2 and Figure 2 reveal that 58.4─70.3% residues of the modeled proteins are within the most favored regions (red), 27.1─43.3% residues of modeled proteins are within the additional allowed regions (yellow), 1.5─6.3% residues of modeled proteins are within the generously allowed regions (beige), and only 0.0─6.1% residues of modeled proteins are within the disallowed regions (white). These results showed that the overall stereochemical properties of the Red represents most favored regions; yellow represents additional allowed regions; beige represents generously allowed regions; and white areas are disallowed regions. generated models were highly reliable and the models could be useful to future molecular modeling studies.

Site of glycosylation prediction
Several web-based tools were used for N-glycosylation site predictions as it provides a more thorough approach. N-GlycoSite 56 suggested N154 as a single N-glycosylation site matching the N-X-S/T/C consensus sequence. The same site was identified by GlycoEP using BPP settings (binary profile of patterns) 57,58 giving a score of 0.65/1.00. NetNGlyc 59 also gave the same predicted site, with a jury agreement of 6/9.
Zika virion compared to dengue virion A qualitative analysis of the Zika virion (which was constructed based on the dengue virion) can be compared to the dengue cryo-EM virion ( Figure 3) and indicates that Zika appears to have slightly more raised 'pimples' on the surface. The glycoprotein E dimer in ZIKV also has a narrow 'letter-box' groove while the dengue virion has a bigger 'pore' between the intersection of 5 dimers (5 fold axis). These differences are considerably more apparent in the animation (Supplementary material S4). It is important to note that the differences may also be artefacts of the homology modeling approach and template used for modeling ZIKV glycoprotein E.

ZIKV glycoprotein E homology model conformation comparison
The homology models developed using two different templates namely the immature protein which was based on the dengue crystal structure 4gsx as a template 50,71-73 and the mature protein which was based on PDB ID:3P54 from Japanese encephalitis virus showed a large difference (RMSD 13.47Å) (Figure 4). These proteins also demonstrate differences around the pocket used centered on the residues 270-277.

Discussion
The genus Flavivirus consists of 70 viruses many of which can cause severe human disease. There have been few sequence analyses of ZIKV previously in comparison to other flaviviruses. The genus Flavivirus produces a monophyletic tree with ZIKV being closest to Spondweni virus 74 while mosquito borne, tick borne and no-vector viruses cluster separately 45 . A BLAST analysis of all the ZIKV proteins in this study suggests for 12 of 15, their closest protein is in Spondweni virus (Table 1). More often strain sequences are compared within ZIKV and these showed variations in the NS5 gene 75 and glycoprotein E 17 . This is important as it would suggest perhaps targeting other proteins would have less issue with resistance or variability due to the strain of ZIKV.
If we are to address ZIKV in the short term while we await a vaccine we need to rapidly identify an antiviral, and preferably one that can be used against other related flaviviruses. Ideally we would need to treat pregnant women and provide them with prophylaxis that was safe to them and their fetus. Such an antiviral could also be used to reduce transmission in the population in general (by reducing viral load and symptoms and/or duration). As noted a decade ago and is still is true today, no antiviral drug is approved for any flavivirus to date 76 . It has been suggested that one of the ways to target these viruses is to interfere with the NS2B/NS3 protease complex 76 . Understanding of flavivirus proteins and other RNA viruses has benefited from the EU funded project VIZIER 77 , in particular several West Nile virus, dengue virus and other flavivirus structures of NS3 or NS5 were solved during this project and allosteric inhibitor sites were identified on NS5 78 . Multiple pharmaceutical companies have worked on this target for HCV leading to clinical candidates like IDX320 79 , danoprevir (ITMN-191/R7227) 80 , GS-9256 81 and others 82,83 . The only HCV protease targeting FDA approved drug is simeprevir, TMC435 84,85 and its use is avoided in pregnancy. Other HCV protease compounds are in clinical trials or submitted for FDA approval including Ledipasvir (formerly GS-5885) 86 . Testing these molecules against ZIKV in vitro would be useful.
We recently described 6 steps which could be taken to kick start research on ZIKV 4 , one of which was to develop homology models for ZIKV proteins that are similar to those targeted by molecules that are also active against the dengue virus. Such an approach would then enable docking of compound libraries of known antivirals, FDA approved drugs or other compounds 4 . Ideally generating homology models with a single tool may not be enough. In particular, for those proteins with low sequence identity the use of servers and methods that use threading may be worthwhile (e.g. I-TASSER 87-89 ). However these methods are generally only accessible to academics while others are required to license the technologies. This is ironic as these technologies were developed in most cases with NIH and NSF funds. An alternative commercial homology modeling approach (MODELLER) was also used and generated a NS5 homology model and the top hit was also the Japanese encephalitis virus RdRp domain (PDB ID: 4HDH) compared with PDB ID: 4K6M from SWISS-MODEL 61 . 4HDH also includes the ATP and zinc metal where the catalytic centers are. The dengue virus 3 polymerase (PDB ID: 4HHJ) 90 has very high sequence homology and comes up as a potential target in MODELLER, which illustrates that all these viral RNA dependent polymerases are very similar.
While it is likely that the eventual availability of crystal structures of ZIKV proteins would improve the results of docking, the homology models described here (Figure 1, Figure 2, Supplementary material S2) represent a starting point that can be used to help prioritize compounds for testing as described previously 4 . Proteins with templates above 25-40% sequence identity might suggest the proteins are related while below this is a twilight zone. Homology modeling is thought to fill in the gaps between proteins with x-ray structures and those with none 91 . Experimental testing of homology models and crystal structures indicate that a similar enrichment rate can be achieved when identifying active compounds in a set decoys 92 . Others have also described homology models that may be an excellent alternative when crystal structures are unavailable for human GPCRs 93,94 , and have led to the first identification of inhibitors of the Mycobacterium tuberculosis Topoisomerase I after virtual screening 95,96 prior to the crystal structure becoming available 97 . Certainly there are still considerable challenges using homology models such as prediction of the correct binding pose 98 but there are plenty of success stories 98-100 . While databases of homology models exist like MODBASE 101 and SWISS-MODEL 50,51,71-73 neither of these have any ZIKV protein homology models at the time of writing. There are many structural genomics initiatives and yet it would seem there are few if any continuing the work of VIZIER working on flaviviruses or emerging viruses.
Availability of structures are important as the structure of the ZIKV glycoprotein could be useful for design of antibodies selective for the virus which will be critical for the development of diagnostics, and understanding antibody binding also for the use of IV immunoglobulin in pregnancy and the organization of the epitopes on viral proteins may facilitate early work in vaccine development. There are further implications for understanding the antibody binding epitopes, which are sometimes shared between different flaviviruses. Broadly protective vaccines for flaviviruses may allow the simultaneous targeting of ZIKV and related viruses such as dengue 102 . Understanding glycosylation is therefore important.  (Supplementary material S4). Even between the dengue serotypes 1, 2 and 4 for which there are cryo-EM structures 105,106 it is apparent while the rafts are very similar (as are the sequence identities [60%]) there is a different charge distribution of the surface of each. Dengue serotype 2 had larger continuous patches of positive charges which was proposed to enable improved binding to heparan sulfate. This might also be the case for ZIKV in that the charge pattern is again different and could be key for vaccine development. The availability of virion structures makes it feasible to understand structure function of the complete virus such as assessment of membrane curvature and how organization of membrane proteins affects this 43 .
A model of the Zika virion was constructed as an illustration using the homology model of the glycoprotein E dimer (Figure 3). While the combined protein sequence of glycoprotein E and the immunoglobulin like domain is closest to dengue virus 1 (57 percent identity, Table 1) the closest template was for the crystal structure of the Japanese encephalitis virus envelope protein (53.12 percent identity, Table 2). This would suggest the virion should more closely resemble that of dengue virus 1, while producing a homology model based on a more distant virus might not be ideal. The homology model of glycoprotein E developed for the mature conformation in this study is significantly different from that developed previously for the immature conformation (Figure 4). The proposed binding site centered around residues 270-277 appears shallower in the mature conformation and this would certainly affect the kinds of molecules that it could interact with. It might also point to the need to interfere with the immature conformation as preferable versus the mature conformation. Ultimately perhaps this model of the Zika virion could help us understand how drugs could access the virus. Viruses affecting pregnancy, like say Varicella which causes microcephaly and other developmental problems 106,107 , are often treated with IV immunoglobulin, i.e. antibodies, as well as antivirals to reduce the effect of the virus (or to avoid infection if given soon after exposure). The models could help us design combination approaches possibly targeting multiple proteins that might prevent drug resistance from occurring also.
Does having the homology models and the virion illustration help understand function? Well, the surface charge pattern might be inferred from the homology model and could be compared with dengue and other filoviruses for which there are cryo-EM structures. This may in turn present opportunities for vaccine design by indicating accessible surfaces and properties, allowing mapping of epitopes, design of accessible fragments and peptides for vaccine/diagnostic design. Vaccines themselves might be the only way to avoid the inevitable, otherwise, simply reducing the spread of ZIKV would just delay it. Women ultimately may just want to 'get it over with' and have ZIKV before they get pregnant and hope there is lasting immunity.
In summary, in the absence of crystal structures for any of the proteins comprising the ZIKV, we are left to attempt to construct homology models which we have done using the freely available SWISS-MODEL server. Further preparation of these models required freely available and commercial tools. In the case of the ZIKV glycoprotein E homology model, this has the added benefit of enabling the construction of a full virion. By comparing the Zika virion to the existing structures for other flaviviruses we can see similarities and differences on the surface (Supplementary material S4, Supplementary material S5). This relatively crude approach could help to understand how we might develop antivirals and vaccines against it. In addition we now provide homology models as a starting point for (small and large scale) docking studies and further evaluation which may complement other modeling efforts for ZIKV 110 . Ultimately the results of their use can be compared with using ZIKV crystal structures once generated.

Notes added while in review
Since the initial publication of this article several cryo-EM and crystal structures have been published for ZIKV, including candidate targets for inhibitor medicinal chemistry optimization and vaccine design [112][113][114][115] . Detailed comparisons between our homology models and the experimental structures will be the subject of future comparisons with the homology models described herein. It should be noted that the prediction for the site of glycosylation of glycoprotein E was experimentally verified by Sirohi et al. 112 , We would also point out the utility of this work through the World Community Grid we have started the OpenZika project 116-118 which is using these homology models, template structures and crystal structures to dock millions of molecules (with AutoDock Vina 119-121 ) to identify compounds for testing against Zika with collaborators. Both the docking and activity screening results (http://openzika.ufg.br/experiments/) will be open to the community and published in due course. An obvious utility of this entire exercise is that, by testing, iterating and tweaking the methodologies each time (e.g. between Ebola and ZIKV) the open science community becomes better prepared for the next global pathogen emergency.

Author contributions
All authors contributed to the collaborative writing of this project. SE conceived and designed the experiments. SE, JL, BJN, WGL, CS and CHA carried out the research.

Competing interests
S.E. works for Collaborations in Chemistry, Collaborations Pharmaceuticals, Inc. and Collaborative Drug Discovery, Inc.

Grant information
CS was supported by Wellcome Trust Grant (to the IUPHAR/BPS Guide to PHARMACOLOGY) Number 099156/Z/12/Z. I confirm that the funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Acknowledgments
Dr's Priscilla Yang, Andrew Marsh, Derek Gatherer, Lucio Freitas-Junior, Daniel Mietchen, Joel S Freundlich, Jair Siqueira-Neto, Antony J. Williams, Alex Perryman and Mr. Tom Stratton are thanked for their helpful discussions and tweets. Biovia is kindly acknowledged for providing Discovery Studio to SE.
Click here to access the data.
Supplementary material S3. PDB files for ZIKV homology models.
Click here to access the data.

F1000Research
Zika Virus proteins by homology modeling.
Infections by Zika virus are seriously preoccupying the population and governments of endemic countries, with particular attention to pregnant women. Indeed, the most serious complications of infections by Zika virus have been observed in pregnant women (i.e. microencephalic fetus). However, the evidence that pregnant women are most affected by Zika virus infection, set some difficulties in developing focused therapeutic strategies, as nicely discussed by the authors. However, in this respect, which is the scenario proposed by authors for a candidate inhibitor of Zika virus replication. Why this molecule should be useful, and how pregnant women could deal with the administration of the drug?
Overall, the work is well written and organized in a clear and rational way, even if no experimental validation of the work carried out is proposed. Accordingly, conclusions of this work are merely in silico speculative. My major concern is on the usability of the protein structures for drug design or virtual screening, as claimed by the authors in the abstract and discussion. Indeed, sequence identity between template and target sequences is rather low in some cases (< 60 %). Moreover, the atomistic detail of the active site and its conformation may vary noticeably in the models depending on the program used, the force field, the level of refinement (algorithm, steps, solvent model, …) and the quality of the template structure. All these variables should be taken into consideration by the authors, and the refinement of 3D models should be at least attempted or discussed in deeper details. In my personal opinion, details on the conservation rate, sequence identity and structural similarity of the binding sites would be more helpful for drug design purposes. Finally, an experimental validation of at least one of the structures modeled by authors (i.e. the most promising target for drug designing studies) should be provided. In fact, analysis with PROCHECK or the use of numerical scores is not sufficient to validate a 3D structure generated by homology modeling. There is a high risk that molecular docking towards a low-resolution structure could provide unrealistic results. The paper can be accepted after revision.
I have read this submission. I believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.
No competing interests were disclosed. Infections by Zika virus are seriously preoccupying the population and governments of endemic countries, with particular attention to pregnant women. Indeed, the most serious complications of infections by Zika virus have been observed in pregnant women (i.e. microencephalic fetus). However, the evidence that pregnant women are most affected by Zika virus infection, set some difficulties in developing focused therapeutic strategies, as nicely discussed by the authors. However, in this respect, which is the scenario proposed by authors for a candidate inhibitor of Zika virus replication. Why this molecule should be useful, and how pregnant women could deal with the administration of the drug?
Response -If we understand the question correctly -any drug for Zika should be safe for all including pregnant women. If they have a Zika infection it would be important to eradicate the virus as soon as possible so that the fetus perhaps is not exposed to the F1000Research all including pregnant women. If they have a Zika infection it would be important to eradicate the virus as soon as possible so that the fetus perhaps is not exposed to the virus.
Overall, the work is well written and organized in a clear and rational way, even if no experimental validation of the work carried out is proposed. in silico Response: This work was initiated in Feb 2016, months before the first crystal structures and experimental in vitro systems were published. Our goal was to build protein models that we would eventually use and which others could use in parallel, to start docking-based virtual screening and identify VS and experimental hits, to initiate the drug discovery cascade.
Accordingly, conclusions of this work are merely speculative.
Response: A modeling study, by definition is speculative and does not claim otherwise. However, in good faith, we provided models which did not exist. The fact that they have since gone on to form the basis of the OpenZika Project (manuscript submitted, ) and have been used to suggest compounds which are now being http://openzika.ufg.br tested argues for their utility. Notwithstanding, we acknowledge that Zika protease, helicase and glycoprotein E experimental structures have now been elucidated and deposited in PDB.
My major concern is on the usability of the protein structures for drug design or virtual screening, as claimed by the authors in the abstract and discussion. Indeed, sequence identity between template and target sequences is rather low in some cases (< 60 %).
Response -As expected for the enzyme functions, the local active site identity and similarity scores are higher than indicated by the full-length identity %. Thus, a reasonable pocket model for docking is the primary goal, not the structural accuracy for the whole protein. We would also add that any team using our models would doubtless take a pragmatic approach that would include running various types of internal controls for virtual screening studies and confirmatory screening experiments (e.g. running the Dengue virus templates alone and checking for enrichment of established Dengue actives against the purified proteins). They would also probably prioritize their effort on the models with highest template similarity. Note also that, the first reviewer Dr. Fiser accepts that our similarity values are high enough for reliable modelling.
Moreover, the atomistic detail of the active site and its conformation may vary noticeably in the models depending on the program used, the force field, the level of refinement (algorithm, steps, solvent model, …) and the quality of the template structure.
Response -We agree that this is one of the challenges with this approach, but there were no other options without a single crystal structure for Zika protein when we started this work and submitted it.
All these variables should be taken into consideration by the authors, and the refinement of 3D models should be at least attempted or discussed in deeper details. In my personal opinion, details on the conservation rate, sequence identity and structural similarity of the binding sites would be more helpful for drug design purposes.
Response: We agree with your point. We are now refining the 3D models using molecular dynamics simulations. However, this will be the subject of work on going, taking an in depth approach to each structure and comparison to crystal structures when available. Our main goal in this paper which we submitted in Feb 2016 (and made available early March) was to provide the scientific community with the first 3D models of Zika proteins.
Finally, an experimental validation of at least one of the structures modeled by authors (i.e. the most promising target for drug designing studies) should be provided. In fact, analysis with PROCHECK or the use of numerical scores is not sufficient to validate a 3D structure generated by homology modeling.
Response -We agree that the experimental validation of the protein structures is of upmost importance in drug discovery programs, but this was not the scope of this paper. Again, our main goal was to generate homology models for the key proteins, which none crystal was available at the moment of the submission, and these models could be taken to initiate ZIKV antiviral drug discovery using both high throughput screens as well as structure-based design. Moreover, these structures were the core to initiate the World Community Grid project called OpenZika ( , which is a global https://www.worldcommunitygrid.org/research/zika/overview.do) research project to accelerate the discovery of an antiviral against the Zika virus. There is a high risk that molecular docking towards a low-resolution structure could provide unrealistic results.
Response -Yes we absolutely agree, but when we started this we had no crystal structures we had to start somewhere as a means to start to filter compounds rather than random HTS. Our work has also correctly predicted the glycosylation site for glycoprotein E one of the first structures crystallized. Moreover, we have compared our models with the crystallographic structures, when they were made available. The RMSD values ranged from 0.72 to 1.8 Å, for the structures of NS1, NS3, glycoprotein E and NS2/NS3 proteins.
We have also optimized the 3D models and the MolProbit scores ranged from 1.28 to 2.81 1.

3.
We have also optimized the 3D models and the MolProbit scores ranged from 1.28 to 2.81 Å. MolProbit is a score that combines the clashscore, rotamer, and Ramachandran evaluations into a single score, normalized to be on the same scale as X-ray resolution. Therefore, we are optimistic that our models are reliable to start a docking-based virtual screening program in the search of a new antiviral drug, which we already initiated.
The paper can be accepted after revision.

Response -Thank you!
No competing interests were disclosed. The paper describes homology models of 15 proteins encoded in the genome of Zika virus that are built by the SWISSmodel web server. A glycolysation site was also identified. Using the models of Zika glycoprotein E a complete structural model of the virion was constructed.
The practical results of the work are the 11 high quality homology models that could be used in the future for structure based drug development. I believe an interested researcher will initiate time consuming follow up studies with these models only if there is a substantial added value.
These models passed quality assessment criteria according to the authors. However a number of successful models (passing similarly well the same quality requirements) can be generated that will differ in small but essential details by e.g. side chain placement, or loop conformations. However, these differences can have a dramatic effect on the outcome of subsequent drug docking trials. These alternative models can be obtained when different softwares are used, as the authors allude to it, e.g. using Modeller vs SWISSModel, as these use different forcefields and restraints to generate models. Therefore, the authors should consider providing a more insightful result by running several different modeling programs or using alternative templates (see 4., below), and comparing the results, identifying similarly modeled parts of these models and providing a set of possible solutions for subsequent studies. A battery of model quality checks were performed and additional energy minimization. However the overall high sequence identities between the target proteins and their respective templates (55% and up,) ensure that these models are highly reliable.
Therefore the extensive reporting on Ramachandran plots (Figure 3) is not adding much to the results, it can be transferred to supplementary material.
Similarly, Table 2 can be shrunk by eliminating many details of PROCHECK results.
However the information on templates should be reported in the table, instead listing them 6.
Response: We totally agree with you, Dr. Fiser. Energy minimization does not improve model quality. The word "refined"in the paper was a mistake and we have corrected it now. We are using some servers, such as KoBaMIN server, to perform structure refinement and to improve models quality, but this will be subject of another manuscript. Our main goal in this first paper was to provide the scientific community with homology models for the key proteins, which none crystal was available at the moment of the submission of this paper, and these models could be taken to initiate ZIKV antiviral drug discovery using both high throughput screens as well as structure-based design. Moreover, these structures were the core to initiate the World Community Grid project called OpenZika ( , which is a global https://www.worldcommunitygrid.org/research/zika/overview.do) . research project to accelerate the discovery of an antiviral against the Zika virus Some statements require attention e.g in Abstract: "Eleven out of 15 models pass our criteria for selection". What selection? This must be referring to quality or accuracy of models, and should be rephrased accordingly.
Response: OK Thank you! Changed to -Eleven out of 15 models pass our model quality criteria for their further use.
No competing interests were disclosed. Competing Interests: