Characterization of sulfated polysaccharide activity against virulent Plasmodium falciparum PHISTb/RLP1 protein [version 2; peer review: 2 approved]

Background: The emergence of artemisinin resistance in South East Asia calls for urgent discovery of new drug compounds that have antiplasmodial activity. Unlike the classical compound screening drug discovery methods, the rational approach involving targeted drug discovery is less cumbersome and therefore key for innovation of new antiplasmodial compounds. Plasmodium falciparum (Pf) utilizes the process of host erythrocyte remodeling using Plasmodium-helical interspersed sub-telomeric domain (PHIST) containing proteins, which are amenable drug targets. The aim of this study is to identify inhibitors of PHIST from sulfated polysaccharides as new antimalarials. Methods: 251 samples from an ongoing study of epidemiology of malaria and drug resistance sensitivity patterns in Kenya were sequenced for PHISTb/RLP1 gene using Sanger sequencing. The sequenced reads were mapped to the reference Pf3D7 protein sequence of PHISTb/RLP1 using CLC Main Workbench. Homology modeling of both reference and mutant protein structures was achieved using the LOMETs tool. The models were refined using ModRefiner for energy minimization. Ramachandran plot was generated by ProCheck to assess the conformation of amino acids in the protein model. Protein binding sites predictions were assessed using FT SITE software. We searched for prospective antimalarials from PubChem. Docking experiments were achieved using AutoDock Vina and analysis results visualized in PyMOL. Results: Sanger sequencing generated 86 complete sequences. Upon mapping of the sequences to the reference, 12 non-synonymous single nucleotide polymorphisms were considered for mutant protein structure analysis. Eleven drug compounds with antiplasmodial activity were identified. Both modeled PHISTb/RLP1 reference and mutant structures had a Ramachandran score of >90% of the amino acids in the favored region. Ten of the drug compounds interacted with amino acid residues in PHISTb and RESA domains, showing potential activity against these proteins. Conclusion: This research identifies inhibitors of exported proteins that can be used in in vitro tests against the Plasmodium parasite. sulfate. The bioavailability value is highly significant for most of the compounds. The drug compounds do not inhibit cytochromes hence no formation of toxic compounds. Skin permeation values are negative depicting less permeation. Lipophilicility greater than 1 in 3,6-Di-O-benzoyl-D-galactal and ghatti sulfate shows the ability of these compounds to reach target cells. The manuscript describes the modeling of a protein and its mutant from Plasmodium falciparum, and the assessment of some ligands with these models by molecular docking analysis. The mutant was built based on experimental data from several samples. Both protein models were validated using online tools. Although the authors used several online tools for assessing the structural performance of the proteins, the mutant model is lacking confidence in its tridimensional structure. Furthermore, the docking protocol is not fully described. The results are too preliminary to suggest a potential activity of the selected ligands against PHISTb/RLP1 proteins. I would recommend major revision prior to indexing. The mutant protein configuration the and assessment of it suggested to dynamics to let the structure correctly minimize. Otherwise, the predicted protein may have a non-biological relevance. authors both the


Introduction
African countries have 94% of malaria cases and the highest malaria-related death rates according to the 2019 World Health Organization (WHO) Malaria Report (WHO, 2019). Despite its prioritization in the Millennium Development Goals and other large scale global health initiatives, efforts and strategies to reduce the burden of malaria by 40% have stalled over the years due to different challenges (WHO, 2018). These challenges include emergence of Plasmodium falciparum resistance to first-line treatment, the lack of an efficacious vaccine, vector resistance to insecticides, the great diversity of malaria parasite, and insufficient funding towards the control of the disease. These challenges have created a gap that calls for quick intervention to reduce the burden of the disease (Dondorp et al., 2012). The latest drug resistance development of P. falciparum to artemisinin combination therapies in Southeast Asia (Takala-harrison et al., 2015), calls for the development of new drug compounds that can overcome parasite resistance to existing drugs.
Studies have shown that sulfated polysaccharides have antimalarial activities (Mourão, 2015). Unlike most drugs that target the blood stages of the parasite, these compounds exploit newly identified pathways to interact with intracellular parasites. It has been established that merozoites perforate the erythrocyte membrane before egress (Boyle et al., 2010). Heparin enters the infected erythrocyte through these pores and prevents merozoite egress (Glushakova et al., 2017). These findings have led to therapies using heparin to treat severe malaria (Vogt et al., 2006). Moreover, heparin has been shown to reduce the cytoadherence of infected red blood cells and thus reduce parasitemia. However, heparin use was discontinued due to the induction of severe bleeding in patients (Leitgeb et al., 2011). Low concentrations of heparin were later used successfully in safety and efficacy trials to disrupt cytoadherence and rosette formation and additional research has been conducted to identify low anticoagulant sulfated polysaccharides with antimalarial activity. Modified heparin compounds and polysaccharide inhibitors were successfully profiled for activity against intracellular parasites (Boyle et al., 2017). These compounds have been identified in marine organisms and plants (Marques et al., 2016).
The interaction of sulfated polysaccharide molecules with merozoite proteins and the P. falciparum erythrocyte membrane protein family have been studied intensively. Members of the Duffy binding-like domain and reticulocyte binding-like domain were shown to have interaction with heparin at different affinities (Saiwaew et al., 2017). In addition, sulfated polysaccharides interacts with all P. falciparum reticulocyte binding homologues, including PfRH2, PfRH4 and PfRH5 (Somner et al., 2000). The P. falciparum parasite uses not only the merozoite surface proteins for invasion but also exported proteins that are found on the surface of the erythrocyte to form cross-linkers with erythrocytes (Tarr et al., 2014). The exported proteins have been classified to be essential for parasite survival and virulence, hence contribute to the pathogenesis of malaria (Maier et al., 2008). The Plasmodium-helical interspersed sub-telomeric domain (PHIST) family of exported proteins are expressed in most Plasmodium species but they are largely expanded in P. falciparum. The PHIST domain clusters into three subgroups across the Plasmodium genus, PHISTa, PHISTb, and PHISTc. PHISTa and PHISTb localize specifically in P. falciparum. PHISTb is found in P. vivax and P. knowlesi. One major virulent protein of the PHIST family is PHISTb containing ring-infected erythrocyte surface antigen (RESA)-like protein, abbreviated to PHISTb/RLP1 (PlasmoDB accession number, PF3D7_0201600). This protein has been involved in enabling attachment of infected erythrocytes to organs and blood vessels, as well as erythrocyte remodeling mechanism. Studies have associated this protein with controlling expression mechanism of PfEMP1 (P. falciparum erythrocyte membrane protein) (Moreira et al., 2016;Oberli et al., 2016;Warncke et al., 2016). As a drug target in the P. falciparum genome, PHISTb/RLP1 is identified in Tropical Disease Research Targets database (TDR Targets ID: 3288) In this study, we profile specific protein-ligand interactions that give the first insight into drug compounds targeting exported proteins. Furthermore, we outline old and novel mutations found within PHISTb/protein sequences from whole blood samples collected from various malaria endemic sites in Kenya. We link the effects of these mutations on protein structure, active sites and interaction with the drug molecules.

Study setting
The study was conducted at the Malaria Drug Resistance laboratory in Kisumu, Kenya, under the United States Army Medical Research Directorate-Africa and Kenya Medical Research Institute. The study analyzed a subset of archived P. falciparum samples from ongoing approved research, studying the epidemiology of malaria drug resistance patterns in Kenya. Samples were collected from the following sites: Kisumu and Kombewa (endemic regions), Kericho and Kisii (highland epidemic areas), Marigat (seasonal transmission zone), and Malindi (coastal area with declining endemicity). The proportion of samples collected was roughly the same

Amendments from Version 1
The newly revised article contains additional information and analysis of the data as suggested by the reviewers. The included analysis is Absorption, Distribution, Metabolism, Excretion and Toxicity (ADMET) of all the reported drug molecules. A positive control, Low molecular weight heparin was included in the analysis for the results comparison. The control was selected on the basis that it belongs to the sulphated-polysaccharides group and has gone through clinical trials as reported in previous studies. In addition to these, figures elaborating hydrophobic and hydrogen bonding interactions between the ligands and protein targets were included in the discussion in line with reviewer's comments. All other missing information regarding explanation of the protocol as reviewers raised were also addressed and included in the new version. The specific sections are: explanation of the docking protocol, grid box parameters, energy minimization and the choice of modelled structure in LOMETs. An explanation of how samples were selected for this study and the treatment of patients was included in the methods.
Any further responses from the reviewers can be found at the end of the article REVISED between the six collection sites. Samples were selected using non-random method due to time and limited resources. Fifty samples were selected from each site from the laboratory inventory which contained samples from 2008 to 2018. Factors considered were availability of the sample collected at day zero of study participant recruitment.

Sample collection
Samples were collected from volunteers consenting to participate in the study, who were aged 6 months. 2.5 ml of venous blood was collected from participants presenting signs of uncomplicated Malaria. 0.5ml of the blood was transferred to the Acid Citrate Dextrose tubes for sterile culture. Participants that tested positive for Malaria were given Artemether lumefantrine and follow up samples taken at day 7, 14 and 28 following treatment. This study analyzed samples collected at day zero before treatment. At least 1ml of the blood was transferred into an ethylene glycol-bis(2-aminoethylether)-N,N,N′,N′tetraaceticacid (EDTA) tube for molecular assays which included DNA extraction. From the remaining blood sample, 3-5 drops of the syringe blood sample was transferred onto Whatman filter paper by blotting on three separate spots and archived at room temperature in a pouch.

Ethical clearance
Written informed consent was provided by participants and/or their legal guardians. The study was carried out in accordance to approved guidelines by the Ethical Review Committee of the Kenya Medical Research Institute (KEMRI), Nairobi, Kenya and Walter Reed Army Institute of Research (WRAIR) Institutional Review Board, Silver Spring, MD. The study was conducted under the approved study protocols KEMRI #1330/WRAIR #1384 and KEMRI #3628/WRAIR #2454.
Sequencing of PHISTb/RLP1 gene using molecular assays A total of 251 viable samples were used in this study. P. falciparum parasite DNA extraction was carried out using Qiagen DNA Mini extraction spin protocol (Qiagen, Valencia, CA) as per manufacturer's instructions. Extracted DNA was stored at -20°C for preservation.
Plasmodium testing and P. falciparum speciation using quantitative PCR First the samples were tested for the presence of Plasmodium parasite. Plasmodium genus was tested with primers F1 and R1 shown in Table 1, the probe was labeled with FAM (6-carboxyfluorescein) in 5' and at the 3' TAMRA (6-carboxytetramethyl-rhodamine). All the Plasmodium positive samples were then tested for P. falciparum species. P. falciparum species-specific assay was carried out with primers and probes designed with FAM reporter, as reported in previous work (Kamau et al., 2013;Veron et al., 2009). The samples were amplified in 0.1 milliliters 96-well plates. The kit used for this assay was Quantifast (QIAGEN). The components of each well were, 2µl DNA template, recommended amount of Quantifast master mix and recommended amount of primers and probes. The reaction mixture amounts were calculated according to the number of samples run per assay.
The real-time PCR assay was the carried out in Applied Biosystem Quantstudio 6 Flex (Quantstudio Tm real-time PCR applied Biosystems by Thermos Fisher Scientific) with the following conditions: 95°C for 10 minutes, 40 cycles at 95°C for 15 minutes, and then 60°C for 1 minute, as previously published (Kamau et al., 2013). RNAse-P was used the housekeeping gene and Ct value was awarded by the equipment as the value for which the amplification was high enough to pass the detection threshold The results were exported into Excel worksheet showing the cycle threshold (CT) value for each sample. Only samples positive by P. falciparum confirmatory PCR with a cycle threshold value of at least 32 and below were considered for downstream assays (Table 1).
Conventional PCR assays for P. falciparum PHISTb/RLP1 gene amplification and gel electrophoresis Gene-specific primers were designed using Primer3Plus, as described in the literature (Untergasser et al., 2012). Mapping of the primers to sequences to test the correct annealing on sequence was achieved using Sequence Manipulation Suite version 2. Primers are included in Table 2. PCR was carried out in Applied Biosystems thermocycler machine, the reaction conditions were as follows: the first Table 1. Primers and probes used to diagnose Plasmodium parasite and detect Plasmodium falciparum species in real time PCR. These are primers and probes were published in previous work by Kamau et al. (2013). pair of primers amplifying the first fragment were 94°C for 5 minutes during initial denaturation phase and 30 seconds in the normal denaturation, annealing 59°C for 30 seconds, and extension 72°C for 1 minute and 5 minutes in the final elongation stage. The second primer pair annealing temperature was 58°C for 30 seconds while the denaturation and extension phases were the same as primer pair one. We used the puRE Taq Ready-To-Go PCR beads standard master mix (GE Healthcare, CITY, STATE), as per manufacturer's instructions.

Plasmodium primers
Following successful amplification, PCR amplicons were visualized in 2% agarose gel electrophoresis for confirmation. The 2% agarose gel was prepared using 2.25gm of agarose powder in 150 ml of 10% Tris Acetate EDTA (TAE) buffer. The gel was then heated in the microwave for 3 min to allow mixing. The gel was then cooled until it was lukewarm, then 15µl of gel red was added. Thereafter the gel was poured into the gel tank in which gel combs had been inserted. The gel was incubated at room temperature for 30 to 60 minutes to allow the gel to set. Upon setting, the gel was flooded with TAE running buffer just above the wells for loading of samples. Samples were then loaded into the wells on the gel wells by mixing 4µl of the sample with 4µl of loading dye. Additionally, 1.5µl each of 1kb hyper ladder 1 (Bioline) was loaded into the first and final wells on the gel. After loading all the samples, the gel was completely flooded with running buffer (10% TAE) and was run at 230 voltage for 50 min. The gel was then read on UVI save HD5 -Gel documentation (Uvitec Limited, United Kingdom).
Sanger sequencing of P. falciparum PHISTb/RLP1 gene Amplification products were purified using ExoSAP-IT (Affymetrix, CITY, STATE) enzymatic cleanup procedure. The enzymatic reaction involves an exonuclease enzyme Shrimp Alkaline Phosphatase (SAP) enzyme, which cleaves phosphate group from unincorporated dNTPs. Using 0.2 ml 96 well plates, each well contained 2µl of ExoSAP-IT enzyme mixed with 8µl of PCR product. The plates were incubated at 37°C for 20 minutes to allow the enzymes to work then for a further 20 minutes at 80°C to inactivate the ExoSAP-IT enzymes prior to sequencing. Big dye termination PCR was carried out using primary PCR primers and amplification conditions. Big dye terminator sequencing is a modification of Sanger sequencing where dideoxynucleotides (ddNTPs) labeled with a specific fluorescent dye corresponding to each nucleotide base are added to the reaction. The sequencing products were cleaned using Sephadex supplied by Sigma. 10µl of HiDi formamide (Applied Biosystems) was added into the purified sequence products. The plates were then sealed and heated at 96°C for 3 minutes to denature the DNA and then analyzed using capillary electrophoresis ABI 3130/3500xL Genetic analyzer. Read assembling was done on Qiagen CLC main Workbench version 8.0.1 (CLC Bio, 2014).
Single nucleotide polymorphisms analysis in PHISTb/ RLP1 protein sequences Multiple sequence alignment was achieved using Multiple Sequence Comparison by Log-Expectation (MUSCLE) (Edgar, 2004) in Jalview version 2.11.0 (Alzohairy, 2014;Waterhouse et al., 2009). Bioedit version 7.2.5 (Hall, 1999) was used to enable intron and gap deletions, as well as translation of the nucleotides to protein sequences. The alignment was copied to Microsoft Excel and frequencies of the observed single nucleotide polymorphisms (SNPs) counted, as well as generation of frequency bar graphs.
Protein 3D structure modelling, model verification and binding site predictions Protein structures of the PHIST/RLP1 reference (Pf3D7) and mutant proteins structure were predicted via ab initio multithreading tool LOMETS which is verified to model multidomain protein structures (Wu & Zhang, 2007;Zheng et al., 2019). The LOMETS tool has an internal selection of templates through in-built multiple sequence alignment and ranking the templates in descending order according to a normalized Z score. A Z score greater than or equal to one is considered a good alignment.
Protein Data Bank (PDB) hits used as templates for the PHISTb/RLP1 reference structure were 4jle, 5ez3 and 6d03 all with a normalized Z-score ≥1. The templates selected for the PHISTb/RLP1 mutant structure modelling were 4jle, 2ziq and 6d03, which had a normalized z-score ≥ 1. Two templates were similar for the reference and mutant protein structure threading because of sequence similarities at the starting and ending domains of the the protein sequences. The middle template was different for both protein structure threading of mutant and reference structures due to presence of point mutations in this region of the mutant PHISTb/RLP1 protein sequence.
The models were submitted to ModRefiner (Xu & Zhang, 2011) for energy minimization. The energy minimized structure of PHISTb/RLP1 reference had a Template Model (TM) score of 0.97 to the initial model. The TM score is calculated between 0 and 1, whereby the higher the TM score the higher the perfect match between the two structures. The models were verified using Galaxy Refine web server tool (Ko et al., 2012) for correction of wrong rotamers. The protein model stability were validated through the parameter of percentage residues lying within the favored and allowed regions using Rampage tool (Lovell et al., 2003) and overall stability confirmed using PROCHECK web tool (Laskowski et al., 1993). Following model validations, the functional sites within the protein structure were predicted in the webserver tool FT Site (Ngan et al., 2012)(Brenke et al., 2009. FT SITE algorithm was reported to achieve near experimental accuracy of predicting druggable hotspots in 94% of apo-proteins used in evaluation of binding sites methods. For confirmation of our binding site clusters, COACH, a metaserver tool that uses comparisons of other servers prediction to profile highly accurate protein-ligand binding sites, was used (Yang et al., 2013).

Sulfated polysaccharides search and toxicity analysis
The sulfated polysaccharides containing anti-malarial properties were identified through chemical modifications, as per previous research (Boyle et al., 2017). We searched for these compounds from the PubChem database, which stores chemical structures of identified chemical compounds and their biochemical activities (Kim et al., 2016). Low molecular weight heparin was used a positive control. The compound belongs to sulfated polysaccharides group and was reported to have inhibition concentration of 0.4 mg•mL −1 against Plasmodium parasites (Skidmore et al., 2008). The drug likeness of the screened compounds were analyzed for the Lipinski Rule of Five, as follows: molecular mass <500 Dalton; high lipophilicity (expressed as LogP <5); <5 hydrogen bond donors; <10 hydrogen bond acceptors; and molar refractivity should be between 40-130. This was achieved using the Lipinski Rule of Five webserver (Gimenez et al., 2010). The absorption, distribution, metabolism, excretion, and toxicity (ADMET) analysis of the screened compound was carried out using SwissADME. The tool was certified to conduct pharmacokinetics and ther drug-likeness parameters of chemical molecules (Daina et al., 2017). The algorithm contains intergrated tools such as iLOG for testing water solubility and BOILED-egg for absorption, metabolism and excretion of the screened compounds ( et al., 1995).

Diagnosis for Plasmodium genus and Plasmodium falciparum species detection
A total of 175 out of 251 (70%) samples tested positive for Plasmodium parasite. The 70% positive samples were further tested for P. falciparum species and 63% were positive. The positive samples (n=110) were used for the downstream experiments, i.e. they were used to sequence PHISTb/RLP1 gene.
Genetic mutations in PHISTb/RLP1 sequences 102 of the 110 samples processed for Sanger sequencing yielded sequences. Of these, 86 samples generated complete sequences; 16 samples had poor sequences.
Non-synonymous SNPs identified in the PHISTb/RLP1 sequenced data A total of 157 non-synonymous SNPs were observed across the full length of the protein sequences (485 amino acid in length). The SNPs occurred at different frequencies in the total sequenced samples. Only SNPs occurring at a frequency >50% were considered for protein structure analysis (n=20) ( Table 3).

SNPs considered for mutant protein structure analysis
Within the 20 SNPs occurring at high frequency, codons F145L, D146R, Y147D, S156H, S208L and L219H were all novel mutations identified in our samples across all different sites where our samples were collected. We compared our data against SNPs recorded in PlasmoDB genetic variation tracks for the PHISTb/RLP1 3D7 reference (PF3D7_ 0201600). These six mutations were not present in the database; therefore, they could be newly identified mutations for PHISTb/RLP1 gene as observed in our samples. Analyzing the amino acids substitutions further, we observed that mutations at codons N108T, F145L, W209F, Y210N, V269A, V274I and M277L were all substitutions of amino acids within the folding of the protein structure and have an effect on the function of the protein. These codons were modelled in the mutant protein sequence for PHISTb/RLP1 structure. Their effect on the tertiary protein structure as well as effect on interaction with a drug compound were further investigated ( Figure 1).
Homology modeling, structure refinement and binding sites prediction PHISTb/RLP1 reference and mutant structures were modelled successfully. The LOMETs modelling results revealed five structures with significant close scores. All structures were assesed for correct positioning of the amino acid side chains using RAMPAGE tool. The model with high Ramachandran score above 90% which is the recommended value for a significant model was selected to be used for further analysis. Following this, the fourth model in wildtype and mutant structure prediction were both fit to be used for downstream docking experiments. The energy minimized structure of PHISTb/RLP1 reference had a TM score of 0.97 to the initial model. The TM score is calculated between 0 and 1, whereby the higher the TM score the higher the perfect match between the two structures. The PHISTb/RLP1 mutant protein had a TM score of 0.98 following energy minimization. The models were submitted to Galaxy Refine for further model refinement and then we chose the first models with best scores. The Galaxy refined models were submitted to RAMPAGE software for Ramachandran plot assessment. The reference structure had 95% and the mutant structure 94% of the residues in the favored region. The PROCHECK results displayed 92% of the residues in the reference structure to be in the most favored region. The mutant structure had 90% of the residues to be in the favored region. Ramachandran plot analysis required a good model to have at least 90% of the amino acids in the favored region. These Ramachandran plot scores qualified the two models to be used for the docking experiments and were submitted to binding sites prediction tools.  the same group. The F145L, W209F, V269A, V274I and M277L mutations were all substitutions within the non-polar group, while N108T and W209F were substitutions in the polar group. Amino acids within the same group play similar functional roles in the protein sequence and thus these SNPs were not considered for protein function analysis. On the other hand, I129T, T142V, Y147D, E154Q, S156H, T167I, S208L, M211T, L219H, D387N, D390N, and E403K were substitutions of amino acids across different functional groups. We postulated that these point mutations are most likely to change FT Site prediction outcome included Photoshop Element (.pse) files that were visualized in PyMol to visualize the active amino acids in the binding sites clusters for both reference and mutant PHISTb/RLP1 protein structures, as shown in Table 4. We also considered residues that were predicted by COACH software and had the highest confidence score (C-score) of 0.05 for PHISTb/RLP1 reference protein structure. COACH predicted residues in PHISTb/RLP1 protein structure had a C-score between 0.07 and 0.08. The C-score ranges between 0-1 where a higher C-score gives a more reliable prediction. In both the mutant and the reference structures, there were active residues predicted by both tools. This double prediction emphasized the accuracy of the binding sites as functional regions of the proteins (Figure 2 and Table 5).

Sulfated polysaccharides search and Absorption, Distribution, Metabolism, Excretion, and Toxicity analysis
The PubChem database search yielded the following compounds: beta carrageenan, alpha carrageenan, dextrin sulfate, amylopectin sulfate, zinc sulfate, ghatti sulfate, 2,4-diaminoanisole sulfate, cyclodextrin sulfate, fucoidan, 3-aminophenylboronic acid and 3,6-Di-O-benzoyl-D-galactal. In this group of compounds, dextrin sulfate deviated from the Lipinski Rule of Table 4. Galaxy refine scores for the best model of the protein structures for PHISTb/RLP1 reference and mutant proteins after structure refinement. The score considered included the poor rotamers scores in which they are very low in both inferring to correction in R chain conformations. The Rama favored scores were also used to consider the models. The scores above 90 for Rama favored shows that the models are good to be used for further analysis. The distribution of the compounds was shown by significant score of bioavailability score of 0.55 which is the recommended value for medicinal drugs. The compounds do not inhibit cytochrome isoenzymes and they do not cross the blood brain barrier (BBB). The skin permeability score were negative showing less permeability of the molecules (Table 6, Table 7).

PHISTb/RLP1 protein-ligand interactions
The identified compounds interacted with the target protein.
Alpha carrageenan, amylopectin sulfate, cyclodextrin sulfate and Fucoidan exhibited optimum interactions with the PHISTb/RLP1 protein. The amino acids interacting with these compounds were identified in the binding sites. Of these interactions, amino acids S132, K84 and N80 were found within the PHISTb domain. These interactions significantly show the potential of interfering with the function of the exported protein. Beta carrageenan and dextrin sulfate compounds had specific interactions with the target protein. Amino acids K376, E402 and E403 were identified in the binding site clusters. 2, 4-Diaminoanisole sulfate and zinc sulfate interactions with the protein were not specific with the identified binding site clusters. However, we noted that the amino acids interacting with these compounds are within the PHIST domain which spans amino acids 1 to 167 of the protein sequence. Ghatti sulfate showed interaction with the reference protein as well, although not specific.
The same drug compounds were tested on the modelled PHISTb/RLP1 mutant protein structure with the identified non-synonymous SNPs from the sequenced data. The aim was to test whether mutations found within the protein interfere with the interactions or not. The interactions of the drug compounds with mutant protein were not as strong as those of the reference protein. The interactions in the mutant protein occurred at different binding residues because the mutations within the proteins affected the folding of the protein structure. Due to these mutations, there was a shift of the binding sites clusters. Alpha carrageenan, ghatti sulfate and 3,6-Di-O-benzoyl-D-galactal depicted specific interactions with the mutant protein. Residues K35, N90, D17 and N18 were within the PHIST domain. The other compounds were interacting with different residues in the protein domains. The interactions were not optimized to the predicted binding site clusters. However, we noted that some of the interacting residues were within the PHISTb domain, which is the functional region of the protein. The interactions were compared to the positive control low molecular weight heparin which contains predetermined activity against Plasmodium falciparum. The interaction visualization of low molecular weight heparin with PHISTb/RLP1 reference protein reveal specific interactions between the compound and N197, K205 and S161 all predicted with high confidence score in the target binding site as shown in Figure 3. These interactions, therefore, gave insight to the action of the drug compounds against the mutant protein.
When selecting the poses after docking experiments, we selected the first pose of the AutoDock Vina pdbqt output file. The docking energies of the screened reference and mutant targets were very low. The energies released therefore supported the drug likeness of these compounds. The resulting Root Mean Square Deviation (RMSD) was zero (first pose RMSD are always zero). There was a difference in docking energies of the drug compounds interacting with PHISTb/RLP1 reference and mutant proteins because of the difference in binding site clusters. The interactions with different residues caused by the shifting of binding sites due to mutations resulted in the slight difference in energy released during docking ( Figure 4, Table 8-Table 10)

Discussion
Point mutations, protein structure analysis and binding sites In P. falciparum, SNPs have been reported to cluster in subtelomeric regions of the chromosomes. A study comparing synteny of exported proteins in P. vivax and P. falciparum  Table 7. ADMET analysis data for screened drug compounds. The compounds are highly soluble in water with logS score greater than -4 which is the range of most market drugs. The compounds have high gastrointestinal absorption except for carrageenan and 2,4-Diaminoanisole sulfate. The bioavailability value is highly significant for most of the compounds. The drug compounds do not inhibit cytochromes hence no formation of toxic compounds. Skin permeation values are negative depicting less permeation. Lipophilicility greater than 1 in 3,6-Di-O-benzoyl-D-galactal and ghatti sulfate shows the ability of these compounds to reach target cells.  reported a large number of SNPs in chromosome 2 and 10 subtelomeric regions (Sargeant et al., 2006). The subtelomeric regions of the P. falciparum genome have been reported to be highly variable. The genes found in these regions have shown sequence diversity within and across different P. falciparum isolates. The drug target PHISTb/RLP1 is located in chromosome 2.
Our genetic diversity analysis of this protein in Kenyan isolates depicted many SNPs, with many of the polymorphisms having been identified before. Novel SNPs were reported with high frequency across our sample size. Novel point mutations were identified in our sequences across different groups of amino acids. These included I129T, T142V, Y147D, E154Q, S156H, T167I, S208L, M211T, L219H, D387N, D390N, and E403K. These substitutions are hypothesized to affect the function of the protein and we considered them to analyze their effect on the structure of the protein, as well as interactions with drug compounds.
The homology modelling of full length PHISTb/RLP1 protein revealed that the 3D structure of this drug target is   The highlighted residues show optimized interactions with amino acids that were predicted by the binding site predicting tools. Alpha carrageenan, beta carrageenan, amylopectin sulfate, cyclodextrin sulfate and fucoidan compounds interacted specifically with amino acids predicted in binding sites of the reference protein structure. They showed potential ability to clock functional domains of PHISTb/RLP1 protein.

Interacting residues PHISTb/RLP1 Reference Protein
Low molecular weight heparin N197 S161 K205 Alpha Carrageenan S132, R315, N80 Beta Carrageenan K376, N408, E402, E403 Dextrin sulfate K376, E377, Q404, D406, E402 Amylopectin sulfate S201, N197, S132, K84, K205 Phenoxyacetyl cellulose sulfate/zinc sulfate S113, Y137, T29, K111 Ghatti sulfate K375, G317 2,4-Diaminoanisole sulfate N7, S12 Cyclodextrin sulfate S132, A133, F134 Fucoidan K205, S132, A133 3,6-Di-O-benzoyl-D-galactal N406, K371 Table 10. Docking energies of each drug compound on interacting with PHISTb/ RLP1 reference and mutant proteins. The energies released were all very low, supporting the activity of the sulfated polysaccharides as drug compounds against the exported proteins. The low binding energies show high binding affinities. Generally, the docking energies in PHISTb/RLP1 mutant protein was slightly higher that in reference protein. As seen in the interaction results, the compounds are interacting with different residues in the two proteins hence the difference in the released energy. Sulfated polysaccharides drug compounds Sulfated polysaccharides are a wide group of biochemical molecules with therapeutic properties including anti-thrombotic, anti-viral and anti-plasmodial activities (Mourão, 2015). We narrowed down our search to sulfated polysaccharides with potential anti-malarial properties (Boyle et al., 2017). The database search yielded ten compounds that are curated in PubChem compounds: Beta carrageenan, alpha carrageenan, dextrin sulfate, amylopectin sulfate, zinc sulfate, ghatti sulfate, 2,4-Diaminoanisole sulfate, cyclodextrin sulfate, Fucoidan,3-Aminophenylboronic acid and 3,6-Di-O-benzoyl-D-galactal. The positive control, low molecular weight heparin, was previously reported to contain activity against Plasmodium parasites (Leitgeb et al., 2011). The compound was reported to contain a significant inhibition concentration (IC 50 = 1.97 × 10 (-2) and 3.05 × 10 (-3) mg/mL (-1)) against Plasmodium falciparum (Skidmore et al., 2008). The compound is a glycosaminoglycan and therefore gives structural and functional comparison with the screened sulfated polysaccharides (Sun et al., 2016). The compounds met most of the ADMET parameters with minor deviations. The water solubility score for most of the compounds was below six and above -4 which is the reported range for market drugs (Ritchie et al., 2013). The compounds do not cross the Blood Brain Barrier not inhibit cytochrome enzymes which supports their abilities form toxic compounds in the body (Daina & Zoete, 2016). The more negative the score of skin permeability is the less the permeability. The bioavailability score 0.55 is well correlated with most market drugs (Riyadi et al., 2021). The compounds have lipophilicity score near one which is the recommended score for a drug. This depicted possibility of these molecules to reach target cells through hydrophobic interactions with lipids (Daina & Zoete, 2016). The lipophilicity scores are less than four which is recommended for optimal absorption and less toxicity of the compounds. 3,6-Di-O-benzoyl-D-galactal and Ghatti sulfate contain optimal score of lipohilicity meaning that they the physicochemical properties qualify them as drug molecules (Gao et al., 2017). This is significance in showing the drugs can be metabolized and excreted safely in the body. The compounds generally have a low skin permeation ability and can hence be retained in the body longer (Potts & Guy, 1992).

PHISTb/RLP1 WILD TYPE
The solubility score showed that these compounds could be easily handled in drug testing procedures such as invitro assays of Plasmodium cultures. The compounds were tested for activity against exported protein PHISTb/RLP1 protein except for 3-Aminophenylboronic acid. Despite dextrin sulfate deviating from Lipinski's Rule of Five, it was still tested. Dextrin sulfate has been supported by previous research to contain antimalarial activity (Boyle et al., 2017). The drug activity of this compound supports its investigation as an antimalarial with further chemical modifications to suit rules of a drug compound. Carrageenan compounds antimalarial activity had previously been reported, however, the compound had to be further modified to reduce its toxicity (Adams et al., 2005;Recuenco et al., 2014). The compound interacted optimumly with the refence and mutant protein through different hydrogen bonds as well as van der walls forces as shown in Figure 4 below. All these compounds interacted with the PHISTb/RLP1 at different affinities and this inferred their inhibitory activity against the PHIST family of proteins.

Interactions of sulfated polysaccharides with PHISTb/ RLP1
The identified drug compounds interacted with the exported protein PHISTb/RLP1. Alpha carrageenan compound interacted with both the reference and the mutant proteins. The compound shows potential inhibitory activity against exported proteins. Amylopectin sulfate, cyclodextrin sulfate, ghatti sulfate, Fucoidan and 3,6-Di-O-benzoyl-D-galactal compounds have specific interactions with the protein. The interactions of the drug compounds with specific amino acids found in binding site clusters depicted that these compounds have the potential to block the protein domains used to invade red blood cells. 2,4-Diaminoanisole sulfate and zinc sulfate showed weak interactions with the proteins. The interactions with the mutant protein are generally weaker. The mutations found in the PHISTb/RLP1 changed the tertiary folding of the protein thus interfering with the active sites. The identified mutations I129T, T142V, Y147D, E154Q, S156H and T167I are within the PHIST domain (PDB ID: 4JLE) (Oberli et al., 2014). I129T, T142V and T167I represent polar to non-polar and non-polar to polar substitutions. Y147D shows substitution of a polar amino acid to negatively charged aspartic acid residue. E154Q is a substitution from negatively charged to polar uncharged glutamine and S156H is a substitution from polar uncharged group to a positively charged amino acid. These changes affected the protein structure as well as enhancing the protein function. The binding affinity was different among the amino acids found in different structures and was shown by the difference in docking energies. The interactions of these compounds with exported proteins support the wide activity of sulfated polysaccharides against the P. falciparum parasite.
P. falciparum parasite uses the mechanism of exporting proteins to the host erythrocyte to enhance its virulence. These changes induced by the exported proteins include changing the physical properties of the cell and giving the cell adhesive properties. These changes enhance the pathogenesis of malaria in humans (Boddey et al., 2016). Among the exported proteins, the PHIST family, which contains 89 proteins, has been identified to play a role in making the remodeled host cell more cytoadherent (Warncke et al., 2016). The key target we have studied in this research, PHISTb/RLP1 (Pf3D7_ 0201600) plays a key role in remodeling of the host erythrocyte to enhance malaria virulence through cytoadherence mechanism (Warncke et al., 2016). The PRESAN domain present in this protein contains the Plasmodium protein export element (PEXEL) motif that enables the export mechanism of the protein (Boddey et al., 2016;Moreira et al., 2016). The interactions of the screened drug compounds with amino acids found in the functional domains of this protein reveal novel chemical inhibitors targeting exported proteins. The compounds have the ability to inhibit the PRESAN and PHIST domains from carrying out the export functions of the proteins. In P. falciparum, the PHIST family of proteins are expressed in the early and late ring stages, as well as trophozoites. Using PHISTb/RLP1 as a representative of this protein family, the interactions of the protein with sulfated polysaccharides infers that these compounds can deactivate exported proteins.

Conclusion
The interactions of specific sulfated polysaccharide compounds with PHISTb/RLP1 protein are the first findings showing compounds that can act against exported proteins of the Plasmodium parasite. These findings support further drug discovery downstream processes with these compounds as lead compounds for developing the next class of antimalarial agents. Crystal structures solving PHISTb/RLP1 and other exported proteins is recommended to enable more insight into the implications of structural variants on the protein structure and functions.

Data availability
Open

Abraham Madariaga
Institute of Chemistry, National Autonomous University of Mexico, Mexico City, Mexico The manuscript describes the modeling of a protein and its mutant from Plasmodium falciparum, and the assessment of some ligands with these models by molecular docking analysis. The mutant was built based on experimental data from several samples. Both protein models were validated using online tools. Although the authors used several online tools for assessing the structural performance of the proteins, the mutant model is lacking confidence in its tridimensional structure. Furthermore, the docking protocol is not fully described. The results are too preliminary to suggest a potential activity of the selected ligands against PHISTb/RLP1 proteins. I would recommend major revision prior to indexing.
The modeled mutant protein resulted with a very different tridimensional configuration to the reference protein. For better prediction and assessment of a homology model, it is suggested to perform a molecular dynamics simulation of at least 50 ns in order to let the structure correctly minimize. Otherwise, the predicted protein may have a non-biological relevance.
It is not clear why the authors select the fourth model of each structure (reference and mutant) for energy minimization. Does LOMETs give a score? Furthermore, the TM score is significantly high for both structures, meaning a "perfect match" between the initial model and the generated model. It is surprising that the mutant model gets a TM as similar as the reference model, so the mutant has no apparent structural changes despite having a dozen different amino acids. Nevertheless we can see in Fig2a-b big differences among those structures. How do you explain this?
Rule of five is not synonymous with toxicity. Rule of five gives an empirical approximation of the pharmacokinetics of a molecule, in terms of absorption. Ro5 is a simple approach intended to select molecules with druggability properties. To examine toxicity of molecules, many other methods can be used (Ames mutagenicity prediction, ADME-Tox predictions, LD50, etc) in different online platforms.
Docking protocol. Were the ligands used in their salt forms? For better results, the ligands should undergo a "washing" procedure that includes removing contra ions, adjusting charges, and geometric minimizing process. It is not described in the methodology.
Regarding the gridbox, what size does it have? Depending on the size, the ligands (in this case the ligands are large molecules) are allowed to freely rotate or not. This may bias the docking results.
On the other hand, if a positive control is not used as a part of the docking experiments, it is difficult to argue that a molecule will have a potent activity, such in the case of the manuscript. It is preferable to suggest only a possible activity.
For completeness of the discussion, a figure depicting the ligand-protein interaction of the ligands with better affinity would be great.

Are sufficient details of methods and analysis provided to allow replication by others? No
If applicable, is the statistical analysis and its interpretation appropriate? Not applicable Comment: The manuscript describes the modeling of a protein and its mutant from Plasmodium falciparum, and the assessment of some ligands with these models by molecular docking analysis. The mutant was built based on experimental data from several samples. Both protein models were validated using online tools. Although the authors used several online tools for assessing the structural performance of the proteins, the mutant model is lacking confidence in its tridimensional structure. Furthermore, the docking protocol is not fully described. The results are too preliminary to suggest a potential activity of the selected ligands against PHISTb/RLP1 proteins. I would recommend major revision prior to indexing.
Response: All the tools used in modeling and assessment of the protein structures are open source tools with algorithms that are assessed and published. Furthermore, they have been used and cited in other scientific research prior to this work hence able to provide significant scientific data as documented in the provided links and cited publications for each tool used. The docking protocol details are included; that is preparation of ligand files in MGL tools, grid box parameters and python script used.
Comment: The modeled mutant protein resulted with a very different tridimensional configuration to the reference protein. For better prediction and assessment of a homology model, it is suggested to perform a molecular dynamics simulation of at least 50 ns in order to let the structure correctly minimize. Otherwise, the predicted protein may have a nonbiological relevance.
Response: Energy minimization carried out using ModRefiner an algorithm for atomiclevel, high-resolution protein structure refinement, which can start from either Calpha trace. The models underwent energy minimization using this tool prior to docking experiments. Molecular dynamics simulation of at least 50ns were not possible to carry out due to minimal expertise in a tool that can do that. However, after energy minimization using ModRefiner the models quality improved and were fit for docking simulations.
Comment: It is not clear why the authors select the fourth model of each structure (reference and mutant) for energy minimization. Does LOMETs give a score? Furthermore, the TM score is significantly high for both structures, meaning a "perfect match" between the initial model and the generated model. It is surprising that the mutant model gets a TM as similar as the reference model, so the mutant has no apparent structural changes despite having a dozen different amino acids. Nevertheless, we can see in Fig2a-b big differences among those structures. How do you explain this?
Response: The fourth model was chosen after screening the Ramachandran score for the five suggested models hence the fourth model had the highest scores. Lomets does not provide scores for models since they are racked from number 1 to 5, but for the template selected. Structures match because the middle template selected by the algorithm was similar in both reference and mutant modelling jobs. The PHIST domain is the only present crystal structure for this protein family in PDB. This is why the TM score matches closely. The difference between the two structures can be seen in the alpha-helix folding which is significantly distinct and postulated to be an effect of the SNPs identified.
Comment: Rule of five is not synonymous with toxicity. Rule of five gives an empirical approximation of the pharmacokinetics of a molecule, in terms of absorption. Ro5 is a simple approach intended to select molecules with draggability properties. To examine the toxicity of molecules, many other methods can be used (Ames mutagenicity prediction, ADME-Tox predictions, LD50, etc) in different online platforms.
Response: ADMET analysis data was included using SwissADME tool as suggested.
Comment: Docking protocol. Were the ligands used in their salt forms? For better results, the ligands should undergo a "washing" procedure that includes removing contra ions, adjusting charges, and geometric minimizing process. It is not described in the methodology.
Response: The ligand files were converted using the Open Babel tool and prepared using MGL tools as described in the text. This editing enabled the removal of contradicting s ions prior to docking.