Characterization of potential drug targeting folate transporter proteins from Eukaryotic Pathogens [version 2; peer review: 2 approved with reservations] Previously titled: Genome-wide characterization of folate transporter proteins of eukaryotic pathogens

Background: Medically important pathogens are responsible for the death of millions every year. For many of these pathogens, there are limited options for therapy and resistance to commonly used drugs is fast emerging. The availability of genome sequences of many eukaryotic microbes is providing critical biological information for understanding parasite biology and identifying new drug and vaccine targets. Methods: We developed automated search strategies in the Eukaryotic Pathogen Database Resources (EuPathDB) to construct a protein list and retrieve protein sequences of folate transporters encoded in the genomes of 200 eukaryotic microbes. The folate transporters were categorized according to features including mitochondrial localization, number of transmembrane helix, and protein sequence relatedness. Results: We identified 234 folate transporter proteins associated with 63 eukaryotic microbes including 48 protozoa, 13 fungi the others being algae and bacteria. Phylogenetic analysis placed 219 proteins into a major clade and 15 proteins into a minor clade. All the folate transporter sequences from the malaria parasite, Plasmodium, belonged to the major clade. The identified folate transporters include folate-binding protein YgfZ, folate/pteridine transporter, folate/biopterin transporter, reduced folate carrier family protein and folate/methotrexate transporter FT1. About 60% of the identified proteins are reported for the first time. Phylogeny computation shows the similarity of the proteins identified. Conclusion: These findings offer new possibilities for potential drug development targeting folate-salvage proteins in eukaryotic pathogens. Open Peer Review


Introduction
A heterogeneous diversity of eukaryotic pathogens is responsible for the most economically important diseases of humans and animals 1,2 . As a result of underdevelopment, a lack of social infrastructure and insufficient funding of public health facilities, most of these pathogens are endemic to resource-poor countries in sub-Saharan Africa, South-East Asia and South America, where they are responsible for high morbidity and mortality [1][2][3] . Of these, parasitic protozoa form a major group, with the apicomplexans and kinetoplastid parasites represented by important members, which cause diseases such as malaria, cryptosporidiosis, toxoplasmosis, babesiosis, leishmaniasis, Human African trypanosomiasis and south American trypanosomiasis or Chagas' disease causing most of the morbidity and mortality 4,5 . Other important diseases caused by protozoans include giardiasis, amoebic dysentery 6,7 and trichomoniasis 8 . A vicious cycle of poverty and disease exists for most of these parasites with a high infection and death rate in affected populations 9-11 . The appreciable burden of disease caused by these parasites has been aggravated by the lack of a licensed vaccine for most of them 12 . Furthermore, current drugs of choice for treatment for many of the parasites have significant side effects, with the added emergence of drug resistant strains [13][14][15] . Despite the urgent demand for new therapies for control, few drugs have been developed to combat these parasites 16 . A major limitation to the development of new drugs is the paucity of new drug targets. There is therefore a need for discovery of novel and alternate potential chemotherapeutic targets that can help in drug development efforts for disease control [16][17][18] . A possible approach to selective antimicrobial chemotherapy has been to exploit the inhibition of unique targets, vital to the pathogen and absent in mammals 17,18 .
A metabolic pathway that has been exploited considerably for the development of drugs is the folate biosynthetic pathway 19 .
Antifolate drugs target this pathway and are the most important and successful antimicrobial chemotherapies targeting a range of bacterial and eukaryotic pathogens. While most parasitic protozoa can synthesize folates from simple precursors, such as GTP, p-aminobenzoic acid (pABA) and glutamate, higher animals and humans cannot 20 . Additionally, a few of these parasites can also salvage folate as nutrient from their host 21 . These folate compounds are important for synthesis of DNA, RNA and membrane lipids and are transported via receptor-mediated or/and carrier-mediated transmembrane proteins; folate transporters [20][21][22] . Importantly, antifolate chemotherapies that target the biosynthesis and processing of folate cofactors have been effective in the chemotherapy of bacterial and protozoan parasites 21 . More importantly, the folate pathway has also been confirmed as being essential in some eukaryotic pathogens such as Plasmodium, trypanosomes and Leishmania 19 .
In addition to the folate biosynthesis pathway, proteins that mediate transport of useful nutrients such as folic acid have been identified as important chemotherapeutic drug targets 18,19,23 . Hence, the folate pathway, metabolites and transporters continue to be extensively studied for identification of new enzymes including transporters, which may serve as new drug targets 22 . Recent estimates have ascribed eight different membrane transporters to eukaryotes 24 .
Proteins that mediate transportation of folates have been well studied in a few parasites such as Plasmodium falciparum, Trypanosoma brucei, Leishmania donovani and Leishmania major 25,26 . These studies have provided information on mode of action of drugs 25,27,28 in addition to studies describing mechanisms of parasite drug resistance 25-32 . However, folate transport proteins remain unidentified and uncharacterized in many other eukaryotic pathogens. This is despite the sequencing of the genomes of most eukaryotic microbes, which has produced a vast wealth of data that could aid in identification of druggable pathogen-specific proteins 33-39 . It is therefore imperative to search and identify from these parasite genomes additional proteins such as folate transporters that may serve as novel drug targets 40,41 . Therefore, in an attempt to identify and characterize targets for novel therapeutics, we report herein an extensive search of folate transporters from pathogen genomes. In addition, we investigated the evolutionary relationship of these transporters in a bid to determine similarities and differences that make them attractive drug targets. The knowledge provided may assist in the design of new antifolates for protozoan parasites.

Methods
We extracted protein sequences of approximately 200-pathogens that mediate transportation or salvage of folates from Eukaryotic Pathogen Genome Database Resources (http://eupathdb.org/ eupathdb/), and from the literature using a key-word search. We also searched the 200-pathogen genome sequences archived at the Eukaryotic Pathogen Genome Database Resources (http:// eupathdb.org/eupathdb/). The search was for all proteins that mediate transportation or folate salvage alone or folate salvage and related compounds (such as pteridine, biopterin and methotrexate) together. This database gives public access to most sequenced emerging/re-emerging infectious pathogen genomes 42 . We utilized

Amendments from Version 1
We have improved the manuscript following the suggestions of the reviewers. We have addressed their major and minor concerns.
• Title has been modified from the first version • Abstract has also been corrected as suggested • Work flow diagram has been removed and improved detail of workflow stated Proteins that did not belong to these groups were classified as others (4%) ( Figure 1A). A good number of the proteins identified had predicted transmembrane helixes with a few having none ( Figure 1B). Furthermore, a number of the transporters possess signal peptides (Dataset 1 43 ), which may be required for targeting to cellular locations. Deciphering the sequence of the targeting signal may indicate its product destination.       species, which, like P. falciparum, may also be chemotherapeutic targets. Transport of folate in higher eukaryotes is made possible by a high affinity folate-biopterin transporters FBT or BT1 family 22,30 . In the trypanosomes and related kinetoplastids, a member of these transporters, the folate biopterin transporter (FBT) family of proteins was identified in Leishmania 28 . It is thought that MFS proteins are related to the FBT. These proteins have been characterized in a few protozoa and cyanobacteria 56 . Results from our study describing the presence of these transporters across several phyla corroborate results from other works establishing the conservation of folate transport function among FBT family proteins from plants and protists 22,56 .
Malaria parasites encode transporters belonging to the organo anion transporter (OAT) folate-biopterin transporter (FBT), glycosidepentoside-hexuronide: cation symporter (GPH), families, which are closely related to the major facilitator super-family of membrane proteins 57 . The inhibition of these transporters by blockers of organic anion transporters such as probenecid has been implicated in sensitization of Plasmodium resistant parasites to antifolates 58,59 . Thus, in Plasmodium chemotherapy, identification of folate transporters could lead to screening for compounds that interfere with folate transport and salvage for antimalarial chemotherapy 22,30 . We identified several types of folate transporters that have been described and functionally characterized in Leishmania with some implicated in the import of the antifolate methotrexate 60,61 . Thus far, only protozoan transporters in Plasmodium, Leishmania, and Trypanosoma brucei have been characterized and these are known to mediate the uptake of the vitamins folate and/or biopterin 22,62,63 . Thus in parasites species of medical importance folate transporter proteins may provide new targets for therapy.
We also identified folate salvaging proteins from fungi such as Coccidioides immitis and A. clavatus, fungi found in soil 64-66 , vegetable 64 and waters in tropical and subtropical areas 67 . These fungi are known to occasionally become pathogenic and act as opportunistic pathogens for animals and man 66 . Coccidioidomycosis caused by C. immitis in association with AIDS has been known to be a fatal disease 68 . Treatment of acute and chronic infections with antifungals such as amphotericin B have not been adequate, hence folate transporters may present new targets in these group of pathogens. Identification was also made on pathogens such as C. fasciculata that parasitize several species of insects including mosquitoes and has been widely used to test new therapeutic strategies against parasitic infections 69 . As a model organism, folate transporters identified in C. fasciculata may be useful in research for developing new drugs in medically important Kinetoplasts as has been shown for other targets in this protozoan parasite 70 .
We noticed that P. parasitica INRA-310 and L. pyrrhocoris H10 had the highest number of folate transporters identified. Their utility as model fungal (P. parasitica) and monoxenous kinetoplast may provide models instrumental for developing new antifolates for fungal and protozoan diseases. The relatedness of these proteins across the different pathogens show that there are two major phylogenetically distinct clades in the eukaryotic pathogens examined. The clustering of these proteins suggests that these transport proteins have highly conserved regions often required for basic cellular function or stability 60,71-87 . Thus, antifolate chemotherapic drugs that are effective against one pathogen might have some effect on others. However, the converse may be the case for the free-living non-parasitic photosynthetic algae, Chromera velia and Vitrella brassicaformis, protists related to apicomplexans 88,89 . These groups of algae live freely in their environment, which unlike apicomplexans that depend on a host animal to survive 88 . This adaptation may explain the difference in the clustering of their transporters after phylogenetic analysis, which separated on the minor clade from other apicomplexans that separated on the major clade. This suggests a high level of evolutionary divergence between folate transporters in both the apicomplexans and these algae based on life-style adaptations.

Conclusion
In summary, we have retrieved information on 234 folate transporter proteins from Eukaryotic Pathogen Database (EuPathDB) resources. The folate transporter proteins were categorized into potential drug targeting features including mitochondrial localization, number of transmembrane helix, and protein sequence relatedness. The identification of folate salvage proteins in diverse eukaryotes extend the evolutionary diversity of these proteins and suggests they might offer new possibilities for potential drug development targeting folate-salvaging routes in eukaryotic pathogens. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Selman M, Sak B, Kváč M, et al.: Extremely reduced levels of heterozygosity in
Open Peer Review INTRODUCTION The statement of objectives of the research needs to be clear. "Therefore, in an attempt to identify and characterize targets for novel therapeutics, we report herein an extensive search of folate transporters from pathogen genomes. In addition, we investigated the evolutionary relationship of these transporters in a bid to determine similarities and differences that make them attractive drug targets. The knowledge provided may assist in the design of new antifolates for protozoan parasites." Suggested Revision: "The objectives of the research reported in this article were to (1) characterize folate transporters encoded in genomes of eukaryotic pathogens; and (2) determine evolutionary relationship of folate transporters in genomes of eukaryotic genomes. These research objectives will help advance the development of effective anti-folate drugs to reduce the morbidity and mortality associated with eukaryotic pathogens." 1.

METHODS
Divide the methods section according to the two objectives. 1.
approximately 200-pathogens Suggested revision: delete "-". The purpose of the dash is not justified.

2.
The folate transporters were classified based on type of transporter, number of 3.
profile (such as HMM) based searches. Thus the number of transporters identified by authors is most likely to be an underestimate.
The methodology is not clear.
Authors write that "we utilized the word "folate" for search on the gene text and "folic acid" was used to confirm the hits", then how only transporters were retrieved? The Figure 1 is confusing. Where is BLAST used here?

4.
The manuscript is written very poorly with so many scientific and grammatical mistakes that it is very difficult for the reader to follow the manuscript. Below are some examples. a) "The mitochondrion is the predicted location of the majority of the proteins, with 15% possessing signal peptides." -how can mitochondria be majority location if only 15% have signal peptides and even less with mitochondrial signal peptide? Shouldn't the majority then be cytoplasmic? b) "We identified 234 proteins to be involve in folate transport". c) "Since folate-binding protein YgfZ, folate/ pteridine transporter, folate/biopterin transporter, putative, reduced folate carrier family protein, folate/methotrexate transporter FT1, putative folate transporters alone and others have 10, 25, 132, 2, 7, 49 and 9." What are these numbers? d) "So we decided to reconstruct the phylogeny based folate transporter, folate-biopterin transporter after considering the identification number, the species diversity in each category." e) "The different proteins identified to be involved in folate salvage or related molecules were folatebinding protein YgfZ, folate/pteridine transporter, folate/biopterin transporter, reduced folate carrier family protein, folate/methotrexate transporter FT1 and folate transporters having a 4%, 11%, 56%, 1%, 3% and 21% identity, respectively." What does this statement mean? f) Does Table 1 really need to be 4 page long? g) "The only Plasmodium species with results for proteins that salvage folate was P. falciparum" h) "However, folate transporters I and II were retrieved from our search of GeneDB for P. malariae and P. ovale curtisi, respectively." What are these transporter classes?
i) "Some of these pathogens include P. ultimum DAOM BR144, which has mitochondrial folate transporter/carrier proteins similar to Homo sapiens, E. cuniculi GB-M1, which has proteins similar to folate transporter, and S. punctatus DAOM BR117, which has folatebinding protein YgfZ." j) "After phylogenetic analysis each sub-phylogeny show a clear characterization except for 5.

folate-biopterin transporters"
h) "In this study, 234 genes encoding homologues of folate salvaging proteins were identified in the genome of 64 strains, representing 28 species of eukaryotic pathogens. Some of the pathogens include P. falciparum 3D7 and IT, P. knowlesi H, P. berghei ANKA, P. chabaudi chabaudi, T. brucei Lister 427, T. brucei TREU927, T. brucei gambiense DAL972, Encephalitozoon cuniculi GB-M1. The pathogens range from bacteria through to fungi, intracellular parasites such as Plasmodium and leishmania species, to extracellular parasites such as trypanosome species" Which bacteria was included in the study? i) "It has been estimated that over half of the drugs currently on the market target integral membrane proteins of which membrane transporters are a part, but unfortunately, these transporters have not been adequately explored as drug targets. Folate transporters therefore represent attractive drug targets for treatment of infectious diseases." Please tell us how many drugs are available in the market which target folate transporters, which is a more relavant statistic with respect to this study. j) "In the trypanosomes and related kinetoplastids, a member of these transporters, the folate biopterin transporter (FBT) family of proteins was identified in Leishmania." k) "It is thought that MFS proteins are related to the FBT." What is MFS? l) "Results from our study describing the presence of these transporters across several phyla corroborate results other researches, establishing the conservation of folate transport function among FBT family proteins from species from plants and protists". m) "The clustering of these proteins suggests that these transport proteins have highly conserved regions often required for basic cellular function or stability". The clustering does not suggest that these transport proteins have highly conserved regions often required for basic cellular function or stability. n) " We also performed phylogenetic comparisons of identified proteins. .".

Competing Interests:
No competing interests were disclosed.
I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.

Summary of Referee's Report
The manuscript presents a strong justification for research on folate transporter proteins as drug targets for diseases caused by eukaryotic pathogens including the malaria parasite. The manuscript reports a data curation effort that involves the use of the Eukaryotic Pathogens Database (EuPathDB) Resource. Several novel results to guide future research are included such as (i) a list of 234 folate transporter proteins from 63 eukaryotic microbes including eukaryotic pathogens; and (ii) phylogenetic trees of relatedness of the protein sequences. The authors observed the clustering of the protein sequences that indicate the possibility that antifolate drugs could be effective for multiple eukaryotic pathogens.
Major concerns are: (i) The need for a clearer description of the workflow for the construction of the protein list.
(ii) There is inadequate support for the statement that "60% of the proteins were identified for the first time". (iii) Confusion between retrieval and identification of protein sequences. The workflow diagram indicates retrieval of sequences but narrative text describes identification in multiple sections. The potential drug targeting categorization of the retrieved protein sequences is a key contribution of the research study.
"pathway are important for" > "pathways are necessary for" 4.
"Methods: We applied a combination of bioinformatics methods to examine the genomes of pathogens in the EupathDB for genes encoding homologues of proteins that mediate folate salvage in a bid to identify and assign putative functions." > "Methods: We developed automated search strategies in the Eukaryotic Pathogen Database Resources (EuPathDB) to construct a protein list and retrieve protein sequences of folate transporters encoded in the genomes of 200 eukaryotic microbes. The folate transporters were categorized according to features including mitochondrial localization, number of transmembrane helix, and protein sequence relatedness.

5.
Provide key result(s) of the protein list retrieval and phylogenetic comparison of the retrieved proteins. For example, We constructed a list of 234 folate transporter proteins associated with 63 eukaryotic microbes including ??? algae, ??? fungi and ??? protozoa. Seven percent of the proteins were predicted to localize on the mitochondrial membrane. Phylogenetic tree revealed major (??? proteins) and minor (??? Proteins) clades. All the folate transporter sequences from the malaria parasite, Plasmodium, belonged to the major clade.

6.
"The mitochondrion is the predicted location of the majority of the proteins". This statement is not supported by Figure 2D, where 7% of the protein sequences are labelled as Mitochondrial folate transporters.

Article content: Have the design, methods and analysis of the results from the study been explained and are they appropriate for the topic being studied?
Design and Methods: Figure 1 presents a conceptual hierarchical methodology. The rectangle labelled "Protein names/ sequences verification" has arrows to PubMed, UniProt, Membranetransporters.org, NCBI, GeneDB, Google Scholar and Phylogenetic analyses. It appears that the integrated results from the search strategies in the databases provided the input for the phylogenetic analyses. Please clarify. 1.
Which step of the workflow resulted in the list of 234 proteins? 2.
How many proteins were retrieved from the initial search using EuPathDB? 3.
Protein Features Retrieved rectangle: Was the retrieval of protein features performed on only the 234 proteins?

4.
There is adequate explanation of the methods for phylogenetic analyses. Please provide the Newick format phylogenetic tree as a supplementary dataset.

5.
Analysis of the Results: Table 1 is a major curation effort presented in the manuscript. 1.
a. The title of Table 1 should be updated to "Eukaryotic microbes from which folate transporters were identified". The list includes non-pathogens. b. The content of the table (especially the Kingdom entries) should be checked for accuracy. The Kingdom column could be updated to Eukaryotic Microbe Group with entries as algae, protozoa or fungi. c. The column entries for A. capsulatus G186AR (mislabelled as bacteria) should be updated as the organism name is for a fungus (genus Ajellomyces). This update will also affect the Phylogenetic Tree (Figure 3). The node labelled Actinobacillus clusters with Ajellomyces macrogynus. d. An updated Table 1 should be presented as Dataset 2 in a spreadsheet file. This would enable secondary data analysis by other researchers. e. A new Table 1 could consist of columns for Eukaryotic Microbe Group, Genera of Eukaryotic Microbe, List of Species/Strain and Number of Folate Transport Proteins. This will provide reader with an overview of how the 234 proteins is distributed into the genera of the eukaryotic microbes. f. References listed for confirmation searches. Page 8, Paragraph 1, Sentence 1: "Our literature search for parasite folate transporters on PubMed and Google Scholar indicated 60% (38 out 63) of the proteins were identified for the first time as presented in Table 1. Comment: Among the references included in Table 1, only eight references (22, 73 to 77 and 82) on the basis of the article title provide experimental assessments of the folate transporters. Reference 85 is a reference for MEGA7 software. In the sentence "proteins" should be eukaryotic microbes. The proportion of eukaryotic microbes whose folate transporter(s) have been previously investigated with functional assays should be revised.

Conclusion
Consider revising "In summary, we have identified and classified 234 proteins…" to "In summary, we have retrieved information on 234 folate transporter proteins from the Eukaryotic Pathogen Database (EuPathDB) resources. The folate transporter proteins were categorized into potential drug targeting features including mitochondrial localization, number of transmembrane helix, and protein sequence relatedness."

Data (if applicable):
Has enough information been provided to be able to replicate the experiment? Are the data in a usable format/structure and have all the data been provided? Table 1 needs to be revised and converted to a Dataset. 1.
Please provide the Newick format phylogenetic tree as a supplementary dataset. 2.
The Gene Identifiers [Gene ID] in Dataset 1 can be used to retrieve the protein sequences 3.