Keywords
Vibrio cholerae, Whole-genome sequencing, Bioinformatics, Genomics,
This article is included in the Pathogens gateway.
This article is included in the Antimicrobial Resistance collection.
Vibrio cholerae, Whole-genome sequencing, Bioinformatics, Genomics,
Cholera remains a serious public health problem in Uganda and Africa as a whole1,2. It is characterized by a large disease burden, recurrent outbreaks, high case fatality rates, as well as tenacious endemicity1,2. Over the last four decades, Uganda has experienced several cholera outbreaks2. The detection, monitoring, and surveillance of cholera in Uganda rely upon the isolation of Vibrio cholerae using culture-based methods in microbiology laboratories2,3. However, these methods are faced with several challenges, including: associated long turn-around times (24–48 hours); limited microbiology laboratory capacity; lack of laboratory supplies; poor laboratory infrastructure, particularly electricity necessary to operate laboratory equipment; as well as limited reliable and rapid diagnostic tests2. Unlike culture-based methods, high-throughput sequencing, a culture-independent method, has been documented to be less affected by most of the challenges facing culture-based methods and at the same time provides an unprecedented view of pathogen biology and delivers high-resolution genomic epidemiology via rapid and cheap whole-genome sequencing4–7. Despite this knowledge, sequencing remains a less desirable option for most scientists in Uganda and hence, data on the genomic characteristics of V. cholerae remains scarce, likely attributable to the underdeveloped bioinformatics capacity and lack of expertise necessary for analyzing whole-genome sequencing data7. This study set out to use bioinformatics approaches to analyze whole-genome sequence data obtained from V. cholerae isolates from different outbreaks in Uganda, with the aim of providing the complete array of virulence genes, pathogenicity islands, antimicrobial resistance genes, integrative and conjugative elements, and antimicrobial resistance genes associated with these elements, plasmids, and insertion sequences. In addition, this study also provided a single nucleotide polymorphism (SNP) based phylogenetic analysis of the strains.
This was a cross-sectional study that analyzed 10 whole-genome sequences of V. cholerae isolates. These isolates were collected during three different cholera outbreaks in Uganda between 2014 and 2016 and sequenced by a group from the University of Maryland (Bwire et al., 2018). The whole-genome sequencing data was deposited in the NCBI’s Sequence Read Archive (SRA) with the accession number SRP136117.
Procedures and considerations in sample collection and whole-genome sequencing are described by Bwire et al., 2018. Briefly, whole-genome sequencing was done using three or four representative samples that have been obtained from each of the three Multiple-Locus Variable Number Tandem-Repeat Analysis clonal complexes that had been identified during the period 2014–2016. Steps in whole-genome sequencing were: library preparation from fragmented DNA, this was achieved using an appropriate library preparation kit (KAPA High Throughput Library Preparation Kit, Millipore-Sigma, St. Louis MO); following this, enrichment and barcoding were done, and subsequently, libraries were sequenced using a 100bp paired-end run on an Illumina HiSeq2500 (Illumina, San Diego, CA).
Whole-genome sequencing data for the 10 V. cholerae isolates were downloaded from NCBI’s SRA using the toolkit fastq-dump v2.9.3. An overview of the bioinformatics workflow adopted in this study has been provided (Figure 1).
Untrimmed sequence data quality reports were generated with FastQC v0.11.8 and MultiQC v1.7 using default settings.
Bacterial SNP calling was done using Snippy 3.2-dev, a tool for rapid and core genome alignments. V. cholerae genome assembly (accession number GCF_002892855.1) was obtained from NCBI’s nucleotide archive and used as a reference during variant calling. We then used BCFtools v1.9 to extract all SNPs that were shared by the 10 V. cholerae isolates. Custom bash scripts were used to extract only missense SNPs (nonsynonymous), available on GitHub (see Software availability)8. Gene ontology enrichment analysis was performed using PANTHER Overrepresentation Test (released 2019-06-06), annotation version PANTHER version 14.1 (Released 2019-03-12) and a reference list of V. cholerae. Biological process and molecular function enrichment analyses were also carried out using the same database.
V. cholerae genomic reads were assembled using Unicycler v0.4.8-beta9 to generate contigs. The Pathosystems Resource Integration Center (PATRIC) v3.5.39 was used to annotate the assembled genomes.
PATRIC v3.5.39 was used to generate genome assembly metrics, identify antimicrobial resistance genes, and virulence factors. We used ISfinder, a dedicated database for bacterial insertion sequences, to screen for the presence of insertion sequences in our assembled bacterial genomes. In addition, we performed a number of analyses using the different pipelines at the Center for Genomic Epidemiology to analyze the assembled bacterial genomes. These analyses were multilocus sequence typing (MLST) using MLST v2.0, plasmids searches using PlasmidFinder v2.0, phenotyping using BLAST-based on the V. cholerae database using MyDbFinder v1.2, and identification of acquired antibiotic resistance genes using ResFinder. For all the above pipelines, default settings were used.
Using the Mashtree command-line based tool, we generated a Newick file. This file was then uploaded to IcyTree, a browser-based phylogenetic tree viewer.
The 10 sequenced strains of V. cholerae belonging to Inaba and Ogawa serotypes were characterized through analysis of the whole-genome sequencing data. Except for one strain, which was a non-O1, all the other strains belonged to the serogroup O1 due to the presence of the rfbV-O1 gene. All 10 sequenced strains had biotype-specific genes ctxB, rstR, and tcpA; hence were all atypical EI Tor biotype variants of V. cholerae and these also belonged to the third wave of the seventh pandemic. In silico MLST revealed that the sequenced strains belonged to two different sequence types (STs); ST69 and ST515. Table 1 shows the genomic characteristics of the V. cholerae strains.
In addition, all the 10 sequenced strains were found to carry virulence-associated genes. These included MakA, ctxA, ctxB, carA, carB, trpB, clpB, ace, toxR, zot, rtxA, ompW, ompR, gmhA, fur, hlyA, and rstR. The 10 sequenced strains also carried genes for the Type VI secretion system (T6SS), including vasA, B, C, D, E, F, G, H, I, J, K, and L, vgrG-2, vgrG-3, vipA/mglA and vipB/mglB. The strains were also found to have the following genes: alsD (VC1589), involved in the synthesis of 2,3-butanediol; alsR, involved in the acetate-responsive LysR-type regulation; makA, the flagella-mediated cytotoxin gene; Type IV pilus genes, including tcpA, B, C, D, E, F, H, I, J, N, P, Q, R, S, and T, as well as icmF/vasK; adherence genes acfA, B, C, D, and IlpA; and quorum sensing system genes luxS, and cqsA. Pathogenicity islands were also present in all the 10 sequenced strains; namely, the Vibrio seventh pandemic islands VSP-1 and VSP-2 as well as VPI-1 and VPI-2.
The 10 sequenced strains all showed genotypic resistance to streptomycin, aminoglycosides, fosfomycin, fluoroquinolones, sulphonamides, trimethoprim, chloramphenicol/florfenicol, and tetracyclines, as illustrated in Table 2.
Furthermore, all the 10 sequenced strains contained the VC1786 integrative and conjugative elements (VC1786ICE genes). Antimicrobial resistance genes associated with resistance to chloramphenicol, streptomycin, sulfamethoxazole, and trimethoprim usually found on the VC1786ICE, such as strA and strB, floR as well as sul2, were found present in the strains according to MyDbFinder 1.2. The sequenced strains were also found to have a genomic organization of the integrative and conjugative element similar to that of the V. cholerae ICEVchHai1 reference strain10.
In addition, all the sequenced strains had no plasmids according to PlasmidFinder 2.0, particularly the IncA/C plasmids, and cryptic plasmids pSDH1-2 were also absent in all the strains according to MyDbFinder 1.2 and in BLAST atlas.
Insertion sequences were, however, present in all the sequenced strains; these included TS200/IS605, IS630, IS66, IS3, and IS4 (Table 3).
SNP-based phylogenetic analysis showed an overall SNP difference of 120 among the 10 sequenced V. cholerae strains. Close relatedness was observed among strains SRR6871252, SRR6871253, and SRR6871254 (only seven SNP differences); strains SRR6871247, SRR6871249, and SRR6871250 (only four SNP differences), and among strains SRR6871245, SRR6871246, and SRR6871248 (only six SNP differences) (Figure 2).
Furthermore, analysis for shared SNPs among the sequences showed that the sequenced strains shared 218 SNPs. Of these, 98 SNPs were missense (non-synonymous). Gene enrichment analysis of the SNPs using the PANTHER GO Ontology database showed enrichment in genes that mediate transmembrane-signaling receptor activity, peptidyl-prolyl cis-trans isomerase activity, and phosphor-relay response regulator activity.
This was a cross-sectional study that aimed at providing a comprehensive genomic analysis of 10 whole-genome sequences of V. cholerae isolates collected during three different cholera outbreaks in Uganda between 2014 and 2016, submitted by the University of Maryland (Baltimore, MD, United States) to the NCBI SRA database under a study titled, “Molecular characterization of Vibrio cholerae responsible for cholera epidemics in Uganda by PCR, MLVA and WGS”.
In the genomic analysis, this study confirmed the identity of the isolates, provided the complete array of virulence genes, pathogenicity islands, antimicrobial resistance genes, integrative and conjugative elements, and antimicrobial resistance genes associated with these elements, plasmids, and insertion sequences. In addition, this study also provided a SNP-based phylogenetic analysis of the strains.
The identity of the isolates was in tandem with what was reported by Bwire et al., 2018. In addition, the finding of this study in regards to most of the isolates belonging to the O1 serotype are consistent with other studies elsewhere; these studies attributed this to the presence of the rfbV-O1 gene in isolates classified as O15. Unlike a similar study done in the East African region5 in which MLST revealed that their isolates belonged to a single ST (ST69), this study revealed that the isolates belonged to two STs; namely, ST69 and ST515. Strains of V. cholerae belonging to the ST515 have been reported elsewhere11.
The virulence genes reported in this study are similar to those reported in studies elsewhere5,12–14. These genes included, among others, those belonging to the Type IV secretion system, those involved in adherence, Type IV pilus genes, and those involved in quorum sensing.
Accessory genetic elements, particularly pathogenicity islands previously reported to commonly occur in V. cholerae were also reported in this study; namely, VSP-1, VSP-2 as well as VPI-1 and VPI-215,16. These have not only been documented to encode virulence-associated genes in V. cholerae, but have also been reported to facilitate a better understanding of the evolutionary events that lead to the emergence of pathogenic V. cholerae clones15,16.
Despite the World Health Organization (WHO) recommendations in regards to the management of cholera with oral rehydration salts in addition to antibiotics such as streptomycin, aminoglycosides, trimethoprim, fosfomycin, fluoroquinolones, sulphonamides, chloramphenicol/florfenicol, and tetracyclines, this study reports genotypic resistance in the isolates to these same antibiotics. Similar resistance has been reported in similar studies from the East African region and elsewhere5,17–19.
The presence of integrative and conjugative elements (VC1786ICE), containing resistance genes associated with sulfamethoxazole and trimethoprim, chloramphenicol, and streptomycin resistance, are also reported in this study. These are genomically similar to V. cholerae ICEVchHai117,19.
This study found no plasmids or intl genes. This could be attributed to the presence of integrative and conjugative elements, a factor that made them insignificant in regards to the encoding of antimicrobial resistance. Studies similar to this have reported similar findings5.
This study also reported the presence of insertion sequences IS605, IS66, IS3, and IS4. Insertion sequences have been described in various studies to be drivers of genetic variability. These studies have also alluded to them being fixed by natural selection each time that a mutation induced by these elements is selected20,21.
The presence of T6SS genes in the study isolates could explain their antimicrobial resistance gene profile. TS66-dependent killing of other bacteria is mostly directed to neighboring cells. These consequently release their DNA, which is ultimately taken up by the killer cells and in the process, these can integrate valuable genes including those that encode antimicrobial resistance. These may consequently evolve, leading to antimicrobial resistance in the killer cells22.
The results obtained from the SNP-based phylogenetic analysis show the relatedness of the Ugandan V. cholerae strains and are in agreement with the results obtained by Bwire et al., 2018.
The analysis for shared SNPs among the sequences and, consequently, gene enrichment revealed the enrichment in genes that mediate transmembrane-signaling receptor activity, peptidyl-prolyl cis-trans isomerase activity, and phosphor relay response regulator activity. These play a fundamental role in quorum sensing in V. cholerae, a process of cell-cell communication that allows these bacteria to share information about cell density and adjust gene expression accordingly23–25. Quorum sensing has been documented to regulate the expression of virulence factors in V. cholerae23–25.
Despite the fact that bioinformatics capacity remains underdeveloped in Uganda and Africa as a whole, this study demonstrated the ability to apply bioinformatics approaches to zoom into genomes (in this case, V. cholerae genomes obtained from Uganda) to provide a comprehensive genomic analysis. This study sets a stage that encourages more sequencing work with potential public health consequences to be done in African settings. Furthermore, it also encourages the need to build bioinformatics capacity in African settings to enable analysis of whole-genome sequence data generated from the continent.
Whole-genome sequences of the ten Vibrio cholera isolates from Sequence Read Archive, Accession number SRP136117: https://identifiers.org/insdc.sra/SRP136117
Vibrio cholera reference genome assembly from NCBI Assembly, Accession number GCF_002892855.1: https://www.ncbi.nlm.nih.gov/assembly/GCF_002892855.1
- Source code available from: https://github.com/gmboowa/extractonlymissenseSNPs-
- Archived source code at the time of publication: https://doi.org/10.5281/zenodo.33544698
- License: GPL-3.0
Gerald Mboowa is supported through the DELTAS Africa Initiative [DEL15011] to THRiVE-2 (the Training Health Researchers into Vocational Excellence in East Africa). The DELTAS Africa Initiative is an independent funding scheme of the African Academy of Sciences’ (AAS) Alliance for Accelerating Excellence in Science in Africa (AESA) and supported by the New Partnership for Africa’s Development Planning and Coordinating Agency (NEPAD Agency) with funding from the Wellcome Trust [107742] and the UK Government.
The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Views | Downloads | |
---|---|---|
F1000Research | - | - |
PubMed Central
Data from PMC are received and updated monthly.
|
- | - |
Is the work clearly and accurately presented and does it cite the current literature?
Partly
Is the study design appropriate and is the work technically sound?
No
Are sufficient details of methods and analysis provided to allow replication by others?
Yes
If applicable, is the statistical analysis and its interpretation appropriate?
Partly
Are all the source data underlying the results available to ensure full reproducibility?
Yes
Are the conclusions drawn adequately supported by the results?
Yes
Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Molecular genetics of Vibrio cholerae with a focus on Type six secretion, natural transformation, quorum sensing as well as expertise in genome sequencing and data analysis.
Is the work clearly and accurately presented and does it cite the current literature?
Yes
Is the study design appropriate and is the work technically sound?
No
Are sufficient details of methods and analysis provided to allow replication by others?
Yes
If applicable, is the statistical analysis and its interpretation appropriate?
Yes
Are all the source data underlying the results available to ensure full reproducibility?
Yes
Are the conclusions drawn adequately supported by the results?
No
Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Genome assembly, expression analysis.
Alongside their report, reviewers assign a status to the article:
Invited Reviewers | ||
---|---|---|
1 | 2 | |
Version 1 02 Aug 19 |
read | read |
Provide sufficient details of any financial or non-financial competing interests to enable users to assess whether your comments might lead a reasonable person to question your impartiality. Consider the following examples, but note that this is not an exhaustive list:
Sign up for content alerts and receive a weekly or monthly email with all newly published articles
Already registered? Sign in
The email address should be the one you originally registered with F1000.
You registered with F1000 via Google, so we cannot reset your password.
To sign in, please click here.
If you still need help with your Google account password, please click here.
You registered with F1000 via Facebook, so we cannot reset your password.
To sign in, please click here.
If you still need help with your Facebook account password, please click here.
If your email address is registered with us, we will email you instructions to reset your password.
If you think you should have received this email but it has not arrived, please check your spam filters and/or contact for further assistance.
Comments on this article Comments (1)