ALL Metrics
-
Views
-
Downloads
Get PDF
Get XML
Cite
Export
Track
Research Article

Whole-genome sequence analysis of Vibrio cholerae from three outbreaks in Uganda, 2014 - 2016

[version 1; peer review: 2 not approved]
PUBLISHED 02 Aug 2019
Author details Author details
OPEN PEER REVIEW
REVIEWER STATUS

This article is included in the Pathogens gateway.

This article is included in the Antimicrobial Resistance collection.

Abstract

Background: Cholera remains a serious public health problem in Uganda and Africa. The aim of this study was to provide the complete array of antimicrobial resistance genes, integrative and conjugative elements, virulence genes, pathogenicity islands, plasmids, and insertion sequences in the strains. In addition, this study also aimed to provide a single nucleotide polymorphism (SNP) based phylogenetic analysis of the strains.
Methods: In the analysis, both Linux and web-based bioinformatics approaches were used to analyze the study sequences. Databases used included; FastQC, MultiQC, Snippy, PANTHER, PATRIC, Unicycler, ISFinder, Center for Genomic Epidemiology pipelines (i.e. MLST, PlasmidFinder, MyDbFinder, and ResFinder), MashTree and IcyTree. 
Results: The 10 sequenced strains of Vibrio cholerae were found to carry virulence-associated genes including MakA, ctxA, ctxB, carA, carB, trpB, clpB, ace, toxR, zot, rtxA, ompW, ompR, gmhA, fur, hlyA, and rstR. Also identified were: genes of the Type VI secretion system including vasA-L, vgrG-2, vgrG-3, vipA/mglA, and vipB/mglB; alsD (VC1589), involved in the synthesis of 2,3-butanediol; alsR, involved in the acetate-responsive LysR-type regulation; makA, the flagella-mediated cytotoxin gene; Type VI pilus genes including tcpA-F, tcpH-J, tcpN, tcpP-T, and icmF/vasK; adherence genes acfA-D and IlpA; and quorum sensing system genes luxS and cqsA. Pathogenicity islands identified comprised of VSP-1 and VSP-2, as well as VPI-1 and VPI-2. In addition, strA and B, APH(3'')-I, APH(3'')-Ib, APH(6)-Id, APH(6)-Ic, murA, pare, dfrA1, floR, catB, and catB9 were among the antimicrobial resistance genes found in the sequences. Analysis for SNPs shared among the sequences showed that the sequenced strains shared 218 SNPs and of these, 98 SNPs were missense. Gene enrichment analysis of these SNPs showed enrichment in genes that mediate transmembrane-signaling receptor activity, peptidyl-prolyl cis-trans isomerase activity, and phosphor-relay response regulator activity.
Conclusions: This study applied bioinformatics approaches to provide comprehensive genomic analysis of V. cholerae genomes obtained from Uganda.

Keywords

Vibrio cholerae, Whole-genome sequencing, Bioinformatics, Genomics,

Introduction

Cholera remains a serious public health problem in Uganda and Africa as a whole1,2. It is characterized by a large disease burden, recurrent outbreaks, high case fatality rates, as well as tenacious endemicity1,2. Over the last four decades, Uganda has experienced several cholera outbreaks2. The detection, monitoring, and surveillance of cholera in Uganda rely upon the isolation of Vibrio cholerae using culture-based methods in microbiology laboratories2,3. However, these methods are faced with several challenges, including: associated long turn-around times (24–48 hours); limited microbiology laboratory capacity; lack of laboratory supplies; poor laboratory infrastructure, particularly electricity necessary to operate laboratory equipment; as well as limited reliable and rapid diagnostic tests2. Unlike culture-based methods, high-throughput sequencing, a culture-independent method, has been documented to be less affected by most of the challenges facing culture-based methods and at the same time provides an unprecedented view of pathogen biology and delivers high-resolution genomic epidemiology via rapid and cheap whole-genome sequencing47. Despite this knowledge, sequencing remains a less desirable option for most scientists in Uganda and hence, data on the genomic characteristics of V. cholerae remains scarce, likely attributable to the underdeveloped bioinformatics capacity and lack of expertise necessary for analyzing whole-genome sequencing data7. This study set out to use bioinformatics approaches to analyze whole-genome sequence data obtained from V. cholerae isolates from different outbreaks in Uganda, with the aim of providing the complete array of virulence genes, pathogenicity islands, antimicrobial resistance genes, integrative and conjugative elements, and antimicrobial resistance genes associated with these elements, plasmids, and insertion sequences. In addition, this study also provided a single nucleotide polymorphism (SNP) based phylogenetic analysis of the strains.

Methods

Study design

This was a cross-sectional study that analyzed 10 whole-genome sequences of V. cholerae isolates. These isolates were collected during three different cholera outbreaks in Uganda between 2014 and 2016 and sequenced by a group from the University of Maryland (Bwire et al., 2018). The whole-genome sequencing data was deposited in the NCBI’s Sequence Read Archive (SRA) with the accession number SRP136117.

Sample collection and whole-genome sequencing

Procedures and considerations in sample collection and whole-genome sequencing are described by Bwire et al., 2018. Briefly, whole-genome sequencing was done using three or four representative samples that have been obtained from each of the three Multiple-Locus Variable Number Tandem-Repeat Analysis clonal complexes that had been identified during the period 2014–2016. Steps in whole-genome sequencing were: library preparation from fragmented DNA, this was achieved using an appropriate library preparation kit (KAPA High Throughput Library Preparation Kit, Millipore-Sigma, St. Louis MO); following this, enrichment and barcoding were done, and subsequently, libraries were sequenced using a 100bp paired-end run on an Illumina HiSeq2500 (Illumina, San Diego, CA).

Bioinformatics workflow

Whole-genome sequencing data for the 10 V. cholerae isolates were downloaded from NCBI’s SRA using the toolkit fastq-dump v2.9.3. An overview of the bioinformatics workflow adopted in this study has been provided (Figure 1).

4274bb2d-5b3f-48e8-bf8d-63974a35047d_figure1.gif

Figure 1. Outline of the bioinformatics workflow.

IS, insertion sequences; MLST, multilocus sequence typing; AMR, antimicrobial resistance; SNP, single nucleotide polymorphism; PATRIC, Pathosystems Resource Integration Center.

Quality control of untrimmed sequence data

Untrimmed sequence data quality reports were generated with FastQC v0.11.8 and MultiQC v1.7 using default settings.

Bacterial variant calling and gene ontology enrichment analysis

Bacterial SNP calling was done using Snippy 3.2-dev, a tool for rapid and core genome alignments. V. cholerae genome assembly (accession number GCF_002892855.1) was obtained from NCBI’s nucleotide archive and used as a reference during variant calling. We then used BCFtools v1.9 to extract all SNPs that were shared by the 10 V. cholerae isolates. Custom bash scripts were used to extract only missense SNPs (nonsynonymous), available on GitHub (see Software availability)8. Gene ontology enrichment analysis was performed using PANTHER Overrepresentation Test (released 2019-06-06), annotation version PANTHER version 14.1 (Released 2019-03-12) and a reference list of V. cholerae. Biological process and molecular function enrichment analyses were also carried out using the same database.

Bacterial genome assembly and annotation

V. cholerae genomic reads were assembled using Unicycler v0.4.8-beta9 to generate contigs. The Pathosystems Resource Integration Center (PATRIC) v3.5.39 was used to annotate the assembled genomes.

Identification of antimicrobial resistance genes, virulence genes, insertion sequences, integrative and conjugative elements, pathogenicity islands, and plasmids

PATRIC v3.5.39 was used to generate genome assembly metrics, identify antimicrobial resistance genes, and virulence factors. We used ISfinder, a dedicated database for bacterial insertion sequences, to screen for the presence of insertion sequences in our assembled bacterial genomes. In addition, we performed a number of analyses using the different pipelines at the Center for Genomic Epidemiology to analyze the assembled bacterial genomes. These analyses were multilocus sequence typing (MLST) using MLST v2.0, plasmids searches using PlasmidFinder v2.0, phenotyping using BLAST-based on the V. cholerae database using MyDbFinder v1.2, and identification of acquired antibiotic resistance genes using ResFinder. For all the above pipelines, default settings were used.

SNP-based phylogenetic analysis

Using the Mashtree command-line based tool, we generated a Newick file. This file was then uploaded to IcyTree, a browser-based phylogenetic tree viewer.

Results

Genomic characterization of the V. cholerae strains

The 10 sequenced strains of V. cholerae belonging to Inaba and Ogawa serotypes were characterized through analysis of the whole-genome sequencing data. Except for one strain, which was a non-O1, all the other strains belonged to the serogroup O1 due to the presence of the rfbV-O1 gene. All 10 sequenced strains had biotype-specific genes ctxB, rstR, and tcpA; hence were all atypical EI Tor biotype variants of V. cholerae and these also belonged to the third wave of the seventh pandemic. In silico MLST revealed that the sequenced strains belonged to two different sequence types (STs); ST69 and ST515. Table 1 shows the genomic characteristics of the V. cholerae strains.

Table 1. Biosample data, serotype and genomic sequence data of the V. cholerae strains.

Isolate IDSerotypeYear of
collection
Month of
collection
District
of origin
CoverageGenome
size (bp)
No. of
contigs
No. of coding
sequences
SRR6871251Inaba2014AprArua1554025190883733
SRR6871254Inaba2014MayMoyo2404025367853743
SRR6871247Ogawa2015AprKasese2004012560883697
SRR6871245Inaba2015AprKasese2604030572963734
SRR6871253Inaba2015JulArua2004024003893743
SRR6871248Inaba2015MayKasese26040348901023745
SRR6871246Inaba2015SepHoima2504032599903734
SRR6871252Inaba2015SepHoima2004025376873731
SRR6871249Ogawa2016JanMbale3003998859823688
SRR6871250Ogawa2016JanMbale2504011573893700

In addition, all the 10 sequenced strains were found to carry virulence-associated genes. These included MakA, ctxA, ctxB, carA, carB, trpB, clpB, ace, toxR, zot, rtxA, ompW, ompR, gmhA, fur, hlyA, and rstR. The 10 sequenced strains also carried genes for the Type VI secretion system (T6SS), including vasA, B, C, D, E, F, G, H, I, J, K, and L, vgrG-2, vgrG-3, vipA/mglA and vipB/mglB. The strains were also found to have the following genes: alsD (VC1589), involved in the synthesis of 2,3-butanediol; alsR, involved in the acetate-responsive LysR-type regulation; makA, the flagella-mediated cytotoxin gene; Type IV pilus genes, including tcpA, B, C, D, E, F, H, I, J, N, P, Q, R, S, and T, as well as icmF/vasK; adherence genes acfA, B, C, D, and IlpA; and quorum sensing system genes luxS, and cqsA. Pathogenicity islands were also present in all the 10 sequenced strains; namely, the Vibrio seventh pandemic islands VSP-1 and VSP-2 as well as VPI-1 and VPI-2.

Genotypic antimicrobial resistance and mobile genetic elements

The 10 sequenced strains all showed genotypic resistance to streptomycin, aminoglycosides, fosfomycin, fluoroquinolones, sulphonamides, trimethoprim, chloramphenicol/florfenicol, and tetracyclines, as illustrated in Table 2.

Table 2. Antimicrobial resistance genes in the V. cholerae strains.

Antibiotic categoryGenes associated with resistanceResistance phenotype
TetracyclineTet (35)Tetracycline resistance
Sulphonamidessul2Sulphonamide resistance
TrimethoprimdfrA1Trimethoprim resistance
PhenicolscatB9, catB, floRChloramphenicol resistance
FosfomycinMurAFosfomycin resistance
FluoroquinoloneParEFluoroquinolone resistance
AminoglycosideAPH(3'')-I, APH(3'')-Ib, APH(6)-Id, APH(6)-IcAminoglycoside resistance

Furthermore, all the 10 sequenced strains contained the VC1786 integrative and conjugative elements (VC1786ICE genes). Antimicrobial resistance genes associated with resistance to chloramphenicol, streptomycin, sulfamethoxazole, and trimethoprim usually found on the VC1786ICE, such as strA and strB, floR as well as sul2, were found present in the strains according to MyDbFinder 1.2. The sequenced strains were also found to have a genomic organization of the integrative and conjugative element similar to that of the V. cholerae ICEVchHai1 reference strain10.

In addition, all the sequenced strains had no plasmids according to PlasmidFinder 2.0, particularly the IncA/C plasmids, and cryptic plasmids pSDH1-2 were also absent in all the strains according to MyDbFinder 1.2 and in BLAST atlas.

Insertion sequences were, however, present in all the sequenced strains; these included TS200/IS605, IS630, IS66, IS3, and IS4 (Table 3).

Table 3. Insertion sequences in the V. cholerae strains.

Isolate IDTS200/IS605IS630IS66IS3IS4
SRR6871251 + + + - -
SRR6871254 + + + - -
SRR6871247 + + + - -
SRR6871245 + + + - -
SRR6871253 + + + - -
SRR6871248 + + + - -
SRR6871246 - - + + +
SRR6871252 + + + - -
SRR6871249 + + + - -
SRR6871250 + + + + -

Key: + / - Presence or absence

Phylogenetic comparison of the V. cholerae strains

SNP-based phylogenetic analysis showed an overall SNP difference of 120 among the 10 sequenced V. cholerae strains. Close relatedness was observed among strains SRR6871252, SRR6871253, and SRR6871254 (only seven SNP differences); strains SRR6871247, SRR6871249, and SRR6871250 (only four SNP differences), and among strains SRR6871245, SRR6871246, and SRR6871248 (only six SNP differences) (Figure 2).

4274bb2d-5b3f-48e8-bf8d-63974a35047d_figure2.gif

Figure 2. Single nucleotide polymorphism-tree showing the phylogenetic relationship among the V. cholerae strains.

Furthermore, analysis for shared SNPs among the sequences showed that the sequenced strains shared 218 SNPs. Of these, 98 SNPs were missense (non-synonymous). Gene enrichment analysis of the SNPs using the PANTHER GO Ontology database showed enrichment in genes that mediate transmembrane-signaling receptor activity, peptidyl-prolyl cis-trans isomerase activity, and phosphor-relay response regulator activity.

Discussion

This was a cross-sectional study that aimed at providing a comprehensive genomic analysis of 10 whole-genome sequences of V. cholerae isolates collected during three different cholera outbreaks in Uganda between 2014 and 2016, submitted by the University of Maryland (Baltimore, MD, United States) to the NCBI SRA database under a study titled, “Molecular characterization of Vibrio cholerae responsible for cholera epidemics in Uganda by PCR, MLVA and WGS”.

In the genomic analysis, this study confirmed the identity of the isolates, provided the complete array of virulence genes, pathogenicity islands, antimicrobial resistance genes, integrative and conjugative elements, and antimicrobial resistance genes associated with these elements, plasmids, and insertion sequences. In addition, this study also provided a SNP-based phylogenetic analysis of the strains.

The identity of the isolates was in tandem with what was reported by Bwire et al., 2018. In addition, the finding of this study in regards to most of the isolates belonging to the O1 serotype are consistent with other studies elsewhere; these studies attributed this to the presence of the rfbV-O1 gene in isolates classified as O15. Unlike a similar study done in the East African region5 in which MLST revealed that their isolates belonged to a single ST (ST69), this study revealed that the isolates belonged to two STs; namely, ST69 and ST515. Strains of V. cholerae belonging to the ST515 have been reported elsewhere11.

The virulence genes reported in this study are similar to those reported in studies elsewhere5,1214. These genes included, among others, those belonging to the Type IV secretion system, those involved in adherence, Type IV pilus genes, and those involved in quorum sensing.

Accessory genetic elements, particularly pathogenicity islands previously reported to commonly occur in V. cholerae were also reported in this study; namely, VSP-1, VSP-2 as well as VPI-1 and VPI-215,16. These have not only been documented to encode virulence-associated genes in V. cholerae, but have also been reported to facilitate a better understanding of the evolutionary events that lead to the emergence of pathogenic V. cholerae clones15,16.

Despite the World Health Organization (WHO) recommendations in regards to the management of cholera with oral rehydration salts in addition to antibiotics such as streptomycin, aminoglycosides, trimethoprim, fosfomycin, fluoroquinolones, sulphonamides, chloramphenicol/florfenicol, and tetracyclines, this study reports genotypic resistance in the isolates to these same antibiotics. Similar resistance has been reported in similar studies from the East African region and elsewhere5,1719.

The presence of integrative and conjugative elements (VC1786ICE), containing resistance genes associated with sulfamethoxazole and trimethoprim, chloramphenicol, and streptomycin resistance, are also reported in this study. These are genomically similar to V. cholerae ICEVchHai117,19.

This study found no plasmids or intl genes. This could be attributed to the presence of integrative and conjugative elements, a factor that made them insignificant in regards to the encoding of antimicrobial resistance. Studies similar to this have reported similar findings5.

This study also reported the presence of insertion sequences IS605, IS66, IS3, and IS4. Insertion sequences have been described in various studies to be drivers of genetic variability. These studies have also alluded to them being fixed by natural selection each time that a mutation induced by these elements is selected20,21.

The presence of T6SS genes in the study isolates could explain their antimicrobial resistance gene profile. TS66-dependent killing of other bacteria is mostly directed to neighboring cells. These consequently release their DNA, which is ultimately taken up by the killer cells and in the process, these can integrate valuable genes including those that encode antimicrobial resistance. These may consequently evolve, leading to antimicrobial resistance in the killer cells22.

The results obtained from the SNP-based phylogenetic analysis show the relatedness of the Ugandan V. cholerae strains and are in agreement with the results obtained by Bwire et al., 2018.

The analysis for shared SNPs among the sequences and, consequently, gene enrichment revealed the enrichment in genes that mediate transmembrane-signaling receptor activity, peptidyl-prolyl cis-trans isomerase activity, and phosphor relay response regulator activity. These play a fundamental role in quorum sensing in V. cholerae, a process of cell-cell communication that allows these bacteria to share information about cell density and adjust gene expression accordingly2325. Quorum sensing has been documented to regulate the expression of virulence factors in V. cholerae2325.

Conclusions

Despite the fact that bioinformatics capacity remains underdeveloped in Uganda and Africa as a whole, this study demonstrated the ability to apply bioinformatics approaches to zoom into genomes (in this case, V. cholerae genomes obtained from Uganda) to provide a comprehensive genomic analysis. This study sets a stage that encourages more sequencing work with potential public health consequences to be done in African settings. Furthermore, it also encourages the need to build bioinformatics capacity in African settings to enable analysis of whole-genome sequence data generated from the continent.

Data availability

Underlying data

Whole-genome sequences of the ten Vibrio cholera isolates from Sequence Read Archive, Accession number SRP136117: https://identifiers.org/insdc.sra/SRP136117

Vibrio cholera reference genome assembly from NCBI Assembly, Accession number GCF_002892855.1: https://www.ncbi.nlm.nih.gov/assembly/GCF_002892855.1

Software availability

Comments on this article Comments (1)

Version 1
VERSION 1 PUBLISHED 02 Aug 2019
Author details Author details
Competing interests
Grant information
Copyright
Download
 
Export To
metrics
Views Downloads
F1000Research - -
PubMed Central
Data from PMC are received and updated monthly.
- -
Citations
CITE
how to cite this article
Aruhomukama D, Sserwadda I and Mboowa G. Whole-genome sequence analysis of Vibrio cholerae from three outbreaks in Uganda, 2014 - 2016 [version 1; peer review: 2 not approved]. F1000Research 2019, 8:1340 (https://doi.org/10.12688/f1000research.20048.1)
NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.
track
receive updates on this article
Track an article to receive email alerts on any updates to this article.

Open Peer Review

Current Reviewer Status: ?
Key to Reviewer Statuses VIEW
ApprovedThe paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approvedFundamental flaws in the paper seriously undermine the findings and conclusions
Version 1
VERSION 1
PUBLISHED 02 Aug 2019
Views
12
Cite
Reviewer Report 18 Sep 2019
Samit Watve, Sackler School of Graduate Biomedical Sciences, Tufts University, Meford, MA, USA 
Not Approved
VIEWS 12
General comments:

This study is tough to review because there is no clear scientific question being addressed and also lacks particularly novel or interesting findings. The authors claim, “The aim of this study was to provide the ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Watve S. Reviewer Report For: Whole-genome sequence analysis of Vibrio cholerae from three outbreaks in Uganda, 2014 - 2016 [version 1; peer review: 2 not approved]. F1000Research 2019, 8:1340 (https://doi.org/10.5256/f1000research.22012.r53477)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
Views
28
Cite
Reviewer Report 19 Aug 2019
Jason Miller, College of Natural Sciences and Mathematics, Shepherd University, Shepherdstown, WV, USA 
Not Approved
VIEWS 28
Summary of the Manuscript: Genome assemblies of ten strains of Vibrio cholerae were generated from public sequence data. The assembled contigs were analyzed for gene content, insertion sequence content, SNPs content, etc.
 
Summary of the Review: This ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Miller J. Reviewer Report For: Whole-genome sequence analysis of Vibrio cholerae from three outbreaks in Uganda, 2014 - 2016 [version 1; peer review: 2 not approved]. F1000Research 2019, 8:1340 (https://doi.org/10.5256/f1000research.22012.r52002)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.

Comments on this article Comments (1)

Version 1
VERSION 1 PUBLISHED 02 Aug 2019
Alongside their report, reviewers assign a status to the article:
Approved - the paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations - A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approved - fundamental flaws in the paper seriously undermine the findings and conclusions
Sign In
If you've forgotten your password, please enter your email address below and we'll send you instructions on how to reset your password.

The email address should be the one you originally registered with F1000.

Email address not valid, please try again

You registered with F1000 via Google, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Google account password, please click here.

You registered with F1000 via Facebook, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Facebook account password, please click here.

Code not correct, please try again
Email us for further assistance.
Server error, please try again.