Genome of Serratia plymuthica UBCF_13, Insight into diverse unique traits [version 1; peer review: awaiting peer review]

Background: Whole genome sequencing is become an essential tool to explore potential of microorganism and evolutionary study. The Serratia plymuthica UBCF_13 is one of phylloplane associated plant bacteria showing antifungal activity. For that reason, its complete genome information is necessary to enhance its potential as biocontrol against plant pathogenic fungal. Here, we report the genome sequence of Serratia plymuthica UBCF_13 to understand the molecular mechanism regarding its biocontrol ability. Methods: Continuous short reads were attained from Illumina sequencing runs and reads 150 bp were merged into a single dataset. Pan-genome based method was used to identify core-genome of S. plymuthica species and unique gene in UBCF_13. Results: Assambled Illumina reads of S. plymuthica strain UBCF_13 genome was produced a 5.46 Mb circular genome sequence. It was found 3321 genes belong to the core-genome sheared by the 18 strains evaluated. The UBCF_13 genome harbor 485 unique genes, where 300 of them only can be found in this strain Conclusions: The sequence of UBCF_13 genome sequence data will contribute for further exploration of the potential of S. plymuthica UBCF_13 as bacteria producing antibiotic.


Introduction
Serratia plymuthica bacteria have been isolated from many environmental sources and are found associated with diverse plants [1][2][3][4][5] . Many strains of this species have been reported to have the ability to inhibit the growth of plant-pathogenic fungi and stimulate plant growth [6][7][8][9] . UBCF_13 is one strain of this species. It has ability to inhibit Colletotrichum gloeosporioides, a species of post-harvest pathogenic fungi that causes anthracnose disease in various plants 10 .
Here, we report the complete genome sequence of this bacterium, constructed using Illumina sequencing technology. Our dataset may be useful as a comparative genome for evolutionary and speciation studies, as well as for the analysis of protein-coding RNA, biosynthetic gene clusters and may also useful for further study such as the regulation of gene expression in relation to the antifungal activity of this bacterium.

Methods
Genomic DNA isolation and sequencing S. plymuthica strain UBCF_13 was isolated from phylloplane of Brassica juncea L. in 2012 from District of Solok, Province of West Sumatera, Indonesia 10 . The bacterium was cultivated in Luria-Bertani (LB) broth at 27°C for 16 hours with 150 rpm. The genomic DNA was extracted using the method of Chen and Kuo (1993) 11 , followed by degrading residual RNA by RNAse. Library preparation and sequencing was done by Novogen (Hong Kong). Sequencing was performed using Illumina NovaSeq 6000 (Illumina NovaSeq 6000 Sequencing System, RRID:SCR_016387).

Genome assembly and annotation
Continuous short reads of 150 bp were merged into a single dataset. The dataset was obtained by using combination of map-based gene references and de novo assembly that was performed in Geneious software (Geneious, RRID:SCR_010519) 12 . The annotation in genome submission was carried out using NCBI Prokaryotic Genomes Automatic Annotation Pipeline (PGAP) 13 . The annotated genome sequence of UBCF_13 has been deposited in the NCBI GenBank under accession number CP068771.
Comparative genomics of Serratia plymuthica strains Comparative genomics analysis was carried out using genome sequences of UBCF_13 from this research and 17 whole sequenced genomes of other Serratia plymuthica strains retrieved from NCBI's GenBank. The genomes were reannotated using the Prokka software tool (Prokka, Galaxy Version 1.14.6+galaxy0) (Prokka, RRID:SCR_014732) 14,15 that is available from the NCBI. Identification of genes shared between the strains, and 'presence-absence gene set' was carried out using Roary; Galaxy Version 3.13.0+galaxy1, (Roary, RRID:SCR_018172) 16 with a threshold similarity of 70%.
Genes that exist in all the strains are the core-genome. Phylogenetic trees were constructed using Maximum Likelihood based inference of large phylogenetic trees-RAxML, Galaxy Version 8.2.4+galaxy2 (RAxML, RRID:SCR_006086) 17 based on multialignment of concatenate core-genome. Phandango (Phandango, RRID:SCR_015243) 18 was used to view the resulted output graphs.

Cluster of the orthologous groups of UBCF_13
The translated protein coding genes of UBCF_13 was used for identification of cluster of orthologous groups (COG). This was obtained from NCBI BLAST+rpsblast (Galaxy Version 2.10.1+galaxy0) 19 and eggNOG Mapper (Galaxy Version 2.0.1+galaxy1) (eggNOG, RRID:SCR_002456) 20 . The result of COG identification was classified based on the categories in COG database NCBI 21 .

Result and discussion
Comparative genomics of Serratia plymuthica strains The whole genome sequencing reads of Serratia plymuthica UBCF_13 were assembled into a single circular 5.46 Mb chromosome with overall GC content of 56.2% (Table 1). S. plymuthica has a genome size in the range 5.40-5.70 Mb. The GC content percentage is 55.70-56.60. Based on genome reannotation by Prokka, it was found different number of CDS in each genome S. plymuthica ( Table 1). All of the compared S. plymuthica genomes shared a highly conserved genomic architecture as inferred from synteny of protein coding orthologs. Figure 1A shows the phylogenetic tree of 18 strains S. plymuthica. The phylogenetic tree shows that S. plymuthica UBCF_13 is in same cluster with strain AS9, PRI-2C, NCTC8015, and NCTC8900. The strain PRI-2C is reclassified and transferred to the species S. inhibens 22 . The pangenome was performed together with other strains in order to obtain further insight into specific features in the UBCF_13. It was found 3315 belong to the core-genome shared by the 18 strains evaluated. The genome of the UBCF_13 harbors 488 unique genes, of which 300 genes are only contained by this strain. The presence-absence gene set was shown in file supplementary data 1 The Cluster of Orthologous Groups of UBCF_13 Functional categories of the CDS in S. plymuthica UBCF_13 based on the Cluster of Orthologous Groups (COG) categories are shown in Table 2. The list of UBCF_13 COG and its function classification based on COG database was shown in extended dataset 2 23 .