Keywords
Sugarcane genome, DNA sequencing, sequencing reads
This article is included in the Agriculture, Food and Nutrition gateway.
This article is included in the Genomics and Genetics gateway.
This article is included in the Data: Use and Reuse collection.
Sugarcane genome, DNA sequencing, sequencing reads
Sugarcane is an important crop for food and energy production. The genomes of modern cultivars are hybrids of species that are themselves polyploid; see for example (Vilela et al., 2017). Selected genomic BAC sequences have been sequenced and assembled (de Setta et al., 2014) (Okura et al., 2016). Chloroplast and mitochondrial genomes have been published (Asano et al., 2004) (Shearman et al., 2016), as have several transcriptomes (Cardoso-Silva et al., 2014). Whole genome sequence assemblies have not been published. CP 96-1252 is the top commercial sugarcane cultivar in Florida, USA (Sandhu & Davidson, 2016). CP 96-1252 was developed by USDA-ARS, the University of Florida, and the Florida Sugar Cane League and released to growers in 2003. CP 96-1252 is a complex hybrid of Saccharum officinarum L., S. barberi Jeswiet, S. spontaneum L., and S. sinense Roxb. amend. Jeswiet (Edmé et al., 2005). Toward better understanding of this cultivar through its genome sequence, DNA reads were generated and made public.
Using lab-grown plantlets, kindly provided by USDA, 14 g of tissue was harvested from the leaves of Saccharum hybrid cultivar CP 96-1252 (Reg. no CV-120, PI 634935, NCBI taxon ID 1983727). DNA was extracted from purified plant nuclei at Amplicon Express (Pullman, WA, USA). Separately, DNA was extracted from whole cells at JCVI (Rockville, MD, USA) using a Qiagen Plant DNA isolation kit. Extracted DNA was fragmented and size selected on the Blue Pippin (Sage Scientific) prior to library construction to ensure a 260 bp insert size. Standard Illumina PE libraries were generated using the NEBNext kit (NEB). Libraries were size selected, QC’d and quantified by qPCR prior to sequencing. Barcode BS78 AGCCATGC was used for the nuclei prep library and barcode BS79 AGGCTAAC was used for the cell prep library. The libraries were generated and sequenced at the JCVI sequencing core in La Jolla, CA, USA. To test for bacterial contamination, both DNA samples plus negative controls were used to generate amplicon libraries targeting the V4 16S region followed by Illumina MiSeq sequencing. These reads were processed by a pipeline using usearch version 8.1.1.1861 for clustering (Edgar, 2017), mothur version 1.36.1 for taxonomic classification (Schloss et al., 2011), and the SILVA SSURef NR99 123 database for reference (Quast et al., 2013). Hits to chloroplast and mitochondria were observed as expected, but bacteria were virtually absent and similar to controls.
An Illumina NextSeq 500 instrument was used to generate paired 150 bp shotgun reads. Run #1 applied the Illumina High Output kit to libraries BS78 and BS79. Run #1 instrument metrics were: 1.8 pM pool loaded, 1% PhiX spike-in with 1.8% aligned, cluster density 138 K/mm2, 96% pass filter, and 106 Gbp in 345 M PE reads. Barcode analysis indicated 46% BS78 and 49% BS79. Run #2 applied the Illumina High Output kit to library BS78 only. Run #2 metrics were: 1.8 pM pool loaded, 1% PhiX spike-in with 1% aligned, and 110 Gbp in 360 M PE reads. The resulting FASTQ files contained 101 Gbp in 161 M pairs from BS78 run #1, 169 M pairs from BS79 run #1, and 341 M pairs from BS78 run #2.
To confirm sugarcane origin of the DNA, the run #1 reads were mapped to available BACs, namely the 608 Kbp of R570 BACs (GenBank accessions KF184657.1 to KF184973.1 (de Setta et al., 2014)). Reads were mapped with bowtie2 (Langmead & Salzberg, 2012) version 2.2.5 with options “-p 4 --no-unal --no-mixed --no-discordant --end-to-end --fast”. Both sequencing libraries demonstrated concordant pair mapping rates of 4.1% unique, 27% repeat, and 69% unmapped. Genome coverage analysis was inconclusive; the K-mer frequency distribution computed by Jellyfish (Marçais & Kingsford, 2011) version 2.2.4, with K=17 showed no peak above 1X coverage.
The data are available at NCBI SRA under BioProject PRJNA345486, Study SRP091668. Amplified reads from BS78 and BS79 have respective accessions SRR5500242 and SRR5500243. Genomic reads from BS78 have accessions are SRR5500246 and SRR5500247. Genomic reads from BS79 have accession SRR5500249.
Design of experiment: TBS, RS. Sample preparation: KD, DMH. Amplicons: MGT, KJM. Sequencing: KG, KB. Bioinformatics: GS, JM, DMH. Manuscript: JM.
This work was funded by US Department of Homeland Security (contract HSHQDC-15-C-B0059).
The authors are grateful for assistance from Jack Comstock, Per McCord, and M.D. Islam of USDA-ARS.
Views | Downloads | |
---|---|---|
F1000Research | - | - |
PubMed Central
Data from PMC are received and updated monthly.
|
- | - |
Is the rationale for creating the dataset(s) clearly described?
Yes
Are the protocols appropriate and is the work technically sound?
Yes
Are sufficient details of methods and materials provided to allow replication by others?
Yes
Are the datasets clearly presented in a useable and accessible format?
Yes
Competing Interests: No competing interests were disclosed.
Is the rationale for creating the dataset(s) clearly described?
Yes
Are the protocols appropriate and is the work technically sound?
Yes
Are sufficient details of methods and materials provided to allow replication by others?
Yes
Are the datasets clearly presented in a useable and accessible format?
Yes
Competing Interests: No competing interests were disclosed.
Alongside their report, reviewers assign a status to the article:
Invited Reviewers | ||
---|---|---|
1 | 2 | |
Version 1 17 May 17 |
read | read |
Provide sufficient details of any financial or non-financial competing interests to enable users to assess whether your comments might lead a reasonable person to question your impartiality. Consider the following examples, but note that this is not an exhaustive list:
Sign up for content alerts and receive a weekly or monthly email with all newly published articles
Already registered? Sign in
The email address should be the one you originally registered with F1000.
You registered with F1000 via Google, so we cannot reset your password.
To sign in, please click here.
If you still need help with your Google account password, please click here.
You registered with F1000 via Facebook, so we cannot reset your password.
To sign in, please click here.
If you still need help with your Facebook account password, please click here.
If your email address is registered with us, we will email you instructions to reset your password.
If you think you should have received this email but it has not arrived, please check your spam filters and/or contact for further assistance.
Riaño-Pachón DM and Mattiello L. Draft genome sequencing of the sugarcane hybrid SP80-3280. F1000Research 2017, 6:861 ... Continue reading A draft genome sequence assembly, for a different hybrid of sugarcane, appeared shortly after this paper.
Riaño-Pachón DM and Mattiello L. Draft genome sequencing of the sugarcane hybrid SP80-3280. F1000Research 2017, 6:861 (doi: 10.12688/f1000research.11859.1)
Riaño-Pachón DM and Mattiello L. Draft genome sequencing of the sugarcane hybrid SP80-3280. F1000Research 2017, 6:861 (doi: 10.12688/f1000research.11859.1)