Keywords
Abiotic stress, Genome assembly, Halophyte, Nanopore, NGS, Salt stress, Wild Oryza, Whole genome sequencing
This article is included in the Nanopore Analysis gateway.
This article is included in the Genomics and Genetics gateway.
This article is included in the Agriculture, Food and Nutrition gateway.
Abiotic stress, Genome assembly, Halophyte, Nanopore, NGS, Salt stress, Wild Oryza, Whole genome sequencing
Soil salinity is a major abiotic stress of rice cultivation globally (Molla et al., 2015), and rice cultivation areas under soil salinity stress are increasing gradually. Genetic potential for salt tolerance of rice that exists among the natural population has been largely exploited, and alternative useful alleles may further enhance salinity tolerance. Wild species are a potential source of many useful genes and QTLs that may not be present in the gene pool of the domesticated species.
Oryza coarctata, known as Asian wild rice, grows naturally in the coastal region of South-East Asian countries. It flowers and set seeds under as high as 40 E.Ce dS m-1 saline soil (Bal & Dutt, 1986). It is the only species in the genus Oryza that is halophyte in nature. However, with the exception of one transcriptomic (Garg et al., 2014) and one miRNA (Mondal et al., 2014) experiment, no large scale generation of any other genomic resource is available for this important species, although several pinitol biosynthesis pathway genes have been cloned to study the functional genomics (Sengupta & Majumder, 2009).
The plants were collected from its native place, Sundarban delta of West Bengal, India (21º.36'N and 88º.15' E) and established to our institute NET house. To determine the genome size, 20 mg of young leaf tissue from Net house grown plants was chopped into small pieces and stained with RNase containing propidium iodide (50 μg/ml) (BD Science, India) as per the protocol of Dolezel et al. (2007). The samples were filtered through a 40-μM mesh sieve (Corning, USA), before analysis in (CFM) BD FACS Calibur (BD Biosciences, San Jose, CA, USA). Pisium sativum leaf was used as standard for calculating the genome size. Further, high-quality genomic DNA from 100 mg young leaf was extracted using CTAB method (Ganie et al., 2016) for the preparation of various genomic DNA libraries. We used Illumina 4000 GA IIx sequencer (San Diego, CA, USA), with 150-bp paired-end libraries, four mate-pair library (with 150-bp paired-end libraries) of four different sizes (average of 2, 4, 6 and 10 kb size). In addition, we also used third generation sequencing (Oxford Nanopore) technology for better assembly. Sequencing was performed on MinION Mk1b (Oxford Nanopore Technologies, Oxford, UK) using SpotON flow cell (R9.4) in a 48h sequencing protocol on MinKNOW 1.4.32. Base calling was performed using Albacore. Base called reads were processed using poRe version 0.24 (Watson et al., 2015) and poretools version 0.6.0 (Loman & Quinlan, 2014). The simple sequence repeats (SSRs) of each scaffold were identified by MISA perl script (Thiel et al., 2003). Gene model prediction was done by AUGUSTUS 3.1 (Stanke & Waak, 2003) and genes were functionally analysed using InterProScan version 5.16.55 (Jones et al., 2014). The InterProScan results were further parsed for additional functional evidence (GO terms and KEGG pathway) using interproscanParser script available at iPlant (Brozynska et al., 2016). Noncoding RNAs, such as miRNA, tRNA, rRNA, snoRNA, snRNA, were identified by adopting Infernal v1.1.2 (Nawrocki & Eddy, 2013) using Rfam (Nawrocki et al., 2015). Transfer RNA was predicted using tRNAscane-SE v 1.23 (Schattner et al., 2005)
The genome (KKLL) of O. coarctata is tetraploid (2n=4X=48) with a genome size estimated by flow cytometer is found to be approximately 665Mb. The Illumina 4000 GA IIx sequencer pair-end generated 137 Gb data. Further four mate-pair libraries together generated 104.35 Gb and Nanopore generated 6.35 Gb sequence data. Hence, we achieved 372.48 X depth of the genome of O. coarctata. The final assembly generated 58362 number of contigs with a minimum length of 200 bp to maximum length of 7,855,609 bp and 1,858,627 bp N50 value, making a total contig length of 569994164 (around 570 Mb) assembled genome, resulting 85.71 % genome coverage. It has been calculated that data contain very small amount of non-ATGC character. Further, we also found that the repeat contain 19.89% of the genome. We also identified approximately 1605 different non-coding RNAs and around 105673 SSRs. Gene ontology analysis identified several salt responsive genes.
Raw sequence data are available at NCBI SRA under the BioProject ID: PRJNA396417.
TKM is also grateful to Mr Sukdev Nath, who provided the planting material. TRS is thankful to the DST, Govt. of India for JC Bose National Fellowship. The authors are thankful to M/S Genotypic Technology Private Limited, Bengaluru, India for sequencing work and M/S BD Biosciences, India for Flow Cytometer work.
Views | Downloads | |
---|---|---|
F1000Research | - | - |
PubMed Central
Data from PMC are received and updated monthly.
|
- | - |
Is the rationale for creating the dataset(s) clearly described?
Yes
Are the protocols appropriate and is the work technically sound?
Yes
Are sufficient details of methods and materials provided to allow replication by others?
Yes
Are the datasets clearly presented in a useable and accessible format?
Yes
Competing Interests: No competing interests were disclosed.
Is the rationale for creating the dataset(s) clearly described?
Yes
Are the protocols appropriate and is the work technically sound?
Yes
Are sufficient details of methods and materials provided to allow replication by others?
Partly
Are the datasets clearly presented in a useable and accessible format?
Yes
Competing Interests: No competing interests were disclosed.
Is the rationale for creating the dataset(s) clearly described?
Yes
Are the protocols appropriate and is the work technically sound?
Yes
Are sufficient details of methods and materials provided to allow replication by others?
Yes
Are the datasets clearly presented in a useable and accessible format?
Yes
Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Comparative genomics, brassica, polyploidy, regulatory evolution
Alongside their report, reviewers assign a status to the article:
Invited Reviewers | |||
---|---|---|---|
1 | 2 | 3 | |
Version 2 (revision) 15 Dec 17 |
read | ||
Version 1 25 Sep 17 |
read | read | read |
Provide sufficient details of any financial or non-financial competing interests to enable users to assess whether your comments might lead a reasonable person to question your impartiality. Consider the following examples, but note that this is not an exhaustive list:
Sign up for content alerts and receive a weekly or monthly email with all newly published articles
Already registered? Sign in
The email address should be the one you originally registered with F1000.
You registered with F1000 via Google, so we cannot reset your password.
To sign in, please click here.
If you still need help with your Google account password, please click here.
You registered with F1000 via Facebook, so we cannot reset your password.
To sign in, please click here.
If you still need help with your Facebook account password, please click here.
If your email address is registered with us, we will email you instructions to reset your password.
If you think you should have received this email but it has not arrived, please check your spam filters and/or contact for further assistance.
Comments on this article Comments (0)