Keywords
Cervus elaphus nannodes, genome draft, mammalian genome assembly, tule elk
This article is included in the Genomics and Genetics gateway.
Cervus elaphus nannodes, genome draft, mammalian genome assembly, tule elk
To date, the closest genomic resource for elk (Cervus elaphus) is a full mitochondrial assembly of white-tailed deer (Odocoileus virginianus), a distantly related cervid1. The present paper presents the first de novo genomic draft of the tule elk (C. elaphus nannodes). This California-endemic elk subspecies underwent a major genetic bottleneck when its numbers were reduced to as few as three individuals in the 1870s2,3. Although their numbers have increased to >5,000 today4, the historical bottleneck nevertheless left its mark on the elk’s genome, rendering it more homozygous than other elk subspecies.
Our motivation for generating a genomic resource for the tule elk was to create a reference for identifying single nucleotide polymorphisms (SNPs) to develop assays to monitor elk population abundance and for related population genetic applications. Due to the relatively low coverage generated in this work (40X overall with an average of 10X coverage from each individual), we used the MEGAHIT metagenome assembler, which has been found to perform well on low-quality or low-coverage DNA sequencing in bacteria5.
Elk were selected from four geographically distinct populations across northern California to maximize genomic diversity (San Luis Reservoir, California Valley, American Canyon, and the San Luis National Wildlife Refuge4). Genomic DNA was extracted from skin biopsies, which were obtained by the California Department of Fish and Wildlife as part of their elk management activities4. We extracted DNA from skin using Qiagen DNeasy blood & tissue kits (QIAGEN Inc., Valencia, CA), according to the manufacturer’s instructions. The DNA was then fragmented via sonication using a Bioruptor (Diagenode, Denville, NJ) to 300 to 400 base pairs (bp) prior to adapter ligation. After verification of fragment size range using agarose gel electrophoresis, NEBNext® Ultra™ DNA Library Prep Kit for Illumina® (New England Biolabs, Inc., Ipswich, MA) was used to ligate Illumina adapters. Multiplexed libraries were prepared using NEBNext Multiplex Oligos for Illumina (New England Biolabs) to individually barcode each of four individual elk. Barcodes were annealed using low-cycle polymerase chain reactions during library preparation. To assess library quality, trace analysis was performed using a Bioanalyzer 2100 (Agilent, Santa Clara, CA) and fluorometric DNA quantitation of libraries was performed using a Qubit fluorometer (Invitrogen, Carlsbad, CA) prior to equilibrating sample concentrations and pooling for sequencing. After library quality control, four samples (one from each population) were pooled in equimolar concentrations and submitted for paired-end sequencing. Samples were sequenced on an Illumina HiSeq 3000 at the DNA Technologies and Expression Analysis Core of the UC Davis Genome Center.
Sequencing quality on demultiplexed reads was evaluated using FastQC v0.11.3 (RRID:SCR_014583)6. The Illumina TruSeq3-PE sequencing adapters were removed using Trimmomatic v0.30 (RRID:SCR_011848)7 with the ILLUMINACLIP parameter set to TruSeq3-PE.fa:2:40:15. The TruSeq3-PE.fa sequence was downloaded from https://anonscm.debian.org/cgit/debian-med/trimmomatic.git/plain/adapters/TruSeq3-PE.fa. LEADING, TRAILING, and SLIDING parameters were set to 2, resulting in the removal of bases with a quality score of 2 or less according to a phred33 quality scoring matrix. The SLIDINGWINDOW parameter of 4:2 was used to clip reads once the quality score fell below 2 within the window. The MINLENGTH parameter set to 25 dropped any reads that fell below that length due to quality trimming. The demultiplexed, quality-filtered reads were interleaved using the interleave-reads.py script in khmer v2.0 (RRID:SCR_001156)8. The assembly was performed using MEGAHIT v1.0.59 on interleaved quality filtered reads. Genome statistical analysis was done using QUAST v3.0 (RRID:SCR_001228)10. All code used is publicly available at https://github.com/dib-lab/2017-tule-elk/.
We obtained 377,980,276 demultiplexed 150 bp paired-end raw reads, containing a total of 113.394 Gbp of sequence, or approximately 40X coverage of the approximately 3 Gbp tule elk genome. Sequence assembly resulted in the generation of a total genome sequence size of 2.395 Gbp. Reads were assembled into 602,862 contiguous sequences ("contigs") averaging 3,973 bp in length with a minimum contig length of 201 bp. The G+C content of the genome was 41.55%. The N50 was 6,885 bp and maximum contig length was 72,391 bp. Additional assembly statistics are available in Table 1. No contigs (e.g. under a certain size or likely to reflect repeats) were removed from the assembly.
This genome can serve as the basis for further genomic work on tule elk and other cervids, such as the development of a SNP assay to track elk population movement across increasingly developed northern Californian terrain. Furthermore, it is the first whole genome assembly available from the family Cervidae, providing a useful interim reference genome for bioinformatic analyses on other deer and elk species.
Raw reads are available in the SRA under the BioProject ID PRJNA345218. The genome draft is available at https://doi.org/10.6084/m9.figshare.5382565.v111.
Code used in this study have been archived at http://doi.org/10.5281/zenodo.88793512
Support for this project was provided by a grant to BNS from the California Department of Fish and Wildlife, FY1516 Big Game Management Program (Grant ID P1580009).
The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
JM would like to thank Luiz Irber, Camille Scott, and Lisa Johnson of the DIB lab at UC Davis for assistance with bioinformatics processing. We also thank C. Langner and J. Hobbs of the California Department of Fish and Wildlife for providing samples.
Views | Downloads | |
---|---|---|
F1000Research | - | - |
PubMed Central
Data from PMC are received and updated monthly.
|
- | - |
Is the rationale for creating the dataset(s) clearly described?
Yes
Are the protocols appropriate and is the work technically sound?
Yes
Are sufficient details of methods and materials provided to allow replication by others?
Yes
Are the datasets clearly presented in a useable and accessible format?
Yes
Competing Interests: No competing interests were disclosed.
Is the rationale for creating the dataset(s) clearly described?
Yes
Are the protocols appropriate and is the work technically sound?
Yes
Are sufficient details of methods and materials provided to allow replication by others?
Yes
Are the datasets clearly presented in a useable and accessible format?
Yes
Competing Interests: No competing interests were disclosed.
Alongside their report, reviewers assign a status to the article:
Invited Reviewers | ||
---|---|---|
1 | 2 | |
Version 2 (revision) 11 Dec 17 |
read | |
Version 1 15 Sep 17 |
read | read |
Provide sufficient details of any financial or non-financial competing interests to enable users to assess whether your comments might lead a reasonable person to question your impartiality. Consider the following examples, but note that this is not an exhaustive list:
Sign up for content alerts and receive a weekly or monthly email with all newly published articles
Already registered? Sign in
The email address should be the one you originally registered with F1000.
You registered with F1000 via Google, so we cannot reset your password.
To sign in, please click here.
If you still need help with your Google account password, please click here.
You registered with F1000 via Facebook, so we cannot reset your password.
To sign in, please click here.
If you still need help with your Facebook account password, please click here.
If your email address is registered with us, we will email you instructions to reset your password.
If you think you should have received this email but it has not arrived, please check your spam filters and/or contact for further assistance.
Comments on this article Comments (0)