Shark and ray genome size estimation: methodological optimization for inclusive and controllable biodiversity genomics

Mitsutaka Kadota; Kaori Tatsumi; Kazuaki Yamaguchi; Atsuko Yamaguchi; Takashi Asahida; Keiichi Sato; Tatsuya Sakamoto; Yoshinobu Uno; Shigehiro Kuraku

doi:10.12688/f1000research.136385.1

Home Browse Shark and ray genome size estimation: methodological optimization...

ALL Metrics

-

Views

-

Downloads

Get PDF

Get XML

Export

▬

✚

Research Article

Shark and ray genome size estimation: methodological optimization for inclusive and controllable biodiversity genomics

[version 1; peer review: 1 approved, 1 approved with reservations]

Mitsutaka Kadota¹, Kaori Tatsumi¹, Kazuaki Yamaguchi¹, [...] Atsuko Yamaguchi², Takashi Asahida³, Keiichi Sato^4,5, Tatsuya Sakamoto⁶, Yoshinobu Uno⁷, Shigehiro Kuraku ^8,9

Mitsutaka Kadota¹, Kaori Tatsumi¹, [...] Kazuaki Yamaguchi¹, Atsuko Yamaguchi², Takashi Asahida³, Keiichi Sato^4,5, Tatsuya Sakamoto⁶, Yoshinobu Uno⁷, Shigehiro Kuraku ^8,9

PUBLISHED 25 Sep 2023

Author details Author details

¹ Laboratory for Phyloinformatics, RIKEN Center for Biosystems Dynamics Research, Kobe, Hyogo, 650-0047, Japan
² Graduate School of Fisheries and Environmental Sciences, Nagasaki University, Nagasaki, Nagasaki, 852-8131, Japan
³ School of Marine Biosciences, Kitasato University, Sagamihara, Kanagawa, 252-0373, Japan
⁴ Okinawa Churashima Research Center, Okinawa Churashima Foundation, Motobu, Okinawa, 905-0206, Japan
⁵ Okinawa Churaumi Aquarium, Okinawa Churashima Foundation, Motobu, Okinawa, 905-0206, Japan
⁶ Ushimado Marine Institute, Graduate School of Natural Science and Technology, Okayama University, Setouchi, Okayama, 701-4303, Japan
⁷ Graduate School of Arts and Sciences, The University of Tokyo, Meguro-ku, Tokyo, 153-0041, Japan
⁸ Molecular Life History Laboratory, National Institute of Genetics, Mishima, Shizuoka, 411-8540, Japan
⁹ Genetics Program, Sokendai (Graduate University for Advanced Studies), Mishima, Shizuoka, 411-8540, Japan

Mitsutaka Kadota
Roles: Formal Analysis, Investigation, Methodology, Writing – Original Draft Preparation, Writing – Review & Editing

Kaori Tatsumi
Roles: Formal Analysis, Resources

Kazuaki Yamaguchi
Roles: Resources

Atsuko Yamaguchi
Roles: Resources

Takashi Asahida
Roles: Resources

Keiichi Sato
Roles: Resources

Tatsuya Sakamoto
Roles: Resources

Yoshinobu Uno
Roles: Resources

Shigehiro Kuraku
Roles: Conceptualization, Formal Analysis, Funding Acquisition, Project Administration, Supervision, Writing – Original Draft Preparation, Writing – Review & Editing

OPEN PEER REVIEW

REVIEWER STATUS

This article is included in the Genomics and Genetics gateway.

Abstract

Estimate of nuclear DNA content serves as an independent tool for validating the completeness of whole genome sequences and investigating the among-species variation of genome sizes, but for some species, the requirement of fresh cells makes this tool highly inaccessible. Here we focused on elasmobranch species (sharks and rays), and using flow cytometry or quantitative PCR (qPCR), estimated the nuclear DNA contents of brownbanded bamboo shark, white spotted bamboo shark, zebra shark, small-spotted catshark, sandbar shark, slendertail lanternshark, basking shark, megamouth shark, red stingray, and ocellate spot skate. Our results revealed their genome sizes spanning from 3.18 pg (for ocellate spot skate) to 13.34 pg (for slendertail lanternshark), reflecting the huge variation of genome sizes already documented for elasmobranchs. Our qPCR-based method ‘sQuantGenome’ enabled accurate genome size estimation without using live cells, which has been a severe limitation with elasmobranchs. These findings and our methodology are expected to contribute to better understanding of the diversity of genome sizes in elasmobranchs even including species with limited availability of fresh tissue materials. It will also help validate the completeness of already obtained or anticipated whole genome sequences.

Keywords

nuclear DNA content, C-value, shark, ray, genome size

Corresponding authors: Mitsutaka Kadota, Shigehiro Kuraku

Competing interests: No competing interests were disclosed.

Grant information: This study was funded by intramural grants from RIKEN and National Institute of Genetics (NIG), as well as JSPS KAKENHI Grant Nos. 20H03269 and 22K15088.
The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Copyright: © 2023 Kadota M et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

How to cite: Kadota M, Tatsumi K, Yamaguchi K et al. Shark and ray genome size estimation: methodological optimization for inclusive and controllable biodiversity genomics [version 1; peer review: 1 approved, 1 approved with reservations]. F1000Research 2023, 12:1204 (https://doi.org/10.12688/f1000research.136385.1) First published: 25 Sep 2023, 12:1204 (https://doi.org/10.12688/f1000research.136385.1) Latest published: 25 Sep 2023, 12:1204 (https://doi.org/10.12688/f1000research.136385.1)

Introduction

Genome size refers to the amount of DNA in a haploid genome, expressed either by weight or number of nucleotides. Development of DNA sequencing technologies and related methods have yielded chromosome-scale sequence information with higher completeness than before, but the total number of nucleotides in resultant genome sequences rarely match the genome size, mainly because of incomplete sequencing that results from highly heterochromatic or repetitive regions in the genome. For example, the axolotl genome sequencing, a pioneering effort on a large vertebrate genome, yielded an assembly consisting of 28 gigabases¹ (Gb), but it still lacks a large fraction larger than a whole human genome, to complete its huge genome that is estimated to amount to 32 Gb.²

In the early days, genome size was determined by measuring the amount of DNA extracted from a defined number of cells or nuclei. The technique revealed variable genome size among species and differential DNA contents between somatic and germ cells.³^–⁵ Other methods based on reassociation kinetics⁶ also provided estimations on genome size or the degree of genome complexity in various experimental settings, i.e., repetitive and non-repetitive content of the genomic DNA.⁷^–⁹ Currently, the most commonly used methods for genome size estimation are DNA staining followed by densitometry or cytometry measurement, i.e., Feulgen densitometry or flow cytometry.¹⁰^,¹¹ These methods, however, require fresh biological samples, which is sometimes challenging, especially when the target species are elusive in the wild.

One of the taxa with scarce genome size reports is the vertebrate class of cartilaginous fishes (Chondrichthyes) comprised of chimaeras, sharks, and rays. This taxon includes some species with enlarged genome sizes over 10 picogram (pg), making it a class with a remarkable genome size variation among vertebrates,¹² next to amphibians in which genome size enlargement is observed particularly in salamanders.¹³ As posed in ‘C-value enigma’,¹² what drives genome size increase/decrease remains inexplicable, and the taxon Chondrichthyes provides an intriguing system to scrutinize it in relation to the variation of habitat environment, body size, propagule size, prey, population size, and lifespan (reviewed in Ref. 14). Among cartilaginous fishes, genome size reports are currently available only up to 134 species (based on the information in the Animal Genome Size Database: http://www.genomesize.com) among approximately 1,300 described species, which is largely due to their inaccessibility—many of them inhabit open ocean or deep-sea and thus are not immediately subjected to laboratory experiments using fresh animal tissues.

An alternative method for genome size estimation based on quantitative PCR (qPCR) was developed¹⁵ and is expected to circumvent the limitation of existing methods with elusive species. The method requires prior information on the genomic or transcriptomic sequences of single-copy genes to design specific oligonucleotide primers, as well as genomic DNA as a template, for PCR. Genome size is then estimated by calculating the DNA copy number contained within a defined amount of DNA. Although the qPCR-based method has been used to assess genome sizes of several plant and animal species,¹⁶^–¹⁹ its utility has been questioned. The qPCR-based method can be more labor-intensive than established methods. Moreover, in some cases, qPCR-based estimates deviated from the values obtained by other established methods such as Feulgen densitometry or flow cytometry.²⁰^,²¹ Thus, establishing a streamlined protocol will allow a wide application of the qPCR-based method to provide accurate genome size estimates even in species with low sample availability.

In this study, we selected several elasmobranch species and estimated their genome sizes. We primarily adopted flow cytometry, and in parallel, we optimized the previously introduced qPCR-based method to achieve higher accuracy. This yielded genome size estimation of the species including those with low availability of fresh tissue samples, and for the brownbanded bamboo shark, zebra shark, small-spotted catshark, ocellate spot skate, the estimates were cross-validated between the two methods. Our technical optimization has been summarized in a protocol that laboratory personnel can refer at the bench for reliable genome size estimation. Based on the data we obtained, we discuss the genome size variation among elasmobranchs.

Methods

Animals

The sources of the animal individuals used for the study (see below) are described in Supplementary Table 1 in the separately supplied file.²² Red blood cells, tissue or embryo of sandbar shark Carcharhinus plumbeus, basking shark Cetorhinus maximus, white-spotted bamboo shark Chiloscyllium plagiosum, brownbanded bamboo shark Chiloscyllium punctatum, slendertail lanternshark Etmopterus molleri, red stingray Hemitrygon akajei, megamouth shark Megachasma pelagios, ocellate spot skate Okamejei kenojei, small-spotted catshark Scyliorhinus canicula, zebra shark (or leopard shark) Stegostoma tigrinum (formerly Stegostoma fasciatum), and mouse Mus musculus were used for the analysis. Animal sampling was conducted in accordance with the institutional guideline Regulations for the Animal Experiments specified by the Institutional Animal Care and Use Committee of RIKEN Kobe Branch and the official guideline for animal experiments at Okinawa Churaumi Aquarium. Phylogenetic relationship of the elasmobranch species covered in the study are described in Figure 1.

Figure 1. Phylogenetic relationship of the elasmobranch species used in this study.

Details of the animals used (e.g., source, developmental stage, and cell type) are included in Supplementary Table 1 in the separately supplied file.²² The branch lengths in the tree are not scaled to relative divergence times.

Cell preparation for flow cytometry

Elasmobranch blood was collected using a syringe containing heparin sodium salt solution (MOCHIDA, cat: 5,000 units/5 mL; use 10 μL for 2-3 ml of blood) or EDTA (0.5 M EDTA, Gibco, cat: 15575-020; use 10 μL for 1-2 ml of blood). Elasmobranch cells from the tissue and embryo were prepared by mincing and dissociation with trypsin (0.5% solution in shark PBS) at 25-28°C for 15-30 min with rotation. Mouse embryonic fibroblast cells prepared from E14.5 embryos of C57BL/6 strain mice, or mouse lung fibroblast cells prepared from an adult male C57BL/6 strain mouse were cultured in DMEM (Nacalai tesque), supplemented with 10% FBS (Gibco) and 1× Antibiotic-Antimycotic solution (Gibco). Human GM12878 lymphoblastoid cell line (Coriell Institute) which has a relatively normal karyotype was cultured in RPMI (Gibco) supplemented with 10% FBS, 1× L-glutamine (Gibco), and 1× Antibiotic-Antimycotic solution. Two million elasmobranch cells or one million reference cells (mouse fibroblast cells or GM12878 cells) were permeabilized in PBS (shark PBS or PBS) containing 0.1 % Triton-X100 and stained with propidium iodide (PI/RNase Staining Buffer, BD Pharmingen).

Flow cytometry

Estimation of the nuclear DNA content by flow cytometry was performed as follows.²³^,²⁴ The PI intensity of the elasmobranch cells and reference cells were measured separately in a flow cytometer (Ploidy Analyser; Sysmex). Then, a single analysis setting applicable for both sample types was determined by adjusting the gain and the threshold in the CyView analysis software (Sysmex). Subsequently, the elasmobranch cells and reference cells were mixed at equal ratio and analyzed again. Genome size of the elasmobranch species was estimated relatively from the calibration curve between the 2c (G0/G1 phase) and the 4c (G2/M phase) peaks of the reference cells with known genome size information, i.e., C-value = 3.25 pg²⁵ for mouse or C-value = 3.37 pg for human GM12878 cells determined by flow cytometry using mouse MEF cells as reference. The estimation is also based on the assumption that the elasmobranch species analyzed are diploid organisms. Step-by-step procedures of cell preparation, flow cytometer operation, and genome size calculation are described in the Supplementary Protocol.

DNA extraction for qPCR

Frozen samples (red blood cells or powderized tissue) were used for the genomic DNA (gDNA) extraction with NucleoBond AXG 100 or AXG 500 with NucleoBond Buffer Set IV (Clontech) or DNeasy Blood and Tissue kit (Qiagen). Genomic DNA was analyzed by 4200 TapeStation (Agilent Technologies) with the Genomic DNA ScreenTape analysis kit to confirm high molecular weight (HMW) DNA extraction. 1-4 μg of genomic DNA, adjusted to the total volume to 150 μL with TE buffer, was sheared to 10 kb using g-TUBE (Covaris) in an MX-300 high-speed centrifuge (TOMY) equipped with an AR015-24 rotor (TOMY) at 6,100 rpm for 1 min, at 25°C. After the shearing, size distribution of the DNA was confirmed to be around 10 kb by 4200 TapeStation with the Genomic DNA ScreenTape analysis kit. The sheared gDNA were stored at 4°C until use.

Selection of reference genes and primer design

Target genes for the analysis were selected from the previously defined set of Core Vertebrate Genes²⁶ (CVG), consisting of 233 genes that are shared as one-to-one single-copy orthologs in the genomes of 29 selected vertebrate species. We selected the components of CVG that are also included in Core Eukaryotic Genes²⁷ (CEG) and BUSCO²⁸ (Benchmarking Universal Single-Copy Orthologs) vertebrate gene set identified by Hara et al.²⁶ (see Supplementary Table S3 in the separately supplied file²² for detailed information of the selected genes). Orthologs of those CVGs of the target species were identified by performing BLAST (TBLASTN) search in transcriptome assemblies of the target species in the Squalomix elasmobranch sequence archive (https://github.com/Squalomix/sequences; Ref. 29), using amino acid sequences of human CVG orthologs as the query. Also, the deduced amino acid sequences of the longest open reading frame of the identified transcripts were used for the BLAST (BLASTP) search against the human protein database. Transcripts of the target species that displayed the best hits reciprocally in the BLAST searches were considered orthologous and retained for further analysis (see Supplementary Tables S3 and S4 in the separately supplied file²² for the CVG transcripts used for the study). Primers were designed inside the putative last exon of the transcripts inferred by aligning the target species’ transcript sequences to the last exon of the human counterpart. Primers for the first round of PCR to generate copy-number standards were designed using the Primer3 program³⁰ (https://bioinfo.ut.ee/primer3/), and primers for the nested qPCR were designed using the Primer Express software (ThermoFisher Scientific). The sequences of CVG transcript of the target species and the oligonucleotide sequences of the PCR primers are included in Supplementary Table S3 and S4 in the separately supplied file,²² respectively.

Preparation of qPCR copy number standards

The first PCR was carried out using the KAPA HiFi HotStart ReadyMix (KAPA Biosystems) with 5 ng of the sheared gDNA and 0.3 μM concentration of each primer in 20 μL reaction volume at [98°C for 3 min, 34 cycles of (98°C for 20 sec, 58°C-66°C in 2°C increments of temperature gradient for 15 sec, 72°C for 15 sec), 72°C for 1 min]. A portion (5 μL) of the PCR products were analyzed by electrophoresis on a 2% Agarose S gel (Nippon gene) in 1× TBE buffer and stained with GelStar (Lonza). The remaining PCR product of the condition that displayed a single-band or near single-band product on the gel was double size-selected and purified using AMPure XP beads (Beckman coulter) and eluted in 30 μL of TE. The optimal annealing temperature of the PCR reaction and size-selection condition by AMPure XP beads for each gene are included in Supplementary Table S4. Purified PCR products were sequenced directly by Sanger sequencing using the same forward and reverse primers used in the initial PCR reaction to identify possible polymorphic sites. If the polymorphic sites overlapped qPCR primer sequences, the qPCR primers were re-designed. As the first PCR was performed at a high PCR cycle number and may have reached the plateau phase resulting in the PCR product to possess a complex secondary structure, the second PCR was performed with a low cycle number using a small aliquot of the first PCR product as template. The number of PCR cycles in the second PCR was determined with QuantiTect SYBR Green PCR Kit (Qiagen) using 1 μL of the first PCR product after 5-folds of dilution or H₂O as the template with 0.3 μM concentration of each primer in 10 μL reaction volume in a duplicate measurement setting, at [95°C for 15 min, 30 cycles of (94°C for 15 sec, 60°C for 15 sec, 72°C for 20 sec)] with QuantStudio 7 Flex system. The baseline signal was set to the 1st cycle only. Ct (cycle threshold) that reached 1/4^th of the maximum signal intensity observed at the 30th cycle of PCR was determined as the number of PCR cycles for the second PCR. The second PCR was performed using the KAPA HiFi HotStart ReadyMix (KAPA Biosystems) at the optimal annealing temperature determined earlier. The second PCR product was double-size-selected and purified using AMPure XP beads and eluted in 30 μL of TE buffer. The size distribution analysis was performed again by 2100 Bioanalyzer (Agilent Technologies) to confirm that the purified second PCR product is truly a single-band DNA. The purified second PCR product, which will be used as the DNA copy number standard (DNA standard) in the qPCR reaction, were stored at 4°C until use.

qPCR analysis

On the day of qPCR, DNA standards and the sheared gDNA were diluted to 1-10 ng/μL in 1× DB (dilution buffer; 10 mM Tris-HCl pH8.0, 0.05% Tween-20) using 50× DB stock solution (500 mM Tris-HCl pH8.0, 2.5% Tween-20), and 5 μL solution was used for quantitation with QubitFlex fluorometer (ThermoFisher Scientific) in four independent measurements. DNA concentration was determined by taking the average of the four independent measurements. The molecular weight (MW) of the double-strand DNA standards was calculated by the OligoCalc program (http://biotools.nubic.northwestern.edu/OligoCalc.html). The concentration of the DNA standards in copy numbers was calculated with the following formula: DNA copy number in 1 μL DNA = ((DNA concentration (ng/μL)) / (MW (g) × 10⁹)) × (6.02214076 × 10²³/mol). DNA standards were diluted hundred-folds in 1× DB, followed by another hundred-folds dilution in 1× DB, and further diluted in 1× DB to generate 100 μL of DNA standard at 50,000 copy DNA/μL (will be STD1). STD1 was diluted five-fold in 1 ×DB by mixing 5 μL of STD1 with 20 μL of 1× DB to generate STD2 at 10,000 copy DNA/μL. Similarly, STD3, 4, and 5 at 2,000 copy DNA/μL, 400 copy DNA/μL, and 80 copy DNA /μL were generated by serial dilution. qPCR for genome size estimation was performed with the QuantiTect SYBR Green PCR Kit (Qiagen) using 1 μL of the sheared gDNA (at 1-10 ng/μL in 1× DB), STD DNAs (STD 1~5) in 1× DB, or H2O as templates with 0.2 μM concentration of each primer in 10 μL reaction volume in 4 technical replicate setting, at [95°C for 15 min, 40 cycles of (94°C for 15 sec, 60°C for 15 sec, 72°C for 20 sec)] with the dissociation curve analysis option in QuantStudio 7 Flex system. Quantification was performed with the QuantStudio Real-Time PCR Software v1.3. The R² of the standard curve and the dissociation curve were also analyzed to confirm specific amplification and quantification of the target region. C-value was determined with the following formula: C-value = amount of gDNA in the qPCR reaction (ng)/DNA copy number of the gDNA. Genome size was estimated by averaging the C-values from multiple (at least three) target gene regions. Step-by-step procedures for gDNA preparation, DNA standard preparation, qPCR, and genome size calculation are described in the Supplementary Protocol in the separately supplied file.²²

Sequence-based genome-size estimation with k-mer frequency

In parallel with the genome size estimate using the flow cytometry and qPCR, those of the zebra shark and thorny skate were estimated in silico based on k-mer frequency distribution with a conventional procedure. For the zebra shark, short-read genomic data produced previously by us (see Yamaguchi et al., 2022)³¹ were used. For the thorny skate, short-read genomic data produced by the Vertebrate Genomes Project³² was downloaded from VGP GenomeArk (s3://genomeark/species/Amblyraja_radiata/sAmbRad1/genomic_data/illumina/). Sequencing reads were trimmed with trim_galore v0.6.4 to remove low quality bases and the Illumina adaptor sequences. The processed reads (both read 1 and read 2; 351 and 751 M read pairs for the zebra shark and the thorny skate, respectively) were analyzed by jellyfish v2.3.0 program with the k value set at 21. The resultant .histo file was uploaded onto the GenomeScape webserver (http://qb.cshl.edu/genomescope/) to obtain estimates of the genome size.

Results

Genome size estimation with flow cytometry

Genome size estimation of ten elasmobranch species that belong to the orders of Orectolobiformes, Lamniformes, Carcharhiniformes, Squaliformes, Rajiformes, and Myliobatiformes, namely eight shark and two skate/ray species (Figure 1), was performed by flow cytometry or qPCR. The flow cytometry method utilizes propidium iodide (PI) that binds to the double-strand DNA to estimate genome sizes relative to the reference mouse or human cells with known genome size information. The reference cells at G0/G1 phase (2c) and G2/M phase (4c) were given a pre-defined nuclear genome content of 6.5 pg and 13 pg for mouse cells and 6.74 pg and 13.48 pg for human GM12878 cells to draw a calibration line. Then, the genome size of the target elasmobranch species was estimated based on the calibration line, assuming that they are diploid organisms (Figure 2). See Supplementary Table 2 in the separately supplied file²² for detailed information on the PI-signal values and the reference cells used in each measurement.

Figure 2. Genome size estimation by flow cytometry.

a) Flow cytometry analysis of the bamboo shark Chiloscyllium punctatum with Ploidy Analyser (Sysmex). Red blood cells of the bamboo shark and the reference mouse lung fibroblast cells were analyzed together. PI signal of the cells corresponding to mouse 2c (cells at G0/G1 phase), mouse 4c (cells at G2/M phase), and bamboo shark 2c (cells at G0/G1 phase) were obtained. b) Genome size estimation of the bamboo shark based on the standard line of the mouse cells. C-value of the reference mouse cells was set to 3.25 pg.²⁵ The PI-signal values and the estimated genome sizes for the species analyzed by flow cytometry are included in Supplementary Table 2 in the separately supplied file.²²

Genome size estimation with qPCR

The qPCR-based method requires the target species’ nucleotide sequence information and materials, i.e., transcriptome sequences for primer design and the genomic DNA for PCR amplification. We followed the method reported by Wilhelm et al.¹⁵ while improving the existing protocol to achieve more accurate estimation (Figure 3). The major modifications in the protocol were 1) selection of multiple (at least three) genes from the established single-copy orthologue gene set CVG, 2) use of the sheared gDNA instead of HMW gDNA for DNA quantification and PCR, 3) careful evaluation of the DNA standards; sequence analysis by Sanger sequencing and size distribution analysis by 2100 Bioanalyzer, 4) Use of a dilution buffer that contains a surfactant in the qPCR reaction. We first determined the PCR condition that gives a single-band product in the PCR reaction. The nucleotide sequence of the PCR product was confirmed by Sanger sequencing to avoid overlap of the qPCR primers with possible polymorphic sites inside the target region of the selected CVGs. The second PCR was performed using a limited number of PCR cycles, not to let the PCR amplification hit the plateau. Finally, qPCR was performed using the sheared gDNA of the target species and the DNA standards of 80~50,000 copy/μL diluted in dilution buffer.

Figure 3. Genome size estimation by quantitative PCR.

a) Workflow of the genome size estimation by quantitative PCR. b) Agarose gel electrophoresis of the first PCR product at various annealing temperature conditions. Shown is the result for the brownbanded bamboo shark MPI gene (CpMPI). c) qPCR-based determination of the PCR cycles for the second PCR. Amplification curve for CpMPI (red), negative control (H₂O; blue lines), and the threshold (green line) are shown. d) Size distribution analysis of the DNA standard by Agilent Bioanalyzer with the High Sensitivity DNA Kit. Shown is the result for CpMPI. e) Amplification of CpMPI for the sheared gDNA, copy number standards (STD1-5), and negative control (NTC). HMW, high molecular weight; Rn, normalized reporter.

The reliability of our analysis methods was assessed by comparing our estimates based on flow cytometry and qPCR in four species. C-values of the brownbanded bamboo shark were estimated to be 4.83 pg by flow cytometry (Figure 2) and 5.05 pg by qPCR; C-values of the zebra shark were estimated to be 3.79 pg by flow cytometry and 3.39 pg by qPCR; C-values of the small-spotted catshark were estimated to be 5.46 pg by flow cytometry and 5.49 pg by qPCR; C-values of the ocellate spot skate were estimated to be 3.40 pg by flow cytometry and 3.18 pg by qPCR. Close correlation in the estimated genome size values based on two independent methods suggested high reliability of the data except one case. The values were also compared to the genome size information documented previously (see Table 1). The estimated C-values from this study for sandbar shark (3.43 pg) by flow cytometry, brownbanded bamboo shark (4.83) by flow cytometry, red stingray (3.82 pg) by qPCR, ocellate spot skate (3.40 pg) by flow cytometry, and small-spotted catshark (5.46 pg) by flow cytometry were comparable to previous reports.³³^–³⁷ The estimated C-value of the house mouse, a control non-elasmobranch species, was 3.18 pg by qPCR and was within 3 % of difference compared to the previous documentation (3.25 pg; Ref. 25). For the rest of the study species selected, genome size was obtained for the first time in this study, namely for the basking shark: C-value = 4.31 pg, white spotted bamboo shark: C-value = 4.96 pg, slendertail lanternshark: C-value = 13.34 pg, megamouth shark: C-value = 5.81 pg, and zebra shark: C-value = 3.79 pg. Our results of genome size estimation are tabulated under the Squalomix consortium gateway page (https://github.com/Squalomix/c-value), which will be kept up-to-date.

Table 1. Estimated genome sizes for the species selected for this study.

Species	Common name	Estimated C-value (pg)
		This study		Previous study
		Flow cytometry	Quantitative PCR	Previous study
Carcharhinus plumbeus	sandbar shark	3.43	-	2.98³⁶
Cetorhinus maximus	basking shark	-	4.31	-
Chiloscyllium plagiosum	white-spotted bamboo shark	4.96	-	-
Chiloscyllium punctatum	brownbanded bamboo shark	4.83	5.05	4.56,¹⁰ 4.83⁴⁵
Etmopterus molleri	slendertail lanternshark	-	13.34	-
Hemitrygon akajei	red stingray	-	3.82	4.02,³⁷ 4.15,³⁴ 4.32³⁵
Megachasma pelagios	megamouth shark	-	5.81	-
Okamejei kenojei	ocellate spot skate	3.40	3.18	3.09³⁷
Scyliorhinus canicula	small-spotted catshark	5.46	5.49	5.65³³
Stegostoma tigrinum	zebra shark	3.79	3.39	-
Mus musculus	house mouse	-	3.18	3.25²⁵

Genome sizes estimated from whole genome sequences

We also estimated genome sizes using DNA sequences with the k-mer-based method³⁸ as supplementary references. For the thorny skate Amblyraja radiata whose nuclear DNA content was previously estimated to be 4.25 Gb,³⁹ the total sequence length of the publicly available genome assembly³² (sAmbRad1.1.pri; GCF_010909765.2) amounts to only 2.56 Gbp. Our k-mer-based estimate of its genome size using raw sequence reads resulted in an even smaller value of approximately 1.7 Gbp. Similarly, for the zebra shark Stegostoma tigrinum whose nuclear DNA content was estimated to be 3.70 Gb (C-value = 3.79 pg), the total sequence length of the genome assembly released previously by us³¹ (sSteFas1.1; GCA_022316705.1) amounts to only 2.771 Gbp. Our k-mer-based estimate of its genome size using raw sequence reads resulted in approximately 2.805 Gbp.

Discussion

Importance of accurate genome size estimates—towards ‘T2T’

Recent advances in DNA sequencing technologies have enabled genome sequencing in unprecedented scale and accuracy. For setting reasonable goals in the genome sequencing project that can affect overall expenses, it is instrumental to know the karyotype and the total lengths of nucleotide sequences (namely the genome size) of the target species. Upon complete sequencing of the whole genome, the total genome sequence length should be equal to the that measured with nuclear DNA content, but in reality, the severe gap of the lengths appears especially when early technologies such as short-read sequencing is employed (see Introduction). This is the case with all cartilaginous fish species for which whole genome sequencing was conducted (Figure 4a; reviewed in Ref. 14). In the small-spotted catshark genome assembly, more than 1 Gb, in comparison to its genome size, is missing. Also, a huge fraction (1.7 Gb, 40%) is missing in the genome assembly of the thorny skate (Figure 4). The severe gap is also often observed when genome size is estimated using k-mers, as demonstrated for the zebra shark and thorny skate in this study. As shown previously,⁴⁰ the unreliability of sequence length-based genome size estimate demands independent measurements with more reliable methods. However, the widely used methods for genome size estimation employ flow cytometry and densitometry and require fresh live materials to give an accurate estimation, which was difficult for some elusive species. To circumvent this limitation, we optimized a previously introduced qPCR-based method.

Figure 4. Comparisons of estimated genome sizes for cartilaginous fishes.

a) Total lengths of genome assembly sequences compared with genome sizes. Included in this plot are species with unequivocal genome size estimates³³^,³⁹^,⁴⁵^,⁵⁰^,⁵¹ and whole genome assemblies,³²^,⁵²^–⁵⁵ namely, whale shark (NCBI Assembly ID, GCA_013626285.1), thorny skate (GCA_010909765.2), small-spotted catshark (GCA_902713615.2), white-spotted bamboo shark (GCA_004010195.1), white shark (GCA_003604245.1), elephant fish Callorhinchus milii (GCA_018977255.1). The genome sizes were measured with flow cytometry except for the elephant fish with the measure by Feulgen densitometry. b) Results of our genome size estimation. Species with two estimate values (out of qPCR-based estimate, flow cytometry-based estimate, or at least one estimate from previous study) are included. Estimates from previous reports are included for reference. Multiple estimates for an individual species are shown with multiple data points and lines (see Table 1).

Which method?—reconsideration in this post-genomic era

Our cell preparation method for flow cytometry utilized an isotonic solution (shark PBS solution)⁴¹^,⁴² allowing mild permeabilization of live elasmobranch cells. Data acquisition with a flow cytometer was performed similarly to genome content analysis (e.g., ploidy level, cell cycle) using fluorescent dyes. The reference cells were cultured mouse or human cells under exponential growth conditions. Using cultured cells enabled the acquisition of PI signals for 2c (G0/G1 phase) and 4c (G2/M phase) cells to draw a regression line between these two data points. In all the measurements based on flow cytometry, we confirmed that the PI signal of the reference 4c cells is about double of the 2c cells (Supplementary Table 2 in the separately supplied file²²), assuring that the reference data points are within the quantitative range and applicable for the estimation of relatively large elasmobranch genomes. The qPCR-based method was used for several non-shark species in past studies.¹⁶^–¹⁹ However, it received criticism about the accuracy of the estimate.²⁰^,²¹ We suspected that the possible inaccuracy resulted from the lack of careful evaluation of the DNA copy number standards, careful handling in qPCR, and the use of a single target gene (instead of multiple target genes) for estimation.

Optimization of qPCR-based method and our protocol

In our preliminary trials, we identified several keys for improving the accuracy in qPCR-based genome size estimation and derived a protocol implementing the following modifications (see Supplementary Protocol for the entire procedure). This protocol is designated sQuantGenome (streamlined protocol for quantitative PCR-based genome size estimation) and is publicly available at the Squalomix consortium gateway (https://github.com/squalomix/c-value/). First, we used low-bind plasticwares (tubes and tips) to minimize the loss of DNA during the laboratory handling. Second, we used Qubit Fluorometer (ThermoFisher Scientific) to measure the concentration of double-stranded DNA (dsDNA) specifically. Third, we sheared HMW gDNA to approximately 10 kb-long fragments for DNA quantification (by Qubit), PCR, and qPCR. Use of un-sheared HMW DNA causes inconsistent measurements in both DNA quantification and qPCR. Fourth, DNA standards were prepared with maximal caution. We designed PCR and qPCR primers within the putative last exon (that tends to be the longest among multiple exons in most eukaryotes⁴³^,⁴⁴) of the target gene by aligning the target species’ transcript sequence to its human ortholog, based on the assumption that the exon-intron structure is conserved between elasmobranchs and mammals. The first PCR was performed at multiple annealing temperatures to identify a condition that gives a single- or near single-band product on agarose gel. We performed Sanger sequencing to confirm the identity of the nucleotide sequence of the first PCR product to the sequence from the transcriptome assembly which was obtained from the Squalomix sequence archive.²⁹ We also confirmed that qPCR primers (to be used for genome size estimation) do not overlap heterozygous polymorphic sites inside the DNA standard sequence. qPCR primers were re-designed if they overlapped heterozygous polymorphic sites. The second PCR was performed at a low PCR cycle number within the exponential amplification phase (confirmed by qPCR for cycle determination) to ensure that the amplified DNA is double-stranded and does not contain incomplete or chimeric PCR products. We size-selected first and second PCR products to remove DNA fragments outside the target size range. We also analyzed the DNA standards with 2100 Bioanalyzer to confirm their uniform DNA size. Fifth, we reduced the sources of variability in the qPCR reaction to improve quantification accuracy. We used a DNA dilution buffer (DB) that contains a surfactant (0.05% Tween-20) to minimize binding of the DNA to the plasticwares (tubes and tips). We prepared template DNAs (sheared gDNA and the DNA standards) in DB and quantified them with Qubit in four independent replications, using a large liquid volume (of 5 uL) containing at least 5 ng of DNA, for improved accuracy. We used an electronic 8-well pipette for consistent dispensation of the qPCR mixture into the 384-well reaction plate. We also performed the experiment with proper liquid handling techniques, e.g., pre-rinsing of the pipette tip, reverse-pipetting, etc. Sixth, our technical consideration also encompassed the in silico steps for selecting adequate amplification targets and primer pairs to amplify them. Choosing multiple single-copy orthologous genes (ideally no fewer than three) as amplification targets per species is a must to take a variation of estimates into consideration and to identity inappropriate choice of genes, e.g., genes with variable copy number caused by structural variation or genes on sex chromosomes.

In silico preparation for qPCR-based genome size estimation

Recently, transcriptome sequencing with RNA-seq, as well as whole genome sequencing has led to the large-scale production of exonic sequences including protein-coding regions of various genes. This allows efficient identification of single-copy genes from the de novo transcriptome assembly of the target species, even when high-quality whole genome assembly is unavailable. The rigorous selection of such genes has been facilitated by the introduction of already prepared sets of single-copy orthologs, such as CEG,²⁷ CVG,²⁶ and BUSCO,²⁸ that were pre-selected to assess the completeness of whole genome sequence assemblies. Targeting genes from the single-copy ortholog set is a highly recommended approach for gene selection in the qPCR-based method. However, one should be aware of the possibility of lineage- or species-specific gene duplication or a gene-loss event that may cause the qPCR-based estimate to diverge from the actual genome size. In case of gene duplication, the prepared DNA standards may contain DNA of alternative sequence or alternative fragment size which should be identified with Sanger sequencing or size distribution analysis with 2100 Bioanalyzer. In case of gene loss, the candidate transcript of the target species belonging to a non-orthologous gene, may display a low similarity score or may fail to be identified as the transcript with reciprocal best hit in the BLAST analysis. In any of these cases, choosing multiple genes from the single-copy orthologous gene set and performing rigorous evaluation of the DNA standards will help prevent errors caused by the use of inappropriate genes for the analysis.

Genome size variation among elasmobranchs

Among the ten elasmobranch species analyzed in this study, genome size information was previously documented for five species (Table 1); sandbar shark (reported to be 2.98 pg using red blood cells with Feulgen staining³⁶), brownbanded bamboo shark (reported to be 4.56 pg or 4.83 pg using red blood cells with Feulgen staining or flow cytometry¹⁰^,⁴⁵), red stingray (reported to be 4.02 pg, 4.15 pg, or 4.32 pg using red blood cells with Feulgen staining or flow cytometry³⁴^,³⁵^,³⁷), ocellate spot skate (reported to be 3.09 pg using red blood cells with flow cytometry³⁷), and small-spotted catshark (reported to be 5.65 pg using red blood cells with Feulgen staining³³). Whether the estimation was based on flow cytometry or qPCR, our estimates were mostly comparable to the existing reports, suggesting its high accuracy. Among the species with a relatively large gap, the sandbar shark exhibited a difference of as much as 15%, between 3.43 pg (qPCR-based estimate in the present study) and 2.98 pg from a past study.³⁶

We have demonstrated the utility of the qPCR-based method for genome size estimation of species whose fresh tissue material is difficult to obtain. The qPCR-based method is also recommended when suitable reference cells with known comparable genome size to the study species are not available in Feulgen densitometry or flow cytometry-based analyses, e.g., for genome size estimation of animal and plant species with extremely large genome sizes. In the case of the South American lungfish Lepidosiren paradoxa, the C-value was estimated to be 80.55-123.90 pg⁴⁶^–⁴⁸ based on flow cytometry and Feulgen densitometry using reference cells with C-values of 3.3-6.7 pg. In the case of the canopy plant Paris japonica (Melanthiaceae) which has the largest genome quantified among eukaryotes to date (C-value = 152.23 pg),⁴⁹ the genome size was estimated by flow cytometry using multiple reference cells at the C-value of 16.75 pg, 27.11 pg, and 54.08 pg. In both cases, C-value of the reference cells is much smaller than the target species, which in principle should be avoided in reference-based genome size estimation as it harbors risks of calibration error due to extrapolation. On the other hand, the qPCR-based method, which does not require any reference cells, is expected to provide accurate estimation for a wide variety of species with a wide range of genome sizes.

Author contributions

S.K. and M.K. conceived the study. S.K., A.Y., T.A., K.S., T.S., K.Y., Y.U., and K.T. collected samples. M.K. and K.T. performed experiments, and S.K. and M.K. analyzed data. S.K. and M.K. wrote the manuscript, which was approved by all authors.

Data availability

Extended data

Figshare: Extended data. https://doi.org/10.6084/m9.figshare.23974383.v1.⁵⁶

Data are available under the terms of the Creative Commons Attribution (CC BY 4.0).

Acknowledgments

We thank Hatsune Makino-Itou of Molecular Life History Laboratory, National Institute of Genetics, for assistance with qPCR, Hiroshi Kiyonari and Mayumi Watase of Laboratory for Animal Resources and Genetic Engineering, RIKEN BDR, for providing mouse cells, Itsuki Kiyatake and Takaomi Ito of Osaka Aquarium Kaiyukan, Kiyomi Murakumo, Rui Matsumoto, Ryo Nozu, and Taketeru Tomita of Okinawa Churaumi Aquarium, Tomoyuki Satonaka of Shima Marineland Aquarium, Nobuyuki Higashiguchi of Suma Sea Life Park, and Kazuhiro Saito and Waichiro Godo of Ushimado Marine Institute of Okayama University for providing elasmobranch fish samples. We also thank Shinichi Nagai of Sysmex corporation for technical assistance on flow cytometry.

References

1. Nowoshilow S, Schloissnig S, Fei JF, et al.: The axolotl genome and the evolution of key tissue formation regulators. Nature. 2018; 554: 50–55. PubMed Abstract | Publisher Full Text
2. Keinath MC, Timoshevskiy VA, Timoshevskaya NY, et al.: Initial characterization of the large genome of the salamander Ambystoma mexicanum using shotgun and laser capture chromosome sequencing. Sci. Rep. 2015; 5: 16413. PubMed Abstract | Publisher Full Text | Free Full Text
3. Vendrely R, Vendrely C: The content of dexyxyconucleic acid in the cell nucleus through organs, individuals and animal species. Experientia. 1948; 4: 434–436. PubMed Abstract | Publisher Full Text
4. Mirsky AE, Ris H: The desoxyribonucleic acid content of animal cells and its evolutionary significance. J. Gen. Physiol. 1951; 34: 451–462. PubMed Abstract | Publisher Full Text | Free Full Text
5. Heagy FC, Roper JA: Deoxyribonucleic acid content of haploid and diploid Aspergillus conidia. Nature. 1952; 170: 713–714. PubMed Abstract | Publisher Full Text
6. Britten RJ, Kohne DE: Repeated sequences in DNA. Hundreds of thousands of copies of DNA sequences have been incorporated into the genomes of higher organisms. Science. 1968; 161: 529–540. Publisher Full Text
7. Mori K, Wintzerith M, Mandel P: Reassociation of normal mouse DNA and mouse plasmocytoma DNA. FEBS Lett. 1973; 35: 7–10. PubMed Abstract | Publisher Full Text
8. Singer DS: Arrangement of a highly repeated DNA sequence in the genome and chromatin of the African green monkey. J. Biol. Chem. 1979; 254: 5506–5514. PubMed Abstract | Publisher Full Text
9. Ward BL, Anderson RS, Bendich AJ: The mitochondrial genome is large and variable in a family of plants (cucurbitaceae). Cell. 1981; 25: 793–803. PubMed Abstract | Publisher Full Text
10. Hardie DC, Hebert PD: The nucleotypic effects of cellular DNA content in cartilaginous and ray-finned fishes. Genome. 2003; 46: 683–706. PubMed Abstract | Publisher Full Text
11. Michaelson MJ, Price HJ, Ellison JR, et al.: Comparison of plant DNA contents determined by Feulgen microspectrophotometry and laser flow cytometry. Am. J. Bot. 1991; 78: 183–188. Publisher Full Text
12. Gregory TR: Synergy between sequence and size in large-scale genomics. Nat. Rev. Genet. 2005; 6: 699–708. PubMed Abstract | Publisher Full Text
13. Liedtke HC, Gower DJ, Wilkinson M, et al.: Macroevolutionary shift in the size of amphibian genomes and the role of life history and climate. Nat. Ecol. Evol. 2018; 2: 1792–1799.
14. Kuraku S: Shark and ray genomics for disentangling their morphological diversity and vertebrate evolution. Dev. Biol. 2021; 477: 262–272. PubMed Abstract | Publisher Full Text
15. Wilhelm J, Pingoud A, Hahn M: Real-time PCR-based method for the estimation of genome sizes. Nucleic Acids Res. 2003; 31: 56e–e56, 556. PubMed Abstract | Publisher Full Text | Free Full Text
16. Mounsey KE, Willis C, Burgess ST, et al.: Quantitative PCR-based genome size estimation of the astigmatid mites Sarcoptes scabiei, Psoroptes ovis and Dermatophagoides pteronyssinus. Parasit. Vectors. 2012; 5: 3.
17. Kim JH, Roh JY, Kwon DH, et al.: Estimation of the genome sizes of the chigger mites Leptotrombidium pallidum and Leptotrombidium scutellare based on quantitative PCR and k-mer analysis. Parasit. Vectors. 2014; 7: 279. PubMed Abstract | Publisher Full Text | Free Full Text
18. Kuang YY, Zheng XH, Li CY, et al.: The genetic map of goldfish (Carassius auratus) provided insights to the divergent genome evolutions in the Cyprinidae family. Sci. Rep. 2016; 6: 34849. PubMed Abstract | Publisher Full Text | Free Full Text
19. Malli RPN, Adal AM, Sarker LS, et al.: De novo sequencing of the Lavandula angustifolia genome reveals highly duplicated and optimized features for essential oil production. Planta. 2019; 249: 251–256. PubMed Abstract | Publisher Full Text
20. Gregory TR, Nathwani P, Bonnett TR, et al.: Sizing up arthropod genomes: an evaluation of the impact of environmental variation on genome size estimates by flow cytometry and the use of qPCR as a method of estimation. Genome. 2013; 56: 505–510. PubMed Abstract | Publisher Full Text
21. Edwards RJ, Tuipulotu DE, Amos TG, et al.: Draft genome assembly of the invasive cane toad, Rhinella marina. GigaScience. 2018; 7. PubMed Abstract | Publisher Full Text | Free Full Text
22. Kadota M, Kuraku S: Supplementary Information. dataset. Figshare. 2023. Publisher Full Text
23. Hara Y, Takeuchi M, Kageyama Y, et al.: Madagascar ground gecko genome analysis characterizes asymmetric fates of duplicated genes. BMC Biol. 2018; 16: 40. PubMed Abstract | Publisher Full Text | Free Full Text
24. Hare EE, Johnston JS: Genome size determination using flow cytometry of propidium iodide-stained nuclei. Methods Mol. Biol. 2011; 772: 3–12. PubMed Abstract | Publisher Full Text
25. Vinogradov AE: Genome size and GC-percent in vertebrates as determined by flow cytometry: the triangular relationship. Cytometry. 1998; 31: 100–109. Publisher Full Text
26. Hara Y, Tatsumi K, Yoshida M, et al.: Optimizing and benchmarking de novo transcriptome sequencing: from library preparation to assembly evaluation. BMC Genomics. 2015; 16: 977. PubMed Abstract | Publisher Full Text | Free Full Text
27. Parra G, Bradnam K, Korf I: CEGMA: a pipeline to accurately annotate core genes in eukaryotic genomes. Bioinformatics. 2007; 23: 1061–1067. PubMed Abstract | Publisher Full Text
28. Simão FA, Waterhouse RM, Ioannidis P, et al.: BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics. 2015; 31: 3210–3212. Publisher Full Text
29. Nishimura O, Rozewicki J, Yamaguchi K, et al.: Squalomix: shark and ray genome analysis consortium and its data sharing platform [version 1; peer review: 2 approved]. F1000Res. 2022; 11: 1077. PubMed Abstract | Publisher Full Text | Free Full Text
30. Untergasser A, Cutcutache I, Koressaar T, et al.: Primer3--new capabilities and interfaces. Nucleic Acids Res. 2012; 40: e115. PubMed Abstract | Publisher Full Text | Free Full Text
31. Yamaguchi K, Uno Y, Kadota M, et al.: Elasmobranch genome sequencing reveals evolutionary trends of vertebrate karyotype organization. Genome Res. 2023. Publisher Full Text
32. Rhie A, McCarthy SA, Fedrigo O, et al.: Towards complete and error-free genome assemblies of all vertebrate species. Nature. 2021; 592: 737–746. PubMed Abstract | Publisher Full Text | Free Full Text
33. Stingo V, Du Buit M-H, Odierna G: Genome size of some selachian fishes. Boll. Zool. 1980; 47: 129–137. Publisher Full Text
34. Asahida T, Ida H, Inoue S: Karyotypes of three rays in the order Myliobatiformes. Jap. J. Ichthyol. 1987; 33: 426–430.
35. Ojima Y: Cellular DNA contents of fishes determined by flow cytometry. La Kromosomo II. 1990; 57: 1871–1888.
36. Asahida T, Ida H: Karyotype and cellular DNA content of a guitarfish, Rhinobatos schlegelii. La Kromosomo II. 1995; 79(80): 2725–2730.
37. Chang H-Y, Sang T-K, Jan K-Y, et al.: Cellular DNA contents and cell volumes of batoids. Copeia. 1995; 1995: 571–576. Publisher Full Text
38. Liu B, Shi Y, Yuan J, et al.: Estimation of genomic characteristics by analyzing k-mer frequency in de novo genome projects. arXiv: Genomics. 2013.
39. Schwartz FJ, Maddock MB: Cytogenetics of the elasmobranchs: Genome evolution and phylogenetic implications. Mar. Freshw. Res. 2002; 53: 491–502. Publisher Full Text
40. Pflug JM, Holmes VR, Burrus C, et al.: Measuring Genome Sizes Using Read-Depth, k-mers, and Flow Cytometry: Methodological Comparisons in Beetles (Coleoptera). G3 (Bethesda, Md.). 2020; 10: 3047–3060. PubMed Abstract | Publisher Full Text | Free Full Text
41. Uno Y, Nozu R, Kiyatake I, et al.: Cell culture-based karyotyping of orectolobiform sharks for chromosome-scale genome analysis. Commun. Biol. 2020; 3: 652. PubMed Abstract | Publisher Full Text | Free Full Text
42. Maddock MB, Schwartz FJ: Elasmobranch Cytogenetics: Methods and Sex Chromosomes. Bull. Mar. Sci. 1996; 58: 147–155.
43. Kalari KR, Casavant M, Bair TB, et al.: First exons and introns--a survey of GC content and gene structure in the human genome. In Silico Biol. 2006; 6: 237–242. PubMed Abstract
44. Zhu L, Zhang Y, Zhang W, et al.: Patterns of exon-intron architecture variation of genes in eukaryotic genomes. BMC Genomics. 2009; 10: 47. PubMed Abstract | Publisher Full Text | Free Full Text
45. Hara Y, Yamaguchi K, Onimaru K, et al.: Shark genomes provide insights into elasmobranch evolution and the origin of vertebrates. Nat. Ecol. Evol. 2018; 2: 1761–1771. PubMed Abstract | Publisher Full Text
46. Ohno S, Atkin NB: Comparative DNA values and chromosome complements of eight species of fishes. Chromosoma. 1966; 18: 455–466. PubMed Abstract | Publisher Full Text
47. Pedersen RA: DNA content, ribosomal gene multiplicity, and cell size in fish. J. Exp. Zool. 1971; 177: 65–78. PubMed Abstract | Publisher Full Text
48. Vinogradov AE: Genome size and chromatin condensation in vertebrates. Chromosoma. 2005; 113: 362–369. PubMed Abstract | Publisher Full Text
49. Pellicer J, Fay MF, Leitch IJ: The largest eukaryotic genome of them all?. Bot. J. Linn. Soc. 2010; 164: 10–15. Publisher Full Text
50. Schwartz FJ: Comparisons of karyotypes and cellular DNA contents within and between major lines of elasmobranchs. Indo-Pacific fish biology: proceedings of the Second International Conference on Indo-Paclfic Fishes, Ichthyological Soc. of Japan. 1986; pp. 148–157.
51. Hardie DC, Hebert PD: Genome-size evolution in fishes. Can. J. Fish. Aquat. Sci. 2004; 61: 1636–1646. Publisher Full Text
52. Weber JA, Park SG, Luria V, et al.: The whale shark genome reveals how genomic and physiological properties scale with body size. Proc. Natl. Acad. Sci. USA. 2020; 117: 20662–20671. PubMed Abstract | Publisher Full Text | Free Full Text
53. Zhang Y, Gao H, Li H, et al.: The White-Spotted Bamboo Shark Genome Reveals Chromosome Rearrangements and Fast-Evolving Immune Genes of Cartilaginous Fish. iScience. 2020; 23: 101754. PubMed Abstract | Publisher Full Text | Free Full Text
54. Marra NJ, Stanhope MJ, Jue NK, et al.: White shark genome reveals ancient elasmobranch adaptations associated with wound healing and the maintenance of genome stability. Proc Natl Acad Sci USA. 2019; 116: 4446–4455. PubMed Abstract | Publisher Full Text | Free Full Text
55. Nakatani Y, Shingate P, Ravi V, et al.: Reconstruction of proto-vertebrate, proto-cyclostome and proto-gnathostome genomes provides new insights into early vertebrate evolution. Nat. comm. 2021; 12: 4489. PubMed Abstract | Publisher Full Text | Free Full Text
56. National Institute of Genetics (formerly at RIKEN BDR), Kuraku Lab: Supplementary information for Kadota et al., 2023. figshare. Journal contribution. 2023. Publisher Full Text

Comments on this article Comments (0)

Version 1

VERSION 1 PUBLISHED 25 Sep 2023

Author details Author details

¹ Laboratory for Phyloinformatics, RIKEN Center for Biosystems Dynamics Research, Kobe, Hyogo, 650-0047, Japan
² Graduate School of Fisheries and Environmental Sciences, Nagasaki University, Nagasaki, Nagasaki, 852-8131, Japan
³ School of Marine Biosciences, Kitasato University, Sagamihara, Kanagawa, 252-0373, Japan
⁴ Okinawa Churashima Research Center, Okinawa Churashima Foundation, Motobu, Okinawa, 905-0206, Japan
⁵ Okinawa Churaumi Aquarium, Okinawa Churashima Foundation, Motobu, Okinawa, 905-0206, Japan
⁶ Ushimado Marine Institute, Graduate School of Natural Science and Technology, Okayama University, Setouchi, Okayama, 701-4303, Japan
⁷ Graduate School of Arts and Sciences, The University of Tokyo, Meguro-ku, Tokyo, 153-0041, Japan
⁸ Molecular Life History Laboratory, National Institute of Genetics, Mishima, Shizuoka, 411-8540, Japan
⁹ Genetics Program, Sokendai (Graduate University for Advanced Studies), Mishima, Shizuoka, 411-8540, Japan

Mitsutaka Kadota
Roles: Formal Analysis, Investigation, Methodology, Writing – Original Draft Preparation, Writing – Review & Editing

Kaori Tatsumi
Roles: Formal Analysis, Resources

Kazuaki Yamaguchi
Roles: Resources

Atsuko Yamaguchi
Roles: Resources

Takashi Asahida
Roles: Resources

Keiichi Sato
Roles: Resources

Tatsuya Sakamoto
Roles: Resources

Yoshinobu Uno
Roles: Resources

Shigehiro Kuraku
Roles: Conceptualization, Formal Analysis, Funding Acquisition, Project Administration, Supervision, Writing – Original Draft Preparation, Writing – Review & Editing

Competing interests

No competing interests were disclosed.

Grant information

This study was funded by intramural grants from RIKEN and National Institute of Genetics (NIG), as well as JSPS KAKENHI Grant Nos. 20H03269 and 22K15088.
The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Article Versions (1)

version 1

Published: 25 Sep 2023, 12:1204

https://doi.org/10.12688/f1000research.136385.1

Copyright

© 2023 Kadota M et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download

Export To

metrics

	Views	Downloads
F1000Research	-	-
PubMed Central Data from PMC are received and updated monthly.	-	-

Citations

0

SEE MORE DETAILS

CITE

how to cite this article

Kadota M, Tatsumi K, Yamaguchi K et al. Shark and ray genome size estimation: methodological optimization for inclusive and controllable biodiversity genomics [version 1; peer review: 1 approved, 1 approved with reservations]. F1000Research 2023, 12:1204 (https://doi.org/10.12688/f1000research.136385.1)

NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.

track

receive updates on this article

Track an article to receive email alerts on any updates to this article.

Open Peer Review

Version 1

VERSION 1

PUBLISHED 25 Sep 2023

Views

11

Reviewer Report 27 Jan 2024

Nancy Saavedra-Sotelo, Universidad Autónoma de Sinaloa, Mazatlán, Mexico

Approved with Reservations

https://doi.org/10.5256/f1000research.149528.r236221

The authors present a research focus on estimations of the size of the genomes of sharks and rays using qPCR. I consider that it is research that has value to science because this knowledge will serve as to base for ... Continue reading

The authors present a research focus on estimations of the size of the genomes of sharks and rays using qPCR. I consider that it is research that has value to science because this knowledge will serve as to base for future studies.
Abstract
Reading the paper, I consider that its results do not show this sentence: “Our qPCR-based method ‘sQuantGenome’ enabled accurate genome size estimation without using live cells”. I invite the authors to write more objective sentences. I make this observation because there is no methodology that allows us to assess the accurate genome size estimation of their results.
Introduction
I considered that the introduction contained enough information to understand the utility of using qPCR to quantify the genome sizes in wild species with difficult-to-obtain fresh samples.
Methods
I wonder if it is impossible to have shark cells in 4c? or use gametes to have another reference in addition to the mouse cells (2c and 4c). Maybe this comment is for future experiments.
Results.
I consider that it is too bold to say: “Close correlation in the estimated genome size values based on two independent methods suggested high reliability of the data except one case”. There is no statistical analysis that verifies the real difference, much less a statistical correlation. I recommend the authors simply mention that the results can be similar without any apparent pattern. This is because in some cases the qPCR presents higher values but in others lower than cytometry.
Discussion
1. The authors say: “The pPCR-based method was used for several non-shark species in past studies. However, it received criticism about the accuracy of the estimate…” This is precisely why my comment about not ensuring that there is a correlation between methods is good. Although the authors standardize to a single gene, this method still has certain sensitivity biases when making measurements.
2. The section “Optimization of qPCR-based method our protocol” should be in methods. In this case, if authors need to convince the readers, they should write about the high points of methods, at what point the DNA quantification of the protocol can be biased, and how it is reduced with their methodology. However, they cannot give the detailed methodology under discussion.
3. The authors say: “Whether the estimation was based on flow cytometry of pPCR, our estimates were mostly comparable to the existing reports, suggesting its high accuracy”. I find this sentence to be pretentious, I think the authors should consider that they do not have a statistical method to say the results are comparable and reliable. Although it is an advance in the knowledge of the size of the genomes in no model organisms, the methodologies have biases and are evident in the difference in values when visually comparing the results.
4. The last two sentences of the discussion may be controversial. I believe that any approach can have biases when quantifying the size of the genomes, including qPCR. I invite the authors to write these sentences mentioning the usefulness of making comparisons between methods to have greater certainty of the genome sizes, I mean that it will always be useful to have genome data with cytometry, qPCR, and other methods that allow us to give an idea of the average genome size of no model species.

Is the work clearly and accurately presented and does it cite the current literature?

Yes
Is the study design appropriate and is the work technically sound?

Yes
Are sufficient details of methods and analysis provided to allow replication by others?

Yes
If applicable, is the statistical analysis and its interpretation appropriate?

Partly
Are all the source data underlying the results available to ensure full reproducibility?

Yes
Are the conclusions drawn adequately supported by the results?

Partly

Competing Interests: No competing interests were disclosed.

Reviewer Expertise: Population genomics, phylogeny, phylogeography, and reproductive behavior

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.

CITE

Report a concern

Respond or Comment

Views

11

Reviewer Report 23 Oct 2023

Jochen Wilhelm, Universities of Giessen and Marburg Lung Center, Giessen, Hesse, Germany

Approved

https://doi.org/10.5256/f1000research.149528.r210076

The report presents optimized procedures to determine the genome sizes of shark species based on quantitative real-time PCR and gives results for 10 shark species (plus M. musculus as a mammalian control), 5 of which were determined by established methods ... Continue reading

The report presents optimized procedures to determine the genome sizes of shark species based on quantitative real-time PCR and gives results for 10 shark species (plus M. musculus as a mammalian control), 5 of which were determined by established methods in previous studies.

Most of the relevant details of the protocols and results are given in the online supplement.

A statistical analysis of the results is missing. It would be good to have some measure of the precision of the estimates, not just the point estimates. I'd prefer seeing 95% confidence intervals from samples of 3-5 independent determinations (ideally from different subjects/animals, if this is possible; the individual values could be listed in the supplement). This would also allow judging if the differences between the methods are systematic or within the margin of error and strengthen the conclusions.

Is the work clearly and accurately presented and does it cite the current literature?

Yes
Is the study design appropriate and is the work technically sound?

Yes
Are sufficient details of methods and analysis provided to allow replication by others?

Yes
If applicable, is the statistical analysis and its interpretation appropriate?

Partly
Are all the source data underlying the results available to ensure full reproducibility?

Yes
Are the conclusions drawn adequately supported by the results?

Partly

Competing Interests: No competing interests were disclosed.

Reviewer Expertise: qrtPCR, RNA-seq, statistics

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

CITE

Report a concern

Respond or Comment

Comments on this article Comments (0)

Version 1

VERSION 1 PUBLISHED 25 Sep 2023

Open Peer Review

Reviewer Status

Reviewer Reports

	Invited Reviewers
	1	2
Version 1 25 Sep 23	read	read

Jochen Wilhelm, Universities of Giessen and Marburg Lung Center, Giessen, Germany
Nancy Saavedra-Sotelo, Universidad Autónoma de Sinaloa, Mazatlán, Mexico

Comments on this article

All Comments(0)

Add a comment

Sign up for content alerts

Browse by related subjects

Back to all reports

Reviewer Report

11 Views

27 Jan 2024 | for Version 1

Nancy Saavedra-Sotelo, Universidad Autónoma de Sinaloa, Mazatlán, Mexico

11 Views Cite this report Responses(0)

Approved With Reservations

The authors present a research focus on estimations of the size of the genomes of sharks and rays using qPCR. I consider that it is research that has value to science because this knowledge will serve as to base for future studies.
Abstract
Reading the paper, I consider that its results do not show this sentence: “Our qPCR-based method ‘sQuantGenome’ enabled accurate genome size estimation without using live cells”. I invite the authors to write more objective sentences. I make this observation because there is no methodology that allows us to assess the accurate genome size estimation of their results.
Introduction
I considered that the introduction contained enough information to understand the utility of using qPCR to quantify the genome sizes in wild species with difficult-to-obtain fresh samples.
Methods
I wonder if it is impossible to have shark cells in 4c? or use gametes to have another reference in addition to the mouse cells (2c and 4c). Maybe this comment is for future experiments.
Results.
I consider that it is too bold to say: “Close correlation in the estimated genome size values based on two independent methods suggested high reliability of the data except one case”. There is no statistical analysis that verifies the real difference, much less a statistical correlation. I recommend the authors simply mention that the results can be similar without any apparent pattern. This is because in some cases the qPCR presents higher values but in others lower than cytometry.
Discussion
1. The authors say: “The pPCR-based method was used for several non-shark species in past studies. However, it received criticism about the accuracy of the estimate…” This is precisely why my comment about not ensuring that there is a correlation between methods is good. Although the authors standardize to a single gene, this method still has certain sensitivity biases when making measurements.
2. The section “Optimization of qPCR-based method our protocol” should be in methods. In this case, if authors need to convince the readers, they should write about the high points of methods, at what point the DNA quantification of the protocol can be biased, and how it is reduced with their methodology. However, they cannot give the detailed methodology under discussion.
3. The authors say: “Whether the estimation was based on flow cytometry of pPCR, our estimates were mostly comparable to the existing reports, suggesting its high accuracy”. I find this sentence to be pretentious, I think the authors should consider that they do not have a statistical method to say the results are comparable and reliable. Although it is an advance in the knowledge of the size of the genomes in no model organisms, the methodologies have biases and are evident in the difference in values when visually comparing the results.
4. The last two sentences of the discussion may be controversial. I believe that any approach can have biases when quantifying the size of the genomes, including qPCR. I invite the authors to write these sentences mentioning the usefulness of making comparisons between methods to have greater certainty of the genome sizes, I mean that it will always be useful to have genome data with cytometry, qPCR, and other methods that allow us to give an idea of the average genome size of no model species.

Is the work clearly and accurately presented and does it cite the current literature?

Yes
Is the study design appropriate and is the work technically sound?

Yes
Are sufficient details of methods and analysis provided to allow replication by others?

Yes
If applicable, is the statistical analysis and its interpretation appropriate?

Partly
Are all the source data underlying the results available to ensure full reproducibility?

Yes
Are the conclusions drawn adequately supported by the results?

Partly

Competing Interests

No competing interests were disclosed.

Reviewer Expertise

Population genomics, phylogeny, phylogeography, and reproductive behavior

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.

Respond to this report

Responses (0)

Back to all reports

Reviewer Report

11 Views

23 Oct 2023 | for Version 1

Jochen Wilhelm, Universities of Giessen and Marburg Lung Center, Giessen, Hesse, Germany

11 Views Cite this report Responses(0)

Approved

The report presents optimized procedures to determine the genome sizes of shark species based on quantitative real-time PCR and gives results for 10 shark species (plus M. musculus as a mammalian control), 5 of which were determined by established methods in previous studies.

Most of the relevant details of the protocols and results are given in the online supplement.

A statistical analysis of the results is missing. It would be good to have some measure of the precision of the estimates, not just the point estimates. I'd prefer seeing 95% confidence intervals from samples of 3-5 independent determinations (ideally from different subjects/animals, if this is possible; the individual values could be listed in the supplement). This would also allow judging if the differences between the methods are systematic or within the margin of error and strengthen the conclusions.

Is the work clearly and accurately presented and does it cite the current literature?

Yes
Is the study design appropriate and is the work technically sound?

Yes
Are sufficient details of methods and analysis provided to allow replication by others?

Yes
If applicable, is the statistical analysis and its interpretation appropriate?

Partly
Are all the source data underlying the results available to ensure full reproducibility?

Yes
Are the conclusions drawn adequately supported by the results?

Partly

Competing Interests

No competing interests were disclosed.

Reviewer Expertise

qrtPCR, RNA-seq, statistics

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Respond to this report

Responses (0)

[1] 1. Nowoshilow S, Schloissnig S, Fei JF, et al.: The axolotl genome and the evolution of key tissue formation regulators. Nature. 2018; 554: 50–55. PubMed Abstract | Publisher Full Text

[2] 2. Keinath MC, Timoshevskiy VA, Timoshevskaya NY, et al.: Initial characterization of the large genome of the salamander Ambystoma mexicanum using shotgun and laser capture chromosome sequencing. Sci. Rep. 2015; 5: 16413. PubMed Abstract | Publisher Full Text | Free Full Text

[3] 3. Vendrely R, Vendrely C: The content of dexyxyconucleic acid in the cell nucleus through organs, individuals and animal species. Experientia. 1948; 4: 434–436. PubMed Abstract | Publisher Full Text

[4] 4. Mirsky AE, Ris H: The desoxyribonucleic acid content of animal cells and its evolutionary significance. J. Gen. Physiol. 1951; 34: 451–462. PubMed Abstract | Publisher Full Text | Free Full Text

[5] 5. Heagy FC, Roper JA: Deoxyribonucleic acid content of haploid and diploid Aspergillus conidia. Nature. 1952; 170: 713–714. PubMed Abstract | Publisher Full Text

[6] 6. Britten RJ, Kohne DE: Repeated sequences in DNA. Hundreds of thousands of copies of DNA sequences have been incorporated into the genomes of higher organisms. Science. 1968; 161: 529–540. Publisher Full Text

[7] 7. Mori K, Wintzerith M, Mandel P: Reassociation of normal mouse DNA and mouse plasmocytoma DNA. FEBS Lett. 1973; 35: 7–10. PubMed Abstract | Publisher Full Text

[8] 8. Singer DS: Arrangement of a highly repeated DNA sequence in the genome and chromatin of the African green monkey. J. Biol. Chem. 1979; 254: 5506–5514. PubMed Abstract | Publisher Full Text

[9] 9. Ward BL, Anderson RS, Bendich AJ: The mitochondrial genome is large and variable in a family of plants (cucurbitaceae). Cell. 1981; 25: 793–803. PubMed Abstract | Publisher Full Text

[10] 10. Hardie DC, Hebert PD: The nucleotypic effects of cellular DNA content in cartilaginous and ray-finned fishes. Genome. 2003; 46: 683–706. PubMed Abstract | Publisher Full Text

[11] 11. Michaelson MJ, Price HJ, Ellison JR, et al.: Comparison of plant DNA contents determined by Feulgen microspectrophotometry and laser flow cytometry. Am. J. Bot. 1991; 78: 183–188. Publisher Full Text

[12] 12. Gregory TR: Synergy between sequence and size in large-scale genomics. Nat. Rev. Genet. 2005; 6: 699–708. PubMed Abstract | Publisher Full Text

[13] 13. Liedtke HC, Gower DJ, Wilkinson M, et al.: Macroevolutionary shift in the size of amphibian genomes and the role of life history and climate. Nat. Ecol. Evol. 2018; 2: 1792–1799.

[14] 14. Kuraku S: Shark and ray genomics for disentangling their morphological diversity and vertebrate evolution. Dev. Biol. 2021; 477: 262–272. PubMed Abstract | Publisher Full Text

[15] 15. Wilhelm J, Pingoud A, Hahn M: Real-time PCR-based method for the estimation of genome sizes. Nucleic Acids Res. 2003; 31: 56e–e56, 556. PubMed Abstract | Publisher Full Text | Free Full Text

[16] 16. Mounsey KE, Willis C, Burgess ST, et al.: Quantitative PCR-based genome size estimation of the astigmatid mites Sarcoptes scabiei, Psoroptes ovis and Dermatophagoides pteronyssinus. Parasit. Vectors. 2012; 5: 3.

[17] 17. Kim JH, Roh JY, Kwon DH, et al.: Estimation of the genome sizes of the chigger mites Leptotrombidium pallidum and Leptotrombidium scutellare based on quantitative PCR and k-mer analysis. Parasit. Vectors. 2014; 7: 279. PubMed Abstract | Publisher Full Text | Free Full Text

[18] 18. Kuang YY, Zheng XH, Li CY, et al.: The genetic map of goldfish (Carassius auratus) provided insights to the divergent genome evolutions in the Cyprinidae family. Sci. Rep. 2016; 6: 34849. PubMed Abstract | Publisher Full Text | Free Full Text

[19] 19. Malli RPN, Adal AM, Sarker LS, et al.: De novo sequencing of the Lavandula angustifolia genome reveals highly duplicated and optimized features for essential oil production. Planta. 2019; 249: 251–256. PubMed Abstract | Publisher Full Text

[20] 20. Gregory TR, Nathwani P, Bonnett TR, et al.: Sizing up arthropod genomes: an evaluation of the impact of environmental variation on genome size estimates by flow cytometry and the use of qPCR as a method of estimation. Genome. 2013; 56: 505–510. PubMed Abstract | Publisher Full Text

[21] 21. Edwards RJ, Tuipulotu DE, Amos TG, et al.: Draft genome assembly of the invasive cane toad, Rhinella marina. GigaScience. 2018; 7. PubMed Abstract | Publisher Full Text | Free Full Text

[22] 22. Kadota M, Kuraku S: Supplementary Information. dataset. Figshare. 2023. Publisher Full Text

[23] 23. Hara Y, Takeuchi M, Kageyama Y, et al.: Madagascar ground gecko genome analysis characterizes asymmetric fates of duplicated genes. BMC Biol. 2018; 16: 40. PubMed Abstract | Publisher Full Text | Free Full Text

[24] 24. Hare EE, Johnston JS: Genome size determination using flow cytometry of propidium iodide-stained nuclei. Methods Mol. Biol. 2011; 772: 3–12. PubMed Abstract | Publisher Full Text

[25] 25. Vinogradov AE: Genome size and GC-percent in vertebrates as determined by flow cytometry: the triangular relationship. Cytometry. 1998; 31: 100–109. Publisher Full Text

[26] 26. Hara Y, Tatsumi K, Yoshida M, et al.: Optimizing and benchmarking de novo transcriptome sequencing: from library preparation to assembly evaluation. BMC Genomics. 2015; 16: 977. PubMed Abstract | Publisher Full Text | Free Full Text

[27] 27. Parra G, Bradnam K, Korf I: CEGMA: a pipeline to accurately annotate core genes in eukaryotic genomes. Bioinformatics. 2007; 23: 1061–1067. PubMed Abstract | Publisher Full Text

[28] 28. Simão FA, Waterhouse RM, Ioannidis P, et al.: BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics. 2015; 31: 3210–3212. Publisher Full Text

[29] 29. Nishimura O, Rozewicki J, Yamaguchi K, et al.: Squalomix: shark and ray genome analysis consortium and its data sharing platform [version 1; peer review: 2 approved]. F1000Res. 2022; 11: 1077. PubMed Abstract | Publisher Full Text | Free Full Text

[30] 30. Untergasser A, Cutcutache I, Koressaar T, et al.: Primer3--new capabilities and interfaces. Nucleic Acids Res. 2012; 40: e115. PubMed Abstract | Publisher Full Text | Free Full Text

[31] 31. Yamaguchi K, Uno Y, Kadota M, et al.: Elasmobranch genome sequencing reveals evolutionary trends of vertebrate karyotype organization. Genome Res. 2023. Publisher Full Text

[32] 32. Rhie A, McCarthy SA, Fedrigo O, et al.: Towards complete and error-free genome assemblies of all vertebrate species. Nature. 2021; 592: 737–746. PubMed Abstract | Publisher Full Text | Free Full Text

[33] 33. Stingo V, Du Buit M-H, Odierna G: Genome size of some selachian fishes. Boll. Zool. 1980; 47: 129–137. Publisher Full Text

[34] 34. Asahida T, Ida H, Inoue S: Karyotypes of three rays in the order Myliobatiformes. Jap. J. Ichthyol. 1987; 33: 426–430.

[35] 35. Ojima Y: Cellular DNA contents of fishes determined by flow cytometry. La Kromosomo II. 1990; 57: 1871–1888.

[36] 36. Asahida T, Ida H: Karyotype and cellular DNA content of a guitarfish, Rhinobatos schlegelii. La Kromosomo II. 1995; 79(80): 2725–2730.

[37] 37. Chang H-Y, Sang T-K, Jan K-Y, et al.: Cellular DNA contents and cell volumes of batoids. Copeia. 1995; 1995: 571–576. Publisher Full Text

[38] 38. Liu B, Shi Y, Yuan J, et al.: Estimation of genomic characteristics by analyzing k-mer frequency in de novo genome projects. arXiv: Genomics. 2013.

[39] 39. Schwartz FJ, Maddock MB: Cytogenetics of the elasmobranchs: Genome evolution and phylogenetic implications. Mar. Freshw. Res. 2002; 53: 491–502. Publisher Full Text

[40] 40. Pflug JM, Holmes VR, Burrus C, et al.: Measuring Genome Sizes Using Read-Depth, k-mers, and Flow Cytometry: Methodological Comparisons in Beetles (Coleoptera). G3 (Bethesda, Md.). 2020; 10: 3047–3060. PubMed Abstract | Publisher Full Text | Free Full Text

[41] 41. Uno Y, Nozu R, Kiyatake I, et al.: Cell culture-based karyotyping of orectolobiform sharks for chromosome-scale genome analysis. Commun. Biol. 2020; 3: 652. PubMed Abstract | Publisher Full Text | Free Full Text

[42] 42. Maddock MB, Schwartz FJ: Elasmobranch Cytogenetics: Methods and Sex Chromosomes. Bull. Mar. Sci. 1996; 58: 147–155.

[43] 43. Kalari KR, Casavant M, Bair TB, et al.: First exons and introns--a survey of GC content and gene structure in the human genome. In Silico Biol. 2006; 6: 237–242. PubMed Abstract

[44] 44. Zhu L, Zhang Y, Zhang W, et al.: Patterns of exon-intron architecture variation of genes in eukaryotic genomes. BMC Genomics. 2009; 10: 47. PubMed Abstract | Publisher Full Text | Free Full Text

[45] 45. Hara Y, Yamaguchi K, Onimaru K, et al.: Shark genomes provide insights into elasmobranch evolution and the origin of vertebrates. Nat. Ecol. Evol. 2018; 2: 1761–1771. PubMed Abstract | Publisher Full Text

[46] 46. Ohno S, Atkin NB: Comparative DNA values and chromosome complements of eight species of fishes. Chromosoma. 1966; 18: 455–466. PubMed Abstract | Publisher Full Text

[47] 47. Pedersen RA: DNA content, ribosomal gene multiplicity, and cell size in fish. J. Exp. Zool. 1971; 177: 65–78. PubMed Abstract | Publisher Full Text

[48] 48. Vinogradov AE: Genome size and chromatin condensation in vertebrates. Chromosoma. 2005; 113: 362–369. PubMed Abstract | Publisher Full Text

[49] 49. Pellicer J, Fay MF, Leitch IJ: The largest eukaryotic genome of them all?. Bot. J. Linn. Soc. 2010; 164: 10–15. Publisher Full Text

[50] 50. Schwartz FJ: Comparisons of karyotypes and cellular DNA contents within and between major lines of elasmobranchs. Indo-Pacific fish biology: proceedings of the Second International Conference on Indo-Paclfic Fishes, Ichthyological Soc. of Japan. 1986; pp. 148–157.

[51] 51. Hardie DC, Hebert PD: Genome-size evolution in fishes. Can. J. Fish. Aquat. Sci. 2004; 61: 1636–1646. Publisher Full Text

[52] 52. Weber JA, Park SG, Luria V, et al.: The whale shark genome reveals how genomic and physiological properties scale with body size. Proc. Natl. Acad. Sci. USA. 2020; 117: 20662–20671. PubMed Abstract | Publisher Full Text | Free Full Text

[53] 53. Zhang Y, Gao H, Li H, et al.: The White-Spotted Bamboo Shark Genome Reveals Chromosome Rearrangements and Fast-Evolving Immune Genes of Cartilaginous Fish. iScience. 2020; 23: 101754. PubMed Abstract | Publisher Full Text | Free Full Text

[54] 54. Marra NJ, Stanhope MJ, Jue NK, et al.: White shark genome reveals ancient elasmobranch adaptations associated with wound healing and the maintenance of genome stability. Proc Natl Acad Sci USA. 2019; 116: 4446–4455. PubMed Abstract | Publisher Full Text | Free Full Text

[55] 55. Nakatani Y, Shingate P, Ravi V, et al.: Reconstruction of proto-vertebrate, proto-cyclostome and proto-gnathostome genomes provides new insights into early vertebrate evolution. Nat. comm. 2021; 12: 4489. PubMed Abstract | Publisher Full Text | Free Full Text

[56] 56. National Institute of Genetics (formerly at RIKEN BDR), Kuraku Lab: Supplementary information for Kadota et al., 2023. figshare. Journal contribution. 2023. Publisher Full Text

Shark and ray genome size estimation: methodological optimization for inclusive and controllable biodiversity genomics

Abstract

Keywords

Introduction

Methods

Animals

Figure 1. Phylogenetic relationship of the elasmobranch species used in this study.

Cell preparation for flow cytometry

Flow cytometry

DNA extraction for qPCR

Selection of reference genes and primer design

Preparation of qPCR copy number standards

qPCR analysis

Sequence-based genome-size estimation with k-mer frequency

Results

Genome size estimation with flow cytometry

Figure 2. Genome size estimation by flow cytometry.

Genome size estimation with qPCR

Figure 3. Genome size estimation by quantitative PCR.

Table 1. Estimated genome sizes for the species selected for this study.

Genome sizes estimated from whole genome sequences

Discussion

Importance of accurate genome size estimates—towards ‘T2T’

Figure 4. Comparisons of estimated genome sizes for cartilaginous fishes.

Which method?—reconsideration in this post-genomic era

Optimization of qPCR-based method and our protocol

In silico preparation for qPCR-based genome size estimation

Genome size variation among elasmobranchs

Author contributions

Data availability

Extended data

Acknowledgments

References

Comments on this article Comments (0)

Open Peer Review

Comments on this article Comments (0)

Open Peer Review

Reviewer Status

Reviewer Reports

Comments on this article

Browse by related subjects

Competing Interests Policy

Stay Updated