Keywords
tunicate, larvacean, appendicularian
This article is included in the Genomics and Genetics gateway.
Appendicularians are planktonic tunicates abundant all over the world. Currently, only two complete annotated mitochondrial genome assemblies are available for appendicularians, both for cryptic species of Oikopleura dioica. This underrepresentation of available appendicularian mitochondrial genomes limits environmental DNA sequencing (eDNA) studies that rely on mitochondrial markers as a taxonomic barcode. We report the complete mitochondrial genome assembly and annotation of an unknown appendicularian species isolated from the Amami Oshima island, Kagoshima prefecture, Japan, that has significant sequence difference with other currently available assemblies and will serve as a useful resource for ecological studies and further mitochondrial studies of appendicularians.
tunicate, larvacean, appendicularian
We have modified the manuscript according to the comments of the reviewers in this round of review.
See the authors' detailed response to the review by Thomas Stach
See the authors' detailed response to the review by Ayelet Voskoboynik
See the authors' detailed response to the review by Carmela Gissi
See the authors' detailed response to the review by Daniel M Chourrout
Appendicularians (synonym: larvaceans) are tunicates distributed all over the world’s ocean that do not have a sessile stage, remaining free-swimming throughout their life cycle, and construct a cellulose “house” which is used for feeding and protection.1 The best studied appendicularian is the ~4 mm length O. dioica, 2 but there are also large species with a body size ranging between 3–10 cm.3 Appendicularian mitochondrial genomes use the ascidian mitochondrial genetic code,4–8 which differs from the invertebrate one by the reassignment of AGR codons from serine to glycine. In one clade within appendicularians containing O. dioica, homopolymers interrupt coding sequences and are resolved to hexamers by an unknown editing process.4–6 In this study, we sequenced the mitochondrial genome of an unknown appendicularian species sampled from the Amami Oshima island, Japan (Figure 1), in order to increase the taxonomic power of eDNA studies based on the sequence of mitochondrial genes. In addition to that, mitochondrial DNA sequence of appendicularians can be useful in evolutionary studies to the origin of tunicates, as well as chordates, and as a means to understand the hidden biodiversity of Appendicularia, which is a quite neglected taxon.
We collected specimens at Tamari harbor, Amami Oshima island, Kagoshima prefecture, Japan (28.41491667 N 129.59016667 E) in July 2023. An identical specimen from the same catch was deposited at the Kagoshima University Museum (www.museum.kagoshima-u.ac.jp) under the voucher number KAUM-UR 1, the original specimen was consumed for sequencing. Samples were preserved in 99% ethanol at -80°C prior to DNA extraction. DNA extraction was done by firstly washing samples with 5 mL of filtered autoclaved seawater 3 times before resuspending in 200 μl of lysis buffer from the MagAttract HMW DNA Kit (Qiagen, USA #67563) with 20 μL of 10 μg/mL proteinase K and incubated for 1 h at 56°C. Next, 50 μL of 5 M NaCl was added before centrifugation of the mixture (5000 × g at 4°C) for 15 min. The supernatant was transferred into a new microtube and mixed with 400 μL of 100% EtOH and 5 μL of glycogen (20 mg/mL) and cooled at -80°C for 20 min. Further centrifugation at 6250 × g, 4°C for 5 min was performed and the supernatant removed. The obtained pellet was then washed with 1 mL of cold 70% ethanol, centrifuged, and air-dried for 5 min. The DNA was then resuspended in nuclease free water and quantified using a Qubit 3 Fluorometer (Thermo Fisher, USA). DNA was fragmented using Megaruptor 3® (Diagenode, USA) using the Megaruptor 3 Shearing Kit (Diagenode, USA #E07010003) at speed 32 and purified using SMRTbell cleanup beads (Pacific Biosciences, USA #102-158-300). Quality control of obtained DNA was performed using the Femto Pulse System (Agilent, USA) and the Genomic DNA 165 kb Kit (Agilent, USA #FP-1002-0275). The sequencing was performed on a PacBio® Sequel II sequencer (Pacific Biosciences, USA) using the Sequel II sequencing kit 2.0 (Pacific Biosciences, USA #101-820-200). The DNA size profile and sequencing metrics are available on Zenodo (doi: 10.5281/zenodo.14934254).
The sequenced reads were assembled with NOVOLoci (https://github.com/ndierckx/NOVOLoci) in targeted assembly mode. A partial PacBio read sequence found by a BLAST9 search of cytochrome c oxidase subunit 1 from Okinawa O. dioica5 to the raw whole DNA reads was used as a seed sequence. The coverage plot was generated by mapping the sequencing reads that were used to assemble the mitogenome to the assembly itself using minimap version 2.28-r1209.10
We annotated the assembly with MITOS2 v2.1.911 using the ascidian mitochondrial genetic code.7 ARWEN version 1.2.312 was used in addition to annotate putative tRNAs.
Protein-coding mitochondrial sequences were extracted from GenBank records with EMBOSS13 and codon-aligned manually in SeaView 5.0.514 after a first alignment with Clustal Omega version 1.2.4.15 The phylogenetic tree was computed with IQ-TREE version 2.0.716 using the command-line options “-T AUTO (run as many CPU threads as possible) --runs 3 (perform 3 runs to assess model convergence) --polytomy (allow unresolved branches) --ufboot 1000 (1000 boostraps with the ultrafast method) -m MFP (automatic selection of the best model for maximum likelihood inference using the ModelFinder Plus method, with one partition per gene. Sequences, accession numbers, and trees are available on Zenodo (doi: 10.5281/zenodo.14934254).
We noticed large appendicularians during a sampling trip targeting O. dioica in the Amami Oshima island, Kagoshima, Japan. We took the opportunity to collect several of these large appendicularians and sequenced a single individual, from which we assembled a circular mitogenome of 13,058 bp length (Figure 2). We mapped the fraction of sequencing reads that were used for the assembly to the assembled sequence and obtained a sequencing depth between 36–176 × and an average depth of 122.4 × (Fig. S1).
Circular plot generated by Geneious Prime version 2024.0.5. Protein coding genes and tRNAs are displayed on a yellow and pink background respectively. The circle in the middle illustrates the homopolymers; 6 or more successive Ts (forward) or As (reverse) in red and Cs (forward) or Gs (reverse) in blue. Abbreviations as follows, cox2: cytochrome c oxidase subunit 2, nad5: NADH dehydrogenase subunit 5, cob: cytochrome b, nad4: NADH dehydrogenase subunit 4, nad1: NADH dehydrogenase subunit 1, putative nad2: putative NADH dehydrogenase subunit 2, atp6: ATP synthase Fo subunit 6, cox1: cytochrome c oxidase subunit 1, cox3: cytochrome c oxidase subunit 3.
The mitogenome does not possess stretches of homopolymers like the ones observed in O. dioica. There is also no evidence of mitochondrial introns. Thus, we could translate its open reading frames (ORFs) with no interruptions. We found a total of 9 protein coding genes, two pseudogenes (fragments of cox3) and 2 tRNA genes, all on the same strand (Table S1), but could not annotate ribosomal RNA genes, although the length of the remaining unannotated regions suggest that they may be present. We were only able to detect tRNA genes for Leucine and Valine.
Eight out of the nine protein-coding genes match known mitochondrial proteins without ambiguity. Previous work on other species suggests that synteny is usually not conserved in tunicates.20 Indeed, we found that the gene order bears no resemblance to the one of O. dioica5,6 nor to the one of O. longicauda21 (scaffold SCLD01101138.1) nor to one in the stolidobranch ascidians Herdmania momus and Halocynthia roretzi.
We also found an ORF with homology to the putative NADH dehydrogenase 2 (nad2) reported by Klirs et al.,6 and we also found matches in other appendicularian species, confirming its presence across Oikopleuridae (Fig. S2). Searches using BLASTp on the non-redundant protein sequences database did not yield hits, and a tBLASTx search on whole-genome shotgun contigs database of tunicates (taxid:7712) matched a predicted Oikopleura longicauda mitochondrial contig (SCLD01139119.1) which is different from the one we used for the phylogenetic analysis and misses four of the eight expected mitochondrial proteins. The predicted structure of the putative nad2 using colabfold17 consists of alpha helices (Fig. S3), similar to reported nad2 protein from human (PDB IDs: 5XTC chain Q). This observation might have been caused by tunicates having fast evolving mitochondria,20 and the coverage gap in the database that currently is available.
We extended the automatic gene annotation to the longest ORF which has stop codons (TAA, TAG) and start codons (TTG, ATA, ATG, GTG) accepted by the ascidian mitochondrial code.7 Nevertheless, due to the variability of initiation tRNA, we cannot rule out the possibility of the translation start codon being different, for example Halocynthia roretzi uses ATT as a start codon.22
The codon usage (table S2) shows that, while TGA codes for tryptophan in tunicates, it is used in less than 5% of the tryptophan positions. Furthermore, these TGA codons were only found in the most N-terminal region of cox2, which is not well supported by alignment to other appendicularians and has a possible alternative start site downstream of these codons. Thus, depending on the real position of cox2’s translation start site, it is possible that the TGA codon is not used in this genome, similar to what was reported for O. longicauda on the cox1 and cob genes.7 Other than that, there are several other codon biases, such as towards TTG (39.7% and TTA (30.4%) for leucine. Another bias is present towards GTG that is coding for valine (56.7%).
The phylogenetic tree using protein-coding mitochondrial sequences (Figure 3, Figure S4) shows that this unknown species belongs to the clade of appendicularians that includes Bathochordaeus, Mesochordaeus and Oikopleura longicauda but not O. dioica. This clade was also found in a phylogenetic analysis of ribosomal protein sequences.21 The split between O. dioica and the other appendicularians in our tree corresponds to the bioluminescent/non-bioluminescent classification of Galt et al., 1985.23 This is also reflected in the situation of homopolymers which are not abundant in this mitogenome, similar to O. longicauda21 and not O. dioica.5 As the Oikopleura genus is paraphyletic in our phylogenetic analysis and that of others, further work not in the scope of this manuscript will be needed to resolve which genus has to be corrected. Considering the tunicates phylogeny, the tree recovered clades for the free-living Appendicularia, Thaliacea, and for the sessile Stolidobranchia, Aplousobranchia and Phlebobranchia, from which it detached the Ciona genus as a separate clade. A single thaliacean species, Doliolum nationalis, grouped with the sessile tunicates, however this is not fully supported by bootstrap values. This computed tree suggests that targeted sampling and sequencing of additional doliolids and Ciona species may be useful to further clarify the phylogeny of tunicates classes and order.
Phylogenetic tree computed by maximum likelihood inference on a ~13 kbp codon alignment of 13 mitochondrial genome protein-coding genes collected from publicly available aquatic chordate genomes.
As a final attempt to identify the species of this appendicularian, we extracted the sequence of the nuclear ribosomal RNA gene from one sequence read (see supplemental material), which we used to screen the GenBank database. The best hit (MK621860) has 1757 identical nucleotides over a length of 1773 (99%), and is from an O. fusiformis individual sampled in Croatia.
We present here the complete mitogenome of an unidentified Oikopleura species. Our phylogenetic analysis and the lack of homopolymer insertions show that it is closer to the lineage of O. longicauda than to the one of O. dioica. Morphological similarity and a preliminary analysis using nuclear genome rRNA sequences suggest that this unknown appendicularian is most closely related to O. fusiformis, however as our recent studies of O. dioica24,25 uncovered cryptic speciation in appendicularians, and in the absence of specimen preservation allowing for confident taxonomic identification, we refrain from naming the species at this current stage. We project that the data produced in this study will be useful in future eDNA studies.
JNW, CP, and NML conceived the study. AM and JM collected samples and AM performed sequencing. JNW, ND, and CP performed bioinformatics analysis. JNW drafted the manuscript. CP, ND, and NML critically revised the manuscript. All authors approved the final manuscript and agreed to be accountable for all aspects of this work.
The mitochondrial genome sequence was deposited in GenBank under the accession number LC830956. The associated BioProject, SRA, and Bio-Sample numbers are PRJNA1152617, SRR30429256, and SAMN43370082 respectively. The annotation and the sequences used to compute the phylogenetic tree in Fig. 3 are available in Zenodo (doi:10.5281/zenodo.13864550).
Accession numbers
NCBI Nucleotide database: Oikopleura sp. bigama1 mitochondrial DNA, complete genome. Accession number; LC830956. https://www.ncbi.nlm.nih.gov/nuccore/LC830956.1/.
NCBI SRA: Genome sequencing of an unknown Oikopleura species.
Accession number; PRJNA1152617. https://www.ncbi.nlm.nih.gov/bioproject/PRJNA1152617/.
NCBI BioSample: Invertebrate sample from Oikopleura sp. bigama1. Accession number; SAMN43370082. https://www.ncbi.nlm.nih.gov/biosample/?term=SAMN43370082
NCBI Sequence Read Archive (SRA). WGS of Oikopleura sp. bigama1 Accession number; SRR30429256. https://www.ncbi.nlm.nih.gov/sra/?term=SRR30429256
Zenodo: Supplemental material to the journal article “The complete mitogenome of an unidentified Oikopleura species”. doi:10.5281/zenodo.13864550.26
Data are available under the terms of the Creative Commons Zero “No rights reserved” data waiver (CC0 1.0 Public domain dedication).
We thank Dr. Michael Mansfield for the invaluable suggestions and guidance in the construction of the nucleotide-based phylogenetic tree, Dr. Yongkai Tan for providing photographs of the specimens and Dr. Rade Garić for critical insights on sequence and morphological similarities to O. fusiformis. We thank the DNA Sequencing Section and the Scientific Computing and Data Analysis Section of the Research Support Division at OIST for their support.
Views | Downloads | |
---|---|---|
F1000Research | - | - |
PubMed Central
Data from PMC are received and updated monthly.
|
- | - |
Competing Interests: No competing interests were disclosed.
Reviewer Expertise: comparative zoology
Are the rationale for sequencing the genome and the species significance clearly described?
Partly
Are the protocols appropriate and is the work technically sound?
Yes
Are sufficient details of the sequencing and extraction, software used, and materials provided to allow replication by others?
Partly
Are the datasets clearly presented in a usable and accessible format, and the assembly and annotation available in an appropriate subject-specific repository?
Yes
Competing Interests: No competing interests were disclosed.
Reviewer Expertise: comparative zoology
Are the rationale for sequencing the genome and the species significance clearly described?
Yes
Are the protocols appropriate and is the work technically sound?
Yes
Are sufficient details of the sequencing and extraction, software used, and materials provided to allow replication by others?
Yes
Are the datasets clearly presented in a usable and accessible format, and the assembly and annotation available in an appropriate subject-specific repository?
Yes
Competing Interests: No competing interests were disclosed.
Reviewer Expertise: animal genomics
Are the rationale for sequencing the genome and the species significance clearly described?
Yes
Are the protocols appropriate and is the work technically sound?
Partly
Are sufficient details of the sequencing and extraction, software used, and materials provided to allow replication by others?
Partly
Are the datasets clearly presented in a usable and accessible format, and the assembly and annotation available in an appropriate subject-specific repository?
Yes
Competing Interests: No competing interests were disclosed.
Reviewer Expertise: genomic, evolution, comparative immunology and stem cell biology
Are the rationale for sequencing the genome and the species significance clearly described?
Partly
Are the protocols appropriate and is the work technically sound?
Partly
Are sufficient details of the sequencing and extraction, software used, and materials provided to allow replication by others?
Partly
Are the datasets clearly presented in a usable and accessible format, and the assembly and annotation available in an appropriate subject-specific repository?
Yes
Competing Interests: No competing interests were disclosed.
Reviewer Expertise: mitogenomics, tunicate evolution, long read NGS
Alongside their report, reviewers assign a status to the article:
Invited Reviewers | ||||
---|---|---|---|---|
1 | 2 | 3 | 4 | |
Version 3 (revision) 06 May 25 |
read | |||
Version 2 (revision) 07 Mar 25 |
read | read | ||
Version 1 12 Nov 24 |
read | read |
Provide sufficient details of any financial or non-financial competing interests to enable users to assess whether your comments might lead a reasonable person to question your impartiality. Consider the following examples, but note that this is not an exhaustive list:
Sign up for content alerts and receive a weekly or monthly email with all newly published articles
Already registered? Sign in
The email address should be the one you originally registered with F1000.
You registered with F1000 via Google, so we cannot reset your password.
To sign in, please click here.
If you still need help with your Google account password, please click here.
You registered with F1000 via Facebook, so we cannot reset your password.
To sign in, please click here.
If you still need help with your Facebook account password, please click here.
If your email address is registered with us, we will email you instructions to reset your password.
If you think you should have received this email but it has not arrived, please check your spam filters and/or contact for further assistance.
Comments on this article Comments (0)