ALL Metrics
-
Views
-
Downloads
Get PDF
Get XML
Cite
Export
Track
Research Article
Revised

Widespread use of the “ascidian” mitochondrial genetic code in tunicates

[version 2; peer review: 2 approved]
PUBLISHED 14 Apr 2020
Author details Author details
OPEN PEER REVIEW
REVIEWER STATUS

Abstract

Background: Ascidians, a tunicate class, use a mitochondrial genetic code that is distinct from vertebrates and other invertebrates. Though it has been used to translate the coding sequences from other tunicate species on a case-by-case basis, it is has not been investigated whether this can be done systematically. This is an important because a) some tunicate mitochondrial sequences are currently translated with the invertebrate code by repositories such as NCBI GenBank, and b) uncertainties about the genetic code to use can complicate or introduce errors in phylogenetic studies based on translated mitochondrial protein sequences.
Methods: We collected publicly available nucleotide sequences for non-ascidian tunicates including appendicularians such as Oikopleura dioica, translated them using the ascidian mitochondrial code, and built multiple sequence alignments covering all tunicate classes.
Results: All tunicates studied here appear to translate AGR codons to glycine instead of serine (invertebrates) or as a stop codon (vertebrates), as initially described in ascidians. Among Oikopleuridae, we suggest further possible changes in the use of the ATA (Ile → Met) and TGA (Trp → Arg) codons.
Conclusions: We recommend using the ascidian mitochondrial code in automatic translation pipelines of mitochondrial sequences for all tunicates. Further investigation is required for additional species-specific differences.

Keywords

Tunicate, Oikopleura, Genetic code, Mitochondria, Cytochrome oxidase subunit I

Revised Amendments from Version 1

Based on Reviewer 2's comments, we expanded our introduction.

See the authors' detailed response to the review by Yuanning Li

Introduction

Tunicates are marine animals that have acquired the capacity to produce cellulose by horizontal gene transfer approximately 500 million years ago (Matthysse et al., 2004; Nakashima et al., 2004). Together with vertebrates and cephalochordates, they belong to the chordate phylum, in which they share morphological features such as a muscular tail during larval stages. Phylogenetic studies place the tunicates as the closest living relatives of vertebrates (Delsuc et al., 2006). Tunicates can be subdivided in three classes: Thaliacea (free-swimming colonial species, for instance salps or dolioids), Appendicularia (free-swimming solitary species with an adult morphologically similar to the larval stage of other tunicates), and Ascidiacea (attached to solid substrates in their adult stage, for instance sea squirts). The relationship between these classes and therefore their mono- or paraphyly has been revised multiple times. For instance the 18S rRNA analysis of Stach & Turbeville (2002) nested Appendicularia within Ascidiacea, but more recently Delsuc et al. (2018) placed them as sister groups using a multigene approach. The paraphyly of Ascidiacea is now widely accepted, as the above studies and others demonstrated that they contain the Thaliacea.

Mitochondrial genomes undergo major changes at the geological time scale due to their small size and clonal reproduction, including changes to their genetic code (Osawa et al., 1992). In animals, alternative genetic codes have first been found in large clades, for instance echinoderms (Himeno et al., 1987) and hemichordates (Castresana et al., 1998), but more recent works underline the presence of changes deeper in the phylogenetic tree, for instance within nematodes (Jacob et al., 2009) and within hemichordates (Li et al., 2019). The first evidence that ascidians use a specific mitochondrial genetic code stemmed from observations that the cytochrome c oxidase subunit 1 (Cox1) sequence from Halocynthia roretzi (Yokobori et al., 1993) and the Cox3 sequence of Pyura stolonifera (Durrheim et al., 1993) are interrupted by stop codons if translated using the vertebrate mitochondrial code. Reassignment of AGR codons to glycine was later confirmed by the discovery of a glycine (Gly) tRNA in the H. roretzi genome (Yokobori et al., 1999) and by the sequencing of its anticodon (U*CU) (Kondow et al., 1999). Apart from the AGR codons, the ascidian code is similar to the vertebrate and the invertebrate ones, with ATA assigned to methionine (Met) and TGA to tryptophan (Trp) (Yokobori et al., 1993).

This genetic code is known as the “ascidian” genetic code; however, it is also used by non-ascidian tunicates, such as the thaliacean Doliolum nationalis (Yokobori et al., 2005). The possibility that this genetic code emerged earlier than tunicates was raised by the study of partial genome sequences of Branchiostoma lanceolatum (Delarbre et al., 1997) leading to the proposition that AGR might encode Gly in cephalochordates. While this seemed to be supported by the discovery of a putative TCT (Gly) tRNA in the full mitochondrial genome of B. lanceolatum (Spruyt et al., 1998), this hypothesis was later ruled out by an analysis of the related amphioxus Branchiostoma floridae (Boore et al., 1999), and has not been reconsidered since. Finally, studies on the appendicularian branch showed compatibility between the mitochondrial sequence of Oikopleura dioica and the ascidian code (Denoeud et al., 2010). Nevertheless, support for compatibility was not demonstrated explicitly for the ATA and TGA codons and the mitochondrial sequence of O. dioica were not released in International Nucleotide Sequence Database Collaborations (INSDC) databanks.

Cox1 is the most conserved mitochondrial protein. Although no mitochondrial genome has been fully sequenced yet for appendicularians, partial Cox1 sequences are present in the INSDC databanks for Oikopleuridae. Sakaguchi et al. (2017) reported that all Oikopleura mitochondrial sequences (AY116609AY116611 and KF977307) may be contaminations from bacteria or cnidarians, and provided partial sequences for Oikopleura longicauda in the same study. Partial mitochondrial sequences were published for Bathochordaeus and Mesochordaeus species by Sherlock et al. (2017). In addition, Naville et al. (2019) recently published draft genome for several appendicularian species. Therefore, to assess whether the ascidian mitochondrial code is used across the whole tunicate subphylum, we took advantage of these public data and prepared a curated alignment of Cox1 sequences comprising representatives of the major tunicate branches, to study the consensus sequences at conserved residues.

Methods

We identified Cox1 and Cytochrome b (Cob) gene sequences for Oikopleura longicauda, Mesochordaeus erythrocephalus and Bathochordaeus stygius by screening published genome assemblies (Naville et al., 2019) with the partial Cox1 sequence of O. longicauda LC222754.1 (Sakaguchi et al., 2017) using tblastn and the ascidian mitochondrial code (-db_gencode=13) (Gertz et al., 2006). Mitochondrial genome sequences were then translated using the cons and getorf commands from EMBOSS (Rice et al., 2000), using the ascidian mitochondrial code.

Oikopleura longicauda 

We identified the circular contig SCLD01101138.1 (length: 10,324 nt) as a potential mitochondrial genome, and translated Cox1 from position 4530 to 6230. We also translated Cob from 3697 to 4668.

Mesochordaeus erythrocephalus 

We translated Cox1 in contig SCLF01725989.1 (length 7,034 nt) on reverse strand from position 1792 to 272. Using the same method with O. longicauda’s Cob sequence as a bait, we also recovered a Cob sequence from contig SCLF01109548.1 (length 5,010 nt), reverse strand, 1604 to 2590.

Bathochordaeus stygius 

We used the consensus of the published B. stygius Cox1 sequences KX599267.1 to KX599281.1 from GenBank (Sherlock et al., 2017), to screen the genome and scaffold SCLE01415711.1 (length 10,388 nt) gave a perfect hit. We translated Cox1 from position 8054 to 6522 on the reverse strand, and a partial Cob sequence from scaffold SCLE01415711.1 (2319 to 2963, reverse strand). We also found a second fragment aligning well with C-terminal sequences between positions 2373 and 1978, but we did not include it due to the difficulty of resolving the overlap between both fragments. When screening with the M. erythrocephalus Cox1 sequence recovered above, we found that another scaffold SCLE01416475.1 gave a perfect hit, hinting at a possible contamination.

Oikopleura dioica 

To assemble a Cox1 sequence in O. dioica, we downloaded expressed sequence tags (file 10_ESTall.txt) from Oikobase (Danks et al., 2012) and extracted hits matching the O. longicauda sequence using tblastn (see above). We then aligned and visualised the hits using Clustal Omega (Sievers et al., 2011) and SeaView (Gouy et al., 2009), filtering out those too short or introducing gap columns. Inspection of the alignment let us notice three possible haplotypes. We generated a consensus for each of them, translated them (see above) and trimmed the proteins sequences in order to match the length of the other reference sequences in the alignment. All variants found between the haplotypes were synonymous codons. We used the same methodology to generate a consensus for Cob and translate it.

Cox1 accession numbers

Bathochordaeus charon KT881544.1 ORF2 translated with ascidian code; Bathochordaeus stygius: SCLE01415711.1[8054:6522] translated with ascidian code; Branchiostoma lanceolatum:BAD93656.1; Caenorhabditis elegans: NP_006961.1; Ciona intestinalis: CAL23359.2; Clavelina oblonga: YP_009029840.1; Doliolum nationalis: BAD86512.1; Halocynthia roretzi: NP_038239.1; Mesochordaeus erythrocephalus: SCLF01725989.1[1915:260] translated with ascidian code; Mus musculus: NP_904330.1; Oikopleura dioica: consensus of Oikobase contigs (see file 10_ESTall.txt) KT0AAA24YA11, KT0AAA22YO17, KT0AAA22YO04, KT0AAA13YK14, KT0AAA18YK22, KT0AAA16YP04, KT0AAA13YE23, KT0AAA8YH10, KT0AAA4YK01, KT0AAA24YE23, KT0AAA18YO18, KT0AAA3YP19, KT0AAA10YF12; O. longicauda: SCLD01101138.1[4678:6230] translated with ascidian code; Salpa thompsoni: BBB04277.1.

Cob accession numbers

Bathochordaeus stygius: SCLE01415711.1[2963:2319] translated with ascidian code; Branchiostoma lanceolatum: BAD93666.1; Caenorhabditis elegans: NP_006958.1; Ciona intestinalis: CAL23352.2; Clavelina oblonga: YP_009029843.1; Doliolum nationalis: BAD86520.1; Halocynthia roretzi: NP_038246.1; Mesochordaeus erythrocephalus: SCLF01109548.1[1604:2590] translated with ascidian code; Mus musculus: NP_904340.1; Oikopleura dioica: consensus of Oikobase contigs KT0AAA23YJ17, KT0AAA16YJ22, KT0AAA17YO14, KT0AAA10YI15, KT0AAA18YI18, KT0AAA11YF07, KT0AAA10YG05, KT0AAA1YH02, KT0AAA12YH10, KT0AAA12YC07, KT0AAA12YC07, KT0AAA18YM15 (see file 10_ESTall.txt); O. longicauda: SCLD01101138.1[3697:4668] translated with ascidian code; Salpa thompsoni: BBB04269.1.

Sequence alignments

Translated Cox1 and Cob sequences were aligned using Clustal Omega (Sievers et al., 2011) and SeaView (Gouy et al., 2009). The alignments were post-processed using the showalign -show=n command of EMBOSS (Rice et al., 2000) to show the differences to the inferred consensus. Graphical processing of the alignments were performed with Jalview (Waterhouse et al., 2009). The codon sequences encoding Cox1 and Cob of the tunicate species were then added aligned to the corresponding amino-acid (three lines per species, see Extended data (Plessy & Pichon, 2019)) and then the text files were transposed, so that each line would correspond to a single position in the alignment, and interrogated with custom Unix commands to compute the tables presented in this manuscript.

Results

AGR encodes for Gly across all tunicates

We selected species according to sequence availability and to ensure coverage of the tunicate subphylum in a way that stays broad under the various hypotheses of monophyly or paraphyly for its major groups. For ascidians, we have included the phlebobranchian Ciona intestinalis, the aplousobranchian Clavelina oblonga and the pyurid stolidobranchian Halocynthia roretzi. For thaliaceans, we selected Doliolum nationalis and Salpa thompsoni. For appendicularians we selected Oikopleura dioica, Oikopleura longicauda, Bathochordaeus stygius and Mesochordaeus erythrocephalus. We ensured that all tunicate sequences were translated with the ascidian mitochondrial genetic code. Lastly, we included outgroup sequences from Caenorhabditis elegans and Branchiostoma lanceolatum (invertebrate mitochondrial code) and from Mus musculus (vertebrate mitochondrial code) to better highlight conserved amino acid positions. In Figure 1, we illustrate the relation between these species based on the phylogeny of Naville et al. (2019) for appendicularians and of Delsuc et al. (2018) for the other tunicates. We prepared Cox1 sequences from the selected species using mitochondrial genomes (for ascidians, thaliaceans, and outgroups), from draft genomes in which we found a putative mitochondrial contig after screening with a partial or a related Cox1 sequence (for O. longicauda, B. stygius, and M. erythrocephalus) and from EST sequences (for O. dioica). We aligned the translated Cox1 and Cob sequences (Figure 2 and Figure 3) and inspected the positions where all species use the same amino acid. Conserved glycines supported the use of AGR codons across the whole tunicate clade. We confirmed this observation with Cob sequences obtained with the same method.

019c999e-f48f-4d49-a229-715dcb0c3997_figure1.gif

Figure 1. Left: Cladogram illustrating the relations between the species selected in study.

Different branch colors indicate different mitochondrial genetic codes. Codon assignments with an equal sign indicate how the nucleotide sequences were translated. Codon assignments with a question mark indicate a possible finding, but were not used for translation. Ascidians, in which the AGR to Gly codon reassignment was initially discovered, are highlighted among the tunicates. Right: codon sequence of Cox1 genes on positions where proposed changes of genetic code would make all species use the same amino acid.

019c999e-f48f-4d49-a229-715dcb0c3997_figure2.gif

Figure 2. Sequence alignment of Cox1 proteins.

White stars indicate conserved cysteines when at least one tunicate uses an AGR codon. Black stars indicate positions suggesting a different genetic code.

019c999e-f48f-4d49-a229-715dcb0c3997_figure3.gif

Figure 3. Sequence alignment of Cob proteins.

White stars indicate conserved cysteines when at least one tunicate uses an AGR codon. Black stars indicate positions suggesting a different genetic code.

Possible lineage specific use of ATA Ile and TGA Arg codons

We then searched for positions where a single tunicate species differed from the other sequences with the same replacement amino acid more than once. We found multiple cases of methionine being replaced by isoleucine and arginine replaced by tryptophan in O. longicauda and B. stygius (Figure 2). Given their phylogenetic proximity, we grouped the two species in the analysis below and we calculated the number of mismatches to the other sequences. We redefined a position as “conserved” if there is at most one mismatch from one sequence to the others.

M. erythrocephalus does not seem to use ATA codons and O. longicauda and B. stygius use ATA codons at positions where all other species had an isoleucine (Ile) (Table 1 and Table 2). In the ancestral invertebrate mitochondrial code and the sister vertebrate code, ATA encodes Met. Although Met and Ile both have hydrophobic side chains that often can substitute for each other, this also suggests a change of the genetic code. Evidence for this is that 1) non-appendicularian species do not display ATA codons at positions where all other species encode Ile; 2) the change would be parsimonious as O. longicauda, B. stygius and M. erythrocephalus are more closely related to each other than to O. dioica (Naville et al., 2019); and 3) these three species never have ATA codons at positions where Met is conserved in every species (in contrast to O. dioica). Furthermore, reversion of the ATA codon to Ile have occurred in other branches of the tree of Life, for instance in echinoderms (Jacobs et al., 1988). Finally, inspection of a partial Cox1 sequence of the related Bathochordaeus charon (KT881544.1) provided one extra instance of an ATA codon at a conserved Ile position.

Table 1. ATN codons in Cox1 in Oikopleuridae.

O. dio.O. lon.B. sty.M. ery.
ATA number2212160
ATC number3802
ATG number14342533
ATT number29182138
ATA on cons. Met5000
ATA on cons. Ile0420

Table 2. ATN codons in Cob in Oikopleuridae.

O. dio.O. lon.B. sty.M. ery.
ATA number6990
ATC number5212
ATG number99816
ATT number21111122
ATA on cons. Met0000
ATA on cons. Ile0110

The TGA codon is known to encode tryptophan (Trp) in vertebrate, invertebrate and ascidian mitochondria (Fox, 1979). We found that B. stygius uses TGA at positions where all other species would encode Arg (Table 3 and Table 4). This is surprising as these two amino acids are unlikely to functionally substitute for each other. O. longicauda does not use TGA codons, and M. erythrocephalus does not use TGA at conserved Arg, although it is found at a position where all other species encode for Arg except C. elegans which encodes lysine, the other positively charged amino-acid. This again suggests a possible change of genetic code, although the numbers are currently too small to draw a solid conclusion.

Table 3. TGR codons in Cox1 in Oikopleuridae.

O. dio.O. lon.B. sty.M. ery.
TGA number2031
TGG number13161916
TGA on cons. Trp1000
TGA on cons. Arg0020

Table 4. TGR codons in Cob in Oikopleuridae.

O. dio.O. lon.B. sty.M. ery.
TGA number3021
TGG number4745
TGA on cons. Trp1000
TGA on cons. Arg0010

Discussion

We extracted Cox1 and Cob sequences of four different appendicularians from public databases. As a nucleotide sequence, Cox1 might be useful for mining databases of molecular barcodes sequenced from the environment, or for studies of population diversity within a species. As a protein sequence, Cox1 might be useful for refining the phylogeny of appendicularians. However, a translation code needs to be chosen.

Our alignments of tunicate Cox1 and Cob protein sequences support the view that all tunicates translate AGR codons as Gly (although this conclusion might be limited by the lack of coverage for the Kowalevskiidae and Fritillariidae families). While our analysis suggests that the last common ancestor of the tunicates used the “ascidian” code, it is not possible to conclude that all contemporary tunicates still do, as we found discrepancies on other conserved residues that could be explained by a genetic code change of ATA and TGA codons within a sub-clade of the appendicularians containing M. erythrocephalus, O. longicauda and B. stygius.

The “ascidian” genetic code is table number 13 in the NCBI protein database, where it is used to translate sequences from ascidians and non-ascidian tunicates, for instance D. nationalis. However for appendicularians, the NCBI currently applies the invertebrate table (number 5). This has the consequences of turning Gly to Ser at functionally important positions. Therefore, the ascidian is probably a more appropriate default. At present, it is unclear whether some appendicularians have additional changes; however, the accurate translation of AGR codons to Gly would nonetheless reduce the amount of error in translated protein sequences.

To confirm a change of genetic code, it is necessary to detect corresponding changes in the respective tRNAs. This beyond reach for the present study because the mitochondrial genomic sequences that we used are extracted from draft genome sequences that may be incomplete, or even contain contaminations (see B. stygius in the Methods section). As a result, we also cannot entirely rule out the possibility that we have examined pseudogenes, although the high conservation found in the alignments suggest this in unlikely. For all these reasons, it is necessary to sequence full-length mitochondrial genomes from appendicularians.

Conclusions

Our alignments of translated mitochondrial sequences suggest that the last common ancestor of living tunicates may have already used the “ascidian” genetic code. Thus, we recommend the use of that code instead of the “invertebrate” one for all tunicates in automatic translation pipelines, with the caveat that additional changes might be found in appendicularians. This observation is a reminder that in biology, exception is the rule, and that each time a mitochondrial sequence is extracted from a species for the first time, it is important to carefully examine its genetic code.

Data availability

Underlying data

All data underlying the results are available as part of the article and no additional source data are required.

Extended data

Zenodo: Aligned Cox1 and Cob sequences from Oikopleura dioica and other tunicates. https://doi.org/10.5281/zenodo.3490310 (Plessy & Pichon, 2019).

This project contains alignment files and descriptions of how the files were generated.

Extended data are available under the terms of the Creative Commons Zero “No rights reserved” data waiver (CC0 1.0 Public domain dedication).

Comments on this article Comments (0)

Version 2
VERSION 2 PUBLISHED 10 Dec 2019
Comment
Author details Author details
Competing interests
Grant information
Copyright
Download
 
Export To
metrics
Views Downloads
F1000Research - -
PubMed Central
Data from PMC are received and updated monthly.
- -
Citations
CITE
how to cite this article
Pichon J, Luscombe NM and Plessy C. Widespread use of the “ascidian” mitochondrial genetic code in tunicates [version 2; peer review: 2 approved]. F1000Research 2020, 8:2072 (https://doi.org/10.12688/f1000research.21551.2)
NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.
track
receive updates on this article
Track an article to receive email alerts on any updates to this article.

Open Peer Review

Current Reviewer Status: ?
Key to Reviewer Statuses VIEW
ApprovedThe paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approvedFundamental flaws in the paper seriously undermine the findings and conclusions
Version 1
VERSION 1
PUBLISHED 10 Dec 2019
Views
45
Cite
Reviewer Report 26 Feb 2020
Yuanning Li, Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT, USA 
Approved
VIEWS 45
Pichon et al. conducted a study by comparing two mt genes across tunicates and recovered conserved tunicate-specific mt codon usage and potential Oikopleuridae-codon. The writing of the manuscript is clear and easy to follow. The methods are also valid and ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Li Y. Reviewer Report For: Widespread use of the “ascidian” mitochondrial genetic code in tunicates [version 2; peer review: 2 approved]. F1000Research 2020, 8:2072 (https://doi.org/10.5256/f1000research.23747.r59671)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
  • Author Response 14 Apr 2020
    Charles Plessy, Genomics and Regulatory Systems Unit, Okinawa Institute of Science and Technology Graduate University, Onna-son, 904-0495, Japan
    14 Apr 2020
    Author Response
    We thank the reviewer for their 2 suggestions.

    1) “To make sure readers understand there are many mt codon changes within deuterostomes”, we are adding the following text and ... Continue reading
COMMENTS ON THIS REPORT
  • Author Response 14 Apr 2020
    Charles Plessy, Genomics and Regulatory Systems Unit, Okinawa Institute of Science and Technology Graduate University, Onna-son, 904-0495, Japan
    14 Apr 2020
    Author Response
    We thank the reviewer for their 2 suggestions.

    1) “To make sure readers understand there are many mt codon changes within deuterostomes”, we are adding the following text and ... Continue reading
Views
27
Cite
Reviewer Report 13 Jan 2020
Patrick Lemaire, Montpellier Cell Biology Research Center (CRBM), CNRS, University of Montpellier, Montpellier, France 
Approved
VIEWS 27
In this short article Pichon and colleagues use publicly available molecular datasets, mostly partial mitochondrial genomes, to re-explore the mitochondrial genetic code across tunicates, based on the analysis of the Cox1 and Cob genes.

Their data suggest ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Lemaire P. Reviewer Report For: Widespread use of the “ascidian” mitochondrial genetic code in tunicates [version 2; peer review: 2 approved]. F1000Research 2020, 8:2072 (https://doi.org/10.5256/f1000research.23747.r57710)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.

Comments on this article Comments (0)

Version 2
VERSION 2 PUBLISHED 10 Dec 2019
Comment
Alongside their report, reviewers assign a status to the article:
Approved - the paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations - A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approved - fundamental flaws in the paper seriously undermine the findings and conclusions
Sign In
If you've forgotten your password, please enter your email address below and we'll send you instructions on how to reset your password.

The email address should be the one you originally registered with F1000.

Email address not valid, please try again

You registered with F1000 via Google, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Google account password, please click here.

You registered with F1000 via Facebook, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Facebook account password, please click here.

Code not correct, please try again
Email us for further assistance.
Server error, please try again.