ALL Metrics
-
Views
-
Downloads
Get PDF
Get XML
Cite
Export
Track
Research Article

Identification and Functional Characterization of microRNAs in Ichthyosporea

[version 1; peer review: awaiting peer review]
PUBLISHED 24 Mar 2025
Author details Author details
OPEN PEER REVIEW
REVIEWER STATUS AWAITING PEER REVIEW

This article is included in the Genomics and Genetics gateway.

Abstract

Background

microRNAs (miRNAs) are key post-transcriptional regulators of gene expression in animals, yet their origins and early functions remain elusive. In this study, we explore the diversity and functionality of miRNAs in Ichthyosporea, a group of unicellular organisms that represent some of the closest relatives of animals.

Methods

We have performed miRNA identification by integrating deep small RNA sequencing with draft genome assemblies across six species of Ichthyosporea, including Abeoforma whisleri, Pirum gemmata, and Sphaeroforma sp. To assess the functional roles of these miRNAs, we have used computational target prediction and degradome sequencing.

Results

We identified a rich and varied repertoire of miRNAs across Ichthyosporea. Many of these miRNAs appear to be lineage-specific, while a subset is conserved among closely related species, suggesting both rapid evolutionary turnover and the persistence of ancient regulatory elements. Moreover, predicted target genes span a wide range of transcripts, implicating miRNAs in diverse cellular processes. Additionally, degradome sequencing reveal that in S. arctica, several miRNAs are likely to induce mRNA cleavage.

Conclusions

These findings not only extend previous reports of miRNA presence in Ichthyosporea but also underscore the deep evolutionary roots of miRNA-mediated gene regulation in the holozoan lineage. The miRNA induced target cleavage in S. arctica is also observed in cnidarians and supports the hypothesis that mRNA cleavage may represent the ancestral function of animal miRNAs.

Keywords

Ichthyosporea, small RNA, microRNA, miRNA, origin of animals, evolution, gene regulation

Introduction

microRNAs (miRNAs) are short RNA molecules, 21-26 nucleotides, which post-transcriptionally regulate other genes.1,2 They are generated from genome-encoded single-stranded hairpins and are involved in a variety of physiological processes such as embryonic development, cell proliferation, immune response, and stress response.2

miRNAs have been identified across a range of eukaryote lineages, including plants, animals, green and brown algae, slime molds, and excavates.24 Among animals, miRNAs have been extensively studied in bilaterians, but also identified in cnidarians and sponges,5,6 two phyla whose ancestors evolved before the emergence of Bilateria. More recently, miRNAs have also been discovered in unicellular relatives of animals, the ichthyosporeans, but only in a single genus, Sphaeroforma.7

Despite these discoveries, the diversity and functional roles of miRNAs in ichthyosporeans, sponges, and cnidarians remain poorly understood. It is unclear how miRNAs emerged and evolved within Holozoa (animals and their closest unicellular relatives) and how their biogenesis and functions diversified across these lineages. And while bilaterian animals exhibit highly complex miRNA regulatory networks, little is known about how miRNAs contribute to gene regulation in earlier-branching holozoans.

In animals, miRNAs are generated from hairpin structures of RNA (pri-miRNAs) which are transcribed from the genome by RNA polymerase II.2 The pri-miRNA hairpin is cleaved inside the nucleus by the Microprocessor complex which consists of the proteins Drosha and Pasha (called DGCR8 in vertebrates). The resulting ~60 nucleotide precursor miRNA (pre-miRNA), which has an overhang of two nucleotides from the processing of the two RNase III domains in Drosha, is then exported to the cytoplasm. Here, the pre-miRNA structure is further cleaved by another double-RNase III protein called Dicer near the loop region of the pre-miRNA. This results in a double-stranded short RNA molecule with the characteristic 3′ two-nucleotide overhangs on both ends. One of the strands of this so-called miRNA duplex will be bound by the protein Argonaute (AGO). This strand is known as the mature miRNA (often denoted miRNA) and the other strand, which is released, is known as the star miRNA (often denoted miRNA*). These structural features, shaped by the sequential enzymatic actions, provide critical markers for identifying miRNAs in genomic and transcriptomic datasets.

The initial clues about the function of miRNAs in animals came from the observation of several conserved sites of lin-4 in C. elegans8 which were complementary to the 3′UTRs that had been proven to be essential for the repression of lin-14.9 We now know that miRNAs regulate gene expression by targeting specific gene transcripts in both plants and animals, however, they achieve it differently. While in plants, the miRNAs bind with a perfect, or near perfect complementarity to mRNAs (usually in the exons), animal miRNAs usually target mRNAs by a short complementary “seed” sequence (and most often target the 3′-UTRs).4,10 And in plants, as opposed to animals, this targeting mode generally results in the cleavage of the mRNA target.2 The cleavage of the mRNA usually occurs between nucleotide 10 and 11 of the miRNA and is performed by the PIWI domain in AGO.11

Until recently, the cleavage of mRNAs through AGO was believed to be limited to plants. However, a study on the cnidarian Nematostella vectensis found that many miRNAs actually target mRNAs via nearly perfect complementarity, and that these mRNAs are cleaved at position 10 in the target site, resembling the targeting mode in plants.12 It has been proposed that the “seed-based" targeting was a bilaterian innovation, which widened the binding capacity of miRNAs and the regulatory networks and contributed to the complexity of Bilateria.13 The finding of Moran et al.12 could therefore support cleavage as the ancestral function of animal miRNAs. However, the targeting modes of miRNAs in other basal holozoans remain entirely unknown, hence also whether cleavage-based targeting is a shared ancestral trait of all Holozoa or if it evolved independently in certain lineages.

The main objective of this study was to explore the presence, diversity, and functional roles of miRNAs in Ichthyosporea. Although miRNAs have previously been identified in the ichthyosporean genus Sphaeroforma, it remains unclear whether this genus is unique or if miRNAs are more widely distributed across other ichthyosporean lineages. To address this, we analyzed transcriptomic datasets from Abeoforma whisleri and Pirum gemmata, alongside new data from several additional species of Sphaeroforma. Additionally, we sought to determine whether miRNAs in S. arctica target and cleave mRNAs in a manner similar to the plant-like mechanism observed in the cnidarian Nematostella vectensis.

Methods

Culture conditions

We studied six species of Ichthyosporea: Abeoforma whisleri (Awh), Pirum gemmata (Pge), Sphaeroforma arctica (Sar), S. sirkka (Ssi), S. tapetis (Sta), and S. gastrica (Sga). All species were cultured in tissue culture TC flasks T25 (Sarstedt, NRW, Germany) with 15 ml sterile Marine Broth (cat. no: 279110, Difco BD, NJ, US; 37.4 g/L) at 12°C with no light. Cultures were sustained monthly by transferring 1 ml of old cultures to 14 ml of Marine Broth.

Small RNA isolation, sequencing and processing of reads

Total RNA was isolated from the different species by first lysing the cells on a FastPrep system (MP Biomedicals, Santa Ana, CA, USA) followed by smallRNA purification using the mirPremiere RNA kit (cat. no: SNC50, Sigma-Aldrich, St. Louis, MO, USA). All cultures were saturated and contained a mix of different cell stages. For S. arctica we also included data from a time series experiment (in two biological replicates) where both small RNAs and mRNAs had been Illumina (Illumina, San Diego, CA, USA) sequenced at 12 time points starting from inoculation of an old culture and until 72 hours.14

The smallRNA libraries were prepared for sequencing with the NEBNext Small RNA Library Prep kit (cat. no: E7330L, New England Biolabs, Ipswich, MA, USA) and sequenced on the Illumina NextSeq 500 platform with 75 bp single end reads. Library preparation and sequencing services were provided by the Norwegian Sequencing Center (NSC).

The resulting Illumina reads were first trimmed for sequencing adaptors and low-quality bases using Trimmomatic v0.3815 (parameters: ILLUMINACLIP:TruSeq3-SE.fa:2:25:7 SLIDINGWINDOW:4:28 LEADING:28 TRAILING:28 MINLEN:18) by removing bases with a phred score less than 28 from both 5′- and 3′ ends. In addition, a sliding window of 4 nucleotides was used, trimming sequences from the 5′ end if the average phred score dropped below 28. Reads shorter than 18 bp were filtered out. Next, we used PrinSeq-lite v.0.20.416 to filter reads containing N′s and reads longer than 26bp (parameters: -max_len 26 -ns_max_n 0).

DNA isolation, sequencing and genome assembly

DNA was isolated from Sta, Sga, Awh and Pga using the DNeasy Blood & Tissue Kit (Qiagen, MD, USA) after lysing the cells using FastPrep as described above. All cultures were saturated and contained a mixture of cell stages.

For Sta and Sga, DNA was shipped to NSC for library preparation and sequencing. 150 bp paired-end (PE) libraries were generated and sequenced on one lane on the Illumina HiSeq4000. The resulting reads were processed into high-quality reads as described for the RNA data above. DNA from Awh and Pge was sequenced on the PacBio RS II instrument (Pacific Biosciences, CA, US). Libraries were prepared using the Pacific Biosciences 20 kb library preparation protocol, and size selection of the final libraries was performed using a BluePippin (Sage Science, MA, US) with a 7 kb cut-off. The libraries were sequenced on 6 SMRT cells. For Awh and Pge we also downloaded publicly available Illumina data from NCBI (BioProject PRJNA360047). The PacBio reads were used as they were delivered from the sequencing facility, taking advantage of the error correction through the use of Illumina reads in the hybrid assembly described below.

For all species, draft genome assemblies were created using Spades v.3.13.17 For Awh and Pge we used hybrid assembly with both Illumina and PacBio reads. As the assemblies were only used to identify precursor miRNAs, we did not perform scaffolding or other optimizations aimed at generation as long, or complete, genomic fragments as possible.

For Sar and Ssi we downloaded the already published genome assemblies (NCBI accessions GCA_008580545.1 and GCA_001586965.1).

mRNA sequencing and assembly

We sequenced the expressed transcriptomes of Awh, Pge, Ssi, Sta and Sga by isolating total RNA using the RNeasy Mini Kit (Qiagen, MD, USA) after cell lysis using FastPrep as described above. All cultures were saturated and contained a mixture of cell stages. The purified total RNA was sent to the NSC for library preparation and sequencing. RNA from Awh, Pge, Sta and Sga was prepared into 150 bp PE libraries and sequenced together on one lane on the Illumina HiSeq X platform. RNA from Ssi was prepared into a 300 bp PE library and sequenced on an Illumina MiSeq.

The sequence reads were processed into high-quality reads using Trimmomatic as described above and assembled into contigs using Trinity v2.8.2.18 Protein coding sequences were predicted using TransDecoder v5.5.0.19

For Sar we downloaded the transcriptome assembly and annotation accompanying the genome assembly on NCBI (accession. GCA_008580545.1).

microRNA identification

After trimming, the smallRNA reads were processed by the script process-reads-fasta.py which comes with the program miR-PREFeR.20 This script collapses identical reads into a single sequence with a number in the fasta header that indicates the number of identical reads. This was done to reduce the data size and speed up the analysis.

miRNAs were identified using a two-step process. First, miR-PREFeR was run to identify potential primary miRNA (pri-miRNA) sequences in the different genome assemblies. Then, the tentative pri-miRNAs were manually curated by inspecting the mapping of small RNA reads against the pri-miRNA sequences and comparing it against a set of criteria (described below). The curated sets of miRNAs were divided into three categories: high-, mid-, and low-confidence, based on the number of criteria fulfilled.

pri-miRNA identification by miR-PREFeR

To identify potential pri-miRNA loci in the genomes, miR-PREFeR was run with the processed small RNA sequences and a genome assembly as inputs. The program was run with the following parameters: PRECURSOR_LEN = 500, READS_DEPTH_CUTOFF = 2, MAX_GAP = 100, MIN_MATURE_LEN = 18, MAX_MATURE_LEN = 26, ALLOW_NO_STAR_EXPRESSION = N, ALLOW_3NT_OVERHANG = N. This means that the size of pri-miRNAs cannot be longer than 500nt, that at least 2 small RNA reads need to map to the mature region of the pri-miRNA and at least 1 read needs to map to the star region, that the length of the mature region should be between 18nt and 26nt, and that 3nt overhang between mature and star sequences is not allowed.

Curation of miRNA candidates

To further validate the miRNA identification by miR-PREFeR, and reduce the number of false positives, we mapped the processed smallRNAs to the pri-miRNA sequences identified by miR-PREFeR using Bowtie v1.1.221 with the parameters: -p 16 -v 0 -k 20 –best --strata --norc. This means that no mismatch is allowed, that up to 20 valid alignments are reported, that strand bias is eliminated and that no reads are mapped to the reverse complement. The read mapping was then visualized in Geneious v11.1.4 (www.geneious.com) and each miRNA candidate was inspected manually and scored according to the criteria determined by Kozomora and Griffith-Jones22 Fromm et al.,1 but slightly modified according to criteria updated by Axtell and Meyers23 (some of these criteria are also already fulfilled by the miR-PREFeR parameters) (see Table 1 and Figure 1).

Table 1. miRNA selection criteria.

Criteria numberShort nameExplanation
I)ExpressionAt least two reads mapping to the mature region (mature expression) and one read mapping to the star region (star expression).
II)5’ mature homogeneityAt least 50% of the reads mapping to the mature region must start at the first nucleotide.
III)OverhangThere must be a 2-nucleotide overhang on both ends of the mature and star duplex.
IV)5’ star homogeneityAt least 50% of the reads mapping to the star region must start at the first nucleotide.
V)ComplementarityAt least 16 complementary nucleotides in the mature and star duplex.
VI)Loop sizeAt least 8 nucleotides separating the mature and star region.
VII)Loop readsLess than 50% of the reads mapping to the pri-miRNA can map to the loop.
e16e098c-2dd4-419e-9328-cbdec89b7fc6_figure1.gif

Figure 1. miRNA identification.

Legends: Schematic structure of a pri-miRNA hairpin as identified by miR-PREFeR and accepted as a valid pri-miRNA in this study. Red color indicates the mature region, blue color indicates the star region. The two bulges in the mature and star regions indicate non-complementary nucleotides. Green color indicates indicates the loop, and black color indicates the flanking regions. Thin black lines indicate smallRNA reads mapping to the pri-miRNA sequence. The dashed lines indicate the 5’ starting point of the mature and star regions. The roman letters in parentheses correspond to the different miRNA identification criteria in Table 1.

The miRNA candidates predicted by miR-PREFeR were divided into the following categories depending on how many of the above criteria they fulfilled (in addition to the settings already used by miR-PREFeR):

High-confidence miRNAs: miRNA candidates that fulfilled all seven criteria in Table 1. As an example, miRNAs in this category are named Sar-high-miR-1.

Mid-confidence miRNAs: miRNA candidates that fulfill six of the criteria. As an example, miRNAs in this category are named Sar-mid-miR-1.

Low-confidence miRNAs: miRNA candidates that fulfill five of the criteria. As an example, miRNAs in this category are named Sar-low-miR-1.

Discarded miRNAs: miRNA candidates that fulfill less than five of the criteria.

Some miRNAs were reported to arise from both strands of the same genomic locus, and with the same small RNA reads mapping to the mature and star regions. These cases could be due to artifacts in the genome assembly or from true inverted repeats. Nevertheless, we only counted these miRNAs as a single miRNA. And in cases where two pri-miRNA sequences were identical, but originated from different genomic loci, they were counted as two miRNAs.

Removal of non-coding RNAs

After curation, the miRNA precursor sequences were analyzed with the cmscan function of infernal v1.124 (parameters --cut_ga, --rfam, --nohmmonly, ---tblout) against the Rfam database25 to identify other types of non-coding RNAs that are not miRNAs. Precursors with significant matches were discarded from the final set of miRNA sequences.

Identification of evolutionary conserved miRNAs

We used the pri-miRNA sequences of the identified miRNAs in a pairwise blast against the genomes of the different ichthyosporeans. We used Blastn implemented in Geneious with the parameters: scoring (match mismatch) 2-3, maximum hits 10, max E-value 1e-10, Gap cost (open extend) 5 2, word size 7.

Genomic loci which covered the entire pre-miRNA (when a pri-miRNA had several matches in a genome, the match with the lowest E-value was selected) were selected to further investigate whether they were also transcribed and processed into a proper miRNA. To do this, the small RNA reads were mapped to the conserved genomic regions and evaluated using the criteria described above. Since these genomic loci are evolutionary conserved, and the query sequence already fulfill the criteria for a miRNA, we only required that these conserved genomic loci only fulfilled criteria I in Table 1 (i.e. that at least two reads map to the mature region and one read to the star) to designate it as a conserved miRNA.

Functional roles of ichthyosporean miRNAs

miRNA target prediction

Tapir v1.226 was used to identify potential miRNA-targeted genes with near-perfect complementarity. Tapir was run using the mature miRNA sequences, as well as star sequences in cases where they were expressed equally high as the matures. In this case they were annotated as co-mature. The mature and co-mature sequences were searched against the assembled transcriptomes of the different ichthyosporean species. Tapir was run with the precise search option which selects potential targets according to a complementarity score cutoff (default 4) and an estimated minimum free energy (mfe) ratio of the miRNA:mRNA duplexes (default 0.7, which indicates that at least 70% of a miRNA:mRNA duplex should have a perfect complementarity to pass the threshold).

To annotate the potential target genes, protein sequences (for protein coding genes) were blasted against the RefSeq protein database using blastp with default settings, while the nucleotide sequences of non-coding genes were blasted against RefSeq RNA database with default settings.

To identify genes potentially targeted by miRNAs similar to that in animals, TargetScan v.7.027 was used with the “seed region” (nucleotides 2-8) of the mature and co-mature sequences as input and run against the 3′-UTRs of the Sar transcriptome genome with default parameters. There are four types of targeting sites identified by TargetScan 8mer-A1 (full match across the entire seed region and with an A nucleotide opposite of position 1 of the 3′ UTR), 7mer-m8 (match from position 2-8 of the seed), 7mer-A1 (match from position 2-7 in the seed and and A opposite position 1), and 6mer (match from position 2-7 in the seed), with decreased decreasing stringency and estimated effect on the targets.

Target cleavage analysis

Total RNA isolation

Degradome-seq was only performed on S. arctica. 1 ml of saturated Sar culture (containing mostly senescent cells) was transferred into 14 ml fresh Marine Broth. Total RNA was extracted 30h, 48h and 54h after inoculation (two biological replicates for each time points) using the RNeasy Mini Kit (Qiagen, Hilden, Germany), followed by DNase I (Invitrogen, CA, US) and RNA cleanup (Zymo research, CA, US) treatments according to provided protocols. The quality of the total RNA was evaluated on the Bioanalyzer 2100 (Agilent Technologies, CA, US).

Degradome library preparation and sequencing

The isolated total RNA was sent to LC sciences (Houston, TX, US) for preparation of the degradome libraries and Illumina sequencing. Briefly, the libraries were created by first capturing polyadenylated RNAs, then a sequencing adapter was ligated to the 5′-end of mRNAs. This adapter will only ligate to a 5′-monophosphate (i.e. uncapped 5′ ends). Then cDNA was generated using random primers and only those fragments carrying the 5′ sequencing adapter were sequenced. The cDNA fragments were sequenced from the 5′ sequencing adapter as 50 bp single-end reads on an llumina HiSeq 2500 instrument. The resulting reads were processed as described for the other small RNA reads above.

Identification of cleaved target genes

CleaveLand v4.528 was run with a p-value cutoff of 0.05 to report a significant miRNA-directed mRNA cleavage. CleaveLand first identifies potential binding sites for the mature miRNAs in the transcriptome, and then it aligns the degradome reads to the transcriptome and counts the number of reads starting at each nucleotide (degradome density). The peaks of reads starting at each position are divided into four categories, 0-4, where category 0 has the highest support for miRNA-directed cleavage. These peaks are then compared to the mature miRNA binding sites. If there is a peak exactly at the predicted cleavage site (between position 10 and 11 in the mature miRNA), a p-value is calculated, which takes both noise (the chances of observing the degradome peak of the given category at random) and the quality of the predicted target site (alignment score) into account. CleaveLand creates the degradome density by mapping with Bowtie (parameters: -f -v 1 --best -k 1 --norc -S). A program called GSTAr (distributed with CleaveLand) identifies the target sites by aligning the mature sequences. As input to CleaveLand both the mature (and co-mature) sequences of the miRNAs from all three categories of miRNAs from Sar and the Sar transcripts (including UTRs) were used.

Results

miRNA identification

The number of reads in the small RNA libraries from the different ichthyosporean species is described in Table 2. For miRNA identification we combined all the libraries from each species, hence the numbers in Table 2 represent combined libraries per species (for Sar this includes the time series datasets). Overall, removing low quality base calls and reads, as well as retaining only reads between 18-26 nucleotides removed on average close to 60% of the sequences (the bulk of the reads removed was due to the length filtering).

Table 2. small RNA sequencing data.

SpeciesNr. raw readsNr. processed reads % reduction
Awh31 973 48717 464 48845.4
Pge32 891 32812 430 12362.2
Sar638 699 149192 089 60769.9
Sga33 115 36815 158 13054.2
Ssi6 464 0072 499 19961.3
Sta31 123 32411 891 72261.8

For Awh, Pge and Sar there was a clear enrichment of reads at 20 nt ( Figure 2). For the other species, the read lengths 18-26 nt was more evenly distributed, although 20 nt was the most common length except in Sga.

e16e098c-2dd4-419e-9328-cbdec89b7fc6_figure2.gif

Figure 2. Length distribution of small RNA reads.

The fraction of reads with nucleotide lengths between 18-26 nt after processing.

Manual curation of the initial predictions made by miR-PREFEeR identified a total of 180, 74, 105, 13, 16 and 19 miRNAs in Awh, Pge,Sar, Ssi, Sta and Sga respectively ( Table 3). These miRNAs were further classified into three categories dependent on how confident we were in the predictions after mapping the small RNA reads to the precursor sequences (see Methods section and Figure 3 for examples of miRNAs from the three categories).

Table 3. Identified miRNAs.

SpeciesHighMidLow Discarded
Awh40 (70.0)85 (76.5)55 (76.4)119 (65.5)
Pge23 (26.1)31 (38.7)20 (45.0)32 (68.8)
Sar28 (78.6)34 (67.3)43 (83.7)316 (74.7)
Sga9 (55.6)4 (50.0)6 (33.3)12 (33.3)
Ssi4 (75.0)6 (0.0)3 (0.0)6 (66.7)
Sta7 (42.9)5 (40.0)4 (50.0)12 (75.0)
e16e098c-2dd4-419e-9328-cbdec89b7fc6_figure3.gif

Figure 3. Examples of the identified ichthyosporean miRNAs.

Legends: Examples of high-confidence (A), mid-confidence (B), and low-confidence (C) miRNAs from the different ichthyosporean species. The y-axis shows mapped reads per million processed reads (RPM), and the nucleotide position on the pri-miRNA sequence is shown on the x-axis. The secondary structures of the pri-miRNAs as predicted by mfold37 are shown next to the mapping. Reads mapping to the mature region are colored red and reads mapping to star region are colored blue. The same colors are used on the secondary structure.

Awh and Sar showed a clear A/U preference at the first nucleotide of miRNA mature strands ( Table 3). For the other Sphaeroforma species we detected very few miRNAs, so we cannot determine whether there was an A/U preference or not. For Pge no such A/U signal was detected.

Target prediction

Most of the identified miRNAs in all the ichthyosporean species were predicted to have at least one target according to Tapir ( Table 4). However, when considering only scores of 2.5 or lower (where a Tapir score of 2.5 may indicate complementarity along the entire miRNA with one seed mismatch, and a score of 0 represents a perfect complementarity), approximately 50% or fewer of the miRNAs had a potential target. It should be noted that these results are highly dependent on the number and quality of the transcriptomes used as potential targets. For Pge, Sga, Ssi and Sga the transcriptome assemblies were probably highly fragmented and redundant, reflected in a high number of transcripts for these species and therefore a low fraction of potentially targeted transcripts. Among Sphaeroforma, Sar had a lot more predicted targets even though fewer mRNA transcripts were used as input. However, this is most likely because of the large difference in the number of predicted miRNAs from these species.

Table 4. Tapir miRNA target prediction.

SpeciesInput transcriptsmiRNAsTotal targetsUnique targetsScore ≤ 2.5UTRExon Non-coding
Awh10448191 (202)35412121 (20.3%)0.2155515201466
Pge9740178 (83)174487055 (7.2%)0.483388232411736
Sar34753101 (117)36072004 (5.8%)0.2520051269333
Sga6133513 (15)13372 (0.12%)0.13316636
Ssi6667919 (20)199160 (0.24%)0.123111355
Sta4424216 (21)11670 (0.15%)0.56125351

The TargetScan analysis was only run on Sar since full genome annotation of the 3′UTRs was only available for this species. This analysis identified more than 100,000 potential seed binding sites within the 3’-UTRs ( Table 5). Some miRNAs were predicted to target many different locations of the same gene. This was especially the case for Sar-high-miR-15 which had 937 potential target sites in a total of 300 different genes. Among those genes, Sarc4_g19159T, Sarc4_g11565T, Sarc_g22056T, and Sarc4_23261T contained binding sites for Sar-high-miR-15 at more than 20 different locations. Therefore, this analysis should only be used as an indication that there are potentially a huge number of binding sites in the 3’ UTR regions of the different transcripts, but experimental evidence is needed to confirm actual binding and transcriptional regulation.

Table 5. TargetScan miRNA target prediction.

miRNAsTotal targetsUnique targets miRNA (8mer-1a) 8mer-1a 7mer-m8 7mer-1a 6mer
117 (117)100 4886 750117 (117)3 5244 8135 1886 079

Evolutionary conserved miRNAs in Ichthyosporea

Conservation across species and over long time scales can be an indication of functionally important miRNAs, and in animals conserved miRNAs are generally functionally important.2 Therefore, we investigated whether any of the identified miRNAs were conserved between the investigated ichthyosporeans. There were several pri-miRNA sequences that had homologous loci in the genomes of two or more ichthyosporeans ( Table 6). Most of these however were only processed into proper miRNAs in one of the species.

Table 6. Conserved pri-miRNAs within Ichthyosporea.

AwhPgeSarSgaSsi Sta
Awh pri-miRNAs 1/0/01/0/00/0/00/0/00/0/0
Pge pri-miRNAs 2/0/01/0/00/0/00/0/00/0/0
Sar pri-miRNAs 6/0/01/0/068/28/669/9/116/0/0
Sga pri-miRNAs 0/0/00/0/014/7/515/6/32/0/0
Ssi pri-miRNAs 0/0/00/0/09/8/212/7/12/1/0
Sta pri-miRNAs 0/0/00/0/00/0/02/0/00/0/0

However, 13 different miRNAs had conserved pri-miRNA sequences that were also expressed and processed into proper miRNAs in two or more species ( Table 7). Based on sequence similarity between the pri-miRNAs, these 13 miRNAs seem to belong to four different evolutionary conserved loci ( Table 7 and Figure 4).

Table 7. Conserved miRNAs within Ichthyosporea.

Conserved group Conserved miRNAs
1Sar-mid-miR-30, Sar-low-miR-35, Sga-low-miR-5
2Sar-mid-miR-16, Sga-high-miR-1
3Sar-high-miR-8, Sar-high-miR-6, Ssi-high-miR-1, Sga-high-miR-2
4Sar-high-miR-21, Ssi-high-miR-3, Sga-high-miR-3, Sga-mid-miR-3
e16e098c-2dd4-419e-9328-cbdec89b7fc6_figure4.gif

Figure 4. Four groups of conserved miRNAs.

Legends: Pairwise MUSCLE alignment of the pri-miRNA sequences of the conserved pri-miRNA loci that were expressed and processed into proper miRNAs in two or more species. The red and blue rectangles indicates the mature and star regions. Colored nucleotides in the alignment indicate sites with sequence variation. The sequences in group 1 are wrapped and the dots indicate the break point. Group numbers correspond to the groups in Table 7.

Identification of miRNA-induced mRNA cleavage in Sar

In plants, mRNA transcripts targeted by miRNAs are silenced by cleavage between position 10 and 11 of the mature miRNA binding site.12,29 These mRNAs will therefore lack the 5′-cap and can be identified by a so-called degradome sequencing (degradome-seq). Degradome-seq targets mRNAs which lacks the 5′-cap and is a modified version of 5′-Rapid Amplification of cDNA Ends (RACE) and similar to a 5′-uncapped polyadenylated mRNA sequencing (also referred to as parallel analysis of RNA ends (PARE).

We performed duplicate isolations of total RNA from three different timepoints and prepared degradome-seq libraries. From the sequencing of cleaved mRNAs we received between 20-30 million reads after quality processing ( Table 8). Overall, the number of degradome reads was between 25 ~ 35 million before processing, while after the number of reads decreased to between 20 ~ 30 million.

Table 8. Degradome reads processing.

LibraryRaw reads Processed reads
30h_136 343 89130 046 849
30h_234 362 83429 194 149
48h_128 891 63924 013 478
48h_236 088 17731 236 800
54h_123 173 81419 547 675
54h_232 508 07628 123 224

CleaveLand identified 18 different mRNA transcripts that were potentially cleaved through miRNA-directed cleavage by 11 different miRNAs ( Table 9). 10 transcripts were cleaved only at a specific time point, 4 were cleaved at two time points, while 4 were cleaved at all three time points. Only two of the miRNAs did not start with an A or U. 15 out of 18 of the cleaved targets were targeted in exons, while one cleaved by Sar-mid-miR-32 co-mature strand was targeted at 5′UTR, and the rest of the targets were non-coding RNAs. Only six of the 18 transcripts were also identified as targets by Tapir. Two of the highly conserved miRNAs (Sar-high-miR-6 (Sar-high-miR-8 has identical mature sequences so these are functionally equivalent) and Sar-mid-miR-16), and two miRNAs with conserved pri-miRNAs only (Sar-mid-miR-3 and Sar-high-miR-2) had induced target cleavage. Furthermore, all the cleaved targets with category 0 were targeted by the same miRNA (Sar-mid-miR-3) which had conserved pri-miRNA sequence ( Figure 5).

Table 9. Cleaved targets identified by degradome sequencing.

miRNAConserved 1st ntCategoryTargetTapirmRNA featureTimepoint RefSeq hit
Sar-mid-miR-3pri-miRNA A0Sarc4_g15395TYesExon30h, 48hNo hit
Sar-mid-miR-32 (co-mature)NoA3Sarc4_g2876TYes5'UTR30h, 48hTransmembrane protein 35A-like (6e-06)
Sar-high-miR-2pri-miRNA A3Sarc4_g26280TNoExon30h, 54hNo hit
Sar-high-miR-6(8) amiRNAU1Sarc4_g31131TNoExon48h, 54hAutotransporter domain-containing protein (3e-04)
Sar-mid-miR-3pri-miRNA A0Sarc4_g19705YesExon30hNo hit
Sar-low-miR-38NoG3Sarc4_g24093TNoExon30hH/ACA ribonucleoprotein complex subunit 1 (5e-47)
Sar-low-miR-41NoU3Sarc4_g3869TNoExon30hTranslocation protein SEC62-like (7e-28)
Sar-mid-miR-3pri-miRNA A0Sarc4_g5854TYesExon30hNo hit
Sar-mid-miR-3pri-miRNA A0Sarc4_g19698TYesExon48hNo hit
Sar-mid-miR-6NoG3Sarc4_g6144TNoExon48hIsopenicillin N synthase family oxygenase (6e-55)
Sar-low-miR-16NoU2Sarc4_g8134TNoExon48hZinc finger X-linked protein ZXDB-like (3e-54)
Sar-high-miR-16NoA2Sarc4_g31551TNoExon54hNo hit
Sar-mid-miR-3pri-miRNA A0Sarc4_g32950TYesExon54hAldehyde reductase (6e-65)
Sar-low-miR-27 (co-mature)NoU4Sarc4_g7215TNoExon54hSortase B protein-sorting domain-containing protein (3e-09)
Sar-mid-miR-16miRNAU3Sarc4_g182TNoExon30h, 48h, 54hConserved hypothetical protein (6e-08)
Sar-mid-miR-16miRNAU3Sarc4_g31933TNoExon30h, 48h, 54hPutative E3 ubiquitin-protein ligase, makorin-related (6e-16)
Sar-mid-miR-3pri-miRNA A0asmbl_10212NoNon-coding RNA30h, 48h, 54hNo hit
Sar-high-miR-2pri-miRNA A3asmbl_14039NoNon-coding RNA30h, 48h, 54hNo hit

a Both Sar-high-miR-6(8)a have identical mature sequences.

e16e098c-2dd4-419e-9328-cbdec89b7fc6_figure5.gif

Figure 5. Category 0 targets identified by CleaveLand.

Legends: The plots show the category 0 targets identified by CleaveLand. The y-axis shows the number of degradome reads (from cleaved mRNAs) starting at each nucleotide position on the transcripts (x-axis). The red dots mark the degradome reads starting at the predicted cleavage site (i.e., at the 10th nucleotide of the complementary sequence to the mature miRNA).

Discussion

We have identified miRNAs in two ichthyosporean species outside Sphaeroforma, A. whisleri and P. gemmata, as well as several novel miRNAs among Sphaeroforma sp. This extends the results of Bråte et al.,7 and shows that Sphaeroforma is not an exception, but that miRNAs are probably a general phenomenon among Ichthyosporea. This also strengthens the hypothesis that miRNAs were present in the last common ancestor of Holozoa, and that they have been lost in the lineages Filasterea, Choanoflagellata and Ctenophora.

Most of the identified ichthyosporean miRNAs are lineage-specific, meaning that they are not conserved between A. whisleri, P. gemmata and Sphaeroforma sp. This lack of conservation is perhaps unsurprising, given that these lineages are estimated to have diverged more than 500 million years ago.30 This suggest that most of the miRNAs evolved independently in each genus, and that there is a rapid birth-and-death dynamic in the evolution of miRNAs in Ichthyosporea. Similar patterns have also been observed in sponges and cnidarians, where no miRNAs are conserved between the major sponge lineages,6,31 and only a handful of the nearly 100 miRNAs in Nematostella vectensis are shared with Hydra, another cnidarian.12

Despite the lack of conserved miRNAs across Ichthyosporea, several miRNAs are conserved between the Sphaeroforma species. These species have very similar cellular morphologies and life cycles, and exhibit high similarity based on 18S small ribosomal subunit comparisons,32 although precise estimates of their divergence times are unavailable. Given their close relationships, an even higher degree of miRNA conservation across Sphaeroforma species might have been expected. The low observed conservation is likely due to incomplete identification of miRNAs in S. sirkka, S. tapetis, and S. gastrica because of poor quality of the transcriptome assemblies.

Additionally, there are many more primary miRNA genomic loci conserved between Sphaeroforma species than there are miRNAs. Even a few shared between S. arctica, A. whisleri, and P. gemmata. This suggests that lineage-specific miRNAs may have evolved from ancestral genomic loci that gained the ability to be transcribed, likely through the acquisition of transcription factor binding sites. This aligns with the idea of rapid evolutionary turnover in miRNA processing,33 where minor genomic changes can enable or disrupt miRNA expression and processing.

Because miRNAs are short non-coding molecules, often with very few nucleotides of functional importance, they can be difficult to identify based on sequencing and gene prediction alone. It is also obvious that miRNA identification is dependent on the input data. For example, the reason why so many novel miRNAs in Sphaeroforma are detected here compared to the 8 miRNAs identified in Bråte et al7 is probably because a much higher number of small RNA reads were sequenced, including different cellular stages. And in S. sirkka, which is very closely related to S. arctica with almost identical cellular morphology,32 only a few miRNAs were identified. But S. sirkka also had the least number of small RNA reads. However, the number of conserved pri-miRNAs between the two species is quite high and probably more of these could be identified as expressed had we sequenced the small RNA fraction deeper. For S. tapetis and S. gastrica we also identified few miRNAs despite a large number of small RNA reads. But these species have very poor and highly fragmented genome assemblies, generated from low coverage short read data. And this has probably reduced the power to detect miRNA precursor sequences in the genomes.

To identify novel miRNAs, one of the biggest jobs is to separate true miRNAs from other RNA segments with hairpin characteristics or other miRNA-resembling features.1 All miRNA identification tools report a large number of false positives, and even the accuracy of the best tool rarely surpasses 60%.23 Because Ichthyosporea are non-model species, with very little information available on the kind of characteristics to expect from the miRNAs, we used a less stringent system for miRNA identification but classified the miRNAs into three different confidence categories based on the number of identification criteria they met. This classification of miRNAs is by no means perfect, and there are several caveats; Of the seven identification criteria used here, we consider criteria I-IV the most important, as they directly reflect signatures of the pri- and pre-miRNA processing. Therefore, some of the miRNAs included here could be for instance either endogenous siRNAs or other spuriously transcribed and processed hairpin structures. This is especially true for the low-confidence miRNAs, where several of the miRNAs have uneven processing of the mature and star regions, or many reads mapping to the loop region. On the other hand, several low-confidence miRNAs do show consistent processing of the mature and star region, but could have unexpected hairpin structure, or some would lack star consistency but otherwise shown consistent mature processing. Therefore, it is not easy to simply discard the miRNAs belonging to the low- or even mid-confidence categories. Nevertheless, there are many miRNAs identified here belonging to the high-confidence category, and that these show all the signs of processing by the Microprocessor and are likely to be functional miRNAs.

The functional mechanisms of miRNAs in Ichthyosporea are unknown. It is unclear whether they operate like plant and cnidarian miRNAs, which target mRNAs through long complementary binding sites to cleave transcripts, or like bilaterian miRNAs, which primarily use a short seed region for target recognition. We investigated the possibility of animal-like targeting in S. arctica using TargetScan,27 which identifies potential targets based on seed region complementarity. This showed that thousands of transcripts contain binding sites for the miRNAs in their 3’ UTRs, indicating a potential for extensive gene regulation similar to bilaterian animals. However, the reliability of this prediction alone is very limited due to the small sequence length of seed regions and the lack of experimental validation. These results should therefore only serve as a foundation for future experimental studies.

Moran et al.12 showed that in Cnidarians miRNAs can induce mRNA target cleavage and speculated that this might be the ancestral mode of action for animal miRNAs. We examined whether this could also be the case for ichthyosporean miRNAs. Using Tapir,26 we identified thousands of mRNA transcripts across Ichthyosporea with high sequence complementarity to miRNAs, suggesting that many mRNAs could be targeted in this manner. This is further supported by our comparison of mRNA cleavage positions with the predicted target sites of the S. arctica miRNAs. We find clear indications for miRNA-induced slicing of mRNA transcripts for 11 of the predicted miRNAs. Many of these cleaved transcripts were targeted by miRNAs conserved between Sphareoforma sp. In bilaterians, evolutionarily older and conserved miRNAs tend to target conserved genes.2 In S. arctica, no such clear pattern is observed. Many of the cleaved transcripts targeted by conserved miRNAs are a mixture of genes with identifiable homologs in other species and apparently species-specific genes.

A notable feature of the miRNAs directing cleavage in S. arctica is their preference for starting with an A or U nucleotide. This is similar to what has been found for vertebrate miRNAs, where almost 90% of the miRNAs start with a U or an A nucleotide,1 and is believed to reflect the binding preference of the Argonaute protein34 in the formation of the gene silencing complex. Ichthyosporeans also have homologs of Argonaute,7 and the A/U bias may suggest that these miRNAs are also bound by Argonaute, although this remains to be verified.

It is worth noting that the degradome sequencing approach employed here likely underestimates the true number of cleaved targets. This is because mRNAs marked for cleavage often lose their poly(A) tail, and the poly(A)-selection step in the degradome-seq protocol would exclude such transcripts. Moran et al.12 demonstrated that using 5′-RACE, which sequences 5′-start sites independent of poly(A) selection, can identify significantly more cleavage events. Additionally, the CleaveLand28 algorithm used here only detects cleavage between positions 10 and 11 of the miRNA binding site. However, miRNA-induced cleavage in S. arctica could also occur at alternative positions. These factors suggest the number of cleaved targets in S. arctica is likely higher than identified in this study.

Thus, it is likely that ichthyosporean miRNAs bind to and regulate a diverse array of gene transcripts, though the precise targeting mode and functional outcomes remain to be experimentally verified. These findings indicate that S. arctica miRNAs regulate gene expression by inducing mRNA cleavage, a mechanism reminiscent of cnidarians and supporting the hypothesis that cleavage was the ancestral function of animal miRNAs. It remains to be determined whether such cleavage events occur at specific developmental time points, potentially linking miRNA activity to particular cellular stages.

Ethics and consent

Ethical approval and consent were not required.

Comments on this article Comments (0)

Version 1
VERSION 1 PUBLISHED 24 Mar 2025
Comment
Author details Author details
Competing interests
Grant information
Copyright
Download
 
Export To
metrics
Views Downloads
F1000Research - -
PubMed Central
Data from PMC are received and updated monthly.
- -
Citations
CITE
how to cite this article
Zhu H and Bråte J. Identification and Functional Characterization of microRNAs in Ichthyosporea [version 1; peer review: awaiting peer review]. F1000Research 2025, 14:319 (https://doi.org/10.12688/f1000research.162501.1)
NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.
track
receive updates on this article
Track an article to receive email alerts on any updates to this article.

Open Peer Review

Current Reviewer Status:
AWAITING PEER REVIEW
AWAITING PEER REVIEW
?
Key to Reviewer Statuses VIEW
ApprovedThe paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approvedFundamental flaws in the paper seriously undermine the findings and conclusions

Comments on this article Comments (0)

Version 1
VERSION 1 PUBLISHED 24 Mar 2025
Comment
Alongside their report, reviewers assign a status to the article:
Approved - the paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations - A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approved - fundamental flaws in the paper seriously undermine the findings and conclusions
Sign In
If you've forgotten your password, please enter your email address below and we'll send you instructions on how to reset your password.

The email address should be the one you originally registered with F1000.

Email address not valid, please try again

You registered with F1000 via Google, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Google account password, please click here.

You registered with F1000 via Facebook, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Facebook account password, please click here.

Code not correct, please try again
Email us for further assistance.
Server error, please try again.