<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.2 20190208//EN" "http://jats.nlm.nih.gov/publishing/1.2/JATS-journalpublishing1.dtd"><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="other" dtd-version="1.2" xml:lang="en">
    <front>
        <journal-meta>
            <journal-id journal-id-type="pmc">F1000Research</journal-id>
            <journal-title-group>
                <journal-title>F1000Research</journal-title>
            </journal-title-group>
            <issn pub-type="epub">2046-1402</issn>
            <publisher>
                <publisher-name>F1000 Research Limited</publisher-name>
                <publisher-loc>London, UK</publisher-loc>
            </publisher>
        </journal-meta>
        <article-meta>
            <article-id pub-id-type="doi">10.12688/f1000research.161461.3</article-id>
            <article-categories>
                <subj-group subj-group-type="heading">
                    <subject>Genome Note</subject>
                </subj-group>
                <subj-group>
                    <subject>Articles</subject>
                </subj-group>
            </article-categories>
            <title-group>
                <article-title>The genome sequence of 
                    <italic>Tethysbaena scabra</italic> (Pretus, 1991), the first known in the peracarid crustacean order 
                    <italic>Thermosbaenacea</italic>.</article-title>
                <fn-group content-type="pub-status">
                    <fn>
                        <p>[version 3; peer review: 2 approved]</p>
                    </fn>
                </fn-group>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author" corresp="yes">
                    <name>
                        <surname>Pons</surname>
                        <given-names>Joan</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Conceptualization</role>
                    <role content-type="http://credit.niso.org/">Data Curation</role>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Funding Acquisition</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <uri content-type="orcid">https://orcid.org/0000-0002-4683-8840</uri>
                    <xref ref-type="corresp" rid="c1">a</xref>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Sch&#x00f6;ninger-Almaraz</surname>
                        <given-names>Karen D.</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Data Curation</role>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <uri content-type="orcid">https://orcid.org/0009-0006-8477-2279</uri>
                    <xref ref-type="aff" rid="a2">2</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Triginer-Llabr&#x00e9;s</surname>
                        <given-names>Laura</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Data Curation</role>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <uri content-type="orcid">https://orcid.org/0009-0002-4680-1172</uri>
                    <xref ref-type="aff" rid="a2">2</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Juan</surname>
                        <given-names>Carlos</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Conceptualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <uri content-type="orcid">https://orcid.org/0000-0002-6067-2963</uri>
                    <xref ref-type="aff" rid="a1">1</xref>
                    <xref ref-type="aff" rid="a3">3</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Jaume</surname>
                        <given-names>Dami&#x00e0;</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Conceptualization</role>
                    <role content-type="http://credit.niso.org/">Resources</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Jurado-Rivera</surname>
                        <given-names>Jos&#x00e9; A.</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Conceptualization</role>
                    <role content-type="http://credit.niso.org/">Funding Acquisition</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <uri content-type="orcid">https://orcid.org/0000-0003-0999-2803</uri>
                    <xref ref-type="aff" rid="a3">3</xref>
                </contrib>
                <aff id="a1">
                    <label>1</label>Animal and Microbial Biodiversity, Institut Mediterrani d'Estudis Avancats, Esporles, Illes Balears, 07190, Spain</aff>
                <aff id="a2">
                    <label>2</label>Centre Balear de Biodiversitat, Departament de Biologia, Universitat de les Illes Balears, Palma, Balearic Islands, 07122, Spain</aff>
                <aff id="a3">
                    <label>3</label>Biologia, Universitat de les Illes Balears, Palma, Balearic Islands, 07122, Spain</aff>
            </contrib-group>
            <author-notes>
                <corresp id="c1">
                    <label>a</label>
                    <email xlink:href="mailto:jpons@imedea.uib-csic.es">jpons@imedea.uib-csic.es</email>
                </corresp>
                <fn fn-type="conflict">
                    <p>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>19</day>
                <month>9</month>
                <year>2025</year>
            </pub-date>
            <pub-date pub-type="collection">
                <year>2025</year>
            </pub-date>
            <volume>14</volume>
            <elocation-id>293</elocation-id>
            <history>
                <date date-type="accepted">
                    <day>16</day>
                    <month>9</month>
                    <year>2025</year>
                </date>
            </history>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2025 Pons J et al.</copyright-statement>
                <copyright-year>2025</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <self-uri content-type="pdf" xlink:href="https://f1000research.com/articles/14-293/pdf"/>
            <abstract>
                <p>We present a genome assembly of 
                    <italic toggle="yes">Tethysbaena scabra</italic> (Arthropoda; Crustacea; Malacostraca; Eumalacostraca; Peracarida; Thermosbaenacea; Monodellidae), a species endemic to Mallorca, Spain. The genome size is 1.18 gigabases that is scaffolded into 17 chromosomes plus a mitochondrial genome of 16,5 kilobases in length.</p>
            </abstract>
            <kwd-group kwd-group-type="author">
                <kwd>Thermosbaenacea</kwd>
                <kwd>anchialine environment</kwd>
                <kwd>stygobiont species</kwd>
                <kwd>Tethysbaena scabra</kwd>
            </kwd-group>
            <funding-group>
                <award-group id="fund-1" xlink:href="https://doi.org/10.13039/501100022397">
                    <funding-source>Govern de les Illes Balears</funding-source>
                    <award-id>Conselleriad&#x2019;Educaci&#x00f3;IUniversitatsandbytheEuropeanUnion-NextGenerationEU(BIO2022/013A)</award-id>
                </award-group>
                <award-group id="fund-2" xlink:href="https://doi.org/10.13039/501100010770">
                    <funding-source>Institut d'Estudis Catalans</funding-source>
                    <award-id>CatalanBiogemomeProject(PRO2021-S02-Jurado)</award-id>
                </award-group>
                <funding-statement>Funding: This work has been partially sponsored and promoted by Institut d'Estudis Catalans (Catalan Biogemome Project grant PRO2021-S02-Jurado). The Catalan Biogenome is EBP-affiliated project network with the objective of sequencing the genome of more than 40,000 eukaryotic species living in the Catalan Linguistic Area (such as Balearic Islands). Some fundings from  the Govern de les Illes Balears - Conselleria d&#x2019;Educaci&#x00f3; i Universitats and by the European Union - Next Generation EU (BIO2022/013A). KDSA and LTL&#x2019;s work has been partially funded and promoted by the Comunitat Aut&#x00f2;noma de les Illes Balears throgh the Conselleria d'Educaci&#x00f3; i Universitats and by the European Union - Next Generation EU/PRTR-C17. I1 (SINCO2022/6717). Nevertheless, the views and opinions expressed are solely those of the authors, and do not necessarily reflect those of the Conselleria d&#x2019;Educaci&#x00f3; i Universitats, the European Union or the European Commission. Therefore, none of these organizations shall not be held liable. This study has been funded by GOIB/Conselleria d'Educaci&#x00f3; i Universitats through the project "SINCO2022/18146" and co-funded by the European Union. </funding-statement>
                <funding-statement>
                    <italic>The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.</italic>
                </funding-statement>
            </funding-group>
        </article-meta>
        <notes>
            <sec sec-type="version-changes">
                <label>Revised</label>
                <title>Amendments from Version 2</title>
                <p>We corrected the accession number of the biosample, improved a sentence to clarify the number of scaffolds and changed figure 4 to show the results after second filtering in Blobtools since previous one represented first filtering.</p>
            </sec>
        </notes>
    </front>
    <body>
        <sec id="sec1" sec-type="intro">
            <title>Introduction</title>
            <p>

                <italic toggle="yes">Tethysbaena scabra</italic> (Pretus, 1991) (NCBI:txid203899) is a thermosbaenacean (Crustacea; Multicrustacea; Malacostraca; Eumalacostraca; Peracarida; Thermosbaenacea; Monodellidae), a relict group of peracarid crustaceans characterized by the display in gravid females of a dorsal brood pouch formed by a posterior extension of the carapace (
                <xref ref-type="fig" rid="f1">Figure 1</xref>). This species measures 2&#x2013;3 mm in length and is completely eyeless and depigmented, inhabiting subterranean waters of raised salinity in caves and wells located near the marine coast. It is endemic to the Mediterranean islands of Mallorca and Menorca (Balearic Archipelago). Its feeding habits correspond to those of a particle collector, thriving primarily in the pycnoclines that develop within the water column of anchialine caves, where organic debris, bacteria, and fungi accumulate. There is no available information on genome size and chromosome number in thermosbaenaceans. The closest taxa with known information on genome size (
                <ext-link ext-link-type="uri" xlink:href="https://www.genomesize.com">https://www.genomesize.com</ext-link>, 1C values in pg) are within the peracarid groups Isopoda (1.70-8.60); Amphipoda (0.52-64.62); and Mysida (10.81-12.00).</p>
            <fig fig-type="figure" id="f1" orientation="portrait" position="float">
                <label>
Figure 1. </label>
                <caption>
                    <title>Photograph of a 
                        <italic toggle="yes">Tethysbaena scabra</italic> (qmTetScab1) specimen.</title>
                </caption>
                <graphic id="gr1" orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/188221/569ae74d-6e28-4f55-bc3c-53613adad555_figure1.gif"/>
            </fig>
            <p>The genome sequence from 
                <italic toggle="yes">T. scabra</italic> will help to study adaptation to underground environments, particularly anchialine ones, that are characterized by oligotrophy, darkness and salinity. The genome of 
                <italic toggle="yes">T. scabra</italic> was sequenced under the umbrella of the Catalan Initiative for the Earth BioGenome Project (CBP). Here we present a chromosome-level genome assembly for 
                <italic toggle="yes">T. scabra</italic> from Mallorca, Spain, which represents the first reference genome for the order Thermosbaenacea.</p>
        </sec>
        <sec id="sec2" sec-type="methods">
            <title>Methods</title>
            <p>Specimens were collected in late Spring 2022 with a modified plankton net from the bottom of a well in an old windmill at Es Pil&#x00b7;lar&#x00ed;, Palma, Mallorca, Spain (39.533831, 2.747581). Specimens were sorted out under a stereo-microscope (
                <xref ref-type="fig" rid="f2">Figure 2</xref>). Several batches of 20 specimens each were placed in a cryovial for snap-freezing in liquid nitrogen, and ulteriorly sent in dry ice to the sequencing facilities. Specimens were collected and identified by Dami&#x00e0; Jaume. Extraction of High Molecular Weight DNA, construction of Pacific Biosciences HiFi circular consensus DNA sequencing libraries, and sequencing on Pacific Biosciences SEQUEL II (HiFi) instrument was performed by Delaware Biotechnology Institute, University of Delaware (DE, USA) using a pool of 20 specimens (Accession number: SAMEA113414145, qmTetScab1). Hi-C data was generated from another pool of 20 individuals from the same collection site (Accession number: SAMEA118091338) using the library preparation Omni-C DNA and sequenced 2 x 150 pb on the Illumina NovaSeq 6000 S4 instrument at the Centre Nacional d&#x2019;An&#x00e0;lisi Gen&#x00f2;mica (CNAG), Barcelona, Spain.</p>
            <fig fig-type="figure" id="f2" orientation="portrait" position="float">
                <label>
Figure 2. </label>
                <caption>
                    <title>Photograph of 
                        <italic toggle="yes">Tethysbaena scabra</italic> specimens under magnification.</title>
                </caption>
                <graphic id="gr2" orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/188221/569ae74d-6e28-4f55-bc3c-53613adad555_figure2.gif"/>
            </fig>
            <p>The genome size was estimated using GenomeScope2 (
                <xref ref-type="bibr" rid="ref16">Vurture et al., 2017</xref>), and diploidy was confirmed with Smudgeplot (
                <xref ref-type="bibr" rid="ref10">Ranallo-Benavidez et al., 2020</xref>). Assembly was conducted using hifiasm (
                <xref ref-type="bibr" rid="ref3">Cheng et al., 2021</xref>) with n_hap=40 (considering diploidy and 20 individuals). Large number of haplotypic duplications presumably caused the high number of specimens used for DNA extraction were withdrawn with purge_dups (
                <xref ref-type="bibr" rid="ref5">Guan et al., 2020</xref>), passing from 2208 to 1272 contigs. Genomic DNA was extracted from individuals whose size is smaller than 5 mm, therefore they were not externally cleaned so it could also contain DNA from microbial and other eukaryote contaminants. Hence, contig sequences from contaminant species were removed from assembly using two bioinformatic tools, Foreign Contamination Screen (FCS, 
                <xref ref-type="bibr" rid="ref1">Astashyn et al., 2024</xref>), and Whokaryote (
                <xref ref-type="bibr" rid="ref9">Pronk and Medema, 2022</xref>), obtaining 993 contigs. The former achieves this by aligning assemblies, preprocessed to mask repetitive and low-complexity regions, to a curated reference database. The pipeline segments scaffolds into 100-kb subsequences and employs hashed k-mers as alignment seeds. Sequences assigned to taxonomic groups distinct from the query organism (NCBI:txid203899) were then excluded. The latter is a computational tool that differentiates eukaryotic from prokaryotic contig sequences based on fundamental differences in gene structure between the two taxonomic domains. It utilizes a Random Forests approach in combination with Tiara predictions, which incorporate k-mer frequency distributions as classification feature. The assembly was scaffolded with Hi-C data (
                <xref ref-type="bibr" rid="ref11">Rao et al., 2014</xref>) using YaHS (
                <xref ref-type="bibr" rid="ref17">Zhou et al., 2023</xref>), obtaining 821 scaffolds. The assembly was checked for contamination with two rounds of Blobtools, to ensure complete decontamination, obtaining 59 scaffolds. FCS and Whokaryote removed very few sequences compared to BlobToolKit because the first ones only use a close taxon reference, not available in Thermosbaenacea, and gene structure and domains, while the latter is based on several features (GC content, coverage, BUSCO reference, etc.). The contact map was curated using Pretext (
                <xref ref-type="bibr" rid="ref6">Harry, 2022</xref>), which suggested connections between scaffolds and reduced the final assembly from 59 to 17 scaffolds, while retaining 229 gaps of unknown size (represented as 100 consecutive Ns in the FASTA file). Putative sex chromosomes have not been identified, likely due to the genomic material being sourced from a pool of 20 individuals of unknown sex, and the Hi-C data being derived from a separate pool of specimens. Additionally, the coverage obtained has not been sufficient to deduce sex-linked chromosomes. The genome was analysed within the BlobToolKit environment and BUSCO scores were generated (
                <xref ref-type="bibr" rid="ref2">Challis et al., 2020</xref>). 
                <xref ref-type="table" rid="T1">
Table 1</xref> list the software tool versions used, where appropriate. To assess the assembly metrics, the k-mer completeness and QV consensus quality values were calculated using Meryl and Merqury (
                <xref ref-type="bibr" rid="ref12">Rhie et al., 2020</xref>).</p>
            <table-wrap id="T1" orientation="portrait" position="float">
                <label>
Table 1. </label>
                <caption>
                    <title>Software tools: versions and sources.</title>
                </caption>
                <table content-type="article-table" frame="hsides">
                    <thead>
                        <tr>
                            <th align="left" colspan="1" rowspan="1" valign="top">Software tool</th>
                            <th align="left" colspan="1" rowspan="1" valign="top">Version</th>
                            <th align="left" colspan="1" rowspan="1" valign="top">Source</th>
                        </tr>
                    </thead>
                    <tbody>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">Blastn</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">2.12.0+</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <ext-link ext-link-type="uri" xlink:href="https://blast.ncbi.nlm.nih.gov/doc/blast-help/downloadblastdata.html">
https://blast.ncbi.nlm.nih.gov/doc/blast-help/downloadblastdata.html</ext-link>
</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">BlobToolKit</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">4.3.5</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <ext-link ext-link-type="uri" xlink:href="https://github.com/blobtoolkit/blobtoolkit">https://github.com/blobtoolkit/blobtoolkit</ext-link>
</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">BUSCO</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">5.5.0</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <ext-link ext-link-type="uri" xlink:href="https://gitlab.com/ezlab/busco/-/archive/5.5.0/busco-5.5.0.zip">https://gitlab.com/ezlab/busco/-/archive/5.5.0/busco-5.5.0.zip</ext-link>
</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">FCS</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.5.3</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <ext-link ext-link-type="uri" xlink:href="https://github.com/ncbi/fcs">https://github.com/ncbi/fcs</ext-link>
</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">GenomeScope2</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">2.0</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <ext-link ext-link-type="uri" xlink:href="https://github.com/tbenavi1/genomescope2.0">https://github.com/tbenavi1/genomescope2.0</ext-link>
</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">Hifiasm</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.20.0-r639</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <ext-link ext-link-type="uri" xlink:href="https://github.com/chhylp123/hifiasm">https://github.com/chhylp123/hifiasm</ext-link>
</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">Merqury</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">1.3</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <ext-link ext-link-type="uri" xlink:href="https://github.com/marbl/merqury">https://github.com/marbl/merqury</ext-link>
</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">Meryl</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">1.4.1</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <ext-link ext-link-type="uri" xlink:href="https://github.com/marbl/meryl">https://github.com/marbl/meryl</ext-link>
</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">PretextMap</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.1.9</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <ext-link ext-link-type="uri" xlink:href="https://github.com/sanger-tol/PretextMap">https://github.com/sanger-tol/PretextMap</ext-link>
</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">RepeatMasker</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">4.1.7</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <ext-link ext-link-type="uri" xlink:href="https://github.com/Dfam-consortium/RepeatMasker">https://github.com/Dfam-consortium/RepeatMasker</ext-link>
</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">RepeatOBServer</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">1.0</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <ext-link ext-link-type="uri" xlink:href="https://github.com/celphin/RepeatOBserverV1">https://github.com/celphin/RepeatOBserverV1</ext-link>
</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">Purge_dups</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">1.2.5</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <ext-link ext-link-type="uri" xlink:href="https://github.com/dfguan/purge_dups">https://github.com/dfguan/purge_dups</ext-link>
</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">Smudgeplot</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.3.0</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <ext-link ext-link-type="uri" xlink:href="https://github.dev/KamilSJaron/smudgeplot">https://github.dev/KamilSJaron/smudgeplot</ext-link>
</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">Whokaryote</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <ext-link ext-link-type="uri" xlink:href="https://github.com/LottePronk/whokaryote/commit/df4d26240e7ad5c7080486ddf27538ecec85e7eb">1.1.2</ext-link>
</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <ext-link ext-link-type="uri" xlink:href="https://github.com/LottePronk/whokaryote">https://github.com/LottePronk/whokaryote</ext-link>
</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">YaHS</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">1.2</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <ext-link ext-link-type="uri" xlink:href="https://github.com/c-zhou/yahs">https://github.com/c-zhou/yahs</ext-link>
</td>
                        </tr>
                    </tbody>
                </table>
            </table-wrap>
            <p>The assembly of mitochondrial genome failed using MitoHiFi (
                <xref ref-type="bibr" rid="ref15">Uliano-Silva et al., 2023</xref>), likely due to lack in genome databanks of a mitogenome sequence of sufficiently close taxa. For this reason, sequence contigs were compared with a relaxed BLASTn algorithm against a database built with mitogenome sequences of several peracarid species. The sequence of 30 kb with a positive match was circularized in MitoMaker (
                <xref ref-type="bibr" rid="ref13">Schomaker-Bastos and Prosdocimi, 2018</xref>), and annotated in Mitos2 (
                <xref ref-type="bibr" rid="ref4">Donath et al., 2019</xref>).</p>
            <p>Repetitive annotation was performed using RepeatMasker (
                <xref ref-type="bibr" rid="ref19">Smit et al., 2013&#x2013;2015</xref>) and RepeatOBserver (
                <xref ref-type="bibr" rid="ref18">Elphinstone et al., 2025</xref>). The former tool identifies DNA low complexity regions as well as interspersed repeats. In contrast, RepeatOBserver describes tandem repeats and cluster of transposons found on a chromosome level assembly, based in repeat patterns. In also returns a predicted centromere location for each chromosome.</p>
        </sec>
        <sec id="sec3" sec-type="results">
            <title>Results</title>
            <p>The genome sequence was obtained from a DNA pool of 20 specimens of 
                <italic toggle="yes">T. scabra</italic> for HiFi data, plus another identical pool for Hi-C data, from individuals collected in a well in Es Pil&#x00b7;lar&#x00ed;, Palma, Mallorca, Spain. Two Pacific Biosciences sequencing cells yielded a total of 63.5 giga bases of high-fidelity (HiFi) long reads with a N50 of 13,270 bp, achieving a coverage of 53.8X. Afterward, primary contig assemblies were scaffolded using 73.9 Gb of paired-end Illumina reads derived from chromosome conformation Hi-C data. Manual curation corrected 39 misassemblies, including missing joins and missjoins, resulting in a 0.28% reduction in the total assembly length, a 61.02% decrease in scaffold count, and an 89.99% increase in scaffold N50. The final genome assembly spans 1.18 Gb across 23 scaffolds, with a scaffold N50 of 74.6 Mb (
                <xref ref-type="fig" rid="f3">Figure 3</xref>, 
                <xref ref-type="table" rid="T2">
Table 2</xref>). GC-coverage (
                <xref ref-type="fig" rid="f4">Figure 4</xref>) and cumulative sequence plots (
                <xref ref-type="fig" rid="f5">Figure 5</xref>) from BlobToolKit showed minimal parameter variation with few outliers, and only a very low fraction of sequences failed to match Arthropoda ones deposited in databases. Most of the assembly sequence (99.2%) has been mapped to the final chromosomes. The final assembly sequence confirmed by Hi-C data was assigned to 17 chromosomal-level scaffolds that are designated as they appear in the PretextMap (
                <xref ref-type="fig" rid="f6">Figure 6</xref>; 
                <xref ref-type="table" rid="T3">
Table 3</xref>). The assembly has a BUSCO v5.5.0 (
                <xref ref-type="bibr" rid="ref7">Manni et al., 2021</xref>; 
                <xref ref-type="bibr" rid="ref14">Sim&#x00e3;o FA et al., 2015</xref>) completeness of 94.7% (single 93.7%, duplicated 0.7%) using the arthropoda_odb10 reference set. The mitochondrial genome contig can be found within the multifasta file of the genome submission.</p>
            <fig fig-type="figure" id="f3" orientation="portrait" position="float">
                <label>
Figure 3. </label>
                <caption>
                    <title>Snailplot of the genome assembly of 
                        <italic toggle="yes">Tethysbaena scabra</italic>, qmTetScab1.</title>
                    <p>This snailplot generated by BlobToolKit displays several metrics, including the longest scaffold, N50, and BUSCO gene completeness, among others. The main plot is segmented into 50 bins, ordered by size around the circumference, with each bin representing 2% of the 1.18 Gbp assembly. Scaffold length distribution is shown in dark grey, with the plot radius scaled to the length of the longest scaffold in the assembly (104 Mbp). Orange and light-orange arcs indicate the N50 and N90 scaffold lengths (74.6 Mbp and 55.4 Mbp, respectively). A pale grey spiral illustrates the cumulative scaffold count on a log scale, with white scale lines marking successive orders of magnitude. The blue and pale-blue areas along the plot's outer edge depict the GC, AT, and N content distribution across these bins. A summary of the BUSCO results appears in the figure&#x2019;s top right corner.</p>
                </caption>
                <graphic id="gr3" orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/188221/569ae74d-6e28-4f55-bc3c-53613adad555_figure3.gif"/>
            </fig>
            <table-wrap id="T2" orientation="portrait" position="float">
                <label>
Table 2. </label>
                <caption>
                    <title>Genome data for 
                        <italic toggle="yes">Tethysbaena scabra</italic>, qmTetScab1.1.</title>
                    <p>Assembly metrics benchmarks are adapted from the 6.C.Q40 of Earth Biogenome Project from (
                        <xref ref-type="bibr" rid="ref8">Lawniczak et al., 2022</xref>). BUSCO scores based on the arthropoda_odb10 BUSCO set using v5.5.0. C = complete, [S = single copy, D = duplicated], F = fragmented, M = missing, n = number of orthologues in comparison.</p>
                </caption>
                <table content-type="article-table" frame="hsides">
                    <thead>
                        <tr>
                            <th align="left" colspan="3" rowspan="1" valign="top">Project accession data</th>
                        </tr>
                    </thead>
                    <tbody>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">
Assembly name</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">

                                <italic toggle="yes">Tethysbaena scabra</italic>
</td>
                            <td colspan="1" rowspan="1"/>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">Assembly accession</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">GCA_964277195</td>
                            <td colspan="1" rowspan="1"/>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">Accession of alternate haplotype</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">-</td>
                            <td colspan="1" rowspan="1"/>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">Span (Mb)</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">1200</td>
                            <td colspan="1" rowspan="1"/>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">Number of contigs</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">322</td>
                            <td colspan="1" rowspan="1"/>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">Contig N50 length (Mb)</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">6.1Mb</td>
                            <td colspan="1" rowspan="1"/>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">Number of scaffolds</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">23</td>
                            <td colspan="1" rowspan="1"/>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">Scaffold N50 length (Mb)</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">74.5Mb</td>
                            <td colspan="1" rowspan="1"/>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">Longest scaffold (Mb)</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">104.45Mb</td>
                            <td colspan="1" rowspan="1"/>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">&#x200d;Gaps (bp)</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">299 standardized 100 bp gaps</td>
                            <td colspan="1" rowspan="1"/>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <bold>Assembly metrics</bold>
</td>
                            <td colspan="1" rowspan="1"/>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <bold>
Benchmark</bold>
</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">Consensus quality (QV)</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">50.41</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">&#x2265;40</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">K-mer completeness</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">92.5</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">&#x2265;90</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">Busco</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">C:93.7%[S:93,D:0.7%],
                                <break/>F:3%,M:3.4%,n:1,013</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">C &#x2265;90%, D &lt;5%</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">Percentage of assembly mapped to chromosomes</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">99.2%</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">&#x2265;90%</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">Organelles</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">MT</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Complete single alleles</td>
                        </tr>
                    </tbody>
                </table>
            </table-wrap>
            <fig fig-type="figure" id="f4" orientation="portrait" position="float">
                <label>
Figure 4. </label>
                <caption>
                    <title>Genome assembly of 
                        <italic toggle="yes">Tethysbaena scabra,
</italic> qmTetScab1.1: BlobToolKit GC-coverage plot.</title>
                    <p>Scaffolds are shown by phylum. Circles are sized in proportion to scaffold length. Histograms show the distribution of scaffold length sum along.</p>
                </caption>
                <graphic id="gr4" orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/188221/569ae74d-6e28-4f55-bc3c-53613adad555_figure4.gif"/>
            </fig>
            <fig fig-type="figure" id="f5" orientation="portrait" position="float">
                <label>
Figure 5. </label>
                <caption>
                    <title>Genome assembly of 
                        <italic toggle="yes">Tethysbaena scabra</italic>: BlobToolKit cumulative sequence plot, qmTetScab1.1.</title>
                    <p>The gray line represents the cumulative length of all scaffolds, while the colored lines indicate the cumulative lengths of scaffolds assigned to each individual phylum.</p>
                </caption>
                <graphic id="gr5" orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/188221/569ae74d-6e28-4f55-bc3c-53613adad555_figure5.gif"/>
            </fig>
            <fig fig-type="figure" id="f6" orientation="portrait" position="float">
                <label>
Figure 6. </label>
                <caption>
                    <title>Genome assembly of 
                        <italic toggle="yes">Tethysbaena scabra,
</italic> qmTetScab1: Hi-C contact map of assembly, visualised using PretextMap.</title>
                    <p>Chromosomes are shown as they appear in PretextMap, not by size order.</p>
                </caption>
                <graphic id="gr6" orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/188221/569ae74d-6e28-4f55-bc3c-53613adad555_figure6.gif"/>
            </fig>
            <table-wrap id="T3" orientation="portrait" position="float">
                <label>
Table 3. </label>
                <caption>
                    <title>Chromosomal pseudomolecules in the genome assembly of 
                        <italic toggle="yes">Tethysbaena scabra.</italic>
</title>
                    <p>

                        <ext-link ext-link-type="uri" xlink:href="https://www.ebi.ac.uk/ena/browser/view/GCA_964277195.1?show=chromosomes">https://www.ebi.ac.uk/ena/browser/view/GCA_964277195.1?show=chromosomes</ext-link>.</p>
                </caption>
                <table content-type="article-table" frame="hsides">
                    <thead>
                        <tr>
                            <th align="left" colspan="1" rowspan="1" valign="top">Accession</th>
                            <th align="left" colspan="1" rowspan="1" valign="top">Name</th>
                            <th align="left" colspan="1" rowspan="1" valign="top">Length (Mb)</th>
                            <th align="left" colspan="1" rowspan="1" valign="top">
GC%</th>
                        </tr>
                    </thead>
                    <tbody>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <ext-link ext-link-type="uri" xlink:href="https://www.ebi.ac.uk/ena/browser/view/OZ195310">OZ195310</ext-link>
</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">tros_1</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">83.11</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">33.29</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <ext-link ext-link-type="uri" xlink:href="https://www.ebi.ac.uk/ena/browser/view/OZ195311">OZ195311</ext-link>
</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">tros_2</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">104.46</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">33.18</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <ext-link ext-link-type="uri" xlink:href="https://www.ebi.ac.uk/ena/browser/view/OZ195312">OZ195312</ext-link>
</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">tros_3</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">85.72</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">33.29</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <ext-link ext-link-type="uri" xlink:href="https://www.ebi.ac.uk/ena/browser/view/OZ195313">OZ195313</ext-link>
</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">tros_4</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">82.79</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">33.44</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <ext-link ext-link-type="uri" xlink:href="https://www.ebi.ac.uk/ena/browser/view/OZ195314">OZ195314</ext-link>
</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">tros_5</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">87.20</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">33.33</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <ext-link ext-link-type="uri" xlink:href="https://www.ebi.ac.uk/ena/browser/view/OZ195315">OZ195315</ext-link>
</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">tros_6</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">74.56</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">33.45</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <ext-link ext-link-type="uri" xlink:href="https://www.ebi.ac.uk/ena/browser/view/OZ195316">OZ195316</ext-link>
</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">tros_7</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">74.67</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">33.31</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <ext-link ext-link-type="uri" xlink:href="https://www.ebi.ac.uk/ena/browser/view/OZ195317">OZ195317</ext-link>
</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">tros_8</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">72.98</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">33.51</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <ext-link ext-link-type="uri" xlink:href="https://www.ebi.ac.uk/ena/browser/view/OZ195318">OZ195318</ext-link>
</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">tros_9</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">61.70</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">33.58</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <ext-link ext-link-type="uri" xlink:href="https://www.ebi.ac.uk/ena/browser/view/OZ195319">OZ195319</ext-link>
</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">tros_10</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">49.14</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">33.44</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <ext-link ext-link-type="uri" xlink:href="https://www.ebi.ac.uk/ena/browser/view/OZ195320">OZ195320</ext-link>
</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">tros_11</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">56.67</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">33.72</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <ext-link ext-link-type="uri" xlink:href="https://www.ebi.ac.uk/ena/browser/view/OZ195321">OZ195321</ext-link>
</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">tros_12</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">70.05</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">33.68</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <ext-link ext-link-type="uri" xlink:href="https://www.ebi.ac.uk/ena/browser/view/OZ195322">OZ195322</ext-link>
</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">tros_13</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">55.35</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">33.43</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <ext-link ext-link-type="uri" xlink:href="https://www.ebi.ac.uk/ena/browser/view/OZ195323">OZ195323</ext-link>
</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">tros_14</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">59.37</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">33.46</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <ext-link ext-link-type="uri" xlink:href="https://www.ebi.ac.uk/ena/browser/view/OZ195324">OZ195324</ext-link>
</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">tros_15</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">57.12</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">33.76</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <ext-link ext-link-type="uri" xlink:href="https://www.ebi.ac.uk/ena/browser/view/OZ195325">OZ195325</ext-link>
</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">tros_16</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">55.68</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">33.69</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <ext-link ext-link-type="uri" xlink:href="https://www.ebi.ac.uk/ena/browser/view/OZ195326">OZ195326</ext-link>
</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">tros_17</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">45.10</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">33.69</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <ext-link ext-link-type="uri" xlink:href="https://www.ebi.ac.uk/ena/browser/view/OZ195327">OZ195327</ext-link>
</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">MT</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.016</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">32.04</td>
                        </tr>
                    </tbody>
                </table>
            </table-wrap>
            <p>The genome annotation was assessed using BUSCO obtaining: C:93.1% [S:73.2%, D:19.9%], F:2.2%, M:4.7%, also 27,004 transcripts and 22,834 genes. RNAQuast has been performed to check the average alignment length, being 1248.6 bp. Repetitive regions are summarized in 
                <xref ref-type="table" rid="T4">
Table 4</xref>.</p>
            <table-wrap id="T4" orientation="portrait" position="float">
                <label>Table 4. </label>
                <caption>
                    <title>Summary of the repetitive elements found by RepeatMasker in the genome of 
                        <italic toggle="yes">Tethysbaena scabra</italic>, qmTetScab1.1.</title>
                </caption>
                <table content-type="article-table" frame="hsides">
                    <thead>
                        <tr>
                            <th align="left" colspan="1" rowspan="1" valign="top"/>
                            <th align="left" colspan="1" rowspan="1" valign="top"/>
                            <th align="left" colspan="1" rowspan="1" valign="top">Number of elements</th>
                            <th align="left" colspan="1" rowspan="1" valign="top">Length occupied</th>
                            <th align="left" colspan="1" rowspan="1" valign="top">
%</th>
                        </tr>
                    </thead>
                    <tbody>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">SINEs:</td>
                            <td colspan="1" rowspan="1"/>
                            <td align="left" colspan="1" rowspan="1" valign="top">3,285</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">217,586 bp</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.02%</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1"/>
                            <td align="left" colspan="1" rowspan="1" valign="top">ALUs</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">7</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">499 bp</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.00%</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1"/>
                            <td align="left" colspan="1" rowspan="1" valign="top">MIRs</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">381</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">30,645 bp</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.00%</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">LINEs:</td>
                            <td colspan="1" rowspan="1"/>
                            <td align="left" colspan="1" rowspan="1" valign="top">100,876</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">97,666,309 bp</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">8.30%</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1"/>
                            <td align="left" colspan="1" rowspan="1" valign="top">LINE1</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">3,138</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">378,718 bp</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.03%</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1"/>
                            <td align="left" colspan="1" rowspan="1" valign="top">LINE2</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">47,591</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">44,895,798 bp</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">3.81%</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1"/>
                            <td align="left" colspan="1" rowspan="1" valign="top">L3/CR1</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">49,210</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">52,023,230 bp</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">4.42%</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">LTR elements:</td>
                            <td colspan="1" rowspan="1"/>
                            <td align="left" colspan="1" rowspan="1" valign="top">1,726</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">541,618 bp</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.05%</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1"/>
                            <td align="left" colspan="1" rowspan="1" valign="top">ERVL</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">80</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">10,534 bp</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.00%</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1"/>
                            <td align="left" colspan="1" rowspan="1" valign="top">ERVL-MaLRs</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">118</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">12,225 bp</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.00%</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1"/>
                            <td align="left" colspan="1" rowspan="1" valign="top">ERV_classI</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">1,224</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">337,524 bp</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.03%</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1"/>
                            <td align="left" colspan="1" rowspan="1" valign="top">ERV_classII</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">46</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">4,692 bp</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.00%</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">DNA elements:</td>
                            <td colspan="1" rowspan="1"/>
                            <td align="left" colspan="1" rowspan="1" valign="top">39,909</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">19,071,121 bp</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">1.62%</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1"/>
                            <td align="left" colspan="1" rowspan="1" valign="top">hAT-Charlie</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">20,903</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">9,453,122 bp</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.80%</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1"/>
                            <td align="left" colspan="1" rowspan="1" valign="top">TcMar-Tigger</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">3,285</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">1,466,820 bp</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.12%</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">Unclassified</td>
                            <td colspan="1" rowspan="1"/>
                            <td align="left" colspan="1" rowspan="1" valign="top">20</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">3,649 bp</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.00%</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">Total</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Interspersed</td>
                            <td colspan="1" rowspan="1"/>
                            <td align="left" colspan="1" rowspan="1" valign="top">117,500,283 bp</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">9.98%</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1"/>
                            <td align="left" colspan="1" rowspan="1" valign="top">Small RNA</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">1,757</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">176,391 bp</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.01%</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">Satellites:</td>
                            <td colspan="1" rowspan="1"/>
                            <td align="left" colspan="1" rowspan="1" valign="top">94</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">13,096 bp</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.00%</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1"/>
                            <td align="left" colspan="1" rowspan="1" valign="top">Simple repeats</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">552,457</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">26,333,387 bp</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">2.24%</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1"/>
                            <td align="left" colspan="1" rowspan="1" valign="top">Low complexity</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">71,177</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">3,444,953 bp</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.29%</td>
                        </tr>
                    </tbody>
                </table>
            </table-wrap>
            <sec id="sec4">
                <title>Ethics and consent</title>
                <p>Ethical approval and consent were not required.</p>
            </sec>
        </sec>
        <sec id="sec5">
            <title>Author contributions</title>
            <p>Conceptualization (JP, CJ, DJ, JAJR), Data Curation (KDSA, LTL, JP), Formal Analysis (LTL, KDSA, JP), Funding Acquisition (JAJR, JP), Resources (DJ), Writing &#x2013; Original Draft Preparation (LTL, KDSA, JP), and Writing &#x2013; Review &amp; Editing (all).</p>
        </sec>
    </body>
    <back>
        <sec id="sec8" sec-type="data-availability">
            <title>Data and software availability</title>
            <p>The 
                <italic toggle="yes">Tethysbaena scabra</italic> genome project is integrated into the Catalan Initiative for the Earth BioGenome Project (CBP), and all raw data and assembly were deposited in European Nucleotide Archive: 
                <italic toggle="yes">Tethysbaena scabra.</italic> Accession number PRJEB61927; 
                <ext-link ext-link-type="uri" xlink:href="https://identifiers.org/ena.embl/PRJEB61927">https://identifiers.org/ena.embl/PRJEB61927</ext-link>. Raw data and assembly accession identifiers are reported in 
                <xref ref-type="table" rid="T3">
Table 3</xref>.</p>
        </sec>
        <ack>
            <title>Acknowledgements</title>
            <p>We are thankful to the bioinformaticians Jessica G&#x00f3;mez-Garrido and Tyler Alioto (Centre Nacional d&#x2019;An&#x00e0;lisi Gen&#x00f2;mic, CNAG) and Emilio Righi (Centre for Genomic Regulation, CRG), both in Barcelona (Spain), for their invaluable assistance.</p>
        </ack>
        <ref-list>
            <title>References</title>
            <ref id="ref1">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Astashyn</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Tvedte</surname>
                            <given-names>ES</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Sweeney</surname>
                            <given-names>D</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Rapid and sensitive detection of genome contamination at scale with FCS-GX.</article-title>
                    <source>

                        <italic toggle="yes">Genome Biol.</italic>
</source>
                    <year>2024</year>;<volume>25</volume>(<issue>1</issue>):<fpage>60</fpage>.
                    <pub-id pub-id-type="pmid">38409096</pub-id>
                    <pub-id pub-id-type="doi">10.5281/zenodo.10651084</pub-id>
                    <pub-id pub-id-type="pmcid">PMC10898089</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref2">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Challis</surname>
                            <given-names>R</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Richards</surname>
                            <given-names>E</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Rajan</surname>
                            <given-names>J</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>BlobToolKit&#x2013;interactive quality assessment of genome assemblies. G3: Genes, Genomes.</article-title>
                    <source>

                        <italic toggle="yes">Genetics.</italic>
</source>
                    <year>2020</year>;<volume>10</volume>(<issue>4</issue>):<fpage>1361</fpage>&#x2013;<lpage>1374</lpage>.
                    <pub-id pub-id-type="pmid">32071071</pub-id>
                    <pub-id pub-id-type="doi">10.1534/g3.119.400908</pub-id>
                    <pub-id pub-id-type="pmcid">PMC7144090</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref3">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Cheng</surname>
                            <given-names>H</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Concepcion</surname>
                            <given-names>GT</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Feng</surname>
                            <given-names>X</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm.</article-title>
                    <source>

                        <italic toggle="yes">Nat. Methods.</italic>
</source>
                    <year>2021</year>;<volume>18</volume>(<issue>2</issue>):<fpage>170</fpage>&#x2013;<lpage>175</lpage>.
                    <pub-id pub-id-type="pmid">33526886</pub-id>
                    <pub-id pub-id-type="doi">10.1038/s41592-020-01056-5</pub-id>
                    <pub-id pub-id-type="pmcid">PMC7961889</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref4">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Donath</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>J&#x00fc;hling</surname>
                            <given-names>F</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Al-Arab</surname>
                            <given-names>M</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Improved annotation of protein-coding genes boundaries in metazoan mitochondrial genomes.</article-title>
                    <source>

                        <italic toggle="yes">Nucleic Acids Res.</italic>
</source>
                    <year>2019</year>;<volume>47</volume>(<issue>20</issue>):<fpage>10543</fpage>&#x2013;<lpage>10552</lpage>.
                    <pub-id pub-id-type="pmid">31584075</pub-id>
                    <pub-id pub-id-type="doi">10.1093/nar/gkz833</pub-id>
                    <pub-id pub-id-type="pmcid">PMC6847864</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref18">
                <mixed-citation publication-type="other">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Elphinstone</surname>
                            <given-names>C</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Elphinstone</surname>
                            <given-names>R</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Todesco</surname>
                            <given-names>M</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>RepeatOBserver: Tandem Repeat Visualisation and Putative Centromere Detection.</article-title>
                    <source>

                        <italic toggle="yes">Mol. Ecol. Resour.</italic>
</source>
                    <year>2025</year>; e14084.
                    <pub-id pub-id-type="doi">10.1111/1755-0998.14084</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref5">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Guan</surname>
                            <given-names>D</given-names>
                        </name>

                        <name name-style="western">
                            <surname>McCarthy</surname>
                            <given-names>SA</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Wood</surname>
                            <given-names>J</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Identifying and removing haplotypic duplication in primary genome assemblies.</article-title>
                    <source>

                        <italic toggle="yes">Bioinformatics.</italic>
</source>
                    <year>2020</year>;<volume>36</volume>(<issue>9</issue>):<fpage>2896</fpage>&#x2013;<lpage>2898</lpage>.
                    <pub-id pub-id-type="pmid">31971576</pub-id>
                    <pub-id pub-id-type="doi">10.1093/bioinformatics/btaa025</pub-id>
                    <pub-id pub-id-type="pmcid">PMC7203741</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref6">
                <mixed-citation publication-type="other">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Harry</surname>
                            <given-names>E</given-names>
                        </name>
</person-group>:
                    <article-title>PretextView (Paired REad TEXTure Viewer): A desktop application for viewing pretext contact maps.</article-title>
                    <year>2022</year>.
                    <ext-link ext-link-type="uri" xlink:href="https://github.com/sanger-tol/PretextView">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref7">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Manni</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Berkeley</surname>
                            <given-names>MR</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Seppey</surname>
                            <given-names>M</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>BUSCO: assessing genomic data quality and beyond.</article-title>
                    <source>

                        <italic toggle="yes">Curr. Protoc. </italic>
</source>
                    <year>2021</year>;<volume>1</volume>:<fpage>e323</fpage>.
                    <pub-id pub-id-type="pmid">34936221</pub-id>
                    <pub-id pub-id-type="doi">10.1002/cpz1.323</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref8">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Lawniczak</surname>
                            <given-names>MK</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Durbin</surname>
                            <given-names>R</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Flicek</surname>
                            <given-names>P</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Standards recommendations for the earth BioGenome project.</article-title>
                    <source>

                        <italic toggle="yes">Proc. Natl. Acad. Sci.</italic>
</source>
                    <year>2022</year>;<volume>119</volume>(<issue>4</issue>):<fpage>e2115639118</fpage>.
                    <pub-id pub-id-type="pmid">35042802</pub-id>
                    <pub-id pub-id-type="doi">10.1073/pnas.2115639118</pub-id>
                    <pub-id pub-id-type="pmcid">PMC8795494</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref9">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Pronk</surname>
                            <given-names>LJ</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Medema</surname>
                            <given-names>MH</given-names>
                        </name>
</person-group>:
                    <article-title>Whokaryote: distinguishing eukaryotic and prokaryotic contigs in metagenomes based on gene structure.</article-title>
                    <source>

                        <italic toggle="yes">Microb. Genom.</italic>
</source>
                    <year>2022</year>;<volume>8</volume>:<fpage>000823</fpage>.
                    <pub-id pub-id-type="pmid">35503723</pub-id>
                    <pub-id pub-id-type="doi">10.1099/mgen.0.000823</pub-id>
                    <pub-id pub-id-type="pmcid">PMC9465069</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref10">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Ranallo-Benavidez</surname>
                            <given-names>TR</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Jaron</surname>
                            <given-names>KS</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Schatz</surname>
                            <given-names>MC</given-names>
                        </name>
</person-group>:
                    <article-title>GenomeScope 2.0 and Smudgeplot for reference-free profiling of polyploid genomes.</article-title>
                    <source>

                        <italic toggle="yes">Nat. Commun.</italic>
</source>
                    <year>2020</year>;<volume>11</volume>(<issue>1</issue>):<fpage>1432</fpage>.
                    <pub-id pub-id-type="pmid">32188846</pub-id>
                    <pub-id pub-id-type="doi">10.1038/s41467-020-14998-3</pub-id>
                    <pub-id pub-id-type="pmcid">PMC7080791</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref11">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Rao</surname>
                            <given-names>SS</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Huntley</surname>
                            <given-names>MH</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Durand</surname>
                            <given-names>NC</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping.</article-title>
                    <source>

                        <italic toggle="yes">Cell.</italic>
</source>
                    <year>2014</year>;<volume>159</volume>(<issue>7</issue>):<fpage>1665</fpage>&#x2013;<lpage>1680</lpage>.
                    <pub-id pub-id-type="pmid">25497547</pub-id>
                    <pub-id pub-id-type="doi">10.1016/j.cell.2014.11.021</pub-id>
                    <pub-id pub-id-type="pmcid">PMC5635824</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref12">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Rhie</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Walenz</surname>
                            <given-names>BP</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Koren</surname>
                            <given-names>S</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Merqury: reference-free quality, completeness, and phasing assessment for genome assemblies.</article-title>
                    <source>

                        <italic toggle="yes">Genome Biol.</italic>
</source>
                    <year>2020</year>;<volume>21</volume>:<fpage>227</fpage>&#x2013;<lpage>245</lpage>.
                    <pub-id pub-id-type="pmid">32928274</pub-id>
                    <pub-id pub-id-type="doi">10.1186/s13059-020-02134-9</pub-id>
                    <pub-id pub-id-type="pmcid">PMC7488777</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref13">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Schomaker-Bastos</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Prosdocimi</surname>
                            <given-names>F</given-names>
                        </name>
</person-group>:
                    <article-title>mitoMaker: a pipeline for automatic assembly and annotation of animal mitochondria using raw NGS data.</article-title>
                    <source>

                        <italic toggle="yes">Preprints.</italic>
</source>
                    <year>2018</year>.
                    <pub-id pub-id-type="doi">10.20944/preprints201808.0423.v1</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref14">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Sim&#x00e3;o</surname>
                            <given-names>FA</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Waterhouse</surname>
                            <given-names>RM</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Ioannidis</surname>
                            <given-names>P</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs.</article-title>
                    <source>

                        <italic toggle="yes">Bioinformatics.</italic>
</source>
                    <year>2015</year>;<volume>31</volume>(<issue>19</issue>):<fpage>3210</fpage>&#x2013;<lpage>3212</lpage>.
                    <pub-id pub-id-type="doi">10.1093/bioinformatics/btv351</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref19">
                <mixed-citation publication-type="other">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Smit</surname>
                            <given-names>AFA</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Hubley</surname>
                            <given-names>R</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Green</surname>
                            <given-names>P</given-names>
                        </name>
</person-group>:
                    <article-title>RepeatMasker Open-4.0 [Software].</article-title>
                    <year>2013&#x2013;2015</year>.
                    <ext-link ext-link-type="uri" xlink:href="http://www.repeatmasker.org">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref15">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Uliano-Silva</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Ferreira</surname>
                            <given-names>JGR</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Krasheninnikova</surname>
                            <given-names>K</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>MitoHiFi: a python pipeline for mitochondrial genome assembly from PacBio high fidelity reads.</article-title>
                    <source>

                        <italic toggle="yes">BMC Bioinformatics.</italic>
</source>
                    <year>2023</year>;<volume>24</volume>(<issue>1</issue>):<fpage>288</fpage>.
                    <pub-id pub-id-type="pmid">37464285</pub-id>
                    <pub-id pub-id-type="doi">10.1101/2022.12.23.521667</pub-id>
                    <pub-id pub-id-type="pmcid">PMC10354987</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref16">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Vurture</surname>
                            <given-names>GW</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Sedlazeck</surname>
                            <given-names>FJ</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Nattestad</surname>
                            <given-names>M</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>GenomeScope: fast reference-free genome profiling from short reads.</article-title>
                    <source>

                        <italic toggle="yes">Bioinformatics.</italic>
</source>
                    <year>2017</year>;<volume>33</volume>(<issue>14</issue>):<fpage>2202</fpage>&#x2013;<lpage>2204</lpage>.
                    <pub-id pub-id-type="pmid">28369201</pub-id>
                    <pub-id pub-id-type="doi">10.1093/bioinformatics/btx153</pub-id>
                    <pub-id pub-id-type="pmcid">PMC5870704</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref17">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Zhou</surname>
                            <given-names>C</given-names>
                        </name>

                        <name name-style="western">
                            <surname>McCarthy</surname>
                            <given-names>SA</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Durbin</surname>
                            <given-names>R</given-names>
                        </name>
</person-group>:
                    <article-title>YaHS: yet another Hi-C scaffolding tool.</article-title>
                    <source>

                        <italic toggle="yes">Bioinformatics.</italic>
</source>
                    <year>2023</year>;<volume>39</volume>(<issue>1</issue>):<fpage>btac808</fpage>.
                    <pub-id pub-id-type="pmid">36525368</pub-id>
                    <pub-id pub-id-type="doi">10.1093/bioinformatics/btac808</pub-id>
                    <pub-id pub-id-type="pmcid">PMC9848053</pub-id>
                </mixed-citation>
            </ref>
        </ref-list>
    </back>
    <sub-article article-type="reviewer-report" id="report415498">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.188221.r415498</article-id>
            <title-group>
                <article-title>Reviewer response for version 3</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Schwentner</surname>
                        <given-names>Martin</given-names>
                    </name>
                    <xref ref-type="aff" rid="r415498a1">1</xref>
                    <role>Referee</role>
                </contrib>
                <aff id="r415498a1">
                    <label>1</label>Naturhistorisches Museum Vienna (Austria), Vienna, Austria</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>29</day>
                <month>9</month>
                <year>2025</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2025 Schwentner M</copyright-statement>
                <copyright-year>2025</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport415498" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.161461.3"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>I think the manuscript is ready for acceptance, the authors have responded to all raised issues and altered all relevant sections</p>
            <p>Are the datasets clearly presented in a usable and accessible format, and the assembly and annotation available in an appropriate subject-specific repository?</p>
            <p>Partly</p>
            <p>Are sufficient details of the sequencing and extraction, software used, and materials provided to allow replication by others?</p>
            <p>Yes</p>
            <p>Are the rationale for sequencing the genome and the species significance clearly described?</p>
            <p>Yes</p>
            <p>Are the protocols appropriate and is the work technically sound?</p>
            <p>Yes</p>
            <p>Reviewer Expertise:</p>
            <p>genomics, systematics, evolutionary research, crustacea</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.</p>
        </body>
    </sub-article>
    <sub-article article-type="reviewer-report" id="report406386">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.183424.r406386</article-id>
            <title-group>
                <article-title>Reviewer response for version 2</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Schwentner</surname>
                        <given-names>Martin</given-names>
                    </name>
                    <xref ref-type="aff" rid="r406386a1">1</xref>
                    <role>Referee</role>
                </contrib>
                <aff id="r406386a1">
                    <label>1</label>Naturhistorisches Museum Vienna (Austria), Vienna, Austria</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>4</day>
                <month>9</month>
                <year>2025</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2025 Schwentner M</copyright-statement>
                <copyright-year>2025</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport406386" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.161461.2"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve-with-reservations</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>The authors present the first genome of the thermosbaenacean, which will be an important resource for future research. The overall manuscript is well written and structured and the methods and results are appropriate and well presented.&#x00a0;</p>
            <p> I have a few comments that will help to clarify some issues.</p>
            <p> </p>
            <p> 1. The authors described in the Amendments that they now report 299 standardized 100 bp gaps, but I could not find this information in the actual manuscript.</p>
            <p> </p>
            <p> 2. I tried to download the genome. Maybe I have missed it, but I could not find the whole genome (I was only able to download the first scaffold) and could not find a gene file or similar. Please make sure that all data is available</p>
            <p> </p>
            <p> 3. The contamination check with Blobtools had a strong impact as it removed ~770 of the 820 scaffolds. Due to its impact, it would be very important and helpful if the actual value used were described. Currently only the metrices are named (GC, coverage, BUSCO), but not the specific values or ranges employed.</p>
            <p> </p>
            <p> </p>
            <p> 4. I find a bit difficult to follow the reported numbers of scaffolds and contigs, and if I understand the text correctly, the final number of contigs and scaffold does not quite match those from the beginning. 993 contigs were assembled into 821 scaffold (thus most scaffolds include only one contig). Only 59 scaffolds were retained after Blobtools filtering and the final set is 322 contigs in 23 scaffolds. That should not be possible, even the 59 scaffolds after Blobtools should not hold more then ~200 contigs. Maybe I did not fully understand the numbers, the authors should make this procedure clearer</p>
            <p>Are the datasets clearly presented in a usable and accessible format, and the assembly and annotation available in an appropriate subject-specific repository?</p>
            <p>Partly</p>
            <p>Are sufficient details of the sequencing and extraction, software used, and materials provided to allow replication by others?</p>
            <p>Yes</p>
            <p>Are the rationale for sequencing the genome and the species significance clearly described?</p>
            <p>Yes</p>
            <p>Are the protocols appropriate and is the work technically sound?</p>
            <p>Yes</p>
            <p>Reviewer Expertise:</p>
            <p>genomics, systematics, evolutionary research, crustacea</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.</p>
        </body>
        <sub-article article-type="response" id="comment14539-406386">
            <front-stub>
                <contrib-group>
                    <contrib contrib-type="author">
                        <name>
                            <surname>Pons</surname>
                            <given-names>Joan</given-names>
                        </name>
                        <aff>Animal&amp;Microbial Biodiversity, Institut Mediterrani d'Estudis Avancats, Esporles, Illes Balears, Spain</aff>
                    </contrib>
                </contrib-group>
                <author-notes>
                    <fn fn-type="conflict">
                        <p>
                            <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                    </fn>
                </author-notes>
                <pub-date pub-type="epub">
                    <day>9</day>
                    <month>9</month>
                    <year>2025</year>
                </pub-date>
            </front-stub>
            <body>
                <p>Here are the response of each question:</p>
                <p> </p>
                <p> 
                    <italic>1. The authors described in the Amendments that they now report 299 standardized 100 bp gaps, but I could not find this information in the actual manuscript.</italic>
                </p>
                <p> </p>
                <p> The information about the 299 standardized 100 pb gaps was included in Table 2. The new version also includes that information in the main text: &#x201c;The contact map was curated using Pretext (Harry, 2022), which suggested connections between scaffolds and reduced the final assembly from 59 to 17 scaffolds, while retaining 229 scaffolds of unknown size (represented as 100 consecutive Ns in the FASTA file).&#x201d;.</p>
                <p> </p>
                <p> 
                    <italic>2. I tried to download the genome. Maybe I have missed it, but I could not find the whole genome (I was only able to download the first scaffold) and could not find a gene file or similar. Please make sure that all data is available</italic>
                </p>
                <p> </p>
                <p> The link in the manuscript 
                    <underline>
                        <ext-link ext-link-type="uri" xlink:href="https://identifiers.org/ena.embl/PRJEB61927">https://identifiers.org/ena.embl/PRJEB61927</ext-link>
                    </underline> points to 
                    <underline>
                        <ext-link ext-link-type="uri" xlink:href="https://www.ebi.ac.uk/ena/browser/view/PRJEB61927">https://www.ebi.ac.uk/ena/browser/view/PRJEB61927</ext-link>
                    </underline> in ENA from where user can go to right panel and click on Related ENA Records and find links to the results of the project:</p>
                <p> Result Count</p>
                <p> Experiment (PACBIO, HiC, and RNAseq reads) 3</p>
                <p> Assembly 1</p>
                <p> Genome assembly contig set 1</p>
                <p> Sequence (By Chromosome) 18</p>
                <p> Run ( PACBIO, HiC, and RNAseq reads) 3</p>
                <p> Raw data, PACBIO, HiC and RNAseq reads, are also available downloaded from NCBI under the Bioprojet PRJEB61927</p>
                <p> </p>
                <p> 
                    <italic>3. The contamination check with Blobtools had a strong impact as it removed ~770 of the 820 scaffolds. Due to its impact, it would be very important and helpful if the actual value used were described. Currently only the metrices are named (GC, coverage, BUSCO), but not the specific values or ranges employed.</italic>
                </p>
                <p> </p>
                <p> GC, length, and coverage after cut-off in Blobtools are indicated in Figure 4. Here we summarize the cut-off filter values implemented in the two rounds of Blobtools:</p>
                <p> Fisrt round in Blobtoolkit:</p>
                <p> GC_MIN = 0.320</p>
                <p> GC_MAX = 0.351</p>
                <p> SORTED_ALIGNMENT_COVERAGE_MIN = 36.320</p>
                <p> SORTED_ALIGNMENT_COVERAGE_MAX = 680.661</p>
                <p> LENGTH_MIN = 10000</p>
                <p> </p>
                <p> </p>
                <p> Second round in Blobtoolkit:</p>
                <p> GC_MIN = 0.330</p>
                <p> GC_MAX = 0.340</p>
                <p> SORTED_ALIGNMENT_COVERAGE_MIN = 50.000</p>
                <p> SORTED_ALIGNMENT_COVERAGE_MAX = 170.000</p>
                <p> LENGTH_MIN = 10000</p>
                <p> </p>
                <p> </p>
                <p> We replaced the results shown in Figure 4, which correspond to the first round of filtering in BlobTools, with those obtained after the second round.</p>
                <p> </p>
                <p> </p>
                <p> 
                    <italic>4. I find a bit difficult to follow the reported numbers of scaffolds and contigs, and if I understand the text correctly, the final number of contigs and scaffold does not quite match those from the beginning. 993 contigs were assembled into 821 scaffold (thus most scaffolds include only one contig). Only 59 scaffolds were retained after Blobtools filtering and the final set is 322 contigs in 23 scaffolds. That should not be possible, even the 59 scaffolds after Blobtools should not hold more then ~200 contigs. Maybe I did not fully understand the numbers, the authors should make this procedure clearer</italic>
                </p>
                <p> </p>
                <p> Our previous explanation was confusing, so we provide a new wording in the main text: &#x201c;The contact map was curated using Pretext (Harry, 2022), which suggested connections between scaffolds and reduced the final assembly from 59 to 17 scaffolds, while retaining 229 scaffolds of unknown size (represented as 100 consecutive Ns in the FASTA file).</p>
            </body>
        </sub-article>
    </sub-article>
    <sub-article article-type="reviewer-report" id="report396911">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.183424.r396911</article-id>
            <title-group>
                <article-title>Reviewer response for version 2</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Angst</surname>
                        <given-names>Pascal</given-names>
                    </name>
                    <xref ref-type="aff" rid="r396911a1">1</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0002-8654-2251</uri>
                </contrib>
                <aff id="r396911a1">
                    <label>1</label>University of Basel, Basel, Switzerland</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>22</day>
                <month>8</month>
                <year>2025</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2025 Angst P</copyright-statement>
                <copyright-year>2025</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport396911" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.161461.2"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>The revised manuscript effectively addresses the comments I made as a reviewer. I appreciate the improvements made and the attention given to the issues I raised. Most of the revisions are clear and well implemented. However, I would be interested in receiving some brief clarification on a few of the changes. Could the authors please elaborate on the following points:</p>
            <p> &#x00a0; 
                <list list-type="bullet">
                    <list-item>
                        <p>The accession added, &#x201c;SAMEA11313135&#x201d;, is titled &#x201c;COVID-HUB-PL deep NGS sequencing of SARS-CoV-2 genomes&#x201d; in the ENA. Should this be &#x201c;SAMEA113414145&#x201d;? &#x2013; For &#x201c;SAMEA118091338&#x201d;, the associated Hi-C data seems to be unavailable on the ENA website.</p>
                    </list-item>
                    <list-item>
                        <p>&#x201c;In also returns a predicted centromere location for each chromosome.&#x201d; &#x201c;It also [&#x2026;]&#x201d;? Further, did you find any centromeres or centromere-characteristic repeats?</p>
                    </list-item>
                    <list-item>
                        <p>&#x201c;The genome annotation was assessed using BUSCO obtaining: C:93,1% [S:73,2%, D:19,9%], F:2,2%, M:4,7%, also 27004 transcripts and 22834 genes. RNAQuast has been performed to check the average alignment length, being 1248,6bp.&#x201d; This is the first time that transcripts, genes, and RNA are mentioned. Is this related to the RNA-seq in PRJEB61927? Was gene annotation performed? If so, how was it done, and where can the annotations be found?</p>
                    </list-item>
                </list>
            </p>
            <p>Are the datasets clearly presented in a usable and accessible format, and the assembly and annotation available in an appropriate subject-specific repository?</p>
            <p>Partly</p>
            <p>Are sufficient details of the sequencing and extraction, software used, and materials provided to allow replication by others?</p>
            <p>Partly</p>
            <p>Are the rationale for sequencing the genome and the species significance clearly described?</p>
            <p>Partly</p>
            <p>Are the protocols appropriate and is the work technically sound?</p>
            <p>Partly</p>
            <p>Reviewer Expertise:</p>
            <p>Genome assembly and annotation | Wet Lab (DNA/RNA extraction and library preparation) | Short- and Long-read sequencing technologies | Evolutionary Biology | Population Genetics | Metapopulation Ecology | Host-Parasite Interactions | Genome Evolution | Environmental Science</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.</p>
        </body>
        <sub-article article-type="response" id="comment14538-396911">
            <front-stub>
                <contrib-group>
                    <contrib contrib-type="author">
                        <name>
                            <surname>Pons</surname>
                            <given-names>Joan</given-names>
                        </name>
                        <aff>Animal&amp;Microbial Biodiversity, Institut Mediterrani d'Estudis Avancats, Esporles, Illes Balears, Spain</aff>
                    </contrib>
                </contrib-group>
                <author-notes>
                    <fn fn-type="conflict">
                        <p>
                            <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                    </fn>
                </author-notes>
                <pub-date pub-type="epub">
                    <day>9</day>
                    <month>9</month>
                    <year>2025</year>
                </pub-date>
            </front-stub>
            <body>
                <p>We reply to each question:</p>
                <p> </p>
                <p> 
                    <italic>1) The accession added, &#x201c;SAMEA11313135&#x201d;, is titled &#x201c;COVID-HUB-PL deep NGS sequencing of SARS-CoV-2 genomes&#x201d; in the ENA. Should this be &#x201c;SAMEA113414145&#x201d;? &#x2013; For &#x201c;SAMEA118091338&#x201d;, the associated Hi-C data seems to be unavailable on the ENA website.</italic>
                </p>
                <p> </p>
                <p> The biosample to obtain PACBIO data is SAMEA113414145. Sorry for the mistake. We fix the error on the new version. The biosample for HiC data is SAMEA118091338.</p>
                <p> </p>
                <p> The link in the manuscript 
                    <underline>
                        <ext-link ext-link-type="uri" xlink:href="https://identifiers.org/ena.embl/PRJEB61927">https://identifiers.org/ena.embl/PRJEB61927</ext-link>
                    </underline> points to 
                    <underline>
                        <ext-link ext-link-type="uri" xlink:href="https://www.ebi.ac.uk/ena/browser/view/PRJEB61927">https://www.ebi.ac.uk/ena/browser/view/PRJEB61927</ext-link>
                    </underline> in ENA. Then, in right panel the user can click on Related ENA Records and find links to all results of the project:</p>
                <p> Result Count</p>
                <p> Experiment (PACBIO, HiC, and RNAseq reads) 3</p>
                <p> Assembly 1</p>
                <p> Genome assembly contig set 1</p>
                <p> Sequence (By Chromosome) 18</p>
                <p> Run ( PACBIO, HiC, and RNAseq reads) 3</p>
                <p> Raw data, PACBIO, HiC and RNAseq reads, are also available downloaded from NCBI under the Bioprojet PRJEB61927</p>
                <p> </p>
                <p> 
                    <italic>2) &#x201c;In also returns a predicted centromere location for each chromosome.&#x201d; &#x201c;It also [&#x2026;]&#x201d;? Further, did you find any centromeres or centromere-characteristic repeats?</italic>
                </p>
                <p> </p>
                <p> We did not analyze the sequences of centromere repeats. We just annotated them.</p>
                <p> </p>
                <p> </p>
                <p> 
                    <italic>3) &#x201c;The genome annotation was assessed using BUSCO obtaining: C:93,1% [S:73,2%, D:19,9%], F:2,2%, M:4,7%, also 27004 transcripts and 22834 genes. RNAQuast has been performed to check the average alignment length, being 1248,6bp.&#x201d; This is the first time that transcripts, genes, and RNA are mentioned. Is this related to the RNA-seq in PRJEB61927? Was gene annotation performed? If so, how was it done, and where can the annotations be found?</italic>
                </p>
                <p> </p>
                <p> We performed a preliminary of the RNA-seq data from another sample (biosample SAMEA117791495) to obtain a preliminary number of coding genes. The workflow consists of several steps. In brief, Illumina paired-end adapters (Nextera PE) were removed from the raw reads using Trimmomatic v0.39. Next, the reference genome assembly was soft-masked with dustmasker, and an index was generated with hisat2-build. The trimmed reads were then aligned to the indexed genome using hisat2, and the resulting alignments were processed with samtools. Finally, gene annotation was performed with BRAKER v3, using a protein database retrieved from OrthoDB as external evidence. RNA-seq data is available in the project (
                    <underline>
                        <ext-link ext-link-type="uri" xlink:href="https://identifiers.org/ena.embl/PRJEB61927">https://identifiers.org/ena.embl/PRJEB61927</ext-link>
                    </underline>). Here is a direct link</p>
                <p> 
                    <underline>
                        <ext-link ext-link-type="uri" xlink:href="https://www.ebi.ac.uk/ena/browser/view/ERX14107725">https://www.ebi.ac.uk/ena/browser/view/ERX14107725</ext-link>
                    </underline>. We are still working to make a full annotation in the near future once new funding is obtained.</p>
            </body>
        </sub-article>
    </sub-article>
    <sub-article article-type="reviewer-report" id="report376178">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.177497.r376178</article-id>
            <title-group>
                <article-title>Reviewer response for version 1</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Angst</surname>
                        <given-names>Pascal</given-names>
                    </name>
                    <xref ref-type="aff" rid="r376178a1">1</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0002-8654-2251</uri>
                </contrib>
                <aff id="r376178a1">
                    <label>1</label>University of Basel, Basel, Switzerland</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>16</day>
                <month>4</month>
                <year>2025</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2025 Angst P</copyright-statement>
                <copyright-year>2025</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport376178" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.161461.1"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve-with-reservations</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>This genome note presents the genome of 
                <italic>Tethysbaena scabra</italic>. The authors sampled two pools of specimens for sequencing using PacBio HiFi and Hi-C technologies. They used latest software for assembly of sequencing reads and for assessing the assembly&#x2019;s quality, completeness, and contamination level. They discussed potential sources of contamination and they separated target versus non-target, contaminant contigs based on specialized software. Generally, this article is sound, but I identified a few inconsistencies and lack of detail which I would like the authors to address. I also expected more details on the sequence (variation) of the genome from a genome note, e.g., a summary of the repeat content and other annotations.</p>
            <p> </p>
            <p> Details on the mandatory reviewer questions: 
                <list list-type="bullet">
                    <list-item>
                        <p>The main rationale presented for sequencing the genome was the Catalan Initiative for the Earth BioGenome Project (CBP). It would thus be nice to have a short explanation of what that is and what its significance is.</p>
                    </list-item>
                    <list-item>
                        <p>The protocols and the work seem technically sound but would profit from more details. For example, it is mentioned that pools of specimens were used for sequencing but not why this was done. Clearly, sequencing a single individual would be preferable. Related to this, the methods state that two pools of 20 specimens were sampled. However, the results state that &#x201c;identical&#x201d; pools were used. The word identical is confusing in that context, because in the methods the pools are described as separate pools. Also, on NCBI, there is only one BioSample (a batch of 20 individuals) registered and is linked to the Illumina and the PacBio sequencing, which is not what is described in the article. The BioSample should either be a batch of 40 individuals or there should be two BioSamples of 20 individuals each.</p>
                        <p> </p>
                        <p> Details on DNA extraction and library preparations are missing. To reproduce the work or to apply it to other systems, it is necessary to know what kits, reagents, and protocols were used.</p>
                    </list-item>
                    <list-item>
                        <p>To replicate the assembly process and subsequent assembly modification the parameters used for the software are missing. Especially, what parameters were used for hifiasm? Given it was not designed for assembly of multiple specimens&#x2019; genomes, were the parameters adjusted accordingly? It seems hifiasm had issues in the collapsing step since the authors applied purge_dups, which lead to a great reduction in the number of contigs. It is normally not desired to purge haplotigs from hifiasm assemblies and if applied does not result in such a drastic reduction in the number of contigs. I missed discussion of all of this.</p>
                        <p> </p>
                        <p> &#x201c;individuals that were not externally cleaned so it could also contain DNA from microbial and other eukaryote contaminants&#x201d;. Why were the individuals not cleaned before assembly despite that this is known to cause assembly issues?</p>
                        <p> </p>
                        <p> It is not clear how many contigs remained after each step in the methods. What was the number of contigs after Hi-C scaffolding? In the method is says 821. It also says 59 contigs were obtained after applying BlobToolKit. Are the other 762 (821 - 59) contigs from contaminants? How does that align with the 322 contigs mentioned in Table 2 and the 17 scaffolds mentioned in the Results?</p>
                        <p> </p>
                        <p> "The coverage obtained has not been sufficient to deduce sex-linked chromosomes". Would this be a sensible analysis given the pools of specimens? Why is 53.8x coverage not enough? Is this the haploid sequence coverage? From Figure 4, the coverage seems twice as high.</p>
                    </list-item>
                    <list-item>
                        <p>The genome is available from NCBI. However, there are no annotations provided. At least a description of the repetitive content would be valuable. Repetitive content seems to have been assessed since assemblies were&#x00a0;"preprocessed to mask repetitive and low-complexity regions".</p>
                    </list-item>
                </list> Additional aspects: 
                <list list-type="bullet">
                    <list-item>
                        <p>The completeness of the assembly was assessed using BUSCOs and k-mers. Given the chromosome-level assembly, it would be valuable to describe the sequence content and arrangement, and the structural variation. For example, what is the telomeric end repeat motif; what characterizes the centromeres (GC content, sequence content, repeat content); what is the distribution of repeat versus genic content?</p>
                    </list-item>
                    <list-item>
                        <p>It would be important to know the average read lengths or read length N50s.</p>
                    </list-item>
                    <list-item>
                        <p>The keywords should include the full species name.</p>
                    </list-item>
                    <list-item>
                        <p>Are there assembly gaps? What is their size?</p>
                    </list-item>
                    <list-item>
                        <p>There is no phylogenetic analysis. I suggest including one or refer to a previous solid (whole genome) phylogeny.</p>
                    </list-item>
                </list>
            </p>
            <p>Are the datasets clearly presented in a usable and accessible format, and the assembly and annotation available in an appropriate subject-specific repository?</p>
            <p>Partly</p>
            <p>Are sufficient details of the sequencing and extraction, software used, and materials provided to allow replication by others?</p>
            <p>Partly</p>
            <p>Are the rationale for sequencing the genome and the species significance clearly described?</p>
            <p>Partly</p>
            <p>Are the protocols appropriate and is the work technically sound?</p>
            <p>Partly</p>
            <p>Reviewer Expertise:</p>
            <p>Genome assembly and annotation | Wet Lab (DNA/RNA extraction and library preparation) | Short- and Long-read sequencing technologies | Evolutionary Biology | Population Genetics | Metapopulation Ecology | Host-Parasite Interactions | Genome Evolution | Environmental Science</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.</p>
        </body>
        <sub-article article-type="response" id="comment14030-376178">
            <front-stub>
                <contrib-group>
                    <contrib contrib-type="author">
                        <name>
                            <surname>Pons</surname>
                            <given-names>Joan</given-names>
                        </name>
                        <aff>Animal&amp;Microbial Biodiversity, Institut Mediterrani d'Estudis Avancats, Esporles, Illes Balears, Spain</aff>
                    </contrib>
                </contrib-group>
                <author-notes>
                    <fn fn-type="conflict">
                        <p>
                            <bold>Competing interests: </bold>No competing interests</p>
                    </fn>
                </author-notes>
                <pub-date pub-type="epub">
                    <day>6</day>
                    <month>6</month>
                    <year>2025</year>
                </pub-date>
            </front-stub>
            <body>
                <p>
                    <list list-type="order">
                        <list-item>
                            <p>&#x00a0;The main rationale presented for sequencing the genome was the Catalan Initiative for the Earth BioGenome Project (CBP). It would thus be nice to have a short explanation of what that is and what its significance is. 
                                <italic>** </italic>
                                <italic>We added additional information as suggested. &#x201c;The Catalan Biogenome is EBP-affiliated project network with the objective of sequencing the genome of more than 40000 eukaryotic species living in the Catalan Linguistic Area (such as Balearic Islands)&#x201d;.</italic>
                            </p>
                        </list-item>
                        <list-item>
                            <p>&#x00a0;The protocols and the work seem technically sound but would profit from more details. For example, it is mentioned that pools of specimens were used for sequencing but not why this was done.&#x00a0;
                                <italic>** </italic>
                                <italic>We agree that our wording was confusing so we rewrote the text to clarify the issue &#x201c;Extraction of high molecular weight DNA, construction of Pacific Biosciences HiFi circular consensus DNA sequencing libraries, and sequencing on Pacific Biosciences SEQUEL II (HiFi) instrument was performed by Delaware Biotechnology Institute, University of Delaware (DE, USA) using a pool of 20 specimens (Accession number: SAMEA113414145 qmTetScab1). Hi-C data was generated from another pool of 20 individuals from the same collection site (Accession number: SAMEA118091338) using the library preparation Omni-C DNA and sequenced 2 x 150 pb on the Illumina NovaSeq 6000 S4 instrument at the Centre Nacional de Seq&#x00fc;enciaci&#x00f3; Gen&#x00f2;mica (CNAG), Barcelona, Spain." </italic>
                            </p>
                        </list-item>
                        <list-item>
                            <p>Details on DNA extraction and library preparations are missing. To reproduce the work or to apply it to other systems, it is necessary to know what kits, reagents, and protocols were used. 
                                <italic>** </italic>
                                <italic>For the HIFI sequencing, a DNA library was prepared using the PacBio SMRTbell Express Template Prep Kit 2.0 </italic>
                                <italic>(Pacific Biosciences of California, CA, USA) </italic>
                                <italic>following the official protocol. Approximately 300 ng of high-quality genomic DNA was sheared to ~15&#x2013;20 kb, repaired, and ligated with SMRTbell adapters to create circular molecules. The library was size-selected to remove fragments smaller than 1,000 bp, purified with AMPure PB beads, and quality-checked using Qubit and TapeStation. Finally, it was sequenced on the PacBio Sequel II platform in CCS mode to generate highly accurate HiFi reads suitable for genome assembly and analysis. These previous steps were performed in the University of Delaware (USA).</italic>&#x00a0;**&#x00a0;
                                <italic>Hi-C libraries were prepared using the Omni-C Dovetail protocol (Cantara Bio, CA, USA). Briefly, chromatin was extracted from frozen tissue and crosslinked with formaldehyde to preserve the native three-dimensional genome architecture by stabilizing DNA-protein and DNA-DNA interactions within the nucleus. The crosslinked chromatin was then fragmented using DNase I, and spatially proximal DNA ends were ligated to capture physical interactions between genomic regions, providing long-range linkage information useful for validating and scaffolding genome assemblies. Hi-C library preparation and sequencing were performed at the Centre Nacional d&#x2019;An&#x00e0;lisi Gen&#x00f2;mica (CNAG), Barcelona, Spain.</italic>
                            </p>
                        </list-item>
                        <list-item>
                            <p>To replicate the assembly process and subsequent assembly modification the parameters used for the software are missing. Especially, what parameters were used for hifiasm? Given it was not designed for assembly of multiple specimens&#x2019; genomes, were the parameters adjusted accordingly? It seems hifiasm had issues in the collapsing step since the authors applied purge_dups, which lead to a great reduction in the number of contigs. It is normally not desired to purge haplotigs from hifiasm assemblies and if applied does not result in such a drastic reduction in the number of contigs. I missed discussion of all of this.
                                <italic>&#x00a0;** </italic>
                                <italic>The number of haplotypes was considered using the --nhap parameter in hifiasm. </italic>
                                <italic>Given that</italic>
                                <italic> our species is presumably diploid, as estimated by Smudgeplot, and </italic>
                                <italic>the sequencing</italic>
                                <italic> pool i</italic>
                                <italic>ncluded</italic>
                                <italic> 20 individuals, </italic>
                                <italic>the</italic>
                                <italic> parameter was </italic>
                                <italic>set to </italic>
                                <italic>nhap=40. </italic>
                                <italic>However, d</italic>
                                <italic>ue to the </italic>
                                <italic>presence of multiple</italic>
                                <italic> individuals, hifiasm </italic>
                                <italic>may struggle</italic>
                                <italic> to accurately resolve haplotypes, which may necessitate the use of </italic>
                                <italic>purge_dups</italic>
                                <italic> to remove redundant contigs.</italic>&#x00a0;
                                <italic>** </italic>
                                <italic>We suspect that most of the duplications are due to the DNA being sourced from a pool of 20 individuals, as a single individual did not provide enough material to construct a HiFi library. We clarify this question in the main text &#x201c;The genome size was estimated using GenomeScope2 (Vurture et al., 2017), and diploidy was confirmed with Smudgeplot (Ranallo-Benavidez et al., 2020). Assembly was conducted using hifiasm (Cheng et al., 2021) with n_hap=40 (considering diploidy and 20 individuals). Larger number of Haplotypic duplications presumably caused by the high number of specimens used for DNA extraction were withdrawn with purge_dups (Guan et al., 2020), passing from 2208 to 1272 contigs". We would like to point out that final genome size after purging duplicates and removing contamination matched the size initially predicted by Genomescope.</italic>
                            </p>
                        </list-item>
                        <list-item>
                            <p>&#x00a0;&#x201c;individuals that were not externally cleaned so it could also contain DNA from microbial and other eukaryote contaminants&#x201d;. Why were the individuals not cleaned before assembly despite that this is known to cause assembly issues? 
                                <italic>** Specimens were isolated from well water and individually collected to minimize contamination from other macroscopic species. However, this approach did not prevent contamination by microscopic organisms.</italic>
                            </p>
                        </list-item>
                        <list-item>
                            <p>&#x00a0;It is not clear how many contigs remained after each step in the methods. What was the number of contigs after Hi-C scaffolding? In the method is says 821. It also says 59 contigs were obtained after applying BlobToolKit. Are the other 762 (821 - 59) contigs from contaminants? How does that align with the 322 contigs mentioned in Table 2 and the 17 scaffolds mentioned in the Results? 
                                <italic>** </italic>
                                <italic>We rewrote the text to be more sound and included a new sentence in the Methods section.</italic>
                            </p>
                        </list-item>
                        <list-item>
                            <p>&#x00a0;"The coverage obtained has not been sufficient to deduce sex-linked chromosomes". Would this be a sensible analysis given the pools of specimens? Why is 53.8x coverage not enough? Is this the haploid sequence coverage? From Figure 4, the coverage seems twice as high. 
                                <italic>** </italic>
                                <italic>Several factors hindered the identification of sex chromosomes in our diploid species. Most prominently, the characteristic haploid coverage pattern typically associated with sex chromosomes was absent. Furthermore, the genome assembly and scaffolding were performed using two separate DNA pools without prior knowledge of the individuals&#x2019; sex, complicating the detection of sex-specific sequences. In addition, the lack of biological information on the species and genus&#x2014;particularly whether sex determination is chromosomal or genetic&#x2014;further limits the applicability of standard methods for identifying sex chromosomes.</italic>
                            </p>
                        </list-item>
                        <list-item>
                            <p>The genome is available from NCBI. However, there are no annotations provided. At least a description of the repetitive content would be valuable. Repetitive content seems to have been assessed since assemblies were "preprocessed to mask repetitive and low-complexity regions". 
                                <italic>** </italic>
                                <italic>We added the annotation for the repetitive sequences as requested.</italic>
                            </p>
                        </list-item>
                        <list-item>
                            <p>The completeness of the assembly was assessed using BUSCOs and k-mers. Given the chromosome-level assembly, it would be valuable to describe the sequence content and arrangement, and the structural variation. For example, what is the telomeric end repeat motif; what characterizes the centromeres (GC content, sequence content, repeat content); what is the distribution of repeat versus genic content? 
                                <italic>** We present a new table summarizing the chromosomal location, composition, and sequences of the repetitive DNA elements.</italic>
                            </p>
                        </list-item>
                        <list-item>
                            <p>&#x00a0;It would be important to know the average read lengths or read length N50s. 
                                <italic>** </italic>
                                <italic>The read length N50 of PacBio raw reads has been added to the results section. </italic>
                            </p>
                        </list-item>
                        <list-item>
                            <p>The keywords should include the full species name. 
                                <italic>** </italic>
                                <italic>The species name has been added to the keywords section. </italic>
                            </p>
                        </list-item>
                        <list-item>
                            <p>Are there assembly gaps? What is their size? 
                                <italic>** The final assembly contains 299 gaps, each 100 bp in length. This is due to the default behavior of tools such as Hifiasm, YAHS, and PretextMap, which insert standardized 100 bp gaps when the true gap size cannot be determined.</italic>
                            </p>
                        </list-item>
                        <list-item>
                            <p>There is no phylogenetic analysis. I suggest including one or refer to a previous solid (whole genome) phylogeny. 
                                <italic>** </italic>
                                <italic>We appreciate the suggestion to include a phylogenetic analysis. However, given the current lack of available genomic data from other species in the Pericarid crustacean order, we believe that conducting a robust phylogenetic analysis at this stage would not be sufficiently reliable. We agree that such an analysis would be highly valuable, particularly once more genomic data from related taxa becomes available, and it is a future goal of our research group.</italic>
                            </p>
                        </list-item>
                    </list>
                </p>
            </body>
        </sub-article>
    </sub-article>
</article>
