<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.2 20190208//EN" "http://jats.nlm.nih.gov/publishing/1.2/JATS-journalpublishing1.dtd"><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="other" dtd-version="1.2" xml:lang="en">
    <front>
        <journal-meta>
            <journal-id journal-id-type="pmc">F1000Research</journal-id>
            <journal-title-group>
                <journal-title>F1000Research</journal-title>
            </journal-title-group>
            <issn pub-type="epub">2046-1402</issn>
            <publisher>
                <publisher-name>F1000 Research Limited</publisher-name>
                <publisher-loc>London, UK</publisher-loc>
            </publisher>
        </journal-meta>
        <article-meta>
            <article-id pub-id-type="doi">10.12688/f1000research.129212.1</article-id>
            <article-categories>
                <subj-group subj-group-type="heading">
                    <subject>Genome Note</subject>
                </subj-group>
                <subj-group>
                    <subject>Articles</subject>
                </subj-group>
            </article-categories>
            <title-group>
                <article-title>An updated version of the Madagascar periwinkle genome</article-title>
                <fn-group content-type="pub-status">
                    <fn>
                        <p>[version 1; peer review: 2 approved]</p>
                    </fn>
                </fn-group>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author" corresp="no" equal-contrib="yes">
                    <name>
                        <surname>Cuello</surname>
                        <given-names>Cl&#x00e9;ment</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <uri content-type="orcid">https://orcid.org/0000-0002-5901-9748</uri>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no" equal-contrib="yes">
                    <name>
                        <surname>Stander</surname>
                        <given-names>Emily Amor</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Resources</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Jansen</surname>
                        <given-names>Hans J.</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Data Curation</role>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Resources</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <uri content-type="orcid">https://orcid.org/0000-0002-8563-4146</uri>
                    <xref ref-type="aff" rid="a2">2</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Dug&#x00e9; De Bernonville</surname>
                        <given-names>Thomas</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Conceptualization</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Resources</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a1">1</xref>
                    <xref ref-type="aff" rid="a3">3</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Oudin</surname>
                        <given-names>Audrey</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Birer Williams</surname>
                        <given-names>Caroline</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Lanoue</surname>
                        <given-names>Arnaud</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Giglioli Guivarc'h</surname>
                        <given-names>Nathalie</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Papon</surname>
                        <given-names>Nicolas</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a4">4</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Dirks</surname>
                        <given-names>Ron P.</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Data Curation</role>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Resources</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a2">2</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Jensen</surname>
                        <given-names>Michael Krogh</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Conceptualization</role>
                    <role content-type="http://credit.niso.org/">Funding Acquisition</role>
                    <role content-type="http://credit.niso.org/">Project Administration</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a5">5</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>O'Connor</surname>
                        <given-names>Sarah Ellen</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Conceptualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a6">6</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Besseau</surname>
                        <given-names>S&#x00e9;bastien</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Conceptualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="yes">
                    <name>
                        <surname>Courdavault</surname>
                        <given-names>Vincent</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Conceptualization</role>
                    <role content-type="http://credit.niso.org/">Funding Acquisition</role>
                    <role content-type="http://credit.niso.org/">Project Administration</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <uri content-type="orcid">https://orcid.org/0000-0001-8902-4532</uri>
                    <xref ref-type="corresp" rid="c1">a</xref>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <aff id="a1">
                    <label>1</label>EA2106 Biomol&#x00e9;cules et Biotechnologies V&#x00e9;g&#x00e9;tales, Universit&#x00e9; de Tours, Tours, 37200, France</aff>
                <aff id="a2">
                    <label>2</label>Future Genomics Technologies, Leiden, 2333BE, The Netherlands</aff>
                <aff id="a3">
                    <label>3</label>Present address: Centre de Recherche, Limagrain, Chappes, 07745, France</aff>
                <aff id="a4">
                    <label>4</label>IRF, SFR ICAT, Univ Angers, Univ Brest, Angers, 49000, France</aff>
                <aff id="a5">
                    <label>5</label>Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kongens Lyngby, 2800, Denmark</aff>
                <aff id="a6">
                    <label>6</label>Department of Natural Product Biosynthesis, Max Planck Institute for Chemical Ecology, Jena, 07745, Germany</aff>
            </contrib-group>
            <author-notes>
                <corresp id="c1">
                    <label>a</label>
                    <email xlink:href="mailto:vincent.courdavault@univ-tours.fr">vincent.courdavault@univ-tours.fr</email>
                </corresp>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>Ron P. Dirks and Hans J. Jansen are CEO and CTO of Future Genomics Technologies, respectively.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>21</day>
                <month>12</month>
                <year>2022</year>
            </pub-date>
            <pub-date pub-type="collection">
                <year>2022</year>
            </pub-date>
            <volume>11</volume>
            <elocation-id>1541</elocation-id>
            <history>
                <date date-type="accepted">
                    <day>13</day>
                    <month>12</month>
                    <year>2022</year>
                </date>
            </history>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2022 Cuello C et al.</copyright-statement>
                <copyright-year>2022</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <self-uri content-type="pdf" xlink:href="https://f1000research.com/articles/11-1541/pdf"/>
            <abstract>
                <p>The Madagascar periwinkle, 
                    <italic toggle="yes">Catharanthus roseus</italic>, belongs to the 
                    <italic toggle="yes">Apocynaceae</italic> family. This medicinal plant, endemic to Madagascar, produces many important drugs including the monoterpene indole alkaloids (MIA) vincristine and vinblastine used to treat cancer worldwide. Here, we provide a new version of the 
                    <italic toggle="yes">C. roseus</italic> genome sequence obtained through the combination of Oxford Nanopore Technologies long-reads and Illumina short-reads. This more contiguous assembly consists of 173 scaffolds with a total length of 581.128 Mb and an N50 of 12.241 Mb. Using publicly available RNAseq data, 21,061 protein coding genes were predicted and functionally annotated. A total of 42.87% of the genome was annotated as transposable elements, most of them being long-terminal repeats. Together with the increasing access to MIA-producing plant genomes, this updated version should ease evolutionary studies leading to a better understanding of MIA biosynthetic pathway evolution.</p>
            </abstract>
            <kwd-group kwd-group-type="author">
                <kwd>Monoterpene indole alkaloids</kwd>
                <kwd>Catharanthus roseus</kwd>
                <kwd>Apocynaceae</kwd>
            </kwd-group>
            <funding-group>
                <award-group id="fund-1">
                    <funding-source>ARD CVL Biopharmaceutical program of the R&#x00e9;gion Centre-Val de Loire</funding-source>
                    <award-id>ETOPOCentre</award-id>
                </award-group>
                <award-group id="fund-2" xlink:href="http://dx.doi.org/10.13039/501100001665">
                    <funding-source>Agence Nationale de la Recherche</funding-source>
                    <award-id>ANR-20-CE43-0010</award-id>
                </award-group>
                <award-group id="fund-3" xlink:href="http://dx.doi.org/10.13039/501100007601">
                    <funding-source>Horizon 2020</funding-source>
                    <award-id>814645</award-id>
                </award-group>
                <funding-statement>This work was supported by EU Horizon 2020 research and innovation program [MIAMi project, grant number 814645; MKJ, SEO, VC]; ARD CVL Biopharmaceutical program of the R&#x00e9;gion Centre-Val de Loire [ETOPOCentre project, VC]; and ANR [project MIACYC &#x2013; ANR-20-CE43-0010, VC].</funding-statement>
            </funding-group>
        </article-meta>
    </front>
    <body>
        <sec id="sec1" sec-type="intro">
            <title>Introduction</title>
            <p>The Madagascar periwinkle, 
                <italic toggle="yes">Catharanthus roseus</italic> (L.) G. Don, is an 
                <italic toggle="yes">Apocynaceae</italic> plant native to Madagascar. 
                <italic toggle="yes">C. roseus</italic> produces several specialized metabolites including monoterpene indole alkaloids (MIA; 
                <xref ref-type="bibr" rid="ref15">O&#x2019;Connor and Maresh, 2006</xref>). These molecules are produced by plants to face biotic and abiotic pressures accounting for their wide range of bioactive properties (
                <xref ref-type="bibr" rid="ref5">Dug&#x00e9; de Bernonville 
                    <italic toggle="yes">et al.</italic>, 2015</xref>). Above all, MIAs produced by 
                <italic toggle="yes">C. roseus</italic> are well-known for being part of the human pharmacopoeia against cancer, such as the well-known vinblastine and vincristine, and other MIA derivatives, including vinorelbine (
                <xref ref-type="bibr" rid="ref15">O&#x2019;Connor and Maresh, 2006</xref>).</p>
            <p>Due to its high economic importance, 
                <italic toggle="yes">C. roseus</italic> has extensively been studied within the last three decades becoming the model species for MIA biosynthetic pathway studies (see 
                <xref ref-type="bibr" rid="ref18">Pan 
                    <italic toggle="yes">et al.</italic>, 2016</xref> and 
                <xref ref-type="bibr" rid="ref14">Kulagina 
                    <italic toggle="yes">et al.</italic>, 2022</xref> for extensive review). 
                <italic toggle="yes">C. roseus</italic> genome was firstly sequenced in 2015 (
                <xref ref-type="bibr" rid="ref11">Kellner 
                    <italic toggle="yes">et al.</italic>, 2015</xref>). Recently, a more contiguous version (v2) was generated to ease inter-species genomic comparison (
                <xref ref-type="bibr" rid="ref9">Franke 
                    <italic toggle="yes">et al.</italic>, 2019</xref>). To date, 
                <italic toggle="yes">C. roseus</italic> genome sequencing and assembly did not benefit from the development of third generation sequencing technologies that lead to more contiguous genome (
                <xref ref-type="bibr" rid="ref10">Jiao and Schneeberger, 2017</xref>). Thanks to these new technologies, we present here an even more contiguous genome assembly. This updated version (v2.1) should ease inter-species studies in order to better understand the diversification of MIAs and the evolution of their biosynthetic pathways.</p>
        </sec>
        <sec id="sec2" sec-type="methods">
            <title>Methods</title>
            <sec id="sec3">
                <title>Sample collection, DNA extraction and sequencing</title>
                <p>
                    <italic toggle="yes">C. roseus</italic> cv &#x2018;SunStorm
                    <sup>&#x00ae;</sup> Apricot&#x2019; seeds (variety ID: 70001114, Syngenta flowers, Basel, Switzerland) were greenhouse-grown at the University of Tours for 1 month before sampling. DNA was extracted from 
                    <italic toggle="yes">C. roseus</italic> leaves using Qiagen Plant DNeasy kit (ID: 69204, Qiagen, Hilden, Germany) following the manufacturer&#x2019;s instructions. Illumina sequencing library were constructed using the TruSeq DNA PCR-free kit (ID: 20015962, Illumina, San Diego, USA) and sequenced in paired-end mode (2 &#x00d7; 150 bp) by Eurofins Genomics (Les Ulis, France) using Illumina NextSeq500 technology. Future Genomics Technologies (Leiden, The Netherland) constructed ONT library using ONT 1D ligation sequencing kit (SQK-LSK109, Oxford Nanopore Technologies Ltd, Oxford, United-Kingdom) subsequently sequenced on Nanopore GridION flowcell and Nanopore PromethION flowcell (Oxford Nanopore Technologies Ltd, Oxford, United-Kingdom) with the 
                    <ext-link ext-link-type="uri" xlink:href="https://edspace.american.edu/openbehavior/project/guppy/">GuPPy</ext-link> (RRID:SCR_022353) version 3.2.6 high-accuracy basecaller. A total of 114,329,683 paired-end reads were obtained from the Illumina HiSeq sequencing, 908,999 and 2,588,997 from the ONT GridION and ONT PromethION sequencing, respectively.</p>
            </sec>
            <sec id="sec4">
                <title>
                    <italic toggle="yes">De novo</italic> genome assembly</title>
                <p>The 
                    <italic toggle="yes">C. roseus</italic> genome was assembled by Future Genomics Technologies (Leiden, The Netherlands). After adapters removal using 
                    <ext-link ext-link-type="uri" xlink:href="https://github.com/rrwick/Porechop">Porechop</ext-link> (RRID:SCR_016967) (
                    <xref ref-type="bibr" rid="ref27">Wick 
                        <italic toggle="yes">et al.</italic>, 2017</xref>), ONT reads were first assembled into contig using 
                    <ext-link ext-link-type="uri" xlink:href="https://github.com/fenderglass/Flye">Flye</ext-link> (RRID:SCR_017016) assembler (v.2.5, 
                    <xref ref-type="bibr" rid="ref13">Kolmogorov 
                        <italic toggle="yes">et al.</italic>, 2019</xref>) with the following options: --min-overlap 10000 -i 2. Redundant contigs were removed using 
                    <ext-link ext-link-type="uri" xlink:href="https://bitbucket.org/mroachawri/purge_haplotigs/src">Purge_haplotigs</ext-link> (RRID:SCR_017616) (v.1.1.0) followed by two rounds of polishing with Illumina paired-end reads using 
                    <ext-link ext-link-type="uri" xlink:href="https://github.com/broadinstitute/pilon/">Pilon</ext-link> (RRID:SCR_014731) (v.1.23, 
                    <xref ref-type="bibr" rid="ref26">Walker 
                        <italic toggle="yes">et al.</italic>, 2014</xref>).</p>
            </sec>
            <sec id="sec5">
                <title>Gene model prediction and gene functional annotation</title>
                <p>RNA-seq data were retrieved from the 
                    <ext-link ext-link-type="uri" xlink:href="http://www.ncbi.nlm.nih.gov/sra">NCBI Sequence Read Archive (SRA</ext-link>) (RRID:SCR_004891) database using the following accession numbers: 
                    <ext-link ext-link-type="uri" xlink:href="https://identifiers.org/insdc.sra:ERS1229288">ERS1229288</ext-link>, 
                    <ext-link ext-link-type="uri" xlink:href="https://identifiers.org/insdc.sra:ERS1229289">ERS1229289</ext-link>, 
                    <ext-link ext-link-type="uri" xlink:href="https://identifiers.org/insdc.sra:ERS1229290">ERS1229290</ext-link>, 
                    <ext-link ext-link-type="uri" xlink:href="https://identifiers.org/insdc.sra:ERS1229291">ERS1229291</ext-link>, 
                    <ext-link ext-link-type="uri" xlink:href="https://identifiers.org/insdc.sra:ERS1229292">ERS1229292</ext-link>, 
                    <ext-link ext-link-type="uri" xlink:href="https://identifiers.org/insdc.sra:ERS1229293">ERS1229293</ext-link>, 
                    <ext-link ext-link-type="uri" xlink:href="https://identifiers.org/insdc.sra:ERS1229294">ERS1229294</ext-link>, 
                    <ext-link ext-link-type="uri" xlink:href="https://identifiers.org/insdc.sra:ERS1229295">ERS1229295</ext-link>, 
                    <ext-link ext-link-type="uri" xlink:href="https://identifiers.org/insdc.sra:ERS1229296">ERS1229296</ext-link>, 
                    <ext-link ext-link-type="uri" xlink:href="https://identifiers.org/insdc.sra:ERS1907920">ERS1907920</ext-link>, 
                    <ext-link ext-link-type="uri" xlink:href="https://identifiers.org/insdc.sra:ERS2396963">ERS2396963</ext-link>, 
                    <ext-link ext-link-type="uri" xlink:href="https://identifiers.org/insdc.sra:ERS2396964">ERS2396964</ext-link>, 
                    <ext-link ext-link-type="uri" xlink:href="https://identifiers.org/insdc.sra:ERS2396965">ERS2396965</ext-link>, 
                    <ext-link ext-link-type="uri" xlink:href="https://identifiers.org/insdc.sra:ERS2396966">ERS2396966</ext-link>, 
                    <ext-link ext-link-type="uri" xlink:href="https://identifiers.org/insdc.sra:SRR20661631">SRR20661631</ext-link>. These data were individually aligned to the 
                    <italic toggle="yes">C. roseus</italic> genome using 
                    <ext-link ext-link-type="uri" xlink:href="http://ccb.jhu.edu/software/hisat2/index.shtml">HISAT2</ext-link> (RRID:SCR_015530) (v.2.2.1, 
                    <xref ref-type="bibr" rid="ref12">Kim 
                        <italic toggle="yes">et al.</italic>, 2019</xref>). Transcripts were subsequently assembled using the resulting RNA-seq alignments and 
                    <ext-link ext-link-type="uri" xlink:href="https://ccb.jhu.edu/software/stringtie/">StringTie</ext-link> (RRID:SCR_016323) (v.2.1.7, 
                    <xref ref-type="bibr" rid="ref19">Pertea 
                        <italic toggle="yes">et al.</italic>, 2015</xref>). These individual transcriptomes were further merged using stringtie-merge to a non-redundant set of transcripts. A combination of similarity search using 
                    <ext-link ext-link-type="uri" xlink:href="http://blast.ncbi.nlm.nih.gov/Blast.cgi?PROGRAM=blastx&amp;BLAST_PROGRAMS=blastx&amp;PAGE_TYPE=BlastSearch&amp;SHOW_DEFAULTS=on&amp;LINK_LOC=blasthome">BLASTX</ext-link> (RRID:SCR_001653) and 
                    <ext-link ext-link-type="uri" xlink:href="http://blast.ncbi.nlm.nih.gov/Blast.cgi?PROGRAM=blastp&amp;PAGE_TYPE=BlastSearch&amp;LINK_LOC=blasthome">BLASTP</ext-link> (v.2.6.0-1, 
                    <xref ref-type="bibr" rid="ref2">Camacho 
                        <italic toggle="yes">et al.</italic> 2009</xref>) against 
                    <ext-link ext-link-type="uri" xlink:href="https://www.uniprot.org/">UniProt</ext-link> (RRID:SCR_002380) database (v.2022-10-12) and hmmscan (v.3.1b2, 
                    <xref ref-type="bibr" rid="ref7">Finn 
                        <italic toggle="yes">et al.</italic>, 2011</xref>) against the 
                    <ext-link ext-link-type="uri" xlink:href="http://pfam.xfam.org/">Pfam</ext-link> (RRID:SCR_004726) database was used to assign putative function to each gene model.</p>
            </sec>
            <sec id="sec6">
                <title>Assembly completeness assessment</title>
                <p>The stat program from 
                    <ext-link ext-link-type="uri" xlink:href="http://sourceforge.net/projects/bbmap">BBmap</ext-link> (RRID:SCR_016965) tool (v.38.94, 
                    <xref ref-type="bibr" rid="ref1">Bushnell, 2014</xref>) was used to assess assembly quality. Benchmarking Universal Single-Copy Orthologs (
                    <ext-link ext-link-type="uri" xlink:href="http://busco.ezlab.org/">BUSCO</ext-link> v.5.2.2, 
                    <xref ref-type="bibr" rid="ref21">Sim&#x00e3;o 
                        <italic toggle="yes">et al.</italic>, 2015</xref>) (RRID:SCR_015008) with default settings was used to assess genome and gene models completeness using a plant-specific database of 2,326 single copy orthologs (eudicots_odb10). The agat_sp_statistics perl script from the AGAT package (v.0.8.0, 
                    <xref ref-type="bibr" rid="ref4">Dainat 
                        <italic toggle="yes">et al.</italic>, 2022</xref>) was used to get the gene models statistics.</p>
            </sec>
            <sec id="sec7">
                <title>Transposable elements (TE) prediction and annotation</title>
                <p>Identification and annotation of transposable elements was determined using extensive 
                    <italic toggle="yes">de novo</italic> TE annotator (
                    <ext-link ext-link-type="uri" xlink:href="https://github.com/oushujun/EDTA">EDTA</ext-link> v.1.9.5, 
                    <xref ref-type="bibr" rid="ref17">Ou 
                        <italic toggle="yes">et al.</italic>, 2019</xref>) (RRID:SCR_022063) using the sensitive mode. This pipeline annotates long-terminal repeat (LTR) using 
                    <ext-link ext-link-type="uri" xlink:href="http://tlife.fudan.edu.cn/ltr_finder/">LTR_Finder</ext-link> (RRID:SCR_015247) (v. 1.07, 
                    <xref ref-type="bibr" rid="ref29">Xu and Wang, 2007</xref>) and LTRharvest (RRID:SCR_018970) included in 
                    <ext-link ext-link-type="uri" xlink:href="http://genometools.org/">GenomeTools</ext-link> (RRID:SCR_016120) (v.1.5.10, 
                    <xref ref-type="bibr" rid="ref6">Ellinghaus 
                        <italic toggle="yes">et al.</italic>, 2008</xref>); terminal inverted repeat (TIR) using Generic repeat finder (v.1.0, 
                    <xref ref-type="bibr" rid="ref20">Shi and Liang, 2019</xref>) and TIR-learner (v.2.5, 
                    <xref ref-type="bibr" rid="ref23">Su 
                        <italic toggle="yes">et al.</italic>, 2019</xref>); and Helitrons using HelitronScanner (v.1.1, 
                    <xref ref-type="bibr" rid="ref28">Xiong 
                        <italic toggle="yes">et al.</italic>, 2014</xref>). TE size thresholds are further used to prevent false discoveries. Hence, TIR shorter than 80 bp as well as LTR and Helitrons shorter than 100 bp are considered as tandem repeats and short sequences. To prevent false LTR discoveries, LTR are further filtered using 
                    <ext-link ext-link-type="uri" xlink:href="https://github.com/oushujun/LTR_retriever">LTR_retriever</ext-link> (RRID:SCR_017623) (v.2.9.0, 
                    <xref ref-type="bibr" rid="ref16">Ou and Jiang, 2018</xref>). TIR candidates are classified as MITEs if not exceeding 600 bp. TIR and Helitrons are further filtered using EDTA advanced filters (see 
                    <xref ref-type="bibr" rid="ref17">Ou 
                        <italic toggle="yes">et al.</italic>, 2019</xref> for details). The genome is then masked using the obtained TE library. Unmasked part of the genome is then scanned by 
                    <ext-link ext-link-type="uri" xlink:href="http://www.repeatmasker.org/RepeatModeler/">RepeatModeler</ext-link> (RRID:SCR_015027) (v.2.0.1, default parameters, 
                    <xref ref-type="bibr" rid="ref8">Flynn 
                        <italic toggle="yes">et al.</italic>, 2020</xref>) to identify non-LTR retrotransposons and unclassified TE missed by structure-based TE identification tools. Finally, EDTA uses the provided CDS sequences to remove gene-related sequences.</p>
            </sec>
        </sec>
        <sec id="sec8" sec-type="results">
            <title>Results</title>
            <sec id="sec9">
                <title>Genome assembly</title>
                <p>
                    <italic toggle="yes">C. roseus</italic> genome was assembled from ONT long-reads using Flye (v.2.5) resulting in a 651.9 Mb assembly distributed across 788 contigs. This assembly was collapsed using purge_haplotigs into 173 scaffolds reducing length to 585,8 Mb but increasing N50 from 10.3 Mb to 12.3 Mb. Assembly polishing was performed twice using Illumina short-reads with pilon (v. 1.23). 
                    <italic toggle="yes">C. roseus</italic> final assembly consisted in 173 scaffolds with a total length of 581.45 Mb. Even though 
                    <italic toggle="yes">C. roseus</italic> v.2.1 displayed similar BUSCO scores compared to 
                    <italic toggle="yes">C. roseus</italic> v.2 based on 
                    <italic toggle="yes">Eudicotyledons</italic> Benchmarking Universal Single-Copy Orthologs (BUSCO), this new version v.2.1 turns out to be much more contiguous with a 12 time less contigs and a six-fold larger N50 (
                    <xref ref-type="table" rid="T1">Table 1</xref>) (
                    <xref ref-type="bibr" rid="ref3">Cuello 
                        <italic toggle="yes">et al.</italic>, 2022</xref>).</p>
                <table-wrap id="T1" orientation="portrait" position="float">
                    <label>Table 1. </label>
                    <caption>
                        <title>Genome assembly metrics.</title>
                    </caption>
                    <table content-type="article-table" frame="hsides">
                        <thead>
                            <tr>
                                <th align="left" colspan="1" rowspan="1" valign="top">Version</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">Assembly size (Mb)</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">No. of scaff.
                                    <xref ref-type="table-fn" rid="tfn1">
                                        <sup>a</sup>
                                    </xref>
                                </th>
                                <th align="left" colspan="1" rowspan="1" valign="top">N50 (Mb)</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">BUSCO scores (genome mode) C [S; D]; F; M
                                    <xref ref-type="table-fn" rid="tfn2">
                                        <sup>b</sup>
                                    </xref>
                                </th>
                                <th align="left" colspan="1" rowspan="1" valign="top">Protein coding genes</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">Ref.</th>
                            </tr>
                        </thead>
                        <tbody>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">
                                    <italic toggle="yes">C. roseus</italic> v.2</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">541.13</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">2,090</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">2.58</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">97.0 [95.5; 1.5]; 1.3; 1.7</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">34,363</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">
                                    <xref ref-type="bibr" rid="ref9">Franke 
                                        <italic toggle="yes">et al.</italic>, 2019</xref>
                                </td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">
                                    <italic toggle="yes">C. roseus</italic> v.2.1</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">581.45</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">173</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">12.2</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">97.1 [94.2; 2.9]; 1.0; 1.9</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">21,061</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">This study</td>
                            </tr>
                        </tbody>
                    </table>
                    <table-wrap-foot>
                        <p>
                            <italic toggle="yes">C. roseus</italic>: 
                            <italic toggle="yes">Catharanthus roseus</italic>; BUSCO: Benchmarking Universal Single-Copy Orthologs.</p>
                        <fn-group content-type="footnotes">
                            <fn id="tfn1">
                                <label>
                                    <sup>a</sup>
                                </label>
                                <p>Number of scaffolds.</p>
                            </fn>
                            <fn id="tfn2">
                                <label>
                                    <sup>b</sup>
                                </label>
                                <p>BUSCO scores (genome mode) % Complete [% Complete and single-copy; % Complete and Duplicated]; % Fragmented; % Missing (n = 2,326).</p>
                            </fn>
                        </fn-group>
                    </table-wrap-foot>
                </table-wrap>
            </sec>
            <sec id="sec10">
                <title>Gene annotation</title>
                <p>RNA-seq based gene model prediction using publicly available data resulted in a total of 21,061 genes. Despite less genes were annotated; a higher BUSCO score was obtained (
                    <xref ref-type="fig" rid="f1">Figure 1</xref>). The combination of BLASTP and BLASTX against UniProt database and hmmscan against the PFAM database led to the functional annotation of 76.5% of the predicted genes (16,118 of the 21,062 genes, Supplementary Table S1 in 
                    <italic toggle="yes">Underlying data</italic> (
                    <xref ref-type="bibr" rid="ref3">Cuello 
                        <italic toggle="yes">et al.</italic>, 2022</xref>)). All functionally validated MIA biosynthetic genes from 
                    <italic toggle="yes">C. roseus</italic> could be found in this new version v.2.1 of the genome with identity and coverage percentage ranging from 95 to 100% and 94 to 100%, respectively, with the exception of 
                    <italic toggle="yes">G10H</italic> and 
                    <italic toggle="yes">DAT</italic> (Supplementary Table S2-S3 in 
                    <italic toggle="yes">Underlying data</italic> (
                    <xref ref-type="bibr" rid="ref3">Cuello 
                        <italic toggle="yes">et al.</italic>, 2022</xref>)).</p>
                <fig fig-type="figure" id="f1" orientation="portrait" position="float">
                    <label>Figure 1. </label>
                    <caption>
                        <title>BUSCO scores of the predicted gene set.</title>
                        <p>BUSCO: Benchmarking Universal Single-Copy Orthologs.</p>
                    </caption>
                    <graphic id="gr1" orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/141880/431bb438-c8a5-40cb-ad65-e749423081a9_figure1.gif"/>
                </fig>
            </sec>
            <sec id="sec11">
                <title>Transposable element annotation</title>
                <p>Finally, we analyzed TE composition of this updated 
                    <italic toggle="yes">C. roseus</italic> genome. While 38.78% of the genome consisted in TE in 
                    <italic toggle="yes">C. roseus</italic> v.2, a higher proportion (42.87%) was annotated as TE in this new version (v.2.1) with similar distribution across the different TE families (
                    <xref ref-type="fig" rid="f2">Figure 2</xref>). It is worth noting that TE proportion of this v.2.1 is closer to the one in its recently sequenced closely related species 
                    <italic toggle="yes">Vinca minor</italic> (
                    <xref ref-type="bibr" rid="ref22">Stander 
                        <italic toggle="yes">et al.</italic>, 2022</xref>).</p>
                <fig fig-type="figure" id="f2" orientation="portrait" position="float">
                    <label>Figure 2. </label>
                    <caption>
                        <title>Proportion of transposable element (TE) in 
                            <italic toggle="yes">C. roseus</italic> assembly version 2 (A) and version 2.1 (B).</title>
                        <p>TIR: terminal inverted repeat, LTR: long terminal repeat, non LTR: retrotransposons without LTR sequence, other LTR: LTR containing retrotransposons except for 
                            <italic toggle="yes">Gypsy</italic> and 
                            <italic toggle="yes">Copia.</italic>
                        </p>
                    </caption>
                    <graphic id="gr2" orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/141880/431bb438-c8a5-40cb-ad65-e749423081a9_figure2.gif"/>
                </fig>
            </sec>
        </sec>
    </body>
    <back>
        <sec id="sec14" sec-type="data-availability">
            <title>Data availability</title>
            <sec id="sec15">
                <title>Underlying data</title>
                <p>BioProject: 
                    <italic toggle="yes">Catharanthus roseus</italic> genome sequencing. Raw sequence reads, complete genome. Accession number PRJNA907167, 
                    <ext-link ext-link-type="uri" xlink:href="https://identifiers.org/NCBI/bioproject:PRJNA907167">https://identifiers.org/NCBI/bioproject:PRJNA907167</ext-link> (
                    <xref ref-type="bibr" rid="ref24">Tours University, 2022a</xref>).</p>
                <p>BioSample: Plant sample from 
                    <italic toggle="yes">Catharanthus roseus</italic>, Accession number SAMN31953452, 
                    <ext-link ext-link-type="uri" xlink:href="https://identifiers.org/NCBI/biosample:SAMN31953452">https://identifiers.org/NCBI/biosample:SAMN31953452</ext-link> (
                    <xref ref-type="bibr" rid="ref25">Tours University, 2022b</xref>).</p>
                <p>Figshare: An updated version of 
                    <italic toggle="yes">Catharanthus roseus</italic> genome. 
                    <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.6084/m9.figshare.21641111">10.6084/m9.figshare.21641111</ext-link> (
                    <xref ref-type="bibr" rid="ref3">Cuello 
                        <italic toggle="yes">et al.</italic>, 2022</xref>).</p>
                <p>This project contains the following underlying data:
                    <list list-type="bullet">
                        <list-item>
                            <label>&#x2022;</label>
                            <p>
Catharanthus_roseus_v2.1_UT.cds (Predicted CDS).</p>
                        </list-item>
                        <list-item>
                            <label>&#x2022;</label>
                            <p>
Catharanthus_roseus_v2.1_UT.gff (Genome annotation file (GFF)).</p>
                        </list-item>
                        <list-item>
                            <label>&#x2022;</label>
                            <p>
Catharanthus_roseus_v2.1_UT.pep (Predicted proteins).</p>
                        </list-item>
                        <list-item>
                            <label>&#x2022;</label>
                            <p>
Catharanthus_roseus_v2.1_UT.tr (Predicted transcripts).</p>
                        </list-item>
                        <list-item>
                            <label>&#x2022;</label>
                            <p>Cuello et al &#x2013; F1000R &#x2013; SuppMat.xlsx (Supplementary tables).</p>
                        </list-item>
                    </list>
                </p>
                <p>Data are available under the terms of the 
                    <ext-link ext-link-type="uri" xlink:href="https://creativecommons.org/licenses/by/4.0/legalcode">Creative Commons Attribution 4.0 International license</ext-link> (CC-BY 4.0).</p>
            </sec>
        </sec>
        <ack>
            <title>Acknowledgments</title>
            <p>The authors benefitted from the use of the cluster at the Centre de Calcul Scientifique en r&#x00e9;gion Centre-Val de Loire.</p>
        </ack>
        <ref-list>
            <title>References</title>
            <ref id="ref1">
                <mixed-citation publication-type="book">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Bushnell</surname>
                            <given-names>B</given-names>
                        </name>
</person-group>:
                    <source>

                        <italic toggle="yes">BBMap: A Fast, Accurate, Splice-Aware Aligner (No. LBNL-7065E).</italic>
</source>
                    <publisher-loc>Berkeley, CA (United States)</publisher-loc>:
                    <publisher-name>Lawrence Berkeley National Lab. (LBNL)</publisher-name>;<year>2014</year>.</mixed-citation>
            </ref>
            <ref id="ref2">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Camacho</surname>
                            <given-names>C</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Coulouris</surname>
                            <given-names>G</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Avagyan</surname>
                            <given-names>V</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>BLAST+: architecture and applications.</article-title>
                    <source>

                        <italic toggle="yes">BMC Bioinformatics.</italic>
</source>
                    <year>2009</year>;<volume>10</volume>:<fpage>421</fpage>.
                    <pub-id pub-id-type="pmid">20003500</pub-id>
                    <pub-id pub-id-type="doi">10.1186/1471-2105-10-421</pub-id>
                    <pub-id pub-id-type="pmcid">PMC2803857</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref3">
                <mixed-citation publication-type="data">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Cuello</surname>
                            <given-names>C</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Stander</surname>
                            <given-names>E</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Jansen</surname>
                            <given-names>HJ</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <data-title>An updated version of the Madagascar periwinkle genome. figshare.</data-title>[Dataset].<year>2022</year>.
                    <pub-id pub-id-type="doi">10.6084/m9.figshare.21641111</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref4">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Dainat</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Here&#x00f1;&#x00fa;</surname>
                            <given-names>D</given-names>
                        </name>

                        <collab>LucileSol, pascal-git</collab>
</person-group>:
                    <article-title>NBISweden/AGAT: AGAT-v0.8.1.</article-title>
                    <source>

                        <italic toggle="yes">Zenodo.</italic>
</source>
                    <year>2022</year>.
                    <pub-id pub-id-type="doi">10.5281/zenodo.5834795</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref5">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Dug&#x00e9; de Bernonville</surname>
                            <given-names>T</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Clastre</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Besseau</surname>
                            <given-names>S</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Phytochemical genomics of the Madagascar periwinkle: Unravelling the last twists of the alkaloid engine.</article-title>
                    <source>

                        <italic toggle="yes">Phytochemistry.</italic>
</source>
                    <year>2015</year>;<volume>113</volume>:<fpage>9</fpage>&#x2013;<lpage>23</lpage>.
                    <pub-id pub-id-type="pmid">25146650</pub-id>
                    <pub-id pub-id-type="doi">10.1016/j.phytochem.2014.07.023</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref6">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Ellinghaus</surname>
                            <given-names>D</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Kurtz</surname>
                            <given-names>S</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Willhoeft</surname>
                            <given-names>U</given-names>
                        </name>
</person-group>:
                    <article-title>LTRharvest, an efficient and flexible software for de novo detection of LTR retrotransposons.</article-title>
                    <source>

                        <italic toggle="yes">BMC Bioinformatics.</italic>
</source>
                    <year>2008</year>;<volume>9</volume>(<issue>1</issue>):<fpage>18</fpage>.
                    <pub-id pub-id-type="pmid">18194517</pub-id>
                    <pub-id pub-id-type="doi">10.1186/1471-2105-9-18</pub-id>
                    <pub-id pub-id-type="pmcid">PMC2253517</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref7">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Finn</surname>
                            <given-names>RD</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Clements</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Eddy</surname>
                            <given-names>SR</given-names>
                        </name>
</person-group>:
                    <article-title>HMMER web server: interactive sequence similarity searching.</article-title>
                    <source>

                        <italic toggle="yes">Nucleic Acids Res.</italic>
</source>
                    <year>2011</year>;<volume>39</volume>:<fpage>W29</fpage>&#x2013;<lpage>W37</lpage>.
                    <pub-id pub-id-type="pmid">21593126</pub-id>
                    <pub-id pub-id-type="doi">10.1093/NAR/GKR367</pub-id>
                    <pub-id pub-id-type="pmcid">PMC3125773</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref8">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Flynn</surname>
                            <given-names>JM</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Hubley</surname>
                            <given-names>R</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Goubert</surname>
                            <given-names>C</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>RepeatModeler2 for automated genomic discovery of transposable element families.</article-title>
                    <source>

                        <italic toggle="yes">PNAS.</italic>
</source>
                    <year>2020</year>;<volume>117</volume>(<issue>17</issue>):<fpage>9451</fpage>&#x2013;<lpage>9457</lpage>.
                    <pub-id pub-id-type="pmid">32300014</pub-id>
                    <pub-id pub-id-type="doi">10.1073/pnas.1921046117</pub-id>
                    <pub-id pub-id-type="pmcid">PMC7196820</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref9">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Franke</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Kim</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Hamilton</surname>
                            <given-names>JP</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Gene Discovery in 
                        <italic toggle="yes">Gelsemium</italic> Highlights Conserved Gene Clusters in Monoterpene Indole Alkaloid Biosynthesis.</article-title>
                    <source>

                        <italic toggle="yes">ChemBioChem.</italic>
</source>
                    <year>2019</year>;<volume>20</volume>:<fpage>83</fpage>&#x2013;<lpage>87</lpage>.
                    <pub-id pub-id-type="pmid">30300974</pub-id>
                    <pub-id pub-id-type="doi">10.1002/CBIC.201800592</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref10">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Jiao</surname>
                            <given-names>WB</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Schneeberger</surname>
                            <given-names>K</given-names>
                        </name>
</person-group>:
                    <article-title>The impact of third generation genomic technologies on plant genome assembly.</article-title>
                    <source>

                        <italic toggle="yes">Curr. Opin. Plant Biol.</italic>
</source>
                    <year>2017</year>;<volume>36</volume>:<fpage>64</fpage>&#x2013;<lpage>70</lpage>.
                    <pub-id pub-id-type="pmid">28231512</pub-id>
                    <pub-id pub-id-type="doi">10.1016/j.pbi.2017.02.002</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref11">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Kellner</surname>
                            <given-names>F</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Kim</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Clavijo</surname>
                            <given-names>BJ</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Genome-guided investigation of plant natural product biosynthesis.</article-title>
                    <source>

                        <italic toggle="yes">Plant J.</italic>
</source>
                    <year>2015</year>;<volume>82</volume>:<fpage>680</fpage>&#x2013;<lpage>692</lpage>.
                    <pub-id pub-id-type="pmid">25759247</pub-id>
                    <pub-id pub-id-type="doi">10.1111/tpj.12827</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref12">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Kim</surname>
                            <given-names>D</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Paggi</surname>
                            <given-names>JM</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Park</surname>
                            <given-names>C</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype.</article-title>
                    <source>

                        <italic toggle="yes">Nat. Biotechnol.</italic>
</source>
                    <year>2019</year>;<volume>37</volume>(<issue>8</issue>):<fpage>907</fpage>&#x2013;<lpage>915</lpage>.
                    <pub-id pub-id-type="pmid">31375807</pub-id>
                    <pub-id pub-id-type="doi">10.1038/s41587-019-0201-4</pub-id>
                    <pub-id pub-id-type="pmcid">PMC7605509</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref13">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Kolmogorov</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Yuan</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Lin</surname>
                            <given-names>Y</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Assembly of long, error-prone reads using repeat graphs.</article-title>
                    <source>

                        <italic toggle="yes">Nat. Biotechnol.</italic>
</source>
                    <year>2019</year>;<volume>37</volume>:<fpage>540</fpage>&#x2013;<lpage>546</lpage>.
                    <pub-id pub-id-type="pmid">30936562</pub-id>
                    <pub-id pub-id-type="doi">10.1038/s41587-019-0072-8</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref14">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Kulagina</surname>
                            <given-names>N</given-names>
                        </name>

                        <name name-style="western">
                            <surname>M&#x00e9;teignier</surname>
                            <given-names>LV</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Papon</surname>
                            <given-names>N</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>More than a Catharanthus plant: A multicellular and pluri-organelle alkaloid-producing factory.</article-title>
                    <source>

                        <italic toggle="yes">Curr. Opin. Plant Biol.</italic>
</source>
                    <year>2022</year>;<volume>67</volume>:<fpage>102200</fpage>.
                    <pub-id pub-id-type="pmid">35339956</pub-id>
                    <pub-id pub-id-type="doi">10.1016/j.pbi.2022.102200</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref15">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>O&#x2019;Connor</surname>
                            <given-names>SE</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Maresh</surname>
                            <given-names>JJ</given-names>
                        </name>
</person-group>:
                    <article-title>Chemistry and biology of monoterpene indole alkaloid biosynthesis.</article-title>
                    <source>

                        <italic toggle="yes">Nat. Prod. Rep.</italic>
</source>
                    <year>2006</year>;<volume>23</volume>:<fpage>532</fpage>&#x2013;<lpage>547</lpage>.
                    <pub-id pub-id-type="pmid">16874388</pub-id>
                    <pub-id pub-id-type="doi">10.1039/B512615K</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref16">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Ou</surname>
                            <given-names>S</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Jiang</surname>
                            <given-names>N</given-names>
                        </name>
</person-group>:
                    <article-title>LTR_retriever: a highly accurate and sensitive program for identification of long terminal repeat retrotransposons.</article-title>
                    <source>

                        <italic toggle="yes">Plant Physiol.</italic>
</source>
                    <year>2018</year>;<volume>176</volume>(<issue>2</issue>):<fpage>1410</fpage>&#x2013;<lpage>1422</lpage>.
                    <pub-id pub-id-type="pmid">29233850</pub-id>
                    <pub-id pub-id-type="doi">10.1104/pp.17.01310</pub-id>
                    <pub-id pub-id-type="pmcid">PMC5813529</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref17">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Ou</surname>
                            <given-names>S</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Su</surname>
                            <given-names>W</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Liao</surname>
                            <given-names>Y</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Benchmarking transposable element annotation methods for creation of a streamlined, comprehensive pipeline.</article-title>
                    <source>

                        <italic toggle="yes">Genome Biol.</italic>
</source>
                    <year>2019</year>;<volume>20</volume>:<fpage>275</fpage>.
                    <pub-id pub-id-type="pmid">31843001</pub-id>
                    <pub-id pub-id-type="doi">10.1186/s13059-019-1905-y</pub-id>
                    <pub-id pub-id-type="pmcid">PMC6913007</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref18">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Pan</surname>
                            <given-names>Q</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Mustafa</surname>
                            <given-names>NR</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Tang</surname>
                            <given-names>K</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Monoterpenoid indole alkaloids biosynthesis and its regulation in 
                        <italic toggle="yes">Catharanthus roseus</italic>: a literature review from genes to metabolites.</article-title>
                    <source>

                        <italic toggle="yes">Phytochem. Rev.</italic>
</source>
                    <year>2016</year>;<volume>15</volume>:<fpage>221</fpage>&#x2013;<lpage>250</lpage>.
                    <pub-id pub-id-type="doi">10.1007/s11101-015-9406-4</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref19">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Pertea</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Pertea</surname>
                            <given-names>GM</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Antonescu</surname>
                            <given-names>CM</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>StringTie enables improved reconstruction of a transcriptome from RNA-seq reads.</article-title>
                    <source>

                        <italic toggle="yes">Nat. Biotechnol.</italic>
</source>
                    <year>2015</year>;<volume>33</volume>(<issue>3</issue>):<fpage>290</fpage>&#x2013;<lpage>295</lpage>.
                    <pub-id pub-id-type="pmid">25690850</pub-id>
                    <pub-id pub-id-type="doi">10.1038/nbt.3122</pub-id>
                    <pub-id pub-id-type="pmcid">PMC4643835</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref20">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Shi</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Liang</surname>
                            <given-names>C</given-names>
                        </name>
</person-group>:
                    <article-title>Generic repeat finder: a high-sensitivity tool for genome-wide de novo repeat detection.</article-title>
                    <source>

                        <italic toggle="yes">Plant Physiol.</italic>
</source>
                    <year>2019</year>;<volume>180</volume>(<issue>4</issue>):<fpage>1803</fpage>&#x2013;<lpage>1815</lpage>.
                    <pub-id pub-id-type="pmid">31152127</pub-id>
                    <pub-id pub-id-type="doi">10.1104/pp.19.00386</pub-id>
                    <pub-id pub-id-type="pmcid">PMC6670090</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref21">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Sim&#x00e3;o</surname>
                            <given-names>FA</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Waterhouse</surname>
                            <given-names>RM</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Ioannidis</surname>
                            <given-names>P</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs.</article-title>
                    <source>

                        <italic toggle="yes">Bioinformatics.</italic>
</source>
                    <year>2015</year>;<volume>31</volume>:<fpage>3210</fpage>&#x2013;<lpage>3212</lpage>.
                    <pub-id pub-id-type="doi">10.1093/bioinformatics/btv351</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref22">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Stander</surname>
                            <given-names>EA</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Cuello</surname>
                            <given-names>C</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Birer-Williams</surname>
                            <given-names>C</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>The Vinca minor genome highlights conserved evolutionary traits in monoterpene indole alkaloid synthesis.</article-title>
                    <source>

                        <italic toggle="yes">G3 Genes|Genomes|Genetics.</italic>
</source>
                    <year>2022</year>;<volume>12</volume>:<fpage>jkac268</fpage>.
                    <pub-id pub-id-type="pmid">36200869</pub-id>
                    <pub-id pub-id-type="doi">10.1093/g3journal/jkac268</pub-id>
                    <pub-id pub-id-type="pmcid">PMC9713385</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref23">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Su</surname>
                            <given-names>W</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Gu</surname>
                            <given-names>X</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Peterson</surname>
                            <given-names>T</given-names>
                        </name>
</person-group>:
                    <article-title>TIR-learner, a new ensemble method for TIR transposable element annotation, provides evidence for abundant new transposable elements in the maize genome.</article-title>
                    <source>

                        <italic toggle="yes">Mol. Plant.</italic>
</source>
                    <year>2019</year>;<volume>12</volume>(<issue>3</issue>):<fpage>447</fpage>&#x2013;<lpage>460</lpage>.
                    <pub-id pub-id-type="pmid">30802553</pub-id>
                    <pub-id pub-id-type="doi">10.1016/j.molp.2019.02.008</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref24">
                <mixed-citation publication-type="data">
                    <collab>Tours University</collab>:
                    <data-title>Catharanthus roseus genome.</data-title>[Dataset].
                    <source>

                        <italic toggle="yes">BioProject.</italic>
</source>
                    <year>2022a</year>.
                    <ext-link ext-link-type="uri" xlink:href="https://identifiers.org/NCBI/bioproject:PRJNA907167">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref25">
                <mixed-citation publication-type="data">
                    <collab>Tours University</collab>:
                    <data-title>Plant sample from Catharanthus roseus.</data-title>[Dataset].
                    <source>

                        <italic toggle="yes">BioSample.</italic>
</source>
                    <year>2022b</year>.
                    <ext-link ext-link-type="uri" xlink:href="https://identifiers.org/NCBI/biosample:SAMN31953452">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref26">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Walker</surname>
                            <given-names>BJ</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Abeel</surname>
                            <given-names>T</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Shea</surname>
                            <given-names>T</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Pilon: An Integrated Tool for Comprehensive Microbial Variant Detection and Genome Assembly Improvement.</article-title>
                    <source>

                        <italic toggle="yes">PLoS One.</italic>
</source>
                    <year>2014</year>;<volume>9</volume>:<fpage>e112963</fpage>&#x2013;<lpage>944 e112963</lpage>.
                    <pub-id pub-id-type="pmid">25409509</pub-id>
                    <pub-id pub-id-type="doi">10.1371/JOURNAL.PONE.0112963</pub-id>
                    <pub-id pub-id-type="pmcid">PMC4237348</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref27">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Wick</surname>
                            <given-names>RR</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Judd</surname>
                            <given-names>LM</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Gorrie</surname>
                            <given-names>CL</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Completing bacterial genome assemblies with multiplex MinION sequencing.</article-title>
                    <source>

                        <italic toggle="yes">Microb. Genom.</italic>
</source>
                    <year>2017</year>;<volume>3</volume>(<issue>10</issue>):<fpage>e000132</fpage>.
                    <pub-id pub-id-type="pmid">29177090</pub-id>
                    <pub-id pub-id-type="doi">10.1099/mgen.0.000132</pub-id>
                    <pub-id pub-id-type="pmcid">PMC5695209</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref28">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Xiong</surname>
                            <given-names>W</given-names>
                        </name>

                        <name name-style="western">
                            <surname>He</surname>
                            <given-names>L</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Lai</surname>
                            <given-names>J</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>HelitronScanner uncovers a large overlooked cache of Helitron transposons in many plant genomes.</article-title>
                    <source>

                        <italic toggle="yes">Proc. Natl. Acad. Sci. USA.</italic>
</source>
                    <year>2014</year>;<volume>111</volume>(<issue>28</issue>):<fpage>10263</fpage>&#x2013;<lpage>10268</lpage>.
                    <pub-id pub-id-type="pmid">24982153</pub-id>
                    <pub-id pub-id-type="doi">10.1073/pnas.1410068111</pub-id>
                    <pub-id pub-id-type="pmcid">PMC4104883</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref29">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Xu</surname>
                            <given-names>Z</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Wang</surname>
                            <given-names>H</given-names>
                        </name>
</person-group>:
                    <article-title>LTR_FINDER: an efficient tool for the prediction of full-length LTR retrotransposons.</article-title>
                    <source>

                        <italic toggle="yes">Nucleic Acids Res.</italic>
</source>
                    <year>2007</year>;<volume>35</volume>(<issue>Web Server issue</issue>):<fpage>W265</fpage>&#x2013;<lpage>W268</lpage>.
                    <pub-id pub-id-type="pmid">17485477</pub-id>
                    <pub-id pub-id-type="doi">10.1093/nar/gkm286</pub-id>
                    <pub-id pub-id-type="pmcid">PMC1933203</pub-id>
                </mixed-citation>
            </ref>
        </ref-list>
    </back>
    <sub-article article-type="reviewer-report" id="report158636">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.141880.r158636</article-id>
            <title-group>
                <article-title>Reviewer response for version 1</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Rai</surname>
                        <given-names>Amit</given-names>
                    </name>
                    <xref ref-type="aff" rid="r158636a1">1</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0002-5715-8541</uri>
                </contrib>
                <aff id="r158636a1">
                    <label>1</label>Graduate School of Pharmaceutical Sciences, Chiba University, Chiba, Japan</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>6</day>
                <month>2</month>
                <year>2023</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2023 Rai A</copyright-statement>
                <copyright-year>2023</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport158636" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.129212.1"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>In the presented article, authors reported an updated version of genome assembly for&#x00a0;Madagascar periwinkle, which is a valuable model plant species to study MIA biosynthesis. Compared to the previously published genome assemblies for&#x00a0;Madagascar periwinkle, this study used long read sequencing technology and achieved an improvement in terms of contig N50.</p>
            <p> </p>
            <p> Without a doubt, this is a better genome assembly, but authors should have considered scaffolding through HiC to achieve a chromosome-scale genome assembly as that would have allowed them to discover novel features contributing MIA biosynthesis and evolution.</p>
            <p> </p>
            <p> Nevertheless, the resource presented here is valuable, and will inspire researchers to combine the generated datasets in this study with new sequencing data to derive a chromosome-scale genome assembly for 
                <italic>C. roseus</italic>. For these reasons, I support its indexing.</p>
            <p>Are the datasets clearly presented in a usable and accessible format, and the assembly and annotation available in an appropriate subject-specific repository?</p>
            <p>Yes</p>
            <p>Are sufficient details of the sequencing and extraction, software used, and materials provided to allow replication by others?</p>
            <p>Yes</p>
            <p>Are the rationale for sequencing the genome and the species significance clearly described?</p>
            <p>Yes</p>
            <p>Are the protocols appropriate and is the work technically sound?</p>
            <p>Yes</p>
            <p>Reviewer Expertise:</p>
            <p>Genome sciences, Metabolomics research. Evolutionary biology, MIA biosynthesis pathways</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.</p>
        </body>
    </sub-article>
    <sub-article article-type="reviewer-report" id="report158640">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.141880.r158640</article-id>
            <title-group>
                <article-title>Reviewer response for version 1</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Tatsis</surname>
                        <given-names>Evangelos</given-names>
                    </name>
                    <xref ref-type="aff" rid="r158640a1">1</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0002-4013-1537</uri>
                </contrib>
                <aff id="r158640a1">
                    <label>1</label>Chinese Academy of Sciences Centre for Excellence in Molecular Plant Sciences, Shanghai Institute of Plant Physiology and Ecology, Shanghai, China</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>4</day>
                <month>1</month>
                <year>2023</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2023 Tatsis E</copyright-statement>
                <copyright-year>2023</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport158640" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.129212.1"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>The current work reports an updated version of the significant medicinal plant 
                <italic>Catharanthus roseus</italic>. 
                <italic>C. roseus</italic> is the only source of the clinically used anticancer drug vinblastine and such work can provide resources for molecular breeding to improve the production of vinblastine and other MIAs.</p>
            <p> </p>
            <p> The sequencing techniques and the bioinformatic analysis are appropriate. I recommend the current manuscript for indexing as it is.</p>
            <p>Are the datasets clearly presented in a usable and accessible format, and the assembly and annotation available in an appropriate subject-specific repository?</p>
            <p>Yes</p>
            <p>Are sufficient details of the sequencing and extraction, software used, and materials provided to allow replication by others?</p>
            <p>Yes</p>
            <p>Are the rationale for sequencing the genome and the species significance clearly described?</p>
            <p>Yes</p>
            <p>Are the protocols appropriate and is the work technically sound?</p>
            <p>Yes</p>
            <p>Reviewer Expertise:</p>
            <p>Plant specialised metabolism</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.</p>
        </body>
    </sub-article>
</article>
