<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.2 20190208//EN" "http://jats.nlm.nih.gov/publishing/1.2/JATS-journalpublishing1.dtd"><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="review-article" dtd-version="1.2" xml:lang="en">
    <front>
        <journal-meta>
            <journal-id journal-id-type="pmc">F1000Research</journal-id>
            <journal-title-group>
                <journal-title>F1000Research</journal-title>
            </journal-title-group>
            <issn pub-type="epub">2046-1402</issn>
            <publisher>
                <publisher-name>F1000 Research Limited</publisher-name>
                <publisher-loc>London, UK</publisher-loc>
            </publisher>
        </journal-meta>
        <article-meta>
            <article-id pub-id-type="doi">10.12688/f1000research.166538.1</article-id>
            <article-categories>
                <subj-group subj-group-type="heading">
                    <subject>Review</subject>
                </subj-group>
                <subj-group>
                    <subject>Articles</subject>
                </subj-group>
            </article-categories>
            <title-group>
                <article-title>White paper: standards for handling and analyzing plant pan-genomes</article-title>
                <fn-group content-type="pub-status">
                    <fn>
                        <p>[version 1; peer review: 2 approved with reservations]</p>
                    </fn>
                </fn-group>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author" corresp="yes">
                    <name>
                        <surname>Heuermann</surname>
                        <given-names>Marc C.</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Conceptualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="corresp" rid="c1">a</xref>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Barros</surname>
                        <given-names>Pedro</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <uri content-type="orcid">https://orcid.org/0000-0001-5626-0619</uri>
                    <xref ref-type="aff" rid="a2">2</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Beier</surname>
                        <given-names>Sebastian</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <uri content-type="orcid">https://orcid.org/0000-0002-2177-8781</uri>
                    <xref ref-type="aff" rid="a3">3</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Gundlach</surname>
                        <given-names>Heidrun</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <uri content-type="orcid">https://orcid.org/0000-0002-6757-0943</uri>
                    <xref ref-type="aff" rid="a4">4</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Alvarez-Jarreta</surname>
                        <given-names>Jorge</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <uri content-type="orcid">https://orcid.org/0000-0002-0946-0957</uri>
                    <xref ref-type="aff" rid="a5">5</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Hassani-Pak</surname>
                        <given-names>Keywan</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <uri content-type="orcid">https://orcid.org/0000-0001-9625-0511</uri>
                    <xref ref-type="aff" rid="a6">6</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>K&#x00f6;nig</surname>
                        <given-names>Patrick</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <uri content-type="orcid">https://orcid.org/0000-0002-8948-6793</uri>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Fiebig</surname>
                        <given-names>Anne</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <uri content-type="orcid">https://orcid.org/0000-0003-3159-3593</uri>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Godec</surname>
                        <given-names>Tim</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <uri content-type="orcid">https://orcid.org/0000-0002-1719-3107</uri>
                    <xref ref-type="aff" rid="a7">7</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Gruden</surname>
                        <given-names>Kristina</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a7">7</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Nolte</surname>
                        <given-names>Nadja</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a7">7</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Petek</surname>
                        <given-names>Marko</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <uri content-type="orcid">https://orcid.org/0000-0003-3644-7827</uri>
                    <xref ref-type="aff" rid="a7">7</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Scholz</surname>
                        <given-names>Uwe</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <uri content-type="orcid">https://orcid.org/0000-0001-6113-3518</uri>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Zagor&#x0161;&#x010d;ak</surname>
                        <given-names>Maja</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a7">7</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Vandepoele</surname>
                        <given-names>Klaas</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Project Administration</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a8">8</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Van Bel</surname>
                        <given-names>Michiel</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Project Administration</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a8">8</xref>
                </contrib>
                <aff id="a1">
                    <label>1</label>Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Seeland, Saxony-Anhalt, 06466, Germany</aff>
                <aff id="a2">
                    <label>2</label>Universidade Nova de Lisboa Instituto de Tecnologia Quimica e Biologica, Oeiras, Lisbon, Portugal</aff>
                <aff id="a3">
                    <label>3</label>Forschungszentrum J&#x00fc;lich GmbH Institute of Bio- and Geosciences, J&#x00fc;lich, North Rhine-Westphalia, 52425, Germany</aff>
                <aff id="a4">
                    <label>4</label>Helmholtz Zentrum M&#x00fc;nchen, Neuherberg, 85764, Germany</aff>
                <aff id="a5">
                    <label>5</label>Systems Immunity Research Institute, Cardiff University School of Medicine, Cardiff, Wales, UK</aff>
                <aff id="a6">
                    <label>6</label>Rothamsted Research, Harpenden, England, AL52JQ, UK</aff>
                <aff id="a7">
                    <label>7</label>National Institute of Biology, Ve&#x010d;na pot 111, Ljubljana, 1000, Slovenia</aff>
                <aff id="a8">
                    <label>8</label>Department of Plant Biotechnology and Bioinformatics, Ghent University, Technologiepark 71, Ghent, 9052, Belgium</aff>
            </contrib-group>
            <author-notes>
                <corresp id="c1">
                    <label>a</label>
                    <email xlink:href="mailto:marcheuermann@ipk-gatersleben.de">marcheuermann@ipk-gatersleben.de</email>
                </corresp>
                <fn fn-type="conflict">
                    <p>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>28</day>
                <month>7</month>
                <year>2025</year>
            </pub-date>
            <pub-date pub-type="collection">
                <year>2025</year>
            </pub-date>
            <volume>14</volume>
            <elocation-id>739</elocation-id>
            <history>
                <date date-type="accepted">
                    <day>21</day>
                    <month>7</month>
                    <year>2025</year>
                </date>
            </history>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2025 Heuermann MC et al.</copyright-statement>
                <copyright-year>2025</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <self-uri content-type="pdf" xlink:href="https://f1000research.com/articles/14-739/pdf"/>
            <abstract>
                <p>Plant pan-genomes, which aggregate genomic sequences and annotations from multiple individuals of a species, have emerged as transformative tools for understanding genetic diversity, adaptation, and evolutionary dynamics. Super-pan-genomes, extending across species boundaries, further enable comparative analyses of clades or genera, bridging breeding applications with evolutionary insights (
                    <xref ref-type="bibr" rid="ref49">Shang et al., 2022</xref>; Li et al., 2023a). However, the absence of standardized practices for data generation, analysis, and sharing hinders reproducibility and interoperability. This white paper presents a harmonized framework developed by the ELIXIR E-PAN consortium, addressing nomenclature, quality control (QC), data formats, visualization, and community practices. By adopting these guidelines, researchers can enhance FAIR (Findable, Accessible, Interoperable, Reusable) compliance, foster collaboration, and accelerate translational applications in crop improvement and evolutionary biology.</p>
            </abstract>
            <kwd-group kwd-group-type="author">
                <kwd>plant pan-genome</kwd>
                <kwd>white paper</kwd>
                <kwd>standards</kwd>
                <kwd>quality control</kwd>
            </kwd-group>
            <funding-group>
                <award-group id="fund-1" xlink:href="https://doi.org/10.13039/501100002347">
                    <funding-source>Federal Ministry of Education and Research</funding-source>
                    <award-id>de.NBI/ELIXIR-DE(W-de.NBI-009)</award-id>
                </award-group>
                <award-group id="fund-2">
                    <funding-source>ELIXIR under the call for proposals on Biodiversity, Food Safety and Pathogens</funding-source>
                    <award-id>(2024-SCIENCE-BFSP)</award-id>
                </award-group>
                <funding-statement>This study has received funding from ELIXIR under the call for proposals on Biodiversity, Food Safety and Pathogens (2024-SCIENCE-BFSP) as part of WP2 E-PAN: Enhancing pan-genome analysis in plants. MCH received funding from ELIXIR-DE, which was supported by the Federal Ministry of Education and Research BMBF within the framework of de.NBI/ELIXIR-DE (W-de.NBI-009).</funding-statement>
                <funding-statement>
                    <italic>The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.</italic>
                </funding-statement>
            </funding-group>
        </article-meta>
    </front>
    <body>
        <sec id="sec1" sec-type="intro">
            <title>1. Introduction</title>
            <p>Pan-genomes capture both core genomic elements (shared across individuals) and accessory components (variable or unique to subsets), offering unprecedented resolution for studying traits such as disease resistance, environmental adaptation, and domestication (
                <xref ref-type="bibr" rid="ref44">Qin et al., 2021</xref>; 
                <xref ref-type="bibr" rid="ref60">Zhou et al., 2022</xref>). Super-pan-genomes, which span multiple species, provide evolutionary context for gene family dynamics and speciation events, as demonstrated in clades like Brassicaceae (
                <xref ref-type="bibr" rid="ref26">Jiao &amp; Schneeberger, 2020</xref>) and Solanaceae (
                <xref ref-type="bibr" rid="ref1">Alonge et al., 2022</xref>). In plant genomics, pan-genomes are vital for understanding genetic diversity, adaptation, and evolutionary dynamics, particularly given the extensive variation observed in plant species (
                <xref ref-type="bibr" rid="ref48">Schreiber et al., 2024</xref>). Despite their potential, inconsistencies in data management&#x2014;such as ad hoc naming conventions, variable QC practices, and fragmented repository use&#x2014;limit cross-study comparisons and data reuse.</p>
            <p>The ELIXIR E-PAN consortium synthesizes insights from foundational studies on barley (
                <italic toggle="yes">Hordeum vulgare</italic>), rice (
                <italic toggle="yes">Oryza sativa</italic>), tomato (
                <italic toggle="yes">Solanum lycopersicum</italic>), and Arabidopsis (
                <italic toggle="yes">Arabidopsis thaliana</italic>) to propose actionable standards. These guidelines aim to unify the plant genomics community, ensuring robust, interoperable resources for breeding and evolutionary research.</p>
        </sec>
        <sec id="sec2">
            <title>2. Naming conventions and ontologies</title>
            <sec id="sec3">
                <title>2.1 Accession and assembly identifiers</title>
                <p>Accession naming should adhere to MIAPPE (Minimum Information About Plant Phenotyping Experiments) standards. The Biological Material ID should incorporate institutional identifiers, followed by the accession number from germplasm catalogue or common name of the plant source/variety (e.g., IPK-Gatersleben:HOR_13170 for barley accession &#x201c;Barke&#x201d;) to ensure traceability (MIAPPE v1.1, 
                    <xref ref-type="bibr" rid="ref41">Papoutsoglou et al., 2020</xref>). When complementary data regarding a specific accession is also available at external sources (e.g. Biosamples), a link to a Biological material external ID should be provided in the metadata.</p>
                <p>Genome assembly identifiers should contain at least 4 fields&#x2014;species, variety/line, project group, assembly version &#x2014; separated by period (&#x2018;.&#x2019;), with an optional fifth field for additional information (
                    <xref ref-type="bibr" rid="ref6">Cannon et al., 2025</xref>). For example, drOrySati.Nipponbare.RicePan.1.0, which refers to the assembly of 
                    <italic toggle="yes">Oryza sativa</italic>, Nipponbare cultivar, RicePan project, version 1.0 (ToLID identifier, 
                    <ext-link ext-link-type="uri" xlink:href="https://id.tol.sanger.ac.uk/">https://id.tol.sanger.ac.uk/</ext-link>, 
                    <xref ref-type="bibr" rid="ref9">Darwin Tree of Life Consortium, 2023</xref>).</p>
            </sec>
            <sec id="sec4">
                <title>2.2 Gene identifiers</title>
                <p>Gene identifiers must balance stability with biological relevance, as outlined by 
                    <xref ref-type="bibr" rid="ref6">Cannon et al. (2025)</xref>, keeping track of the annotation version, chromosome and gene ID. Their framework proposes human- and machine-readable identifiers, including the assembly names (e.g. drOrySati.Nipponbare.RicePan.1.0) with the addition of gene models like drOrySati.Nipponbare.RicePan.1.0.1.01.g000100 (assembly version 1.0, annotation version 1, chromosome 01, gene 100). To enhance this for pan-genomics, the &#x201c;group&#x201d; field can denote pan-genome projects (e.g., RicePan), linking multiple assemblies, while optional fields like &#x201c;Hap1&#x201d; or metadata tags distinguish haplotypes or accession types (e.g., wild vs. cultivated). Pangenes, representing orthologous gene clusters, can be assigned identifiers like drOrySati.RicePan.pan00001, with metadata linking to specific gene models across assemblies. 
                    <xref ref-type="bibr" rid="ref6">Cannon et al. (2025)</xref> advocate preserving legacy identifiers via cross-references to ensure stability, avoiding disruptive renaming as new accessions are added.</p>
            </sec>
            <sec id="sec5">
                <title>2.3 Metadata and ontologies</title>
                <p>A core metadata schema is critical for interoperability. Required fields to properly annotate pan-genome studies include species details such as name (TaxonID), pedigree, geographic origin, ploidy and chromosome number, as well as sequencing technology used (e.g. PacBio HiFi, Oxford Nanopore, Hi-C, Illumina), assembly pipelines (e.g., Flye (
                    <xref ref-type="bibr" rid="ref28">Kolmogorov et al., 2020</xref>), hifiasm (
                    <xref ref-type="bibr" rid="ref8">Cheng et al., 2024</xref>), Canu (
                    <xref ref-type="bibr" rid="ref29">Koren et al., 2017</xref>), &#x2026;), and assembly QC metrics (e.g., BUSCO scores (
                    <xref ref-type="bibr" rid="ref35">Manni et al., 2021</xref>)). Existing ontologies such as the Plant Ontology (PO) and Gene Ontology (GO) or probably more suited the Sequence Ontology (SO) should be extended to include pan-genome-specific terms that describe the layouts and structures of pan-genomes (
                    <xref ref-type="bibr" rid="ref42">Plant Ontology Consortium, 2023</xref>, The 
                    <xref ref-type="bibr" rid="ref17">Gene Ontology Consortium, 2023</xref>; 
                    <xref ref-type="bibr" rid="ref13">Eilbeck et al., 2005</xref>). These can be categorized as core, shell and cloud genome genes, but these terms may depend on the number of genomes and genotypes selected (
                    <xref ref-type="bibr" rid="ref24">Jayakodi et al., 2024</xref>). Collaboration with the AgBioData Nomenclature Working Group and the Genomics Standards Consortium (
                    <ext-link ext-link-type="uri" xlink:href="https://www.gensc.org/">https://www.gensc.org/</ext-link>) ensures alignment with broader genomic standards (
                    <xref ref-type="bibr" rid="ref6">Cannon et al., 2025</xref>).</p>
            </sec>
        </sec>
        <sec id="sec6">
            <title>3. Quality Control (QC) standards</title>
            <sec id="sec7">
                <title>3.1 Sequencing and assembly QC</title>
                <p>Quality control in genome assembly workflows begins with sequencing QC, where tools like FASTQC assess raw read integrity and adapter content, while k-mer plots generated via Jellyfish (
                    <xref ref-type="bibr" rid="ref36">Mar&#x00e7;ais et al., 2011</xref>) paired with GenomeScope (
                    <xref ref-type="bibr" rid="ref45">Ranallo-Benavidez et al., 2020</xref>) estimations reveal genome complexity, including ploidy, heterozygosity, and repetitive element profiles. For QC of individual assemblies, metrics such as contiguity, completeness, and consensus accuracy are evaluated using QUAST (
                    <xref ref-type="bibr" rid="ref38">Mikheenko et al., 2023</xref>) and CRAQ (
                    <xref ref-type="bibr" rid="ref33">Li et al., 2023b</xref>), with Merqury validating haplotype resolution in polyploid or heterozygous genomes (e.g., wheat, potato) by comparing k-mer spectra between raw reads and assemblies (
                    <xref ref-type="bibr" rid="ref46">Rhie et al., 2020</xref>). Finally, repeat QC ensures assembly integrity via the LTR Assembly Index (LAI, 
                    <xref ref-type="bibr" rid="ref57">Ou et al., 2018</xref>) to evaluate retrotransposon completeness and tidk (
                    <xref ref-type="bibr" rid="ref5">Brown et al., 2025</xref>) for telomere motif identification, safeguarding chromosomal end-to-end accuracy.</p>
            </sec>
            <sec id="sec8">
                <title>3.2 Annotation QC</title>
                <p>In addition to the assembly strategy used, the corresponding annotation pipelines should also be documented. These may include gene model integration pipelines based on MAKER2 (
                    <xref ref-type="bibr" rid="ref23">Holt &amp; Yandell, 2011</xref>), PASA (
                    <xref ref-type="bibr" rid="ref19">Haas et al., 2003</xref>) or BRAKER3 (
                    <xref ref-type="bibr" rid="ref15">Gabriel et al., 2024</xref>), or more advanced tools integrating deep learning methods like Helixer (
                    <xref ref-type="bibr" rid="ref54">Stiehler et al., 2020</xref>) for ab initio prediction. Additionally, Liftoff (
                    <xref ref-type="bibr" rid="ref51">Shumate &amp; Salzberg, 2021</xref>) for annotation transfer should be integrated into versioned workflows (e.g., Snakemake (
                    <xref ref-type="bibr" rid="ref30">K&#x00f6;ster &amp; Rahmann, 2012</xref>), Nextflow (
                    <xref ref-type="bibr" rid="ref11">Di Tommaso et al., 2017</xref>)). Transcriptomic data (RNA-Seq) may be used to validate gene models, particularly for accessory genes lacking orthologs (
                    <xref ref-type="bibr" rid="ref44">Qin et al., 2021</xref>). It is important to ensure that multiple tissues, such as roots and shoots, are represented, and that sufficient replication is included to capture transcriptome diversity.</p>
                <p>QC of genome structural annotation may leverage lineage-specific BUSCO analyses to assess gene space completeness (adjusted for polyploidy; 
                    <xref ref-type="bibr" rid="ref35">Manni et al., 2021</xref>), complemented by Mercator4 (
                    <xref ref-type="bibr" rid="ref3">Bolger et al., 2021</xref>) for automated gene family classification, PSAURON for structural annotation validation (
                    <xref ref-type="bibr" rid="ref53">Sommer et al., 2025</xref>), and OMArk for contamination detection through evolutionary consistency checks (
                    <xref ref-type="bibr" rid="ref40">Nevers et al., 2025</xref>).</p>
            </sec>
            <sec id="sec9">
                <title>3.3 Pan-genome-specific
 QC</title>
                <p>Pan-genome completeness requires saturation analysis, where gene accumulation curves determine whether new accessions yield novel genes (
                    <xref ref-type="bibr" rid="ref55">Tettelin et al., 2005</xref>). For species like barley, benchmark datasets of 100+ conserved genes enable orthology tool validation (
                    <xref ref-type="bibr" rid="ref24">Jayakodi et al., 2024</xref>). Additionally, gene family expansion and contraction analyses, facilitated by tools such as OrthoFinder (
                    <xref ref-type="bibr" rid="ref14">Emms &amp; Kelly, 2019</xref>) and CAFE5 (
                    <xref ref-type="bibr" rid="ref37">Mendes et al., 2020</xref>), provide insights into evolutionary dynamics across accessions. Structural variant detection (e.g., using Sniffles2 (
                    <xref ref-type="bibr" rid="ref52">Smolka et al., 2024</xref>) or SVIM (
                    <xref ref-type="bibr" rid="ref21">Heller &amp; Vingron, 2019</xref>)) must quantify representation of indels and inversions (
                    <xref ref-type="bibr" rid="ref44">Qin et al., 2021</xref>). Similarly, presence-absence variation (PAV) detection, enabled by tools like Panaroo or PAV-specific pipelines, is critical for identifying variable gene content that contributes to phenotypic diversity (
                    <xref ref-type="bibr" rid="ref56">Tonkin-Hill et al., 2020</xref>).</p>
            </sec>
        </sec>
        <sec id="sec10">
            <title>4. Data formats and sharing</title>
            <sec id="sec11">
                <title>4.1 File formats
</title>
                <p>

                    <list list-type="bullet">
                        <list-item>
                            <label>&#x2022;</label>
                            <p>

                                <bold>Raw data</bold>: Assemblies must be submitted in FASTA format with headers containing unique sequence identifiers (e.g., &gt;chr1, &gt;chr2). Annotations must be provided in GFF3 or GTF format (compliant with Sequence Ontology), with the sequence IDs in the first column exactly matching the sequence identifiers used in the FASTA headers.</p>
                        </list-item>
                        <list-item>
                            <label>&#x2022;</label>
                            <p>

                                <bold>Derived data</bold>: Structural variants in VCF/BCF, orthogroups in TSV (cluster ID + member gene), and graph-based representations (GFA format) for complex pan-genomes (
                                <xref ref-type="bibr" rid="ref31">Li et al., 2020</xref>).</p>
                        </list-item>
                    </list>
                </p>
            </sec>
            <sec id="sec12">
                <title>4.2 Repositories</title>
                <p>A centralized E-PAN repository hosted by ELIXIR will archive versioned datasets (e.g., Barley v2, Rice v1.5) with DOI-based identifiers (DataCite). Public deposition in INSDC (raw reads and assembly) and Ensembl Plants (annotations, see documentation of Ensembl, 2025) ensures global accessibility (ENA Documentation, 2025).</p>
            </sec>
            <sec id="sec13">
                <title>4.3 Metadata requirements</title>
                <p>Mandatory metadata fields include sequencing technology and coverage (e.g., PacBio HiFi, Oxford Nanopore), assembly method (e.g., Flye (
                    <xref ref-type="bibr" rid="ref28">Kolmogorov et al., 2020</xref>), Canu (
                    <xref ref-type="bibr" rid="ref29">Koren et al., 2017</xref>)), and accession provenance (BioSample IDs). Missing metadata, as observed in early barley submissions, must be addressed via enforced submission guidelines (
                    <xref ref-type="bibr" rid="ref24">Jayakodi et al., 2024</xref>).</p>
            </sec>
        </sec>
        <sec id="sec14">
            <title>5. Visualization and analysis guidelines</title>
            <sec id="sec15">
                <title>5.1 Visualization tools</title>
                <p>Plant pan-genomes capture a species&#x2019; full genomic diversity, constructed using 
                    <bold>linear-based
</bold> or 
                    <bold>graph-based
</bold> methods, each with distinct strengths and limitations.</p>
                <p>

                    <bold>Linear-based
</bold> approaches align genomes to a single reference or consensus sequence, identifying SNPs, indels, and presence/absence variations (PAVs) using tools like Ensembl Compara (
                    <xref ref-type="bibr" rid="ref12">Dyer et al., 2025</xref>) or OrthoFinder (
                    <xref ref-type="bibr" rid="ref14">Emms &amp; Kelly, 2019</xref>). These methods are simple, computationally efficient, and compatible with tools like JBrowse2 or IGV for synteny and PAV visualization (
                    <xref ref-type="bibr" rid="ref10">Diesh et al., 2023</xref>; 
                    <xref ref-type="bibr" rid="ref47">Robinson et al., 2023</xref>). However, reference bias limits their ability to capture complex structural variations, particularly in repetitive or polyploid plant genomes.</p>
                <p>

                    <bold>Graph-based
</bold> approaches model genomes as interconnected nodes (shared regions) and edges (SNPs, indels, structural variants) using tools like VG Toolkit (
                    <xref ref-type="bibr" rid="ref22">Hickey et al., 2020</xref>), PGGB (
                    <xref ref-type="bibr" rid="ref16">Garrison et al., 2024</xref>), PanTools (
                    <xref ref-type="bibr" rid="ref27">Jonkheer et al., 2022</xref>), and wfmash (
                    <xref ref-type="bibr" rid="ref18">Guarracino et al., 2021</xref>). They provide an unbiased, comprehensive view of genomic diversity, ideal for complex genomes like tomato (
                    <xref ref-type="bibr" rid="ref60">Zhou et al., 2022</xref>). Visualization tools like Bandage (
                    <xref ref-type="bibr" rid="ref58">Wick et al., 2015</xref>) or Cytoscape (
                    <xref ref-type="bibr" rid="ref50">Shannon et al., 2003</xref>) address structural complexity but are computationally intensive and require specialized expertise.</p>
                <p>In essence, linear methods suit rapid, basic analyses, while graph-based approaches excel for complex structural variations despite greater computational demands. As tools and computational resources advance, graph-based methods are becoming more accessible, enhancing plant pan-genome studies. A multitude of tools are summarized in 
                    <xref ref-type="bibr" rid="ref39">Naithani et al. (2023)</xref>.</p>
            </sec>
            <sec id="sec16">
                <title>5.2 Analysis best practices</title>
                <p>Orthology inference tools (e.g., OrthoFinder, OMA) must be benchmarked using inflation value sweeps to minimize false positives (
                    <xref ref-type="bibr" rid="ref14">Emms &amp; Kelly, 2019</xref>). Trait association studies should integrate pan-genomes with GWAS/QTL data, as demonstrated in rice (
                    <xref ref-type="bibr" rid="ref44">Qin et al., 2021</xref>), and deliver such integrated knowledge graphs to assist scientists and breeders in evidence-based gene discovery as being developed at KnetMiner (
                    <xref ref-type="bibr" rid="ref20">Hassani-Pak et al., 2021</xref>).</p>
            </sec>
        </sec>
        <sec id="sec17">
            <title>6. Case studies
</title>
            <p>

                <list list-type="order">
                    <list-item>
                        <label>1.</label>
                        <p>

                            <bold>Barley Pan-genome
</bold> (
                            <xref ref-type="bibr" rid="ref24">Jayakodi et al., 2024</xref>): The IPK barley pan-genome (76 accessions) highlighted challenges in polyploid assembly and manual curation. Automated QC pipelines (Snakemake (
                            <xref ref-type="bibr" rid="ref30">K&#x00f6;ster &amp; Rahmann, 2012</xref>)) and validation gene sets improved reproducibility.</p>
                    </list-item>
                    <list-item>
                        <label>2.</label>
                        <p>

                            <bold>Rice Pan-genome
</bold> (
                            <xref ref-type="bibr" rid="ref44">Qin et al., 2021</xref>): Analysis of 31 accessions revealed hidden structural variations using Sniffles, underscoring the need for standardized QC metrics.</p>
                    </list-item>
                    <list-item>
                        <label>3.</label>
                        <p>

                            <bold>Tomato Super-Pan-genome</bold> (
                            <xref ref-type="bibr" rid="ref60">Zhou et al., 2022</xref>): A graph-based representation of 838 genomes resolved complex structural variants, informing breeding for disease resistance.</p>
                    </list-item>
                    <list-item>
                        <label>4.</label>
                        <p>

                            <bold>Arabidopsis</bold> (
                            <xref ref-type="bibr" rid="ref26">Jiao &amp; Schneeberger 2020</xref>; 
                            <xref ref-type="bibr" rid="ref59">Zhong et al., 2025</xref>): Annotation transfer across eight high-quality genomes demonstrated the utility of Liftoff for cross-accession consistency. Multi-omic, pan-genomic assessment and comparison of Arabidopsis genomes that constitute the Arabidopsis MAGIC population.</p>
                    </list-item>
                </list>
            </p>
        </sec>
        <sec id="sec18">
            <title>7. Future directions
</title>
            <p>

                <list list-type="bullet">
                    <list-item>
                        <label>&#x2022;</label>
                        <p>

                            <bold>Machine learning</bold>: Tools like DeepVariant (
                            <xref ref-type="bibr" rid="ref43">Poplin et al., 2018</xref>) will enhance variant calling in polyploid genomes. Detection of other genomic features, such as repeat elements, regulatory elements, and binding sites, will be enabled and refined using foundational models, as demonstrated in recent high-impact studies. For instance, BigRNA predicts tissue-specific RNA expression and identifies regulatory elements like microRNA and protein binding sites with high accuracy (
                            <xref ref-type="bibr" rid="ref7">Celaj et al., 2023</xref>). Similarly, Evo 2 detects transcription factor binding sites and exon-intron boundaries across diverse genomes (
                            <xref ref-type="bibr" rid="ref4">Brixi et al., 2025</xref>), while models like DNABERT (
                            <xref ref-type="bibr" rid="ref25">Ji et al., 2021</xref>) and Enformer (
                            <xref ref-type="bibr" rid="ref2">Avsec et al., 2021</xref>) excel in promoter prediction and variant effect analysis (
                            <xref ref-type="bibr" rid="ref34">Li et al., 2024</xref>). These advancements highlight the transformative potential of foundational models in refining genomic feature detection, particularly for complex polyploid genomes.</p>
                    </list-item>
                    <list-item>
                        <label>&#x2022;</label>
                        <p>

                            <bold>Cross-species standards</bold>: Develop clade-wide frameworks (e.g., Brassicaceae) to unify super-pan-genome analyses.</p>
                    </list-item>
                    <list-item>
                        <label>&#x2022;</label>
                        <p>

                            <bold>Community engagement</bold>: ELIXIR hackathons will refine workflows and ontology terms, ensuring adaptability to technological advances.</p>
                    </list-item>
                    <list-item>
                        <label>&#x2022;</label>
                        <p>

                            <bold>Generalized feature identification:</bold> As pan-genome graphs grow to encompass not just core and variable genes but a full spectrum of genomic elements, we need a unified identification system. Current annotation often focuses on genes, leaving features like transposable elements, SSRs, non-coding RNAs, and regulatory motifs with inconsistent or tool-specific labels. We propose the development of a 
                            <bold>generalized feature identifier (GFI)</bold>. This system would provide a stable, queryable, and standardized format for any annotated feature, independent of its type or the discovery tool used. A GFI would be important for pan-genome-scale association studies and for functionally characterizing the entire &#x201c;dark matter&#x201d; of the genome, ensuring that a SNP in a long terminal repeat or a copy number variation in a novel ncRNA can be cataloged and compared with the same rigor as in a gene.</p>
                    </list-item>
                </list>
            </p>
        </sec>
        <sec id="sec19" sec-type="conclusion">
            <title>8. Conclusion</title>
            <p>This white paper establishes a community-driven framework for plant pan-genome research. By adopting these guidelines, researchers can ensure data interoperability, reproducibility, and translational impact. The E-PAN consortium calls for global collaboration to iteratively refine these standards, fostering innovation in plant genomics and breeding.</p>
            <p>

                <bold>Endorsed by ELIXIR Nodes</bold>: DE, BE, PT, SI, UK.</p>
            <p>

                <bold>Contact</bold>: 
                <email xlink:href="mailto:elixir-epan@elixir-europe.org">elixir-epan@elixir-europe.org</email>
            </p>
        </sec>
    </body>
    <back>
        <sec id="sec22" sec-type="data-availability">
            <title>Data availability</title>
            <p>No data is associated with this article.</p>
        </sec>
        <ack>
            <title>Acknowledgments</title>
            <p>The E-PAN consortium acknowledges contributions from researchers at ELIXIR nodes and foundational studies in rice, barley, tomato, and Arabidopsis. AI tools (DeepSeek R1, Qwen QwQ 32B) hosted on 
                <ext-link ext-link-type="uri" xlink:href="https://chat-ai.academiccloud.de">https://chat-ai.academiccloud.de</ext-link> helped create the draft from multiple meeting notes, with thorough human oversight ensuring scientific accuracy.</p>
            <p>For updates, visit the 
                <ext-link ext-link-type="uri" xlink:href="https://elixir-europe.org/communities/plant-sciences">

                    <italic toggle="yes">ELIXIR Plant Sciences Community Portal</italic>
</ext-link>.</p>
        </ack>
        <ref-list>
            <title>References</title>
            <ref id="ref1">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Alonge</surname>
                            <given-names>M</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Major impacts of widespread structural variation on gene expression and crop improvement in tomato.</article-title>
                    <source>

                        <italic toggle="yes">Nature.</italic>
</source>
                    <year>2022</year>;<volume>606</volume>:<fpage>527</fpage>&#x2013;<lpage>534</lpage>.
                    <pub-id pub-id-type="pmid">35676474</pub-id>
                    <pub-id pub-id-type="doi">10.1038/s41586-022-04808-9</pub-id>
                    <pub-id pub-id-type="pmcid">PMC9200638</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref2">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Avsec</surname>
                            <given-names>&#x017d;</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Effective gene expression prediction from sequence by integrating long-range interactions.</article-title>
                    <source>

                        <italic toggle="yes">Nat. Methods.</italic>
</source>
                    <year>2021</year>;<volume>18</volume>:<fpage>1196</fpage>&#x2013;<lpage>1203</lpage>.
                    <pub-id pub-id-type="pmid">34608324</pub-id>
                    <pub-id pub-id-type="doi">10.1038/s41592-021-01252-x</pub-id>
                    <pub-id pub-id-type="pmcid">PMC8490152</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref3">
                <mixed-citation publication-type="book">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Bolger</surname>
                            <given-names>M</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <chapter-title>MapMan Visualization of RNA-Seq Data Using Mercator4 Functional Annotations. </chapter-title>
                    <person-group person-group-type="editor">

                        <name name-style="western">
                            <surname>Dobnik</surname>
                            <given-names>D</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Gruden</surname>
                            <given-names>K</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Ram&#x0161;ak</surname>
                            <given-names>&#x017d;</given-names>
                        </name>

                        <etal/>
</person-group>, editors.
                    <source>

                        <italic toggle="yes">Solanum tuberosum. Methods in Molecular Biology.</italic>
</source>
                    <publisher-loc>New York, NY</publisher-loc>:
                    <publisher-name>Humana</publisher-name>;<year>2021</year>; vol.<volume>2354</volume>.
                    <pub-id pub-id-type="doi">10.1007/978-1-0716-1609-3_9</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref4">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Brixi</surname>
                            <given-names>G</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Genome modeling and design across all domains of life with Evo 2.</article-title>
                    <source>

                        <italic toggle="yes">bioRxiv.</italic>
</source>
                    <year>2025</year>.
                    <pub-id pub-id-type="doi">10.1101/2025.02.18.638918</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref5">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Brown</surname>
                            <given-names>MR</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>tidk: a toolkit to rapidly identify telomeric repeats from genomic datasets Open Access.</article-title>
                    <source>

                        <italic toggle="yes">Bioinformatics.</italic>
</source>
                    <year>February 2025</year>;<volume>41</volume>(<issue>2</issue>):<fpage>btaf049</fpage>.
                    <pub-id pub-id-type="pmid">39891350</pub-id>
                    <pub-id pub-id-type="doi">10.1093/bioinformatics/btaf049</pub-id>
                    <pub-id pub-id-type="pmcid">PMC11814493</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref6">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Cannon</surname>
                            <given-names>EKS</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Guidelines for gene and genome assembly nomenclature.</article-title>
                    <source>

                        <italic toggle="yes">Genetics.</italic>
</source>
                    <year>2025</year>;<volume>229</volume>(<issue>3</issue>).
                    <pub-id pub-id-type="pmid">39813136</pub-id>
                    <pub-id pub-id-type="doi">10.1093/genetics/iyaf006</pub-id>
                    <pub-id pub-id-type="pmcid">PMC11912837</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref7">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Celaj</surname>
                            <given-names>A</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>An RNA foundation model enables discovery of disease mechanisms and candidate therapeutics.</article-title>
                    <source>

                        <italic toggle="yes">bioRxiv.</italic>
</source>
                    <year>2023</year>.
                    <pub-id pub-id-type="doi">10.1101/2023.09.20.558508</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref8">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Cheng</surname>
                            <given-names>H</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Scalable telomere-to-telomere assembly for diploid and polyploid genomes with double graph.</article-title>
                    <source>

                        <italic toggle="yes">Nat. Methods.</italic>
</source>
                    <year>2024</year>;<volume>21</volume>:<fpage>967</fpage>&#x2013;<lpage>970</lpage>.
                    <pub-id pub-id-type="pmid">38730258</pub-id>
                    <pub-id pub-id-type="doi">10.1038/s41592-024-02269-8</pub-id>
                    <pub-id pub-id-type="pmcid">PMC11214949</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref9">
                <mixed-citation publication-type="other">
                    <collab>Darwin Tree of Life Consortium</collab>:
                    <article-title>The Darwin Tree of Life project: Sequencing all life in Britain and Ireland.</article-title>
                    <year>2023</year>.
                    <ext-link ext-link-type="uri" xlink:href="https://www.darwintreeoflife.org/">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref10">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Diesh</surname>
                            <given-names>C</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>JBrowse2: A modular genome browser with next-generation data support.</article-title>
                    <source>

                        <italic toggle="yes">Genome Biol.</italic>
</source>
                    <year>2023</year>;<volume>24</volume>(<issue>74</issue>):<fpage>74</fpage>.
                    <pub-id pub-id-type="pmid">37069644</pub-id>
                    <pub-id pub-id-type="doi">10.1186/s13059-023-02914-z</pub-id>
                    <pub-id pub-id-type="pmcid">PMC10108523</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref11">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Di Tommaso</surname>
                            <given-names>P</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Nextflow enables reproducible computational workflows.</article-title>
                    <source>

                        <italic toggle="yes">Nat. Biotechnol.</italic>
</source>
                    <year>2017</year>;<volume>35</volume>:<fpage>316</fpage>&#x2013;<lpage>319</lpage>.
                    <pub-id pub-id-type="pmid">28398311</pub-id>
                    <pub-id pub-id-type="doi">10.1038/nbt.3820</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref12">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Dyer</surname>
                            <given-names>S</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Ensembl 2025.</article-title>
                    <source>

                        <italic toggle="yes">Nucleic Acids Res.</italic>
</source>
                    <year>6 January 2025</year>;<volume>53</volume>(<issue>D1</issue>):<fpage>D948</fpage>&#x2013;<lpage>D957</lpage>.
                    <pub-id pub-id-type="pmid">39656687</pub-id>
                    <pub-id pub-id-type="doi">10.1093/nar/gkae1071</pub-id>
                    <pub-id pub-id-type="pmcid">PMC11701638</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref13">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Eilbeck</surname>
                            <given-names>K</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>The Sequence Ontology: A tool for the unification of genome annotations.</article-title>
                    <source>

                        <italic toggle="yes">Genome Biol.</italic>
</source>
                    <year>2005</year>;<volume>6</volume>(<issue>R44</issue>):<fpage>R44</fpage>.
                    <pub-id pub-id-type="pmid">15892872</pub-id>
                    <pub-id pub-id-type="doi">10.1186/gb-2005-6-5-r44</pub-id>
                    <pub-id pub-id-type="pmcid">PMC1175956</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref14">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Emms</surname>
                            <given-names>DM</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Kelly</surname>
                            <given-names>S</given-names>
                        </name>
</person-group>:
                    <article-title>OrthoFinder: Phylogenetic orthology inference for comparative genomics.</article-title>
                    <source>

                        <italic toggle="yes">Genome Biol.</italic>
</source>
                    <year>2019</year>;<volume>20</volume>(<issue>238</issue>):<fpage>238</fpage>.
                    <pub-id pub-id-type="pmid">31727128</pub-id>
                    <pub-id pub-id-type="doi">10.1186/s13059-019-1832-y</pub-id>
                    <pub-id pub-id-type="pmcid">PMC6857279</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref15">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Gabriel</surname>
                            <given-names>L</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>BRAKER3: Fully automated genome annotation using RNA-seq and protein evidence with GeneMark-ETP, AUGUSTUS, and TSEBRA.</article-title>
                    <source>

                        <italic toggle="yes">Genome Res.</italic>
</source>
                    <year>2024</year>;<volume>34</volume>(<issue>34</issue>):<fpage>769</fpage>&#x2013;<lpage>777</lpage>.
                    <pub-id pub-id-type="pmid">38866550</pub-id>
                    <pub-id pub-id-type="doi">10.1101/gr.278090.123</pub-id>
                    <pub-id pub-id-type="pmcid">PMC11216308</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref16">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Garrison</surname>
                            <given-names>E</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Building pangenome graphs.</article-title>
                    <source>

                        <italic toggle="yes">Nat. Methods.</italic>
</source>
                    <year>2024</year>;<volume>21</volume>:<fpage>2008</fpage>&#x2013;<lpage>2012</lpage>.
                    <pub-id pub-id-type="pmid">39433878</pub-id>
                    <pub-id pub-id-type="doi">10.1038/s41592-024-02430-3</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref17">
                <mixed-citation publication-type="journal">
                    <collab>Gene Ontology Consortium</collab>:
                    <article-title>The Gene Ontology knowledgebase.</article-title>
                    <source>

                        <italic toggle="yes">Genetics.</italic>
</source>
                    <year>May 2023</year>;<volume>224</volume>(<issue>1</issue>):<fpage>iyad031</fpage>.
                    <pub-id pub-id-type="pmid">36866529</pub-id>
                    <pub-id pub-id-type="pmid">10.1093/genetics/iyad031</pub-id>
                    <pub-id pub-id-type="pmcid">PMC10158837</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref18">
                <mixed-citation publication-type="book">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Guarracino</surname>
                            <given-names>A</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <source>

                        <italic toggle="yes">wfmash: whole-chromosome pairwise alignment using the hierarchical wavefront algorithm.</italic>
</source>
                    <publisher-name>GitHub</publisher-name>;<year>2021</year>.
                    <ext-link ext-link-type="uri" xlink:href="https://github.com/waveygang/wfmash">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref19">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Haas</surname>
                            <given-names>B-J</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies.</article-title>
                    <source>

                        <italic toggle="yes">Nucleic Acids Res.</italic>
</source>
                    <year>1 October 2003</year>;<volume>31</volume>(<issue>19</issue>):<fpage>5654</fpage>&#x2013;<lpage>5666</lpage>.
                    <pub-id pub-id-type="pmid">14500829</pub-id>
                    <pub-id pub-id-type="doi">10.1093/nar/gkg770</pub-id>
                    <pub-id pub-id-type="pmcid">PMC206470</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref20">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Hassani-Pak</surname>
                            <given-names>K</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>KnetMiner: a comprehensive approach for supporting evidence-based gene discovery and complex trait analysis across species.</article-title>
                    <source>

                        <italic toggle="yes">Plant Biotechnol. J.</italic>
</source>
                    <year>2021</year>;<volume>19</volume>:<fpage>1670</fpage>&#x2013;<lpage>1678</lpage>.
                    <pub-id pub-id-type="pmid">33750020</pub-id>
                    <pub-id pub-id-type="doi">10.1111/pbi.13583</pub-id>
                    <pub-id pub-id-type="pmcid">PMC8384599</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref21">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Heller</surname>
                            <given-names>D</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Vingron</surname>
                            <given-names>M</given-names>
                        </name>
</person-group>:
                    <article-title>SVIM: structural variant identification using mapped long reads Open Access.</article-title>
                    <source>

                        <italic toggle="yes">Bioinformatics.</italic>
</source>
                    <year>September 2019</year>;<volume>35</volume>(<issue>17</issue>):<fpage>2907</fpage>&#x2013;<lpage>2915</lpage>.
                    <pub-id pub-id-type="pmid">30668829</pub-id>
                    <pub-id pub-id-type="doi">10.1093/bioinformatics/btz041</pub-id>
                    <pub-id pub-id-type="pmcid">PMC6735718</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref22">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Hickey</surname>
                            <given-names>G</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Genotyping structural variants in pangenome graphs using the vg toolkit.</article-title>
                    <source>

                        <italic toggle="yes">Genome Biol.</italic>
</source>
                    <year>2020</year>;<volume>21</volume>:<fpage>35</fpage>.
                    <pub-id pub-id-type="pmid">32051000</pub-id>
                    <pub-id pub-id-type="doi">10.1186/s13059-020-1941-7</pub-id>
                    <pub-id pub-id-type="pmcid">PMC7017486</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref23">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Holt</surname>
                            <given-names>C</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Yandell</surname>
                            <given-names>M</given-names>
                        </name>
</person-group>:
                    <article-title>MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects.</article-title>
                    <source>

                        <italic toggle="yes">BMC Bioinformatics.</italic>
</source>
                    <year>2011</year>;<volume>12</volume>:<fpage>Article number: 491</fpage>.
                    <pub-id pub-id-type="pmid">22192575</pub-id>
                    <pub-id pub-id-type="doi">10.1186/1471-2105-12-491</pub-id>
                    <pub-id pub-id-type="pmcid">PMC3280279</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref24">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Jayakodi</surname>
                            <given-names>M</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>The barley pan-genome reveals genomic diversity across wild and cultivated accessions.</article-title>
                    <source>

                        <italic toggle="yes">Nature.</italic>
</source>
                    <year>2024</year>;<volume>636</volume>:<fpage>654</fpage>&#x2013;<lpage>662</lpage>.
                    <pub-id pub-id-type="pmid">39537924</pub-id>
                    <pub-id pub-id-type="doi">10.1038/s41586-024-08187-1</pub-id>
                    <pub-id pub-id-type="pmcid">PMC11655362</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref25">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Ji</surname>
                            <given-names>Y</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>DNABERT: pre-trained Bidirectional Encoder Representations from Transformers model for DNA-language in genome Free.</article-title>
                    <source>

                        <italic toggle="yes">Bioinformatics.</italic>
</source>
                    <year>August 2021</year>;<volume>37</volume>(<issue>15</issue>):<fpage>2112</fpage>&#x2013;<lpage>2120</lpage>.
                    <pub-id pub-id-type="pmid">33538820</pub-id>
                    <pub-id pub-id-type="doi">10.1093/bioinformatics/btab083</pub-id>
                    <pub-id pub-id-type="pmcid">PMC11025658</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref26">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Jiao</surname>
                            <given-names>W-B</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Schneeberger</surname>
                            <given-names>K</given-names>
                        </name>
</person-group>:
                    <article-title>Chromosome-level assemblies of multiple Arabidopsis genomes reveal hotspots of rearrangements with altered evolutionary dynamics.</article-title>
                    <source>

                        <italic toggle="yes">Nat. Commun.</italic>
</source>
                    <year>2020</year>;<volume>11</volume>(<issue>989</issue>):<fpage>989</fpage>.
                    <pub-id pub-id-type="pmid">32080174</pub-id>
                    <pub-id pub-id-type="doi">10.1038/s41467-020-14779-y</pub-id>
                    <pub-id pub-id-type="pmcid">PMC7033125</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref27">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Jonkheer</surname>
                            <given-names>EM</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>PanTools v3: Functional analysis of prokaryotic pangenomes.</article-title>
                    <source>

                        <italic toggle="yes">Bioinformatics.</italic>
</source>
                    <year>2022</year>;<volume>38</volume>(<issue>18</issue>):<fpage>4403</fpage>&#x2013;<lpage>4405</lpage>.
                    <pub-id pub-id-type="pmid">35861394</pub-id>
                    <pub-id pub-id-type="doi">10.1093/bioinformatics/btac506</pub-id>
                    <pub-id pub-id-type="pmcid">PMC9477522</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref28">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Kolmogorov</surname>
                            <given-names>M</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>metaFlye: scalable long-read metagenome assembly using repeat graphs.</article-title>
                    <source>

                        <italic toggle="yes">Nat. Methods.</italic>
</source>
                    <year>2020</year>;<volume>17</volume>:<fpage>1103</fpage>&#x2013;<lpage>1110</lpage>.
                    <pub-id pub-id-type="pmid">33020656</pub-id>
                    <pub-id pub-id-type="doi">10.1038/s41592-020-00971-x</pub-id>
                    <pub-id pub-id-type="pmcid">PMC10699202</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref29">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Koren</surname>
                            <given-names>S</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation.</article-title>
                    <source>

                        <italic toggle="yes">Genome Res.</italic>
</source>
                    <year>2017</year>;<volume>27</volume>:<fpage>722</fpage>&#x2013;<lpage>736</lpage>.
                    <pub-id pub-id-type="pmid">28298431</pub-id>
                    <pub-id pub-id-type="doi">10.1101/gr.215087.116</pub-id>
                    <pub-id pub-id-type="pmcid">PMC5411767</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref30">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>K&#x00f6;ster</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Rahmann</surname>
                            <given-names>S</given-names>
                        </name>
</person-group>:
                    <article-title>Snakemake&#x2014;A scalable bioinformatics workflow engine.</article-title>
                    <source>

                        <italic toggle="yes">Bioinformatics.</italic>
</source>
                    <year>2012</year>;<volume>28</volume>(<issue>19</issue>):<fpage>2520</fpage>&#x2013;<lpage>2522</lpage>.
                    <pub-id pub-id-type="pmid">22908215</pub-id>
                    <pub-id pub-id-type="doi">10.1093/bioinformatics/bts480</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref31">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Li</surname>
                            <given-names>H</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>The design and construction of reference pangenome graphs.</article-title>
                    <source>

                        <italic toggle="yes">Genome Biol.</italic>
</source>
                    <year>2020</year>;<volume>21</volume>(<issue>265</issue>):<fpage>265</fpage>.
                    <pub-id pub-id-type="pmid">33066802</pub-id>
                    <pub-id pub-id-type="doi">10.1186/s13059-020-02168-z</pub-id>
                    <pub-id pub-id-type="pmcid">PMC7568353</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref32">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Li</surname>
                            <given-names>N</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Super-pangenome analyses highlight genomic diversity and structural variation across wild and cultivated tomato species.</article-title>
                    <source>

                        <italic toggle="yes">Nat. Genet.</italic>
</source>
                    <year>2023a</year>;<volume>55</volume>(<issue>8</issue>):<fpage>852</fpage>&#x2013;<lpage>860</lpage>.
                    <pub-id pub-id-type="pmid">37024581</pub-id>
                    <pub-id pub-id-type="doi">10.1038/s41588-023-01340-y</pub-id>
                    <pub-id pub-id-type="pmcid">PMC10181942</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref33">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Li</surname>
                            <given-names>K</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Identification of errors in draft genome assemblies at single-nucleotide resolution for quality assessment and improvement.</article-title>
                    <source>

                        <italic toggle="yes">Nat. Commun.</italic>
</source>
                    <year>2023b</year>;<volume>14</volume>:<fpage>6556</fpage>.
                    <pub-id pub-id-type="pmid">37848433</pub-id>
                    <pub-id pub-id-type="doi">10.1038/s41467-023-42336-w</pub-id>
                    <pub-id pub-id-type="pmcid">PMC10582259</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref34">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Li</surname>
                            <given-names>Q</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Progress and opportunities of foundation models in bioinformatics.</article-title>
                    <source>

                        <italic toggle="yes">Brief. Bioinform.</italic>
</source>
                    <year>2024</year>;<volume>25</volume>(<issue>6</issue>).
                    <pub-id pub-id-type="pmid">39461902</pub-id>
                    <pub-id pub-id-type="doi">10.1093/bib/bbae548</pub-id>
                    <pub-id pub-id-type="pmcid">PMC11512649</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref35">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Manni</surname>
                            <given-names>M</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>BUSCO: Assessing genome assembly and annotation completeness.</article-title>
                    <source>

                        <italic toggle="yes">Current Protocols.</italic>
</source>
                    <year>2021</year>;<volume>1</volume>(<issue>7</issue>):<fpage>e323</fpage>.
                    <pub-id pub-id-type="pmid">34936221</pub-id>
                    <pub-id pub-id-type="doi">10.1002/cpz1.323</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref36">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Mar&#x00e7;ais</surname>
                            <given-names>G</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>A fast, lock-free approach for efficient parallel counting of occurrences of k-mers Free.</article-title>
                    <source>

                        <italic toggle="yes">Bioinformatics.</italic>
</source>
                    <year>March 2011</year>;<volume>27</volume>(<issue>6</issue>):<fpage>764</fpage>&#x2013;<lpage>770</lpage>.
                    <pub-id pub-id-type="pmid">21217122</pub-id>
                    <pub-id pub-id-type="doi">10.1093/bioinformatics/btr011</pub-id>
                    <pub-id pub-id-type="pmcid">PMC3051319</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref37">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Mendes</surname>
                            <given-names>FK</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>CAFE 5 models variation in evolutionary rates among gene families.</article-title>
                    <source>

                        <italic toggle="yes">Bioinformatics.</italic>
</source>
                    <year>December 2020</year>;<volume>36</volume>(<issue>22-23</issue>):<fpage>5516</fpage>&#x2013;<lpage>5518</lpage>.
                    <pub-id pub-id-type="pmid">33325502</pub-id>
                    <pub-id pub-id-type="doi">10.1093/bioinformatics/btaa1022</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref38">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Mikheenko</surname>
                            <given-names>A</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>WebQUAST: online evaluation of genome assemblies Open Access.</article-title>
                    <source>

                        <italic toggle="yes">Nucleic Acids Res.</italic>
</source>
                    <year>5 July 2023</year>;<volume>51</volume>(<issue>W1</issue>):<fpage>W601</fpage>&#x2013;<lpage>W606</lpage>.
                    <pub-id pub-id-type="doi">10.1093/nar/gkad406</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref39">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Naithani</surname>
                            <given-names>S</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Exploring Pan-Genomes: An Overview of Resources and Tools for Unraveling Structure, Function, and Evolution of Crop Genes and Genomes.</article-title>
                    <source>

                        <italic toggle="yes">Biomolecules.</italic>
</source>
                    <year>2023 Sep 17</year>;<volume>13</volume>(<issue>9</issue>):<fpage>1403</fpage>.
                    <pub-id pub-id-type="pmid">37759803</pub-id>
                    <pub-id pub-id-type="doi">10.3390/biom13091403</pub-id>
                    <pub-id pub-id-type="pmcid">PMC10527062</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref40">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Nevers</surname>
                            <given-names>Y</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>OMArk: Genome assembly quality assessment using evolutionary signals.</article-title>
                    <source>

                        <italic toggle="yes">Nat. Biotechnol.</italic>
</source>
                    <year>2025</year>;<volume>43</volume>:<fpage>124</fpage>&#x2013;<lpage>133</lpage>.
                    <pub-id pub-id-type="pmid">38383603</pub-id>
                    <pub-id pub-id-type="doi">10.1038/s41587-024-02147-w</pub-id>
                    <pub-id pub-id-type="pmcid">PMC11738984</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref41">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Papoutsoglou</surname>
                            <given-names>EA</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Enabling reusability of plant phenomic datasets with MIAPPE 1.1.</article-title>
                    <source>

                        <italic toggle="yes">New Phytol.</italic>
</source>
                    <year>2020</year>;<volume>227</volume>:<fpage>260</fpage>&#x2013;<lpage>273</lpage>.
                    <pub-id pub-id-type="pmid">32171029</pub-id>
                    <pub-id pub-id-type="doi">10.1111/nph.16544</pub-id>
                    <pub-id pub-id-type="pmcid">PMC7317793</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref42">
                <mixed-citation publication-type="journal">
                    <collab>Plant Ontology Consortium</collab>:
                    <article-title>The Plant Ontology: A tool for plant genomics.</article-title>
                    <source>

                        <italic toggle="yes">Nucleic Acids Res.</italic>
</source>
                    <year>2023</year>;<volume>51</volume>(<issue>D1</issue>):<fpage>D232</fpage>&#x2013;<lpage>D239</lpage>.
                    <pub-id pub-id-type="pmid">36373614</pub-id>
                    <pub-id pub-id-type="doi">10.1093/nar/gkac1002</pub-id>
                    <pub-id pub-id-type="pmcid">PMC9825547</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref43">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Poplin</surname>
                            <given-names>R</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>A universal SNP and small-indel variant caller using deep neural networks.</article-title>
                    <source>

                        <italic toggle="yes">Nat. Biotechnol.</italic>
</source>
                    <year>2018</year>;<volume>36</volume>(<issue>10</issue>):<fpage>983</fpage>&#x2013;<lpage>987</lpage>.
                    <pub-id pub-id-type="pmid">30247488</pub-id>
                    <pub-id pub-id-type="doi">10.1038/nbt.4235</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref44">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Qin</surname>
                            <given-names>P</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Pan-genome analysis of 33 genetically diverse rice accessions reveals hidden genomic variations.</article-title>
                    <source>

                        <italic toggle="yes">Cell.</italic>
</source>
                    <year>2021</year>;<volume>184</volume>(<issue>13</issue>):<fpage>3542</fpage>&#x2013;<lpage>3558.e16</lpage>.
                    <pub-id pub-id-type="pmid">34051138</pub-id>
                    <pub-id pub-id-type="doi">10.1016/j.cell.2021.04.046</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref45">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Ranallo-Benavidez</surname>
                            <given-names>TR</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>GenomeScope 2.0 and Smudgeplot for reference-free profiling of polyploid genomes.</article-title>
                    <source>

                        <italic toggle="yes">Nat. Commun.</italic>
</source>
                    <year>2020</year>;<volume>11</volume>:<fpage>1432</fpage>.
                    <pub-id pub-id-type="pmid">32188846</pub-id>
                    <pub-id pub-id-type="doi">10.1038/s41467-020-14998-3</pub-id>
                    <pub-id pub-id-type="pmcid">PMC7080791</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref46">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Rhie</surname>
                            <given-names>A</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Merqury: Reference-free quality assessment of genome assemblies.</article-title>
                    <source>

                        <italic toggle="yes">Genome Biol.</italic>
</source>
                    <year>2020</year>;<volume>21</volume>(<issue>245</issue>):<fpage>245</fpage>.
                    <pub-id pub-id-type="pmid">32928274</pub-id>
                    <pub-id pub-id-type="doi">10.1186/s13059-020-02134-9</pub-id>
                    <pub-id pub-id-type="pmcid">PMC7488777</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref47">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Robinson</surname>
                            <given-names>J-T</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>igv.js: an embeddable JavaScript implementation of the Integrative Genomics Viewer (IGV).</article-title>
                    <source>

                        <italic toggle="yes">Bioinformatics.</italic>
</source>
                    <year>January 2023</year>;<volume>39</volume>(<issue>1</issue>):<fpage>btac830</fpage>.
                    <pub-id pub-id-type="pmid">36562559</pub-id>
                    <pub-id pub-id-type="doi">10.1093/bioinformatics/btac830</pub-id>
                    <pub-id pub-id-type="pmcid">PMC9825295</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref48">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Schreiber</surname>
                            <given-names>M</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Plant pangenomes for crop improvement, biodiversity and evolution.</article-title>
                    <source>

                        <italic toggle="yes">Nat. Rev. Genet.</italic>
</source>
                    <year>2024</year>;<volume>25</volume>:<fpage>563</fpage>&#x2013;<lpage>577</lpage>.
                    <pub-id pub-id-type="pmid">38378816</pub-id>
                    <pub-id pub-id-type="doi">10.1038/s41576-024-00691-4</pub-id>
                    <pub-id pub-id-type="pmcid">PMC7616794</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref49">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Shang</surname>
                            <given-names>L</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>A super pan-genomic landscape of rice.</article-title>
                    <source>

                        <italic toggle="yes">Cell Res.</italic>
</source>
                    <year>2022</year>;<volume>32</volume>:<fpage>878</fpage>&#x2013;<lpage>896</lpage>.
                    <pub-id pub-id-type="pmid">35821092</pub-id>
                    <pub-id pub-id-type="doi">10.1038/s41422-022-00685-z</pub-id>
                    <pub-id pub-id-type="pmcid">PMC9525306</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref50">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Shannon</surname>
                            <given-names>P</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Cytoscape: A software environment for integrated models of biomolecular interaction networks.</article-title>
                    <source>

                        <italic toggle="yes">Genome Res.</italic>
</source>
                    <year>2003</year>;<volume>13</volume>(<issue>11</issue>):<fpage>2498</fpage>&#x2013;<lpage>2504</lpage>.
                    <pub-id pub-id-type="pmid">14597658</pub-id>
                    <pub-id pub-id-type="doi">10.1101/gr.1239303</pub-id>
                    <pub-id pub-id-type="pmcid">PMC403769</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref51">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Shumate</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Salzberg</surname>
                            <given-names>SL</given-names>
                        </name>
</person-group>:
                    <article-title>Liftoff: Accurate alignment-based annotation transfer in phylogenomics.</article-title>
                    <source>

                        <italic toggle="yes">Bioinformatics.</italic>
</source>
                    <year>2021</year>;<volume>37</volume>(<issue>12</issue>):<fpage>1639</fpage>&#x2013;<lpage>1643</lpage>.
                    <pub-id pub-id-type="pmid">33320174</pub-id>
                    <pub-id pub-id-type="doi">10.1093/bioinformatics/btaa1016</pub-id>
                    <pub-id pub-id-type="pmcid">PMC8289374</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref52">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Smolka</surname>
                            <given-names>M</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>(2024). Detection of mosaic and population-level structural variants with Sniffles2.</article-title>
                    <source>

                        <italic toggle="yes">Nat. Biotechnol.</italic>
</source>
                    <year>2024</year>;<volume>42</volume>:<fpage>1571</fpage>&#x2013;<lpage>1580</lpage>.
                    <pub-id pub-id-type="pmid">38168980</pub-id>
                    <pub-id pub-id-type="doi">10.1038/s41587-023-02024-y</pub-id>
                    <pub-id pub-id-type="pmcid">PMC11217151</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref53">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Sommer</surname>
                            <given-names>MJ</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>PSAURON: A tool for structural annotation quality assessment.</article-title>
                    <source>

                        <italic toggle="yes">NAR Genomics and Bioinformatics.</italic>
</source>
                    <year>2025</year>;<volume>7</volume>(<issue>1</issue>).
                    <pub-id pub-id-type="pmid">39781514</pub-id>
                    <pub-id pub-id-type="doi">10.1093/nargab/lqae189</pub-id>
                    <pub-id pub-id-type="pmcid">PMC11704789</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref54">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Stiehler</surname>
                            <given-names>F</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Helixer: Cross-species gene annotation with deep learning.</article-title>
                    <source>

                        <italic toggle="yes">Bioinformatics.</italic>
</source>
                    <year>2020</year>;<volume>36</volume>(<issue>22-23</issue>):<fpage>5291</fpage>&#x2013;<lpage>5298</lpage>.
                    <pub-id pub-id-type="pmid">33325516</pub-id>
                    <pub-id pub-id-type="doi">10.1093/bioinformatics/btaa1044</pub-id>
                    <pub-id pub-id-type="pmcid">PMC8016489</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref55">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Tettelin</surname>
                            <given-names>H</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae.</article-title>
                    <source>

                        <italic toggle="yes">Proc. Natl. Acad. Sci.</italic>
</source>
                    <year>2005</year>;<volume>102</volume>(<issue>39</issue>):<fpage>13950</fpage>&#x2013;<lpage>13955</lpage>.
                    <pub-id pub-id-type="pmid">16172379</pub-id>
                    <pub-id pub-id-type="doi">10.1073/pnas.0506758102</pub-id>
                    <pub-id pub-id-type="pmcid">PMC1216834</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref56">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Tonkin-Hill</surname>
                            <given-names>G</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Panaroo: Pangenome analysis pipeline for microbial genomes.</article-title>
                    <source>

                        <italic toggle="yes">Genome Biol.</italic>
</source>
                    <year>2020</year>;<volume>21</volume>(<issue>180</issue>):<fpage>180</fpage>.
                    <pub-id pub-id-type="pmid">32698896</pub-id>
                    <pub-id pub-id-type="doi">10.1186/s13059-020-02090-4</pub-id>
                    <pub-id pub-id-type="pmcid">PMC7376924</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref57">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Ou</surname>
                            <given-names>S</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Assessing genome assembly quality using the LTR Assembly Index (LAI) Open Access.</article-title>
                    <source>

                        <italic toggle="yes">Nucleic Acids Res.</italic>
</source>
                    <year>30 November 2018</year>;<volume>46</volume>(<issue>21</issue>):<fpage>e126</fpage>.
                    <pub-id pub-id-type="pmid">30107434</pub-id>
                    <pub-id pub-id-type="doi">10.1093/nar/gky730</pub-id>
                    <pub-id pub-id-type="pmcid">PMC6265445</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref58">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Wick</surname>
                            <given-names>RR</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Bandage: Interactive visualization of de novo genome assemblies.</article-title>
                    <source>

                        <italic toggle="yes">Bioinformatics.</italic>
</source>
                    <year>2015</year>;<volume>31</volume>(<issue>20</issue>):<fpage>3350</fpage>&#x2013;<lpage>3352</lpage>.
                    <pub-id pub-id-type="pmid">26099265</pub-id>
                    <pub-id pub-id-type="doi">10.1093/bioinformatics/btv383</pub-id>
                    <pub-id pub-id-type="pmcid">PMC4595904</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref59">
                <mixed-citation publication-type="other">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Zhong</surname>
                            <given-names>Z</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Revisiting the Central Dogma: the distinct roles of genome, methylation, transcription, and translation on protein expression in 
                        <italic toggle="yes">Arabidopsis thaliana.</italic>
                    </article-title>
                    <year>2025</year>.
                    <pub-id pub-id-type="doi">10.1101/2025.01.08.631880</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref60">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Zhou</surname>
                            <given-names>Y</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Graph pangenome captures missing heritability and empowers tomato breeding.</article-title>
                    <source>

                        <italic toggle="yes">Nature.</italic>
</source>
                    <year>2022</year>;<volume>606</volume>:<fpage>527</fpage>&#x2013;<lpage>534</lpage>.
                    <pub-id pub-id-type="pmid">35676474</pub-id>
                    <pub-id pub-id-type="doi">10.1038/s41586-022-04808-9</pub-id>
                    <pub-id pub-id-type="pmcid">PMC9200638</pub-id>
                </mixed-citation>
            </ref>
        </ref-list>
        <ref-list>
            <title>Additional Resources</title>
            <ref id="ref61">
                <mixed-citation publication-type="other">
                    <collab>AgBioData Nomenclature Working Group</collab>:
                    <article-title>GitHub repository.</article-title>
                    <year>2025</year>.
                    <ext-link ext-link-type="uri" xlink:href="https://github.com/AgBioData/Pan-genomes">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref62">
                <mixed-citation publication-type="other">
                    <collab>ENA Metadata Standards</collab>:
                    <article-title>European Nucleotide Archive.</article-title>
                    <year>2025</year>.
                    <ext-link ext-link-type="uri" xlink:href="https://ena-docs.readthedocs.io/">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref63">
                <mixed-citation publication-type="other">
                    <collab>ENSEMBL</collab>:
                    <article-title>Genome data &amp; annotation.</article-title>
                    <year>2025</year>.
                    <ext-link ext-link-type="uri" xlink:href="https://beta.ensembl.org/">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref64">
                <mixed-citation publication-type="other">
                    <collab>FAIR Cookbook</collab>:
                    <article-title>ELIXIR Europe.</article-title>
                    <year>2025</year>.
                    <ext-link ext-link-type="uri" xlink:href="https://faircookbook.elixir-europe.org">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref65">
                <mixed-citation publication-type="other">
                    <collab>INSDC</collab>:
                    <article-title>The International Nucleotide Sequence Database Collaboration.</article-title>
                    <year>2025</year>.
                    <ext-link ext-link-type="uri" xlink:href="https://www.insdc.org/">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref66">
                <mixed-citation publication-type="other">
                    <collab>Merqury Documentation</collab>:
                    <article-title>GitHub.</article-title>
                    <year>2020</year>.
                    <ext-link ext-link-type="uri" xlink:href="https://github.com/marbl/merqury">Reference Source</ext-link>
                </mixed-citation>
            </ref>
        </ref-list>
    </back>
    <sub-article article-type="reviewer-report" id="report404154">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.183537.r404154</article-id>
            <title-group>
                <article-title>Reviewer response for version 1</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Xie</surname>
                        <given-names>Xiaoming</given-names>
                    </name>
                    <xref ref-type="aff" rid="r404154a1">1</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0002-7925-4964</uri>
                </contrib>
                <aff id="r404154a1">
                    <label>1</label>Wheat Genetics and Genomics Center, China Agricultural University College of Agronomy and Biotechnology (Ringgold ID: 200630), Beijing, Beijing, China</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>4</day>
                <month>9</month>
                <year>2025</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2025 Xie X</copyright-statement>
                <copyright-year>2025</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport404154" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.166538.1"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve-with-reservations</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>This white paper by Heuermann 
                <italic>et al.</italic> presents a timely and comprehensive framework aiming to establish community-wide standards for plant pan-genome analysis. The authors cover critical aspects from nomenclature and quality control to data sharing and visualization. This work represents a valuable and necessary initiative to promote FAIR principles in a rapidly evolving field. However, several major revisions are required to enhance its practical utility, scientific rigor, and overall impact before it can be endorsed as a foundational guide for the community.</p>
            <p> </p>
            <p> Major comments</p>
            <p> </p>
            <p> 1. Lack of Practical Guidance and Actionable Workflows. While the paper provides an exhaustive list of tools and standards, it currently functions more as a catalogue than a practical guide. A researcher new to the field would struggle to navigate the options and select the most appropriate methodology for their specific project. To address this, the authors should incorporate one or more decision-tree figures or summary tables that guide users based on their specific research context (e.g., species ploidy, data type, biological question). Such a resource would transform this document from a simple list into an indispensable, actionable guide.</p>
            <p> </p>
            <p> 2. Understated Importance of the Generalized Feature Identifier (GFI) Concept. The proposal of a 'Generalized Feature Identifier' (GFI) in the 'Future Directions' section is a highly innovative and critical idea. It elegantly addresses a major bottleneck in functionally annotating the non-genic 'dark matter' of pan-genomes, which is often overlooked. However, its placement as a future thought diminishes its significance. This concept should be introduced much earlier in the manuscript, possibly in the nomenclature section, and framed as a core recommendation of this white paper to highlight its forward-thinking contribution.</p>
            <p> </p>
            <p> 3.&#x00a0;Imprecise Comparison of Linear- vs. Graph-Based Approaches. The distinction between linear- and graph-based pan-genomes in Section 5.1 is crucial, but the current description of linear approaches could be refined for greater accuracy and clarity. The manuscript conflates two distinct types of 'linear-based' analyses. It states that these approaches identify sequence-level variations (SNPs, indels) as well as gene-level presence/absence variations (PAVs). However, the tools cited as examples, such as OrthoFinder and Ensembl Compara, are primarily used for inferring orthology and identifying gene-level PAVs. They are not the primary tools for calling SNPs and small indels from whole-genome alignments (which typically involves a separate workflow with variant callers like GATK against a linear reference). This conflation creates an imprecise comparison with graph-based approaches, which are fundamentally designed to model sequence-level variation directly. To improve this section, we recommend the authors explicitly distinguish between: (a) Sequence-based linear analysis: Aligning multiple genomes to a single linear reference to call SNPs and indels. (b) Gene-based linear analysis: Using orthology inference tools on annotated gene sets to determine the pan-gene repertoire and gene-level PAVs. By separating these two concepts, the manuscript can provide a more accurate and nuanced comparison, highlighting how graph-based pan-genomes aim to integrate both types of variation in a way that requires distinct workflows in a traditional linear framework.</p>
            <p> </p>
            <p> Minor comments</p>
            <p> </p>
            <p> 1. The authors should adopt a more authoritative tone appropriate for a standards paper. Phrases like "probably more suited" (Section 2.3) should be replaced with definitive recommendations (e.g., "we recommend the use of...") to provide clear guidance.</p>
            <p> </p>
            <p> 2. In Section 4.3 (Metadata requirements), the list of mandatory fields should be expanded to include the specific versions of all software and pipelines used. This is essential for ensuring full reproducibility of the analyses.</p>
            <p> </p>
            <p> 3. The case studies in Section 6 are too brief to be impactful. Each case should be slightly expanded to illustrate how the application (or lack thereof) of the proposed standards directly impacted the project's outcomes, challenges, or successes. This would provide concrete evidence for the importance of the proposed standards.</p>
            <p> </p>
            <p> 4. Regarding the classification of genes into "core, shell, and cloud" (Section 2.3), it is crucial to state that the specific percentage thresholds used for these definitions must be explicitly reported in the metadata, as they can significantly impact downstream comparative analyses.</p>
            <p>Is the review written in accessible language?</p>
            <p>Partly</p>
            <p>Are all factual statements correct and adequately supported by citations?</p>
            <p>Yes</p>
            <p>Are the conclusions drawn appropriate in the context of the current research literature?</p>
            <p>Yes</p>
            <p>Is the topic of the review discussed comprehensively in the context of the current literature?</p>
            <p>Yes</p>
            <p>Reviewer Expertise:</p>
            <p>wheat pangenome, gene-based pangenome,&#x00a0;comparative genomics</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.</p>
        </body>
        <sub-article article-type="response" id="comment14928-404154">
            <front-stub>
                <contrib-group>
                    <contrib contrib-type="author">
                        <name>
                            <surname>Heuermann</surname>
                            <given-names>Marc Christian</given-names>
                        </name>
                    </contrib>
                </contrib-group>
                <author-notes>
                    <fn fn-type="conflict">
                        <p>
                            <bold>Competing interests: </bold>No competing interests.</p>
                    </fn>
                </author-notes>
                <pub-date pub-type="epub">
                    <day>10</day>
                    <month>11</month>
                    <year>2025</year>
                </pub-date>
            </front-stub>
            <body>
                <p>
                    <bold>Reviewer 2</bold>
                </p>
                <p> Xiaoming Xie (
                    <ext-link ext-link-type="uri" xlink:href="https://orcid.org/0000-0002-7925-4964">https://orcid.org/0000-0002-7925-4964</ext-link>), Wheat Genetics and Genomics Center, China Agricultural University College of Agronomy and Biotechnology (Ringgold ID: 200630), Beijing, Beijing, China</p>
                <p> Approved with Reservations</p>
                <p> This white paper by Heuermann 
                    <italic>et al.</italic> presents a timely and comprehensive framework aiming to establish community-wide standards for plant pan-genome analysis. The authors cover critical aspects from nomenclature and quality control to data sharing and visualization. This work represents a valuable and necessary initiative to promote FAIR principles in a rapidly evolving field. However, several major revisions are required to enhance its practical utility, scientific rigor, and overall impact before it can be endorsed as a foundational guide for the community.</p>
                <p> </p>
                <p> &#x00a0;Major comments</p>
                <p> </p>
                <p> &#x00a0;1. Lack of Practical Guidance and Actionable Workflows. While the paper provides an exhaustive list of tools and standards, it currently functions more as a catalogue than a practical guide. A researcher new to the field would struggle to navigate the options and select the most appropriate methodology for their specific project. To address this, the authors should incorporate one or more decision-tree figures or summary tables that guide users based on their specific research context (e.g., species ploidy, data type, biological question). Such a resource would transform this document from a simple list into an indispensable, actionable guide.</p>
                <p> </p>
                <p> 
                    <bold>Answer: Thank you for your insightful comments. Both reviewers raised this point, prompting us to create Figure 1, which clearly illustrates how each tool fits into the pan-genome analysis workflow. We have now cross-referenced the figure in both Section 3 and Section 5.</bold>
                </p>
                <p> </p>
                <p> &#x00a0;2. Understated Importance of the Generalized Feature Identifier (GFI) Concept. The proposal of a 'Generalized Feature Identifier' (GFI) in the 'Future Directions' section is a highly innovative and critical idea. It elegantly addresses a major bottleneck in functionally annotating the non-genic 'dark matter' of pan-genomes, which is often overlooked. However, its placement as a future thought diminishes its significance. This concept should be introduced much earlier in the manuscript, possibly in the nomenclature section, and framed as a core recommendation of this white paper to highlight its forward-thinking contribution.</p>
                <p> </p>
                <p> 
                    <bold>Answer: We do agree with the assessment and positioned the GFI concept now as paragraph 2.4.</bold>
                </p>
                <p> </p>
                <p> &#x00a0;3. Imprecise Comparison of Linear- vs. Graph-Based Approaches. The distinction between linear- and graph-based pan-genomes in Section 5.1 is crucial, but the current description of linear approaches could be refined for greater accuracy and clarity. The manuscript conflates two distinct types of 'linear-based' analyses. It states that these approaches identify sequence-level variations (SNPs, indels) as well as gene-level presence/absence variations (PAVs). However, the tools cited as examples, such as OrthoFinder and Ensembl Compara, are primarily used for inferring orthology and identifying gene-level PAVs. They are not the primary tools for calling SNPs and small indels from whole-genome alignments (which typically involves a separate workflow with variant callers like GATK against a linear reference). This conflation creates an imprecise comparison with graph-based approaches, which are fundamentally designed to model sequence-level variation directly. To improve this section, we recommend the authors explicitly distinguish between: (a) Sequence-based linear analysis: Aligning multiple genomes to a single linear reference to call SNPs and indels. (b) Gene-based linear analysis: Using orthology inference tools on annotated gene sets to determine the pan-gene repertoire and gene-level PAVs. By separating these two concepts, the manuscript can provide a more accurate and nuanced comparison, highlighting how graph-based pan-genomes aim to integrate both types of variation in a way that requires distinct workflows in a traditional linear framework.</p>
                <p> </p>
                <p> 
                    <bold>Answer: Thank you for this specific and helpful comment. We have reworked and updated our paragraph 5 accordingly. </bold>
                </p>
                <p> </p>
                <p> &#x00a0;Minor comments</p>
                <p> </p>
                <p> &#x00a0;1. The authors should adopt a more authoritative tone appropriate for a standards paper. Phrases like "probably more suited" (Section 2.3) should be replaced with definitive recommendations (e.g., "we recommend the use of...") to provide clear guidance.</p>
                <p> </p>
                <p> 
                    <bold>Answer: Thank you for the suggestion. We rephrased the paragraph. </bold>
                </p>
                <p> </p>
                <p> &#x00a0;2. In Section 4.3 (Metadata requirements), the list of mandatory fields should be expanded to include the specific versions of all software and pipelines used. This is essential for ensuring full reproducibility of the analyses.</p>
                <p> </p>
                <p> 
                    <bold>Answer: Thanks, we have now addressed this important point and extended the paragraph to elaborate more on the importance of metadata requirements.</bold>
                </p>
                <p> </p>
                <p> &#x00a0;3. The case studies in Section 6 are too brief to be impactful. Each case should be slightly expanded to illustrate how the application (or lack thereof) of the proposed standards directly impacted the project's outcomes, challenges, or successes. This would provide concrete evidence for the importance of the proposed standards.</p>
                <p> </p>
                <p> 
                    <bold>Answer: Agreed. We updated the paragraph and extended the description of the case studies and their relevance.</bold>
                </p>
                <p> </p>
                <p> &#x00a0;4. Regarding the classification of genes into "core, shell, and cloud" (Section 2.3), it is crucial to state that the specific percentage thresholds used for these definitions must be explicitly reported in the metadata, as they can significantly impact downstream comparative analyses.</p>
                <p> </p>
                <p> 
                    <bold>Answer: We added a sentence to be more precise regarding this point.</bold> 
                    <list list-type="bullet">
                        <list-item>
                            <p>Is the topic of the review discussed comprehensively in the context of the current literature?</p>
                        </list-item>
                    </list> Yes 
                    <list list-type="bullet">
                        <list-item>
                            <p>Are all factual statements correct and adequately supported by citations?</p>
                        </list-item>
                    </list> Yes 
                    <list list-type="bullet">
                        <list-item>
                            <p>Is the review written in accessible language?</p>
                        </list-item>
                    </list> Partly 
                    <list list-type="bullet">
                        <list-item>
                            <p>Are the conclusions drawn appropriate in the context of the current research literature?</p>
                        </list-item>
                    </list> Yes</p>
            </body>
        </sub-article>
    </sub-article>
    <sub-article article-type="reviewer-report" id="report404153">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.183537.r404153</article-id>
            <title-group>
                <article-title>Reviewer response for version 1</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Sahu</surname>
                        <given-names>Sunil Kumar</given-names>
                    </name>
                    <xref ref-type="aff" rid="r404153a1">1</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0002-4742-9870</uri>
                </contrib>
                <aff id="r404153a1">
                    <label>1</label>State Key Laboratory of&#x00a0;Genome&#x00a0;and Multi-omics Technologies, BGI Research, Shenzhen, China</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>3</day>
                <month>9</month>
                <year>2025</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2025 Sahu SK</copyright-statement>
                <copyright-year>2025</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport404153" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.166538.1"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve-with-reservations</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>This article presents a very comprehensive and thoughtful set of recommendations, covering a wide spectrum from naming conventions to quality control and data sharing. I enjoyed reading it and have a few suggestions that I believe could strengthen its impact and practicality for the community.</p>
            <p> My main thought is that the sheer breadth of recommendations, while excellent, might feel daunting for some labs, especially those with limited resources. To make the framework more accessible, it would be incredibly helpful if the authors could more clearly distinguish between what they consider essential "minimum standards" and what are "aspirational best practices." For instance, while the framework is well-described, I found myself wishing for a more concrete, practical roadmap. The QC section, for example, lists many excellent tools (FastQC, GenomeScope, QUAST, etc.), but a visual workflow diagram would be immensely valuable. A figure illustrating the step-by-step process from raw data QC, through assembly and annotation QC, to pan-genome QC would really help readers visualize how to integrate these tools into their own standardized processes.</p>
            <p> Finally, on the topic of quality control, the article does a great job listing the available tools but could go further in guiding users on how to apply them. For example, some guidance on tool selection would be useful, such as which aspects of QUAST or CRAQ are best for evaluating polyploid assemblies. It would also be helpful to address how to interpret conflicting results from different tools or databases. Furthermore, in the pan-genome section, the concept of "saturation analysis" is mentioned. It would strengthen this part to include some discussion on the sample size required to confidently claim saturation, particularly for species with different ploidy levels.</p>
            <p> These are all meant as constructive feedback to enhance what is already a very valuable and needed framework. I hope my comments are helpful.</p>
            <p>Is the review written in accessible language?</p>
            <p>Yes</p>
            <p>Are all factual statements correct and adequately supported by citations?</p>
            <p>Yes</p>
            <p>Are the conclusions drawn appropriate in the context of the current research literature?</p>
            <p>Yes</p>
            <p>Is the topic of the review discussed comprehensively in the context of the current literature?</p>
            <p>Yes</p>
            <p>Reviewer Expertise:</p>
            <p>Plant genomics and evolution</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.</p>
        </body>
        <sub-article article-type="response" id="comment14927-404153">
            <front-stub>
                <contrib-group>
                    <contrib contrib-type="author">
                        <name>
                            <surname>Heuermann</surname>
                            <given-names>Marc Christian</given-names>
                        </name>
                    </contrib>
                </contrib-group>
                <author-notes>
                    <fn fn-type="conflict">
                        <p>
                            <bold>Competing interests: </bold>No competing interests.</p>
                    </fn>
                </author-notes>
                <pub-date pub-type="epub">
                    <day>10</day>
                    <month>11</month>
                    <year>2025</year>
                </pub-date>
            </front-stub>
            <body>
                <p>
                    <bold>Reviewer 1</bold>
                </p>
                <p> Sunil Kumar Sahu (
                    <ext-link ext-link-type="uri" xlink:href="https://orcid.org/0000-0002-4742-9870">https://orcid.org/0000-0002-4742-9870</ext-link>), State Key Laboratory of Genome and Multi-omics Technologies, BGI Research, Shenzhen, China</p>
                <p> Approved with Reservations</p>
                <p> This article presents a very comprehensive and thoughtful set of recommendations, covering a wide spectrum from naming conventions to quality control and data sharing. I enjoyed reading it and have a few suggestions that I believe could strengthen its impact and practicality for the community.</p>
                <p> &#x00a0;My main thought is that the sheer breadth of recommendations, while excellent, might feel daunting for some labs, especially those with limited resources. To make the framework more accessible, it would be incredibly helpful if the authors could more clearly distinguish between what they consider essential "minimum standards" and what are "aspirational best practices." For instance, while the framework is well-described, I found myself wishing for a more concrete, practical roadmap. The QC section, for example, lists many excellent tools (FastQC, GenomeScope, QUAST, etc.), but a visual workflow diagram would be immensely valuable. A figure illustrating the step-by-step process from raw data QC, through assembly and annotation QC, to pan-genome QC would really help readers visualize how to integrate these tools into their own standardized processes.</p>
                <p> </p>
                <p> 
                    <bold>Answer: Thank you for your constructive feedback. We have added Figure 1 to the manuscript to clearly illustrate the recommended tools for each step of the pan-genome analysis workflow. The figure is now cross-referenced in both Section 3 and Section 5 for better integration with the discussed content.</bold>
                </p>
                <p> </p>
                <p> &#x00a0;Finally, on the topic of quality control, the article does a great job listing the available tools but could go further in guiding users on how to apply them. For example, some guidance on tool selection would be useful, such as which aspects of QUAST or CRAQ are best for evaluating polyploid assemblies. It would also be helpful to address how to interpret conflicting results from different tools or databases. Furthermore, in the pan-genome section, the concept of "saturation analysis" is mentioned. It would strengthen this part to include some discussion on the sample size required to confidently claim saturation, particularly for species with different ploidy levels.</p>
                <p> &#x00a0;These are all meant as constructive feedback to enhance what is already a very valuable and needed framework. I hope my comments are helpful.</p>
                <p> </p>
                <p> 
                    <bold>Answer: We have revised section 3. Quality Control (QC) Standards to fully incorporate your suggestions. </bold> 
                    <list list-type="bullet">
                        <list-item>
                            <p>Is the topic of the review discussed comprehensively in the context of the current literature?</p>
                        </list-item>
                    </list> Yes 
                    <list list-type="bullet">
                        <list-item>
                            <p>Are all factual statements correct and adequately supported by citations?</p>
                        </list-item>
                    </list> Yes 
                    <list list-type="bullet">
                        <list-item>
                            <p>Is the review written in accessible language?</p>
                        </list-item>
                    </list> Yes 
                    <list list-type="bullet">
                        <list-item>
                            <p>Are the conclusions drawn appropriate in the context of the current research literature?</p>
                        </list-item>
                    </list> Yes</p>
            </body>
        </sub-article>
    </sub-article>
</article>
