<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.2 20190208//EN" "http://jats.nlm.nih.gov/publishing/1.2/JATS-journalpublishing1.dtd"><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="other" dtd-version="1.2" xml:lang="en">
    <front>
        <journal-meta>
            <journal-id journal-id-type="pmc">F1000Research</journal-id>
            <journal-title-group>
                <journal-title>F1000Research</journal-title>
            </journal-title-group>
            <issn pub-type="epub">2046-1402</issn>
            <publisher>
                <publisher-name>F1000 Research Limited</publisher-name>
                <publisher-loc>London, UK</publisher-loc>
            </publisher>
        </journal-meta>
        <article-meta>
            <article-id pub-id-type="doi">10.12688/f1000research.163833.1</article-id>
            <article-categories>
                <subj-group subj-group-type="heading">
                    <subject>Software Tool Article</subject>
                </subj-group>
                <subj-group>
                    <subject>Articles</subject>
                </subj-group>
            </article-categories>
            <title-group>
                <article-title>LitSieve: An integrated literature search and triage tool for biocuration</article-title>
                <fn-group content-type="pub-status">
                    <fn>
                        <p>[version 1; peer review: 1 approved with reservations]</p>
                    </fn>
                </fn-group>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Jeffryes</surname>
                        <given-names>Matt</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Conceptualization</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="yes">
                    <name>
                        <surname>Harrison</surname>
                        <given-names>Melissa</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Conceptualization</role>
                    <role content-type="http://credit.niso.org/">Supervision</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <uri content-type="orcid">https://orcid.org/0000-0003-3523-4408</uri>
                    <xref ref-type="corresp" rid="c1">a</xref>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Hermjakob</surname>
                        <given-names>Henning</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Conceptualization</role>
                    <role content-type="http://credit.niso.org/">Funding Acquisition</role>
                    <role content-type="http://credit.niso.org/">Supervision</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>McEntyre</surname>
                        <given-names>Johanna</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Conceptualization</role>
                    <role content-type="http://credit.niso.org/">Funding Acquisition</role>
                    <role content-type="http://credit.niso.org/">Supervision</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <aff id="a1">
                    <label>1</label>European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, CB10 1SD, UK</aff>
            </contrib-group>
            <author-notes>
                <corresp id="c1">
                    <label>a</label>
                    <email xlink:href="mailto:mharrison@ebi.ac.uk">mharrison@ebi.ac.uk</email>
                </corresp>
                <fn fn-type="conflict">
                    <p>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>11</day>
                <month>7</month>
                <year>2025</year>
            </pub-date>
            <pub-date pub-type="collection">
                <year>2025</year>
            </pub-date>
            <volume>14</volume>
            <elocation-id>685</elocation-id>
            <history>
                <date date-type="accepted">
                    <day>25</day>
                    <month>6</month>
                    <year>2025</year>
                </date>
            </history>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2025 Jeffryes M et al.</copyright-statement>
                <copyright-year>2025</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <self-uri content-type="pdf" xlink:href="https://f1000research.com/articles/14-685/pdf"/>
            <abstract>
                <p>Biomedical databases are an important part of the scientific infrastructure for organising and synergising research outputs. Many of these databases abstract content from the rapidly expanding scientific literature. Therefore, database curators require effective literature search methods in order to capture research relevant to their domain.</p>
                <p>This article describes LitSieve, a literature search tool with filtering based on text mined annotations, and flexible article organisation features. It allows users to define filters based on biomedical entities like genes, diseases and species to include or exclude particular articles within their results. By combining a search query with a filter, curators are able to identify articles relevant to the database which they are curating. LitSieve uses APIs provided by Europe PMC, from which abstracts, article full text and text mined annotations are drawn.</p>
                <p>LitSieve is available at 
                    <uri xlink:href="https://www.ebi.ac.uk/europepmc/litsieve/">https://www.ebi.ac.uk/europepmc/litsieve/</uri>
                </p>
            </abstract>
            <kwd-group kwd-group-type="author">
                <kwd>biocuration</kwd>
                <kwd>information retreieval</kwd>
                <kwd>text mining</kwd>
            </kwd-group>
            <funding-group>
                <award-group id="fund-1" xlink:href="https://doi.org/10.13039/100013060">
                    <funding-source>European Molecular Biology Laboratory</funding-source>
                </award-group>
                <award-group id="fund-2" xlink:href="https://doi.org/10.13039/100018694">
                    <funding-source>HORIZON EUROPE Marie Sklodowska-Curie Actions</funding-source>
                    <award-id>945405</award-id>
                </award-group>
                <funding-statement>MJ has received funding from the European Union&#x2019;s Horizon 2020 research and innovation programme under the Marie Sk&#x0142;odowska-Curie grant agreement No 945405. </funding-statement>
                <funding-statement>
                    <italic>The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.</italic>
                </funding-statement>
            </funding-group>
        </article-meta>
    </front>
    <body>
        <sec id="sec1" sec-type="intro">
            <title>Introduction</title>
            <p>Biomedical databases have become a critical infrastructure supporting life science research. Biologists and bioinformaticians depend on databases to interpret their results.
                <sup>
                    <xref ref-type="bibr" rid="ref1">1</xref>,
                    <xref ref-type="bibr" rid="ref2">2</xref>
                </sup> Many important databases depend upon curation of the scientific literature in order to identify and extract relevant information into a structured format. When curating the literature, domain expert biocurators search and sort through scientific articles, and read those that appear relevant to their databases, focussing on the specific facts that they wish to capture.
                <sup>
                    <xref ref-type="bibr" rid="ref1">1</xref>
                </sup> For example, a reference to a particular pair of proteins interacting, or an association between a gene and a disease.</p>
            <p>Biocurators may use biomedical literature databases to identify &#x2018;curatable&#x2019; literature. In this work we describe LitSieve, a system building on the Europe PMC database to provide literature filtering and organisational functions designed to assist with biocuration workflows.
                <sup>
                    <xref ref-type="bibr" rid="ref3">3</xref>
                </sup> While there is a variety of literature search and organisation software already available, LitSieve provides a unique ability to filter based on a wide range of text-mined annotations.</p>
            <sec id="sec2">
                <title>Europe PMC</title>
                <p>Europe PMC is a comprehensive database of life science literature. It contains abstracts from PubMed and Agricola, full text articles from PubMed Central and content from 35 life science relevant preprint servers including bioRxiv and medRxiv. The database contains a total of over 45 million articles, and the full text of the article is searchable for 10 million of those. Literature in Europe PMC is enriched by over 2 billion text-mined annotations. Annotations are references to biomedical entities or concepts such as gene/protein names and diseases, extracted from the literature using a variety of methods. In total there are 43 different categories of annotation. These entities are normalised to an entry in a database. For example, species names are normalised to the NCBI taxonomy. These annotations are made available via a public REST API.
                    <sup>
                        <xref ref-type="bibr" rid="ref3">3</xref>
                    </sup>
                </p>
            </sec>
            <sec id="sec3">
                <title>Biocuration tools</title>
                <p>A number of tools to assist biocurators have been developed. PubTator permits users to search based on six types of text mined &#x2018;bioentities&#x2019;&#x2013;genes, diseases, chemicals, single nucleotide polymorphisms (SNPs), species and cell lines&#x2013;against the PubMed and PubMed Central databases.
                    <sup>
                        <xref ref-type="bibr" rid="ref4">4</xref>
                    </sup> Users are further able to search based on 12 types of text mined interactions between bio entities; for example drug interactions between two chemicals or causation between a SNP and a disease. It also allows users to gather articles into user defined &#x2018;collections&#x2019;.</p>
                <p>LitSuggest uses machine learning (ML) to suggest similar articles to those selected by the user.
                    <sup>
                        <xref ref-type="bibr" rid="ref7">5</xref>
                    </sup> Articles identified by the trained model can then be marked as relevant or irrelevant to further refine the model. In the context of curation, this permits biocurators to submit a list of articles they have already curated and ideally find further &#x2018;curatable&#x2019; articles.</p>
                <p>Tools using large language models (LLMs) to search and summarise literature have also emerged,
                    <sup>
                        <xref ref-type="bibr" rid="ref5">6</xref>
                    </sup> however, these tools have yet to be comprehensively assessed in the context of biocuration. Given the impact of LLMs like ChatGPT on the wider technology landscape it seems inevitable that biocurators will use LLM based tools. However, their propensity for factual errors remains an open problem, and presents a challenge to deploying them on biocuration tasks, where statements must be reliably attributed.
                    <sup>
                        <xref ref-type="bibr" rid="ref6">7</xref>
                    </sup>
                </p>
            </sec>
            <sec id="sec4">
                <title>LitSieve development process</title>
                <p>Development of LitSieve began with the goal of providing an interface to Europe PMC with improved utility for biocurators. The initial concept being that curators may prefer not to use certain ML-based suggestion or recommendation systems, due to their &#x2018;black box&#x2019; nature.
                    <sup>
                        <xref ref-type="bibr" rid="ref8">8</xref>
                    </sup> An internal survey of biocurators was conducted to understand their usage of literature search tools and the types of literature they were interested in. Possible features were discussed, and curators were observed while completing tasks. As development progressed, feedback from biocurators was incorporated into the prerelease versions at each stage.</p>
                <p>The LitSieve system is based upon retrieval using a user-specified search query, the results of which are filtered as chosen by the user. This concept prioritises the explainability of the results, since it is clear to users exactly why a particular article has been included or excluded from their search results: Only search results which are retrieved by their boolean search term are included and, of those, only those that match all the filters are displayed in the search results. Therefore, the reason for the inclusion or exclusion of a particular article is always transparent (see 
                    <xref ref-type="fig" rid="f1">Figure 1c</xref> for an illustration of a matching search result).</p>
                <fig fig-type="figure" id="f1" orientation="portrait" position="float">
                    <label>
Figure 1. </label>
                    <caption>
                        <title>The LitSieve workflow.</title>
                        <p>A literature search is performed (a), the results optionally filtered (b), and then the literature retrieved (c) which can be read and annotated (d) according to the requirements of the user.</p>
                    </caption>
                    <graphic id="gr1" orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/180243/e19d146d-b71f-4646-a135-3c7f50147804_figure1.gif"/>
                </fig>
                <p>Although ML is used to identify many of the text mined annotations used to filter, this approach reduces the scope of the &#x2018;black box&#x2019; area of the retrieval system, it is smaller and more comprehensible. This modular, filter-based concept also enables additional filters to be developed and added, fitting within the same architecture.</p>
            </sec>
        </sec>
        <sec id="sec5">
            <title>System overview</title>
            <p>LitSieve is a literature search and organisation tool designed for biocuration. It permits users to perform a standard literature search and then filter it based upon text-mined annotations. The filtering system is very flexible and accommodates a wide variety of use cases. An overview of the process of using LitSieve is shown in 
                <xref ref-type="fig" rid="f1">
Figure 1</xref>.</p>
            <p>LitSieve builds upon Europe PMC&#x2019;s public articles and annotations APIs and is implemented using the 
                <italic toggle="yes">Vue</italic> JavaScript framework. Searches are configured using a form and user selected parameters define which relevant articles are fetched from Europe PMC. These results may then be filtered according to the text-mined annotations found in the articles. Any of the 43 categories of annotation in Europe PMC can be used for filtering. Users can filter to include articles according to the presence or absence of a specific annotation. Three types of filter are available (include, exclude, ignore), listed in 
                <xref ref-type="table" rid="T1">Table 1</xref> and illustrated in 
                <xref ref-type="fig" rid="f2">Figure 2</xref>. Annotations are fetched from Europe PMC, and then used to filter the articles client-side. The basic search, filtering and reading functions can be used by anonymous users. Saving lists, highlights and notes requires users to register with either an email address or by using ORCID login.</p>
            <table-wrap id="T1" orientation="portrait" position="float">
                <label>
Table 1. </label>
                <caption>
                    <title>The 3 filter types.</title>
                </caption>
                <table content-type="article-table" frame="hsides">
                    <thead>
                        <tr>
                            <th align="left" colspan="1" rowspan="1" valign="top">Filter type</th>
                            <th align="left" colspan="1" rowspan="1" valign="top">
Action</th>
                        </tr>
                    </thead>
                    <tbody>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">include</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Only show search results that have an annotation mapped to a specified identifier</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">exclude</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Only show search results that do not have an annotation mapped to a specified identifier</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">ignore</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Only show search results that have an annotation of a specified type, but is not among the specified identifiers</td>
                        </tr>
                    </tbody>
                </table>
            </table-wrap>
            <fig fig-type="figure" id="f2" orientation="portrait" position="float">
                <label>
Figure 2. </label>
                <caption>
                    <title>An illustration of the filter types.</title>
                    <p>Three taxa are specified for filtering at the top. In the left column, 4 documents are shown. In reality, the filter would be applied using the entire document, or a specified section, but in this case a short fragment is used for illustrative purposes. In the fragments, all mentions of a species are underlined, and the specified species are highlighted. In the top row, each filter type is listed. Below the filter types it is indicated whether a filter of the corresponding type, with the three specified taxa, would result in the document being included in the search results or not.</p>
                </caption>
                <graphic id="gr2" orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/180243/e19d146d-b71f-4646-a135-3c7f50147804_figure2.gif"/>
            </fig>
            <p>The filter may be restricted to a specific section of the article (for example, finding only articles that have a &#x2018;mouse&#x2019; annotation within their Methods section). Lists of identifiers may be saved for convenience, for example, if a curator has a list of diseases of interest that they wish to use as a filter on many searches.</p>
            <p>For convenience, several types of annotation can be filtered using an integrated auto-complete interface. Species and other taxonomic ranks can be retrieved from the NCBI taxonomy 
                <xref ref-type="fn" rid="fn1">[1]</xref>, gene and protein names can be retrieved from UniProt, and terms from the Gene Ontology, Uberon, Experimental Factors Ontology, and Chebi can be retrieved from the Ontology Lookup Service.
                <sup>
                    <xref ref-type="bibr" rid="ref9">9</xref>,
                    <xref ref-type="bibr" rid="ref10">10</xref>
                </sup>
            </p>
            <p>Articles found using LitSieve can be saved to lists. This accommodates a triage workflow where users can flag literature as either curatable or non-curatable, but users may organise lists as they wish, and no specific workflow is imposed. Articles may be added or removed from lists directly from the search result page, or from the reading view. This permits, for example, a curator to remove an article from their &#x2018;triage&#x2019; list after having read it and found it to be non-curateable. The &#x201c;quick lists&#x201d; feature allows users to assign an icon and colour to a particular list, which permits easy visual identification of list membership in the search results page. This allows a curator to identify, for example, articles they have already triaged.</p>
            <p>In the reader view, users may highlight and add private notes to articles (see 
                <xref ref-type="fig" rid="f1">Figure 1d</xref>). Biocurators may use this to highlight curatable passages from the article or other pertinent details such as cell lines used in experiments.</p>
            <p>Users may recall and reorder saved articles from a list management view. A list of all articles to which notes or highlights have been added is also available. Lists may be used to organise or prioritise articles for curation, or to save a group of related articles.</p>
        </sec>
        <sec id="sec6">
            <title>Use cases</title>
            <sec id="sec7">
                <title>IntAct</title>
                <p>IntAct is a molecular interaction database.
                    <sup>
                        <xref ref-type="bibr" rid="ref11">11</xref>
                    </sup> It is essentially a graph of interacting molecules, with the vertices being biologically active molecules like proteins, and the edges denoting some kind of interaction between a pair of them. IntAct is manually curated; every interaction has been captured by a biocurator. This is a time intensive process, and given the available resources, prioritisation is necessary because not every possible interaction published can be incorporated into the database. As a strategic goal, IntAct has prioritised adding new molecules to the database (increasing the number of vertices) over adding edges between molecules already in the database, prioritising coverage over increasing the number of evidences for known relationships. Therefore, it is desirable to find literature that discusses protein&#x2013;protein interactions where at least one of the proteins is not yet listed in IntAct.</p>
                <p>LitSieve enables the IntAct biocurators to filter out articles that will not add new molecules to the database. After performing a literature search, an &#x2018;ignore&#x2019; filter that lists UniProt identifiers for all proteins already present in IntAct can be applied. This will filter out any article that does NOT mention at least one protein not in the list specified by the user. That is, only articles mentioning proteins new to IntAct will be shown in the result list. While this does not guarantee that the article will discuss a curatable protein interaction, it will filter out articles which certainly do not increase the number of proteins covered by IntAct. In this way, LitSieve enables IntAct curators to perform literature searches constructed using their experience while benefiting from the text-mined annotations in Europe PMC to speed up their triage of the results. A step by step illustration of this workflow is available at 
                    <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.5281/zenodo.15682791">10.5281/zenodo.15682791</ext-link>.</p>
            </sec>
            <sec id="sec8">
                <title>UniProt</title>
                <p>UniProt is a data resource for protein sequence and functional information. One component of UniProt is the SwissProt subset of the UniProt knowledgebase (UniProtKB/SwissProt). This is a curated resource summarising experimental and computationally predicted functional information selected and reviewed by an expert biocurator. In order to carry out this work, UniProt biocurators search for, and read, literature related to the proteins which they are tasked with creating and updating records for.</p>
                <p>LitSieve has been used to curate proteins related to antimicrobial resistance into UniProt. The ability to filter search results based on species is beneficial during triage to sift out articles not related to the entry being curated. Since a single species may be referred to by multiple different names (for example, mouse, mice, 
                    <italic toggle="yes">M. musculus</italic>, 
                    <italic toggle="yes">Mus musculus</italic>), filtering based on concept rather than exact text matches can save time and effort during the triage process.</p>
            </sec>
        </sec>
        <sec id="sec9" sec-type="conclusion">
            <title>Conclusion</title>
            <p>LitSieve allows biocurators to combine their literature search expertise with filters based on text-mined annotations. This transparent and reproducible approach to literature discovery allows biocurators and other users to understand why a particular article has or has not been captured by their query. The flexible filter architecture permits use cases that we have not yet anticipated. Based on Europe PMC, LitSieve benefits from daily literature updates and can search across over 31 million abstracts and over 10 million full text articles. Filtering can be performed using 2 billion text-mined annotations in 43 categories. There are a variety of other tools available for biocuration literature search, however, to our knowledge, no others are able to search based on this number of types of annotation.</p>
            <p>LitSieve provides an integrated interface for organising and prioritising literature. We anticipate that by integrating biocuration related features into a single application, biocuration workflows can be made more efficient.</p>
        </sec>
        <sec id="sec10">
            <title>Software availability</title>
            <p>LitSieve is available publicly at 
                <ext-link ext-link-type="uri" xlink:href="https://www.ebi.ac.uk/europepmc/litsieve/">https://www.ebi.ac.uk/europepmc/litsieve/</ext-link>.</p>
            <p>Source code is available in two repositories under an MIT licence. The front-end is available at 
                <ext-link ext-link-type="uri" xlink:href="https://gitlab.ebi.ac.uk/mjj/biocuration-toolbox">https://gitlab.ebi.ac.uk/mjj/biocuration-toolbox
</ext-link>, and the back-end is available at 
                <ext-link ext-link-type="uri" xlink:href="https://gitlab.ebi.ac.uk/mjj/litsieve-backend">https://gitlab.ebi.ac.uk/mjj/litsieve-backend
</ext-link>. An archived copy of these repositories at time of submission has been deposited in Zenodo: 
                <ext-link ext-link-type="uri" xlink:href="https://dx.doi.org/10.5281/zenodo.15480211">https://dx.doi.org/10.5281/zenodo.15480211</ext-link>.</p>
        </sec>
        <sec id="sec11">
            <title>Author contributions</title>
            <p>All the authors contributed to conceptualisation and determining the methodology. HH, MH and JM provided supervision. MJ was responsible for software development, and for drafting the original manuscript. All authors contributed to review and editing.</p>
        </sec>
    </body>
    <back>
        <ack>
            <title>Acknowledgements</title>
            <p>We thank Islam Hassan, Mohamed Selim, and Jagadeeswararao Poluru for software engineering and Kalpana Panneerselvam, Paul Denny and other users for testing, and feedback. This work was supported by the European Molecular Biology Laboratory (EMBL).</p>
        </ack>
        <ref-list>
            <title>References</title>
            <ref id="ref1">
                <label>1</label>
                <mixed-citation publication-type="journal">
                    <collab>International Society for Biocuration</collab>:
                    <article-title>Biocuration: Distilling data into knowledge.</article-title>
                    <source>

                        <italic toggle="yes">PLoS Biol.</italic>
</source>
                    <year>2018 Apr 16</year>;<volume>16</volume>(<issue>4</issue>):<fpage>e2002846</fpage>.
                    <pub-id pub-id-type="pmid">29659566</pub-id>
                    <pub-id pub-id-type="doi">10.1371/journal.pbio.2002846</pub-id>
                    <pub-id pub-id-type="pmcid">PMC5919672</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref2">
                <label>2</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Hirschman</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Berardini</surname>
                            <given-names>TZ</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Drabkin</surname>
                            <given-names>HJ</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>A MOD (ern) perspective on literature curation.</article-title>
                    <source>

                        <italic toggle="yes">Mol. Gen. Genomics.</italic>
</source>
                    <year>2010 May</year>;<volume>283</volume>(<issue>5</issue>):<fpage>415</fpage>&#x2013;<lpage>425</lpage>.
                    <pub-id pub-id-type="pmid">20221640</pub-id>
                    <pub-id pub-id-type="doi">10.1007/s00438-010-0525-8</pub-id>
                    <pub-id pub-id-type="pmcid">PMC2854346</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref3">
                <label>3</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Rosonovski</surname>
                            <given-names>S</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Levchenko</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Bhatnagar</surname>
                            <given-names>R</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Europe PMC in 2023.</article-title>
                    <source>

                        <italic toggle="yes">Nucleic Acids Res.</italic>
</source>
                    <year>2024 Jan 5</year>;<volume>52</volume>(<issue>D1</issue>):<fpage>D1668</fpage>&#x2013;<lpage>D1676</lpage>.
                    <pub-id pub-id-type="pmid">37994696</pub-id>
                    <pub-id pub-id-type="doi">10.1093/nar/gkad1085</pub-id>
                    <pub-id pub-id-type="pmcid">PMC10767826</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref4">
                <label>4</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Wei</surname>
                            <given-names>C-H</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Allot</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Lai</surname>
                            <given-names>P-T</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>PubTator 3.0: an AI-powered Literature Resource for Unlocking Biomedical Knowledge.</article-title>
                    <source>

                        <italic toggle="yes">arXiv.</italic>
</source>
                    <year>2024 Jan 19</year>;<volume>52</volume>:<fpage>W540</fpage>&#x2013;<lpage>W546</lpage>.
                    <pub-id pub-id-type="pmid">39314498</pub-id>
                    <pub-id pub-id-type="doi">10.1093/nar/gkae235</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref7">
                <label>5</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Allot</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Lee</surname>
                            <given-names>K</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Chen</surname>
                            <given-names>Q</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>LitSuggest: a web-based system for literature recommendation and curation using machine learning.</article-title>
                    <source>

                        <italic toggle="yes">Nucleic Acids Res.</italic>
</source>
                    <year>2021 Jul 2</year>;<volume>49</volume>(<issue>W1</issue>):<fpage>W352</fpage>&#x2013;<lpage>W358</lpage>.
                    <pub-id pub-id-type="pmid">33950204</pub-id>
                    <pub-id pub-id-type="doi">10.1093/nar/gkab326</pub-id>
                    <pub-id pub-id-type="pmcid">PMC8262723</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref5">
                <label>6</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Jin</surname>
                            <given-names>Q</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Leaman</surname>
                            <given-names>R</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Lu</surname>
                            <given-names>Z</given-names>
                        </name>
</person-group>:
                    <article-title>PubMed and beyond: biomedical literature search in the age of artificial intelligence.</article-title>
                    <source>

                        <italic toggle="yes">EBioMedicine.</italic>
</source>
                    <year>2024 Feb 1</year>;<volume>100</volume>:<fpage>104988</fpage>.
                    <pub-id pub-id-type="pmid">38306900</pub-id>
                    <pub-id pub-id-type="doi">10.1016/j.ebiom.2024.104988</pub-id>
                    <pub-id pub-id-type="pmcid">PMC10850402</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref6">
                <label>7</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Wynter</surname>
                            <given-names>A</given-names>
                            <prefix>de</prefix>
                        </name>

                        <name name-style="western">
                            <surname>Wang</surname>
                            <given-names>X</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Sokolov</surname>
                            <given-names>A</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>An evaluation on large language model outputs: Discourse and memorization.</article-title>
                    <source>

                        <italic toggle="yes">Nat. Lang. Proc. J.</italic>
</source>
                    <year>2023 Sep</year>;<volume>4</volume>:<fpage>100024</fpage>.
                    <pub-id pub-id-type="doi">10.1016/j.nlp.2023.100024</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref8">
                <label>8</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Holzinger</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Langs</surname>
                            <given-names>G</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Denk</surname>
                            <given-names>H</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Causability and explainability of artificial intelligence in medicine.</article-title>
                    <source>

                        <italic toggle="yes">Wiley Interdiscip. Rev. Data Min. Knowl. Discov.</italic>
</source>
                    <year>2019 Apr 2</year>;<volume>9</volume>(<issue>4</issue>):<fpage>e1312</fpage>.
                    <pub-id pub-id-type="pmid">32089788</pub-id>
                    <pub-id pub-id-type="doi">10.1002/widm.1312</pub-id>
                    <pub-id pub-id-type="pmcid">PMC7017860</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref9">
                <label>9</label>
                <mixed-citation publication-type="journal">
                    <collab>UniProt Consortium</collab>:
                    <article-title>Uniprot: the universal protein knowledgebase in 2023.</article-title>
                    <source>

                        <italic toggle="yes">Nucleic Acids Res.</italic>
</source>
                    <year>2023 Jan 6</year>;<volume>51</volume>(<issue>D1</issue>):<fpage>D523</fpage>&#x2013;<lpage>D531</lpage>.
                    <pub-id pub-id-type="pmid">36408920</pub-id>
                    <pub-id pub-id-type="doi">10.1093/nar/gkac1052</pub-id>
                    <pub-id pub-id-type="pmcid">PMC9825514</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref10">
                <label>10</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>C&#x00f4;t&#x00e9;</surname>
                            <given-names>R</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Reisinger</surname>
                            <given-names>F</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Martens</surname>
                            <given-names>L</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>The Ontology Lookup Service: bigger and better.</article-title>
                    <source>

                        <italic toggle="yes">Nucleic Acids Res.</italic>
</source>
                    <year>2010 Jul</year>;<volume>38</volume>(<issue>Web Server issue</issue>):<fpage>W155</fpage>&#x2013;<lpage>W160</lpage>.
                    <pub-id pub-id-type="pmid">20460452</pub-id>
                    <pub-id pub-id-type="doi">10.1093/nar/gkq331</pub-id>
                    <pub-id pub-id-type="pmcid">PMC2896109</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref11">
                <label>11</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Del Toro</surname>
                            <given-names>N</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Shrivastava</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Ragueneau</surname>
                            <given-names>E</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>The IntAct database: efficient access to fine-grained molecular interaction data.</article-title>
                    <source>

                        <italic toggle="yes">Nucleic Acids Res.</italic>
</source>
                    <year>2022 Jan 7</year>;<volume>50</volume>(<issue>D1</issue>):<fpage>D648</fpage>&#x2013;<lpage>D653</lpage>.
                    <pub-id pub-id-type="pmid">34761267</pub-id>
                    <pub-id pub-id-type="doi">10.1093/nar/gkab1006</pub-id>
                </mixed-citation>
            </ref>
        </ref-list>
        <fn-group content-type="footnotes">
            <fn id="fn1">
                <label>
                    <sup>1</sup>
                </label>
                <p>

                    <ext-link ext-link-type="uri" xlink:href="https://www.ncbi.nlm.nih.gov/taxonomy">https://www.ncbi.nlm.nih.gov/taxonomy</ext-link>
                </p>
            </fn>
        </fn-group>
    </back>
    <sub-article article-type="reviewer-report" id="report410949">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.180243.r410949</article-id>
            <title-group>
                <article-title>Reviewer response for version 1</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Rutherford</surname>
                        <given-names>Kim</given-names>
                    </name>
                    <xref ref-type="aff" rid="r410949a1">1</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0001-6277-726X</uri>
                </contrib>
                <aff id="r410949a1">
                    <label>1</label>University of Cambridge, Cambridge, UK</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>19</day>
                <month>9</month>
                <year>2025</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2025 Rutherford K</copyright-statement>
                <copyright-year>2025</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport410949" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.163833.1"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve-with-reservations</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>This paper describes "LitSieve", a literature search tool that improves on previous systems by providing integrated access to publication details and text-mined annotations, along with a filtering system allows users to narrow their search to relevant articles.</p>
            <p> </p>
            <p> --------------------------</p>
            <p> </p>
            <p> I appreciate the summary of the filters in Figure 2.</p>
            <p> </p>
            <p> The "exclude" and "include" filter types seem straightforward but I struggle to understand the "ignore" filter type.&#x00a0; An example of "ignore" is given in the "Use cases" section but could the function of "ignore" be more precisely discribed earlier?&#x00a0; Perhaps in the section that introduces the filters?</p>
            <p> </p>
            <p> --------------------------</p>
            <p> </p>
            <p> "Are sufficient details of the code, methods and analysis (if applicable) provided to allow replication of the software development and its use by others?"</p>
            <p> </p>
            <p> It's good to see that the source code is available and has been deposited in Zenodo.</p>
            <p> </p>
            <p> The frontend repository README says "It may be run with or without the backend" but doesn't specify how.&#x00a0; I can't see documentation for how to configure the frontend, either in the manuscript or in the repository.&#x00a0; Please add this documentation to the repository or manuscript.</p>
            <p> </p>
            <p> --------------------------</p>
            <p> </p>
            <p> Please add a statement about future support, maintenance and software availability.&#x00a0; I notice that the repositories linked to from the manuscript have had no code changes for 4 months?&#x00a0; Has development and bug fixing stopped?&#x00a0; Will there be future support?</p>
            <p> </p>
            <p> --------------------------</p>
            <p> </p>
            <p> "We thank Islam Hassan, Mohamed Selim, and Jagadeeswararao Poluru for software engineering"</p>
            <p> </p>
            <p> If these software engineers made substantial contributions to the software, they should be co-authors.&#x00a0; If not, consider a separate explanation for the contribution of each engineer, if there are differences.&#x00a0; Thanking someone for "software engineering" in a software publication would be like thanking someone for "lab work" in a experimental publication.</p>
            <p> </p>
            <p> --------------------------</p>
            <p> </p>
            <p> The user-driven approach described here is encouraging:</p>
            <p> </p>
            <p> "An internal survey of biocurators was conducted to understand their usage of literature search tools and the types of literature they were interested in. Possible features were discussed, and curators were observed while completing tasks."</p>
            <p> </p>
            <p> "We thank ... Kalpana Panneerselvam, Paul Denny and other users for testing, and feedback"</p>
            <p> </p>
            <p> "As development progressed, feedback from biocurators was incorporated into the prerelease versions at each stage."</p>
            <p> </p>
            <p> Any users who have made substantial contributions in the form of feedback or ideas should be considered for co-authorship.&#x00a0; Especially consider any biocurators who contributed multiple major suggestions that have been incorporated into the system.&#x00a0; Are there users who have contributed more ideas or feedback than any of the current co-authors?&#x00a0; If there are, they should be on the author list.</p>
            <p>Are the conclusions about the tool and its performance adequately supported by the findings presented in the article?</p>
            <p>Yes</p>
            <p>Is the rationale for developing the new software tool clearly explained?</p>
            <p>Yes</p>
            <p>Is the description of the software tool technically sound?</p>
            <p>Yes</p>
            <p>Are sufficient details of the code, methods and analysis (if applicable) provided to allow replication of the software development and its use by others?</p>
            <p>Partly</p>
            <p>Is sufficient information provided to allow interpretation of the expected output datasets and any results generated using the tool?</p>
            <p>Yes</p>
            <p>Reviewer Expertise:</p>
            <p>Software engineering. Bioinformatics.</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.</p>
        </body>
    </sub-article>
</article>
