<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.2 20190208//EN" "http://jats.nlm.nih.gov/publishing/1.2/JATS-journalpublishing1.dtd"><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="methods-article" dtd-version="1.2" xml:lang="en">
    <front>
        <journal-meta>
            <journal-id journal-id-type="pmc">F1000Research</journal-id>
            <journal-title-group>
                <journal-title>F1000Research</journal-title>
            </journal-title-group>
            <issn pub-type="epub">2046-1402</issn>
            <publisher>
                <publisher-name>F1000 Research Limited</publisher-name>
                <publisher-loc>London, UK</publisher-loc>
            </publisher>
        </journal-meta>
        <article-meta>
            <article-id pub-id-type="doi">10.12688/f1000research.36395.2</article-id>
            <article-categories>
                <subj-group subj-group-type="heading">
                    <subject>Method Article</subject>
                </subj-group>
                <subj-group>
                    <subject>Articles</subject>
                </subj-group>
            </article-categories>
            <title-group>
                <article-title>Organizing training workshops on gene literature retrieval,&#x00a0;profiling, and visualization for early career researchers</article-title>
                <fn-group content-type="pub-status">
                    <fn>
                        <p>[version 2; peer review: 2 approved]</p>
                    </fn>
                </fn-group>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author" corresp="no" equal-contrib="yes">
                    <name>
                        <surname>Al Ali</surname>
                        <given-names>Fatima</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Data Curation</role>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <uri content-type="orcid">https://orcid.org/0000-0001-6451-7357</uri>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no" equal-contrib="yes">
                    <name>
                        <surname>Marr</surname>
                        <given-names>Alexandra K</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Data Curation</role>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Tatari-Calderone</surname>
                        <given-names>Zohreh</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Alfaki</surname>
                        <given-names>Mohamed</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <uri content-type="orcid">https://orcid.org/0000-0003-4913-2357</uri>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Toufiq</surname>
                        <given-names>Mohammed</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <uri content-type="orcid">https://orcid.org/0000-0002-6368-6746</uri>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Roelands</surname>
                        <given-names>Jessica</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <uri content-type="orcid">https://orcid.org/0000-0003-3631-2041</uri>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Syed Ahamed Kabeer</surname>
                        <given-names>Basirudeen</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <uri content-type="orcid">https://orcid.org/0000-0002-7287-1636</uri>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Bedognetti</surname>
                        <given-names>Davide</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Funding Acquisition</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a1">1</xref>
                    <xref ref-type="aff" rid="a2">2</xref>
                    <xref ref-type="aff" rid="a3">3</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Marr</surname>
                        <given-names>Nico</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Funding Acquisition</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a1">1</xref>
                    <xref ref-type="aff" rid="a2">2</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Garand</surname>
                        <given-names>Mathieu</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Rinchai</surname>
                        <given-names>Darawan</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Conceptualization</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <uri content-type="orcid">https://orcid.org/0000-0001-8851-7730</uri>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="yes">
                    <name>
                        <surname>Chaussabel</surname>
                        <given-names>Damien</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Conceptualization</role>
                    <role content-type="http://credit.niso.org/">Data Curation</role>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Funding Acquisition</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <uri content-type="orcid">https://orcid.org/0000-0002-7287-1636</uri>
                    <xref ref-type="corresp" rid="c1">a</xref>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <aff id="a1">
                    <label>1</label>Research Branch, Sidra Medicine, Doha, Qatar</aff>
                <aff id="a2">
                    <label>2</label>College of Health and Life Sciences, Hamad Bin Khalifa University, Doha, Qatar</aff>
                <aff id="a3">
                    <label>3</label>Department of Internal Medicine and Medical Specialties, University of Genoa, Genoa, 16126, Italy</aff>
            </contrib-group>
            <author-notes>
                <corresp id="c1">
                    <label>a</label>
                    <email xlink:href="mailto:Damien.chaussabel@jax.org">Damien.chaussabel@jax.org</email>
                </corresp>
                <fn fn-type="conflict">
                    <p>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>11</day>
                <month>5</month>
                <year>2023</year>
            </pub-date>
            <pub-date pub-type="collection">
                <year>2021</year>
            </pub-date>
            <volume>10</volume>
            <elocation-id>275</elocation-id>
            <history>
                <date date-type="accepted">
                    <day>20</day>
                    <month>3</month>
                    <year>2023</year>
                </date>
            </history>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2023 Al Ali F et al.</copyright-statement>
                <copyright-year>2023</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <self-uri content-type="pdf" xlink:href="https://f1000research.com/articles/10-275/pdf"/>
            <abstract>
                <p>Developing the skills needed to effectively&#x00a0;search&#x00a0;and extract information from biomedical literature&#x00a0;is&#x00a0;essential&#x00a0;for early-career researchers.&#x00a0;It is, for instance, on this basis that&#x00a0;the&#x00a0;novelty of&#x00a0;experimental results, and therefore publishing&#x00a0;opportunities,&#x00a0;can be evaluated.&#x00a0;Given&#x00a0;the&#x00a0;unprecedented volume of publications&#x00a0;in the&#x00a0;field of&#x00a0;biomedical research,&#x00a0;new systematic&#x00a0;approaches need to be devised and adopted for the retrieval and curation of literature&#x00a0;relevant to a specific theme.&#x00a0;Here we describe a hands-on training curriculum aimed at retrieval,&#x00a0;profiling,&#x00a0;and visualization of literature&#x00a0;associated with a given topic.&#x00a0;This curriculum was implemented&#x00a0;in&#x00a0;a workshop&#x00a0;in January 2021.&#x00a0;We provide&#x00a0;supporting&#x00a0;material and step-by-step implementation guidelines with&#x00a0;the&#x00a0;ISG15&#x00a0;gene&#x00a0;literature&#x00a0;serving as an illustrative use case.&#x00a0;Through participation&#x00a0;in&#x00a0;such a workshop,&#x00a0;trainees&#x00a0;can&#x00a0;learn:&#x00a0;1)&#x00a0;to build and troubleshoot&#x00a0;PubMed&#x00a0;queries&#x00a0;in order to retrieve the literature&#x00a0;associated&#x00a0;with&#x00a0;a gene of interest;&#x00a0;2) to identify key concepts relevant to given themes (such as cell types, diseases,&#x00a0;and&#x00a0;biological processes);&#x00a0;3) to measure the prevalence of these concepts in the gene&#x00a0;literature;&#x00a0;4) to&#x00a0;extract key information from relevant articles,&#x00a0;and&#x00a0;5) to&#x00a0;develop&#x00a0;a background section or summary&#x00a0;on the basis of this information. Finally,&#x00a0;trainees can learn to&#x00a0;consolidate&#x00a0;the&#x00a0;structured&#x00a0;information captured&#x00a0;through this process&#x00a0;for&#x00a0;presentation&#x00a0;via&#x00a0;an interactive web application.</p>
            </abstract>
            <kwd-group kwd-group-type="author">
                <kwd>Literature profiling</kwd>
                <kwd>Science education</kwd>
                <kwd>Concept extraction</kwd>
                <kwd>Data visualization</kwd>
            </kwd-group>
            <funding-group>
                <award-group id="fund-1" xlink:href="http://dx.doi.org/10.13039/100008982">
                    <funding-source>Qatar National Research Fund</funding-source>
                    <award-id>NPRPgrant#10-0205-170348</award-id>
                </award-group>
                <funding-statement>The work presented here was supported in part by NPRP grant # 10-0205-170348 from the Qatar National Research Fund (a member of Qatar Foundation). The work reported herein is solely the responsibility of the authors. </funding-statement>
                <funding-statement>
                    <italic>The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.</italic>
                </funding-statement>
            </funding-group>
        </article-meta>
        <notes>
            <sec sec-type="version-changes">
                <label>Revised</label>
                <title>Amendments from Version 1</title>
                <p>In response to one of the reviewer's comments we removed the step where information captured from research articles by workshop participants and structured in a spreadsheet format was used to automatically generate "standard sentences", to be used later to form a coherent narrative. We incorporated the edits and corrections suggested by the reviewer throughout the document, and we revised the abstract and title to improve clarity and conciseness.</p>
            </sec>
        </notes>
    </front>
    <body>
        <sec id="sec1" sec-type="intro">
            <title>Introduction</title>
            <p>Peer-reviewed publications constitute the body of knowledge upon which biomedical research is based. This resource is essential for the generation of novel hypotheses and the design of studies, experiments, and trials and is thus key to the discovery process itself. However, the volume of literature on any given research topic has grown dramatically in recent years, making it increasingly challenging for a single person to manually survey the relevant literature in its entirety, and consequently to acquire the foundational knowledge needed for discovery research. Hence, developing a solid foundation of skills for literature retrieval and profiling is of critical importance for early career biomedical scientists. These competencies will, for instance, be needed: 1) to acquire a sound knowledge base through developing the ability to compile and summarize large volumes of literature. 2) to develop data interpretation skills as well as become able to assess novelty and potential impact of a given finding; and 3) to develop scientific writing skills and become able to write background material on specific topics.</p>
            <p>Publicly available omics data provides ideal material for training new generations of biomedical researchers.
                <sup>
                    <xref ref-type="bibr" rid="ref1">1</xref>&#x2013;
                    <xref ref-type="bibr" rid="ref3">3</xref>
                </sup> One of the &#x201c;collective omics data&#x201d; training modules that we have developed follows a reductionist analysis and interpretation workflow using publicly available transcriptome data as the source material.
                <sup>
                    <xref ref-type="bibr" rid="ref4">4</xref>
                </sup> The first activities that form part of this training module involve literature retrieval and profiling for a given candidate gene.</p>
            <p>In this article, we describe a detailed stepwise approach that we have developed running literature profiling training workshops - literature profiling here meaning information extraction from titles and determination of keyword frequencies in titles and abstracts. The training program that we have devised takes advantage of publicly available omics data.
                <sup>
                    <xref ref-type="bibr" rid="ref4">4</xref>
                </sup> The training module presented here focuses on retrieval of gene-centric literature. Supporting material, such as sets of slides, templates, and a handout, is also provided along with an illustrative use case. Notably, although the training activity that is presented focuses on gene-centric literature retrieval and profiling, the skills and approaches can be adapted to any other application (e.g., focusing on disease-centric, or pathway/process-centric literature).</p>
        </sec>
        <sec id="sec2" sec-type="methods">
            <title>Methods</title>
            <p>The hands-on training exercise described here is suitable for implementation as part of a broader course on research methodology, or as a stand-alone workshop. The training is appropriate for undergraduate, graduate and post-doctoral trainees and no prior bioinformatics experience is required in order to participate. The time commitment depends on the level of experience of the attendees, the overall organization of the workflow and the volume of literature associated with the candidate gene(s) selected. For instance, for a gene with about 1,000 associated articles, five participants could work together on this same gene, each focusing on a different theme. Generally, the training could be covered in one introductory session (40 min), and three two-hour hands-on sessions. The format and content of these sessions are described in more detail below. These would cover:
                <list list-type="bullet">
                    <list-item>
                        <label>&#x2022;</label>
                        <p>Introductory session</p>
                    </list-item>
                    <list-item>
                        <label>&#x2022;</label>
                        <p>Step 1 &#x2013; Retrieving the relevant literature</p>
                    </list-item>
                    <list-item>
                        <label>&#x2022;</label>
                        <p>Step 2 &#x2013; Extracting concepts</p>
                    </list-item>
                    <list-item>
                        <label>&#x2022;</label>
                        <p>Step 3 &#x2013; Generating literature profiles</p>
                    </list-item>
                    <list-item>
                        <label>&#x2022;</label>
                        <p>Step 4 &#x2013; Developing interactive data visualizations</p>
                    </list-item>
                    <list-item>
                        <label>&#x2022;</label>
                        <p>Step 5 &#x2013; Writing a narrative</p>
                    </list-item>
                </list>
            </p>
            <p>In addition to covering workshop pre-requisites and time commitment, announcements for such workshops can also list the sets of skills trainees are expected to develop including:
                <list list-type="bullet">
                    <list-item>
                        <label>&#x2022;</label>
                        <p>Literature retrieval (development of advanced PubMed queries).</p>
                    </list-item>
                    <list-item>
                        <label>&#x2022;</label>
                        <p>Literature profiling (information extraction, determination of keyword frequencies).</p>
                    </list-item>
                    <list-item>
                        <label>&#x2022;</label>
                        <p>Data visualization (structuring and presenting information in an interactive format).</p>
                    </list-item>
                    <list-item>
                        <label>&#x2022;</label>
                        <p>Extraction of biomedical information (capturing information in a structured format).</p>
                    </list-item>
                </list>
            </p>
            <p>Trainees can be asked to perform tasks between each of the sessions. Alternative formats are possible depending on the needs of the participants and a range of biomedical themes (e.g., diseases, pathways, or cell types) could be selected as the focus of the literature workshop. The endpoint for such workshops may include interactive literature profiling representations generated as part of the hands-on training activities and/or a peer-reviewed publication that, for instance, could build on such a resource.</p>
            <sec id="sec3">
                <title>Introductory session</title>
                <p>The introductory session is designed to provide participants with an understanding of basic concepts and present an outline of the training curriculum. In addition, this session presents the overall rationale and teaching objectives of the training program. The introduction also defines the endpoint of the workshop activity as the development of a web resource that permits the visualization and exploration of structured literature profiles for a gene of interest (described in detail below, taking 
                    <italic toggle="yes">ISG15</italic> as an illustrative case). This is followed by an overview of each of the steps that will be undertaken during the hands-on session. These are outlined above and described in more details below. It also provides participants with an opportunity to ask questions before the start of the hands-on sessions. Ideally, the presentation should last no longer than 40 minutes and an illustrative case could serve to support the presenter&#x2019;s narrative. A generic presentation is provided (
                    <bold>Extended data File 1)</bold>.
                    <sup>
                        <xref ref-type="bibr" rid="ref5">5</xref>
                    </sup>
                </p>
            </sec>
            <sec id="sec4">
                <title>Step 1: Retrieving the relevant literature</title>
                <p>As a first step, all the literature that is relevant to a given gene of interest must be identified. This forms the body of literature that will be subjected to literature profiling in subsequent steps. Most researchers will already be familiar with PubMed, the search engine hosted by the US National Center for Biotechnology Information (NCBI) (
                    <ext-link ext-link-type="uri" xlink:href="https://pubmed.ncbi.nlm.nih.gov/">https://pubmed.ncbi.nlm.nih.gov/</ext-link>). While this tool is straightforward to use, developing queries that will permit the comprehensive retrieval of literature associated with a given theme can be more complicated.</p>
                <p>It is important to design a PubMed query that will permit the retrieval of all the literature that is available for the gene of interest. Challenges when retrieving literature associated with a given gene include capturing all aliases and variations that may exist in addition to the official gene names and symbols. For instance, 14 aliases were used to develop a query for the gene officially known as &#x201c;
                    <italic toggle="yes">ISG15</italic> Ubiquitin Like Modifier&#x201d;. Failure to include all the aliases can lead to under-representation of the literature and erroneous judgments that may have major consequences (e.g., when gaging the novelty of a given finding and its suitability for publication). The use of synonyms among gene names and aliases can also result in high proportions of false-positive results, particularly when aliases are common language terms (e.g., 
                    <italic toggle="yes">CAMEL</italic> genes [official symbol: 
                    <italic toggle="yes">CTAG2</italic>] or 
                    <italic toggle="yes">WARS</italic> genes [official symbol: 
                    <italic toggle="yes">WARS1</italic>]). If the literature retrieved in this step misses relevant records for the candidate gene (= false negatives), the resulting literature profiles will be incomplete. If the literature retrieved contains records that are irrelevant (= false positives), the resulting literature profiles will be inaccurate. For these reasons, it is important to optimize PubMed queries used for gene literature retrieval. Notably, the same principle applies when generating PubMed queries for any given topic. For instance, when querying PubMed to identify literature associated with inflammation, terms such as &#x201c;inflammatory&#x201d; or &#x201c;inflamed&#x201d; is also captured.</p>
                <p>The endpoint of this step is the development of an optimized PubMed query for a given gene. Any criterion can be used for the selection of a candidate gene to be assigned to a trainee or group of trainees. However, a significant body of literature is associated with the gene in question. For the purposes of illustration in this article, we have selected 
                    <italic toggle="yes">ISG15.</italic>
                </p>
                <p>Practical activities for this step:
                    <list list-type="alpha-lower">
                        <list-item>
                            <p>The official gene name, symbol and all aliases are retrieved from the GeneCard website (e.g., for 
                                <italic toggle="yes">ISG15</italic>: 
                                <ext-link ext-link-type="uri" xlink:href="https://www.genecards.org/cgi-bin/carddisp.pl?gene=ISG15">https://www.genecards.org/cgi-bin/carddisp.pl?gene=ISG15</ext-link>). Although gene names may also be retrieved from different sources, the advantage of using GeneCards is that information has already been compiled from reference databases such as NCBI Entrez Genes, Uniprot or 
                                <ext-link ext-link-type="uri" xlink:href="http://genenames.org">genenames.org</ext-link>.</p>
                        </list-item>
                        <list-item>
                            <p>PubMed queries are built using the official gene name, official symbol, and all aliases for the gene of interest as search terms, and by using the appropriate Boolean operators (AND, OR, NOT), field restriction tags, and suitable syntax. For instance, in a query using multiple search terms, the Boolean operator OR must be capitalized in between each keyword. The field restriction tag [tw] can be used after each term to search &#x201c;text words&#x201d;, included, for instance, in title, abstract, MeSH terms and subheadings, publication types, and substance names. Or alternatively the more restrictive [tiab], can be used in order to limit the search to titles and abstracts only. A query using [tw] for multiple search terms could for example be noted as follows: ISG15 [tw] OR IFI-15K [tw] OR IFI15 [tw]. In addition, the field restriction tag [pt], can be used to restrict/exclude the search of a particular publication type (e.g., NOT review [pt], would exclude review articles from the search results). Quotation marks are employed when compound words appear as an exact phrase in the search (e.g., &#x201c;
                                <italic toggle="yes">ISG15</italic> Ubiquitin Like Modifier&#x201d; [tw]). Notably, Pubmed&#x2019;s advanced search builder can also be used to design such complex queries (
                                <ext-link ext-link-type="uri" xlink:href="https://pubmed.ncbi.nlm.nih.gov/advanced/">https://pubmed.ncbi.nlm.nih.gov/advanced/</ext-link>) And for reference, a wide range of training material can also be found on the NCBI website:</p>
                            <p>
                                <ext-link ext-link-type="uri" xlink:href="https://pubmed.ncbi.nlm.nih.gov/help/">https://pubmed.ncbi.nlm.nih.gov/help/</ext-link>
                            </p>
                            <p>
                                <ext-link ext-link-type="uri" xlink:href="https://learn.nlm.nih.gov/documentation/training-packets/T0042010P/">https://learn.nlm.nih.gov/documentation/training-packets/T0042010P/</ext-link>
                            </p>
                        </list-item>
                        <list-item>
                            <p>The PubMed query is run, and quality checks are performed among the publications that were returned. This is to identify search terms that may be too permissive and return false-positive results. For instance, short three-character acronyms tend to be more problematic, as these are terms that are otherwise used as part of a common language (e.g. CAMEL, WARS).</p>
                        </list-item>
                        <list-item>
                            <p>If necessary, the query is optimized by addressing problematic terms. Removing the ambiguous or problematic term altogether may be a solution, but this could also lead to false-negative results (missing literature that is actually relevant to the gene of interest). As a compromise, the search term could be retained but optimized by adding a keyword that would restrict the search (e.g., provided below for HUCRP and UCRP). One may also find in some instances that the list of aliases known for a given gene is incomplete and needs to be amended (e.g., below).</p>
                        </list-item>
                    </list>
                </p>
                <p>
                    <italic toggle="yes">
                        <underline>Illustrative case:</underline>
                    </italic>
                </p>
                <p>
                    <italic toggle="yes">The optimized query (searching all words and numbers in the title, abstract, other abstract, MeSH terms, MeSH subheadings, publication types, substance names, personal name as subject, corporate author, secondary source, comment/correction notes, and other terms) for ISG15 is as follows:</italic>
                </p>
                <p>
                    <italic toggle="yes">ISG15 [tw] OR &#x201c;ISG15 Ubiquitin Like Modifier&#x201d; [tw] OR "Interferon-stimulated gene 15" [tw] OR "IFN-induced 15-kDa protein" [tw] OR IFI-15K [tw] OR &#x201c;Ubiquitin Cross-Reactive Protein&#x201d; [tw] OR "Ubiquitin-Like Protein ISG15" [tw] OR (HUCRP [tw] AND "Cross-Reactive Protein" [tw]) OR G1P2 [tw] OR IP17 [tw] OR (UCRP [tw] AND "Cross-Reactive Protein" [tw]) OR "Interferon-Induced 17-KDa/15-KDa Protein" [tw] OR "Interferon-Stimulated Protein, 15 Kda" [tw] OR "Interferon-Induced 17 Kda Protein" [tw] OR IFI15 [tw] OR IMD38 [tw] NOT review [pt].</italic>
                </p>
                <p>
                    <italic toggle="yes">This query returned 1,186 results (as of September 1
                        <sup>st</sup>, 2020). Optimization included the addition of aliases for this gene that were not captured in the GeneCard database but identified by reviewing some of the results (e.g., notably, the search argument Interferon-stimulated gene 15" [tw], which alone retrieves over 230 entries in PubMed). The acronyms HUCRP or UCRP also proved a source of false-positive results (CPR standing for cross-reactive protein, rather than C-reactive Protein). This was rectified by adding the &#x2018;AND &#x201c;cross reactive protein [tw]&#x201d;&#x2019; argument to each of the ambiguous acronym.</italic>
                </p>
            </sec>
            <sec id="sec5">
                <title>Step 2: Extracting concepts</title>
                <p>A large body of literature can be associated with a given gene. It may be useful then to employ a systematic &#x201c;cataloguing&#x201d; or indexing approach. For this step, workshop participants may first define themes under which concepts, and keywords associated with these concepts, will be categorized (e.g., themes could be &#x2018;Human diseases and pathogens&#x2019;, &#x2018;Tissues&#x2019;, &#x2018;Cell types&#x2019;, or &#x2018;Cellular processes&#x2019;). Participants would in turn scan titles of articles associated with 
                    <italic toggle="yes">ISG15</italic>, looking for concepts linked to the theme of interest. For example, under &#x2018;Human diseases and pathogens&#x2019;, a concept could be &#x2018;liver cancer&#x2019; and associated keywords could be &#x2018;liver cancer&#x2019;, &#x2018;hepatic carcinoma&#x2019;, &#x2018;hepatocellular carcinoma&#x2019;, &#x2018;HCC&#x2019;, or &#x2018;liver carcinoma&#x2019;. The output for this activity would be lists of concepts and keywords associated with the gene literature organized under different themes. It is the prevalence of these concepts that will be measured in the subsequent step (generation of literature profiles).</p>
                <p>Notably, this process could be repeated for other themes (or different themes could be assigned to each of the participants). If the literature is extensive and time is limited, it would also be possible to divide the literature into subsets (e.g., by batches of 100 articles, with participants assigned different batches to work on for the same theme). If the literature is sparse, all available articles for the gene may be used rather than just focusing on those articles including search terms in titles (i.e., using [tw], instead of [ti] in Step 2a). However, it may be generally preferable for a workshop to be based on selected genes with a relatively abundant literature (e.g., &gt;100 articles returned when restricting the search to titles).</p>
                <p>Practical activities for this step:
                    <list list-type="alpha-lower">
                        <list-item>
                            <p>A subset of the literature is retrieved, restricting the search to titles (using the optimized query from Step 1, and substituting the field restricting argument [ti] for [tw]).</p>
                        </list-item>
                        <list-item>
                            <p>A given theme is assigned to, or selected by, participants (e.g., &#x2018;Human diseases and pathogens&#x2019;, &#x2018;Tissues&#x2019;, &#x2018;Cell types&#x2019;, &#x2018;Biomolecules&#x2019;, &#x2018;Pathways&#x2019;, &#x2018;Biological processes&#x2019;).</p>
                        </list-item>
                        <list-item>
                            <p>Concepts relevant to the theme in question are identified in the titles of articles retrieved by the query designed in a) (e.g., liver cancer, HIV).</p>
                        </list-item>
                        <list-item>
                            <p>The concepts and associated keywords (e.g., &#x2018;hepatocellular carcinoma&#x2019;, &#x2018;liver carcinoma&#x2019;, &#x2018;hepatic cancer&#x2019;, for the concept &#x2018;liver cancer&#x2019; or &#x2018;virus&#x2019;, &#x2018;viral&#x2019;, &#x2018;human immunodeficiency virus&#x2019; for the concept HIV) are recorded in a spreadsheet (example and templates are available in 
                                <bold>Extended data File 2</bold>).
                                <sup>
                                    <xref ref-type="bibr" rid="ref6">6</xref>
                                </sup>
                            </p>
                        </list-item>
                    </list>
                </p>
                <p>
                    <italic toggle="yes">
                        <underline>Illustrative case:</underline>
                    </italic>
                </p>
                <p>
                    <italic toggle="yes">The subset of the ISG15 literature in which the official gene name, symbol or aliases are present in titles is retrieved. The query below is adapted to retrieve only the literature for which gene name and aliases are present in titles (using [ti]).</italic>
                </p>
                <p>
                    <italic toggle="yes">ISG15 [ti] OR &#x201c;ISG15 Ubiquitin Like Modifier&#x201d; [ti] OR "Interferon-stimulated gene 15" [ti] OR "IFN-induced 15-kDa protein" [ti] OR IFI-15K [ti] OR &#x201c;Ubiquitin Cross-Reactive Protein&#x201d; [ti] OR "Ubiquitin-Like Protein ISG15" [ti] OR (HUCRP [ti] AND "Cross-Reactive Protein" [tw]) OR G1P2 [ti] OR IP17 [ti] OR (UCRP [ti] AND "Cross-Reactive Protein" [tw]) OR "Interferon-Induced 17-KDa/15-KDa Protein" [ti] OR "Interferon-Stimulated Protein, 15 Kda" [ti] OR "Interferon-Induced 17 Kda Protein" [ti] OR IFI15 [ti] OR IMD38 [ti] NOT review [pt]. This query returned 312 results (as of September 1
                        <sup>st</sup>, 2020).</italic>
                </p>
                <p>
                    <italic toggle="yes">Titles are then parsed from the ISG15 literature for concepts corresponding to the theme: &#x201c;human diseases and pathogens&#x201d;. The concepts and associated keywords retrieved for the &#x201c;human diseases and pathogens&#x201d; category are listed in 
                        <bold>Extended data File 2</bold>
                        <sup>
                            <xref ref-type="bibr" rid="ref6">6</xref>
                        </sup> (Excel File: Extraction of concepts Tab).</italic>
                </p>
            </sec>
            <sec id="sec6">
                <title>Step 3: Generating literature profiles</title>
                <p>Determining the relative prevalence of concepts among the literature associated with a given gene can be useful. This is the case when assessing the novelty of a finding, for instance (e.g., change in transcript or protein abundance associated with pathogenesis). Moreover, such an exercise and the information being derived would also be useful in other instances when writing general background/summary about the gene for a report or a manuscript.</p>
                <p>With the two previous steps completed, determining the prevalence of concepts in the literature associated with a given gene can be achieved quite simply. For this, the literature query developed in Step 1 is modified to narrow the search and retrieve literature associated with the concepts identified in Step 2. The endpoint for this activity is a table showing the frequency of articles for these concepts in the literature associated with a given gene (e.g., 
                    <bold>Extended data File 2</bold>
                    <sup>
                        <xref ref-type="bibr" rid="ref6">6</xref>
                    </sup> [Excel File: Literature Profiling Tab]).</p>
                <p>Notes: Participants may also be encouraged to explore different types of visual representations for this type of data. Treemaps or word clouds can for instance show relative prevalence, while other types of graphs, such as 2D bubble graphs, may also show article frequencies (e.g., 
                    <xref ref-type="fig" rid="f1">Figure 1</xref>). Another exercise could involve visualization of changes in the abundance of the literature associated with the selected concepts over the years. For this purpose, queries can be amended to add a range of publication dates with the field restriction tag [dp] (e.g., adding AND 2000:2010 [dp] to the query).</p>
                <fig fig-type="figure" id="f1" orientation="portrait" position="float">
                    <label>Figure 1. </label>
                    <caption>
                        <title>Visualizing 
                            <italic toggle="yes">ISG15</italic> literature profiles.</title>
                        <p>Treemap representation of the relative prevalence of concepts associated with the &#x201c;human diseases and pathogens&#x201d; theme among the ISG15 literature.</p>
                    </caption>
                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/145671/d2ee1429-3d48-428c-b319-9aefe12f8f8f_figure1.gif"/>
                </fig>
                <p>Practical activities for this step:
                    <list list-type="alpha-lower">
                        <list-item>
                            <p>The literature query from Step 1 is employed (using the field search restriction [tw]).</p>
                        </list-item>
                        <list-item>
                            <p>The Boolean operator AND is added, followed by search terms corresponding to keywords related to one of the concepts identified in Step 2 (e.g., &#x201c;Liver cancer&#x201d;, &#x201c;Liver carcinoma&#x201d;, &#x201c;Hepatic carcinoma&#x201d;).</p>
                        </list-item>
                        <list-item>
                            <p>Quotation marks, field search restriction and the Boolean OR are added, so that the notation would read as follows: &#x2026; AND (&#x201c;Liver cancer&#x201d; [tw] OR &#x201c;Liver carcinoma&#x201d; [tw] OR &#x201c;Hepatic carcinoma&#x201d; [tw]).</p>
                        </list-item>
                        <list-item>
                            <p>The query thus constructed is run, and the number of articles retrieved recorded (for instance in a spreadsheet: 
                                <bold>Extended data File 2</bold>).
                                <sup>
                                    <xref ref-type="bibr" rid="ref6">6</xref>
                                </sup> Steps b through d are repeated for the rest of the concepts identified in Step 2.</p>
                        </list-item>
                        <list-item>
                            <p>Visual representations of the literature profiles are generated.</p>
                        </list-item>
                    </list>
                </p>
                <p>
                    <italic toggle="yes">
                        <underline>Illustrative case:</underline>
                    </italic>
                </p>
                <p>
                    <italic toggle="yes">The prevalence of concepts among the ISG15 literature associated with the theme &#x201c;Human diseases and pathogens&#x201d; is given in 
                        <bold>Extended data File 2</bold>
                        <sup>
                            <xref ref-type="bibr" rid="ref6">6</xref>
                        </sup> (Excel File: Literature Profiling Tab) and represented visually in 
                        <xref ref-type="fig" rid="f1">Figure 1</xref> (also in 
                        <bold>Extended data File 2</bold>
                        <sup>
                            <xref ref-type="bibr" rid="ref6">6</xref>
                        </sup> [Excel File: Generating plots Tab]).</italic>
                </p>
            </sec>
            <sec id="sec7">
                <title>Step 4: Developing interactive data visualizations</title>
                <p>Valuable insights can be gained from visual representation of information. And this could be further facilitated when the information underlying these visual representations can be accessed interactively by the end user.</p>
                <p>To produce such interactive visual representations in the context of training workshops, we recommend using the Prezi web application (
                    <ext-link ext-link-type="uri" xlink:href="https://prezi.com">https://prezi.com</ext-link>). This tool has been developed to create presentations in which it is possible to zoom in and out between levels of information. This gives users the opportunity to visualize the prevalence of concepts while at the same time allowing them to &#x201c;drill down&#x201d; into each individual concept in order to access relevant underlying information. The endpoint for this activity is the production of a &#x201c;hierarchical circle packing chart&#x201d; representing the relative abundance of concepts in the literature for a given gene and theme (
                    <xref ref-type="fig" rid="f2">Figure 2A</xref>, 
                    <ext-link ext-link-type="uri" xlink:href="https://prezi.com/view/zCedrcYaAEUAON1VeEUi">https://prezi.com/view/zCedrcYaAEUAON1VeEUi</ext-link>). Such a resource gives users access to underlying reference information for each of the concepts (
                    <xref ref-type="fig" rid="f2">Figure 2B</xref>) and can be made available publicly.</p>
                <fig fig-type="figure" id="f2" orientation="portrait" position="float">
                    <label>Figure 2. </label>
                    <caption>
                        <title>
                            <italic toggle="yes">ISG15</italic> Human diseases and pathogens circle packing chart. A.</title>
                        <p>At the highest level, the representation permits visualization of the relative abundance of concepts associated with the &#x201c;human diseases and pathogens&#x201d; theme for the 
                            <italic toggle="yes">ISG15</italic> gene. The color and size of the circles is a representation of the number of articles retrieved for a given concept (as indicated in the figure key). 
                            <bold>B.</bold> Zooming into each of the circles permits access to another level of information, such as links to PubMed results as well as screen captures of articles relevant to a specific topic (e.g., the relevance of 
                            <italic toggle="yes">ISG15</italic> as a biomarker). An interactive version of this presentation can be accessed via: 
                            <ext-link ext-link-type="uri" xlink:href="https://prezi.com/view/zCedrcYaAEUAON1VeEUi/">
                                <italic toggle="yes">https://prezi.com/view/zCedrcYaAEUAON1VeEUi/</italic>
                            </ext-link>
                        </p>
                    </caption>
                    <graphic id="gr2" orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/145671/d2ee1429-3d48-428c-b319-9aefe12f8f8f_figure2.gif"/>
                </fig>
                <p>For the practical activity designed for this step, participants will need to register with Prezi and create an account to access and edit the interactive presentations. They can do so free of charge by selecting the basic account at: 
                    <ext-link ext-link-type="uri" xlink:href="https://prezi.com/pricing/basic/">https://prezi.com/pricing/basic/</ext-link>. Ideally, this is completed ahead of the training session. If instructors have access to a paid subscription, they can create an unlimited number of presentations and invite individual participants to collaborate with editing rights. Otherwise, the participants can create their own presentation and invite the instructors as collaborators. They would also need to copy and paste material from a master template in their own Prezi account. Thus, this alternative solution would be workable, but probably not ideal.</p>
                <p>Practical activities for this step:
                    <list list-type="alpha-lower">
                        <list-item>
                            <p>Participants are given access to a Prezi presentation that contains a template and illustrative example (e.g. 
                                <ext-link ext-link-type="uri" xlink:href="https://prezi.com/view/u65ZqHn9ZBJx1VKHqfZA/">https://prezi.com/view/u65ZqHn9ZBJx1VKHqfZA/</ext-link>). As indicated above, one such presentation could be created and made available for each participant. It could serve as their own &#x201c;Sandbox&#x201d;, to familiarize themselves with the application and use as a starting point to develop an interactive resource. Multiple users can also work simultaneously on the same shared presentation. It would thus also be possible to create one presentation for a group of participants to work on cooperatively.</p>
                        </list-item>
                        <list-item>
                            <p>Starting with the template and following the illustrative example, participants can build a circle packing chart for concepts identified in Step 3.</p>
                        </list-item>
                        <list-item>
                            <p>The size and color of the circles is determined by the frequency of articles in the gene-associated literature related to a given concept (as shown in 
                                <xref ref-type="fig" rid="f2">Figure 2A</xref>).</p>
                        </list-item>
                        <list-item>
                            <p>The circles can be arranged manually to create a visually attractive representation.</p>
                        </list-item>
                        <list-item>
                            <p>Underlying information is then added to each of the circles and accessed by zooming in to different levels (
                                <xref ref-type="fig" rid="f2">Figure 2B</xref>). This information could include, for instance:</p>
                            <list list-type="bullet">
                                <list-item>
                                    <label>&#x2022;</label>
                                    <p>The gene symbol and concept;</p>
                                </list-item>
                                <list-item>
                                    <label>&#x2022;</label>
                                    <p>The query link;</p>
                                </list-item>
                                <list-item>
                                    <label>&#x2022;</label>
                                    <p>Number of articles retrieved on a specified date;</p>
                                </list-item>
                                <list-item>
                                    <label>&#x2022;</label>
                                    <p>Result link to those articles;</p>
                                </list-item>
                                <list-item>
                                    <label>&#x2022;</label>
                                    <p>Screen capture and links pointing to articles relevant to a specific topic (see Step 5 below).</p>
                                </list-item>
                            </list>
                        </list-item>
                    </list>
                </p>
                <p>
                    <italic toggle="yes">
                        <underline>Illustrative case:</underline>
                    </italic>
                </p>
                <p>
                    <italic toggle="yes">The circle packing chart representation for ISG15, focusing on the &#x201c;Human diseases and pathogens&#x201d; topic is shown in</italic> 
                    <xref ref-type="fig" rid="f2">
                        <italic toggle="yes">Figure 2</italic>
                    </xref>
                    <italic toggle="yes">. The Prezi presentation can be accessed for interactive exploration via:</italic> 
                    <ext-link ext-link-type="uri" xlink:href="https://prezi.com/view/zCedrcYaAEUAON1VeEUi/">https://prezi.com/view/zCedrcYaAEUAON1VeEUi/</ext-link>
                </p>
                <p>A screen recording demonstrating how to add underlying information to a Prezi circle packing chart can be accessed via: 
                    <ext-link ext-link-type="uri" xlink:href="https://soapbox.wistia.com/videos/2tmC1VeQyr">https://soapbox.wistia.com/videos/2tmC1VeQyr</ext-link>
                </p>
            </sec>
            <sec id="sec8">
                <title>Step 5: Writing a narrative</title>
                <p>Writing material for a report or manuscript can be one of the motivations for profiling the literature associated with a given topic. To become successful in their future academic endeavors early career scientists need to become proficient independent writers. The last hands-on activity of the workshop consists in capturing key information from published work and using it as a basis for writing up a narrative about the gene of interest.</p>
                <p>A specific topic would be selected as the focus of the activity to limit the amount of literature that would need to be covered. One such topic could, for example, be the relevance of a gene (in this case 
                    <italic toggle="yes">ISG15</italic>) as a biomarker for diagnostic applications within the overall theme of &#x201c;Human diseases and pathogens&#x201d; defined in Step 2. Participants could then focus on the most prevalent disease (e.g. in the case of 
                    <italic toggle="yes">ISG15</italic> literature, three diseases have &gt;50 articles and ten diseases have &gt;20 articles). Literature queries corresponding to the diseases that have been selected would then be modified to retrieve only those most likely to contain information about 
                    <italic toggle="yes">ISG15</italic> and its relevance as a biomarker.</p>
                <p>Practical activities for this step:
                    <list list-type="alpha-lower">
                        <list-item>
                            <p>Concepts most prevalent in the gene-associated literature for the theme of interest are selected (e.g. concept = Hepatitis C infection, within the theme = &#x201c;Human diseases and pathogens&#x201d;).</p>
                        </list-item>
                        <list-item>
                            <p>The query developed to retrieve literature for a given concept is modified to restrict the search in order to retrieve articles relevant to a given topic (e.g. relevance of the gene as a biomarker). The Boolean &#x201c;AND&#x201d; is appended at the end of the query along with relevant keywords in parenthesis separated by the Boolean &#x201c;OR&#x201d; (e.g. &#x201c;AND (biomarker OR biomarkers OR diagnostic OR diagnosis OR prognostic OR prognosis)&#x201d;.</p>
                        </list-item>
                        <list-item>
                            <p>Articles are reviewed and those deemed relevant to the topic are selected (e.g., those in which changes in abundance of the gene product in clinical specimens are reported).</p>
                        </list-item>
                        <list-item>
                            <p>Relevant information is captured from the abstract and/or full text and recorded in a spreadsheet (e.g. analyte name, species, data generation methods, comparator groups, etc.). In some cases, the full text of articles may not be accessible (e.g., paywall), and the abstract has only incomplete information. This can lead to the absence of important information in the &#x201c;capture&#x201d; spreadsheet, rendering the findings unusable. Participants can then be reminded of the importance of using best practices when reporting findings (e.g., mentioning the comparator group, species or the specific factor being measured &#x2013; being protein or RNA). It may also be an opportunity to discuss the merits of publication in open access journals.</p>
                        </list-item>
                        <list-item>
                            <p>The information captured in the spreadsheet will serve as a synopsis that participants will be able to rely upon for developing a written narrative, which could be used in an introduction or review article.</p>
                        </list-item>
                    </list>
                </p>
                <p>
                    <italic toggle="yes">
                        <underline>Illustrative case:</underline>
                    </italic>
                </p>
                <p>
                    <italic toggle="yes">Gene of interest: ISG15</italic>
                </p>
                <p>
                    <italic toggle="yes">Theme: &#x201c;Human diseases and pathogens&#x201d;.</italic>
                </p>
                <p>
                    <italic toggle="yes">Topic: Biomarker/diagnostic relevance.</italic>
                </p>
                <p>
                    <italic toggle="yes">Focusing on ISG15 and &#x201c;human diseases and pathogens&#x201d; as a theme, with biomarker/diagnostic/prognostic as a topic, the main diseases identified in step 3 were selected (&gt;20 articles; 
                        <xref ref-type="fig" rid="f2">Figure 2A</xref>, red circles and orange circles). Starting with the query developed for retrieving ISG15 literature relevant to Hepatitis C virus, the search is restricted by adding the expression &#x201c;AND (biomarker OR biomarkers OR diagnostic OR diagnosis OR prognostic OR prognosis)&#x201d;. Three articles were returned. From each article, relevant information available in the abstract, and in the full text where possible/necessary, is recorded in a spreadsheet (
                        <bold>Extended data File 3</bold>).
                        <sup>
                            <xref ref-type="bibr" rid="ref7">7</xref>
                        </sup> The information in question relates specifically to changes in abundance of ISG15 measured in clinical specimens (since it is directly relevant to the topic that was selected).</italic>
                </p>
                <p>
                    <italic toggle="yes">Narrative, based on the information captured in the spreadsheet:</italic>
                </p>
                <p>
                    <italic toggle="yes">&#x201c;Interferon-stimulated gene 15 (ISG15) is a member of the ubiquitin family, which includes ubiquitin and ubiquitin-like modifiers (Ubls). Ubiquitin and Ubls are involved in the regulation of a variety of cellular activities, including protein stability, intracellular trafficking, cell cycle control and immune modulation. ISG15 has been implicated as a biomarker with diagnostic relevance for a number of human disorders, including cancer (melanoma and breast cancer) and autoimmune diseases (SLE), as well as infection with pathogens such as HBC, HCV and HIV.</italic>
                </p>
                <p>
                    <italic toggle="yes">The expression of ISG15 has been implicated in a wide range of human cancers, although the roles of ISG15 in tumorigenesis and responses to anticancer treatments remain largely unknown. In patients with breast cancer, ISG15 is overexpressed at both the mRNA and protein levels in mammary tumor tissue compared to that in normal mammary tissue.
                        <sup>
                            <xref ref-type="bibr" rid="ref8">8</xref>,
                            <xref ref-type="bibr" rid="ref9">9</xref>
                        </sup> Increased ISG15 protein expression has also been detected in human melanoma cell lines.
                        <sup>
                            <xref ref-type="bibr" rid="ref10">10</xref>
                        </sup> These findings indicate the potential of ISG15 as a tumor biomarker. ISG15 protein expression is upregulated in immune cell markers from patients with invasive breast cancer
                        <sup>
                            <xref ref-type="bibr" rid="ref11">11</xref>
                        </sup> as well as dendritic cells from melanoma patients.
                        <sup>
                            <xref ref-type="bibr" rid="ref10">10</xref>
                        </sup> ISG15 is also highly expressed in breast cancer patients undergoing immunotherapy.
                        <sup>
                            <xref ref-type="bibr" rid="ref12">12</xref>
                        </sup> Short-hairpin RNA-mediated silencing of ISG15 expression has been shown to inhibits breast tumor growth and increase NK cell infiltration a nude mouse model of tumorigenesis.
                        <sup>
                            <xref ref-type="bibr" rid="ref13">13</xref>
                        </sup> Furthermore, shRNA-mediated knockdown of ISG15 increased the resistance of human tumor cells to CPT.
                        <sup>
                            <xref ref-type="bibr" rid="ref14">14</xref>
                        </sup> These findings indicate that ISG15 is also a candidate biomarker of the responsiveness to immunotherapy among patients with cancer.</italic>
                </p>
                <p>
                    <italic toggle="yes">ISG15 plays a key role in the host antiviral response. As such, ISG15 is implicated as a diagnostic biomarker of viral infection. Studies in patients with HCC have shown high levels of ISG15 protein in human HBV-related HCC tissues compared to the levels detected in non-tumor tissues.
                        <sup>
                            <xref ref-type="bibr" rid="ref15">15</xref>
                        </sup> Furthermore, upregulated expression of ISG15 was observed at both the mRNA and protein levels in liver tissue samples from patients with HCV infection compared to the levels detected in uninfected controls.
                        <sup>
                            <xref ref-type="bibr" rid="ref16">16</xref>
                        </sup> Similarly, the abundance of ISG15 transcripts was found to be increased in human PBMCs and liver cells in patients with HCV infection who were unresponsive to IFN treatment compared to the levels in corresponding IFN-responsive patients.
                        <sup>
                            <xref ref-type="bibr" rid="ref17">17</xref>
                        </sup> In addition, high levels of ISG15 in the liver of patients with HCV infection were associated with an unfavourable HCV genotype 1, a high hepatic HCV load and a low antiviral response to IFN compared to patients who did not present such characteristics.
                        <sup>
                            <xref ref-type="bibr" rid="ref16">16</xref>
                        </sup> These findings indicate the potential of ISG15 as a biomarker of IFN treatment response in patients with HCV infection.</italic>
                </p>
                <p>
                    <italic toggle="yes">Increased ISG15 expression has also been reported in mouse models related to HIV infection. ISG15 is upregulated in transgenic mice overexpressing the HIV gp120 protein.
                        <sup>
                            <xref ref-type="bibr" rid="ref18">18</xref>
                        </sup> Moreover, in mouse models of acute and chronic neuronal injuries with HIV infection, ISG15 protein in the brain was increased compared to the levels detected in mice with global ischemia and traumatic brain injuries.
                        <sup>
                            <xref ref-type="bibr" rid="ref18">18</xref>
                        </sup> The importance of ISG15 as a biomarker of HIV infection and treatment response has also been revealed in human studies. Compared to healthy donors, TaqMan assays revealed high ISG15 expression (mRNA and protein) as well as an increased frequency of ISG15-SNPs in human PBMC from untreated patients with HIV infection.
                        <sup>
                            <xref ref-type="bibr" rid="ref19">19</xref>
                        </sup> Moreover, ISG15 expression decreased in human PBMC in patients with HIV infection after long-term antiretroviral therapy.
                        <sup>
                            <xref ref-type="bibr" rid="ref19">19</xref>
                        </sup>
                    </italic>
                </p>
                <p>
                    <italic toggle="yes">The key importance of ISG15 in the human immune system is also reflected in its role in autoimmune diseases, such as SLE. High levels of ISG15 transcripts were detected in human whole blood cell samples from 28 patients newly diagnosed with SLE compared to with 10 patients with undifferentiated connective tissue disease, and 22 healthy volunteers,
                        <sup>
                            <xref ref-type="bibr" rid="ref20">20</xref>
                        </sup> indicating the potential of ISG15 as a specific biomarker of SLE. Moreover, the upregulation of ISG15 in human plasmablasts/plasma cells from patients with active SLE indicate that ISG15 can be used as a marker of disease activation in patients in remission.
                        <sup>
                            <xref ref-type="bibr" rid="ref21">21</xref>
                        </sup>
                    </italic>
                </p>
                <p>
                    <italic toggle="yes">Thus, an increasing body of evidence supports a role of ISG15 as a biomarker with diagnostic relevance for a number of human disorders.&#x201d;</italic>
                </p>
            </sec>
        </sec>
        <sec id="sec9">
            <title>Implementation</title>
            <p>As proof of principle, a workshop was implemented using the step-by-step guide and supporting information provided with this paper. It was led by FAA and AKM, who participated in the development of the training curriculum, but had no prior experience running such training activities. The workshop took place on January 26, 2021. In total, 29 graduate students from Hamad Bin Khalifa University took part. It was offered as one three-hour class as part of an &#x201c;Introduction to Data Science&#x201d; course. Due to Covid-19 restrictions the workshop was run remotely using Webex. As no information was collected from workshop participants (i.e., no surveys, or questionnaires), the activity was not considered to constitute human research and therefore no ethical approval was required. The generic introductory presentation in 
                <bold>Extended data File 1</bold>
                <sup>
                    <xref ref-type="bibr" rid="ref5">5</xref>
                </sup> was adapted to provide more specific context, and notably explain how, 
                <italic toggle="yes">CCR1</italic>, the target gene for this workshop was selected (for illustrative purposes this introductory presentation is also made available here: 
                <bold>Extended data File 4</bold>).
                <sup>
                    <xref ref-type="bibr" rid="ref22">22</xref>
                </sup> FAA and AKM, the two co-instructors, also prepared a handout to guide participants through the different steps (
                <bold>Extended data File 5)</bold>,
                <sup>
                    <xref ref-type="bibr" rid="ref23">23</xref>
                </sup> specifically for the 
                <italic toggle="yes">CCR1</italic> use case. Several days prior to the workshop, participants had been asked to create a Prezi &#x201c;Basic&#x201d; account and a template was created and shared with each one of them ahead of time. Following the introductory presentation (10 minutes) participants carried out hands-on activities following each one of the tasks described in the handout, with each step corresponding to one working session. Given time constraints, material prepared ahead of time by FAA and AKM was provided to the participants at the end of each working session so that they could move on to the next. Ideally sessions should span multiple days in order for participants to complete the assignments. However, the more intensive schedule implemented here had the advantage of introducing literature profiling concepts and approaches to a large group of trainees in a time-effective manner. The opportunity to continue work later on, on their own time being also available to them.</p>
        </sec>
        <sec id="sec10" sec-type="conclusions">
            <title>Conclusions</title>
            <p>Effectively harnessing biomedical literature is one of the most fundamental skills required by biomedical researchers. Thus, early career researchers need to develop the ability to read the scientific literature at different levels: from &#x201c;scanning&#x201d; a large number of articles to reading the full text version for a more in depth understanding. Given the current rate at which articles are published, developing more systematic approaches to literature profiling based on defined principles may prove especially beneficial to early-career biomedical researchers.</p>
            <p>Here, we present a training workflow as well as supporting material that may be re-used/adapted for the organization of &#x2018;Gene literature retrieval, profiling and visualization&#x2019; training workshops. Hands-on activities range from literature retrieval and optimization of PubMed queries, to the development of interactive resources and authoring original material focusing on a specific topic.</p>
            <p>A number of the steps described could easily be automated. For instance, the concatenation of official gene names and aliases for retrieving gene-specific literature using PubMed (Step 1). Indeed, such tools are under development by our group and will be made available in the future. These tools would not only save time, but also minimize user error. Nevertheless, some of the steps, such as optimization of search queries or extraction of information from abstracts will always require some critical evaluation and decision making that cannot be automated. Refraining from the use of such automated tools in a training workshop may also be a deliberate choice on the part of the instructor if the emphasis is initially on the development of competency in completing the steps involved throughout the process.</p>
            <p>Here, we focused on gene-associated literature, simply because the training curriculum that is currently under development is based on the reuse of publicly available gene expression data. A similar approach could be employed to profile the literature associated with any given disease, pathway, molecular process, or drug, for instance.</p>
            <p>We aim to further develop an illustrative case that would lead to the publication of a review of the 
                <italic toggle="yes">ISG15</italic> literature employing the literature profiling and visualization approaches described herein. We also plan in the future to develop and make available guides and supporting material for the implementation of hands-on training workshops focusing on other topics covered by our Collective Omics Data (COD) training curriculum, which includes three modules: COD 1 focuses on retrieving, visualizing and interpretating gene-specific transcriptional profiles
                <sup>
                    <xref ref-type="bibr" rid="ref24">24</xref>&#x2013;
                    <xref ref-type="bibr" rid="ref26">26</xref>
                </sup>; COD2 focuses on constituting and curating of themed public dataset collections
                <sup>
                    <xref ref-type="bibr" rid="ref27">27</xref>&#x2013;
                    <xref ref-type="bibr" rid="ref35">35</xref>
                </sup>; COD3 focuses on global analysis of large-scale omics data.
                <sup>
                    <xref ref-type="bibr" rid="ref36">36</xref>&#x2013;
                    <xref ref-type="bibr" rid="ref38">38</xref>
                </sup> These hands-on training activities would build in part on the skills developed through the implementation of the literature profiling workshop that we have presented here. Progress to date towards this goal includes the publication of a study guide for COD1,
                <xref ref-type="bibr" rid="ref39">
                    <sup>39</sup>
                </xref> along with a use case that examines the relevance of CEACAM6 as a blood transcriptional marker.
                <xref ref-type="bibr" rid="ref40">
                    <sup>40</sup>
                </xref>
            </p>
        </sec>
        <sec id="sec11">
            <title>Data availability</title>
            <sec id="sec12">
                <title>Underlying data</title>
                <p>No newly generated data are associated with this article. Information was retrieved via the literature search engine PubMed (
                    <ext-link ext-link-type="uri" xlink:href="https://pubmed.ncbi.nlm.nih.gov/">https://pubmed.ncbi.nlm.nih.gov/</ext-link>).</p>
            </sec>
            <sec id="sec13">
                <title>Extended data</title>
                <p>This project contains the following extended data:</p>
                <p>Figshare: Literature profiling workshop, introduction session slides (S File 1), 
                    <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.6084/m9.figshare.13669070.v2">https://doi.org/10.6084/m9.figshare.13669070.v2</ext-link>.
                    <sup>
                        <xref ref-type="bibr" rid="ref5">5</xref>
                    </sup>
                </p>
                <p>Figshare: Literature profiling workshop: steps 1-3 (S File 2), 
                    <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.6084/m9.figshare.14160329.v3">https://doi.org/10.6084/m9.figshare.14160329.v3</ext-link>.
                    <sup>
                        <xref ref-type="bibr" rid="ref6">6</xref>
                    </sup>
                </p>
                <p>Figshare: Literature Profiling Workshop: Step 5 (S File 3), 
                    <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.6084/m9.figshare.14161484.v1">https://doi.org/10.6084/m9.figshare.14161484.v1</ext-link>.
                    <sup>
                        <xref ref-type="bibr" rid="ref7">7</xref>
                    </sup>
                </p>
                <p>Figshare: Literature profiling workshop: HBKU handout (S File 4), 
                    <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.6084/m9.figshare.14166395.v1">https://doi.org/10.6084/m9.figshare.14166395.v1</ext-link>.
                    <sup>
                        <xref ref-type="bibr" rid="ref22">22</xref>
                    </sup>
                </p>
                <p>Figshare: Literature profiling workshop: HBKU intro presentation (S File 5), 
                    <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.6084/m9.figshare.14166500.v1">https://doi.org/10.6084/m9.figshare.14166500.v1</ext-link>.
                    <sup>
                        <xref ref-type="bibr" rid="ref23">23</xref>
                    </sup>
                </p>
                <p>Data are available under the terms of the 
                    <ext-link ext-link-type="uri" xlink:href="https://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International license</ext-link> (CC BY 4.0).</p>
            </sec>
        </sec>
    </body>
    <back>
        <ack>
            <title>Acknowledgements</title>
            <p>We wish to thank all the students from Hamad Bin Khalifa University (HBKU) for participating in the workshop that served as a proof-of-principle for the use of the implementation guide described in this paper. We also thank the Division of Genomics and Translational Biomedicine at the College of Health and Life Sciences at HBKU for hosting this workshop.</p>
        </ack>
        <ref-list>
            <title>References</title>
            <ref id="ref1">
                <label>1</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Margolis</surname>
                            <given-names>R</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Derr</surname>
                            <given-names>L</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Dunn</surname>
                            <given-names>M</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>The National Institutes of Health&#x2019;s Big Data to Knowledge (BD2K) initiative: capitalizing on biomedical big data.</article-title>
                    <source>

                        <italic toggle="yes">J Am Med Inform Assoc.</italic>
</source>
                    <year>2014 Dec</year>;<volume>21</volume>(<issue>6</issue>):<fpage>957</fpage>&#x2013;<lpage>8</lpage>.
                    <pub-id pub-id-type="pmid">25008006</pub-id>
                    <pub-id pub-id-type="doi">10.1136/amiajnl-2014-002974</pub-id>
                    <pub-id pub-id-type="pmcid">PMC4215061</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref2">
                <label>2</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Van Horn</surname>
                            <given-names>JD</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Fierro</surname>
                            <given-names>L</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Kamdar</surname>
                            <given-names>J</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Democratizing data science through data science training.</article-title>
                    <source>

                        <italic toggle="yes">Pac Symp Biocomput.</italic>
</source>
                    <year>2018</year>;<volume>23</volume>:<fpage>292</fpage>&#x2013;<lpage>303</lpage>.
                    <pub-id pub-id-type="pmid">29218890</pub-id>
                    <pub-id pub-id-type="pmcid">PMC5731238</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref3">
                <label>3</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Garmire</surname>
                            <given-names>LX</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Gliske</surname>
                            <given-names>S</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Nguyen</surname>
                            <given-names>QC</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>The training of next generation data scientists in biomedicine.</article-title>
                    <source>

                        <italic toggle="yes">Pac Symp Biocomput.</italic>
</source>
                    <year>2017</year>;<volume>22</volume>:<fpage>640</fpage>&#x2013;<lpage>5</lpage>.
                    <pub-id pub-id-type="pmid">27897014</pub-id>
                    <pub-id pub-id-type="doi">10.1142/9789813207813_0059</pub-id>
                    <pub-id pub-id-type="pmcid">PMC5425257</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref4">
                <label>4</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Chaussabel</surname>
                            <given-names>D</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Rinchai</surname>
                            <given-names>D</given-names>
                        </name>
</person-group>:
                    <article-title>Using &#x201c;collective omics data&#x201d; for biomedical research training.</article-title>
                    <source>

                        <italic toggle="yes">Immunology.</italic>
</source>
                    <year>2018</year>;<volume>155</volume>(<issue>1</issue>):<fpage>18</fpage>&#x2013;<lpage>23</lpage>.
                    <pub-id pub-id-type="pmid">29705995</pub-id>
                    <pub-id pub-id-type="doi">10.1111/imm.12944</pub-id>
                    <pub-id pub-id-type="pmcid">PMC6099165</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref5">
                <label>5</label>
                <mixed-citation publication-type="other">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Chaussabel</surname>
                            <given-names>D</given-names>
                        </name>
</person-group>:
                    <article-title>Literature profiling workshop, introduction session slides (S File 1).</article-title>
                    <source>

                        <italic toggle="yes">figshare. Journal contribution.</italic>
</source>
                    <year>2021</year>.
                    <pub-id pub-id-type="doi">10.6084/m9.figshare.13669070.v2</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref6">
                <label>6</label>
                <mixed-citation publication-type="other">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Chaussabel</surname>
                            <given-names>D</given-names>
                        </name>
</person-group>:
                    <article-title>Literature profiling workshop: steps 1-3 (S File 2).</article-title>
                    <source>

                        <italic toggle="yes">figshare. Journal contribution.</italic>
</source>
                    <year>2021</year>.
                    <pub-id pub-id-type="doi">10.6084/m9.figshare.14160329.v3</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref7">
                <label>7</label>
                <mixed-citation publication-type="other">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Chaussabel</surname>
                            <given-names>D</given-names>
                        </name>
</person-group>:
                    <article-title>Literature Profiling Workshop: Step 5 (S File 3).</article-title>
                    <source>

                        <italic toggle="yes">figshare. Journal contribution.</italic>
</source>
                    <year>2021</year>.
                    <pub-id pub-id-type="doi">10.6084/m9.figshare.14161484.v1</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref8">
                <label>8</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Bektas</surname>
                            <given-names>N</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Noetzel</surname>
                            <given-names>E</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Veeck</surname>
                            <given-names>J</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>The ubiquitin-like molecule interferon-stimulated gene 15 (ISG15) is a potential prognostic marker in human breast cancer.</article-title>
                    <source>

                        <italic toggle="yes">Breast Cancer Res.</italic>
</source>
                    <year>2008</year>;<volume>10</volume>(<issue>4</issue>):<fpage>R58</fpage>.
                    <pub-id pub-id-type="pmid">18627608</pub-id>
                    <pub-id pub-id-type="doi">10.1186/bcr2117</pub-id>
                    <pub-id pub-id-type="pmcid">PMC2575531</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref9">
                <label>9</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Tecalco-Cruz</surname>
                            <given-names>AC</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Cort&#x00e9;s-Gonz&#x00e1;lez</surname>
                            <given-names>CC</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Cruz-Ramos</surname>
                            <given-names>E</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Interplay between interferon-stimulated gene 15/ISGylation and interferon gamma signaling in breast cancer cells.</article-title>
                    <source>

                        <italic toggle="yes">Cell Signal.</italic>
</source>
                    <year>2019 Feb</year>;<volume>54</volume>:<fpage>91</fpage>&#x2013;<lpage>101</lpage>.
                    <pub-id pub-id-type="pmid">30500379</pub-id>
                    <pub-id pub-id-type="doi">10.1016/j.cellsig.2018.11.021</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref10">
                <label>10</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Padovan</surname>
                            <given-names>E</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Terracciano</surname>
                            <given-names>L</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Certa</surname>
                            <given-names>U</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Interferon stimulated gene 15 constitutively produced by melanoma cells induces e-cadherin expression on human dendritic cells.</article-title>
                    <source>

                        <italic toggle="yes">Cancer Res.</italic>
</source>
                    <year>2002 Jun 15</year>;<volume>62</volume>(<issue>12</issue>):<fpage>3453</fpage>&#x2013;<lpage>8</lpage>.
                    <pub-id pub-id-type="pmid">12067988</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref11">
                <label>11</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Kariri</surname>
                            <given-names>YA</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Alsaleem</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Joseph</surname>
                            <given-names>C</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>The prognostic significance of interferon-stimulated gene 15 (ISG15) in invasive breast cancer.</article-title>
                    <source>

                        <italic toggle="yes">Breast Cancer Res Treat.</italic>
</source>
                    <year>2020 Oct</year>;<volume>19</volume>.
                    <pub-id pub-id-type="pmid">33073304</pub-id>
                    <pub-id pub-id-type="doi">10.1007/s10549-020-05955-1</pub-id>
                    <pub-id pub-id-type="pmcid">PMC7867506</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref12">
                <label>12</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Wood</surname>
                            <given-names>LM</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Pan</surname>
                            <given-names>Z-K</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Seavey</surname>
                            <given-names>MM</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>The ubiquitin-like protein, ISG15, is a novel tumor-associated antigen for cancer immunotherapy.</article-title>
                    <source>

                        <italic toggle="yes">Cancer Immunol Immunother.</italic>
</source>
                    <year>2012 May</year>;<volume>61</volume>(<issue>5</issue>):<fpage>689</fpage>&#x2013;<lpage>700</lpage>.
                    <pub-id pub-id-type="pmid">22057675</pub-id>
                    <pub-id pub-id-type="doi">10.1007/s00262-011-1129-9</pub-id>
                    <pub-id pub-id-type="pmcid">PMC4561532</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref13">
                <label>13</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Burks</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Reed</surname>
                            <given-names>RE</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Desai</surname>
                            <given-names>SD</given-names>
                        </name>
</person-group>:
                    <article-title>Free ISG15 triggers an antitumor immune response against breast cancer: a new perspective.</article-title>
                    <source>

                        <italic toggle="yes">Oncotarget.</italic>
</source>
                    <year>2015 Mar 30</year>;<volume>6</volume>(<issue>9</issue>):<fpage>7221</fpage>&#x2013;<lpage>31</lpage>.
                    <pub-id pub-id-type="pmid">25749047</pub-id>
                    <pub-id pub-id-type="doi">10.18632/oncotarget.3372</pub-id>
                    <pub-id pub-id-type="pmcid">PMC4466680</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref14">
                <label>14</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Desai</surname>
                            <given-names>SD</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Wood</surname>
                            <given-names>LM</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Tsai</surname>
                            <given-names>Y-C</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>ISG15 as a novel tumor biomarker for drug sensitivity.</article-title>
                    <source>

                        <italic toggle="yes">Mol Cancer Ther.</italic>
</source>
                    <year>2008 Jun</year>;<volume>7</volume>(<issue>6</issue>):<fpage>1430</fpage>&#x2013;<lpage>9</lpage>.
                    <pub-id pub-id-type="pmid">18566215</pub-id>
                    <pub-id pub-id-type="doi">10.1158/1535-7163.MCT-07-2345</pub-id>
                    <pub-id pub-id-type="pmcid">PMC2561335</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref15">
                <label>15</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Qiu</surname>
                            <given-names>X</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Hong</surname>
                            <given-names>Y</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Yang</surname>
                            <given-names>D</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>ISG15 as a novel prognostic biomarker for hepatitis B virus-related hepatocellular carcinoma.</article-title>
                    <source>

                        <italic toggle="yes">Int J Clin Exp Med.</italic>
</source>
                    <year>2015</year>;<volume>8</volume>(<issue>10</issue>):<fpage>17140</fpage>&#x2013;<lpage>50</lpage>.
                    <pub-id pub-id-type="pmid">26770308</pub-id>
                    <pub-id pub-id-type="pmcid">PMC4694208</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref16">
                <label>16</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Broering</surname>
                            <given-names>R</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Trippler</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Werner</surname>
                            <given-names>M</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Hepatic expression of proteasome subunit alpha type-6 is upregulated during viral hepatitis and putatively regulates the expression of ISG15 ubiquitin-like modifier, a proviral host gene in hepatitis C virus infection.</article-title>
                    <source>

                        <italic toggle="yes">J Viral Hepat.</italic>
</source>
                    <year>2016 May</year>;<volume>23</volume>(<issue>5</issue>):<fpage>375</fpage>&#x2013;<lpage>86</lpage>.
                    <pub-id pub-id-type="pmid">26833585</pub-id>
                    <pub-id pub-id-type="doi">10.1111/jvh.12508</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref17">
                <label>17</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Katsounas</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Hubbard</surname>
                            <given-names>JJ</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Wang</surname>
                            <given-names>CH</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>High interferon-stimulated gene ISG-15 expression affects HCV treatment outcome in patients co-infected with HIV and HCV.</article-title>
                    <source>

                        <italic toggle="yes">J Med Virol.</italic>
</source>
                    <year>2013 Jun</year>;<volume>85</volume>(<issue>6</issue>):<fpage>959</fpage>&#x2013;<lpage>63</lpage>.
                    <pub-id pub-id-type="pmid">23588721</pub-id>
                    <pub-id pub-id-type="doi">10.1002/jmv.23576</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref18">
                <label>18</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Wang</surname>
                            <given-names>R-G</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Kaul</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Zhang</surname>
                            <given-names>D-X</given-names>
                        </name>
</person-group>:
                    <article-title>Interferon-stimulated gene 15 as a general marker for acute and chronic neuronal injuries.</article-title>
                    <source>

                        <italic toggle="yes">Sheng Li Xue Bao.</italic>
</source>
                    <year>2012 Oct 25</year>;<volume>64</volume>(<issue>5</issue>):<fpage>577</fpage>&#x2013;<lpage>83</lpage>.
                    <pub-id pub-id-type="pmid">23090498</pub-id>
                    <pub-id pub-id-type="pmcid">PMC3587786</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref19">
                <label>19</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Scagnolari</surname>
                            <given-names>C</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Monteleone</surname>
                            <given-names>K</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Selvaggi</surname>
                            <given-names>C</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>ISG15 expression correlates with HIV-1 viral load and with factors regulating T cell response.</article-title>
                    <source>

                        <italic toggle="yes">Immunobiology.</italic>
</source>
                    <year>2016 Feb</year>;<volume>221</volume>(<issue>2</issue>):<fpage>282</fpage>&#x2013;<lpage>90</lpage>.
                    <pub-id pub-id-type="pmid">26563749</pub-id>
                    <pub-id pub-id-type="doi">10.1016/j.imbio.2015.10.007</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref20">
                <label>20</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Yuan</surname>
                            <given-names>Y</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Ma</surname>
                            <given-names>H</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Ye</surname>
                            <given-names>Z</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Interferon-stimulated gene 15 expression in systemic lupus erythematosus: Diagnostic value and association with lymphocytopenia.</article-title>
                    <source>

                        <italic toggle="yes">Z Rheumatol.</italic>
</source>
                    <year>2018 Apr</year>;<volume>77</volume>(<issue>3</issue>):<fpage>256</fpage>&#x2013;<lpage>62</lpage>.
                    <pub-id pub-id-type="pmid">28204879</pub-id>
                    <pub-id pub-id-type="doi">10.1007/s00393-017-0274-8</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref21">
                <label>21</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Care</surname>
                            <given-names>MA</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Stephenson</surname>
                            <given-names>SJ</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Barnes</surname>
                            <given-names>NA</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Network Analysis Identifies Proinflammatory Plasma Cell Polarization for Secretion of ISG15 in Human Autoimmunity.</article-title>
                    <source>

                        <italic toggle="yes">J Immunol.</italic>
</source>
                    <year>2016 Aug 15</year>;<volume>197</volume>(<issue>4</issue>):<fpage>1447</fpage>&#x2013;<lpage>59</lpage>.
                    <pub-id pub-id-type="pmid">27357150</pub-id>
                    <pub-id pub-id-type="doi">10.4049/jimmunol.1600624</pub-id>
                    <pub-id pub-id-type="pmcid">PMC4974491</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref22">
                <label>22</label>
                <mixed-citation publication-type="other">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Chaussabel</surname>
                            <given-names>D</given-names>
                        </name>
</person-group>:
                    <article-title>Literature profiling workshop: HBKU handout (S File 4).</article-title>
                    <source>

                        <italic toggle="yes">figshare. Journal contribution.</italic>
</source>
                    <year>2021</year>.
                    <pub-id pub-id-type="doi">10.6084/m9.figshare.14166395.v1</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref23">
                <label>23</label>
                <mixed-citation publication-type="other">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Chaussabel</surname>
                            <given-names>D</given-names>
                        </name>
</person-group>:
                    <article-title>Literature profiling workshop: HBKU intro presentation (S File 5).</article-title>
                    <source>

                        <italic toggle="yes">figshare. Journal contribution.</italic>
</source>
                    <year>2021</year>.
                    <pub-id pub-id-type="doi">10.6084/m9.figshare.14166500.v1</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref24">
                <label>24</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Toufiq</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Roelands</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Alfaki</surname>
                            <given-names>M</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Annexin A3 in sepsis: novel perspectives from an exploration of public transcriptome data.</article-title>
                    <source>

                        <italic toggle="yes">Immunology.</italic>
</source>
                    <year>2020 Jul</year>;<volume>18</volume>.
                    <pub-id pub-id-type="pmid">32682335</pub-id>
                    <pub-id pub-id-type="doi">10.1111/imm.13239</pub-id>
                    <pub-id pub-id-type="pmcid">PMC7692248</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref25">
                <label>25</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Rinchai</surname>
                            <given-names>D</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Kewcharoenwong</surname>
                            <given-names>C</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Kessler</surname>
                            <given-names>B</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Increased abundance of ADAM9 transcripts in the blood is associated with tissue damage.</article-title>
                    <source>

                        <italic toggle="yes">F1000Res.</italic>
</source>
                    <year>2015</year>;<volume>4</volume>:<fpage>89</fpage>.
                    <pub-id pub-id-type="pmid">27990250</pub-id>
                    <pub-id pub-id-type="doi">10.12688/f1000research.6241.2</pub-id>
                    <pub-id pub-id-type="pmcid">PMC5130078</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref26">
                <label>26</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Roelands</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Garand</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Hinchcliff</surname>
                            <given-names>E</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Long-Chain Acyl-CoA Synthetase 1 Role in Sepsis and Immunity: Perspectives From a Parallel Review of Public Transcriptome Datasets and of the Literature.</article-title>
                    <source>

                        <italic toggle="yes">Front Immunol.</italic>
</source>
                    <year>2019</year>;<volume>10</volume>:<fpage>2410</fpage>.
                    <pub-id pub-id-type="pmid">31681299</pub-id>
                    <pub-id pub-id-type="doi">10.3389/fimmu.2019.02410</pub-id>
                    <pub-id pub-id-type="pmcid">PMC6813721</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref27">
                <label>27</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Rinchai</surname>
                            <given-names>D</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Boughorbel</surname>
                            <given-names>S</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Presnell</surname>
                            <given-names>S</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>A curated compendium of monocyte transcriptome datasets of relevance to human monocyte immunobiology research.</article-title>
                    <source>

                        <italic toggle="yes">F1000Res.</italic>
</source>
                    <year>2016</year>;<volume>5</volume>:<fpage>291</fpage>.
                    <pub-id pub-id-type="pmid">27158452</pub-id>
                    <pub-id pub-id-type="doi">10.12688/f1000research.8182.2</pub-id>
                    <pub-id pub-id-type="pmcid">PMC4856112</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref28">
                <label>28</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Roelands</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Decock</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Boughorbel</surname>
                            <given-names>S</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>A collection of annotated and harmonized human breast cancer transcriptome datasets, including immunologic classification.</article-title>
                    <source>

                        <italic toggle="yes">F1000Res.</italic>
</source>
                    <year>2017</year>;<volume>6</volume>:<fpage>296</fpage>.
                    <pub-id pub-id-type="pmid">29527288</pub-id>
                    <pub-id pub-id-type="doi">10.12688/f1000research.10960.2</pub-id>
                    <pub-id pub-id-type="pmcid">PMC5820610</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref29">
                <label>29</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Huang</surname>
                            <given-names>SSY</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Al Ali</surname>
                            <given-names>F</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Boughorbel</surname>
                            <given-names>S</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>A curated collection of transcriptome datasets to investigate the molecular mechanisms of immunoglobulin E-mediated atopic diseases.</article-title>
                    <source>

                        <italic toggle="yes">Database (Oxford).</italic>
</source>
                    <year>2019 01</year>;<volume>2019</volume>.
                    <pub-id pub-id-type="pmid">31290545</pub-id>
                    <pub-id pub-id-type="doi">10.1093/database/baz066</pub-id>
                    <pub-id pub-id-type="pmcid">PMC6616200</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref30">
                <label>30</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Bougarn</surname>
                            <given-names>S</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Boughorbel</surname>
                            <given-names>S</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Chaussabel</surname>
                            <given-names>D</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>A curated transcriptome dataset collection to investigate the blood transcriptional response to viral respiratory tract infection and vaccination.</article-title>
                    <source>

                        <italic toggle="yes">F1000Res.</italic>
</source>
                    <year>2019</year>;<volume>8</volume>:<fpage>284</fpage>.
                    <pub-id pub-id-type="pmid">31231515</pub-id>
                    <pub-id pub-id-type="doi">10.12688/f1000research.18533.1</pub-id>
                    <pub-id pub-id-type="pmcid">PMC6567289</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref31">
                <label>31</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Marr</surname>
                            <given-names>AK</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Boughorbel</surname>
                            <given-names>S</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Presnell</surname>
                            <given-names>S</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>A curated transcriptome dataset collection to investigate the development and differentiation of the human placenta and its associated pathologies.</article-title>
                    <source>

                        <italic toggle="yes">F1000Res.</italic>
</source>
                    <year>2016</year>;<volume>5</volume>:<fpage>305</fpage>.
                    <pub-id pub-id-type="pmid">27303626</pub-id>
                    <pub-id pub-id-type="doi">10.12688/f1000research.8210.2</pub-id>
                    <pub-id pub-id-type="pmcid">PMC4897750</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref32">
                <label>32</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Mackeh</surname>
                            <given-names>R</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Boughorbel</surname>
                            <given-names>S</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Chaussabel</surname>
                            <given-names>D</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>-A curated transcriptomic dataset collection relevant to embryonic development associated with in vitro fertilization in healthy individuals and patients with polycystic ovary syndrome.</article-title>
                    <source>

                        <italic toggle="yes">F1000Res.</italic>
</source>
                    <year>2017</year>;<volume>6</volume>:<fpage>181</fpage>.
                    <pub-id pub-id-type="pmid">28413616</pub-id>
                    <pub-id pub-id-type="doi">10.12688/f1000research.10877.1</pub-id>
                    <pub-id pub-id-type="pmcid">PMC5365227</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref33">
                <label>33</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Blazkova</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Boughorbel</surname>
                            <given-names>S</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Presnell</surname>
                            <given-names>S</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>A curated transcriptome dataset collection to investigate the immunobiology of HIV infection.</article-title>
                    <source>

                        <italic toggle="yes">F1000Res.</italic>
</source>
                    <year>2016</year>;<volume>5</volume>:<fpage>327</fpage>.
                    <pub-id pub-id-type="pmid">27134731</pub-id>
                    <pub-id pub-id-type="doi">10.12688/f1000research.8204.1</pub-id>
                    <pub-id pub-id-type="pmcid">PMC4838008</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref34">
                <label>34</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Bougarn</surname>
                            <given-names>S</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Boughorbel</surname>
                            <given-names>S</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Chaussabel</surname>
                            <given-names>D</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>A curated transcriptome dataset collection to investigate inborn errors of immunity.</article-title>
                    <source>

                        <italic toggle="yes">F1000Res.</italic>
</source>
                    <year>2019</year>;<volume>8</volume>:<fpage>188</fpage>.
                    <pub-id pub-id-type="pmid">31559014</pub-id>
                    <pub-id pub-id-type="doi">10.12688/f1000research.18048.2</pub-id>
                    <pub-id pub-id-type="pmcid">PMC6749933</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref35">
                <label>35</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Rahman</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Boughorbel</surname>
                            <given-names>S</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Presnell</surname>
                            <given-names>S</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>A curated transcriptome dataset collection to investigate the functional programming of human hematopoietic cells in early life.</article-title>
                    <source>

                        <italic toggle="yes">F1000Res.</italic>
</source>
                    <year>2016</year>;<volume>5</volume>:<fpage>414</fpage>.
                    <pub-id pub-id-type="pmid">27347375</pub-id>
                    <pub-id pub-id-type="doi">10.12688/f1000research.8375.1</pub-id>
                    <pub-id pub-id-type="pmcid">PMC4916988</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref36">
                <label>36</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Rinchai</surname>
                            <given-names>D</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Presnell</surname>
                            <given-names>S</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Vidal</surname>
                            <given-names>M</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Blood Interferon Signatures Putatively Link Lack of Protection Conferred by the RTS,S Recombinant Malaria Vaccine to an Antigen-specific IgE Response.</article-title>
                    <source>

                        <italic toggle="yes">F1000Res.</italic>
</source>
                    <year>2015</year>;<volume>4</volume>:<fpage>919</fpage>.
                    <pub-id pub-id-type="pmid">28883910</pub-id>
                    <pub-id pub-id-type="doi">10.12688/f1000research.7093.2</pub-id>
                    <pub-id pub-id-type="pmcid">PMC5580375</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref37">
                <label>37</label>
                <mixed-citation publication-type="web">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Rawat</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Rinchai</surname>
                            <given-names>D</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Toufiq</surname>
                            <given-names>M</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>A Neutrophil-Driven Inflammatory Signature Characterizes the Blood Transcriptome Fingerprint of Psoriasis.</article-title>
                    <source>

                        <italic toggle="yes">Front Immunol.</italic>
</source>
                    <year>2020 [cited 2020 Dec 7]</year>;<volume>11</volume>.
                    <pub-id pub-id-type="pmid">33329570</pub-id>
                    <pub-id pub-id-type="doi">10.3389/fimmu.2020.587946</pub-id>
                    <pub-id pub-id-type="pmcid">PMC7732684</pub-id>
                    <ext-link ext-link-type="uri" xlink:href="https://www.frontiersin.org/articles/10.3389/fimmu.2020.587946/full">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref38">
                <label>38</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Rinchai</surname>
                            <given-names>D</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Altman</surname>
                            <given-names>MC</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Konza</surname>
                            <given-names>O</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Definition of erythroid cell-positive blood transcriptome phenotypes associated with severe respiratory syncytial virus infection.</article-title>
                    <source>

                        <italic toggle="yes">Clin Transl Med.</italic>
</source>
                    <year>2020</year>;<volume>10</volume>(<issue>8</issue>):<fpage>e244</fpage>.
                    <pub-id pub-id-type="pmid">33377660</pub-id>
                    <pub-id pub-id-type="doi">10.1002/ctm2.244</pub-id>
                    <pub-id pub-id-type="pmcid">PMC7733317</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref39">
                <label>39</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Rinchai</surname>
                            <given-names>D</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Chaussabel</surname>
                            <given-names>D</given-names>
                        </name>
</person-group>:
                    <article-title>A training curriculum for retrieving, structuring, and aggregating information derived from the biomedical literature and large-scale data repositories. [version 1; peer review: 2 approved with reservations].</article-title>
                    <source>

                        <italic toggle="yes">F1000Research.</italic>
</source>
                    <year>2022</year>;<volume>11</volume>:<fpage>994</fpage>.
                    <pub-id pub-id-type="doi">10.12688/f1000research.122811.1</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref40">
                <label>40</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Rinchai</surname>
                            <given-names>D</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Chaussabel</surname>
                            <given-names>D</given-names>
                        </name>
</person-group>:
                    <article-title>Assessing the potential relevance of CEACAM6 as a blood transcriptional biomarker [version 1; peer review: 1 approved with reservations].</article-title>
                    <source>

                        <italic toggle="yes">F1000Research.</italic>
</source>
                    <year>2022</year>;<volume>11</volume>:<fpage>1294</fpage>.
                    <pub-id pub-id-type="doi">10.12688/f1000research.126721.1</pub-id>
                </mixed-citation>
            </ref>
        </ref-list>
    </back>
    <sub-article article-type="reviewer-report" id="report173013">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.145671.r173013</article-id>
            <title-group>
                <article-title>Reviewer response for version 2</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Matarese</surname>
                        <given-names>Valerie</given-names>
                    </name>
                    <xref ref-type="aff" rid="r173013a1">1</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0003-2687-194X</uri>
                </contrib>
                <aff id="r173013a1">
                    <label>1</label>Authors' editor and editorial consultant, Self-employed, Vidor TV, Italy</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>25</day>
                <month>5</month>
                <year>2023</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2023 Matarese V</copyright-statement>
                <copyright-year>2023</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport173013" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.36395.2"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>I have read the revised manuscript and am pleased to see that the authors acted on most of my comments. As a result, the manuscript has improved on many levels. The inclusion of this manuscript in the literature on scientific writing workshops will be useful to the scientific community, as it illustrates some unique approaches and viewpoints. I therefore give my approval to the manuscript and look forward to seeing it indexed in bibliographic databases.</p>
            <p>Is the rationale for developing the new method (or application) clearly explained?</p>
            <p>Partly</p>
            <p>Is the description of the method technically sound?</p>
            <p>Yes</p>
            <p>Are the conclusions about the method and its performance adequately supported by the findings presented in the article?</p>
            <p>Partly</p>
            <p>If any results are presented, are all the source data underlying the results available to ensure full reproducibility?</p>
            <p>Yes</p>
            <p>Are sufficient details provided to allow replication of the method development and its use by others?</p>
            <p>Yes</p>
            <p>Reviewer Expertise:</p>
            <p>I am a former biomedical researcher with 25 years' experience as a biomedical editor in different capacities and 15 years' experience as instructor of research writing at the PhD level. I am also a scholar of academic editing.</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.</p>
        </body>
    </sub-article>
    <sub-article article-type="reviewer-report" id="report96508">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.39457.r96508</article-id>
            <title-group>
                <article-title>Reviewer response for version 1</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Matarese</surname>
                        <given-names>Valerie</given-names>
                    </name>
                    <xref ref-type="aff" rid="r96508a1">1</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0003-2687-194X</uri>
                </contrib>
                <aff id="r96508a1">
                    <label>1</label>Authors' editor and editorial consultant, Self-employed, Vidor TV, Italy</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>20</day>
                <month>10</month>
                <year>2021</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2021 Matarese V</copyright-statement>
                <copyright-year>2021</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport96508" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.36395.1"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>reject</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>
                <bold>Overview</bold>
            </p>
            <p> This method article describes a short workshop (3 hours) with five stated teaching objectives: bibliographic research in PubMed on gene-related topics (&#x201c;retrieval&#x201d;); extraction of relevant keywords (&#x201c;concepts&#x201d;) from article titles; &#x201c;profiling&#x201d; (counting the number of articles per concept); using the data and online tools to create interactive graphics (&#x201c;visualization&#x201d;); and &#x201c;writing a narrative review&#x201d; using a semi-automated method in a spreadsheet. I welcome a workshop description for teaching bibliographic research, and I am sure that workshop trainees will enjoy learning to make treemaps and circle packing charts. However, I have substantial reservations about the last objective and thus, as peer reviewer, I cannot endorse the method as a whole.</p>
            <p> </p>
            <p> 
                <bold>Regarding F1000Research questions</bold>
            </p>
            <p> The rationale for the workshop is clearly given, but no valid justification is given for using a spreadsheet to write a narrative review (fifth objective) for the Introduction section of a research article or for a review article. The conclusions about the method and its performance are supported in part by the findings presented in the article, but the generated nonsense text is clear evidence that the proposed method for automated text generation should not be published.</p>
            <p> </p>
            <p> 
                <bold>Major comment</bold>
            </p>
            <p> The fifth objective, &#x201c;Writing a narrative&#x201d;, is associated with two skills that workshop trainees can expect to learn: &#x201c;Assimilation of biomedical knowledge (synthesis and presentation)&#x201d; and &#x201c;Scientific publication (optional)&#x201d;. However, as described in the manuscript, the workshop is unable to fulfill these promises at all. In this fifth part of the workshop, participants are instructed to read the &#x201c;abstract and/or full text and record in a spreadsheet&#x201d; a series of data. They then use the spreadsheet&#x2019;s concatenate function to generate &#x201c;standard sentences&#x201d; and collate these into a &#x201c;cohesive paragraph or section&#x201d;. Finally, the text is &#x201c;polished&#x201d; by the students, an instructor or a &#x201c;professional scientific copyeditor&#x201d;. There are enormous problems with all of this.</p>
            <p> </p>
            <p> First and foremost, the method is proposing the automated generation of text for publication based primarily on abstracts (&#x201c;full text where possible/necessary&#x201d;) and without critical appraisal. Only in the conclusions of the manuscript is there any minimal mention of this problem: &#x201c;Nevertheless, &#x2026; extraction of information 
                <italic>from abstracts</italic> will always require 
                <italic>some</italic> critical evaluation and decision making that cannot be automated&#x201d; (emphasis added). I cannot endorse a method for the computer-assisted writing of research articles and reviews, as it is unscholarly, wasteful to the reviewers and (if published) readers who will have to sift through substantial nonsense text, and damaging to knowledge advancement.</p>
            <p> </p>
            <p> That this method can generate nonsense text is seen in the two pages of &#x201c;Illustrative case&#x201d;. Part A (page 11) shows the computer-generated &#x201c;standard sentences&#x201d; which are close to gibberish, for example:</p>
            <p> </p>
            <p> 
                <italic>The abundance of ISG15 Proteins measured by immunohistochemistry is increased in vivo in human immune cell markers &#x201c;in patients with&#x201d; invasive BC patients with long-term follow-up compared to &#x2026; &#x2026; &#x2026;&#x2026;&#x2026;&#x2026; &#x2026; .. [33073304].</italic>
            </p>
            <p> </p>
            <p> Part B (page 12) shows the &#x201c;edited narrative&#x201d; which &#x201c;was prepared by a professional scientific copyeditor&#x201d;. Given both the poor science and&#x00a0;poor English grammar and style of this narrative, I doubt that a professional editor worked on it. Furthermore, the use of an editor to create meaningful text from the &#x201c;standard sentences&#x201d; in Part A would amount to ghost writing. Here is just one example from the supposedly edited text:</p>
            <p> </p>
            <p> 
                <italic>Compared to healthy donors, TaqMan assays revealed high ISG15 expression (mRNA and protein) as well as an increased frequency of ISG15-SNPs in human PBMC from untreated patients with HIV infection.</italic>
            </p>
            <p> </p>
            <p> TaqMan assays do not inform on protein levels. The frequency of a SNP cannot change unless there is gene duplication (what changes is the allele frequency). There is no need to use &#x201c;human&#x201d; to describe cells from HIV patients. There is a misplaced modifier before &#x201c;TaqMan&#x201d; and an unwanted hyphen.</p>
            <p> </p>
            <p> I therefore feel that, for a revised version of this manuscript to be approved, the fifth objective should be deleted along with its two associated skills (impossible to achieve in a short workshop). In addition, a paragraph should be added with comments on the need for early career researchers to learn to read the scientific literature at different levels (from scanning to critical appraisal
                <italic> </italic>
                <italic>of the full text</italic>) and to appreciate that scientific findings must be replicated and validated before they can be considered &#x201c;true&#x201d;.</p>
            <p> </p>
            <p> 
                <bold>Minor comments</bold>
            </p>
            <p> The title is a hard read, because there are six words between &#x201c;Organizing&#x201d; and what gets organized. I suggest: Organizing training workshops on gene literature &#x2026;. for early career researchers&#x201d;.</p>
            <p> </p>
            <p> The word &#x201c;profiling&#x201d; is used to the point of being meaningless. It is used six times in the 20-line Introduction, in the context of literature, omics and transcriptome. It should be deleted after &#x201c;omics&#x201d; and &#x201c;transcriptome&#x201d;, as these words are best used as nouns, not adjectives describing the vague &#x201c;profiling&#x201d;. Regarding &#x201c;literature profiling&#x201d;, this non-standard expression should be defined early in the text, not in the Methods section (bottom page 3). Here, profiling means &#x201c;information extraction&#x201d; from titles and &#x201c;determination of keyword frequencies&#x201d;. This definition should be given in the Introduction as soon as &#x201c;literature profiling&#x201d; is mentioned.</p>
            <p> </p>
            <p> Throughout the text: The use of conditional mood (&#x201c;this session should present&#x201d;, &#x201c;the presentation should last&#x201d; and so on) is unnecessary, burdens the text and slows reading. The simple present tense can be used in most cases. A method is not a recommendation (&#x201c;should&#x201d;) but a description (&#x201c;is&#x201d;).</p>
            <p> </p>
            <p> Throughout the text: The use of italics in the &#x201c;illustrative cases&#x201d; makes reading difficult, especially because of the extensive use of quotation marks and parentheses. Italics should be reserved for emphasis on small pieces of text. Here, instead, a block quote (indented text) would be better. Text boxes can also help.</p>
            <p> </p>
            <p> Page 2: &#x201c;some independent preparation&#x201d;. Does this mean independent study, exercises, getting organized technically?</p>
            <p> </p>
            <p> Page 2: Ref. 34 is cited right after Ref. 4. It should be renumbered 5. I confused it with Refs. 3,4 (missing a comma). Similarly Ref. 35 is cited too early, on page 6.</p>
            <p> </p>
            <p> Page 3, line 3: &#x201c;on any given specific research topics&#x201d; should be changed to &#x201c;on any given research topic&#x201d;. 
                <italic>Topic</italic> should be singular. 
                <italic>Specific</italic> is unneeded and redundant with &#x201c;given topic&#x201d;.</p>
            <p> </p>
            <p> Page 3: &#x201c;for a gene with &#x00b1; 1000 associated articles and five participants&#x201d;. This phrase is too concise to have meaning, as a gene cannot have participants.</p>
            <p> </p>
            <p> Page 4, Step 1. No mention is made of NCBI&#x2019;s Advanced Search Builder; MeSH terms; or truncation by asterisk. The manuscript should explain why these advanced features are not taught.</p>
            <p> </p>
            <p> Page 4, bottom: &#x201c;The field restriction tag [tw] (or alternatively the more restrictive [tiab]) should be added after each term&#x2026; to limit the search to titles and abstracts&#x201d;. Actually, it is [tiab] that limits the search to titles and abstracts. The broad [tw] searches almost everything (see 
                <ext-link ext-link-type="uri" xlink:href="https://pubmed.ncbi.nlm.nih.gov/help/#tw">https://pubmed.ncbi.nlm.nih.gov/help/#tw</ext-link>).</p>
            <p> </p>
            <p> Page 5: &#x201c;examples provided above&#x201d;. Where? Vague cross-references like this make reading difficult.</p>
            <p> </p>
            <p> Page 6: &#x201c;For instance, when assessing the novelty &#x2026;&#x201d;. This incomplete sentence is lacking a subject-verb pair. Similar problem on page 13, bottom.</p>
            <p> </p>
            <p> Page 6, bottom: &#x201c;The Boolean argument AND is added&#x201d;. AND is a Boolean operator, not argument.</p>
            <p> </p>
            <p> Page 6, bottom: &#x2018;&#x2026; AND (&#x201c;Liver cancer&#x201d;[tw] OR &#x201c;Liver carcinoma&#x201d;[tw] OR &#x201c;Hepatic carcinoma&#x201d;[tw], etc.)&#x2019;. A query cannot contain &#x201c;, etc.&#x201d;. It should be removed, and a final period added to the sentence.</p>
            <p> </p>
            <p> Page 9, Step 5. This part is excessively wordy and suggests an uneasiness with the concepts. We read &#x201c;is the measure by which productivity is measured&#x201d;, &#x201c;it can be at first important to be able to identify information that is worth capturing&#x201d; and &#x201c;A specific topic would be selected as the focus &#x2026; to limit the amount of literature that would need to be covered&#x201d;.</p>
            <p> </p>
            <p> Page 13: Webex. Not WebEx.</p>
            <p> </p>
            <p> Punctuation 
                <list list-type="bullet">
                    <list-item>
                        <p>Comma after &#x201c;Supporting material&#x201d; in the third paragraph of Introduction. Comma after &#x201c;This is a skill that is worth developing&#x201d; (page 9).</p>
                    </list-item>
                    <list-item>
                        <p>Replace &#x00b1; with &#x201c;about&#x201d; (first paragraph of Methods).</p>
                    </list-item>
                    <list-item>
                        <p>Hyphens: no hyphen after -ly adverbs, e.g. &#x201c;early career researcher&#x201d;. Use hyphens in &#x201c;gene-associated literature&#x201d;, &#x201c;false-positive result&#x201d; and &#x201c;false-negative result&#x201d; (but not in &#x201c;false positives&#x201d; or &#x201c;false negatives&#x201d;).</p>
                    </list-item>
                    <list-item>
                        <p>There are many instances where the required space between words is missing, especially when a quotation mark is used, e.g. &#x201c;For example, under&#x2019;Human diseases and &#x2026; associated keywords could be&#x2019;liver cancer&#x2019;,&#x2019;hepatic carcinoma&#x2019;, &#x2026;&#x201d; (page 5). The entire manuscript should be checked and spaces added where needed. A hint is that the curly quotation mark is in the wrong direction before the word when a space is missing.</p>
                    </list-item>
                    <list-item>
                        <p>Double quotation marks should be used in all cases except when nesting is needed. Nesting should be used on page 5 where the sentence:</p>
                        <p> </p>
                        <p> 
                            <italic>This was rectified by adding the AND &#x201c;cross-reactive protein [tw] argument to each of the ambiguous acronym. </italic>
                        </p>
                        <p> </p>
                        <p> should be changed to:</p>
                        <p> </p>
                        <p> 
                            <italic>This was rectified by adding the &#x2018;AND &#x201c;cross-reactive protein [tw]&#x201d;&#x2019; argument to each of the ambiguous acronyms.</italic>
                        </p>
                        <p> </p>
                        <p> or more simply:</p>
                        <p> </p>
                        <p> 
                            <italic>This was rectified by adding &#x2018;AND &#x201c;cross-reactive protein [tw]&#x201d;&#x2019; to each of the ambiguous acronyms.</italic>
                        </p>
                    </list-item>
                </list>
            </p>
            <p>Is the rationale for developing the new method (or application) clearly explained?</p>
            <p>Partly</p>
            <p>Is the description of the method technically sound?</p>
            <p>Yes</p>
            <p>Are the conclusions about the method and its performance adequately supported by the findings presented in the article?</p>
            <p>Partly</p>
            <p>If any results are presented, are all the source data underlying the results available to ensure full reproducibility?</p>
            <p>Yes</p>
            <p>Are sufficient details provided to allow replication of the method development and its use by others?</p>
            <p>Yes</p>
            <p>Reviewer Expertise:</p>
            <p>I am a former biomedical researcher with 25 years' experience as a biomedical editor in different capacities and 15 years' experience as instructor of research writing at the PhD level. I am also a scholar of academic editing.</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to state that I do not consider it to be of an acceptable scientific standard, for reasons outlined above.</p>
        </body>
        <sub-article article-type="response" id="comment9475-96508">
            <front-stub>
                <contrib-group>
                    <contrib contrib-type="author">
                        <name>
                            <surname>Chaussabel</surname>
                            <given-names>Damien</given-names>
                        </name>
                        <aff>The Jackson Laboratory for Genomic Medicine, USA</aff>
                    </contrib>
                </contrib-group>
                <author-notes>
                    <fn fn-type="conflict">
                        <p>
                            <bold>Competing interests: </bold>The authors have no competing interests to declare.</p>
                    </fn>
                </author-notes>
                <pub-date pub-type="epub">
                    <day>17</day>
                    <month>3</month>
                    <year>2023</year>
                </pub-date>
            </front-stub>
            <body>
                <p>
                    <bold>Overview</bold>
                </p>
                <p> 
                    <bold>This method article describes a short workshop (3 hours) with five stated teaching objectives: bibliographic research in PubMed on gene-related topics (&#x201c;retrieval&#x201d;); extraction of relevant keywords (&#x201c;concepts&#x201d;) from article titles; &#x201c;profiling&#x201d; (counting the number of articles per concept); using the data and online tools to create interactive graphics (&#x201c;visualization&#x201d;); and &#x201c;writing a narrative review&#x201d; using a semi-automated method in a spreadsheet. I welcome a workshop description for teaching bibliographic research, and I am sure that workshop trainees will enjoy learning to make treemaps and circle packing charts. However, I have substantial reservations about the last objective and thus, as peer reviewer, I cannot endorse the method as a whole.</bold>
                </p>
                <p> </p>
                <p> We appreciate the time spent reviewing this work and the constructive feedback given. The reason we are coming back to revising this article a little bit late is because we wanted first to complete the development of a more rounded training curriculum. As a result, a study guide and use case have recently been published in F1000Research (
                    <ext-link ext-link-type="uri" xlink:href="https://f1000research.com/articles/11-994">https://f1000research.com/articles/11-994</ext-link>, 
                    <ext-link ext-link-type="uri" xlink:href="https://f1000research.com/articles/11-1294">https://f1000research.com/articles/11-1294</ext-link>). Literature profiling remains nonetheless a major component of this training curriculum and could be run as stand-alone training/workshop. Also, we also wanted to circle back to this manuscript at this time and address all outstanding comments and add references to the more recent work.</p>
                <p> </p>
                <p> We have addressed the main comment raised in the review by eliminating the step in the workflow that involved automatically generating text from the information provided by workshop participants. We did not intend for this step to cause controversy and were simply trying to practically train early-career scientists in countries where English literacy may not be as high as in more privileged countries. The question then becomes, how can these brilliant scientists overcome the language barrier and fulfil their potential? The approach we attempted, as an experiment, was only meant to be a first step towards trainees becoming independent writers, eventually able to produce literary work without artificial crutches. However, we admit that the approach may have been flawed or misguided (or both!), and we acknowledge your point. We are happy to remove this step from the manuscript.</p>
                <p> </p>
                <p> 
                    <bold>Regarding F1000Research questions</bold>
                </p>
                <p> 
                    <bold>The rationale for the workshop is clearly given, but no valid justification is given for using a spreadsheet to write a narrative review (fifth objective) for the Introduction section of a research article or for a review article. The conclusions about the method and its performance are supported in part by the findings presented in the article, but the generated nonsense text is clear evidence that the proposed method for automated text generation should not be published.</bold>
                </p>
                <p> </p>
                <p> Thanks for raising these concerns. We have now removed the steps that pertains to automated text generation (5e and 5f). We feel that capturing relevant information and recording it in a spreadsheet is still a worthwhile exercise. Indeed, participants can learn in the process what information they should be looking for in the work of others and make sure to report in their own (e.g. abstracts often fail to mention in what species the work is performed or if abundance of protein or transcripts is measured). We now write instead as step 5e:</p>
                <p> </p>
                <p> &#x201c;The information captured in the spreadsheet will serve as a synopsis that participants will be able to rely upon for developing a written narrative, which could be used in an introduction or review article.&#x201d;</p>
                <p> The spreadsheet has also been amended accordingly, removing the functions used to generate text. And the text that was automatically generated has been removed from the manuscript. We retained the written narrative, simply prefacing the paragraph with the following statement:</p>
                <p> 
                    <italic>&#x201c;Narrative, based on the information captured in the spreadsheet:</italic> &#x201c;</p>
                <p> </p>
                <p> 
                    <bold>Major comment</bold>
                </p>
                <p> 
                    <bold>The fifth objective, &#x201c;Writing a narrative&#x201d;, is associated with two skills that workshop trainees can expect to learn: &#x201c;Assimilation of biomedical knowledge (synthesis and presentation)&#x201d; and &#x201c;Scientific publication (optional)&#x201d;. However, as described in the manuscript, the workshop is unable to fulfill these promises at all. In this fifth part of the workshop, participants are instructed to read the &#x201c;abstract and/or full text and record in a spreadsheet&#x201d; a series of data. They then use the spreadsheet&#x2019;s concatenate function to generate &#x201c;standard sentences&#x201d; and collate these into a &#x201c;cohesive paragraph or section&#x201d;. Finally, the text is &#x201c;polished&#x201d; by the students, an instructor or a &#x201c;professional scientific copyeditor&#x201d;. There are enormous problems with all of this.</bold>
                </p>
                <p>
                    <bold> </bold>
                </p>
                <p>
                    <bold> First and foremost, the method is proposing the automated generation of text for publication based primarily on abstracts (&#x201c;full text where possible/necessary&#x201d;) and without critical appraisal. Only in the conclusions of the manuscript is there any minimal mention of this problem: &#x201c;Nevertheless, &#x2026; extraction of information&#x00a0;
                        <italic>from abstracts</italic>&#x00a0;will always require&#x00a0;
                        <italic>some</italic>&#x00a0;critical evaluation and decision making that cannot be automated&#x201d; (emphasis added). I cannot endorse a method for the computer-assisted writing of research articles and reviews, as it is unscholarly, wasteful to the reviewers and (if published) readers who will have to sift through substantial nonsense text, and damaging to knowledge advancement.</bold>
                </p>
                <p>
                    <bold> </bold>
                </p>
                <p>
                    <bold> That this method can generate nonsense text is seen in the two pages of &#x201c;Illustrative case&#x201d;. Part A (page 11) shows the computer-generated &#x201c;standard sentences&#x201d; which are close to gibberish, for example:</bold>
                </p>
                <p>
                    <bold> </bold>
                </p>
                <p>
                    <bold> 
                        <italic>The abundance of ISG15 Proteins measured by immunohistochemistry is increased in vivo in human immune cell markers &#x201c;in patients with&#x201d; invasive BC patients with long-term follow-up compared to &#x2026; &#x2026; &#x2026;&#x2026;&#x2026;&#x2026; &#x2026; .. [33073304].</italic>
                    </bold>
                </p>
                <p>
                    <bold> </bold>
                </p>
                <p>
                    <bold> Part B (page 12) shows the &#x201c;edited narrative&#x201d; which &#x201c;was prepared by a professional scientific copyeditor&#x201d;. Given both the poor science and&#x00a0;poor English grammar and style of this narrative, I doubt that a professional editor worked on it. Furthermore, the use of an editor to create meaningful text from the &#x201c;standard sentences&#x201d; in Part A would amount to ghost writing. Here is just one example from the supposedly edited text:</bold>
                </p>
                <p>
                    <bold> </bold>
                </p>
                <p>
                    <bold> 
                        <italic>Compared to healthy donors, TaqMan assays revealed high ISG15 expression (mRNA and protein) as well as an increased frequency of ISG15-SNPs in human PBMC from untreated patients with HIV infection.</italic>
                    </bold>
                </p>
                <p>
                    <bold> </bold>
                </p>
                <p>
                    <bold> TaqMan assays do not inform on protein levels. The frequency of a SNP cannot change unless there is gene duplication (what changes is the allele frequency). There is no need to use &#x201c;human&#x201d; to describe cells from HIV patients. There is a misplaced modifier before &#x201c;TaqMan&#x201d; and an unwanted hyphen.</bold>
                </p>
                <p>
                    <bold> </bold>
                </p>
                <p>
                    <bold> I therefore feel that, for a revised version of this manuscript to be approved, the fifth objective should be deleted along with its two associated skills (impossible to achieve in a short workshop). In addition, a paragraph should be added with comments on the need for early career researchers to learn to read the scientific literature at different levels (from scanning to critical appraisal
                        <italic>&#x00a0;of the full text</italic>) and to appreciate that scientific findings must be replicated and validated before they can be considered &#x201c;true&#x201d;.</bold>
                </p>
                <p> </p>
                <p> We amended Step 5 as described above, removing altogether any notion of automated text generation, while keeping the step that consists in capturing and recording key information in a spreadsheet. If necessary, even this could be removed, but again as stated above we feel it is a worthwhile exercise.</p>
                <p> </p>
                <p> We amended our learning objectives, removing the two points mentioned in your critique and writing instead: 
                    <list list-type="bullet">
                        <list-item>
                            <p>Extraction of biomedical information (capturing information in a structured format).</p>
                        </list-item>
                    </list> We also added wording highlighting the need for participants to go beyond scanning titles and abstracts and read research papers:</p>
                <p> </p>
                <p> &#x201c;Thus, early career researchers need to develop the ability to read the scientific literature at different levels: from &#x201c;scanning&#x201d; a large number of articles to reading the full text version for a more in depth understanding.&#x201d;</p>
                <p> </p>
                <p> And for the record &#x2013; since integrity is no small matter in science - we maintain that a professional and reputable science editor did work on this text, as we had stated earlier; even though as pointed out the service rendered was maybe not of the highest quality in this case (we have been otherwise been quite happy with them).</p>
                <p> </p>
                <p> 
                    <bold>Minor comments</bold>
                </p>
                <p> 
                    <bold>The title is a hard read, because there are six words between &#x201c;Organizing&#x201d; and what gets organized. I suggest: Organizing training workshops on gene literature &#x2026;. for early career researchers&#x201d;.</bold>
                </p>
                <p> </p>
                <p> Thank you for the helpful edits, which have been incorporated throughout the text and to the best of our abilities. We also took your advice and changed the title of the article to improve legibility.</p>
                <p> </p>
                <p> 
                    <bold>The word &#x201c;profiling&#x201d; is used to the point of being meaningless. It is used six times in the 20-line Introduction, in the context of literature, omics and transcriptome. It should be deleted after &#x201c;omics&#x201d; and &#x201c;transcriptome&#x201d;, as these words are best used as nouns, not adjectives describing the vague &#x201c;profiling&#x201d;. Regarding &#x201c;literature profiling&#x201d;, this non-standard expression should be defined early in the text, not in the Methods section (bottom page 3). Here, profiling means &#x201c;information extraction&#x201d; from titles and &#x201c;determination of keyword frequencies&#x201d;. This definition should be given in the Introduction as soon as &#x201c;literature profiling&#x201d; is mentioned.</bold>
                </p>
                <p> </p>
                <p> We have deleted profiling when associated with omics data and defined literature profiling early on in the introduction.</p>
                <p> </p>
                <p> 
                    <bold>Throughout the text: The use of conditional mood (&#x201c;this session should present&#x201d;, &#x201c;the presentation should last&#x201d; and so on) is unnecessary, burdens the text and slows reading. The simple present tense can be used in most cases. A method is not a recommendation (&#x201c;should&#x201d;) but a description (&#x201c;is&#x201d;).</bold>
                </p>
                <p> </p>
                <p> That makes sense and we have edited the text throughout.</p>
                <p> </p>
                <p> 
                    <bold>Throughout the text: The use of italics in the &#x201c;illustrative cases&#x201d; makes reading difficult, especially because of the extensive use of quotation marks and parentheses. Italics should be reserved for emphasis on small pieces of text. Here, instead, a block quote (indented text) would be better. Text boxes can also help.</bold>
                </p>
                <p> </p>
                <p> Text boxes sound ideal. We will reach out to the editor to see if this is feasible.</p>
                <p> </p>
                <p> 
                    <bold>Page 2: &#x201c;some independent preparation&#x201d;. Does this mean independent study, exercises, getting organized technically?</bold>
                </p>
                <p> </p>
                <p> We clarified this point by rewording this sentence, that now reads: &#x201c;Trainees can be asked to perform tasks between each of the sessions.&#x201d;</p>
                <p> </p>
                <p> 
                    <bold>Page 2: Ref. 34 is cited right after Ref. 4. It should be renumbered 5. I confused it with Refs. 3,4 (missing a comma). Similarly Ref. 35 is cited too early, on page 6.</bold>
                </p>
                <p> </p>
                <p> Indeed, we will communicate this to the editor so it can be addressed in the formatted document.</p>
                <p> </p>
                <p> 
                    <bold>Page 3, line 3: &#x201c;on any given specific research topics&#x201d; should be changed to &#x201c;on any given research topic&#x201d;.&#x00a0;
                        <italic>Topic</italic>&#x00a0;should be singular.&#x00a0;
                        <italic>Specific</italic>&#x00a0;is unneeded and redundant with &#x201c;given topic&#x201d;.</bold>
                </p>
                <p> </p>
                <p> This sentence has been edited accordingly.</p>
                <p> </p>
                <p> 
                    <bold>Page 3: &#x201c;for a gene with &#x00b1; 1000 associated articles and five participants&#x201d;. This phrase is too concise to have meaning, as a gene cannot have participants.</bold>
                </p>
                <p> </p>
                <p> We have edited this sentence to read:</p>
                <p> </p>
                <p> &#x201c;For instance, for a gene with about 1,000 associated articles, five participants could work together on this same gene, each focusing on a different theme.&#x201d;</p>
                <p> </p>
                <p> 
                    <bold>Page 4, Step 1. No mention is made of NCBI&#x2019;s Advanced Search Builder; MeSH terms; or truncation by asterisk. The manuscript should explain why these advanced features are not taught.</bold>
                </p>
                <p> </p>
                <p> Indeed, thanks for the suggestion. We have re-written this section and mention the availability of the advanced search builder and provided the corresponding URL. We also pointed to training material available on the NCBI website. In our experience truncation by asterisk can produce unwanted results and increase false positives, also did not want to recommend using it in this case.</p>
                <p> </p>
                <p> 
                    <bold>Page 4, bottom: &#x201c;The field restriction tag [tw] (or alternatively the more restrictive [tiab]) should be added after each term&#x2026; to limit the search to titles and abstracts&#x201d;. Actually, it is [tiab] that limits the search to titles and abstracts. The broad [tw] searches almost everything (see&#x00a0;
                        <ext-link ext-link-type="uri" xlink:href="https://pubmed.ncbi.nlm.nih.gov/help/#tw">https://pubmed.ncbi.nlm.nih.gov/help/#tw</ext-link>).</bold>
                </p>
                <p> </p>
                <p> Yes, this is correct, and this entire section has been rewritten for clarity. We found the [tw] field to be effective in preventing PubMed from trying best guesses when few entries are returned. [tiab] can be used when users wish to further restrict the search. But both should provide good results for our purpose.</p>
                <p> </p>
                <p> 
                    <bold>Page 5: &#x201c;examples provided above&#x201d;. Where? Vague cross-references like this make reading difficult.</bold>
                </p>
                <p> </p>
                <p> Thanks for pointing this out. We are now providing those examples instead.</p>
                <p> </p>
                <p> 
                    <bold>Page 6: &#x201c;For instance, when assessing the novelty &#x2026;&#x201d;. This incomplete sentence is lacking a subject-verb pair. Similar problem on page 13, bottom.</bold>
                </p>
                <p> </p>
                <p> This has been corrected.</p>
                <p> </p>
                <p> 
                    <bold>Page 6, bottom: &#x201c;The Boolean argument AND is added&#x201d;. AND is a Boolean operator, not argument.</bold>
                </p>
                <p> </p>
                <p> Indeed, this has been corrected.</p>
                <p> </p>
                <p> 
                    <bold>Page 6, bottom: &#x2018;&#x2026; AND (&#x201c;Liver cancer&#x201d;[tw] OR &#x201c;Liver carcinoma&#x201d;[tw] OR &#x201c;Hepatic carcinoma&#x201d;[tw], etc.)&#x2019;. A query cannot contain &#x201c;, etc.&#x201d;. It should be removed, and a final period added to the sentence.</bold>
                </p>
                <p> This has been corrected as well.</p>
                <p> </p>
                <p> 
                    <bold>Page 9, Step 5. This part is excessively wordy and suggests an uneasiness with the concepts. We read &#x201c;is the measure by which productivity is measured&#x201d;, &#x201c;it can be at first important to be able to identify information that is worth capturing&#x201d; and &#x201c;A specific topic would be selected as the focus &#x2026; to limit the amount of literature that would need to be covered&#x201d;.</bold>
                </p>
                <p> </p>
                <p> We rewrote and condensed this paragraph.</p>
                <p> </p>
                <p> 
                    <bold>Page 13: Webex. Not WebEx.</bold>
                </p>
                <p> This has been corrected.</p>
                <p> </p>
                <p> 
                    <bold>Punctuation</bold>
                </p>
                <p> </p>
                <p> Thank you, that was very useful, and we have now made the necessary edits throughout. 
                    <list list-type="bullet">
                        <list-item>
                            <p>
                                <bold>Comma after &#x201c;Supporting material&#x201d; in the third paragraph of Introduction. Comma after &#x201c;This is a skill that is worth developing&#x201d; (page 9).</bold>
                            </p>
                        </list-item>
                        <list-item>
                            <p>
                                <bold>Replace &#x00b1; with &#x201c;about&#x201d; (first paragraph of Methods).</bold>
                            </p>
                        </list-item>
                        <list-item>
                            <p>
                                <bold>Hyphens: no hyphen after -ly adverbs, e.g. &#x201c;early career researcher&#x201d;. Use hyphens in &#x201c;gene-associated literature&#x201d;, &#x201c;false-positive result&#x201d; and &#x201c;false-negative result&#x201d; (but not in &#x201c;false positives&#x201d; or &#x201c;false negatives&#x201d;).</bold>
                            </p>
                        </list-item>
                        <list-item>
                            <p>
                                <bold>There are many instances where the required space between words is missing, especially when a quotation mark is used, e.g. &#x201c;For example, under&#x2019;Human diseases and &#x2026; associated keywords could be&#x2019;liver cancer&#x2019;,&#x2019;hepatic carcinoma&#x2019;, &#x2026;&#x201d; (page 5). The entire manuscript should be checked and spaces added where needed. A hint is that the curly quotation mark is in the wrong direction before the word when a space is missing.</bold>
                            </p>
                        </list-item>
                        <list-item>
                            <p>
                                <bold>Double quotation marks should be used in all cases except when nesting is needed. Nesting should be used on page 5 where the sentence: '
                                    <italic>This was rectified by adding the AND &#x201c;cross-reactive protein [tw] argument to each of the ambiguous acronym.'&#x00a0;</italic>should be changed to: '
                                    <italic>This was rectified by adding the &#x2018;AND &#x201c;cross-reactive protein [tw]&#x201d;&#x2019; argument to each of the ambiguous acronyms.'&#x00a0;</italic>or more simply: '
                                    <italic>This was rectified by adding &#x2018;AND &#x201c;cross-reactive protein [tw]&#x201d;&#x2019; to each of the ambiguous acronyms.'</italic>
                                </bold>
                            </p>
                        </list-item>
                    </list>
                </p>
            </body>
        </sub-article>
    </sub-article>
    <sub-article article-type="reviewer-report" id="report86220">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.39457.r86220</article-id>
            <title-group>
                <article-title>Reviewer response for version 1</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>McDowell</surname>
                        <given-names>Gary S.</given-names>
                    </name>
                    <xref ref-type="aff" rid="r86220a1">1</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0002-9470-3799</uri>
                </contrib>
                <aff id="r86220a1">
                    <label>1</label>Lightoller LLC, Chicago, IL, USA</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>15</day>
                <month>6</month>
                <year>2021</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2021 McDowell GS</copyright-statement>
                <copyright-year>2021</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport86220" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.36395.1"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>The authors discuss the need and provide a curriculum for early career researchers to develop the ability to survey and synthesize the background literature to a field in an appropriate manner. This is demonstrated with publicly available &#x201c;omics&#x201d; data as an example.</p>
            <p> </p>
            <p> This article gives an excellent description of a training workflow that is suitable and adaptable for training in systematic literature profiling. The article provides clear illustrative use case examples that are very helpful in demonstrating the results of the training workflow, and provide excellent materials for others to work from in developing similar materials.</p>
            <p>Is the rationale for developing the new method (or application) clearly explained?</p>
            <p>Yes</p>
            <p>Is the description of the method technically sound?</p>
            <p>Yes</p>
            <p>Are the conclusions about the method and its performance adequately supported by the findings presented in the article?</p>
            <p>Yes</p>
            <p>If any results are presented, are all the source data underlying the results available to ensure full reproducibility?</p>
            <p>Yes</p>
            <p>Are sufficient details provided to allow replication of the method development and its use by others?</p>
            <p>Yes</p>
            <p>Reviewer Expertise:</p>
            <p>Early career researcher experiences, training and professional development.</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.</p>
        </body>
    </sub-article>
</article>
