<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.2 20190208//EN" "http://jats.nlm.nih.gov/publishing/1.2/JATS-journalpublishing1.dtd"><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="brief-report" dtd-version="1.2" xml:lang="en">
    <front>
        <journal-meta>
            <journal-id journal-id-type="pmc">F1000Research</journal-id>
            <journal-title-group>
                <journal-title>F1000Research</journal-title>
            </journal-title-group>
            <issn pub-type="epub">2046-1402</issn>
            <publisher>
                <publisher-name>F1000 Research Limited</publisher-name>
                <publisher-loc>London, UK</publisher-loc>
            </publisher>
        </journal-meta>
        <article-meta>
            <article-id pub-id-type="doi">10.12688/f1000research.50857.1</article-id>
            <article-categories>
                <subj-group subj-group-type="heading">
                    <subject>Brief Report</subject>
                </subj-group>
                <subj-group>
                    <subject>Articles</subject>
                </subj-group>
            </article-categories>
            <title-group>
                <article-title>Interactive SARS-CoV-2 mutation timemaps</article-title>
                <fn-group content-type="pub-status">
                    <fn>
                        <p>[version 1; peer review: 3 approved with reservations]</p>
                    </fn>
                </fn-group>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author" corresp="yes">
                    <name>
                        <surname>Warren</surname>
                        <given-names>Ren&#x00e9; L.</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Conceptualization</role>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Resources</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <xref ref-type="corresp" rid="c1">a</xref>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Birol</surname>
                        <given-names>Inanc</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Funding Acquisition</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <aff id="a1">
                    <label>1</label>Genome Sciences Centre, BC Cancer Agency, Vancouver, British Columbia, V5Z 4S6, Canada</aff>
            </contrib-group>
            <author-notes>
                <corresp id="c1">
                    <label>a</label>
                    <email xlink:href="mailto:rwarren@bcgsc.ca">rwarren@bcgsc.ca</email>
                </corresp>
                <fn fn-type="conflict">
                    <p>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>3</day>
                <month>2</month>
                <year>2021</year>
            </pub-date>
            <pub-date pub-type="collection">
                <year>2021</year>
            </pub-date>
            <volume>10</volume>
            <elocation-id>68</elocation-id>
            <history>
                <date date-type="accepted">
                    <day>29</day>
                    <month>1</month>
                    <year>2021</year>
                </date>
            </history>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2021 Warren RL and Birol I</copyright-statement>
                <copyright-year>2021</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <self-uri content-type="pdf" xlink:href="https://f1000research.com/articles/10-68/pdf"/>
            <abstract>
                <p>As the year 2020 came to a close, several new strains have been reported of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the agent responsible for the coronavirus disease 2019 (COVID-19) pandemic that has afflicted us all this past year. However, it is difficult to comprehend the scale, in sequence space, geographical location and time, at which SARS-CoV-2 mutates and evolves in its human hosts. To get an appreciation for the rapid evolution of the coronavirus, we built interactive scalable vector graphics maps that show daily nucleotide variations in genomes from the six most populated continents compared to that of the initial, ground-zero SARS-CoV-2 isolate sequenced at the beginning of the year.</p>
                <p> 
                    <bold>Availability: </bold>The tool used to perform the reported mutation analysis results, ntEdit, is available from 
                    <ext-link ext-link-type="uri" xlink:href="https://github.com/bcgsc/ntedit">GitHub</ext-link>. Genome mutation reports are available for download from 
                    <ext-link ext-link-type="uri" xlink:href="https://www.bcgsc.ca/downloads/btl/SARS-CoV-2/mutations/">BCGSC</ext-link>. Mutation time maps are available from 
                    <ext-link ext-link-type="uri" xlink:href="https://bcgsc.github.io/SARS2/">https://bcgsc.github.io/SARS2/</ext-link>.</p>
            </abstract>
            <kwd-group kwd-group-type="author">
                <kwd>SARS-CoV-2</kwd>
                <kwd>COVID-19</kwd>
                <kwd>Mutation time maps</kwd>
                <kwd>GISAID</kwd>
                <kwd>Interactive SVG</kwd>
            </kwd-group>
            <funding-group>
                <award-group id="fund-1" xlink:href="http://dx.doi.org/10.13039/100008762">
                    <funding-source>Genome Canada</funding-source>
                    <award-id>281ANV</award-id>
                </award-group>
                <award-group id="fund-2" xlink:href="http://dx.doi.org/10.13039/501100000272">
                    <funding-source>National Institute for Health Research</funding-source>
                    <award-id>2R01HG007182-04A1</award-id>
                </award-group>
                <funding-statement>This work was supported by Genome BC and Genome Canada [281ANV]; and the National Institutes of Health [2R01HG007182-04A1]. The content of this paper is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health or other funding organizations.</funding-statement>
                <funding-statement>
                    <italic>The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.</italic>
                </funding-statement>
            </funding-group>
        </article-meta>
    </front>
    <body>
        <sec id="sec1" sec-type="intro">
            <title>Introduction</title>
            <p>In the last few weeks of 2020, new severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) mutations in the United Kingdom (UK) were reported
                <sup>
                    <xref ref-type="bibr" rid="ref1">1</xref>
                </sup>. Although coronavirus genome mutations have been previously discovered and announced throughout the year, including the widely discussed D614G missense change in the spike protein
                <sup>
                    <xref ref-type="bibr" rid="ref2">2</xref>
                </sup>
                <sup>,</sup>
                <sup>
                    <xref ref-type="bibr" rid="ref3">3</xref>
                </sup>, the latest recurring surface protein mutations to be identified (e.g. N501Y, P681H) are cause for concern. The SARS-CoV-2 viral 
                <italic toggle="yes">S</italic> gene encodes a surface glycoprotein, which upon interaction with host ACE-2 receptors, makes it possible for the coronavirus to gain entry to host cells and propagate. The reported changes to its sequence may be associated with increased virulence
                <sup>
                    <xref ref-type="bibr" rid="ref4">4</xref>
                </sup>, infectivity
                <sup>
                    <xref ref-type="bibr" rid="ref3">3</xref>
                </sup> and overall fitness
                <sup>
                    <xref ref-type="bibr" rid="ref5">5</xref>
                </sup>. The global response to those recent reports has been swift, with several countries shutting down air travel from the UK. This highlights the severity of the situation and the importance to track genomic variations and their predicted effects over time and space.</p>
            <p>The rapid evolution of the SARS-CoV-2 genome in human hosts has prompted us to map all nucleotide changes that have appeared in 2020, since the first genome sequence of a COVID-19 patient isolate from the outbreak epicentre in Wuhan, China was made public
                <sup>
                    <xref ref-type="bibr" rid="ref6">6</xref>
                </sup>. For this, we leveraged the collaborative efforts of hundreds of institutions worldwide who have graciously shared over 260,000 SARS-CoV-2 genome sequences with the 
                <ext-link ext-link-type="uri" xlink:href="https://www.gisaid.org/">GISAID</ext-link> central repository since early January 2020
                <sup>
                    <xref ref-type="bibr" rid="ref7">7</xref>
                </sup>. Our mutation time maps show the staggering number of nucleotide variants that have accumulated on the whole viral genome throughout the year, and especially since fall 2020, and in the six most populated continents. Here we present key features of these maps and how they may be of utility to researchers.</p>
        </sec>
        <sec id="sec2" sec-type="methods">
            <title>Methods</title>
            <p>We first downloaded all complete, high-coverage SARS- CoV-2 genomes from GISAID
                <sup>
                    <xref ref-type="bibr" rid="ref7">7</xref>
                </sup> on January 23
                <sup>rd</sup>, 2021 (human hosts samples collected). We then ran a genome polishing pipeline, which consists of ntHits
                <sup>
                    <xref ref-type="bibr" rid="ref8">8</xref>
                </sup> (v0.1.0 -b 36 -outbloom -c 1 -p seq -k 25) followed by ntEdit
                <sup>
                    <xref ref-type="bibr" rid="ref9">9</xref>
                </sup> (v1.3.4 -i 5 -d 5 -m 1 -r seq_k25.bf) and required at most 0.5 GB RAM and executed in ~1 sec. per genome on a single CPU. We used the first published SARS-CoV-2 genome isolate
                <sup>
                    <xref ref-type="bibr" rid="ref6">6</xref>
                </sup> (WH- Human 1 coronavirus, GenBank accession: MN908947.3) as the reference and each individual GISAID genome in turn as source of kmers to identify base variation relative to the former. The variant call format (VCF) output files from ntEdit were parsed and we tallied, for each submitted GISAID genome, the complete list of nucleotide variations. We next organized each nucleotide variant by sample collection date, continent of origin and, when applicable, evaluated its effect on the gene product that harbours the change to output an interactive scalable vector graphics (SVG) file. The script we developed to generate the maps is written in PERL and distributed under GPLv3. Users wishing to generate custom maps can download the script from 
                <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.5281/zenodo.4469840">Zenodo</ext-link>
                <sup>
                    <xref ref-type="bibr" rid="ref10">10</xref>
                </sup>.</p>
        </sec>
        <sec id="sec3" sec-type="results|discussion">
            <title>Results and discussion</title>
            <p>We analyzed nucleotide variations over time in over 260,000 SARS-CoV-2 viral genomes, submitted to the GISAID initiative
                <sup>
                    <xref ref-type="bibr" rid="ref7">7</xref>
                </sup> from around the globe, relative to that of the ground zero COVID-19 clinical isolate
                <sup>
                    <xref ref-type="bibr" rid="ref6">6</xref>
                </sup>. We mapped each mutation that was observed in five or more genomes each day. The 2020 calendar year from January 1
                <sup>st</sup> 2020 (day 1) to December 31
                <sup>st</sup> 2020 (day 366) is organized in a circle where each radius represents a day (1 day = 0.98 degree) and data points represent mutations along the reference genome sequence from 1 (closest to center) to 29,903 bp (near the outer rim). The size of each point is in log10 scale of the number of contributing viral genomes collected on that day that has the mutation, with colour assignments indicating the continent of origin where the mutation is observed. A mouse over each data point reveals the collection date, the nucleotide variant, the continent and associated number of contributing genome sequences (including daily sample fraction) and, when applicable, the gene product and predicted amino acid change.</p>
            <p>From the SARS-CoV-2 genome mutation time map (
                <xref ref-type="fig" rid="f1">Figure 1A</xref>), we observe the first persistent mutations (&#x2265;5 genomes/day) appearing in late February 2020, including the prevalent D614G mutation in Europe on February 22
                <sup>nd</sup> (albeit since January in fewer samples, 
                <xref ref-type="fig" rid="f1">Figure 1B</xref>). From there, the original coronavirus genome sustained many changes overtime (5,468 distinct variants mapped in 2020 as of January 23
                <sup>rd</sup>, 2021), including a sizeable proportion (56.8 %) of missense mutations. It is immediately evident from 
                <xref ref-type="fig" rid="f1">Figure 1A</xref> that variations from Europe account for a larger share (71.2%) of the variants mapped. Further, there appears to be a surge in variations identified in late summer/throughout fall 2020 in this continent. This may be explained by a disproportionate number of submissions with samples originating from this jurisdiction as the second wave hit hard. Thus, caution in interpreting the map is warranted. Of note, the spike protein gene variant N501Y, observed on our maps in the UK in late September 2020 (
                <xref ref-type="fig" rid="f1">Figure 1</xref>), is consistent with an earlier study reporting on its recurrent emergence within this time frame
                <sup>
                    <xref ref-type="bibr" rid="ref1">1</xref>
                </sup>. We think these maps will be of utility to researchers in their exploration of SARS-CoV-2 mutations and their predicted effect over time.</p>
            <fig fig-type="figure" id="f1" orientation="portrait" position="float">
                <label>Figure 1.</label>
                <caption>
                    <title>Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) evolution in human hosts.</title>
                    <p>ntEdit was used to map nucleotide variations between the first published coronavirus isolate from Wuhan, China in early January and over 260,000 SARS-CoV-2 genomes sampled from around the globe during the 2020 coronavirus disease 2019 (COVID-19) pandemic. The maps show missense mutations arising daily (A) in the world within the whole viral genome, with the reference genome represented by the vertical axis from bases 1 to 29.9 kbp and (B) in Europe within the spike protein gene. Alternating dark
                        <italic toggle="yes">/</italic>light grey vertical rectangles and associated tracks depict, starting from the center, SARS-CoV-2 genes 
                        <italic toggle="yes">
                            <sc>orf</sc>1
                            <sc>ab</sc>, S, ORF3
                            <sc>a</sc>, E, M, ORF6, ORF7
                            <sc>a</sc>, ORF8, N,</italic> and 
                        <italic toggle="yes">ORF10.</italic> Mutations identified daily are represented by circles in a given radius and are coloured by regions and sized relative to raw count (panel A) or ratio (panel B) of the daily samples. A stacked bar plot (center) shows sample count. The 2020 calendar year mutations are organized clockwise from the upper vertical. Hovering the mouse cursor over each data point reveals additional insights (not shown).</p>
                </caption>
                <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/53946/42ea96a0-e743-458e-8730-c70b0d48762f_figure1a.gif"/>
                <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/53946/42ea96a0-e743-458e-8730-c70b0d48762f_figure1b.gif"/>
            </fig>
        </sec>
        <sec id="sec4">
            <title>Data availability</title>
            <sec id="sec5">
                <title>Source data</title>
                <p>The SARS-CoV-2 genome sequences can be accessed via the 
                    <ext-link ext-link-type="uri" xlink:href="https://www.gisaid.org/">GISAID</ext-link> central repository. Processed single nucleotide variant (SNV) data is available from 
                    <ext-link ext-link-type="uri" xlink:href="https://www.bcgsc.ca/downloads/btl/SARS-CoV-2/mutations/">https://www.bcgsc.ca/downloads/btl/SARS-CoV-2/mutations/</ext-link>.</p>
            </sec>
        </sec>
        <sec id="sec6">
            <title>Maps availability</title>
            <p>
                <list list-type="simple">
                    <list-item>
                        <label>-</label>
                        <p>Maps are available from: 
                            <ext-link ext-link-type="uri" xlink:href="https://bcgsc.github.io/SARS2">https://bcgsc.github.io/SARS2</ext-link>
                        </p>
                    </list-item>
                    <list-item>
                        <label>-</label>
                        <p>SNV detection source code is available from: 
                            <ext-link ext-link-type="uri" xlink:href="https://github.com/bcgsc/ntedit">https://github.com/bcgsc/ntedit</ext-link>
                        </p>
                    </list-item>
                    <list-item>
                        <label>-</label>
                        <p>Archived source code at time of publication: 
                            <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.5281/zenodo.4469840">https://doi.org/10.5281/zenodo.4469840</ext-link>
                            <sup>
                                <xref ref-type="bibr" rid="ref10">10</xref>
                            </sup>
                        </p>
                    </list-item>
                </list>
            </p>
            <p>Data are available under the terms of the 
                <ext-link ext-link-type="uri" xlink:href="https://creativecommons.org/licenses/by/4.0/legalcode">Creative Commons Attribution 4.0 International license</ext-link> (CC-BY 4.0).</p>
        </sec>
        <sec id="sec7">
            <title>Author contributions</title>
            <p>Study design: RLW. Analysis: RLW. Both authors wrote the manuscript.</p>
        </sec>
    </body>
    <back>
        <ack id="ack1">
            <title>Acknowledgements</title>
            <p>We acknowledge Cecilia (Lingyu) Yang for her early work on SARS-CoV-2 variants.</p>
        </ack>
        <ref-list>
            <title>References</title>
            <ref id="ref1">
                <label>1</label>
                <mixed-citation publication-type="web">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Rambaut</surname>
                            <given-names>A</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Preliminary genomic characterisation of an emergent SARS-CoV-2 lineage in the UK defined by a novel set of spike mutations.</article-title>
                    <source>

                        <italic toggle="yes">Virological</italic>
</source>
                    <year>2020</year>.
                    <ext-link ext-link-type="uri" xlink:href="https://virological.org/t/preliminary-genomic-characterisation-of-an-emergent-sars-cov-2-lineage-in-the-uk-defined-by-a-novel-set-of-spike-mutations">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref2">
                <label>2</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Dey</surname>
                            <given-names>T</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Identification and computational analysis of mutations in SARS-CoV-2.</article-title>
                    <source>

                        <italic toggle="yes">Comput Biol Med</italic>
</source>
                    <year>2021</year>;<volume>129</volume>:<fpage>104166</fpage>.
                    <pub-id pub-id-type="pmid">33383528</pub-id>
                    <pub-id pub-id-type="doi">10.1016/j.compbiomed.2020.104166</pub-id>
                    <pub-id pub-id-type="pmcid">PMC7837166</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref3">
                <label>3</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Korber</surname>
                            <given-names>B</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Tracking changes in SARS- CoV-2 spike: evidence that D614G increases infectivity of the COVID-19 virus.</article-title>
                    <source>

                        <italic toggle="yes">Cell</italic>
</source>
                    <year>2020</year>;<volume>182</volume>:<fpage>812</fpage>&#x2013;<lpage>827</lpage>.
                    <pub-id pub-id-type="pmid">32697968</pub-id>
                    <pub-id pub-id-type="doi">10.1016/j.cell.2020.06.043</pub-id>
                    <pub-id pub-id-type="pmcid">PMC7332439</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref4">
                <label>4</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Gu</surname>
                            <given-names>H</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Adaptation of SARS-CoV-2 in BALB/c Mice for Testing Vaccine Efficacy</article-title>
                    <source>

                        <italic toggle="yes">Science</italic>
</source>
                    <year>2020</year>;<volume>369</volume>:<fpage>1603</fpage>&#x2013;<lpage>1607</lpage>.
                    <pub-id pub-id-type="pmid">32732280</pub-id>
                    <pub-id pub-id-type="doi">10.1126/science.abc4730</pub-id>
                    <pub-id pub-id-type="pmcid">PMC7574913</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref5">
                <label>5</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Plante</surname>
                            <given-names>JA</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Spike mutation D614G alters SARS-CoV-2 fitness.</article-title>
                    <source>

                        <italic toggle="yes">Nature</italic>
                    </source>
                    <year>2020</year>.
                    <pub-id pub-id-type="pmid">33106671</pub-id>
                    <pub-id pub-id-type="doi">10.1038/s41586-020-2895-3</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref6">
                <label>6</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Wu</surname>
                            <given-names>F</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>A new coronavirus associated with human respiratory disease in China.</article-title>
                    <source>

                        <italic toggle="yes">Nature</italic>
</source>
                    <year>2020</year>;<volume>579</volume>:<fpage>265</fpage>&#x2013;<lpage>269</lpage>.
                    <pub-id pub-id-type="pmid">32015508</pub-id>
                    <pub-id pub-id-type="doi">10.1038/s41586-020-2008-3</pub-id>
                    <pub-id pub-id-type="pmcid">PMC7094943</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref7">
                <label>7</label>
                <mixed-citation publication-type="web">
                    <collab>re3data.org</collab>:
                    <article-title>GISAID; editing status.</article-title>
                    <source>

                        <italic toggle="yes">re3data.org - Registry of Research Data Repositories.</italic>
</source>
                    <year>2020-02-03</year>.
                    <pub-id pub-id-type="doi">10.17616/R3Q59F</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref8">
                <label>8</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Mohamadi</surname>
                            <given-names>H</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>ntHits: de novo repeat identification of genomics data using a streaming approach.</article-title>
                    <source>

                        <italic toggle="yes">BioRxiv</italic>
</source>
                    <year>2020</year>.
                    <pub-id pub-id-type="doi">10.1101/2020.11.02.365809</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref9">
                <label>9</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Warren</surname>
                            <given-names>RL</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>ntEdit: scalable genome sequence polishing</article-title>
                    <source>

                        <italic toggle="yes">Bioinformatics</italic>
</source>
                    <year>2019</year>;<volume>35</volume>:<fpage>4430</fpage>&#x2013;<lpage>4432</lpage>.
                    <pub-id pub-id-type="pmid">31095290</pub-id>
                    <pub-id pub-id-type="doi">10.1093/bioinformatics/btz400</pub-id>
                    <pub-id pub-id-type="pmcid">PMC6821332</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref10">
                <label>10</label>
                <mixed-citation publication-type="other">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Warren</surname>
                            <given-names>R</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Birol</surname>
                            <given-names>I</given-names>
                        </name>
</person-group>:
                    <article-title>Interactive SARS-CoV-2 mutation timemaps (Version v1.1).</article-title>
                    <source>

                        <italic toggle="yes">Zenodo</italic>
</source>
                    <year>2021, January 26</year>.
                    <pub-id pub-id-type="doi">10.5281/zenodo.4469840</pub-id>
                </mixed-citation>
            </ref>
        </ref-list>
    </back>
    <sub-article article-type="reviewer-report" id="report85512">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.53946.r85512</article-id>
            <title-group>
                <article-title>Reviewer response for version 1</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Moradi</surname>
                        <given-names>Jale</given-names>
                    </name>
                    <xref ref-type="aff" rid="r85512a1">1</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0002-2050-1393</uri>
                </contrib>
                <aff id="r85512a1">
                    <label>1</label>Department of Microbiology, Faculty of Medicine, Kermanshah University of Medical Sciences, Kermanshah, Iran</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>1</day>
                <month>6</month>
                <year>2021</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2021 Moradi J</copyright-statement>
                <copyright-year>2021</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport85512" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.50857.1"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve-with-reservations</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>The authors have geographically shown the nucleotide variations for global SARS-CoV-2 sequences in a time map. The sequences have been downloaded, polished and analyzed with ntHit and ntEdit. The Wuhan-Hu-1-NC_045512/MN908947 was set as the reference sequence, then the variations output was mapped based on the sample collection time by a script written in PERL. The results have shown in two circle maps including &#x201c;whole viral genome&#x201d; and &#x201c;spike protein gene&#x201d; variations over time from January 1
                <sup>st</sup> 2020 as day 1 to December 31
                <sup>st</sup> 2020 as day 366. Each radius in these circles represents a day and each spot on this radius shows a variation. Also, the spots are shown in different colours that each colour is indicating a specific geographical region (continent or country).</p>
            <p> </p>
            <p> It is a useful tool to overview the evolution of the virus since the beginning of the epidemic. Furthermore, it can be concluded which part of the genome has more variations, also, the colour appearance of the map helps us to understand approximately how many mutations there are in different regions or from which ones the mutations originated. If it were possible to identify the relevant mutation (exact mutation type) by clicking on each spot, it could help more. Also, different spots have overlaps in many parts, which would provide better information if it was possible to determine which spots this overlap includes.</p>
            <p> </p>
            <p> Overall, the developed script provides a useful map for viewing the pattern of virus evolution globally, although it would be more informative if the authors could improve this script to solve the mentioned issues.</p>
            <p>Is the work clearly and accurately presented and does it cite the current literature?</p>
            <p>Yes</p>
            <p>If applicable, is the statistical analysis and its interpretation appropriate?</p>
            <p>Not applicable</p>
            <p>Are all the source data underlying the results available to ensure full reproducibility?</p>
            <p>Yes</p>
            <p>Is the study design appropriate and is the work technically sound?</p>
            <p>Partly</p>
            <p>Are the conclusions drawn adequately supported by the results?</p>
            <p>Partly</p>
            <p>Are sufficient details of methods and analysis provided to allow replication by others?</p>
            <p>Yes</p>
            <p>Reviewer Expertise:</p>
            <p>Medical Microbiology, genomics, immunology</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.</p>
        </body>
        <sub-article article-type="response" id="comment6757-85512">
            <front-stub>
                <contrib-group>
                    <contrib contrib-type="author">
                        <name>
                            <surname>Warren</surname>
                            <given-names>Ren&#x00e9;</given-names>
                        </name>
                        <aff>Canada's Michael Smith Genome Sciences Centre, Canada</aff>
                    </contrib>
                </contrib-group>
                <author-notes>
                    <fn fn-type="conflict">
                        <p>
                            <bold>Competing interests: </bold>no competing interests to declare</p>
                    </fn>
                </author-notes>
                <pub-date pub-type="epub">
                    <day>3</day>
                    <month>6</month>
                    <year>2021</year>
                </pub-date>
            </front-stub>
            <body>
                <p>We thank our Reviewer for their support of our work and insights. We also value the suggestions, as it helps us improve upon the work and broaden the interest.</p>
                <p> </p>
                <p> We have just published a revised version of the manuscript (v2), which expands on the utility of the maps, situates them in context of other similar work, and introduces new map features to increase interactivity and overall experience.</p>
                <p> </p>
                <p> Some of the maps' new features (since original submission):</p>
                <p> </p>
                <p> 
                    <underline>Interactivity</underline> 
                    <list list-type="order">
                        <list-item>
                            <p>Maps are draggable.</p>
                        </list-item>
                        <list-item>
                            <p>Zoom/pan.</p>
                        </list-item>
                        <list-item>
                            <p>Tilt 90 degrees to make axis horizontal (this and above features implemented in a navigation wheel).</p>
                        </list-item>
                        <list-item>
                            <p>Colour highlight on mutation tooltip.</p>
                        </list-item>
                        <list-item>
                            <p>Gene/variant views have additional colour highlight (by region) on certain maps*.</p>
                        </list-item>
                    </list> *The added functionality comes at a cost, making them sluggish when views are too dense, which is why this feature is currently only used to display individual genes/variant displays and not the whole genome</p>
                <p> </p>
                <p> 
                    <underline>Improvements:</underline> 
                    <list list-type="order">
                        <list-item>
                            <p>Over 120 individual displays, all SARS-CoV-2 genes are now presented.</p>
                        </list-item>
                        <list-item>
                            <p>Better discrimination of close high-frequency mutations allows more information to show through by adjusting the spot ratio (r=sqrt(freq*factor/pi) and no longer plots on a log10 in ratio mode.</p>
                        </list-item>
                        <list-item>
                            <p>When same %, adjust a secondary sort such that the colour matches the first region labelled.</p>
                        </list-item>
                        <list-item>
                            <p>Better grouping/sorting of overlapping points.</p>
                        </list-item>
                        <list-item>
                            <p>Added ability to explore switch year from the current view 2020&lt;-&gt;2021 and between ratio(%) and raw (#) counts without having to go to main menu and use drop-down.</p>
                        </list-item>
                    </list> The mutation "spots" are also plotted incrementally (by coordinates) and by decreasing order of frequency, allowing most mutations to interactively show (and not be obscured by overlaps). But overlaps are unavoidable with displays that are too dense, and some data points may still be out of reach, but other individual maps (eg. variant/gene levels) may provide a better visual of the most important mutations.</p>
                <p> </p>
                <p> Improvements 2), 3) and 4) in particular are in response to our Reviewer's comment on spot overlap, and calculating the ratio in such a fashion (instead of log10) enables a better resolution on close-by high-frequency mutations (such as the D614G). Most displays will show missense mutation to minimize display density, but we also offer representations by types (missense vs silent) and all-encompassing. With tooltip, the mutation type is shown as either its effect in amino acid space (eg. N501Y) or silent when the nucleotide variation has no predicted effect.</p>
            </body>
        </sub-article>
    </sub-article>
    <sub-article article-type="reviewer-report" id="report85294">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.53946.r85294</article-id>
            <title-group>
                <article-title>Reviewer response for version 1</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Koyama</surname>
                        <given-names>Takahiko</given-names>
                    </name>
                    <xref ref-type="aff" rid="r85294a1">1</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0003-1694-9061</uri>
                </contrib>
                <aff id="r85294a1">
                    <label>1</label>TJ Watson Research Center, IBM, Scarsdale, NY, USA</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>17</day>
                <month>5</month>
                <year>2021</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2021 Koyama T</copyright-statement>
                <copyright-year>2021</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport85294" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.50857.1"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve-with-reservations</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>Authors have developed a web based visualization tool for longitudinal evolution of SARS-CoV-2 genomes.</p>
            <p> </p>
            <p> Although they have made unique representation of longitudinal strain developments, it is not clear the utility of the tool. For instance, while concentric circle representation of daily genomes is visually appealing, it limits the duration to a year and inner part inevitably becomes crowded compared with outer area.</p>
            <p> </p>
            <p> Lack of interactivity is also an issue. There must have been a way to magnify the area.</p>
            <p> </p>
            <p> Furthermore, in mutation prone loci, the dots are overlapped and not easy to see what is going on. For these reasons, utility of the tool is limited; more improvements need to be done before it gains large user base.</p>
            <p>Is the work clearly and accurately presented and does it cite the current literature?</p>
            <p>Partly</p>
            <p>If applicable, is the statistical analysis and its interpretation appropriate?</p>
            <p>Not applicable</p>
            <p>Are all the source data underlying the results available to ensure full reproducibility?</p>
            <p>Yes</p>
            <p>Is the study design appropriate and is the work technically sound?</p>
            <p>Yes</p>
            <p>Are the conclusions drawn adequately supported by the results?</p>
            <p>Partly</p>
            <p>Are sufficient details of methods and analysis provided to allow replication by others?</p>
            <p>Yes</p>
            <p>Reviewer Expertise:</p>
            <p>Genomics, bioinformatics, oncology, immunology, virology, and stem cell biology.</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.</p>
        </body>
        <sub-article article-type="response" id="comment6680-85294">
            <front-stub>
                <contrib-group>
                    <contrib contrib-type="author">
                        <name>
                            <surname>Warren</surname>
                            <given-names>Ren&#x00e9;</given-names>
                        </name>
                        <aff>Canada's Michael Smith Genome Sciences Centre, Canada</aff>
                    </contrib>
                </contrib-group>
                <author-notes>
                    <fn fn-type="conflict">
                        <p>
                            <bold>Competing interests: </bold>no competing interests</p>
                    </fn>
                </author-notes>
                <pub-date pub-type="epub">
                    <day>17</day>
                    <month>5</month>
                    <year>2021</year>
                </pub-date>
            </front-stub>
            <body>
                <p>
                    <italic>Authors have developed a web based visualization tool for longitudinal evolution of SARS-CoV-2 genomes.</italic>
                </p>
                <p>
                    <italic> </italic>
                </p>
                <p>
                    <italic> Although they have made unique representation of longitudinal strain developments, it is not clear the utility of the tool. For instance, while concentric circle representation of daily genomes is visually appealing, it limits the duration to a year and inner part inevitably becomes crowded compared with outer area.</italic>
                </p>
                <p> </p>
                <p> 
                    <bold>We thank our Reviewer for the valuable insights provided and spending the time to review our work. We acknowledge limitations of the display, and we stress that our original work on this was done in December, on 200,000 GISAID genomes and one year's worth of data. Our preprint became public January 2021 and we subsequently submitted this work to F1000Research, summarizing the 2020 pandemic-associated SARS-CoV-2 variants for year 2020. A circular representation is an aesthetic choice, allowing to get a bird's eye view of the breadth of mutations.&#x00a0;&#x00a0;</bold>
                </p>
                <p> </p>
                <p> 
                    <italic>Lack of interactivity is also an issue. There must have been a way to magnify the area.</italic>
                </p>
                <p> </p>
                <p> 
                    <bold>This is a great suggestion. We have now added the ability to pan and zoom on each map, making the maps more interactive.</bold>
                </p>
                <p> </p>
                <p> 
                    <italic>Furthermore, in mutation prone loci, the dots are overlapped and not easy to see what is going on. For these reasons, utility of the tool is limited; more improvements need to be done before it gains large user base.</italic>
                </p>
                <p> </p>
                <p> 
                    <bold>The maps were first built to visually quantify the appreciable variability that exists in rapidly evolving SARS-CoV-2 genomes. Since, we have added spike-specific views, and variants of concerns (VOCs) to the list of maps available to the community. We also provide the tools to generate the maps, such that advanced users may customize and generate additional views of interest, as needed</bold>
                </p>
            </body>
        </sub-article>
        <sub-article article-type="response" id="comment6758-85294">
            <front-stub>
                <contrib-group>
                    <contrib contrib-type="author">
                        <name>
                            <surname>Warren</surname>
                            <given-names>Ren&#x00e9;</given-names>
                        </name>
                        <aff>Canada's Michael Smith Genome Sciences Centre, Canada</aff>
                    </contrib>
                </contrib-group>
                <author-notes>
                    <fn fn-type="conflict">
                        <p>
                            <bold>Competing interests: </bold>no competing interests to declare</p>
                    </fn>
                </author-notes>
                <pub-date pub-type="epub">
                    <day>3</day>
                    <month>6</month>
                    <year>2021</year>
                </pub-date>
            </front-stub>
            <body>
                <p>We wanted to add to our previous response to our Reviewer. Once again, we are grateful for your suggestions to improve upon interactivity of the maps. Since your Review, we have worked to improve the user experience and we list below some of the new features:</p>
                <p> </p>
                <p> 
                    <underline>Interactivity</underline> 
                    <list list-type="order">
                        <list-item>
                            <p>Maps are draggable.</p>
                        </list-item>
                        <list-item>
                            <p>Zoom/pan.</p>
                        </list-item>
                        <list-item>
                            <p>Tilt 90 degrees to make axis horizontal (this and above features implemented in a navigation wheel).</p>
                        </list-item>
                        <list-item>
                            <p>Colour highlight on mutation tooltip.</p>
                        </list-item>
                        <list-item>
                            <p>Gene/variant views have additional colour highlight (by region) on certain maps*.</p>
                        </list-item>
                    </list> *The added functionality comes at a cost, making them sluggish when views are too dense, which is why this feature is currently only used to display individual genes/variant displays and not the whole genome</p>
                <p> </p>
                <p> 
                    <underline>Overall improvements</underline> 
                    <list list-type="order">
                        <list-item>
                            <p>Over 120 individual displays, all SARS-CoV-2 genes are now presented.</p>
                        </list-item>
                        <list-item>
                            <p>Better discrimination of close high-frequency mutations allows more information to show through by adjusting the spot ratio (r=sqrt(freq*factor/pi) and no longer plots on a log10 in ratio mode.</p>
                        </list-item>
                        <list-item>
                            <p>When same %, adjust a secondary sort such that the colour matches the first region labelled.</p>
                        </list-item>
                        <list-item>
                            <p>Better grouping/sorting of overlapping points.</p>
                        </list-item>
                        <list-item>
                            <p>Added ability to explore switch year from the current view 2020&lt;-&gt;2021 and between ratio(%) and raw (#) counts without having to go to main menu and use drop-down.</p>
                        </list-item>
                    </list> Thanks again for spending the time to review our work.</p>
            </body>
        </sub-article>
    </sub-article>
    <sub-article article-type="reviewer-report" id="report83795">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.53946.r83795</article-id>
            <title-group>
                <article-title>Reviewer response for version 1</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Ebersberger</surname>
                        <given-names>Ingo</given-names>
                    </name>
                    <xref ref-type="aff" rid="r83795a1">1</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0001-8187-9253</uri>
                </contrib>
                <contrib contrib-type="author">
                    <name>
                        <surname>Iruegas</surname>
                        <given-names>Ruben</given-names>
                    </name>
                    <xref ref-type="aff" rid="r83795a1">1</xref>
                    <role>Co-referee</role>
                </contrib>
                <aff id="r83795a1">
                    <label>1</label>Applied Bioinformatics Group, Institute for Cell Biology and Neuroscience, Goethe-University Frankfurt, Frankfurt, Germany</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>14</day>
                <month>5</month>
                <year>2021</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2021 Ebersberger I and Iruegas R</copyright-statement>
                <copyright-year>2021</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport83795" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.50857.1"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve-with-reservations</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>The authors present interactive mutation time maps for SARS-CoV-2, which provide a highly resolving view of when, where and how frequent a particular mutation was detected in the sampled SARS-CoV-2 genome sequences provided via GISAID. The manuscript itself is rather short. It is briefly describing the methodological approach of how the mutations have been detected and mapped to the reference genome. The combined Results and Discussion section is equally concise and comprises a description of what is seen in the interactive maps together with few example observations that can be made with these maps. The Discussion section ends with the expression of the hope that the maps presented here &#x201c;will help researchers in their exploration of SARS-CoV-2 mutations and their predicted effect over time.&#x201d;</p>
            <p> </p>
            <p> Overall, the topic that is touched in this manuscript is highly relevant, as variations of SARS-CoV-2 is something that currently is and will be of major concern in the future. Here, the graphs present a very nice access to the information that is represented by the ever-increasing amount of viral genome sequences world-wide. The data presentation is appealing, and it allows to overview the general trends of SARS-CoV-2 evolution. However, we see considerable room for (essential) improvement.</p>
            <p> </p>
            <p> Major issues:</p>
            <p> The authors end the manuscript with the belief that the interactive maps will be of help for the research community working on SARS-CoV-2 variation. We miss two things here:</p>
            <p> </p>
            <p> First, it would be great if the authors show how the data provided by the maps can be used to indeed come up with new conclusions, in particular with respect to the &#x2018;predicted effect over time&#x2019;. For us, it is entirely unclear how such an analysis should be performed. Exploring the data, this is something that one nicely can do while looking at the plots, some clear signals, e.g. the fate of D614G, can also be extracted. But how to work with the data beyond this simple and straightforward &#x2018;looking&#x2019; at the plots? Please, don&#x2019;t get us wrong here, we consider looking at data a very important aspect of data analysis. Still, the sheer amount of information, which results in very dense plots with many overlapping data points, makes it, in our opinion, very hard to identify emerging variants that should be monitored right from the start. Just to give you an example: D614G is represented by a very prominent circle in the plots. What would be the authors approach to identify and monitor a novel variant, say at position 615 of the reference strain? By looking at the plots, we consider this almost impossible, since the signal will be entirely covered by the prominent mutation at position 614.</p>
            <p> </p>
            <p> The analysis is presented using the &#x201c;ground-zero&#x201d; strain as a reference. But is this still timely? Numerous variants have now frequencies that go far beyond that of the original nucleotide at a certain position, again, for example the D614G variant. This would allow to &#x2018;purge&#x2019; the signal of very successful variants, helping to direct the focus on emerging variants.</p>
            <p> </p>
            <p> When it comes to the website itself, we see some room for improvement: 
                <list list-type="bullet">
                    <list-item>
                        <p>First and foremost, we think the plots are overcrowded with information. Although it is nice to see a global overview of the data across the entire genome, 365 days, and 6 continents, it is impossible (at least for me) to explore this information other than randomly clicking individual data points, as we have outlined above. we think, this approach would benefit from providing the information in more digestible data fractions. Thus far, the user can choose to focus on the spike, but not on the other proteins. It would be helpful, just as a suggestion, to focus also on variants with a certain prevalence. But we are sure that the authors will have way better ideas than our proposals here, once they specify how a user should work with the plots and the data. Looking at 
                            <ext-link ext-link-type="uri" xlink:href="https://nextstrain.org">https://nextstrain.org</ext-link>, which also provides a very nice overview of SARS-CoV-2 variation, may give some hints.</p>
                    </list-item>
                    <list-item>
                        <p>It would be very convenient, if the interactive plots would be designed such that the user can toggle the information for display, instead of having to go back to the main menu and select a different display mode.</p>
                    </list-item>
                    <list-item>
                        <p>Trend lines that show the prevalence of a certain variant in a certain region over time would help a lot and should be easy to implement.</p>
                    </list-item>
                    <list-item>
                        <p>The orientation of where in a genome a certain variant exists is very hard. Although the vertical bars at 12 h in the circular plot should indicate in what ORF a variant is located, this is really hard to track across the full plot. In particular, because the bar-ORF assignment is not visible.</p>
                    </list-item>
                    <list-item>
                        <p>Animation of daily variant emergence is again a nice feature. However, it is a gif and not interactive. The time lapse does not allow the user to pause, fast forward, or skip to a particular time. Moreover, x-axis labels overlap in particular for the spike. This makes the plot nice to look at, but the information that can be retrieved is only limited.</p>
                    </list-item>
                    <list-item>
                        <p>Graph of weekly spike protein variant emergence is not interactive and difficult to read, as the lines overlap with each other and some have similar colors. Some functionalities could be implemented such as being able to toggle strains from the right menu, selecting a time range and continent/country, and being able to hover over to display the information. 2020 and 2021 plots have layout inconsistencies and could be merged into a single graph.</p>
                    </list-item>
                    <list-item>
                        <p>The variant emergence graph heavily competes with the information in&#x00a0;
                            <ext-link ext-link-type="uri" xlink:href="https://nextstrain.org">https://nextstrain.org</ext-link>, which claims to be updated daily.</p>
                    </list-item>
                </list> In the outermost ring, we detected a variant that is assigned to na/na. What is this supposed to mean?</p>
            <p>Is the work clearly and accurately presented and does it cite the current literature?</p>
            <p>Partly</p>
            <p>If applicable, is the statistical analysis and its interpretation appropriate?</p>
            <p>Not applicable</p>
            <p>Are all the source data underlying the results available to ensure full reproducibility?</p>
            <p>Yes</p>
            <p>Is the study design appropriate and is the work technically sound?</p>
            <p>Partly</p>
            <p>Are the conclusions drawn adequately supported by the results?</p>
            <p>Partly</p>
            <p>Are sufficient details of methods and analysis provided to allow replication by others?</p>
            <p>Yes</p>
            <p>Reviewer Expertise:</p>
            <p>BioSequence-Informatics</p>
            <p>We confirm that we have read this submission and believe that we have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however we have significant reservations, as outlined above.</p>
        </body>
        <sub-article article-type="response" id="comment6670-83795">
            <front-stub>
                <contrib-group>
                    <contrib contrib-type="author">
                        <name>
                            <surname>Warren</surname>
                            <given-names>Ren&#x00e9;</given-names>
                        </name>
                        <aff>Canada's Michael Smith Genome Sciences Centre, Canada</aff>
                    </contrib>
                </contrib-group>
                <author-notes>
                    <fn fn-type="conflict">
                        <p>
                            <bold>Competing interests: </bold>No competing interests to declare</p>
                    </fn>
                </author-notes>
                <pub-date pub-type="epub">
                    <day>14</day>
                    <month>5</month>
                    <year>2021</year>
                </pub-date>
            </front-stub>
            <body>
                <p>
                    <italic>The authors present interactive mutation time maps for SARS-CoV-2, which provide a highly resolving view of when, where and how frequent a particular mutation was detected in the sampled SARS-CoV-2 genome sequences provided via GISAID. The manuscript itself is rather short. It is briefly describing the methodological approach of how the mutations have been detected and mapped to the reference genome. The combined Results and Discussion section is equally concise and comprises a description of what is seen in the interactive maps together with few example observations that can be made with these maps. The Discussion section ends with the expression of the hope that the maps presented here &#x201c;will help researchers in their exploration of SARS-CoV-2 mutations and their predicted effect over time.&#x201d;</italic>
                </p>
                <p>
                    <italic> </italic>
                </p>
                <p>
                    <italic> Overall, the topic that is touched in this manuscript is highly relevant, as variations of SARS-CoV-2 is something that currently is and will be of major concern in the future. Here, the graphs present a very nice access to the information that is represented by the ever-increasing amount of viral genome sequences world-wide. The data presentation is appealing, and it allows to overview the general trends of SARS-CoV-2 evolution. However, we see considerable room for (essential) improvement.</italic>
                </p>
                <p> </p>
                <p> 
                    <bold>We thank our Reviewers for their comments, suggestions and diligence with their extensive report. Our response can be found below, in bold face</bold>
                </p>
                <p> </p>
                <p> 
                    <italic>Major issues:</italic>
                </p>
                <p>
                    <italic> The authors end the manuscript with the belief that the interactive maps will be of help for the research community working on SARS-CoV-2 variation. We miss two things here:</italic>
                </p>
                <p>
                    <italic> </italic>
                </p>
                <p>
                    <italic> First, it would be great if the authors show how the data provided by the maps can be used to indeed come up with new conclusions, in particular with respect to the &#x2018;predicted effect over time&#x2019;. For us, it is entirely unclear how such an analysis should be performed. Exploring the data, this is something that one nicely can do while looking at the plots, some clear signals, e.g. the fate of D614G, can also be extracted. But how to work with the data beyond this simple and straightforward &#x2018;looking&#x2019; at the plots? Please, don&#x2019;t get us wrong here, we consider looking at data a very important aspect of data analysis. Still, the sheer amount of information, which results in very dense plots with many overlapping data points, makes it, in our opinion, very hard to identify emerging variants that should be monitored right from the start. Just to give you an example: D614G is represented by a very prominent circle in the plots. What would be the authors approach to identify and monitor a novel variant, say at position 615 of the reference strain? By looking at the plots, we consider this almost impossible, since the signal will be entirely covered by the prominent mutation at position 614.</italic>
                </p>
                <p> </p>
                <p> 
                    <bold>We greatly really appreciate community feedback on the potential usefulness of this work, and not only the maps, but additional analysis we were able to provide after we submitted the paper (our Reviewers made mentioned of them below), using the wealth of information we were able to mine from the GISAID genomes (these secondary analysis results, which consists of nucleotide variants and their effect, are tallied each week from each individual SARS-CoV-2 genome).&#x00a0; We originally built the maps to be fairly qualitative, to simply gain a [visual] appreciation for the rapid coronavirus evolution on a year scale/factoring sample regions of origin, and this is what we presented in the manuscript. In our conclusion we give an example of a mutation that is observable from the GISAID genomes, on our maps, at the time reported in published papers; Since submission, the GISAID catalogue has more than doubled in size and maps quickly became dense, as our Reviewer indicated. To help remedy the problem and make the maps more useful, we have since started to provide additional genome and spike views of variants of concerns (VOCs) and have added visualizations for 2021 (a more digestible data fraction, indicated below by our Reviewer). Another type of information that can be extracted from the maps is the speed at which mutations in VOCs have appeared and spreading in additional jurisdictions, which can be readily observed without too much effort. Our Reviewers are correct that variations in close proximity are difficult to see, which is why we provide views for the spike-encoding gene. Still, it would be difficult to differentiate between positions 614 and 615, which is why we provide the SVG-generating script such that interested parties would be able to generate custom views should they chose to (Ideally a more flexible website could help, see response below).&#x00a0;</bold>
                </p>
                <p> </p>
                <p> 
                    <italic>The analysis is presented using the &#x201c;ground-zero&#x201d; strain as a reference. But is this still timely? Numerous variants have now frequencies that go far beyond that of the original nucleotide at a certain position, again, for example the D614G variant. This would allow to &#x2018;purge&#x2019; the signal of very successful variants, helping to direct the focus on emerging variants.</italic>
                </p>
                <p> </p>
                <p> 
                    <bold>Our Reviewer is correct that the comparison is relative. When we started this project in December 2020, it made sense to use the "ground zero" strain genome. We could make the case for selecting another set of references to compare against, but it may lead to disagreements in scientific circles, on the base genome sequence to use. Additional maps may be produced in the future to see evolution within each VOCs, which may be an acceptable proposition.</bold>
                </p>
                <p> </p>
                <p> When it comes to the website itself, we see some room for improvement: 
                    <list list-type="bullet">
                        <list-item>
                            <p>
                                <italic>First and foremost, we think the plots are overcrowded with information. Although it is nice to see a global overview of the data across the entire genome, 365 days, and 6 continents, it is impossible (at least for me) to explore this information other than randomly clicking individual data points, as we have outlined above. we think, this approach would benefit from providing the information in more digestible data fractions. Thus far, the user can choose to focus on the spike, but not on the other proteins. It would be helpful, just as a suggestion, to focus also on variants with a certain prevalence. But we are sure that the authors will have way better ideas than our proposals here, once they specify how a user should work with the plots and the data. Looking at&#x00a0;
                                    <ext-link ext-link-type="uri" xlink:href="https://nextstrain.org/">https://nextstrain.org</ext-link>, which also provides a very nice overview of SARS-CoV-2 variation, may give some hints.</italic>
                            </p>
                        </list-item>
                        <list-item>
                            <p>
                                <italic>It would be very convenient, if the interactive plots would be designed such that the user can toggle the information for display, instead of having to go back to the main menu and select a different display mode.</italic>
                            </p>
                        </list-item>
                        <list-item>
                            <p>
                                <italic>Trend lines that show the prevalence of a certain variant in a certain region over time would help a lot and should be easy to implement.</italic>
                            </p>
                        </list-item>
                        <list-item>
                            <p>
                                <italic>The orientation of where in a genome a certain variant exists is very hard. Although the vertical bars at 12 h in the circular plot should indicate in what ORF a variant is located, this is really hard to track across the full plot. In particular, because the bar-ORF assignment is not visible.</italic>
                            </p>
                        </list-item>
                        <list-item>
                            <p>
                                <italic>Animation of daily variant emergence is again a nice feature. However, it is a gif and not interactive. The time lapse does not allow the user to pause, fast forward, or skip to a particular time. Moreover, x-axis labels overlap in particular for the spike. This makes the plot nice to look at, but the information that can be retrieved is only limited.</italic>
                            </p>
                        </list-item>
                        <list-item>
                            <p>
                                <italic>Graph of weekly spike protein variant emergence is not interactive and difficult to read, as the lines overlap with each other and some have similar colors. Some functionalities could be implemented such as being able to toggle strains from the right menu, selecting a time range and continent/country, and being able to hover over to display the information. 2020 and 2021 plots have layout inconsistencies and could be merged into a single graph.</italic>
                            </p>
                        </list-item>
                        <list-item>
                            <p>
                                <italic>The variant emergence graph heavily competes with the information in&#x00a0;
                                    <ext-link ext-link-type="uri" xlink:href="https://nextstrain.org/">https://nextstrain.org</ext-link>, which claims to be updated daily.</italic>
                            </p>
                        </list-item>
                    </list> 
                    <bold>We thank our Reviewers for spending the time to navigate the website, which originally, wasn't part of the project (built as a means to share the maps). We agree that a more modern and flexible web design would help with the customization and eventual uptake of these maps. Some of the plots were added to the website for convenience, to show users what is possible to do with the extensive mutation data we are compiling for this project (and available for 
                        <ext-link ext-link-type="uri" xlink:href="https://www.bcgsc.ca/downloads/btl/SARS-CoV-2/mutations/">download here</ext-link>)</bold>
                </p>
                <p> </p>
                <p> 
                    <italic>In the outermost ring, we detected a variant that is assigned to na/na. What is this supposed to mean?</italic>
                </p>
                <p> </p>
                <p> 
                    <bold>These variants fall in UTR regions. Thank you for the feedback, in our next release of the maps, we will replace NA by UTR to indicate that this nucleotide variant compared to the reference is found outside coding regions. The last position indicates possible effect in the protein space, which is not applicable in this case.</bold>
                </p>
            </body>
        </sub-article>
    </sub-article>
</article>
