<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.2 20190208//EN" "http://jats.nlm.nih.gov/publishing/1.2/JATS-journalpublishing1.dtd"><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="other" dtd-version="1.2" xml:lang="en">
    <front>
        <journal-meta>
            <journal-id journal-id-type="pmc">F1000Research</journal-id>
            <journal-title-group>
                <journal-title>F1000Research</journal-title>
            </journal-title-group>
            <issn pub-type="epub">2046-1402</issn>
            <publisher>
                <publisher-name>F1000 Research Limited</publisher-name>
                <publisher-loc>London, UK</publisher-loc>
            </publisher>
        </journal-meta>
        <article-meta>
            <article-id pub-id-type="doi">10.12688/f1000research.165281.1</article-id>
            <article-categories>
                <subj-group subj-group-type="heading">
                    <subject>Software Tool Article</subject>
                </subj-group>
                <subj-group>
                    <subject>Articles</subject>
                </subj-group>
            </article-categories>
            <title-group>
                <article-title>DataMap: A Browser-based App for Visualizing High-Dimensional Data</article-title>
                <fn-group content-type="pub-status">
                    <fn>
                        <p>[version 1; peer review: 1 approved with reservations]</p>
                    </fn>
                </fn-group>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author" corresp="yes">
                    <name>
                        <surname>Ge</surname>
                        <given-names>Xijin</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Conceptualization</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <uri content-type="orcid">https://orcid.org/0000-0001-7406-3782</uri>
                    <xref ref-type="corresp" rid="c1">a</xref>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <aff id="a1">
                    <label>1</label>Mathematics and Statistics, South Dakota State University, Brookings, South Dakota, 57007, USA</aff>
            </contrib-group>
            <author-notes>
                <corresp id="c1">
                    <label>a</label>
                    <email xlink:href="mailto:Xijin.Ge@sdstate.edu">Xijin.Ge@sdstate.edu</email>
                </corresp>
                <fn fn-type="conflict">
                    <p>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>10</day>
                <month>11</month>
                <year>2025</year>
            </pub-date>
            <pub-date pub-type="collection">
                <year>2025</year>
            </pub-date>
            <volume>14</volume>
            <elocation-id>1234</elocation-id>
            <history>
                <date date-type="accepted">
                    <day>27</day>
                    <month>10</month>
                    <year>2025</year>
                </date>
            </history>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2025 Ge X</copyright-statement>
                <copyright-year>2025</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <self-uri content-type="pdf" xlink:href="https://f1000research.com/articles/14-1234/pdf"/>
            <abstract>
                <sec>
                    <title>Background</title>
                    <p>Visualization and analysis of high-dimensional data are essential in biomedical research. There is a need for secure, scalable, and reproducible tools to facilitate data exploration and interpretation.</p>
                </sec>
                <sec>
                    <title>Results</title>
                    <p>We introduce DataMap, a browser-based application designed for the visualization of high-dimensional data using heatmaps and dimensionality reduction plots. DataMap operates directly in the web browser, ensuring data privacy without the requirement of installations or server setups. The application features an intuitive user interface for data transformation, annotation, and generation of reproducible R code.</p>
                </sec>
                <sec>
                    <title>Conclusions</title>
                    <p>Freely accessible as a GitHub page (
                        <uri xlink:href="https://gexijin.github.io/datamap">https://gexijin.github.io/datamap</uri>), DataMap is a secure, user-friendly, and reproducible solution for visualizing high-dimensional
 data.</p>
                </sec>
            </abstract>
            <kwd-group kwd-group-type="author">
                <kwd>Data visualization</kwd>
                <kwd>Heatmap</kwd>
                <kwd>PCA</kwd>
                <kwd>t-SNE</kwd>
                <kwd>Reproducibility</kwd>
            </kwd-group>
            <funding-group>
                <award-group id="fund-1" xlink:href="https://doi.org/10.13039/100000057">
                    <funding-source>National Institute of General Medical Sciences</funding-source>
                    <award-id>P20GM135008</award-id>
                    <award-id>R43GM153076</award-id>
                </award-group>
                <award-group id="fund-2" xlink:href="https://doi.org/10.13039/100000051">
                    <funding-source>National Human Genome Research Institute</funding-source>
                    <award-id>R01HG010805</award-id>
                    <award-id>R01HG013534</award-id>
                </award-group>
                <funding-statement>Supported by NIH grants (P20GM135008, R01HG010805, R01HG013534, and R43GM153076).&#13;
</funding-statement>
                <funding-statement>
                    <italic>The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.</italic>
                </funding-statement>
            </funding-group>
        </article-meta>
    </front>
    <body>
        <sec id="sec4" sec-type="intro">
            <title>1. Introduction</title>
            <p>High-dimensional datasets, such as expression matrices from RNA-seq or proteomics experiments, have become commonplace in biomedical research. Heatmaps are an effective visualization method, efficiently representing thousands of data points in a matrix through variations in color. Several heatmap-centric visualization tools, including Clustergrammer
                <sup>
                    <xref ref-type="bibr" rid="ref1">1</xref>
                </sup>, Phantasus
                <sup>
                    <xref ref-type="bibr" rid="ref2">2</xref>
                </sup>, and Morpheus
                <sup>
                    <xref ref-type="bibr" rid="ref3">3</xref>
                </sup>, have been developed to facilitate the analysis of these expansive datasets. Phantasus and Morpheus operate entirely within the user&#x2019;s web browser, while Clustergrammer requires server-side processing.</p>
            <p>Our goal was to develop a secure, browser-based application capable of generating high-quality and reproducible visualizations. To address this, we developed DataMap, an R/Shiny application deployed via Shinylive, leveraging WebR&#x2014;a specialized version of R compiled into WebAssembly&#x2014;to enable execution directly within web browsers. Hosted as static files on GitHub, this serverless design ensures sensitive data remains securely on the user&#x2019;s device while eliminating the need for server-side computation.</p>
            <p>DataMap supports multiple visualization methods, including hierarchical clustering with heatmaps, principal component analysis (PCA), and t-distributed stochastic neighbor embedding (t-SNE)
                <sup>
                    <xref ref-type="bibr" rid="ref4">4</xref>
                </sup>. These methods enable researchers to uncover biologically significant patterns, clusters, and relationships within complex datasets. To facilitate usability, the application automatically recommends optimal file-parsing methods and data transformation settings by analyzing input files and assessing data distributions. Although optimized for omics datasets, DataMap remains broadly applicable to general high-dimensional data matrices across various research domains.</p>
        </sec>
        <sec id="sec5">
            <title>2. Implementation</title>
            <p>DataMap is implemented as an R/Shiny application compiled into WebAssembly via Shinylive, enabling entirely client-side execution within web browsers. The application is hosted on GitHub Pages as static files, automatically updated through GitHub Actions from the source code repository, ensuring continuous integration and continuous deployment (CI/CD).</p>
            <p>The app consists of several functional modules:
                <list list-type="order">
                    <list-item>
                        <label>1.</label>
                        <p>File Upload Module: Supports multiple file formats, including Excel, CSV, TSV, TXT, and other plain text files, with automatic delimiter detection for accurate parsing.</p>
                    </list-item>
                    <list-item>
                        <label>2.</label>
                        <p>Data Transformation Module: Offers essential preprocessing features, such as log transformation, normalization, missing value handling, outlier capping, and feature filtering.</p>
                    </list-item>
                    <list-item>
                        <label>3.</label>
                        <p>Visualization Modules: Produces high-quality, publication-ready visualizations, including hierarchical clustering heatmaps generated using the pheatmap package
                            <sup>
                                <xref ref-type="bibr" rid="ref5">5</xref>
                            </sup>, as well as PCA and t-SNE plots. The heatmap visualization supports dendrogram cutting, enabling clearer identification of distinct clusters.</p>
                    </list-item>
                    <list-item>
                        <label>4.</label>
                        <p>Code Generation Module: Automatically generates reproducible R code reflecting all analytical steps performed by the user, facilitating transparency and reproducibility.</p>
                    </list-item>
                </list>
            </p>
            <p>This modular and serverless design ensures data security by processing exclusively on the user&#x2019;s device and facilitates easy maintenance and ongoing enhancements through automated deployment.</p>
        </sec>
        <sec id="sec6">
            <title>3. Operation</title>
            <p>To use DataMap, simply access the application via the static GitHub page at 
                <ext-link ext-link-type="uri" xlink:href="https://gexijin.github.io/datamap">https://gexijin.github.io/datamap</ext-link> using modern web browsers such as Chrome, Edge, Safari, or Firefox. Additionally, users may install and run DataMap locally as an R package using the following commands in R:</p>
            <disp-quote>
                <p>remotes::install_github(&#x201c;gexijin/datamap&#x201d;)</p>
                <p>datamap::run_app()</p>
            </disp-quote>
            <p>This flexibility ensures seamless operation both online and offline, accommodating diverse user needs.</p>
        </sec>
        <sec id="sec7" sec-type="results">
            <title>4. Results</title>
            <sec id="sec8">
                <title>4.1 Features and functionality</title>
                <p>

                    <bold>Secure local processing</bold>: DataMap securely processes all data directly within the user&#x2019;s web browser, safeguarding data privacy and removing dependency on external servers. This approach also ensures scalability without being limited by server resources.</p>
                <p>

                    <bold>Smart data import</bold>: It automatically detects file formats, delimiters, and annotations, streamlining the data upload process. The app also examines the data to identify the presence of row and column names. Row annotations can be uploaded separately or included in the data matrix. Column annotations, such as experimental design factors in omics datasets, must be uploaded separately using matching column names.</p>
                <p>

                    <bold>Comprehensive data transformations</bold>: The data transformation workflow employs statistical heuristics to recommend appropriate settings for effective visualization. Missing data can remain unchanged or be imputed using row-wise or column-wise mean or median values. When high skewness (&gt;1) is detected and no negative values are present, the app recommends a log transformation, addressing common challenges associated with visualizing biological datasets. Matrix orientation is inferred by comparing row and column variability using the Median Absolute Deviation, followed by recommendations for centering or scaling. The mapping of data to colors in heatmaps is usually determined by the minimum and maximum values in the data matrix. This makes the mapping susceptible to outliers. Outliers beyond three standard deviations from the mean are capped, optimizing color ranges for visualization. Users can also filter out less variable rows. To optimize heatmap color mapping, outliers exceeding three standard deviations from the mean are capped by default, minimizing their influence. Additionally, users can filter out less-variable rows. These built-in strategies empower users, including non-statisticians, to produce robust visualizations effortlessly.</p>
                <p>

                    <bold>Publication-quality visualizations</bold>: Leveraging R visualization libraries, DataMap generates high-quality graphics suitable for publication, downloadable in PDF or PNG formats.</p>
                <p>

                    <bold>Reproducible analysis</bold>: DataMap promotes transparency, consistency, and collaborative analysis by automatically recording all user-selected settings and analytical steps, generating reproducible R code that replicates visualizations on local systems.</p>
            </sec>
            <sec id="sec9">
                <title>4.2 Comparison with existing tools</title>
                <p>DataMap complements existing visualization tools such as Clustergrammer, Phantasus, and Morpheus. Like Phantasus and Morpheus, DataMap employs client-side processing for enhanced data security. It extends their functionality by offering a broader set of preprocessing options, automatic generation of reproducible R scripts, and publication-quality graphics. However, DataMap is less interactive than native web applications built with Java or other programming languages.</p>
                <p>

                    <bold>Use cases</bold>
                </p>
                <p>We used DataMap to visualize genes upregulated by ionizing radiation in mouse B cells with and without functional p53 gene
                    <sup>
                        <xref ref-type="bibr" rid="ref6">6</xref>
                    </sup>. 
                    <xref ref-type="fig" rid="f1">
Figure 1A</xref> shows the top genes specifically induced in B cells with p53. Experimental conditions (genotype and radiation exposure) are annotated by column annotations, while row color bars indicate genes involved in apoptosis. This heatmap clearly reveals genes strongly responsive to radiation only in wild-type B cells, highlighting the functional importance of p53. In 
                    <xref ref-type="fig" rid="f1">
Figure 1B</xref>, we visualize a t-SNE projection of 2,700 single-cell RNA-seq samples, color-coded by clusters corresponding to cell types. These examples highlight DataMap&#x2019;s capability to uncover insights within complex, high-dimensional omics datasets.</p>
                <fig fig-type="figure" id="f1" orientation="portrait" position="float">
                    <label>
Figure 1. </label>
                    <caption>
                        <title>Example visualizations.</title>
                        <p>(A) Top 20 genes upregulated by ionizing radiation in mouse B cells with functional p53 gene
                            <sup>
                                <xref ref-type="bibr" rid="ref6">6</xref>
                            </sup>, and (B) t-SNE projection of 2700 single-cell RNAseq profiles of peripheral blood mononuclear cells (PBMCs), available from 10X Genomics. Both datasets are included as built-in examples within the application.</p>
                    </caption>
                    <graphic id="gr1" orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/181894/73562297-9f41-4250-b0e6-3bc1ba55d35e_figure1.gif"/>
                </fig>
            </sec>
        </sec>
        <sec id="sec10" sec-type="discussion">
            <title>5. Discussion</title>
            <p>When analyzing large datasets, browser-based execution is slower compared to native execution. For example, generating a hierarchical clustering heatmap of a 2700&#x00d7;50 matrix takes approximately 80 seconds when run in the browser, compared to just 5 seconds in native R on the same laptop (Intel 11th Gen Core i7-1185G7, 3.00 GHz). Users are encouraged to install DataMap locally as an R package for extremely large datasets. Another limitation stems from DataMap&#x2019;s reliance on the WebR, which only supports a subset of R packages with delayed updates.</p>
            <p>In summary, DataMap combines secure client-side processing with robust data preprocessing and reproducible workflows. It complements existing web-based tools, equipping biomedical researchers with a powerful tool for exploratory analysis. Future development will focus on expanding visualization capabilities and incorporating additional analytical modules.</p>
        </sec>
        <sec id="sec11">
            <title>Software availability</title>
            <p>Software available from: 
                <ext-link ext-link-type="uri" xlink:href="https://gexijin.github.io/datamap">https://gexijin.github.io/datamap</ext-link>
            </p>
            <p>Source code available from: 
                <ext-link ext-link-type="uri" xlink:href="https://github.com/gexijin/datamap">https://github.com/gexijin/datamap</ext-link>
            </p>
            <p>Archived source code at time of publication: 
                <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.5281/zenodo.15361414">https://doi.org/10.5281/zenodo.15361414</ext-link>
            </p>
            <p>License: GNU General Public License v3.0</p>
        </sec>
    </body>
    <back>
        <sec id="sec14" sec-type="data-availability">
            <title>Data availability</title>
            <p>This work produces a software package. No data is generated or collected.</p>
        </sec>
        <ref-list>
            <title>References</title>
            <ref id="ref1">
                <label>1</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Fernandez</surname>
                            <given-names>NF</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Clustergrammer, a web-based heatmap visualization and analysis tool for high-dimensional biological data.</article-title>
                    <source>

                        <italic toggle="yes">Sci Data.</italic>
</source>
                    <year>2017</year>;<volume>4</volume>:<fpage>170151</fpage>.
                    <pub-id pub-id-type="pmid">28994825</pub-id>
                    <pub-id pub-id-type="doi">10.1038/sdata.2017.151</pub-id>
                    <pub-id pub-id-type="pmcid">PMC5634325</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref2">
                <label>2</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Kleverov</surname>
                            <given-names>M</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Phantasus, a web application for visual and interactive gene expression analysis.</article-title>
                    <source>

                        <italic toggle="yes">elife.</italic>
</source>
                    <year>2024</year>;<volume>13</volume>.
                    <pub-id pub-id-type="pmid">38742735</pub-id>
                    <pub-id pub-id-type="doi">10.7554/eLife.85722</pub-id>
                    <pub-id pub-id-type="pmcid">PMC11147506</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref3">
                <label>3</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Starruss</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Back</surname>
                            <given-names>W</given-names>
                            <prefix>de</prefix>
                        </name>

                        <name name-style="western">
                            <surname>Brusch</surname>
                            <given-names>L</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Morpheus: a user-friendly modeling environment for multiscale and multicellular systems biology.</article-title>
                    <source>

                        <italic toggle="yes">Bioinformatics.</italic>
</source>
                    <year>2014</year>;<volume>30</volume>:<fpage>1331</fpage>&#x2013;<lpage>1332</lpage>.
                    <pub-id pub-id-type="pmid">24443380</pub-id>
                    <pub-id pub-id-type="doi">10.1093/bioinformatics/btt772</pub-id>
                    <pub-id pub-id-type="pmcid">PMC3998129</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref4">
                <label>4</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Maaten</surname>
                            <given-names>L</given-names>
                            <prefix>van der</prefix>
                        </name>

                        <name name-style="western">
                            <surname>Hinton</surname>
                            <given-names>G</given-names>
                        </name>
</person-group>:
                    <article-title>Visualizing data using t-SNE.</article-title>
                    <source>

                        <italic toggle="yes">J. Mach. Learn. Res.</italic>
</source>
                    <year>2008</year>;<volume>9</volume>:<fpage>7</fpage>.</mixed-citation>
            </ref>
            <ref id="ref5">
                <label>5</label>
                <mixed-citation publication-type="other">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Kolde</surname>
                            <given-names>R</given-names>
                        </name>
</person-group>:
                    <article-title>pheatmap: Pretty Heatmaps.</article-title>
                    <year>2018</year>.
                    <ext-link ext-link-type="uri" xlink:href="https://github.com/raivokolde/pheatmap">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref6">
                <label>6</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Tonelli</surname>
                            <given-names>C</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Genome-wide analysis of p53 transcriptional programs in B cells upon exposure to genotoxic stress in vivo.</article-title>
                    <source>

                        <italic toggle="yes">Oncotarget.</italic>
</source>
                    <year>2015</year>;<volume>6</volume>:<fpage>24611</fpage>&#x2013;<lpage>24626</lpage>.
                    <pub-id pub-id-type="pmid">26372730</pub-id>
                    <pub-id pub-id-type="doi">10.18632/oncotarget.5232</pub-id>
                    <pub-id pub-id-type="pmcid">PMC4694782</pub-id>
                </mixed-citation>
            </ref>
        </ref-list>
    </back>
    <sub-article article-type="reviewer-report" id="report440608">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.181894.r440608</article-id>
            <title-group>
                <article-title>Reviewer response for version 1</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Yu</surname>
                        <given-names>Guangchuang</given-names>
                    </name>
                    <xref ref-type="aff" rid="r440608a1">1</xref>
                    <role>Referee</role>
                </contrib>
                <aff id="r440608a1">
                    <label>1</label>Southern Medical University, Guangzhou, China</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>29</day>
                <month>12</month>
                <year>2025</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2025 Yu G</copyright-statement>
                <copyright-year>2025</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport440608" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.165281.1"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve-with-reservations</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>This article leverages the 
                <bold>webR</bold> framework to build a client-side web tool capable of generating heatmaps and dimensional-reduction plots. The approach is reasonable and technically sound. However, I remain uncertain about the necessity of such a tool, given that numerous R packages already exist for creating these types of visualizations. Moreover, with today&#x2019;s AI capabilities, users can easily obtain high-quality, ready-to-run code for producing these figures, which further reduces the need for a dedicated tool of this kind.</p>
            <p>Are the conclusions about the tool and its performance adequately supported by the findings presented in the article?</p>
            <p>Yes</p>
            <p>Is the rationale for developing the new software tool clearly explained?</p>
            <p>Yes</p>
            <p>Is the description of the software tool technically sound?</p>
            <p>Yes</p>
            <p>Are sufficient details of the code, methods and analysis (if applicable) provided to allow replication of the software development and its use by others?</p>
            <p>Yes</p>
            <p>Is sufficient information provided to allow interpretation of the expected output datasets and any results generated using the tool?</p>
            <p>Yes</p>
            <p>Reviewer Expertise:</p>
            <p>Bioinformatics</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.</p>
        </body>
    </sub-article>
</article>
