<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.2 20190208//EN" "http://jats.nlm.nih.gov/publishing/1.2/JATS-journalpublishing1.dtd"><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article" dtd-version="1.2" xml:lang="en">
    <front>
        <journal-meta>
            <journal-id journal-id-type="pmc">F1000Research</journal-id>
            <journal-title-group>
                <journal-title>F1000Research</journal-title>
            </journal-title-group>
            <issn pub-type="epub">2046-1402</issn>
            <publisher>
                <publisher-name>F1000 Research Limited</publisher-name>
                <publisher-loc>London, UK</publisher-loc>
            </publisher>
        </journal-meta>
        <article-meta>
            <article-id pub-id-type="doi">10.12688/f1000research.54864.1</article-id>
            <article-categories>
                <subj-group subj-group-type="heading">
                    <subject>Research Article</subject>
                </subj-group>
                <subj-group>
                    <subject>Articles</subject>
                </subj-group>
            </article-categories>
            <title-group>
                <article-title>The need to reassess single-cell RNA sequencing datasets: more is not always better</article-title>
                <fn-group content-type="pub-status">
                    <fn>
                        <p>[version 1; peer review: 2 approved with reservations]</p>
                    </fn>
                </fn-group>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Ascensi&#x00f3;n</surname>
                        <given-names>Alex M.</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Conceptualization</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <uri content-type="orcid">https://orcid.org/0000-0002-0013-3052</uri>
                    <xref ref-type="aff" rid="a1">1</xref>
                    <xref ref-type="aff" rid="a2">2</xref>
                </contrib>
                <contrib contrib-type="author" corresp="yes">
                    <name>
                        <surname>Ara&#x00fa;zo-Bravo</surname>
                        <given-names>Marcos J.</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Funding Acquisition</role>
                    <role content-type="http://credit.niso.org/">Project Administration</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="corresp" rid="c1">a</xref>
                    <xref ref-type="aff" rid="a1">1</xref>
                    <xref ref-type="aff" rid="a3">3</xref>
                    <xref ref-type="aff" rid="a4">4</xref>
                    <xref ref-type="aff" rid="a5">5</xref>
                    <xref ref-type="aff" rid="a6">6</xref>
                </contrib>
                <contrib contrib-type="author" corresp="yes">
                    <name>
                        <surname>Izeta</surname>
                        <given-names>Ander</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Funding Acquisition</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Project Administration</role>
                    <role content-type="http://credit.niso.org/">Supervision</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <uri content-type="orcid">https://orcid.org/0000-0003-1879-7401</uri>
                    <xref ref-type="corresp" rid="c2">b</xref>
                    <xref ref-type="aff" rid="a2">2</xref>
                    <xref ref-type="aff" rid="a7">7</xref>
                </contrib>
                <aff id="a1">
                    <label>1</label>Computational Biology and Systems Biomedicine Group, Biodonostia Health Research Institute, San Sebastian, Gipuzkoa, 20014, Spain</aff>
                <aff id="a2">
                    <label>2</label>Tissue Engineering Group, Biodonostia Health Research Institute, San Sebastian, Gipuzkoa, 20014, Spain</aff>
                <aff id="a3">
                    <label>3</label>Computational Biomedicine Data Analysis Platform, Biodonostia Health Research Institute, San Sebastian, Gipuzkoa, 20014, Spain</aff>
                <aff id="a4">
                    <label>4</label>IKERBASQUE, Basque Foundation for Science, Bilbao, Spain</aff>
                <aff id="a5">
                    <label>5</label>CIBER of Frailty and Healthy Aging (CIBERfes), Madrid, Spain</aff>
                <aff id="a6">
                    <label>6</label>Computational Biology and Bioinformatics Group, Max Planck Institute for Molecular Biomedicine, M&#x00fc;nster, Germany</aff>
                <aff id="a7">
                    <label>7</label>Department of Biomedical Engineering and Science, Tecnun- University of Navarra School of Engineering, San Sebastian, Gipuzkoa, 20009, Spain</aff>
            </contrib-group>
            <author-notes>
                <corresp id="c1">
                    <label>a</label>
                    <email xlink:href="mailto:mararabra@yahoo.co.uk">mararabra@yahoo.co.uk</email>
                </corresp>
                <corresp id="c2">
                    <label>b</label>
                    <email xlink:href="mailto:ander.izeta@biodonostia.org">ander.izeta@biodonostia.org</email>
                </corresp>
                <fn fn-type="conflict">
                    <p>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>6</day>
                <month>8</month>
                <year>2021</year>
            </pub-date>
            <pub-date pub-type="collection">
                <year>2021</year>
            </pub-date>
            <volume>10</volume>
            <elocation-id>767</elocation-id>
            <history>
                <date date-type="accepted">
                    <day>21</day>
                    <month>7</month>
                    <year>2021</year>
                </date>
            </history>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2021 Ascensi&#x00f3;n AM et al.</copyright-statement>
                <copyright-year>2021</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <self-uri content-type="pdf" xlink:href="https://f1000research.com/articles/10-767/pdf"/>
            <abstract>
                <p>

                    <bold>Background:</bold> The advent of single-cell RNA sequencing (scRNAseq) and additional single-cell omics technologies have provided scientists with unprecedented tools to explore biology at cellular resolution. However, reaching an appropriate number of good quality reads per cell and reasonable numbers of cells within each of the populations of interest are key to infer conclusions from otherwise limited analyses. For these reasons, scRNAseq studies are constantly increasing the number of cells analysed and the granularity of the resultant transcriptomics analyses.</p>
                <p> 
                    <bold>Methods:</bold> We aimed to identify previously described fibroblast subpopulations in healthy adult human skin by using the largest dataset published to date (528,253 sequenced cells) and an unsupervised population-matching algorithm.</p>
                <p> 
                    <bold>Results:</bold> Our reanalysis of this landmark resource demonstrates that a substantial proportion of cell transcriptomic signatures may be biased by cellular stress and response to hypoxic conditions.</p>
                <p> 
                    <bold>Conclusions:</bold> We postulate that the &#x201d;more is better&#x201d; approach, currently prevalent in the scientific community, might undermine the extent of the analysis, possibly due to long computational processing times inherent to large datasets.</p>
            </abstract>
            <kwd-group kwd-group-type="author">
                <kwd>single-cell RNA-seq</kwd>
                <kwd>skin</kwd>
                <kwd>fibroblasts</kwd>
                <kwd>reproducibility</kwd>
                <kwd>computational analysis</kwd>
                <kwd>Python</kwd>
            </kwd-group>
            <funding-group>
                <award-group id="fund-1" xlink:href="http://dx.doi.org/10.13039/501100004587">
                    <funding-source>Instituto de Salud Carlos III</funding-source>
                    <award-id>AC17/00012</award-id>
                    <award-id>PI19/01621</award-id>
                </award-group>
                <award-group id="fund-2" xlink:href="http://dx.doi.org/10.13039/501100007601">
                    <funding-source>Horizon 2020</funding-source>
                    <award-id>Eracosysmed;GrantAgreementNo643271</award-id>
                </award-group>
                <award-group id="fund-3" xlink:href="http://dx.doi.org/10.13039/100015866">
                    <funding-source>Hezkuntza, Hizkuntza Politika Eta Kultura Saila, Eusko Jaurlaritza</funding-source>
                    <award-id>PRE_2019_2_0233</award-id>
                </award-group>
                <award-group id="fund-4" xlink:href="http://dx.doi.org/10.13039/501100008530">
                    <funding-source>European Regional Development Fund</funding-source>
                </award-group>
                <funding-statement>Instituto  de  Salud  Carlos  III  grant  AC17/00012,  co-funded  by  the  European  Union  (ERDF/ESF,  &#x201c;Investing  in  your  future&#x201d;)  (MJAB)  ERA-Net  program  Era-coSysMed, JTC-2 2017 (MJAB) Instituto de Salud Carlos III grant PI19/01621, co-funded by the European Union (ERDF/ESF, &#x201c;Investing in your future&#x201d;) (AI) Basque Government PhD fellowship PRE_2019_2_0233 (AMA), Horizon 2020 Eracosysmed, Grant Agreement No 643271.</funding-statement>
                <funding-statement>
                    <italic>The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.</italic>
                </funding-statement>
            </funding-group>
        </article-meta>
    </front>
    <body>
        <sec id="sec1" sec-type="intro">
            <title>Introduction</title>
            <p>The quest for deciphering the underlying biology of numerous phenomena at the single-cell level has exponentially increased the number of published single-cell RNA sequencing (scRNAseq) studies.
                <sup>
                    <xref ref-type="bibr" rid="ref1">1</xref>
                </sup> Additionally, individual studies are gradually increasing in scale, and in most tissues a correlation between the numbers of cells sequenced and the number of identified cell types is found.
                <sup>
                    <xref ref-type="bibr" rid="ref1">1</xref>
                </sup> Unfortunately, many (if not most) of the studies concentrate their efforts on individual dataset analyses and perform relatively little correlative study to meta-analyse previously published scRNAseq datasets. However, the amount of information that could be retrieved from the already existing corpus of literature is enormous.
                <sup>
                    <xref ref-type="bibr" rid="ref2">2</xref>
                </sup>
            </p>
            <p>Within identified cell clusters (what we normally would define as &#x201d;cell types&#x201d;), the existing cell heterogeneity may be indicative of cell subsets that respond to particular conditions (such as cell cycle phase, cell stress, response to local signals, etc.) or reflect underlying functional/positional differences.
                <sup>
                    <xref ref-type="bibr" rid="ref3">3</xref>-
                    <xref ref-type="bibr" rid="ref6">6</xref>
                </sup> It is thus of utmost importance that the scientific community interested in a specific tissue or cell type agrees on the existing subsets within particular cell types and their defining molecular profiles, so that a common reference atlas may be used to understand homeostasis and response to varying insults.
                <sup>
                    <xref ref-type="bibr" rid="ref7">7</xref>
                </sup>
            </p>
            <p>In a re-analysis of 13,823 human adult dermal fibroblasts obtained from four independent scRNAseq studies,
                <sup>
                    <xref ref-type="bibr" rid="ref8">8</xref>-
                    <xref ref-type="bibr" rid="ref11">11</xref>
                </sup> we recently proposed that human skin presents a common set of fibroblast subsets, irrespective of donor area.
                <sup>
                    <xref ref-type="bibr" rid="ref12">12</xref>
                </sup> These subsets can be categorised into three main fibroblast types (type A, B, and C), with a total of 10 minor subpopulations (A1&#x2013;A4, B1&#x2013;B2, C1&#x2013;C4). In a recent landmark paper published in Science, Reynolds 
                <italic toggle="yes">et al</italic>. produced a dataset of 528,253 sequenced cells obtained from healthy adult skin (five female patients undergoing mammoplasty surgery) and fetal samples, as well as inflamed skin from atopic dermatitis and psoriasis patients.
                <sup>
                    <xref ref-type="bibr" rid="ref13">13</xref>
                </sup> In healthy dermal fibroblasts, the authors described three populations: a main cluster termed Fb1, and two minor subpopulations, Fb2 and Fb3. Fb2 was additionally described as enriched in fetal and inflamed skin samples.
                <sup>
                    <xref ref-type="bibr" rid="ref13">13</xref>
                </sup> We aimed to analyse whether the Fb1, Fb2 and Fb3 populations were consistent with the A&#x2013;C fibroblast types and subtypes that we had just described. More specifically, we reasoned that at least the most abundant subpopulations that we had defined, namely A1, A2, B1 and B2, should be clearly detected in a &gt;500k cell dataset, thus further validating our previous scRNAseq study. In contrast, we found that a substantial proportion of the Reynolds 
                <italic toggle="yes">et al</italic>. scRNAseq dataset appears to be biased by differential expression of stress and hypoxia-related genes. Thus, data extracted from this source should be interpreted in the light of this bias. It is possible that other existing large datasets suffer from similar methodological problems, which might be due to insufficient oversight.</p>
            <p>We conclude that the current high bar on the number of cells established by the scientific community might be counterproductive, possibly through undermining the extent of the analysis.</p>
        </sec>
        <sec id="sec2" sec-type="methods">
            <title>Methods</title>
            <sec id="sec3">
                <title>Preprocessing of fibroblast sample data</title>
                <p>Fibroblast sample data originated from five donors, as described by Reynolds 
                    <italic toggle="yes">et al</italic>.,
                    <sup>
                        <xref ref-type="bibr" rid="ref13">13</xref>
                    </sup> and were processed from raw fastq files (E-MATB-8142). The ID numbers are 4820STDY7388991 [S1], 4820STDY7388999 [S2], 4820STDY7389007 [S3], SKN8104899 [S4], SKN8105197 [S5]. Fastq files were processed using the 
                    <monospace>loompy fromfq</monospace> pipeline described in 
                    <ext-link ext-link-type="uri" xlink:href="https://linnarssonlab.org/loompy/kallisto/index.html">https://linnarssonlab.org/loompy/kallisto/index.html</ext-link>. 
                    <italic toggle="yes">Loompy</italic> (RRID:SCR_016666) and 
                    <italic toggle="yes">kallisto</italic> (RRID:SCR_016582) versions are 3.0.6 and 0.46.0. Genome fasta index and annotations were based on GRCh38 Gencode v31 (RRID:SCR_014966). Additionally, for other annotations and analysis of other populations, the processed h5ad adata from
                    <sup>
                        <xref ref-type="bibr" rid="ref13">13</xref>
                    </sup> was downloaded from the Zenodo repository (ID: 4536165).</p>
                <p>Each individual sample (S1&#x2013;C fibroblast types and subtypes that we had just descS5) data was processed equally using the following 
                    <italic toggle="yes">scanpy</italic> (RRID:SCR_018139, v1.7.0rc1)
                    <sup>
                        <xref ref-type="bibr" rid="ref14">14</xref>
                    </sup> procedure. To map the clusters from the original publication, cells from the processed data set were extracted and mapped to the samples. Genes with fewer than 30 counts were rejected. The sample was normalised (
                    <monospace>sc.pp.normalize_per_cell</monospace>) and log-transformed. Then, Principal Component Analysis (PCA) with 30 components was calculated and feature selection was performed with 
                    <italic toggle="yes">triku</italic>
                    <sup>
                        <xref ref-type="bibr" rid="ref15">15</xref>
                    </sup> (RRID:SCR_020977, v1.3.1), and 
                    <italic toggle="yes">k</italic>NN with cosine metric were computed. Finally, UMAP (RRID:SCR_018217, v0.4.6)
                    <sup>
                        <xref ref-type="bibr" rid="ref16">16</xref>
                    </sup> and leiden (v0.8.3)
                    <sup>
                        <xref ref-type="bibr" rid="ref17">17</xref>
                    </sup> were applied to detect the fibroblast populations.</p>
                <p>Most of the cells from the preprocessed adata were mapped to the raw dataset. However, additional unmapped cells appear, some of them related to other cell types (e.g. keratinocytes, immune cells or perivascular cells). To assign unmapped cells to their corresponding cell types a population matching algorithm was applied (described below). This algorithm requires a dictionary of cell types and markers. The markers used were the following: 
                    <list list-type="simple">
                        <list-item>
                            <label>&#x2022;</label>
                            <p>Fibroblast: 
                                <italic toggle="yes">LUM</italic>, 
                                <italic toggle="yes">PDGFRA</italic>, 
                                <italic toggle="yes">COL1A1</italic>, 
                                <italic toggle="yes">SFRP2</italic>, 
                                <italic toggle="yes">CCL19.</italic>
                            </p>
                        </list-item>
                        <list-item>
                            <label>&#x2022;</label>
                            <p>Perivascular cell: 
                                <italic toggle="yes">RGS5</italic>, 
                                <italic toggle="yes">MYL9</italic>, 
                                <italic toggle="yes">NDUFA4L2.</italic>
                            </p>
                        </list-item>
                        <list-item>
                            <label>&#x2022;</label>
                            <p>Erithrocyte: 
                                <italic toggle="yes">HBB</italic>, 
                                <italic toggle="yes">HBA2</italic>, 
                                <italic toggle="yes">HBA1.</italic>
                            </p>
                        </list-item>
                        <list-item>
                            <label>&#x2022;</label>
                            <p>Immune cell: 
                                <italic toggle="yes">TPSB2</italic>, 
                                <italic toggle="yes">TPSAB1</italic>, 
                                <italic toggle="yes">HLA-DRA</italic>, 
                                <italic toggle="yes">FCER1G</italic>, 
                                <italic toggle="yes">CD74.</italic>
                            </p>
                        </list-item>
                        <list-item>
                            <label>&#x2022;</label>
                            <p>Melanocyte: 
                                <italic toggle="yes">PMEL</italic>, 
                                <italic toggle="yes">MLANA.</italic>
                            </p>
                        </list-item>
                        <list-item>
                            <label>&#x2022;</label>
                            <p>Endothelial vascular cell: 
                                <italic toggle="yes">CLDN5</italic>, 
                                <italic toggle="yes">PECAM1.</italic>
                            </p>
                        </list-item>
                        <list-item>
                            <label>&#x2022;</label>
                            <p>Keratinocyte: 
                                <italic toggle="yes">DMKN</italic>, 
                                <italic toggle="yes">KRT1</italic>, 
                                <italic toggle="yes">KRT5.</italic>
                            </p>
                        </list-item>
                        <list-item>
                            <label>&#x2022;</label>
                            <p>Mitochondrial content (low quality): 
                                <italic toggle="yes">MTND2P8</italic>, 
                                <italic toggle="yes">MTND4P12</italic>, 
                                <italic toggle="yes">MTCO1P40</italic>, 
                                <italic toggle="yes">ADAM33</italic>, 
                                <italic toggle="yes">RN7SL2</italic>, 
                                <italic toggle="yes">MTRNR2L6.</italic>
                            </p>
                        </list-item>
                    </list>
                </p>
                <p>Once cell types have been assigned, non-fibroblast cells were discarded, and the PCA, triku, 
                    <italic toggle="yes">k</italic>NN, UMAP, leiden cycle was repeated to recalculate the new cell projection.</p>
                <p>The sample S5 was discarded from the analysis due to its lack of 
                    <italic toggle="yes">SFRP2</italic> expression, a well established fibroblast marker that is expressed in the rest of samples.
                    <sup>
                        <xref ref-type="bibr" rid="ref12">12</xref>
                    </sup>
                </p>
                <p>Then, we separated the Fb2 population from the Fb1 and Fb3 populations for each dataset and applied the population matching algorithm to annotate them with the labels assigned from.
                    <sup>
                        <xref ref-type="bibr" rid="ref12">12</xref>
                    </sup> The genes used for the population assignation were the following: 
                    <list list-type="simple">
                        <list-item>
                            <label>&#x2022;</label>
                            <p>A1: 
                                <italic toggle="yes">PI16</italic>, 
                                <italic toggle="yes">QPCT</italic>, 
                                <italic toggle="yes">SLPI</italic>, 
                                <italic toggle="yes">CCN5</italic>, 
                                <italic toggle="yes">CPE</italic>, 
                                <italic toggle="yes">CTHRC1</italic>, 
                                <italic toggle="yes">MFAP5</italic>, 
                                <italic toggle="yes">PCOLCE2</italic>, 
                                <italic toggle="yes">SCARA5</italic>, 
                                <italic toggle="yes">TSPAN8</italic>
                            </p>
                        </list-item>
                        <list-item>
                            <label>&#x2022;</label>
                            <p>A2: 
                                <italic toggle="yes">APCDD1</italic>, 
                                <italic toggle="yes">COL18A1</italic>, 
                                <italic toggle="yes">COMP</italic>, 
                                <italic toggle="yes">NKD2</italic>, 
                                <italic toggle="yes">F13A1</italic>, 
                                <italic toggle="yes">HSPB3</italic>, 
                                <italic toggle="yes">LEPR</italic>, 
                                <italic toggle="yes">TGFBI</italic>
                            </p>
                        </list-item>
                        <list-item>
                            <label>&#x2022;</label>
                            <p>B1: 
                                <italic toggle="yes">CXCL2</italic>, 
                                <italic toggle="yes">MYC</italic>, 
                                <italic toggle="yes">C7</italic>, 
                                <italic toggle="yes">SPSB1</italic>, 
                                <italic toggle="yes">ITM2A</italic>
                            </p>
                        </list-item>
                        <list-item>
                            <label>&#x2022;</label>
                            <p>B2: 
                                <italic toggle="yes">SOCS3</italic>, 
                                <italic toggle="yes">CCL19</italic>, 
                                <italic toggle="yes">CD74</italic>, 
                                <italic toggle="yes">RARRES2</italic>, 
                                <italic toggle="yes">CCDC146</italic>, 
                                <italic toggle="yes">IGFBP3</italic>, 
                                <italic toggle="yes">TNFSF13B</italic>
                            </p>
                        </list-item>
                        <list-item>
                            <label>&#x2022;</label>
                            <p>C: 
                                <italic toggle="yes">CRABP1</italic>, 
                                <italic toggle="yes">PLXDC1</italic>, 
                                <italic toggle="yes">RSPO4</italic>, 
                                <italic toggle="yes">ASPN</italic>, 
                                <italic toggle="yes">F2R</italic>, 
                                <italic toggle="yes">POSTN</italic>, 
                                <italic toggle="yes">TNN</italic>
                            </p>
                        </list-item>
                    </list>
                </p>
                <p>Next, all datasets with Fb1 and Fb3, or Fb2 populations were joined. We applied the previous processing routine and, to correct for batch effects, we used 
                    <italic toggle="yes">bbknn</italic> (v1.4.0)
                    <sup>
                        <xref ref-type="bibr" rid="ref18">18</xref>
                    </sup> with 
                    <monospace>metric=angular</monospace> and 
                    <monospace>neighbors_within_batch=2</monospace> parameters.</p>
                <p>To analyse the transcriptomic profile between Fb1 and Fb3, and Fb2 populations, we joined the two datasets and applied the same processing pipeline as before. We first characterised the genes driving the differences by obtaining the DEGs between the two sets of populations, and running GOEA with the first 150 DEGs of each category. The set of ontologies used was 
                    <italic toggle="yes">GO Biological Process 2018</italic> with the module 
                    <italic toggle="yes">gseapy</italic> (v0.10.4).
                    <sup>
                        <xref ref-type="bibr" rid="ref19">19</xref>
                    </sup> Then, to assess that the differences were due to cellular stress in the Fb2 population, we downloaded the lists of genes mentioned in the Results section (gene lists are available in the Github repository below), and genes appearing in more than two lists were selected. Then, the population matching algorithm was run against this list, and clusters with scores lower than 0.55 were assigned as &#x201d;Non-stress&#x201d; clusters.</p>
                <p>To analyse the differences in transcriptomic profiles within Fb1 and Fb3 populations, we obtained the DEGs between the two sets of A2 populations, which were the easiest to separate in clusters. By using that list of DEGs, we applied the population matching algorithm and divided the Fb1 and Fb3 populations into two halves. We then obtained the DEGs between the two halves and ran GOEA with the first 150 DEGs of each category, which revealed a hypoxia pattern in one of the halves. To assess that the differences were due to hypoxia, we downloaded the lists of hypoxia-related genes, and genes appearing in more than two lists were selected. Since some key genes (some glycolysis genes, or important genes appearing in one list) were missing, they were manually added to obtain a more robust list. Then, the population matching algorithm was run against this list, as well as the list of stress-related genes, and clusters with scores lower than 0.5 were assigned as &#x201d;Normal&#x201d; clusters.</p>
                <p>To replicate the analysis on the rest of the cell types, we used the processed h5ad file.</p>
            </sec>
            <sec id="sec4">
                <title>Correction of stress and hypoxia cell states</title>
                <p>In order to correct for stress and hypoxia cell states we used the 
                    <monospace>sc.pp.regress_out</monospace> implementation from 
                    <italic toggle="yes">scanpy</italic> on the stress and hypoxia scores. We first created two sub-datasets, one containing stress and normal cells, and another one with hypoxia and normal cells, and then the scores were regressed out. Finally, the common processing pipeline was applied. Additional correction methods can be seen in the notebooks in the Zenodo repository.
                    <sup>
                        <xref ref-type="bibr" rid="ref20">20</xref>
                    </sup>
                </p>
            </sec>
            <sec id="sec5">
                <title>Population matching algorithm</title>
                <p>The aim of this algorithm is to assign a set of clusters to a set of labels, where each label contains a list of representative markers. For each label we extracted the matrix of counts of the genes belonging to the label. Then, we created a new matrix, where we assigned to each cell and gene the sum of the counts of the gene within its 
                    <italic toggle="yes">k</italic>NN, divided by the number of neighbours. This step reduced the noisiness of the expression, and also exacerbated the local expression of a gene and dampened the expression of sparse genes.</p>
                <p>Gene expression values were substituted by the ranked index of their expression; and the values were divided by the largest index to sum 1. Therefore, the cell with the highest expression had a value of 1 for that gene, while the lowest expressed cell had a near 0 value. After this normalisation was applied to the rest of genes within the label, the mean of the normalised values across genes was computed, so that each cell had one value for that label.</p>
                <p>After the previous steps were computed for the rest of labels, a new matrix with the number of clusters by the number of labels was computed. For each label and each cluster, the percentile of the normalised values within cells of that cluster was computed (percentile 70 by default). This helped reduce noise on normalised values, and assigned a unique number per cluster.</p>
                <p>This algorithm allowed choosing of intermediate states, that is cell labels with a high similarity. By default, the label with the highest score per cluster was chosen. With the intermediate state option, labels that had a similar value as the label with the highest value were included. The difference in values was set as a threshold (0.05 by default), and labels with a difference of a value greater than the threshold were not merged.</p>
            </sec>
        </sec>
        <sec id="sec6" sec-type="results">
            <title>Results</title>
            <sec id="sec7">
                <title>Reassessment of the main cell populations in a large skin dataset reveals the presence of clusters with stress- and hypoxia-related gene signatures</title>
                <p>By using an unsupervised population-matching algorithm (details in processed notebooks available online
                    <sup>
                        <xref ref-type="bibr" rid="ref20">20</xref>
                    </sup>) we observed that in each of the healthy donors analysed by Reynolds 
                    <italic toggle="yes">et al</italic>.,
                    <sup>
                        <xref ref-type="bibr" rid="ref13">13</xref>
                    </sup> at least two independent fibroblast clusters expressed signature markers of the A1, A2, B1 and B2 populations. One set of cells corresponded to the Fb2 population, and the second set corresponded to the Fb1 and Fb3 populations. A joint analysis of all donors after batch effect correction showed that the cluster duplication observed in each individual donor could be replicated jointly. We therefore assumed that some global effect should be affecting the cells, i.e. Fb2 might be a copy of Fb1+Fb3 cells, although perhaps affected by some alteration. Differential gene expression (DEG) analysis between Fb2 and Fb1+Fb3 revealed an enrichment in ontology terms associated to cell stress (e.g. unfolded protein response, regulation of apoptotic process, mRNA catabolic process). We then designed a signature gene list composed of 50 DEGs commonly associated to stress in very different scRNAseq settings (e.g. 
                    <italic toggle="yes">ATF3</italic>, 
                    <italic toggle="yes">BTG2</italic>, 
                    <italic toggle="yes">FOS</italic>, 
                    <italic toggle="yes">FOSB</italic>, 
                    <italic toggle="yes">GADD45B</italic>, 
                    <italic toggle="yes">HSPA1A/B</italic>, 
                    <italic toggle="yes">IER2/3</italic>, 
                    <italic toggle="yes">JUN</italic>, 
                    <italic toggle="yes">JUNB</italic>, 
                    <italic toggle="yes">NFKBIA</italic>, 
                    <italic toggle="yes">NR4A1/2</italic>, 
                    <italic toggle="yes">PPP1R15A</italic>, 
                    <italic toggle="yes">RHOB</italic>).
                    <sup>
                        <xref ref-type="bibr" rid="ref21">21</xref>-
                        <xref ref-type="bibr" rid="ref25">25</xref>
                    </sup> Using this signature, the Fb2 population over-expressed 
                    <italic toggle="yes">BTG2</italic>, 
                    <italic toggle="yes">EGR1</italic>, 
                    <italic toggle="yes">FOSB</italic>, 
                    <italic toggle="yes">IER2</italic>, 
                    <italic toggle="yes">SOCS3</italic>, and 
                    <italic toggle="yes">ZFP36</italic>, among others, indicating that these cells clustered together mainly due to cellular stress.</p>
                <p>In a further analysis of the Fb1 and Fb3 cells, we observed that the A1, A2, B1 and B2 populations appeared to duplicate once more. A DEG analysis between each pair of duplicated populations disclosed genes in one of the split populations that were related to glycolysis (
                    <italic toggle="yes">ALDOC</italic>, 
                    <italic toggle="yes">ENO2</italic>, 
                    <italic toggle="yes">GAPDH</italic>, 
                    <italic toggle="yes">PGK1</italic>, 
                    <italic toggle="yes">PDK1</italic>, 
                    <italic toggle="yes">PFKFB4</italic>, 
                    <italic toggle="yes">PYGL</italic>), cell integrity, hypoxia and apoptosis (
                    <italic toggle="yes">BNIP3</italic>, 
                    <italic toggle="yes">BNIP3L</italic>, 
                    <italic toggle="yes">ANGPTL4</italic>, 
                    <italic toggle="yes">LOX</italic>, 
                    <italic toggle="yes">HILPDA</italic>); whereas the second split population over-expressed units of the mitochondrial ATPase and complex I, indicating an active oxidative metabolism. It is well known that cells under hypoxic conditions switch from aerobic to anaerobic metabolism to keep energy homeostasis within the cell.
                    <sup>
                        <xref ref-type="bibr" rid="ref26">26</xref>-
                        <xref ref-type="bibr" rid="ref28">28</xref>
                    </sup> We therefore generated a curated list of hypoxia-related genes, and managed to separate the non-hypoxic from the hypoxic group with the population-matching algorithm. Once stressed or hypoxic cells were removed on the basis of a set threshold of expression of signature genes, we mapped the main types of fibroblasts in what we termed normal cell subset of Reynolds 
                    <italic toggle="yes">et al</italic>. (
                    <xref ref-type="fig" rid="f1">Figure 1A</xref>). Fibroblast A1, A2 and B2 populations were independently mapped, and we also found clusters which seemingly were mixtures of previously defined populations e.g. B1/B2, A1/A2, or A2/B2. No type C fibroblasts were detected.</p>
                <fig fig-type="figure" id="f1" orientation="portrait" position="float">
                    <label>Figure 1. </label>
                    <caption>
                        <title>A re-analysis of the Reynolds et al. dataset in search of dermal fibroblast subpopulations reveals the presence of substantial proportions of stressed and hypoxic cells.</title>
                        <p>(A) UMAP projection of normal fibroblasts (after removal of hypoxic and stressed cell subsets) reveals conservation of some, but not all, cell types previously described in independent datasets.
                            <sup>
                                <xref ref-type="bibr" rid="ref12">12</xref>
                            </sup> (B) UMAP projections of fibroblast, vascular endothelium, pericyte, keratinocyte, lymphoid and APC cell populations from healthy donors, labeled to highlight hypoxic and stressed cell subpopulations as characterized by overexpression of defined gene signatures.</p>
                    </caption>
                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/58386/67b96601-6cd3-4bb2-8cde-c8e6439ed76d_figure1.gif"/>
                </fig>
                <p>To understand whether the stress and hypoxic signatures were only present in fibroblast subsets or could also be traced to other populations within the Reynolds 
                    <italic toggle="yes">et al</italic>. dataset, we mapped the stress and hypoxia gene signatures to perivascular cell, keratinocyte, vascular endothelial cell, lymphoid cell, and antigen presenting cell (APC) clusters. In our reanalysis of healthy donors, fibroblasts, perivascular cells, keratinocytes, and vascular endothelial cells showed clear hypoxia and stress-related clusters (
                    <xref ref-type="fig" rid="f1">Figure 1B</xref>). For instance, the VE3 population, described by Reynolds 
                    <italic toggle="yes">et al</italic>. as increased in patients suffering from inflammatory conditions, presented a clear stress-related transcriptomic profile. On the other hand, most of the VE2 population over-expressed hypoxia-related genes. On lymphoid cells we did observe a sub-cluster of stressed Tc/Th cells but no clear hypoxic profiles. On APCs, an inflammatory macrophage cluster showed hypoxia, and the M2 and DC2 clusters showed stress-related profiles. Some of these results may be expected in physiological conditions for immune cells , but others could be attributed to sample handling.</p>
                <p>Finally, we tested if the aforementioned stress and hypoxia related signatures were present in the previously published scRNAseq datasets of human skin.
                    <sup>
                        <xref ref-type="bibr" rid="ref8">8</xref>-
                        <xref ref-type="bibr" rid="ref11">11</xref>
                    </sup> The levels of expression of these genes were clearly higher in the Reynolds 
                    <italic toggle="yes">et al</italic>. dataset as compared to other available resources (
                    <xref ref-type="fig" rid="f2">Figure 2</xref>).</p>
                <fig fig-type="figure" id="f2" orientation="portrait" position="float">
                    <label>Figure 2. </label>
                    <caption>
                        <title>Stress and hypoxia-related signatures in published human dermal fibroblast datasets.</title>
                        <p>(A) UMAP projection of normal fibroblasts (after removal of hypoxic and stressed cell subsets) reveals conservation of some, but not all, cell types previously described in independent datasets (1). (B) UMAP projections of human dermal fibroblast subsets as defined in
                            <sup>
                                <xref ref-type="bibr" rid="ref12">12</xref>
                            </sup> are shown here for five published datasets,
                            <sup>
                                <xref ref-type="bibr" rid="ref8">8</xref>-
                                <xref ref-type="bibr" rid="ref11">11</xref>,
                                <xref ref-type="bibr" rid="ref13">13</xref>
                            </sup> and depicted by the average levels of expression of stress and hypoxia gene signatures.</p>
                    </caption>
                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/58386/67b96601-6cd3-4bb2-8cde-c8e6439ed76d_figure2.gif"/>
                </fig>
            </sec>
            <sec id="sec8">
                <title>Correction of stress and hypoxia signatures shows that stressed cells show a non-recoverable gene signature</title>
                <p>Since the stress and hypoxia related expression profiles are apparent, we were interested in studying the &#x201d;reversibility&#x201d; of the transcriptomic signatures, and creating a normalised dataset where hypoxic and stressed cells could merge with the normal cells, and classifying the whole dataset into the original cell types described in.
                    <sup>
                        <xref ref-type="bibr" rid="ref12">12</xref>
                    </sup> To this end, we applied two approaches with similar results. On the one hand, we considered cell states as batches, and applied batch effect correction with 
                    <italic toggle="yes">bbknn</italic> and 
                    <italic toggle="yes">harmony.</italic> On the other hand, we applied regression on the stress and hypoxia scores shown in 
                    <xref ref-type="fig" rid="f2">Figure 2</xref> based on the 
                    <italic toggle="yes">Seurat</italic>&#x2019;s linear regression function implemented in 
                    <italic toggle="yes">scanpy.</italic> Since both approaches showed similar results, we show the results of the latter case in 
                    <xref ref-type="fig" rid="f3">Figure 3</xref>.</p>
                <fig fig-type="figure" id="f3" orientation="portrait" position="float">
                    <label>Figure 3. </label>
                    <caption>
                        <title>Dataset merging of stress and hypoxia populations show mixed degrees of integration with the &#x201d;normal&#x201d; dataset.</title>
                        <p>(A) UMAP projection of merged &#x201d;stress&#x201d; and &#x201d;normal&#x201d; cells. There is a low degree of integration between both cell types. (B) UMAP projection of merged &#x201d;hypoxia&#x201d; and &#x201d;normal&#x201d; cells. There is a high degree of integration between both cell types. (C) Unsupervised assignation of fibroblast types from (B) reveals, similar to results from 
                            <xref ref-type="fig" rid="f1">Figure 1A</xref>, major fibroblast types.</p>
                    </caption>
                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/58386/67b96601-6cd3-4bb2-8cde-c8e6439ed76d_figure3.gif"/>
                </fig>
                <p>To further study if stress and hypoxia transcriptomic profiles are &#x201d;recoverable&#x201d;, we generated two types of datasets, one each with the stress or hypoxia cells, and another one containing normal cells. When applied the correction to the stress + normal dataset we observed that there was no integration between the two states (
                    <xref ref-type="fig" rid="f3">Figure 3A</xref>). On the other hand, there was a good integration between the hypoxia and normal cell states (
                    <xref ref-type="fig" rid="f3">Figure 3B</xref>), and the main fibroblast populations could be correctly mapped (
                    <xref ref-type="fig" rid="f3">Figure 3C</xref>). From these results we infer that the transcriptome from stressed cells is much more altered than the one from hypoxic cells, to the extent that stressed cells are in a computationally non-reversible state.</p>
            </sec>
        </sec>
        <sec id="sec9" sec-type="discussion">
            <title>Discussion</title>
            <p>The results from the efforts to compare, correlate, and compile the information present in available scRNAseq datasets could be condemned to short longevity since they can be overpassed by new resources that appear almost on a daily basis. However, it is to be expected that, at some tipping point, robust cell types and subtypes will be fully defined for each tissue and organ. Then, new scRNAseq datasets will only add information on the transcriptomically defined cell states of each of the robustly defined cell subpopulations, in response to specific perturbations such as injury or disease.</p>
            <p>Here, we aimed to validate results we had obtained with a few thousand cells with a large scRNAseq dataset including over half a million cells. Instead, we have found that clustering of this large dataset appears to be biased by differential expression of stress and hypoxia-related genes. In our opinion, the origin of stress- and hypoxia-related signatures in healthy donor cells might be related to the very exhaustive and complex protocol for cell isolation chosen by the authors. The top 200 
                <italic toggle="yes">&#x03bc;</italic>m-thick layer of the skin was cut with a dermatome, digested with dispase (1 h at 37&#x00b0;C) to separate dermal and epidermal layers. Both layers were digested in collagenase for 12 h at 37&#x00b0;C, cells were filtered and subjected to FACS sorting before library generation and sequencing.
                <sup>
                    <xref ref-type="bibr" rid="ref13">13</xref>
                </sup> While this strategy warrants high purity of the obtained cell populations, the long processing times (&#x2265;16 h) and the use of heat for tissue dissociation might have significantly affected patterns of gene expression of relevant numbers of cells in this setting. In this sense, aiming to process large numbers of cells involves longer processing times. High processing times (even &#x2265; 60&#x2019;) have previously been reported to generate significant transcriptomic alterations.
                <sup>
                    <xref ref-type="bibr" rid="ref21">21</xref>,
                    <xref ref-type="bibr" rid="ref25">25</xref>
                </sup> Additionally, computational analyses of large datasets also scale exponentially in the time required to perform each step of the investigation, thus limiting the number of iterations that are computed to understand the underlying biology.</p>
            <p>In conclusion, understanding skin fibroblast heterogeneity is of great relevance not only in homeostasis, but also in ageing
                <sup>
                    <xref ref-type="bibr" rid="ref11">11</xref>,
                    <xref ref-type="bibr" rid="ref29">29</xref>
                </sup> and disease.
                <sup>
                    <xref ref-type="bibr" rid="ref30">30</xref>-
                    <xref ref-type="bibr" rid="ref34">34</xref>
                </sup> Further refinement of fibroblasts subsets and their identity-defining features will provide a fruitful framework for the advancement of knowledge as well as for the development of novel therapeutic approaches in dermatological disease and skin cancer.</p>
        </sec>
        <sec id="sec10">
            <title>Data availability</title>
            <p>All data underlying the results are available as part of the article and no additional source data are required.</p>
        </sec>
        <sec id="sec11">
            <title>Software availability</title>
            <p>Notebooks to replicate this work can be found at: 
                <ext-link ext-link-type="uri" xlink:href="https://github.com/alexmascension/revisit_reynolds_fb">https://github.com/alexmascension/revisit_reynolds_fb</ext-link>.</p>
            <p>Processed notebooks and AnnData files can be found at: 
                <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.5281/zenodo.4596374">https://doi.org/10.5281/zenodo.4596374</ext-link>.
                <sup>
                    <xref ref-type="bibr" rid="ref20">20</xref>
                </sup>
            </p>
            <p>License: Creative Commons Attribution 4.0 International.</p>
        </sec>
    </body>
    <back>
        <ref-list>
            <title>References</title>
            <ref id="ref1">
                <label>1</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Svensson</surname>
                            <given-names>V</given-names>
                        </name>

                        <name name-style="western">
                            <surname>da Veiga Beltrame</surname>
                            <given-names>E</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Pachter</surname>
                            <given-names>L</given-names>
                        </name>
</person-group>:
                    <article-title>A curated database reveals trends in single-cell transcriptomics.</article-title>
                    <source>

                        <italic toggle="yes">Database.</italic>
</source>
                    <year>2020</year>;<volume>2020</volume>.
                    <pub-id pub-id-type="pmid">33247933</pub-id>
                    <pub-id pub-id-type="doi">10.1093/database/baaa073</pub-id>
                    <pub-id pub-id-type="pmcid">PMC7698659</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref2">
                <label>2</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Phan</surname>
                            <given-names>QM</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Driskell</surname>
                            <given-names>IM</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Driskell</surname>
                            <given-names>RR</given-names>
                        </name>
</person-group>:
                    <article-title>The three rs of single-cell rna sequencing: reuse, refine, and resource.</article-title>
                    <source>

                        <italic toggle="yes">J Invest Dermatol.</italic>
</source>
                    <year>2021</year>;<volume>141</volume>(<issue>7</issue>):<fpage>1627</fpage>&#x2013;<lpage>1629</lpage>.
                    <pub-id pub-id-type="pmid">34167721</pub-id>
                    <pub-id pub-id-type="doi">10.1016/j.jid.2021.01.002</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref3">
                <label>3</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Schuster</surname>
                            <given-names>R</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Rockel</surname>
                            <given-names>JS</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Kapoor</surname>
                            <given-names>M</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>The inflammatory speech of fibroblasts.</article-title>
                    <source>

                        <italic toggle="yes">Immunol Rev.</italic>
</source>
                    <year>2021</year>.
                    <pub-id pub-id-type="pmid">33987902</pub-id>
                    <pub-id pub-id-type="doi">10.1111/imr.12971</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref4">
                <label>4</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Sawant</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Hinz</surname>
                            <given-names>B</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Sch&#x00f6;nborn</surname>
                            <given-names>K</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>A story of fibers and stress: Matrix-embedded signals for fibroblast activation in the skin.</article-title>
                    <source>

                        <italic toggle="yes">Wound Repair Regen.</italic>
</source>
                    <year>2021</year>;<volume>29</volume>:<fpage>515</fpage>&#x2013;<lpage>530</lpage>.
                    <pub-id pub-id-type="pmid">34081361</pub-id>
                    <pub-id pub-id-type="doi">10.1111/wrr.12950</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref5">
                <label>5</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Correa-Gallegos</surname>
                            <given-names>D</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Rinkevich</surname>
                            <given-names>Y</given-names>
                        </name>
</person-group>:
                    <article-title>Cutting into wound repair.</article-title>
                    <source>

                        <italic toggle="yes">FEBS J.</italic>
</source>
                    <year>2021</year>.
                    <pub-id pub-id-type="pmid">34137168</pub-id>
                    <pub-id pub-id-type="doi">10.1111/febs.16078</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref6">
                <label>6</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Jiang</surname>
                            <given-names>D</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Rinkevich</surname>
                            <given-names>Y</given-names>
                        </name>
</person-group>:
                    <article-title>Distinct fibroblasts in scars and regeneration.</article-title>
                    <source>

                        <italic toggle="yes">Curr Opin Genet Dev.</italic>
</source>
                    <year>2021</year>;<volume>70</volume>:<fpage>7</fpage>&#x2013;<lpage>14</lpage>.
                    <pub-id pub-id-type="pmid">34022662</pub-id>
                    <pub-id pub-id-type="doi">10.1016/j.gde.2021.04.005</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref7">
                <label>7</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Puntambekar</surname>
                            <given-names>S</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Hesselberth</surname>
                            <given-names>JR</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Riemondy</surname>
                            <given-names>KA</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Cell-level metadata are indispensable for documenting single-cell sequencing datasets.</article-title>
                    <source>

                        <italic toggle="yes">PLoS Biol.</italic>
</source>
                    <year>2021</year>;<volume>19</volume>(<issue>5</issue>).
                    <pub-id pub-id-type="pmid">33945522</pub-id>
                    <pub-id pub-id-type="doi">10.1371/journal.pbio.3001077</pub-id>
                    <pub-id pub-id-type="pmcid">PMC8121533</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref8">
                <label>8</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Tabib</surname>
                            <given-names>T</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Morse</surname>
                            <given-names>C</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Wang</surname>
                            <given-names>T</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Sfrp2/dpp4 and fmo1/lsp1 define major fibroblast populations in human skin.</article-title>
                    <source>

                        <italic toggle="yes">J Invest Dermatol.</italic>
</source>
                    <year>2018</year>;<volume>138</volume>(<issue>4</issue>):<fpage>802</fpage>&#x2013;<lpage>810</lpage>.
                    <pub-id pub-id-type="pmid">29080679</pub-id>
                    <pub-id pub-id-type="doi">10.1016/j.jid.2017.09.045</pub-id>
                    <pub-id pub-id-type="pmcid">PMC7444611</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref9">
                <label>9</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>He</surname>
                            <given-names>H</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Suryawanshi</surname>
                            <given-names>H</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Morozov</surname>
                            <given-names>P</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Single-cell transcriptome analysis of human skin identifies novel fibroblast subpopulation and enrichment of immune subsets in atopic dermatitis.</article-title>
                    <source>

                        <italic toggle="yes">J Allergy Clin Immunol.</italic>
</source>
                    <year>2020</year>;<volume>145</volume>(<issue>6</issue>):<fpage>1615</fpage>&#x2013;<lpage>1628</lpage>.
                    <pub-id pub-id-type="pmid">32035984</pub-id>
                    <pub-id pub-id-type="doi">10.1016/j.jaci.2020.01.042</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref10">
                <label>10</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Vorstandlechner</surname>
                            <given-names>V</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Laggner</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Kalinina</surname>
                            <given-names>P</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Deciphering the functional heterogeneity of skin fibroblasts using single-cell rna sequencing.</article-title>
                    <source>

                        <italic toggle="yes">FASEB J.</italic>
</source>
                    <year>2020</year>;<volume>34</volume>(<issue>3</issue>):<fpage>3677</fpage>&#x2013;<lpage>3692</lpage>.
                    <pub-id pub-id-type="pmid">31930613</pub-id>
                    <pub-id pub-id-type="doi">10.1096/fj.201902001RR</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref11">
                <label>11</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Sol&#x00e9;-Boldo</surname>
                            <given-names>Lloren&#x00e7;</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Raddatz</surname>
                            <given-names>G&#x00fc;nter</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Sch&#x00fc;tz</surname>
                            <given-names>S</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Single-cell transcriptomes of the human skin reveal age-related loss of fibroblast priming.</article-title>
                    <source>

                        <italic toggle="yes">Commun Biol.</italic>
</source>
                    <year>2020</year>;<volume>3</volume>(<issue>1</issue>).
                    <pub-id pub-id-type="pmid">32327715</pub-id>
                    <pub-id pub-id-type="doi">10.1038/s42003-020-0922-4</pub-id>
                    <pub-id pub-id-type="pmcid">PMC7181753</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref12">
                <label>12</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Ascensi&#x00f3;n</surname>
                            <given-names>AM</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Fuertes-&#x00c1;lvarez</surname>
                            <given-names>S</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Iba&#x00f1;ez-Sol&#x00e9;</surname>
                            <given-names>O</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Human dermal fibroblast subpopulations are conserved across single-cell rna sequencing studies.</article-title>
                    <source>

                        <italic toggle="yes">J Invest Dermatol.</italic>
</source>
                    <year>2021</year>;<volume>141</volume>(<issue>7</issue>):<fpage>1735</fpage>&#x2013;<lpage>1744</lpage>.
                    <pub-id pub-id-type="pmid">33385399</pub-id>
                    <pub-id pub-id-type="doi">10.1016/j.jid.2020.11.028</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref13">
                <label>13</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Reynolds</surname>
                            <given-names>G</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Vegh</surname>
                            <given-names>P</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Fletcher</surname>
                            <given-names>J</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Developmental cell programs are co-opted in inflammatory skin disease.</article-title>
                    <source>

                        <italic toggle="yes">Science.</italic>
</source>
                    <year>2021</year>;<volume>371</volume>(<issue>6527</issue>).
                    <pub-id pub-id-type="pmid">33479125</pub-id>
                    <pub-id pub-id-type="doi">10.1126/science.aba6500</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref14">
                <label>14</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Wolf</surname>
                            <given-names>FA</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Angerer</surname>
                            <given-names>P</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Theis</surname>
                            <given-names>FJ</given-names>
                        </name>
</person-group>:
                    <article-title>Scanpy: large-scale single-cell gene expression data analysis.</article-title>
                    <source>

                        <italic toggle="yes">Genome Biol.</italic>
</source>
                    <year>2018</year>;<volume>19</volume>(<issue>1</issue>).
                    <pub-id pub-id-type="pmid">29409532</pub-id>
                    <pub-id pub-id-type="doi">10.1186/s13059-017-1382-0</pub-id>
                    <pub-id pub-id-type="pmcid">PMC5802054</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref15">
                <label>15</label>
                <mixed-citation publication-type="other">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Ascensi&#x00f3;n</surname>
                            <given-names>AM</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Iba&#x00f1;ez-Sol&#x00e9;</surname>
                            <given-names>O</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Inza</surname>
                            <given-names>I</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Triku: a feature selection method based on nearest neighbors for single-cell data.</article-title>
                    <source>

                        <italic toggle="yes">bioRxiv.</italic>
</source>
                    <year>2021</year>.
                    <pub-id pub-id-type="doi">10.1101/2021.02.12.430764</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref16">
                <label>16</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>McInnes</surname>
                            <given-names>L</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Healy</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Melville</surname>
                            <given-names>J</given-names>
                        </name>
</person-group>:
                    <article-title>Umap: Uniform manifold approximation and projection for dimension reduction.</article-title>
                    <source>

                        <italic toggle="yes">arXiv preprint arXiv:1802.03426.</italic>
</source>
                    <year>2018</year>.</mixed-citation>
            </ref>
            <ref id="ref17">
                <label>17</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Traag</surname>
                            <given-names>VA</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Waltman</surname>
                            <given-names>L</given-names>
                        </name>

                        <name name-style="western">
                            <surname>van Eck</surname>
                            <given-names>NJ</given-names>
                        </name>
</person-group>:
                    <article-title>From louvain to leiden: guaranteeing well-connected communities.</article-title>
                    <source>

                        <italic toggle="yes">Sci Rep.</italic>
</source>
                    <year>2019</year>;<volume>9</volume>(<issue>5233</issue>).
                    <pub-id pub-id-type="doi">10.1038/s41598-019-41695-z</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref18">
                <label>18</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Pola&#x0144;ski</surname>
                            <given-names>K</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Young</surname>
                            <given-names>MD</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Miao</surname>
                            <given-names>Z</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Bbknn: Fast batch alignment of single cell transcriptomes.</article-title>
                    <source>

                        <italic toggle="yes">Bioinformatics.</italic>
</source>
                    <year>2020</year>;<volume>36</volume>(<issue>3</issue>):<fpage>964</fpage>&#x2013;<lpage>965</lpage>.
                    <pub-id pub-id-type="pmid">31400197</pub-id>
                    <pub-id pub-id-type="doi">10.1093/bioinformatics/btz625</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref19">
                <label>19</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Fang</surname>
                            <given-names>Z</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Wolf</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Liao</surname>
                            <given-names>Y</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>zqfang/gseapy: gseapy-v0.10.3.</article-title>
                    <year>February 2021</year>.
                    <pub-id pub-id-type="doi">10.5281/zenodo.4553090</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref20">
                <label>20</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Ascensi&#x00f3;n</surname>
                            <given-names>AM</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Ara&#x00fa;zo-Bravo</surname>
                            <given-names>MJ</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Ander</surname>
                            <given-names>I</given-names>
                        </name>
</person-group>:
                    <article-title>The need to reassess single-cell rna sequencing datasets: more is not always better.</article-title>
                    <source>

                        <italic toggle="yes">Zenodo.</italic>
</source>
                    <year>2021</year>.
                    <pub-id pub-id-type="doi">10.5281/zenodo.4596374</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref21">
                <label>21</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>van den Brink</surname>
                            <given-names>SC</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Sage</surname>
                            <given-names>F</given-names>
                        </name>

                        <name name-style="western">
                            <surname>V&#x00e9;rtesy</surname>
                            <given-names>&#x00c1;bel</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Single-cell sequencing reveals dissociation-induced gene expression in tissue subpopulations.</article-title>
                    <source>

                        <italic toggle="yes">Nat Methods.</italic>
</source>
                    <year>2017</year>;<volume>14</volume>(<issue>10</issue>):<fpage>935</fpage>&#x2013;<lpage>936</lpage>.
                    <pub-id pub-id-type="pmid">28960196</pub-id>
                    <pub-id pub-id-type="doi">10.1038/nmeth.4437</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref22">
                <label>22</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>O&#x2019;Flanagan</surname>
                            <given-names>CH</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Campbell</surname>
                            <given-names>KR</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Zhang</surname>
                            <given-names>AW</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Dissociation of solid tumor tissues with cold active protease for single-cell rna-seq minimizes conserved collagenase-associated stress responses.</article-title>
                    <source>

                        <italic toggle="yes">Genome Biol.</italic>
</source>
                    <year>2019</year>;<volume>20</volume>(<issue>1</issue>).
                    <pub-id pub-id-type="pmid">31623682</pub-id>
                    <pub-id pub-id-type="doi">10.1186/s13059-019-1830-0</pub-id>
                    <pub-id pub-id-type="pmcid">PMC6796327</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref23">
                <label>23</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Denisenko</surname>
                            <given-names>E</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Guo</surname>
                            <given-names>BB</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Jones</surname>
                            <given-names>M</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Systematic assessment of tissue dissociation and storage biases in single-cell and single-nucleus RNA-seq workflows.</article-title>
                    <source>

                        <italic toggle="yes">Genome Biol.</italic>
</source>
                    <year>2020</year>;<volume>21</volume>(<issue>1</issue>).
                    <pub-id pub-id-type="pmid">32487174</pub-id>
                    <pub-id pub-id-type="doi">10.1186/s13059-020-02048-6</pub-id>
                    <pub-id pub-id-type="pmcid">PMC7265231</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref24">
                <label>24</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Adam</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Potter</surname>
                            <given-names>AS</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Potter</surname>
                            <given-names>SS</given-names>
                        </name>
</person-group>:
                    <article-title>Psychrophilic proteases dramatically reduce single-cell rna-seq artifacts: a molecular atlas of kidney development.</article-title>
                    <source>

                        <italic toggle="yes">Development.</italic>
</source>
                    <year>2017</year>;<volume>144</volume>(<issue>19</issue>):<fpage>3625</fpage>&#x2013;<lpage>3632</lpage>.
                    <pub-id pub-id-type="pmid">28851704</pub-id>
                    <pub-id pub-id-type="doi">10.1242/dev.151142</pub-id>
                    <pub-id pub-id-type="pmcid">PMC5665481</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref25">
                <label>25</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Waise</surname>
                            <given-names>S</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Parker</surname>
                            <given-names>R</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Rose-Zerilli</surname>
                            <given-names>MJ</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>An optimised tissue disaggregation and data processing pipeline for characterising fibroblast phenotypes using single-cell rna sequencing.</article-title>
                    <source>

                        <italic toggle="yes">Sci Rep.</italic>
</source>
                    <year>2019</year>;<volume>9</volume>:<fpage>9580</fpage>.
                    <pub-id pub-id-type="pmid">31270426</pub-id>
                    <pub-id pub-id-type="doi">10.1038/s41598-019-45842-4</pub-id>
                    <pub-id pub-id-type="pmcid">PMC6610623</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref26">
                <label>26</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Xiao</surname>
                            <given-names>Z</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Dai</surname>
                            <given-names>Z</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Locasale</surname>
                            <given-names>JW</given-names>
                        </name>
</person-group>:
                    <article-title>Metabolic landscape of the tumor microenvironment at single cell resolution.</article-title>
                    <source>

                        <italic toggle="yes">Nat Commun.</italic>
</source>
                    <year>2019</year>;<volume>10</volume>:<fpage>3763</fpage>.
                    <pub-id pub-id-type="pmid">31434891</pub-id>
                    <pub-id pub-id-type="doi">10.1038/s41467-019-11738-0</pub-id>
                    <pub-id pub-id-type="pmcid">PMC6704063</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref27">
                <label>27</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Mohyeldin</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Garz&#x00f3;n-Muvdi</surname>
                            <given-names>Tom&#x00e1;s</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Qui&#x00f1;ones-Hinojosa</surname>
                            <given-names>A</given-names>
                        </name>
</person-group>:
                    <article-title>Oxygen in stem cell biology: A critical component of the stem cell niche.</article-title>
                    <source>

                        <italic toggle="yes">Cell Stem Cell.</italic>
</source>
                    <year>2010</year>;<volume>7</volume>(<issue>2</issue>):<fpage>150</fpage>&#x2013;<lpage>161</lpage>.
                    <pub-id pub-id-type="pmid">20682444</pub-id>
                    <pub-id pub-id-type="doi">10.1016/j.stem.2010.07.007</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref28">
                <label>28</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Simon</surname>
                            <given-names>MC</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Keith</surname>
                            <given-names>B</given-names>
                        </name>
</person-group>:
                    <article-title>The role of oxygen availability in embryonic development and stem cell function.</article-title>
                    <source>

                        <italic toggle="yes">Nat Rev Mol Cell Biol.</italic>
</source>
                    <year>2008</year>;<volume>9</volume>(<issue>4</issue>):<fpage>285</fpage>&#x2013;<lpage>296</lpage>.
                    <pub-id pub-id-type="pmid">18285802</pub-id>
                    <pub-id pub-id-type="doi">10.1038/nrm2354</pub-id>
                    <pub-id pub-id-type="pmcid">PMC2876333</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref29">
                <label>29</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Zou</surname>
                            <given-names>Z</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Long</surname>
                            <given-names>X</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Zhao</surname>
                            <given-names>Q</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>A single-cell transcriptomic atlas of human skin aging.</article-title>
                    <source>

                        <italic toggle="yes">Dev Cell.</italic>
</source>
                    <year>2021</year>;<volume>56</volume>(<issue>3</issue>):<fpage>1</fpage>&#x2013;<lpage>15</lpage>.
                    <pub-id pub-id-type="pmid">33238152</pub-id>
                    <pub-id pub-id-type="doi">10.1016/j.devcel.2020.11.002</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref30">
                <label>30</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Rojahn</surname>
                            <given-names>TB</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Vorstandlechner</surname>
                            <given-names>V</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Krausgruber</surname>
                            <given-names>T</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Single-cell transcriptomics combined with interstitial fluid proteomics defines cell-type-specific immune regulation in atopic dermatitis.</article-title>
                    <source>

                        <italic toggle="yes">J Allergy Clin Immunol.</italic>
</source>
                    <year>2020</year>;<volume>146</volume>(<issue>5</issue>):<fpage>1056</fpage>&#x2013;<lpage>1069</lpage>.
                    <pub-id pub-id-type="pmid">32344053</pub-id>
                    <pub-id pub-id-type="doi">10.1016/j.jaci.2020.03.041</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref31">
                <label>31</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Gao</surname>
                            <given-names>Y</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Yao</surname>
                            <given-names>X</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Zhai</surname>
                            <given-names>Y</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Single cell transcriptional zonation of human psoriasis skin identifies an alternative immunoregulatory axis conducted by skin resident cells.</article-title>
                    <source>

                        <italic toggle="yes">Cell Death Dis.</italic>
</source>
                    <year>2021</year>;<volume>12</volume>:<fpage>450</fpage>.
                    <pub-id pub-id-type="pmid">33958582</pub-id>
                    <pub-id pub-id-type="doi">10.1038/s41419-021-03724-6</pub-id>
                    <pub-id pub-id-type="pmcid">PMC8102483</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref32">
                <label>32</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Kim</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Lee</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Kim</surname>
                            <given-names>HJ</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Single-cell transcriptomics applied to emigrating cells from psoriasis elucidate pathogenic vs. regulatory immune cell subsets.</article-title>
                    <source>

                        <italic toggle="yes">J Allergy Clin Immunol.</italic>
</source>
                    <year>2021</year>.
                    <pub-id pub-id-type="pmid">33932468</pub-id>
                    <pub-id pub-id-type="doi">10.1016/j.jaci.2021.04.021</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref33">
                <label>33</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Liu</surname>
                            <given-names>X</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Chen</surname>
                            <given-names>W</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Zeng</surname>
                            <given-names>Q</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Single-cell RNA-seq reveals lineage-specific regulatory changes of fibroblasts and vascular endothelial cells in keloid.</article-title>
                    <source>

                        <italic toggle="yes">J Invest Dermatol.</italic>
</source>
                    <year>2021</year>.
                    <pub-id pub-id-type="pmid">34242659</pub-id>
                    <pub-id pub-id-type="doi">10.1016/j.jid.2021.06.010</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref34">
                <label>34</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Liu</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Chang</surname>
                            <given-names>H-W</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Huang</surname>
                            <given-names>Z-M</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Single-cell rna sequencing of psoriatic skin identifies pathogenic TC17 cell subsets and reveals distinctions between CD8+ T cells in autoimmunity and cancer.</article-title>
                    <source>

                        <italic toggle="yes">J Allergy Clin Immunol.</italic>
</source>
                    <year>2021</year>;<volume>147</volume>(<issue>6</issue>):<fpage>2370</fpage>&#x2013;<lpage>2380</lpage>.
                    <pub-id pub-id-type="pmid">33309739</pub-id>
                    <pub-id pub-id-type="doi">10.1016/j.jaci.2020.11.028</pub-id>
                </mixed-citation>
            </ref>
        </ref-list>
    </back>
    <sub-article article-type="reviewer-report" id="report95141">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.58386.r95141</article-id>
            <title-group>
                <article-title>Reviewer response for version 1</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Zhou</surname>
                        <given-names>(Jo) Huiqing</given-names>
                    </name>
                    <xref ref-type="aff" rid="r95141a1">1</xref>
                    <xref ref-type="aff" rid="r95141a2">2</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0002-2434-3986</uri>
                </contrib>
                <aff id="r95141a1">
                    <label>1</label>Department of Molecular Developmental Biology, Faculty of Science, Radboud Institute for Molecular Life Sciences, Radboud University, Nijmegen, The Netherlands</aff>
                <aff id="r95141a2">
                    <label>2</label>Human Genetics, Radboudumc, Nijmegen, The Netherlands</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>20</day>
                <month>12</month>
                <year>2021</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2021 Zhou (H</copyright-statement>
                <copyright-year>2021</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport95141" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.54864.1"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve-with-reservations</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>In this article, the authors re-analysed a large dataset published in a high-impact journal and compared the outcome of analyzing this dataset with several other previously published works. This effort is highly appreciated and should be encouraged in the field, to critically evaluate and make sense of published work, with the ultimate aim to understand biology.</p>
            <p> </p>
            <p> Specifically, the authors re-analysed the skin single-cell RNA-seq data published by Reynolds&#x00a0;
                <italic>et al</italic>., 2021. They initially focused on the healthy fibroblast population, using the authors&#x2019; previously published annotations, and identified an enhanced signature of stress- and hypoxia-related genes. They then further analysed several other cell populations of the dataset and reported that this enhanced stress and hypoxia signature is also present in other cell populations. Furthermore, using a normalised dataset where hypoxia and stressed cells could merge with the normal cells, they showed that the stress signature seems to be &#x2018;irreversible&#x2019;, in contrast to the hypoxia signature.</p>
            <p> </p>
            <p> The workflow and methods seem to be appropriate, and the conclusion on the enhanced stress and hypoxia gene signature in the analysed dataset is also convincing (for a non-computational biologist who has a good understanding of common bioinformatics analysis and interpretation). This is consistent with the authors&#x2019; discussions on the cell dissociation/processing methods. However, the title &#x2018;more is not always better&#x2019; can be discussed and reconsidered. The authors should probably give advice on careful dissociation methods when retrieving a large number of cells, rather than proposing that a large number of cells is not necessary or desirable. In addition, the authors did not give an extensive discussion on data analysis effort in analyzing the large datasets. It is also appropriate for the authors to comment on the data retrieval process of the analyses dataset, and to encourage the authors of the original paper to annotate their data properly and clearly (e.g., which methods were used to generate which batch of data). This is true for all publications and it is the only way for scientists to share and re-use the published data according to the FAIR principle (
                <ext-link ext-link-type="uri" xlink:href="https://www.go-fair.org/fair-principles/">https://www.go-fair.org/fair-principles/</ext-link>).</p>
            <p>Is the work clearly and accurately presented and does it cite the current literature?</p>
            <p>Yes</p>
            <p>If applicable, is the statistical analysis and its interpretation appropriate?</p>
            <p>Yes</p>
            <p>Are all the source data underlying the results available to ensure full reproducibility?</p>
            <p>Yes</p>
            <p>Is the study design appropriate and is the work technically sound?</p>
            <p>Yes</p>
            <p>Are the conclusions drawn adequately supported by the results?</p>
            <p>Partly</p>
            <p>Are sufficient details of methods and analysis provided to allow replication by others?</p>
            <p>Yes</p>
            <p>Reviewer Expertise:</p>
            <p>(Epidermal) stem cells; transcriptomis, epigenomics, developmental disease</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.</p>
        </body>
        <sub-article article-type="response" id="comment7913-95141">
            <front-stub>
                <contrib-group>
                    <contrib contrib-type="author">
                        <name>
                            <surname>Izeta</surname>
                            <given-names>Ander</given-names>
                        </name>
                        <aff>Biodonostia Health Research Institute, Spain</aff>
                    </contrib>
                </contrib-group>
                <author-notes>
                    <fn fn-type="conflict">
                        <p>
                            <bold>Competing interests: </bold>No competing interests</p>
                    </fn>
                </author-notes>
                <pub-date pub-type="epub">
                    <day>2</day>
                    <month>3</month>
                    <year>2022</year>
                </pub-date>
            </front-stub>
            <body>
                <p>We thank the reviewer for the helpful and insightful comments. We will now comment on the issues point-by-point.</p>
                <p> </p>
                <p> 
                    <italic>The title &#x2018;more is not always better&#x2019; can be discussed and reconsidered</italic>
                </p>
                <p> </p>
                <p> This comment is also in line with the other reviewer. Therefore we have changed the title to focus the paper mainly on biological sample processing.</p>
                <p> </p>
                <p> 
                    <italic>The authors should&#x00a0; probably&#x00a0; give&#x00a0; advice&#x00a0; on&#x00a0; careful&#x00a0; dissociation&#x00a0; methods&#x00a0; when&#x00a0; retrieving&#x00a0; a&#x00a0; large&#x00a0; number of&#x00a0; cells.</italic>
                </p>
                <p> </p>
                <p> We agree with this excellent point. The original article had already quoted the reference by Denisenko 
                    <italic>et al </italic>(2020) where the importance of cold dissociation was demonstrated. The use of cold active psychrophilic proteases has been proposed to avoid use of enzymes that need incubation at 37&#x00ba;C (Potter and Potter, 2019). Both references are now included in the text.</p>
                <p> </p>
                <p> 
                    <italic>[...] the authors did not give an extensive discussion on data analysis effort in analyzing the large dataset</italic>
                </p>
                <p> </p>
                <p> We analyzed the dadaw- Source code can be located as scripts in the GitHub repository https://github.com/haniffalab/HCA\_skin as well as in the Zenodo repository https://zenodo.org/record/4249674. Although the scripts from the GitHub repository are conveniently ordered and are legible, the Zenodo repository does not include the output results and intermediate figures from the scripts, so we were unable to check the values and intermediate results. Also, for some parts of the scripts, further commenting would have been appreciated. Additionally, we observed that the file structure of the GitHub and Zenodo repository were not comparable. For instance, the \textit{Pipeline} folder with the structured analysis is lacking in the Zenodo repository, although some of its scripts are scattered through different folders. Another issue from the GitHub repository is that analysis scripts were uploaded in their final form to the repository (commits 1766cd, 9b520f, 9a361e and 123b7d, 8 Dec 2020). Considering that scripts from Zenodo were uploaded on November, some efforts to tidy the GitHub repository were made afterwards. However, a quick look at the scripts from the different sections shows a lack of variable consistency and file I/O, which implies a lack of reproducibility on their scripts. Additionally, despite a succinct explanation of the README file of the GitHub repository, the lack of commentaries on the scripts and the apparition of entire scripts that are not reflected in the methods difficult any replication effort.</p>
                <p> </p>
                <p> Regarding the initial analysis of the dataset, from the python scripts we observed a set of common thresholds for all batches (
                    <italic>n\_genes $&lt;$ 6000, n\_genes $&gt;$ 400, n\_counts $&gt;$ 1000, percent\_mito $&lt;$ 0.2</italic>). Although this is common practice, maybe a thresholding per batch would have been more convenient. For instance, we observed that sample SKN8105197 does not provide enough consistency in fibroblast marker expression. For convenience, this sample was removed from the analysis.</p>
                <p> </p>
                <p> After QC and feature selection, bbknn is run with default parameters, although in other notebooks harmony has also been partially used. bbknn seems to be favored as the reference batch correction method. The lack of commit history does not allow us to look for previous attempts with other methods.</p>
                <p> </p>
                <p> In the Pipeline/02 folder we found some scripts that use a logistic regression model for label stability prediction. We do not observe this script in the methods section.</p>
                <p> </p>
                <p> In the Pipeline/03 folder we observe that an enhanced reclustering was made with hand-picked DEGs which, according to the authors, favored the separation of clusters. Interestingly, some of these DEGs are found on the hypoxia/stress lists (ZFP36, HSPA1A, HSPA1B, DNAJB1, JUNB, ATF3, SOCS3, GADD45B, FOS), and others are ribosomal protein associated genes (RPL22, RPL37, RPL34). In our opinion, the selected DEGs might bias the reclustering for the segregation of hypoxic and stress populations.</p>
                <p> </p>
                <p> 
                    <italic>It is also appropriate for the authors to comment on the data retrieval process of the analyses dataset, and to encourage the authors of the original paper to annotate their data properly and clearly</italic>
                </p>
                <p> </p>
                <p> Regarding the data retrieval process, neither the GitHub nor the Zenodo repositories host the code for data retrieval and preprocessing described in materials and methods.</p>
            </body>
        </sub-article>
    </sub-article>
    <sub-article article-type="reviewer-report" id="report98248">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.58386.r98248</article-id>
            <title-group>
                <article-title>Reviewer response for version 1</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Long</surname>
                        <given-names>Xiao</given-names>
                    </name>
                    <xref ref-type="aff" rid="r98248a1">1</xref>
                    <role>Referee</role>
                </contrib>
                <contrib contrib-type="author">
                    <name>
                        <surname>Li</surname>
                        <given-names>Zhujun</given-names>
                    </name>
                    <xref ref-type="aff" rid="r98248a1">1</xref>
                    <role>Co-referee</role>
                </contrib>
                <aff id="r98248a1">
                    <label>1</label>Department of Plastic Surgery, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences, Beijing, China</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>1</day>
                <month>12</month>
                <year>2021</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2021 Long X and Li Z</copyright-statement>
                <copyright-year>2021</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport98248" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.54864.1"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve-with-reservations</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>The study reanalyzed data from several single-cell RNA sequencing resources, proposing an interesting point that a larger number of cells could lead to compromised results. It is well designed and conducted. Bioinformatic and statistic methods were valid.</p>
            <p> </p>
            <p> The authors brought up an interesting point of view. However, large numbers of cells might not have a definite positive correlation with longer processing time. For example, as the authors discussed, the choice of the method during tissue processing could be an important contributing factor to the biased results. Therefore, the conclusion should focus more on the bias caused by processing time and choice of method, and maybe quality control measures, instead of larger cell numbers, because that would require further experiments and analysis to rule out these confounding factors.</p>
            <p>Is the work clearly and accurately presented and does it cite the current literature?</p>
            <p>Yes</p>
            <p>If applicable, is the statistical analysis and its interpretation appropriate?</p>
            <p>Yes</p>
            <p>Are all the source data underlying the results available to ensure full reproducibility?</p>
            <p>Yes</p>
            <p>Is the study design appropriate and is the work technically sound?</p>
            <p>Yes</p>
            <p>Are the conclusions drawn adequately supported by the results?</p>
            <p>Partly</p>
            <p>Are sufficient details of methods and analysis provided to allow replication by others?</p>
            <p>Yes</p>
            <p>Reviewer Expertise:</p>
            <p>Adipose derived stem cells</p>
            <p>We confirm that we have read this submission and believe that we have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however we have significant reservations, as outlined above.</p>
        </body>
        <sub-article article-type="response" id="comment7912-98248">
            <front-stub>
                <contrib-group>
                    <contrib contrib-type="author">
                        <name>
                            <surname>Izeta</surname>
                            <given-names>Ander</given-names>
                        </name>
                        <aff>Biodonostia Health Research Institute, Spain</aff>
                    </contrib>
                </contrib-group>
                <author-notes>
                    <fn fn-type="conflict">
                        <p>
                            <bold>Competing interests: </bold>No competing interests</p>
                    </fn>
                </author-notes>
                <pub-date pub-type="epub">
                    <day>2</day>
                    <month>3</month>
                    <year>2022</year>
                </pub-date>
            </front-stub>
            <body>
                <p>We thank the reviewer for the insight on focusing more on the biological processing of the sample. Although we are limited by the information from the article, we believe that a poor execution of the sample processing is a leading factor inducing the artifacts from the analysis. Indeed, we added some references supporting the claim that a general stress transcriptomic profile is produced by the long times that cells endured warm dissociation.</p>
                <p> </p>
                <p> However, we believe that the high processing time and its probable consequence of insufficient analytical insight at the dataset cannot be ignored. In fact, we added a new figure (Figure 4) showing a relevant increase in running times when increasing number of cells analyzed. Additionally, We have observed a trend in publications at top-ranked journals that consist of the presentation of datasets with a diverse and complex set of cells, but which lack an in-depth analysis and do not produce biological results with enough insights. This trend affects, in the end, the quality of the datasets. In our opinion, this might be related to the limited time data scientists have been able to iterate analyses with that dataset.</p>
                <p> </p>
                <p> We thus postulate that prospective authors should revise sample processing strategies as well as data analysis protocols, so that sampling errors can be pinpointed and corrected upstream in the analysis pipeline. In conclusion, we have reformulated our statements to increment the importance of biological sample processing as suggested by the reviewer.</p>
            </body>
        </sub-article>
    </sub-article>
</article>
