<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.2 20190208//EN" "http://jats.nlm.nih.gov/publishing/1.2/JATS-journalpublishing1.dtd"><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article" dtd-version="1.2" xml:lang="en">
    <front>
        <journal-meta>
            <journal-id journal-id-type="pmc">F1000Research</journal-id>
            <journal-title-group>
                <journal-title>F1000Research</journal-title>
            </journal-title-group>
            <issn pub-type="epub">2046-1402</issn>
            <publisher>
                <publisher-name>F1000 Research Limited</publisher-name>
                <publisher-loc>London, UK</publisher-loc>
            </publisher>
        </journal-meta>
        <article-meta>
            <article-id pub-id-type="doi">10.12688/f1000research.15809.2</article-id>
            <article-categories>
                <subj-group subj-group-type="heading">
                    <subject>Research Article</subject>
                </subj-group>
                <subj-group>
                    <subject>Articles</subject>
                </subj-group>
            </article-categories>
            <title-group>
                <article-title>Comparison of clustering tools in R for medium-sized 10x Genomics single-cell RNA-sequencing data</article-title>
                <fn-group content-type="pub-status">
                    <fn>
                        <p>[version 2; peer review: 3 approved]</p>
                    </fn>
                </fn-group>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author" corresp="yes">
                    <name>
                        <surname>Freytag</surname>
                        <given-names>Saskia</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Conceptualization</role>
                    <role content-type="http://credit.niso.org/">Data Curation</role>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Funding Acquisition</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <uri content-type="orcid">https://orcid.org/0000-0002-2185-7068</uri>
                    <xref ref-type="corresp" rid="c1">a</xref>
                    <xref ref-type="aff" rid="a1">1</xref>
                    <xref ref-type="aff" rid="a2">2</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Tian</surname>
                        <given-names>Luyi</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <uri content-type="orcid">https://orcid.org/0000-0003-3420-3685</uri>
                    <xref ref-type="aff" rid="a2">2</xref>
                    <xref ref-type="aff" rid="a3">3</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>L&#x00f6;nnstedt</surname>
                        <given-names>Ingrid</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Conceptualization</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a4">4</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Ng</surname>
                        <given-names>Milica</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Conceptualization</role>
                    <role content-type="http://credit.niso.org/">Funding Acquisition</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a4">4</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Bahlo</surname>
                        <given-names>Melanie</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Conceptualization</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Project Administration</role>
                    <role content-type="http://credit.niso.org/">Supervision</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <uri content-type="orcid">https://orcid.org/0000-0001-5132-0774</uri>
                    <xref ref-type="aff" rid="a1">1</xref>
                    <xref ref-type="aff" rid="a2">2</xref>
                </contrib>
                <aff id="a1">
                    <label>1</label>Population Health and Immunity, Walter and Eliza Hall Institute of Medical Research, Parkville, Australia</aff>
                <aff id="a2">
                    <label>2</label>Department of Medical Biology, University of Melbourne, Parkville, Australia</aff>
                <aff id="a3">
                    <label>3</label>Molecular Medicine Division, Walter and Eliza Hall Institute of Medical Research, Parkville, Australia</aff>
                <aff id="a4">
                    <label>4</label>Bio21 Insititute, CSL Limited, Parkville, Australia</aff>
            </contrib-group>
            <author-notes>
                <corresp id="c1">
                    <label>a</label>
                    <email xlink:href="mailto:freytag.s@wehi.edu.au">freytag.s@wehi.edu.au</email>
                </corresp>
                <fn fn-type="con">
                    <p>Freytag S: Conceptualization, Data Curation, Funding Acquisition, Formal Analysis, Investigation, Methodology, Software, Visualization, Writing &#x2013; Original Draft Preparation, Writing &#x2013; Review &amp; Editing</p>
                    <p>Tian L: Investigation, Writing &#x2013; Review &amp; Editing L&#x00f6;nnstedt I: Conceptualization, Methodology, Writing &#x2013; Review &amp; Editing</p>
                    <p>NG M: Conceptualization, Investigation, Funding Acquisition, Methodology, Writing &#x2013; Review &amp; Editing</p>
                    <p>Bahlo M: Supervision Conceptualization, Investigation, Funding Acquisition, Methodology, Writing &#x2013; Review &amp; Editing</p>
                </fn>
                <fn fn-type="conflict">
                    <p>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>19</day>
                <month>12</month>
                <year>2018</year>
            </pub-date>
            <pub-date pub-type="collection">
                <year>2018</year>
            </pub-date>
            <volume>7</volume>
            <elocation-id>1297</elocation-id>
            <history>
                <date date-type="accepted">
                    <day>14</day>
                    <month>12</month>
                    <year>2018</year>
                </date>
            </history>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2018 Freytag S et al.</copyright-statement>
                <copyright-year>2018</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <self-uri content-type="pdf" xlink:href="https://f1000research.com/articles/7-1297/pdf"/>
            <abstract>
                <p>
                    <bold>Background:</bold> The commercially available 10x Genomics protocol to generate droplet-based single cell RNA-seq (scRNA-seq) data is enjoying growing popularity among researchers. Fundamental to the analysis of such scRNA-seq data is the ability to cluster similar or same cells into non-overlapping groups. Many competing methods have been proposed for this task, but there is currently little guidance with regards to which method to use.</p>
                <p>
                    <bold>Methods:</bold> Here we use one gold standard 10x Genomics dataset, generated from the mixture of three cell lines, as well as three silver standard 10x Genomics datasets generated from peripheral blood mononuclear cells to examine not only the accuracy but also robustness of a dozen methods.</p>
                <p>
                    <bold>Results:</bold> We found that some methods, including Seurat and Cell Ranger, outperform other methods, although performance seems to be dependent on the complexity of the studied system. Furthermore, we found that solutions produced by different methods have little in common with each other.</p>
                <p>
                    <bold>Conclusions:</bold> In light of this we conclude that the choice of clustering tool crucially determines interpretation of scRNA-seq data generated by 10x Genomics. Hence practitioners and consumers should remain vigilant about the outcome of 10x Genomics scRNA-seq analysis.</p>
            </abstract>
            <kwd-group kwd-group-type="author">
                <kwd>Clustering</kwd>
                <kwd>Single-Cell RNA-seq</kwd>
                <kwd>Benchmarking</kwd>
                <kwd>10x Genomics</kwd>
            </kwd-group>
            <funding-group>
                <award-group id="fund-1" xlink:href="http://dx.doi.org/10.13039/501100000925">
                    <funding-source>National Health and Medical Research Council</funding-source>
                    <award-id>SeniorResearchFellowship110297</award-id>
                    <award-id>ProgramGrant1054618</award-id>
                    <award-id>IRIIS</award-id>
                </award-group>
                <funding-statement>We would like to thank the Australian Genome Research Facility and the Genomics Innovation Hub for their generous support of this project, including funding. This work was also supported by the Victorian Government&#x2019;s Operational Infrastructure Support Program and Australian Government NHMRC IRIIS. MB is funded by NHMRC Senior Research Fellowship 110297 and NHMRC Program Grant 1054618.</funding-statement>
                <funding-statement>
                    <italic>The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.</italic>
                </funding-statement>
            </funding-group>
        </article-meta>
        <notes>
            <sec sec-type="version-changes">
                <label>Revised</label>
                <title>Amendments from Version 1</title>
                <p>We thank all three reviewers for reviewing our manuscript and their constructive comments. In response, we have made the following modifications to the manuscript: - Added a discussion which summarize the overall performance of all methods - Clarified the design underlying the gold standard dataset - Included 2 further datasets generated from fresh PBMCs available in the TENxPBMCsData package - Clarified Cell Ranger approach to preprocessing - Elaborated on failed methods - Elaborated on use of performance metrics - Summarized use of method and similarity metric by different clustering tools - Investigated whether different similarity metrics relate to performance - Included boxplots for stability assessment using ARI_truth - Added a table explaining the various performance assessments - Changed the stability assessment with regards to genes to be more realistic - Included a list with all parameters in the code repository &#x00a0; In addition, the text has been clarified in several places. Detailed responses to all points raised by the reviewers are available below. &#x00a0; Since the inclusion of the TENxPBMCsData package required an update of the R version, we decided to assess all clustering methods for a second time using their newer versions.</p>
            </sec>
        </notes>
    </front>
    <body>
        <sec sec-type="intro">
            <title>Introduction</title>
            <p>Single-cell RNA-sequencing (scRNA-seq) studies have opened the way for new data-driven definitions of cell identity and function. No longer is a cell&#x2019;s type determined by arbitrary hierarchies and their respective predefined markers. Instead, a cell&#x2019;s transcriptional and epigenomic profile can now be used
                <sup>
                    <xref ref-type="bibr" rid="ref-1">1</xref>
                </sup> to accomplish this task. This is achieved using computational methods for scRNA-seq that characterize cells into novel and known cell types. Characterization consists of two steps: (i) unsupervised or semi-supervised clustering of same or similar cells into non-overlapping groups, and (ii) labeling clusters, i.e. determining the cell type, or related cell types, represented by the cluster. Here, we focus on the first step of this process.</p>
            <p>Research into clustering has produced many algorithms for the task, including over 90 tools specifically designed for scRNA-seq
                <sup>
                    <xref ref-type="bibr" rid="ref-2">2</xref>
                </sup>. Due to the relative youth of the field, there are currently no rules guiding the application of these clustering algorithms. If tools&#x2019; performances have been tested outside synthetic scenarios, testing seems to be confined to scenarios with limited biological variability. Furthermore, most tools were developed and consequently tested only on the Fluidigm C1 protocol, despite considerable differences in throughput capabilities and sensitivities
                <sup>
                    <xref ref-type="bibr" rid="ref-3">3</xref>
                </sup> in the different scRNA-seq platforms. Here we focus solely on clustering performance on medium-sized scRNA-seq data generated by 10x Genomics as it is currently the most widely used platform. Commercially available scRNA-seq platforms, like 10x Genomics&#x2019; Chromium, are being widely adopted due to their ease of use and relatively low cost per cell
                <sup>
                    <xref ref-type="bibr" rid="ref-4">4</xref>
                </sup>. The 10x Genomics protocol uses a droplet-based system to isolate single cells. Each droplet contains all the necessary reagents for cell lysis, barcoding, reverse transcription and molecular tagging. This is followed by pooled PCR amplification and 3&#x2019; library preparation, after which standard Illumina short-read sequencing can be applied
                <sup>
                    <xref ref-type="bibr" rid="ref-5">5</xref>
                </sup>. Unlike other commercially available scRNA-seq protocols, like Fluidigm C1, 10x Genomics allows for sequencing of thousands of cells albeit at much shallower read depths per cell, and without allowing the use of fluorescence markers to establish cell identity. As such the 10x Genomics platform is particularly suited to detailed characterization of heterogeneous tissues.</p>
        </sec>
        <sec sec-type="methods">
            <title>Methods</title>
            <p>In this study, we performed comprehensive evaluation of a dozen clustering methods (
                <xref ref-type="table" rid="T1">Table 1</xref>). We focused on analysis methods available in the R language, as this is one of the most commonly used programming languages for scRNA-seq data analysis. The exception to this is the 10x Genomics software 
                <monospace>Cell Ranger</monospace>. Since many methods are still being actively developed, we include assessment of program versions available in October 2017 and April 2018. Our evaluation comprised four core aspects: (i) accuracy of clustering solutions compared to a gold standard (near absolute truth, limited variability and complexity), (ii) performance of clustering methods using silver standard data (no absolute truth, realistic variability and complexity), (iii) stability of clustering solutions, and (iv) miscellaneous characteristics, such as time and practicality.</p>
            <table-wrap id="T1" orientation="portrait" position="anchor">
                <label>Table 1. </label>
                <caption>
                    <title>Overview of the clustering tools included in this study, and several characteristics thereof.</title>
                </caption>
                <table content-type="article-table" frame="hsides">
                    <thead>
                        <tr>
                            <th align="left" colspan="1" rowspan="1" valign="top">Software</th>
                            <th align="left" colspan="1" rowspan="1" valign="top">Year</th>
                            <th align="left" colspan="1" rowspan="1" valign="top">Similarity Metric</th>
                            <th align="left" colspan="1" rowspan="1" valign="top">Clustering Method</th>
                            <th colspan="1" rowspan="1">Ref</th>
                        </tr>
                    </thead>
                    <tbody>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <monospace>ascend</monospace>
                            </td>
                            <td align="left" colspan="1" rowspan="1" valign="top">2017</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Euclidean distance</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Hierarchical clustering</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <xref ref-type="bibr" rid="ref-7">7</xref>
                            </td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <monospace>Cell Ranger</monospace>
                            </td>
                            <td align="left" colspan="1" rowspan="1" valign="top">2016</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Euclidean distance</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Graph-based  clustering</td>
                            <td colspan="1" rowspan="1"/>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <monospace>CIDR</monospace>
                            </td>
                            <td align="left" colspan="1" rowspan="1" valign="top">2017</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Imputed dissimilarity</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Hierarchical clustering</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <xref ref-type="bibr" rid="ref-8">8</xref>
                            </td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <monospace>countClust</monospace>
                            </td>
                            <td align="left" colspan="1" rowspan="1" valign="top">2014</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">none</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Grade of membership models</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <xref ref-type="bibr" rid="ref-9">9</xref>
                            </td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <monospace>
                                    <ext-link ext-link-type="uri" xlink:href="https://github.com/dgrun/StemID">RaceID</ext-link>
                                </monospace>
                            </td>
                            <td align="left" colspan="1" rowspan="1" valign="top">2015</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Pearson correlation</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">K-means clustering</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <xref ref-type="bibr" rid="ref-10">10</xref>
                            </td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <monospace>
                                    <ext-link ext-link-type="uri" xlink:href="https://github.com/dgrun/RaceID">RaceID2</ext-link>
                                </monospace>
                            </td>
                            <td align="left" colspan="1" rowspan="1" valign="top">2016</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Pearson correlation</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">K-means clustering</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <xref ref-type="bibr" rid="ref-11">11</xref>
                            </td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <monospace>RCA</monospace>
                            </td>
                            <td align="left" colspan="1" rowspan="1" valign="top">2017</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Pearson correlation</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Supervised hierarchical clustering</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <xref ref-type="bibr" rid="ref-12">12</xref>
                            </td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <monospace>SC3</monospace>
                            </td>
                            <td align="left" colspan="1" rowspan="1" valign="top">2016</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Euclidean distance</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Consensus clustering</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <xref ref-type="bibr" rid="ref-13">13</xref>
                            </td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <monospace>scran</monospace>
                            </td>
                            <td align="left" colspan="1" rowspan="1" valign="top">2016</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Euclidean distance</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Hierarchical clustering</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <xref ref-type="bibr" rid="ref-14">14</xref>
                            </td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <monospace>Seurat</monospace>
                            </td>
                            <td align="left" colspan="1" rowspan="1" valign="top">2015</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Euclidean distance</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Graph-based  clustering</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <xref ref-type="bibr" rid="ref-15">15</xref>
                            </td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <monospace>SIMLR</monospace>
                            </td>
                            <td align="left" colspan="1" rowspan="1" valign="top">2016</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Multikernel learning</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Spectral clustering</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <xref ref-type="bibr" rid="ref-16">16</xref>
                            </td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <monospace>TSCAN</monospace>
                            </td>
                            <td align="left" colspan="1" rowspan="1" valign="top">2016</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">none</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Model-based clustering</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <xref ref-type="bibr" rid="ref-17">17</xref>
                            </td>
                        </tr>
                    </tbody>
                </table>
            </table-wrap>
            <sec>
                <title>Data</title>
                <p>

                    <bold>

                        <italic toggle="yes">Gold standard.</italic>
</bold> Three human lung adenocarcinoma cell lines, HCC827, H1975 and H2228, were cultured separately
                    <sup>
                        <xref ref-type="bibr" rid="ref-6">6</xref>
                    </sup>. The cell lines were obtained from ATCC and cultured in Roswell Park Memorial Institute 1640 medium with 10% fetal bovine serum (FBS, catalog number: 11875-176; Thermo Fisher Gibco) and 1% penicillin-streptomycin. The cells were grown independently at 37&#x00b0;C with 5% carbon dioxide until near 100% confluence. Before mixing cell lines, cells were dissociated into single-cell suspensions in FACS buffer (phosphate-buffered saline (PBS), catalog number: 14190-144; Thermo Fisher Gibco) with 5% FBS (catalog number: 35-076-CV; Corning), stained with propidium iodide (catalog number: P21493; Thermo Fisher FluoroPure) and 120,000 live cells were sorted for each cell line by FACS (BD FACSAria III flow cytometer, BD FACSDiva software version 7.0; BD Biology) to acquire an accurate equal mixture of live cells from the three cell lines. The resulting mixture was then processed by the Chromium Controller (10x Genomics) using single Cell 3&#x2019; Reagent Kit v2 (Chromium Single Cell 3&#x2019; Library &amp; Gel Bead Kit v2, catalog number: 120237; Chromium Single Cell A Chip Kit, 48 runs, catalog number: 120236; 10x Genomics) (see 
                    <xref ref-type="table" rid="T2">Table 2</xref>). Afterwards the library was sequenced using Illumina NextSeq500 and V4 chemistry (NextSeq 500/550 High Output Kit v2.5, 150 Cycles, catalog number: 20024907; Iluumina) with 100bp paired end reads. RTA (version 1.18.66.3; Illumina) was used for base calling.</p>
                <table-wrap id="T2" orientation="portrait" position="anchor">
                    <label>Table 2. </label>
                    <caption>
                        <title>Properties of all benchmarking datasets used in the study.</title>
                    </caption>
                    <table content-type="article-table" frame="hsides">
                        <thead>
                            <tr>
                                <th align="left" colspan="1" rowspan="1">
                                    <bold>Benchmark standard</bold>
                                </th>
                                <th align="left" colspan="1" rowspan="1">Gold</th>
                                <th align="left" colspan="5" rowspan="1">Silver</th>
                            </tr>
                            <tr>
                                <th align="left" colspan="1" rowspan="1">Dataset</th>
                                <th align="left" colspan="1" rowspan="1"/>
                                <th align="left" colspan="1" rowspan="1">Dataset 1</th>
                                <th align="left" colspan="1" rowspan="1">Dataset 2/2a</th>
                                <th align="left" colspan="1" rowspan="1">Dataset 3/3a</th>
                                <th align="left" colspan="1" rowspan="1">Dataset 4</th>
                                <th align="left" colspan="1" rowspan="1">Dataset5</th>
                            </tr>
                        </thead>
                        <tbody>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">
                                    <bold>Tissue</bold>
                                </td>
                                <td align="left" colspan="1" rowspan="1">Cell lines</td>
                                <td align="left" colspan="1" rowspan="1">PBMCs</td>
                                <td align="left" colspan="1" rowspan="1">PBMCs</td>
                                <td align="left" colspan="1" rowspan="1">PBMCs</td>
                                <td align="left" colspan="1" rowspan="1">PBMCs</td>
                                <td align="left" colspan="1" rowspan="1">PBMCs</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">
                                    <bold>Source</bold>
                                </td>
                                <td align="left" colspan="1" rowspan="1">GSE111108</td>
                                <td align="left" colspan="1" rowspan="1">GSE115189</td>
                                <td align="left" colspan="1" rowspan="1">Website
                                    <xref ref-type="other" rid="FN1">*</xref>/
                                    <sup>
                                        <xref ref-type="other" rid="FN2">&amp;</xref>
                                    </sup>
                                </td>
                                <td align="left" colspan="1" rowspan="1">Website
                                    <sup>
                                        <xref ref-type="other" rid="FN3">+</xref>
                                    </sup>/
                                    <sup>
                                        <xref ref-type="other" rid="FN4">&#x2212;</xref>
                                    </sup>
                                </td>
                                <td align="left" colspan="1" rowspan="1">Website
                                    <sup>
                                        <xref ref-type="other" rid="FN5">#</xref>
                                    </sup>
                                </td>
                                <td align="left" colspan="1" rowspan="1">Website
                                    <sup>
                                        <xref ref-type="other" rid="FN6">$</xref>
                                    </sup>
                                </td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">
                                    <bold>Instrument</bold>
                                </td>
                                <td align="left" colspan="1" rowspan="1">Chromium</td>
                                <td align="left" colspan="1" rowspan="1">Chromium</td>
                                <td align="left" colspan="1" rowspan="1">GemCode</td>
                                <td align="left" colspan="1" rowspan="1">Chromium</td>
                                <td align="left" colspan="1" rowspan="1">GemCode</td>
                                <td align="left" colspan="1" rowspan="1">Chromium</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">
                                    <bold>Number of cells</bold>
                                </td>
                                <td align="left" colspan="1" rowspan="1">1,039</td>
                                <td align="left" colspan="1" rowspan="1">3,372</td>
                                <td align="left" colspan="1" rowspan="1">2,691/2,700</td>
                                <td align="left" colspan="1" rowspan="1">4,337/4,340</td>
                                <td align="left" colspan="1" rowspan="1">5,419</td>
                                <td align="left" colspan="1" rowspan="1">8,381</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">
                                    <bold>Total genes detected</bold>
                                </td>
                                <td align="left" colspan="1" rowspan="1">29,451</td>
                                <td align="left" colspan="1" rowspan="1">24,654</td>
                                <td align="left" colspan="1" rowspan="1">20,693/16,634</td>
                                <td align="left" colspan="1" rowspan="1">25,820/19,773</td>
                                <td align="left" colspan="1" rowspan="1">28,117</td>
                                <td align="left" colspan="1" rowspan="1">21,425</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="7" rowspan="1">
                                    <italic toggle="yes">After preprocessing</italic>
                                </td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">
                                    <bold>Number of cells</bold>
                                </td>
                                <td align="left" colspan="1" rowspan="1">925</td>
                                <td align="left" colspan="1" rowspan="1">3,205</td>
                                <td align="left" colspan="1" rowspan="1">2,590/2,592</td>
                                <td align="left" colspan="1" rowspan="1">4,292/4,310</td>
                                <td align="left" colspan="1" rowspan="1">5,310</td>
                                <td align="left" colspan="1" rowspan="1">8,352</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">
                                    <bold>Mean counts per cell</bold>
                                </td>
                                <td align="left" colspan="1" rowspan="1">114,426</td>
                                <td align="left" colspan="1" rowspan="1">3,818</td>
                                <td align="left" colspan="1" rowspan="1">2,605/2432</td>
                                <td align="left" colspan="1" rowspan="1">4,528/4,368</td>
                                <td align="left" colspan="1" rowspan="1">2,057</td>
                                <td align="left" colspan="1" rowspan="1">4,650</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">
                                    <bold>Median genes detected</bold>
                                    <break/>
                                    <bold>per cell</bold>
                                </td>
                                <td align="left" colspan="1" rowspan="1">8,499</td>
                                <td align="left" colspan="1" rowspan="1">1,158</td>
                                <td align="left" colspan="1" rowspan="1">877/824</td>
                                <td align="left" colspan="1" rowspan="1">1,318/1,237</td>
                                <td align="left" colspan="1" rowspan="1">721</td>
                                <td align="left" colspan="1" rowspan="1">1,299</td>
                            </tr>
                        </tbody>
                    </table>
                    <table-wrap-foot>
                        <fn id="FN1">
                            <p>*
                                <ext-link ext-link-type="uri" xlink:href="https://support.10xgenomics.com/single-cell-gene-expression/datasets/1.0.0/pbmc3k">https://support.10xgenomics.com/single-cell-gene-expression/datasets/1.0.0/pbmc3k</ext-link>
                            </p>
                        </fn>
                        <fn id="FN2">
                            <p>
                                <sup>&amp;</sup>
                                <ext-link ext-link-type="uri" xlink:href="https://support.10xgenomics.com/single-cell-gene-expression/datasets/1.1.0/pbmc3k">https://support.10xgenomics.com/single-cell-gene-expression/datasets/1.1.0/pbmc3k</ext-link>
                            </p>
                        </fn>
                        <fn id="FN3">
                            <p>
                                <sup>+</sup>
                                <ext-link ext-link-type="uri" xlink:href="https://support.10xgenomics.com/single-cell-gene-expression/datasets/1.2.0/pbmc4k">https://support.10xgenomics.com/single-cell-gene-expression/datasets/1.2.0/pbmc4k</ext-link>
                            </p>
                        </fn>
                        <fn id="FN4">
                            <p>
                                <sup>&#x2212;</sup>
                                <ext-link ext-link-type="uri" xlink:href="https://support.10xgenomics.com/single-cell-gene-expression/datasets/2.1.0/pbmc4k">https://support.10xgenomics.com/single-cell-gene-expression/datasets/2.1.0/pbmc4k</ext-link>
                            </p>
                        </fn>
                        <fn id="FN5">
                            <p>
                                <sup>#</sup>
                                <ext-link ext-link-type="uri" xlink:href="https://support.10xgenomics.com/single-cell-gene-expression/datasets/1.1.0/pbmc6k">https://support.10xgenomics.com/single-cell-gene-expression/datasets/1.1.0/pbmc6k</ext-link>
                            </p>
                        </fn>
                        <fn id="FN6">
                            <p>
                                <sup>$</sup>
                                <ext-link ext-link-type="uri" xlink:href="https://support.10xgenomics.com/single-cell-gene-expression/datasets/2.1.0/pbmc8k">https://support.10xgenomics.com/single-cell-gene-expression/datasets/2.1.0/pbmc8k</ext-link>
                            </p>
                        </fn>
                    </table-wrap-foot>
                </table-wrap>
                <p>

                    <bold>
                        <italic toggle="yes">Silver standard.</italic>
</bold> We consider five fresh human peripheral blood mononuclear cells (PBMCs) scRNA-seq datasets to be the silver standard (
                    <xref ref-type="table" rid="T2">Table 2</xref>). All datasets were generated using the 10x Genomics droplet system combined with Illumina sequencing. The Australian Genome Research Facility in partnership with CSL generated one dataset using the 10x Genomics Chromium system (Dataset 1). Four datasets were generated by 10x Genomics and are publicly available (Datasets 2-5). Of these, Datasets 2 and 4 were generated with an earlier version of the microfluidics instrument, the 10x Genomics GemCode Controller (
                    <ext-link ext-link-type="uri" xlink:href="https://support.10xgenomics.com/single-cell-gene-expression/datasets/1.1.0/pbmc3k">Dataset 2</ext-link>, 
                    <ext-link ext-link-type="uri" xlink:href="https://support.10xgenomics.com/single-cell-gene-expression/datasets/1.1.0/pbmc6k">Dataset 4</ext-link>). Datasets 3 and 5 were generated with the latest instrument, the 10x Genomics Chromium Controller (
                    <ext-link ext-link-type="uri" xlink:href="https://support.10xgenomics.com/single-cell-gene-expression/datasets/1.2.0/pbmc4k">Dataset 3</ext-link>, 
                    <ext-link ext-link-type="uri" xlink:href="https://support.10xgenomics.com/single-cell-gene-expression/datasets/2.1.0/pbmc8k">Dataset 5</ext-link>).</p>
                <p>For Dataset 1, PBMCs were isolated from whole blood obtained through the Australian Red Cross Blood Service in the following manner. First, 50ml of blood was diluted using 50ml of PBS (catalog number: D8537-500ml; Sigma-Aldrich). We then added 30ml of Ficoll-Paque medium (catalog number: Catalog: 17-1440-03; GE Healthcare). We then centrifuged at room temperature for 20 minutes at 400 g and carefully removed the interface layer containing PBMCs, located between the top plasma layer and middle layer (Heraeus Multifuge 3 S-R Centrifuge, Thermo Fisher Scientific). To remove the supernatant, we further centrifuged at 400 g for 10 minutes at room temperature. This process was repeated to remove the contaminating Ficoll medium or platelets. Finally, cells were resuspended in 20ml of cell culture media with 5% FBS (RPMI-1640 Medium, catalog number: R0884-500ml, Sigma-Aldrich) and counted (Nikon Eclipse TS100 Microscope, Nikon). The resulting mixture was then processed by the Chromium Controller (10x Genomics) using single Cell 3&#x2019; Reagent Kit v2 (Chromium Single Cell 3&#x2019; Library &amp; Gel Bead Kit v2, catalog number: 120237; Chromium Single Cell A Chip Kit, 48 runs, catalog number: 120236; 10x Genomics). Afterwards the library was sequenced using HiSeq2500 (Illumina) and V4 chemistry (HiSeq PE Cluster Kit v4 cBot, catalog number: PE-401-4001; HiSeq SBS Kit V4 50 cycles, catalog number: FC-401-4002; Illumina) with 101bp paired end reads. RTA (version 1.18.66.3, Illumina) was used for base calling.</p>
            </sec>
            <sec>
                <title>Preprocessing</title>
                <p>For Datasets 1-3, we used the 10x Genomics software version 2.0.0, 
                    <ext-link ext-link-type="uri" xlink:href="https://support.10xgenomics.com/single-cell-gene-expression/software/downloads/latest">Cell Ranger</ext-link> to align to the GRCh38 (version 90) genome annotation, de-duplicate, filter barcodes and quantify genes. Note that, Cell Ranger filters any barcode that contains less than 10% of the 99
                    <sup>
                        <italic toggle="yes">th</italic>
                    </sup> percentile of total UMI counts per barcode, as these are considered to be barcodes associated with empty droplets. The barcode by design can take one of 737,000 different sequences that comprise a whitelist. This feature allows the performance of error correction when the observed barcode does not match any barcode on the whitelist due to sequencing error. Using the Bioconductor package 
                    <ext-link ext-link-type="uri" xlink:href="https://bioconductor.org/packages/release/bioc/html/scater.html">scater</ext-link>
                    <sup>
                        <xref ref-type="bibr" rid="ref-18">18</xref>
                    </sup> (version 1.6.3), we then removed low quality data from cells with low library size or low number of expressed gene transcripts. We also removed cells with a high mitochondrial read proportion as this can indicate apoptosis, also known as programmed cell death. Stressed cells undergoing apoptosis have an aberrant transcriptome profile in comparison to a living cell and have previously been acknowledged to adversely influence transcriptome studies
                    <sup>
                        <xref ref-type="bibr" rid="ref-14">14</xref>
                    </sup>.</p>
                <p>Preprocessed versions of Datasets 2-5 were available in the R package 
                    <ext-link ext-link-type="uri" xlink:href="https://bioconductor.org/packages/devel/data/experiment/html/TENxPBMCData.html">TENxPBMCData</ext-link>. However, preprocessing was conducted with a CellRanger modified version of GRCh38 (version90) genome annotation resulting in slightly different versions for Dataset 2 and 3, referred to as Dataset 2a and Dataset 3a.</p>
            </sec>
            <sec>
                <title>Criteria for inclusion of clustering tool</title>
                <p>We based our selection of method on the online list within 
                    <monospace>
                        <ext-link ext-link-type="uri" xlink:href="https://www.scrna-tools.org/">www.scRNA-tools.org</ext-link>
                    </monospace>
                    <sup>
                        <xref ref-type="bibr" rid="ref-2">2</xref>
                    </sup> in October 2017. We only considered methods with an R package that had sufficient documentation to enable easy installation and execution and had at least one preprint or publication associated with it. Note that for some of the R packages the primary focus is not clustering, but the package authors explicitly describe how their packages can be applied to achieve clustering of the scRNA-seq data. We also excluded any methods that required extensive prior information not provided in the package. We also excluded any methods that continually failed to run (e.g. Linnorm
                    <sup>
                        <xref ref-type="bibr" rid="ref-19">19</xref>
                    </sup> because computation would time out and Monocle
                    <sup>
                        <xref ref-type="bibr" rid="ref-20">20</xref>
                    </sup> because calculation of dispersion resulted in errors). This resulted in the evaluation of 12 methods (see 
                    <xref ref-type="table" rid="T1">Table 1</xref> and for further details see 
                    <xref ref-type="other" rid="ST1">Supplementary Table 1</xref>) in the first evaluation (R version 3.4.3). During the second evaluation of the methods (R version 3.5.0) only 11 methods were still functional. 
                    <monospace>SIMLR</monospace> resulted in R aborting and had to be excluded.</p>
                <p>The aim of this study is to provide guidance for the use of clustering methods to non-experts. Hence, we used all clustering methods with their default parameters as this represents the most common use case. In the case of 
                    <monospace>countClust</monospace> and 
                    <monospace>SIMLR</monospace> parameters included the number of clusters, which we set to 3, 8 and 20 for the gold standard, silver standard datasets in evaluation 1 (R version 3.4.3) and silver standard datasets in evaluation 2 (R version 3.5.0), respectively. Marker genes were required for the analysis with 
                    <monospace>scran</monospace>, which we obtained by performing differential expression analyses on GSE86337 and an in-house dataset of isolated cell types in PBMCs
                    <sup>
                        <xref ref-type="bibr" rid="ref-21">21</xref>
                    </sup> for the gold standard and silver standard datasets, respectively. Furthermore, we also followed upstream data handling, such as filtering of genes and normalization, as described in the documentation of the respective clustering method. We concede that it is possible that more care in the upstream data handling and selection of parameters could result in different results. However, confronted with the extremely large number of parameter choices, we believe that this evaluation suffices to identify strengths and weaknesses of each method.</p>
            </sec>
            <sec>
                <title>Methods for the comparison of clustering solutions</title>
                <p>To evaluate the similarity of different clustering solutions, we rely on two different metrics. We use the adjusted Rand index (ARI)
                    <sup>
                        <xref ref-type="bibr" rid="ref-22">22</xref>
                    </sup> and the normalized mutual information (NMI)
                    <sup>
                        <xref ref-type="bibr" rid="ref-23">23</xref>
                    </sup>, two metrics routinely applied in the field of clustering, to assess the similarity of clustering solutions or their similarity to a known truth. Both metrics can take values from 0 to 1, with 0 signifying no overlap between two groupings and 1 signifying complete overlap. These metrics are also applicable in the absence of known cluster labels. Furthermore, they share the following advantages: bounded ranges, no assumptions regarding cluster structures and symmetry.</p>
                <p>To evaluate the performance of the different clustering methods with regards to an underlying truth, we use the ARI as well as a homogeneity score
                    <sup>
                        <xref ref-type="bibr" rid="ref-24">24</xref>
                    </sup>. The homogeneity score takes the value 1 when all of its clusters contain only data points that are members of a single known group. Values of this score closer to 0 indicate that clusters contain mixed known groups. Unlike ARI, this score does not penalize members of a single group being split into several clusters and thus serves as a complimentary score to the ARI. Furthermore, bounded ranges and no assumptions regarding cluster structures are properties of both the ARI with regards to ground truth and the homogeneity score.</p>
                <p>Let 
                    <italic toggle="yes">X</italic> be a finite set of size 
                    <italic toggle="yes">n</italic>. A clustering solution 
                    <italic toggle="yes">C</italic> is a set 
                    <italic toggle="yes">C</italic>
                    <sub>1</sub>, . . . , 
                    <italic toggle="yes">C
                        <sub>k</sub>
                    </italic> of non-empty disjoint subsets of 
                    <italic toggle="yes">X</italic> such that their union equals 
                    <italic toggle="yes">X</italic>. Let 
                    <inline-formula>
                        <mml:math display="inline" id="M1">
                            <mml:mrow>
                                <mml:mi>C</mml:mi>
                                <mml:mo>&#x2032;</mml:mo>
                                <mml:mo>=</mml:mo>
                                <mml:msubsup>
                                    <mml:mi>C</mml:mi>
                                    <mml:mn>1</mml:mn>
                                    <mml:mo>&#x2032;</mml:mo>
                                </mml:msubsup>
                                <mml:mo>,</mml:mo>
                                <mml:mn>...</mml:mn>
                                <mml:mo>,</mml:mo>
                                <mml:msubsup>
                                    <mml:mi>C</mml:mi>
                                    <mml:mi>l</mml:mi>
                                    <mml:mo>&#x2032;</mml:mo>
                                </mml:msubsup>
                            </mml:mrow>
                        </mml:math>
                    </inline-formula> be a second clustering solution or the supervised labeling solution with the same properties. The contingency table 
                    <italic toggle="yes">M</italic> = (
                    <italic toggle="yes">m
                        <sub>ij</sub>
                    </italic> ) of the pair of sets 
                    <italic toggle="yes">C</italic>, 
                    <italic toggle="yes">C</italic>&#x2032; is a 
                    <italic toggle="yes">k</italic> &#x00d7; 
                    <italic toggle="yes">l</italic> matrix whose 
                    <italic toggle="yes">i</italic>, 
                    <italic toggle="yes">j</italic>-th entry equals the number of elements in the intersection of clusters 
                    <italic toggle="yes">C
                        <sub>i</sub>
                    </italic> and 
                    <inline-formula>
                        <mml:math display="inline" id="M3">
                            <mml:mrow>
                                <mml:msubsup>
                                    <mml:mi>C</mml:mi>
                                    <mml:mi>j</mml:mi>
                                    <mml:mo>&#x2032;</mml:mo>
                                </mml:msubsup>
                            </mml:mrow>
                        </mml:math>
                    </inline-formula>: 
                    <disp-formula id="e1">
                        <mml:math display="block" id="math1">
                            <mml:mrow>
                                <mml:msub>
                                    <mml:mi>m</mml:mi>
                                    <mml:mrow>
                                        <mml:mi>i</mml:mi>
                                        <mml:mi>j</mml:mi>
                                    </mml:mrow>
                                </mml:msub>
                                <mml:mo>=</mml:mo>
                                <mml:mrow>
                                    <mml:mo>|</mml:mo>
                                    <mml:mrow>
                                        <mml:msub>
                                            <mml:mi>C</mml:mi>
                                            <mml:mi>i</mml:mi>
                                        </mml:msub>
                                        <mml:mo>&#x2229;</mml:mo>
                                        <mml:msubsup>
                                            <mml:mi>C</mml:mi>
                                            <mml:mi>j</mml:mi>
                                            <mml:mo>&#x2032;</mml:mo>
                                        </mml:msubsup>
                                    </mml:mrow>
                                    <mml:mo>|</mml:mo>
                                </mml:mrow>
                                <mml:mo>,</mml:mo>
                                <mml:mn>1</mml:mn>
                                <mml:mo>&#x2264;</mml:mo>
                                <mml:mi>i</mml:mi>
                                <mml:mo>&#x2264;</mml:mo>
                                <mml:mi>k</mml:mi>
                                <mml:mo>,</mml:mo>
                                <mml:mn>1</mml:mn>
                                <mml:mo>&#x2264;</mml:mo>
                                <mml:mi>j</mml:mi>
                                <mml:mo>&#x2264;</mml:mo>
                                <mml:mi>l</mml:mi>
                                <mml:mo>.</mml:mo>
                            </mml:mrow>
                        </mml:math>
                    </disp-formula>
                </p>
                <p>
                    <bold>
                        <italic toggle="yes">ARI</italic>
                    </bold>
                </p>
                <p>
                    <disp-formula id="e2">
                        <mml:math display="block" id="math3">
                            <mml:mrow>
                                <mml:msub>
                                    <mml:mi>R</mml:mi>
                                    <mml:mrow>
                                        <mml:mi>a</mml:mi>
                                        <mml:mi>d</mml:mi>
                                        <mml:mi>j</mml:mi>
                                    </mml:mrow>
                                </mml:msub>
                                <mml:mo stretchy="false">(</mml:mo>
                                <mml:mi>C</mml:mi>
                                <mml:mo>,</mml:mo>
                                <mml:msup>
                                    <mml:mi>C</mml:mi>
                                    <mml:mo>&#x2032;</mml:mo>
                                </mml:msup>
                                <mml:mo stretchy="false">)</mml:mo>
                                <mml:mo>=</mml:mo>
                                <mml:mfrac>
                                    <mml:mrow>
                                        <mml:mstyle displaystyle="true">
                                            <mml:msubsup>
                                                <mml:mo>&#x2211;</mml:mo>
                                                <mml:mrow>
                                                    <mml:mi>i</mml:mi>
                                                    <mml:mo>=</mml:mo>
                                                    <mml:mn>1</mml:mn>
                                                </mml:mrow>
                                                <mml:mi>k</mml:mi>
                                            </mml:msubsup>
                                            <mml:mrow>
                                                <mml:mstyle displaystyle="true">
                                                    <mml:msubsup>
                                                        <mml:mo>&#x2211;</mml:mo>
                                                        <mml:mrow>
                                                            <mml:mi>j</mml:mi>
                                                            <mml:mo>=</mml:mo>
                                                            <mml:mn>1</mml:mn>
                                                        </mml:mrow>
                                                        <mml:mi>l</mml:mi>
                                                    </mml:msubsup>
                                                    <mml:mrow>
                                                        <mml:mrow>
                                                            <mml:mo>(</mml:mo>
                                                            <mml:mrow>
                                                                <mml:mtable>
                                                                    <mml:mtr>
                                                                        <mml:mtd>
                                                                            <mml:mrow>
                                                                                <mml:msub>
                                                                                    <mml:mi>m</mml:mi>
                                                                                    <mml:mrow>
                                                                                        <mml:mi>i</mml:mi>
                                                                                        <mml:mi>j</mml:mi>
                                                                                    </mml:mrow>
                                                                                </mml:msub>
                                                                            </mml:mrow>
                                                                        </mml:mtd>
                                                                    </mml:mtr>
                                                                    <mml:mtr>
                                                                        <mml:mtd>
                                                                            <mml:mn>2</mml:mn>
                                                                        </mml:mtd>
                                                                    </mml:mtr>
                                                                </mml:mtable>
                                                            </mml:mrow>
                                                            <mml:mo>)</mml:mo>
                                                        </mml:mrow>
                                                    </mml:mrow>
                                                </mml:mstyle>
                                            </mml:mrow>
                                        </mml:mstyle>
                                        <mml:mo>&#x2212;</mml:mo>
                                        <mml:msub>
                                            <mml:mi>t</mml:mi>
                                            <mml:mn>3</mml:mn>
                                        </mml:msub>
                                    </mml:mrow>
                                    <mml:mrow>
                                        <mml:mrow>
                                            <mml:mo>(</mml:mo>
                                            <mml:mrow>
                                                <mml:mfrac>
                                                    <mml:mn>1</mml:mn>
                                                    <mml:mn>2</mml:mn>
                                                </mml:mfrac>
                                                <mml:mrow>
                                                    <mml:mo>(</mml:mo>
                                                    <mml:mrow>
                                                        <mml:msub>
                                                            <mml:mi>t</mml:mi>
                                                            <mml:mn>1</mml:mn>
                                                        </mml:msub>
                                                        <mml:mo>+</mml:mo>
                                                        <mml:msub>
                                                            <mml:mi>t</mml:mi>
                                                            <mml:mn>2</mml:mn>
                                                        </mml:msub>
                                                    </mml:mrow>
                                                    <mml:mo>)</mml:mo>
                                                </mml:mrow>
                                                <mml:mo>&#x2212;</mml:mo>
                                                <mml:msub>
                                                    <mml:mi>t</mml:mi>
                                                    <mml:mn>3</mml:mn>
                                                </mml:msub>
                                            </mml:mrow>
                                            <mml:mo>)</mml:mo>
                                        </mml:mrow>
                                    </mml:mrow>
                                </mml:mfrac>
                                <mml:mo>,</mml:mo>
                            </mml:mrow>
                        </mml:math>
                    </disp-formula>
                </p>
                <p> where 
                    <italic toggle="yes">t</italic>
                    <sub>1</sub> = 
                    <inline-formula>
                        <mml:math display="inline" id="M2">
                            <mml:mrow>
                                <mml:mstyle displaystyle="true">
                                    <mml:msubsup>
                                        <mml:mo>&#x2211;</mml:mo>
                                        <mml:mrow>
                                            <mml:mi>i</mml:mi>
                                            <mml:mo>=</mml:mo>
                                            <mml:mtext>1</mml:mtext>
                                        </mml:mrow>
                                        <mml:mi>k</mml:mi>
                                    </mml:msubsup>
                                    <mml:mrow>
                                        <mml:mrow>
                                            <mml:mo>(</mml:mo>
                                            <mml:mrow>
                                                <mml:mtable>
                                                    <mml:mtr>
                                                        <mml:mtd>
                                                            <mml:mrow>
                                                                <mml:mrow>
                                                                    <mml:mo>|</mml:mo>
                                                                    <mml:mrow>
                                                                        <mml:msub>
                                                                            <mml:mi>C</mml:mi>
                                                                            <mml:mi>i</mml:mi>
                                                                        </mml:msub>
                                                                    </mml:mrow>
                                                                    <mml:mo>|</mml:mo>
                                                                </mml:mrow>
                                                            </mml:mrow>
                                                        </mml:mtd>
                                                    </mml:mtr>
                                                    <mml:mtr>
                                                        <mml:mtd>
                                                            <mml:mn>2</mml:mn>
                                                        </mml:mtd>
                                                    </mml:mtr>
                                                </mml:mtable>
                                            </mml:mrow>
                                            <mml:mo>)</mml:mo>
                                        </mml:mrow>
                                    </mml:mrow>
                                </mml:mstyle>
                                <mml:mspace width="0.2em"/>
                                <mml:mo>,</mml:mo>
                                <mml:msub>
                                    <mml:mi>t</mml:mi>
                                    <mml:mn>2</mml:mn>
                                </mml:msub>
                                <mml:mo>=</mml:mo>
                                <mml:mstyle displaystyle="true">
                                    <mml:msubsup>
                                        <mml:mo>&#x2211;</mml:mo>
                                        <mml:mrow>
                                            <mml:mi>j</mml:mi>
                                            <mml:mo>=</mml:mo>
                                            <mml:mtext>1</mml:mtext>
                                        </mml:mrow>
                                        <mml:mi>l</mml:mi>
                                    </mml:msubsup>
                                    <mml:mrow>
                                        <mml:mrow>
                                            <mml:mo>(</mml:mo>
                                            <mml:mrow>
                                                <mml:mtable>
                                                    <mml:mtr>
                                                        <mml:mtd>
                                                            <mml:mrow>
                                                                <mml:mrow>
                                                                    <mml:mo>|</mml:mo>
                                                                    <mml:mrow>
                                                                        <mml:msubsup>
                                                                            <mml:mi>C</mml:mi>
                                                                            <mml:mi>j</mml:mi>
                                                                            <mml:mo>&#x2032;</mml:mo>
                                                                        </mml:msubsup>
                                                                    </mml:mrow>
                                                                    <mml:mo>|</mml:mo>
                                                                </mml:mrow>
                                                            </mml:mrow>
                                                        </mml:mtd>
                                                    </mml:mtr>
                                                    <mml:mtr>
                                                        <mml:mtd>
                                                            <mml:mn>2</mml:mn>
                                                        </mml:mtd>
                                                    </mml:mtr>
                                                </mml:mtable>
                                            </mml:mrow>
                                            <mml:mo>)</mml:mo>
                                        </mml:mrow>
                                    </mml:mrow>
                                </mml:mstyle>
                            </mml:mrow>
                        </mml:math>
                    </inline-formula> and 
                    <inline-formula>
                        <mml:math display="inline" id="M4">
                            <mml:mrow>
                                <mml:msub>
                                    <mml:mi>t</mml:mi>
                                    <mml:mtext>3</mml:mtext>
                                </mml:msub>
                                <mml:mo>=</mml:mo>
                                <mml:mfrac>
                                    <mml:mrow>
                                        <mml:mn>2</mml:mn>
                                        <mml:msub>
                                            <mml:mi>t</mml:mi>
                                            <mml:mn>1</mml:mn>
                                        </mml:msub>
                                        <mml:mspace width="0.2em"/>
                                        <mml:msub>
                                            <mml:mi>t</mml:mi>
                                            <mml:mn>2</mml:mn>
                                        </mml:msub>
                                    </mml:mrow>
                                    <mml:mrow>
                                        <mml:mi>n</mml:mi>
                                        <mml:mrow>
                                            <mml:mo>(</mml:mo>
                                            <mml:mrow>
                                                <mml:mi>n</mml:mi>
                                                <mml:mo>&#x2212;</mml:mo>
                                                <mml:mn>1</mml:mn>
                                            </mml:mrow>
                                            <mml:mo>)</mml:mo>
                                        </mml:mrow>
                                    </mml:mrow>
                                </mml:mfrac>
                                <mml:mo>&#x00b7;</mml:mo>
                            </mml:mrow>
                        </mml:math>
                    </inline-formula> For ease of notation this is referred to as ARI in the text, dropping the reference to specific pairs of sets. Furthermore, we also distinguish between ARI_truth as a comparison of a clustering solution to an underlying known or suspected truth and ARI_comp, which refers to a comparison between two clustering solutions.</p>
                <p>
                    <bold>
                        <italic toggle="yes">NMI</italic>
                    </bold> 
                    <disp-formula id="e3">
                        <mml:math display="block" id="math6">
                            <mml:mrow>
                                <mml:mi>N</mml:mi>
                                <mml:mi>M</mml:mi>
                                <mml:msub>
                                    <mml:mi>I</mml:mi>
                                    <mml:mn>1</mml:mn>
                                </mml:msub>
                                <mml:mo>=</mml:mo>
                                <mml:mfrac>
                                    <mml:mrow>
                                        <mml:mi>I</mml:mi>
                                        <mml:mo stretchy="false">(</mml:mo>
                                        <mml:mi>C</mml:mi>
                                        <mml:mo>,</mml:mo>
                                        <mml:msup>
                                            <mml:mi>C</mml:mi>
                                            <mml:mo>&#x2032;</mml:mo>
                                        </mml:msup>
                                        <mml:mo stretchy="false">)</mml:mo>
                                    </mml:mrow>
                                    <mml:mrow>
                                        <mml:msqrt>
                                            <mml:mrow>
                                                <mml:mi>H</mml:mi>
                                                <mml:mo stretchy="false">(</mml:mo>
                                                <mml:mi>C</mml:mi>
                                                <mml:mo stretchy="false">)</mml:mo>
                                                <mml:mi>H</mml:mi>
                                                <mml:mo stretchy="false">(</mml:mo>
                                                <mml:msup>
                                                    <mml:mi>C</mml:mi>
                                                    <mml:mo>&#x2032;</mml:mo>
                                                </mml:msup>
                                                <mml:mo stretchy="false">)</mml:mo>
                                            </mml:mrow>
                                        </mml:msqrt>
                                    </mml:mrow>
                                </mml:mfrac>
                                <mml:mo>,</mml:mo>
                            </mml:mrow>
                        </mml:math>
                    </disp-formula> where 
                    <italic toggle="yes">H</italic>(
                    <italic toggle="yes">C</italic> ) = 
                    <italic toggle="yes">I</italic>(
                    <italic toggle="yes">C</italic>, 
                    <italic toggle="yes">C</italic> ) is the entropy of 
                    <italic toggle="yes">C</italic>. Note that 
                    <disp-formula id="e4">
                        <mml:math display="block" id="math7">
                            <mml:mrow>
                                <mml:mi>I</mml:mi>
                                <mml:mo stretchy="false">(</mml:mo>
                                <mml:mi>C</mml:mi>
                                <mml:mo>,</mml:mo>
                                <mml:msup>
                                    <mml:mi>C</mml:mi>
                                    <mml:mo>&#x2032;</mml:mo>
                                </mml:msup>
                                <mml:mo stretchy="false">)</mml:mo>
                                <mml:mo>=</mml:mo>
                                <mml:mstyle displaystyle="true">
                                    <mml:munderover>
                                        <mml:mo>&#x2211;</mml:mo>
                                        <mml:mrow>
                                            <mml:mi>i</mml:mi>
                                            <mml:mo>=</mml:mo>
                                            <mml:mn>1</mml:mn>
                                        </mml:mrow>
                                        <mml:mi>k</mml:mi>
                                    </mml:munderover>
                                    <mml:mrow>
                                        <mml:mstyle displaystyle="true">
                                            <mml:munderover>
                                                <mml:mo>&#x2211;</mml:mo>
                                                <mml:mrow>
                                                    <mml:mi>j</mml:mi>
                                                    <mml:mo>=</mml:mo>
                                                    <mml:mn>1</mml:mn>
                                                </mml:mrow>
                                                <mml:mi>l</mml:mi>
                                            </mml:munderover>
                                            <mml:mrow>
                                                <mml:mi>P</mml:mi>
                                                <mml:mo stretchy="false">(</mml:mo>
                                                <mml:mi>i</mml:mi>
                                                <mml:mo>,</mml:mo>
                                                <mml:mi>j</mml:mi>
                                                <mml:mo stretchy="false">)</mml:mo>
                                                <mml:msub>
                                                    <mml:mrow>
                                                        <mml:mi>log</mml:mi>
                                                        <mml:mo>&#x2061;</mml:mo>
                                                    </mml:mrow>
                                                    <mml:mn>2</mml:mn>
                                                </mml:msub>
                                            </mml:mrow>
                                        </mml:mstyle>
                                    </mml:mrow>
                                </mml:mstyle>
                                <mml:mfrac>
                                    <mml:mrow>
                                        <mml:mi>P</mml:mi>
                                        <mml:mo stretchy="false">(</mml:mo>
                                        <mml:mi>i</mml:mi>
                                        <mml:mo>,</mml:mo>
                                        <mml:mi>j</mml:mi>
                                        <mml:mo stretchy="false">)</mml:mo>
                                    </mml:mrow>
                                    <mml:mrow>
                                        <mml:mi>P</mml:mi>
                                        <mml:mo stretchy="false">(</mml:mo>
                                        <mml:mi>i</mml:mi>
                                        <mml:mo stretchy="false">)</mml:mo>
                                        <mml:mi>P</mml:mi>
                                        <mml:mo stretchy="false">(</mml:mo>
                                        <mml:mi>j</mml:mi>
                                        <mml:mo stretchy="false">)</mml:mo>
                                    </mml:mrow>
                                </mml:mfrac>
                                <mml:mo>,</mml:mo>
                            </mml:mrow>
                        </mml:math>
                    </disp-formula> where 
                    <inline-formula>
                        <mml:math id="math8">
                            <mml:mrow>
                                <mml:mi>P</mml:mi>
                                <mml:mo stretchy="false">(</mml:mo>
                                <mml:mi>i</mml:mi>
                                <mml:mo>,</mml:mo>
                                <mml:mi>j</mml:mi>
                                <mml:mo stretchy="false">)</mml:mo>
                                <mml:mo>=</mml:mo>
                                <mml:mfrac>
                                    <mml:mrow>
                                        <mml:msub>
                                            <mml:mi>m</mml:mi>
                                            <mml:mrow>
                                                <mml:mi>i</mml:mi>
                                                <mml:mi>j</mml:mi>
                                            </mml:mrow>
                                        </mml:msub>
                                    </mml:mrow>
                                    <mml:mi>n</mml:mi>
                                </mml:mfrac>
                            </mml:mrow>
                        </mml:math>
                    </inline-formula> and 
                    <inline-formula>
                        <mml:math id="math9">
                            <mml:mrow>
                                <mml:mi>P</mml:mi>
                                <mml:mo stretchy="false">(</mml:mo>
                                <mml:mi>i</mml:mi>
                                <mml:mo stretchy="false">)</mml:mo>
                                <mml:mo>=</mml:mo>
                                <mml:mfrac>
                                    <mml:mrow>
                                        <mml:mrow>
                                            <mml:mo>|</mml:mo>
                                            <mml:mrow>
                                                <mml:msub>
                                                    <mml:mi>C</mml:mi>
                                                    <mml:mi>i</mml:mi>
                                                </mml:msub>
                                            </mml:mrow>
                                            <mml:mo>|</mml:mo>
                                        </mml:mrow>
                                    </mml:mrow>
                                    <mml:mi>n</mml:mi>
                                </mml:mfrac>
                                <mml:mo>,</mml:mo>
                            </mml:mrow>
                        </mml:math>
                    </inline-formula> is the mutual information of 
                    <italic toggle="yes">C</italic> and 
                    <italic toggle="yes">C</italic>&#x2018;.</p>
                <p>
                    <bold>
                        <italic toggle="yes">Homogeneity.</italic>
                    </bold> Now let us assume 
                    <italic toggle="yes">C</italic>&#x2032; is the known and correct grouping of the cells. Then, 
                    <disp-formula id="e5">
                        <mml:math display="block" id="math10">
                            <mml:mrow>
                                <mml:mtext>Homogeneity</mml:mtext>
                                <mml:mo>=</mml:mo>
                                <mml:mfrac>
                                    <mml:mrow>
                                        <mml:mi>I</mml:mi>
                                        <mml:mo stretchy="false">(</mml:mo>
                                        <mml:mi>C</mml:mi>
                                        <mml:mo>,</mml:mo>
                                        <mml:msup>
                                            <mml:mi>C</mml:mi>
                                            <mml:mo>&#x2032;</mml:mo>
                                        </mml:msup>
                                        <mml:mo stretchy="false">)</mml:mo>
                                    </mml:mrow>
                                    <mml:mrow>
                                        <mml:mi>H</mml:mi>
                                        <mml:mo stretchy="false">(</mml:mo>
                                        <mml:msup>
                                            <mml:mi>C</mml:mi>
                                            <mml:mo>&#x2032;</mml:mo>
                                        </mml:msup>
                                        <mml:mo stretchy="false">)</mml:mo>
                                    </mml:mrow>
                                </mml:mfrac>
                                <mml:mo>.</mml:mo>
                            </mml:mrow>
                        </mml:math>
                    </disp-formula>
                </p>
            </sec>
            <sec>
                <title>Performance assessment</title>
                <p>We evaluated accuracy, robustness and running time for all methods (for detailed benchmarking plan see 
                    <xref ref-type="other" rid="ST1">Supplementary Table 2</xref>). For some assessments we tested methods both in R version 3.4.3 and R version 3.5.0, other assessments were only performed for one R version.</p>
                <p>
                    <bold>
                        <italic toggle="yes">Gold standard.</italic>
                    </bold> The gold standard dataset consists of a mixture of three human lung adenocarcinoma cell lines in equal proportions. As the library preparation requires mixing these cells, the origin of each sequenced cell is technically unknown. By exploiting the genetic differences between the three different cell lines we were able to establish the cell line of origin for each cell in the gold standard dataset. To this end we first called single nucleotide variants (SNVs) in publicly available bulk RNA-seq of the same cell lines (
                    <ext-link ext-link-type="uri" xlink:href="https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE86337">GSE86337</ext-link>)
                    <sup>
                        <xref ref-type="bibr" rid="ref-25">25</xref>
                    </sup>. Drawing on these SNVs, we then apply 
                    <ext-link ext-link-type="uri" xlink:href="https://github.com/statgen/demuxlet">demuxlet</ext-link>
                    <sup>
                        <xref ref-type="bibr" rid="ref-26">26</xref>
                    </sup> (version 0.0.1), which harnesses the natural genetic variation between the cell lines to determine the most likely identity of each cell. We observe almost complete concordance between the result from demuxlet and clustering of cells seen in dimension reduction visualizations of the data (compare 
                    <xref ref-type="other" rid="SF1">Supplementary Figure 1</xref>). Note that the gold standard dataset was only used during the first evaluation (R version 3.4.3).</p>
                <p>
                    <bold>
                        <italic toggle="yes">Silver standard.</italic>
                    </bold> For the silver standard data, we compared clustering solutions to a cell labeling approach by 10x Genomics
                    <sup>
                        <xref ref-type="bibr" rid="ref-5">5</xref>
                    </sup> for PBMCs. This approach finds the cell type in a reference dataset which most closely resembles the expression in the cell. The reference dataset contains 11 isolated cell types sequenced using the 10x Genomics system. While this labeling does not constitute truth, it has been found to be perform well in comparison with marker-based classification
                    <sup>
                        <xref ref-type="bibr" rid="ref-5">5</xref>
                    </sup>. Furthermore, the proportions of cells assigned to the 11 cell types by the supervised labeling approach were consistent with the literature (see 
                    <xref ref-type="other" rid="ST1">Supplementary Table 3</xref>)
                    <sup>
                        <xref ref-type="bibr" rid="ref-27">27</xref>,
                        <xref ref-type="bibr" rid="ref-28">28</xref>
                    </sup>.</p>
                <p>Note that the first evaluation (R version 3.4.3) was performed with Datasets 1-3. The second evaluation (R version 3.5.0) was performed on Datasets 2-5, as these were available in the R package 
                    <ext-link ext-link-type="uri" xlink:href="https://bioconductor.org/packages/devel/data/experiment/html/TENxPBMCData.html">TENxPBMCData</ext-link>.</p>
            </sec>
            <sec>
                <title>Stability assessment</title>
                <p>To test the robustness of different clustering methods we pursued a sampling strategy in terms cells. We also investigated the robustness of different methods with regards to different stringency of gene filtering. Finally, the impact of different aligners and preprocessing was assessed using all possible combinations of programs (i.e. some clustering methods did not run with scPipe output).</p>
                <p>
                    <bold>
                        <italic toggle="yes">Cells.</italic>
                    </bold> In the first evaluation (R version 3.4.3) we used Dataset 3 for the robustness evaluation with regards to cells. We randomly sampled 3,000 cells in Dataset 3 (out of the total of 4,292 that were available after filtering), generating five (non-independent) datasets. For every combination of two datasets (10 combinations in total) we then investigated for each clustering method separately how often cells contained in all five sampled datasets were assigned to the same cluster using the ARI_comp. In the second evaluation (R version 3.5.0) we used Dataset 5. Here, we randomly sampled 4,000 cells (out of the total of 8,381 that were available after filtering), generating five (non-independent) datasets. We then repeated the evaluation procedure described above. We also investigated the variability of ARI_truth for all methods in both evaluations.</p>
                <p>
                    <bold>
                        <italic toggle="yes">Genes.</italic>
                    </bold> Impact of gene filtering was only investigated for methods available in R version 3.5.0 during the second evaluation. We analyzed Dataset 4, as it had the most detected genes, with 10%, 20%, 30%, 40% and 50% of the most expressed genes (total counts). We investigated both the ARI_comp with regards to the clustering solution produced on a version of the dataset with no gene filtering, as well as the ARI_truth.</p>
                <p>
                    <bold>
                        <italic toggle="yes">Aligners and preprocessing pipelines.</italic>
                    </bold> In order to assess the effect of using different preprocessing pipelines on the data, we applied the Bioconductor package 
                    <ext-link ext-link-type="uri" xlink:href="https://bioconductor.org/packages/release/bioc/html/scPipe.html">scPipe</ext-link>
                    <sup>
                        <xref ref-type="bibr" rid="ref-29">29</xref>
                    </sup> (version 1.0.6) to the raw data. Like Cell Ranger, scPipe can be used to align, de-duplicate, filter barcodes and quantify genes. Since scPipe is modular, we tried it with both the 
                    <ext-link ext-link-type="uri" xlink:href="https://github.com/alexdobin/STAR">STAR</ext-link>
                    <sup>
                        <xref ref-type="bibr" rid="ref-30">30</xref>
                    </sup> (version 020201) and 
                    <ext-link ext-link-type="uri" xlink:href="http://subread.sourceforge.net/">Subread</ext-link>
                    <sup>
                        <xref ref-type="bibr" rid="ref-31">31</xref>
                    </sup> (version 1.5.2) aligners. In order to ensure comparability we aligned reads to the same GRCh38 genome annotation and repeated quality control with scater. We investigated the similarity of clustering solutions applied to the differently preprocessed and aligned versions of the same dataset by ARI_comp. Note that this was only done for the evaluation with methods available in R version 3.4.3.</p>
            </sec>
            <sec>
                <title>Run time assessment</title>
                <p>Each execution of a method on a dataset was performed in a separate R session. Each task was allocated as many CPU cores of a 24 core Intel(R) Xeon(R) CPU E5-2690 v3 @ 2.60GHz as specified by the default parameters, but less than 10 cores. The base::set.seed was set for all steps involving stochasticity (i.e. dimension reduction and clustering). Timings for each method include any preprocessing steps.</p>
            </sec>
            <sec>
                <title>Influence assessment</title>
                <p>We also investigated what properties of each cell&#x2019;s data were driving the clustering solutions produced by the different methods as well as the inferred cell labels. Properties of a cell&#x2019;s data refer to features such as the number of total reads that included the cell&#x2019;s barcode, the total number of detected genes found for this cell, etc. To this end, we used linear mixed models where cell data properties were predicted using the indicators for cluster membership. We predicted cell data properties and not cluster membership for modeling ease. The adjusted 
                    <italic toggle="yes">R</italic>
                    <sup>2</sup> of these models was used to assess which properties influenced the clustering solutions. Properties investigated included: (i) the total number of detected genes, (ii) the total read count, and (iii) the percentages of reads aligning respectively to ribosomal proteins, mitochondrial genes and ribosomal RNA (only Datasets 1&#x2013;3).</p>
            </sec>
        </sec>
        <sec sec-type="results">
            <title>Results</title>
            <sec>
                <title>Evaluation of clustering tools</title>
                <p>
                    <bold>
                        <italic toggle="yes">Gold standard dataset.</italic>
                    </bold> For the gold standard dataset consisting of three cell types, half of the tested clustering methods overestimated the true number of different cell types in the data. Methods with cluster number estimations close to the correct number of different cell types included methods with prior information, such as 
                    <monospace>SIMLR</monospace>, 
                    <monospace>countClust</monospace> and 
                    <monospace>scran</monospace>, as well as 
                    <monospace>ascend</monospace>, 
                    <monospace>Cell Ranger</monospace>, 
                    <monospace>RaceID</monospace> and 
                    <monospace>CIDR</monospace> (
                    <xref ref-type="fig" rid="f1">Figure 1</xref>). The clustering solutions produced by these methods, with the exception of 
                    <monospace>countClust</monospace>, largely reflected cell types. This is indicated by ARI_truth &gt;0.8. The remaining methods overestimated the number of clusters by 2 to 85 clusters, with 
                    <monospace>SC3</monospace> and 
                    <monospace>RaceID2</monospace> representing the extremes, both estimating more than 20 clusters (see t-SNE plots in 
                    <xref ref-type="other" rid="SF1">Supplementary Figure 1</xref> for the impact). As a consequence of the greater number of estimated clusters, the ARI_truth of the other clustering methods is lower than 0.8. To see whether these methods split cell types into several clusters or instead assign cells types randomly to clusters, we also investigate the homogeneity of the clustering solutions with respect to the known labeling. Apart from 
                    <monospace>countClust</monospace> and 
                    <monospace>RCA</monospace>, all methods have extremely high homogeneity, indicating that they split cell types into more subtypes, rather than randomly creating more cell types, which is reassuring.</p>
                <fig fig-type="figure" id="f1" orientation="portrait" position="float">
                    <label>Figure 1. </label>
                    <caption>
                        <title>Performance on the gold standard dataset.</title>
                        <p>(
                            <bold>a</bold>) ARI_truth of each method with regards to the truth versus the number of clusters. The dashed line indicates the true number of clusters. (
                            <bold>b</bold>) Homogeneity of clusters of each method, given the truth.</p>
                    </caption>
                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/19170/0c3bdf83-c73e-4ed2-aa63-29ada26b1301_figure1.gif"/>
                </fig>
                <p>
                    <bold>
                        <italic toggle="yes">Silver standard datasets.</italic>
                    </bold> We labeled the cells in each of the silver standard datasets as one of 11 different PBMC cell populations. When using the ARI_truth to compare the likeness of the clustering solutions and the labels, no method produced solutions that were uniformly the most similar to the inferred labels (
                    <xref ref-type="fig" rid="f2">Figure 2</xref>) in either the first or second evaluation. In both evaluations, 
                    <monospace>ascend</monospace> tended to estimate smaller number of clusters and consequently did not agree with the labeling. Only 
                    <monospace>Seurat</monospace>, 
                    <monospace>SC3</monospace> and 
                    <monospace>Cell Ranger</monospace> achieved an ARI_truth above 0.4 for at least two datasets in each of the evaluations. All methods considerably improved their ARI_truth when we subset to more confidently labeled cells (see 
                    <xref ref-type="other" rid="SF1">Supplementary Figure 2</xref>). 
                    <monospace>RCA</monospace> and 
                    <monospace>SC3</monospace> were particularly affected, showing much greater similarity for more confidently labeled cells. We also calculated the homogeneity of each method in each dataset with respect to the inferred labeling (compare 
                    <xref ref-type="fig" rid="f3">Figure 3</xref>). Generally, most methods exhibited significantly lower performance on datasets generated with the older version of the 10x Genomics technology. Most methods had much lower accuracy than for the gold standard data, indicating that most clusters represent mixtures of different inferred cell types. The exceptions are 
                    <monospace>SC3</monospace>&#x2019;s clustering solution of Dataset 3 in the first evaluation and 
                    <monospace>Seurat</monospace>&#x2019;s clustering solution on Datasets 3a and 5 in the second evaluation, which all achieved an homogeneity score above 0.7.</p>
                <fig fig-type="figure" id="f2" orientation="portrait" position="float">
                    <label>Figure 2. </label>
                    <caption>
                        <title>ARI_truth of each method in each dataset, as indicated by different shapes, with regards to the supervised cell labeling versus the number of clusters.</title>
                        <p>The dashed line indicates the number of cell populations estimated by the supervised cell labeling approach. (
                            <bold>a</bold>) First evaluation with methods available in R 3.4.3. (
                            <bold>b</bold>) Second evaluation with methods available in R 3.5.0.</p>
                    </caption>
                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/19170/0c3bdf83-c73e-4ed2-aa63-29ada26b1301_figure2.gif"/>
                </fig>
                <fig fig-type="figure" id="f3" orientation="portrait" position="float">
                    <label>Figure 3. </label>
                    <caption>
                        <title>Homogeneity of clusters with regards to the inferred cell labeling for each method and each dataset.</title>
                        <p>Different datasets are indicated by transparency. (
                            <bold>a</bold>) First evaluation with methods available in R 3.4.3. (
                            <bold>b</bold>) Second evaluation with methods available in R 3.5.0.</p>
                    </caption>
                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/19170/0c3bdf83-c73e-4ed2-aa63-29ada26b1301_figure3.gif"/>
                </fig>
                <p>Interestingly, similar performance when compared to the labeling did not imply that cluster solutions were similar (compare 
                    <xref ref-type="fig" rid="f4">Figure 4</xref>). Furthermore, similar algorithms did not result in more similar solutions. This is probably due to the vast differences in filtering and data normalization between the methods.</p>
                <fig fig-type="figure" id="f4" orientation="portrait" position="float">
                    <label>Figure 4. </label>
                    <caption>
                        <p>Similarity of all combinations of clustering methods as estimated by ARI_comp (lower triangle) and NMI (upper triangle) averaged over all datasets in (
                            <bold>a</bold>) evaluation 1 (R version 3.4.3) and (
                            <bold>b</bold>) evaluation 2 (R version 3.5.0). The similarity is indicated by the color; yellow indicating no similarity and purple indicating complete overlap. The diagonals give the average number of clusters estimated by each respective method. Note that methods are ordered according to similarity.</p>
                    </caption>
                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/19170/0c3bdf83-c73e-4ed2-aa63-29ada26b1301_figure4.gif"/>
                </fig>
                <p>Most methods had comparable performance on Datasets 2/2a and 3/3a in the first and second evaluation. Consistent performance increases were only noted for 
                    <monospace>countClust</monospace> and 
                    <monospace>Seurat</monospace> (compare 
                    <xref ref-type="other" rid="SF1">Supplementary Figure 3</xref>).</p>
                <p>
                    <bold>
                        <italic toggle="yes">Stability.</italic>
                    </bold> We evaluated the stability of the clustering methods by examining three different features: (i) filtering of cells 
                    <xref ref-type="fig" rid="f5">Figure 5</xref>), (ii) filtering of genes (
                    <xref ref-type="fig" rid="f6">Figure 6</xref> and 
                    <xref ref-type="other" rid="SF1">Supplementary Figure 4</xref>), and (iii) use of different aligners (
                    <xref ref-type="other" rid="SF1">Supplementary Figure 5</xref>). When assessing the stability with regards to input in both evaluations 1 and 2, 
                    <monospace>RaceID</monospace> and 
                    <monospace>RaceID2</monospace> did not appear very robust. Due to its reliance on reference profiles 
                    <monospace>RCA</monospace> is extremely robust, achieving ARI_comp above 0.9 consistently in both evaluations. In contrast, changes to gene filtering seemed to result in method specific effects, probably owing to individual filtering and normalization procedures. The performance of 
                    <monospace>Seurat</monospace> improved dramatically with the inclusion of more genes, whereas it deteriorated for 
                    <monospace>RaceID</monospace>. In contrast, both 
                    <monospace>Cell Ranger</monospace> and 
                    <monospace>SC3</monospace> exhibited stable performance when the percentage of highly expressed genes was varied.</p>
                <fig fig-type="figure" id="f5" orientation="portrait" position="float">
                    <label>Figure 5. </label>
                    <caption>
                        <p>(
                            <bold>a</bold>) Tukey boxplots of ARI_comp results from the comparison of clustering solutions of the same method when cell input was varied in Dataset 5. (
                            <bold>b</bold>) Tukey boxplots of ARI_truth of clustering solutions of the same method when cell input was varied in Dataset 5. Results shown are for evaluation 2 (R version 3.5.0) for results of evaluation 1 (R version 3.4.3) see 
                            <xref ref-type="other" rid="SF1">Supplementary Figure 4</xref>.</p>
                    </caption>
                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/19170/0c3bdf83-c73e-4ed2-aa63-29ada26b1301_figure5.gif"/>
                </fig>
                <fig fig-type="figure" id="f6" orientation="portrait" position="float">
                    <label>Figure 6. </label>
                    <caption>
                        <p>(
                            <bold>a</bold>) ARI_comp of clustering solutions on Dataset 4 using 10%, 20%, 30%, 40% and 50% of the most expressed genes with respect to clustering Dataset 4 with all genes with the same method. (
                            <bold>b</bold>) ARI_truth of clustering solutions on Dataset 4 using 10%, 20%, 30%, 40% and 50% of the most expressed genes. Note that many methods could not cluster the data when few genes were available. In particular, 
                            <monospace>ascend</monospace> did not run.</p>
                    </caption>
                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/19170/0c3bdf83-c73e-4ed2-aa63-29ada26b1301_figure6.gif"/>
                </fig>
                <p>We also investigated how the stability of the clustering method was affected by the use of different aligners (
                    <xref ref-type="other" rid="SF1">Supplementary Figure 5</xref>) in evaluation 1 (R version 3.4.3). In particular, we used Cell Ranger and ScPipe
                    <sup>
                        <xref ref-type="bibr" rid="ref-29">29</xref>
                    </sup> with Subread
                    <sup>
                        <xref ref-type="bibr" rid="ref-31">31</xref>
                    </sup>, or STAR
                    <sup>
                        <xref ref-type="bibr" rid="ref-30">30</xref>
                    </sup>. We found that different aligners largely result in the same gene counts, but with some notable exceptions for processed pseudogenes (see 
                    <xref ref-type="other" rid="SF1">Supplementary Figure 6</xref>, 
                    <xref ref-type="other" rid="SF1">Supplementary Figure 7</xref> and 
                    <xref ref-type="other" rid="SF1">Supplementary Figure 8</xref>). Not all methods were able to be used in conjunction with scPipe. This included 
                    <monospace>ascend</monospace> and 
                    <monospace>SIMLR</monospace>, which failed to run, and 
                    <monospace>Cell Ranger</monospace>, which requires output from its own preprocessing pipeline. However we were able to evaluate eight methods. Apart from 
                    <monospace>RaceID2</monospace> and 
                    <monospace>RCA</monospace>, all tested methods appeared robust.</p>
                <p>
                    <bold>
                        <italic toggle="yes">Miscellaneous properties.</italic>
                    </bold> Running time varied substantially between different methods. 
                    <monospace>RaceID2</monospace> took prohibitively long and thus does not lend itself to interactive analysis when applied to 10x Genomics data (
                    <xref ref-type="fig" rid="f7">Figure 7</xref>). The fastest methods was 
                    <monospace>RCA</monospace>, with both taking less than 25 seconds on average for the entire dataset analysis. Considerable faster running times in evaluation 2 (R version 3.5.0) than in evaluation 1 (R version 3.4.3) were reported for 
                    <monospace>Seurat</monospace> and 
                    <monospace>SC3</monospace> (compare 
                    <xref ref-type="other" rid="SF1">Supplementary Figure 9</xref>). They were the second and third fastest methods in evaluation 2 respectively, despite offering more intermediate steps than most methods. Also note that methods differed in the quality of their documentation. For example, tools like 
                    <monospace>Cell Ranger</monospace> and 
                    <monospace>Seurat</monospace> offer detailed documentation, with many different use cases as well as tutorials (compare 
                    <xref ref-type="other" rid="ST1">Supplementary Table 1</xref>). Tools, which are not found on Bioconductor, such as 
                    <monospace>RaceID2</monospace>, 
                    <monospace>ascend</monospace> and 
                    <monospace>RCA</monospace> have more limited documentation.</p>
                <fig fig-type="figure" id="f7" orientation="portrait" position="float">
                    <label>Figure 7. </label>
                    <caption>
                        <title>The bars indicate the average log10 run time (in seconds) of all 11 methods on Dataset 5 with 3,000 genes over 5 iterations.</title>
                    </caption>
                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/19170/0c3bdf83-c73e-4ed2-aa63-29ada26b1301_figure7.gif"/>
                </fig>
            </sec>
            <sec>
                <title>Factors influencing clustering solutions</title>
                <p>The variation in the percentage of reads aligning to ribosomal protein genes strongly predicted all clustering solutions as well as the inferred cell labels (see 
                    <xref ref-type="fig" rid="f8">Figure 8</xref>, 
                    <xref ref-type="other" rid="SF1">Supplementary Figure 10</xref>, 
                    <xref ref-type="other" rid="SF1">Supplementary Figure 11</xref>). Expression of ribosomal protein genes has been successfully used to discriminate cell types belonging to different hematopoietic lineages
                    <sup>
                        <xref ref-type="bibr" rid="ref-32">32</xref>
                    </sup>. Hence, it may be the case that overall mRNA amount of ribosomal protein genes can also serve as a discriminator. Furthermore, differences in abundance of ribosomal protein genes are likely to drive variation in PBMC scRNA-seq datasets, as they typically account for a large proportion of reads (around 40% in all three datasets). In combination with ribosomal protein genes being less affected by dropout due to their relatively high expression, it is perhaps unsurprising that clustering solutions of all methods foremost reflect differences in the amount of ribosomal protein genes between cells.</p>
                <fig fig-type="figure" id="f8" orientation="portrait" position="float">
                    <label>Figure 8. </label>
                    <caption>
                        <title>Radial plots describing the average effect of 5 cell features on the clustering solutions of different methods across the three silver standard datasets in evaluation 1 (R version 3.4.3).</title>
                        <p>For every method and every feature the adjusted 
                            <italic toggle="yes">R</italic>
                            <sup>2</sup> of the linear model fitting the feature by the clustering solution is presented.</p>
                    </caption>
                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/19170/0c3bdf83-c73e-4ed2-aa63-29ada26b1301_figure8.gif"/>
                </fig>
                <p>Most methods&#x2019; solutions were much more driven by the total number of detected genes and total number of counts than the inferred solution. 
                    <monospace>TSCAN</monospace> was particularly affected (
                    <italic toggle="yes">R</italic>
                    <sup>2</sup> = 0.52 in evaluation 1 and 
                    <italic toggle="yes">R</italic>
                    <sup>2</sup> = 0.68 in evaluation 2), but for 
                    <monospace>RaceID2</monospace> similar effects were observed. It can be speculated that this strong influence of total number of features and total number of counts on their clustering solutions points to a failure to appropriately normalize the data.</p>
            </sec>
        </sec>
        <sec sec-type="discussion">
            <title>Discussion</title>
            <p>We also summarized the performance of each method across all evaluations (see 
                <xref ref-type="fig" rid="f9">Figure 9</xref>). This summary suggests that 
                <monospace>Seurat</monospace> provides the best clustering solutions for 10x Genomics scRNA-seq data in terms of running time, robustness and accuracy. The next best performing methods were 
                <monospace>RCA</monospace>, 
                <monospace>SC3</monospace>, 
                <monospace>Cell Ranger</monospace> and 
                <monospace>CIDR</monospace>. However, it should be noted that 
                <monospace>RCA</monospace> performed particularly poorly on the gold standard dataset. This highlights that 
                <monospace>RCA</monospace>&#x2019;s performance hinges on the studied cell types being represented in the reference used during the supervised clustering approach. These results closely mimic benchmarking results observed by Du&#x00f2; 
                <italic toggle="yes">et al.</italic>
                <sup>
                    <xref ref-type="bibr" rid="ref-33">33</xref>
                </sup> on independent silver standard and simulated datasets across multiple single cell technologies.</p>
            <fig fig-type="figure" id="f9" orientation="portrait" position="float">
                <label>Figure 9. </label>
                <caption>
                    <title>Summary of the performance of each method across all evaluations.</title>
                    <p>Note that 1 refers to evaluation 1 (R version 3.4.3) and 2 refers to evaluation 2 (R version 3.5.0).</p>
                </caption>
                <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/19170/0c3bdf83-c73e-4ed2-aa63-29ada26b1301_figure9.gif"/>
            </fig>
            <p>We also investigated whether properties of the clustering method correlated with their performance. We found that neither the type of clustering method used nor the similarity metric used seemed to correlate with the performance. However, our ability to identify patterns might have been impacted by the small sample size. A recent paper by Kim 
                <italic toggle="yes">et al.</italic>
                <sup>
                    <xref ref-type="bibr" rid="ref-34">34</xref>
                </sup>, which systematically studied the effect of different similarity metrics on performance of scRNA-seq clustering methods, found that correlation-based similarity metrics outperformed distance-based metrics.</p>
        </sec>
        <sec sec-type="conclusions">
            <title>Conclusion</title>
            <p>Most biological conclusions obtained from droplet-based scRNA-seq data crucially rely on accurate clustering of cells into homogeneous groups. Indeed, one can argue that it is the very act of clustering that unlocks the technology&#x2019;s potential for discovery. Therefore it is not surprising that according to several repositories, such as 
                <monospace>
                    <ext-link ext-link-type="uri" xlink:href="http://www.omicstools.org/">www.omicstools.org</ext-link>
                </monospace> and 
                <monospace>
                    <ext-link ext-link-type="uri" xlink:href="https://www.scrna-tools.org/">www.scRNA-tools.org</ext-link>
                </monospace>
                <sup>
                    <xref ref-type="bibr" rid="ref-2">2</xref>
                </sup>, many of the tools developed for scRNA-seq specifically focus on clustering. With so many choices, it is thus important to evaluate their performance for droplet based protocols, such as 10x Genomics, specifically.</p>
            <p>In this study, we presented our evaluation of a dozen clustering method on scRNA-seq 10x Genomics data. The results of our investigations will be useful for method users, as we provide practical guidelines. Nonetheless, our evaluation has several limitations: 
                <list list-type="bullet">
                    <list-item>
                        <label>&#x2022; </label>
                        <p>Inclusion of methods limited to R packages and methods published before October 2017</p>
                    </list-item>
                    <list-item>
                        <label>&#x2022; </label>
                        <p>Parameter selection limited to defaults</p>
                    </list-item>
                    <list-item>
                        <label>&#x2022; </label>
                        <p>No assessment of robustness to noise and parameter changes</p>
                    </list-item>
                    <list-item>
                        <label>&#x2022; </label>
                        <p>No assessment of ability to discover rare cell populations</p>
                    </list-item>
                    <list-item>
                        <label>&#x2022; </label>
                        <p>Evaluation of more silver standard datasets from systems other than PBMCs</p>
                    </list-item>
                    <list-item>
                        <label>&#x2022; </label>
                        <p>No evaluation of ability to deal with batch effects or other more complex designs</p>
                    </list-item>
                    <list-item>
                        <label>&#x2022; </label>
                        <p>No evaluation of quality of code and documentation</p>
                    </list-item>
                    <list-item>
                        <label>&#x2022; </label>
                        <p>No assessment of scalability of methods</p>
                    </list-item>
                </list>
</p>
            <p>While 
                <monospace>Seurat</monospace> performed slightly better than the next best methods, in our opinion, the choice of clustering method should be informed by the user&#x2019;s familiarity with statistical concepts and R programming. Many methods, including 
                <monospace>Seurat</monospace>, require the user to make informed parameter choices and occasionally troubleshoot code. Methods requiring no parameter choices, like 
                <monospace>Cell Ranger</monospace>, may offer a better choice for non-experts.</p>
            <p>In general, we recommend that practitioners and consumers of results generated from 10x Genomics scRNA-seq data alike remain vigilant about the outcome of their analysis, and acknowledge the variability and likelihood of undesired influences. The choice of clustering tool for scRNA-seq data generated by the 10x Genomics platform crucially determines interpretation. Hence, we suggest using several clustering methods ideally with multiple parameter choices on 10x Genomics scRNA-seq data in order to ensure that biological results are not artifacts of method or parameter choice. This should help guard against subjective interpretation of the data and thus increase robustness of and confidence in results.</p>
        </sec>
        <sec>
            <title>Data availability</title>
            <p>Repository: Gold Standard Dataset. Single cell profiling of 3 Human Lung Adenocarcinoma cell lines, GSE111108 Repository: Silver Standard Dataset 1. Single cell profiling of peripheral blood mononuclear cells from healthy human donor, GSE115189</p>
            <p>Repository: Silver Standard Dataset 2. 3k PBMCs from a Healthy Donor, Version 1.0.0: 
                <ext-link ext-link-type="uri" xlink:href="https://support.10xgenomics.com/single-cell-gene-expression/datasets/1.0.0/pbmc3k">https://support.10xgenomics.com/single-cell-gene-expression/datasets/1.0.0/pbmc3k</ext-link>, Version 1.1.0: 
                <ext-link ext-link-type="uri" xlink:href="https://support.10xgenomics.com/single-cell-gene-expression/datasets/1.1.0/pbmc3k">https://support.10xgenomics.com/single-cell-gene-expression/datasets/1.1.0/pbmc3k</ext-link>
            </p>
            <p>Repository: Silver Standard Dataset 3. 4k PBMCs from a Healthy Donor, Version 1.2.0 
                <ext-link ext-link-type="uri" xlink:href="https://support.10xgenomics.com/single-cell-gene-expression/datasets/1.2.0/pbmc4k">https://support.10xgenomics.com/single-cell-gene-expression/datasets/1.2.0/pbmc4k</ext-link>, Version 2.1.0 
                <ext-link ext-link-type="uri" xlink:href="https://support.10xgenomics.com/single-cell-gene-expression/datasets/2.1.0/pbmc4k">https://support.10xgenomics.com/single-cell-gene-expression/datasets/2.1.0/pbmc4k</ext-link>
            </p>
            <p>Repository: Silver Standard Dataset 4. 6k PBMCs from a Healthy Donor, 
                <ext-link ext-link-type="uri" xlink:href="https://support.10xgenomics.com/single-cell-gene-expression/datasets/1.1.0/pbmc6k">https://support.10xgenomics.com/single-cell-gene-expression/datasets/1.1.0/pbmc6k</ext-link>
            </p>
            <p>Repository: Silver Standard Dataset 5. 8k PBMCs from a Healthy Donor, 
                <ext-link ext-link-type="uri" xlink:href="https://support.10xgenomics.com/single-cell-gene-expression/datasets/2.1.0/pbmc8k">https://support.10xgenomics.com/single-cell-gene-expression/datasets/2.1.0/pbmc8k</ext-link>
            </p>
            <p>We also provide versions in the R Single-CellExperiment format of all datasets at 
                <ext-link ext-link-type="uri" xlink:href="https://github.com/bahlolab/cluster_benchmark_data">https://github.com/bahlolab/cluster_benchmark_data</ext-link>
            </p>
        </sec>
        <sec>
            <title>Software availability</title>
            <p>
                <bold>All code is available for download at:</bold> 
                <ext-link ext-link-type="uri" xlink:href="https://github.com/SaskiaFreytag/cluster_benchmarking_code">https://github.com/SaskiaFreytag/cluster_benchmarking_code</ext-link>.</p>
            <p>
                <bold>Archived code at time of publication: </bold> 
                <ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.5281/zenodo.2008645">10.5281/zenodo.2008645</ext-link>
            </p>
            <p>
                <bold>License:</bold> MIT License</p>
        </sec>
        <sec>
            <title>Consent</title>
            <p>Written informed consent for publication of the participant&#x2019;s transcriptomic information was obtained (Australian Red Cross Blood Service Supply Agreement 1803VIC-07).</p>
        </sec>
    </body>
    <back>
        <ack>
            <title>Acknowledgements</title>
            <p>We gratefully acknowledge the constructive comments and experimental work of Azadeh Seidi, Mark Biondo and Nicolas J. Wilson. Additionally, we want to acknowledge Mark Robinson for his great advice.</p>
        </ack>
        <sec sec-type="supplementary-material">
            <title>Supplementary material</title>
            <p id="SF1">
                <bold>Supplementary Figures 1&#x2013;11.</bold>
            </p>
            <p>
                <ext-link ext-link-type="uri" xlink:href="https://f1000researchdata.s3.amazonaws.com/supplementary/15809/ab8d23e4-f9e2-41e4-a057-43630939796b_Supp_Figures.pdf">Click here to access the data</ext-link>.</p>
            <p id="ST1">
                <bold>Supplementary Tables 1&#x2013;3.</bold>
            </p>
            <p>
                <ext-link ext-link-type="uri" xlink:href="https://f1000researchdata.s3.amazonaws.com/supplementary/15809/34ca038d-e372-4e49-835f-90acce20d669_Supp_Tables.pdf">Click here to access the data</ext-link>.</p>
        </sec>
        <ref-list>
            <ref id="ref-1">
                <label>1</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Tanay</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Regev</surname>
                            <given-names>A</given-names>
                        </name>
</person-group>:
                    <article-title>Scaling single-cell genomics from phenomenology to mechanism.</article-title>
                    <source>

                        <italic toggle="yes">Nature.</italic>
</source>
                    <year>2017</year>;<volume>541</volume>(<issue>7637</issue>):<fpage>331</fpage>&#x2013;<lpage>338</lpage>.
                    <pub-id pub-id-type="pmid">28102262</pub-id>
                    <pub-id pub-id-type="doi">10.1038/nature21350</pub-id>
                    <pub-id pub-id-type="pmcid">5438464</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-2">
                <label>2</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Zappia</surname>
                            <given-names>L</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Phipson</surname>
                            <given-names>B</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Oshlack</surname>
                            <given-names>A</given-names>
                        </name>
</person-group>:
                    <article-title>Exploring the single-cell RNA-seq analysis landscape with the scRNA-tools database.</article-title>
                    <source>

                        <italic toggle="yes">PLoS Comput Biol.</italic>
</source>
                    <year>2018</year>;<volume>14</volume>(<issue>6</issue>):<fpage>e1006245</fpage>.
                    <pub-id pub-id-type="pmid">29939984</pub-id>
                    <pub-id pub-id-type="doi">10.1371/journal.pcbi.1006245</pub-id>
                    <pub-id pub-id-type="pmcid">6034903</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-3">
                <label>3</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Ziegenhain</surname>
                            <given-names>C</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Vieth</surname>
                            <given-names>B</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Parekh</surname>
                            <given-names>S</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Comparative Analysis of Single-Cell RNA Sequencing Methods.</article-title>
                    <source>

                        <italic toggle="yes">Mol Cell.</italic>
</source>
                    <year>2017</year>;<volume>65</volume>(<issue>4</issue>):<fpage>631</fpage>&#x2013;<lpage>643.e4</lpage>.
                    <pub-id pub-id-type="pmid">28212749</pub-id>
                    <pub-id pub-id-type="doi">10.1016/j.molcel.2017.01.023</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-4">
                <label>4</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Haque</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Engel</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Teichmann</surname>
                            <given-names>SA</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>A practical guide to single-cell RNA-sequencing for biomedical research and clinical applications.</article-title>
                    <source>

                        <italic toggle="yes">Genome Med.</italic>
</source>
                    <year>2017</year>;<volume>9</volume>(<issue>1</issue>):<fpage>75</fpage>.
                    <pub-id pub-id-type="pmid">28821273</pub-id>
                    <pub-id pub-id-type="doi">10.1186/s13073-017-0467-4</pub-id>
                    <pub-id pub-id-type="pmcid">5561556</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-5">
                <label>5</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">	

                        <name name-style="western">
                            <surname>Zheng</surname>
                            <given-names>GX</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Terry</surname>
                            <given-names>JM</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Belgrader</surname>
                            <given-names>P</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Massively parallel digital transcriptional profiling of single cells.</article-title>
                    <source>

                        <italic toggle="yes">Nat Commun.</italic>
</source>
                    <year>2017</year>;<volume>8</volume>:<fpage>14049</fpage>.
                    <pub-id pub-id-type="pmid">28091601</pub-id>
                    <pub-id pub-id-type="doi">10.1038/ncomms14049</pub-id>
                    <pub-id pub-id-type="pmcid">5241818</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-6">
                <label>6</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Tian</surname>
                            <given-names>L</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Dong</surname>
                            <given-names>X</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Freytag</surname>
                            <given-names>S</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>scRNA-seq mixology: towards better benchmarking of single cell rna-seq protocols and analysis methods.</article-title>
                    <source>

                        <italic toggle="yes">bioRxiv.</italic>
</source>
                    <year>2018</year>; 433102.
                    <pub-id pub-id-type="doi">10.1101/433102</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-7">
                <label>7</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Senabouth</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Lukowski</surname>
                            <given-names>S</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Alquicira</surname>
                            <given-names>J</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>ascend: R package for analysis of single cell RNA-seq data.</article-title>
                    <source>

                        <italic toggle="yes">bioRxiv.</italic>
</source>
                    <year>2017</year>;<fpage>207704</fpage>.
                    <pub-id pub-id-type="doi">10.1101/207704</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-8">
                <label>8</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Lin</surname>
                            <given-names>P</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Troup</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Ho</surname>
                            <given-names>JW</given-names>
                        </name>
</person-group>:
                    <article-title>CIDR: Ultrafast and accurate clustering through imputation for single-cell RNA-seq data.</article-title>
                    <source>

                        <italic toggle="yes">Genome Biol.</italic>
</source>
                    <year>2017</year>;<volume>18</volume>(<issue>1</issue>):<fpage>59</fpage>.
                    <pub-id pub-id-type="pmid">28351406</pub-id>
                    <pub-id pub-id-type="doi">10.1186/s13059-017-1188-0</pub-id>
                    <pub-id pub-id-type="pmcid">5371246</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-9">
                <label>9</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Dey</surname>
                            <given-names>KK</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Hsiao</surname>
                            <given-names>CJ</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Stephens</surname>
                            <given-names>M</given-names>
                        </name>
</person-group>:
                    <article-title>Visualizing the structure of RNA-seq expression data using grade of membership models.</article-title>
                    <source>

                        <italic toggle="yes">PLoS Genet.</italic>
</source>
                    <year>2017</year>;<volume>13</volume>(<issue>3</issue>):<fpage>e1006599</fpage>.
                    <pub-id pub-id-type="pmid">28333934</pub-id>
                    <pub-id pub-id-type="doi">10.1371/journal.pgen.1006599</pub-id>
                    <pub-id pub-id-type="pmcid">5363805</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-10">
                <label>10</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Gr&#x00fc;n</surname>
                            <given-names>D</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Lyubimova</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Kester</surname>
                            <given-names>L</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Single-cell messenger RNA sequencing reveals rare intestinal cell types.</article-title>
                    <source>

                        <italic toggle="yes">Nature.</italic>
</source>
                    <year>2015</year>;<volume>525</volume>(<issue>7568</issue>):<fpage>251</fpage>&#x2013;<lpage>5</lpage>.
                    <pub-id pub-id-type="pmid">26287467</pub-id>
                    <pub-id pub-id-type="doi">10.1038/nature14966</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-11">
                <label>11</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Gr&#x00fc;n</surname>
                            <given-names>D</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Muraro</surname>
                            <given-names>MJ</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Boisset</surname>
                            <given-names>JC</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>
                        <italic toggle="yes">De Novo</italic> Prediction of Stem Cell Identity using Single-Cell Transcriptome Data.</article-title>
                    <source>

                        <italic toggle="yes">Cell Stem Cell.</italic>
</source>
                    <year>2016</year>;<volume>19</volume>(<issue>2</issue>):<fpage>266</fpage>&#x2013;<lpage>277</lpage>.
                    <pub-id pub-id-type="pmid">27345837</pub-id>
                    <pub-id pub-id-type="doi">10.1016/j.stem.2016.05.010</pub-id>
                    <pub-id pub-id-type="pmcid">4985539</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-12">
                <label>12</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Li</surname>
                            <given-names>H</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Courtois</surname>
                            <given-names>ET</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Sengupta</surname>
                            <given-names>D</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Reference component analysis of single-cell transcriptomes elucidates cellular heterogeneity in human colorectal tumors.</article-title>
                    <source>

                        <italic toggle="yes">Nat Genet.</italic>
</source>
                    <year>2017</year>;<volume>49</volume>(<issue>5</issue>):<fpage>708</fpage>&#x2013;<lpage>718</lpage>.
                    <pub-id pub-id-type="pmid">28319088</pub-id>
                    <pub-id pub-id-type="doi">10.1038/ng.3818</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-13">
                <label>13</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Kiselev</surname>
                            <given-names>VY</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Kirschner</surname>
                            <given-names>K</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Schaub</surname>
                            <given-names>MT</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>SC3: consensus clustering of single-cell RNA-seq data.</article-title>
                    <source>

                        <italic toggle="yes">Nat Methods.</italic>
</source>
                    <year>2017</year>;<volume>14</volume>(<issue>5</issue>):<fpage>483</fpage>&#x2013;<lpage>486</lpage>.
                    <pub-id pub-id-type="pmid">28346451</pub-id>
                    <pub-id pub-id-type="doi">10.1038/nmeth.4236</pub-id>
                    <pub-id pub-id-type="pmcid">5410170</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-14">
                <label>14</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Lun</surname>
                            <given-names>AT</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Bach</surname>
                            <given-names>K</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Marioni</surname>
                            <given-names>JC</given-names>
                        </name>
</person-group>:
                    <article-title>Pooling across cells to normalize single-cell RNA sequencing data with many zero counts.</article-title>
                    <source>

                        <italic toggle="yes">Genome Biol.</italic>
</source>
                    <year>2016</year>;<volume>17</volume>(<issue>1</issue>):<fpage>75</fpage>.
                    <pub-id pub-id-type="pmid">27122128</pub-id>
                    <pub-id pub-id-type="doi">10.1186/s13059-016-0947-7</pub-id>
                    <pub-id pub-id-type="pmcid">4848819</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-15">
                <label>15</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Butler</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Hoffman</surname>
                            <given-names>P</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Smibert</surname>
                            <given-names>P</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Integrating single-cell transcriptomic data across different conditions, technologies, and species.</article-title>
                    <source>

                        <italic toggle="yes">Nat Biotechnol.</italic>
</source>
                    <year>2018</year>;<volume>36</volume>(<issue>5</issue>):<fpage>411</fpage>&#x2013;<lpage>420</lpage>.
                    <pub-id pub-id-type="pmid">29608179</pub-id>
                    <pub-id pub-id-type="doi">10.1038/nbt.4096</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-16">
                <label>16</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Wang</surname>
                            <given-names>B</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Ramazzotti</surname>
                            <given-names>D</given-names>
                        </name>

                        <name name-style="western">
                            <surname>De Sano</surname>
                            <given-names>L</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>SIMLR: A Tool for Large-Scale Genomic Analyses by Multi-Kernel Learning.</article-title>
                    <source>

                        <italic toggle="yes">Proteomics.</italic>
</source>
                    <year>2018</year>;<volume>18</volume>(<issue>2</issue>):<fpage>1700232</fpage>.
                    <pub-id pub-id-type="pmid">29265724</pub-id>
                    <pub-id pub-id-type="doi">10.1002/pmic.201700232</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-17">
                <label>17</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Ji</surname>
                            <given-names>Z</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Ji</surname>
                            <given-names>H</given-names>
                        </name>
</person-group>:
                    <article-title>TSCAN: Pseudo-time reconstruction and evaluation in single-cell RNA-seq analysis.</article-title>
                    <source>

                        <italic toggle="yes">Nucleic Acids Res.</italic>
</source>
                    <year>2016</year>;<volume>44</volume>(<issue>13</issue>):<fpage>e117</fpage>.
                    <pub-id pub-id-type="pmid">27179027</pub-id>
                    <pub-id pub-id-type="doi">10.1093/nar/gkw430</pub-id>
                    <pub-id pub-id-type="pmcid">4994863</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-18">
                <label>18</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>McCarthy</surname>
                            <given-names>DJ</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Campbell</surname>
                            <given-names>KR</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Lun</surname>
                            <given-names>AT</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Scater: pre-processing, quality control, normalization and visualization of single-cell RNA-seq data in R.</article-title>
                    <source>

                        <italic toggle="yes">Bioinformatics.</italic>
</source>
                    <year>2017</year>;<volume>33</volume>(<issue>8</issue>):<fpage>1179</fpage>&#x2013;<lpage>1186</lpage>.
                    <pub-id pub-id-type="pmid">28088763</pub-id>
                    <pub-id pub-id-type="doi">10.1093/bioinformatics/btw777</pub-id>
                    <pub-id pub-id-type="pmcid">5408845</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-19">
                <label>19</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Yip</surname>
                            <given-names>SH</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Wang</surname>
                            <given-names>P</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Kocher</surname>
                            <given-names>JA</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Linnorm: improved statistical analysis for single cell RNA-seq expression data.</article-title>
                    <source>

                        <italic toggle="yes">Nucleic Acids Res.</italic>
</source>
                    <year>2017</year>;<volume>45</volume>(<issue>22</issue>):<fpage>e179</fpage>.
                    <pub-id pub-id-type="pmid">28981748</pub-id>
                    <pub-id pub-id-type="doi">10.1093/nar/gkx828</pub-id>
                    <pub-id pub-id-type="pmcid">5727406</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-20">
                <label>20</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Trapnell</surname>
                            <given-names>C</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Cacchiarelli</surname>
                            <given-names>D</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Grimsby</surname>
                            <given-names>J</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells.</article-title>
                    <source>

                        <italic toggle="yes">Nat Biotechnol.</italic>
</source>
                    <year>2014</year>;<volume>32</volume>(<issue>4</issue>):<fpage>381</fpage>&#x2013;<lpage>386</lpage>.
                    <pub-id pub-id-type="pmid">24658644</pub-id>
                    <pub-id pub-id-type="doi">10.1038/nbt.2859</pub-id>
                    <pub-id pub-id-type="pmcid">4122333</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-21">
                <label>21</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>de Graaf</surname>
                            <given-names>CA</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Choi</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Baldwin</surname>
                            <given-names>TM</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Haemopedia: An Expression Atlas of Murine Hematopoietic Cells.</article-title>
                    <source>

                        <italic toggle="yes">Stem Cell Reports.</italic>
</source>
                    <year>2016</year>;<volume>7</volume>(<issue>3</issue>):<fpage>571</fpage>&#x2013;<lpage>582</lpage>.
                    <pub-id pub-id-type="pmid">27499199</pub-id>
                    <pub-id pub-id-type="doi">10.1016/j.stemcr.2016.07.007</pub-id>
                    <pub-id pub-id-type="pmcid">5031953</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-22">
                <label>22</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Hubert</surname>
                            <given-names>L</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Arabie</surname>
                            <given-names>P</given-names>
                        </name>
</person-group>:
                    <article-title>Comparing partitions.</article-title>
                    <source>

                        <italic toggle="yes">J Classif.</italic>
</source>
                    <year>1985</year>;<volume>2</volume>(<issue>1</issue>):<fpage>193</fpage>&#x2013;<lpage>218</lpage>.
                    <pub-id pub-id-type="doi">10.1007/BF01908075</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-23">
                <label>23</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Studholme</surname>
                            <given-names>C</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Hill</surname>
                            <given-names>DLG</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Hawkes</surname>
                            <given-names>DJ</given-names>
                        </name>
</person-group>:
                    <article-title>An overlap invariant entropy measure of 3D medical image alignment.</article-title>
                    <source>

                        <italic toggle="yes">Pattern Recogn.</italic>
</source>
                    <year>1999</year>;<volume>32</volume>(<issue>1</issue>):<fpage>71</fpage>&#x2013;<lpage>86</lpage>.
                    <pub-id pub-id-type="doi">10.1016/S0031-3203(98)00091-0</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-24">
                <label>24</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Rosenberg</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Hirschberg</surname>
                            <given-names>J</given-names>
                        </name>
</person-group>:
                    <article-title>V-measure: A conditional entropy-based external cluster evaluation measure</article-title>. In
                    <italic toggle="yes">Proceedings of the 2007 joint conference on empirical methods in natural language processing and computational natural language learning (EMNLP-CoNLL)</italic>.<year>2007</year>.
                    <ext-link ext-link-type="uri" xlink:href="http://www.aclweb.org/anthology/D07-1043">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-25">
                <label>25</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Holik</surname>
                            <given-names>AZ</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Law</surname>
                            <given-names>CW</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Liu</surname>
                            <given-names>R</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>RNA-seq mixology: designing realistic control experiments to compare protocols and analysis methods.</article-title>
                    <source>

                        <italic toggle="yes">Nucleic Acids Res.</italic>
</source>
                    <year>2017</year>;<volume>45</volume>(<issue>5</issue>):<fpage>e30</fpage>.
                    <pub-id pub-id-type="pmid">27899618</pub-id>
                    <pub-id pub-id-type="doi">10.1093/nar/gkw1063</pub-id>
                    <pub-id pub-id-type="pmcid">5389713</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-26">
                <label>26</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Kang</surname>
                            <given-names>HM</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Subramaniam</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Targ</surname>
                            <given-names>S</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Multiplexed droplet single-cell RNA-sequencing using natural genetic variation.</article-title>
                    <source>

                        <italic toggle="yes">Nat Biotechnol.</italic>
</source>
                    <year>2018</year>;<volume>36</volume>(<issue>1</issue>):<fpage>89</fpage>&#x2013;<lpage>94</lpage>.
                    <pub-id pub-id-type="pmid">29227470</pub-id>
                    <pub-id pub-id-type="doi">10.1038/nbt.4042</pub-id>
                    <pub-id pub-id-type="pmcid">5784859</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-27">
                <label>27</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Sasaki</surname>
                            <given-names>Y</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Darmochwal-Kolarz</surname>
                            <given-names>D</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Suzuki</surname>
                            <given-names>D</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Proportion of peripheral blood and decidual CD4
                        <sup>+</sup> CD25
                        <sup>bright</sup> regulatory T cells in pre-eclampsia.</article-title>
                    <source>

                        <italic toggle="yes">Clin Exp Immunol.</italic>
</source>
                    <year>2007</year>;<volume>149</volume>(<issue>1</issue>):<fpage>139</fpage>&#x2013;<lpage>145</lpage>.
                    <pub-id pub-id-type="pmid">17459078</pub-id>
                    <pub-id pub-id-type="doi">10.1111/j.1365-2249.2007.03397.x</pub-id>
                    <pub-id pub-id-type="pmcid">1942015</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-28">
                <label>28</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Jing</surname>
                            <given-names>Y</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Gravenstein</surname>
                            <given-names>S</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Chaganty</surname>
                            <given-names>NR</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Aging is associated with a rapid decline in frequency, alterations in subset composition, and enhanced Th2 response in CD1d-restricted NKT cells from human peripheral blood.</article-title>
                    <source>

                        <italic toggle="yes">Exp Gerontol.</italic>
</source>
                    <year>2007</year>;<volume>42</volume>(<issue>8</issue>):<fpage>719</fpage>&#x2013;<lpage>732</lpage>.
                    <pub-id pub-id-type="pmid">17368996</pub-id>
                    <pub-id pub-id-type="doi">10.1016/j.exger.2007.01.009</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-29">
                <label>29</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Tian</surname>
                            <given-names>L</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Su</surname>
                            <given-names>S</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Dong</surname>
                            <given-names>X</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>scPipe: a flexible R/Bioconductor preprocessing pipeline for single-cell RNA-sequencing data.</article-title>
                    <source>

                        <italic toggle="yes">PLoS Computational Biology.</italic>
</source>
                    <year>2018</year>;<volume>14</volume>(<issue>8</issue>):<fpage>e1006361</fpage>.
                    <pub-id pub-id-type="doi">10.1371/journal.pcbi.1006361</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-30">
                <label>30</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Dobin</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Davis</surname>
                            <given-names>CA</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Schlesinger</surname>
                            <given-names>F</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>STAR: ultrafast universal RNA-seq aligner.</article-title>
                    <source>

                        <italic toggle="yes">Bioinformatics.</italic>
</source>
                    <year>2013</year>;<volume>29</volume>(<issue>1</issue>):<fpage>15</fpage>&#x2013;<lpage>21</lpage>.
                    <pub-id pub-id-type="pmid">23104886</pub-id>
                    <pub-id pub-id-type="doi">10.1093/bioinformatics/bts635</pub-id>
                    <pub-id pub-id-type="pmcid">3530905</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-31">
                <label>31</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Liao</surname>
                            <given-names>Y</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Smyth</surname>
                            <given-names>GK</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Shi</surname>
                            <given-names>W</given-names>
                        </name>
</person-group>:
                    <article-title>The Subread aligner: fast, accurate and scalable read mapping by seed-and-vote.</article-title>
                    <source>

                        <italic toggle="yes">Nucleic Acids Res.</italic>
</source>
                    <year>2013</year>;<volume>41</volume>(<issue>10</issue>):<fpage>e108</fpage>.
                    <pub-id pub-id-type="pmid">23558742</pub-id>
                    <pub-id pub-id-type="doi">10.1093/nar/gkt214</pub-id>
                    <pub-id pub-id-type="pmcid">3664803</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-32">
                <label>32</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Guimaraes</surname>
                            <given-names>JC</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Zavolan</surname>
                            <given-names>M</given-names>
                        </name>
</person-group>:
                    <article-title>Patterns of ribosomal protein expression specify normal and malignant human cells.</article-title>
                    <source>

                        <italic toggle="yes">Genome Biol.</italic>
</source>
                    <year>2016</year>;<volume>17</volume>(<issue>1</issue>):<fpage>236</fpage>.
                    <pub-id pub-id-type="pmid">27884178</pub-id>
                    <pub-id pub-id-type="doi">10.1186/s13059-016-1104-z</pub-id>
                    <pub-id pub-id-type="pmcid">5123215</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-33">
                <label>33</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Du&#x00f2;</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Robinson</surname>
                            <given-names>MD</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Soneson</surname>
                            <given-names>C</given-names>
                        </name>
</person-group>:
                    <article-title>A systematic performance evaluation of clustering methods for single-cell RNA-seq data [version 1; referees: 2 approved with reservations].</article-title>
                    <source>

                        <italic toggle="yes">F1000Res.</italic>
</source>
                    <year>2018</year>;<volume>7</volume>:<fpage>1141</fpage>.
                    <pub-id pub-id-type="doi">10.12688/f1000research.15666.1</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-34">
                <label>34</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Kim</surname>
                            <given-names>T</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Chen</surname>
                            <given-names>IR</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Lin</surname>
                            <given-names>Y</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Impact of similarity metrics on single-cell RNA-seq data clustering.</article-title>
                    <source>

                        <italic toggle="yes">Brief Bioinform.</italic>
</source>
                    <year>2018</year>.
                    <pub-id pub-id-type="pmid">30137247</pub-id>
                    <pub-id pub-id-type="doi">10.1093/bib/bby076</pub-id>
                </mixed-citation>
            </ref>
        </ref-list>
    </back>
    <sub-article article-type="reviewer-report" id="report42132">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.19170.r42132</article-id>
            <title-group>
                <article-title>Reviewer response for version 2</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Ghazanfar</surname>
                        <given-names>Shila</given-names>
                    </name>
                    <xref ref-type="aff" rid="r42132a1">1</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0001-7861-6997</uri>
                </contrib>
                <aff id="r42132a1">
                    <label>1</label>School of Mathematics and Statistics, University of Sydney, Sydney, NSW, Australia</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>11</day>
                <month>1</month>
                <year>2019</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2019 Ghazanfar S</copyright-statement>
                <copyright-year>2019</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport42132" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.15809.2"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>The authors have provided excellent responses to reviewer comments, leading to a more comprehensive and useful manuscript.</p>
            <p>Is the work clearly and accurately presented and does it cite the current literature?</p>
            <p>Yes</p>
            <p>If applicable, is the statistical analysis and its interpretation appropriate?</p>
            <p>Yes</p>
            <p>Are all the source data underlying the results available to ensure full reproducibility?</p>
            <p>Yes</p>
            <p>Is the study design appropriate and is the work technically sound?</p>
            <p>Yes</p>
            <p>Are the conclusions drawn adequately supported by the results?</p>
            <p>Yes</p>
            <p>Are sufficient details of methods and analysis provided to allow replication by others?</p>
            <p>Yes</p>
            <p>Reviewer Expertise:</p>
            <p>Statistics, statistical bioinformatics</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.</p>
        </body>
    </sub-article>
    <sub-article article-type="reviewer-report" id="report42131">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.19170.r42131</article-id>
            <title-group>
                <article-title>Reviewer response for version 2</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Hicks</surname>
                        <given-names>Stephanie C</given-names>
                    </name>
                    <xref ref-type="aff" rid="r42131a1">1</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0002-7858-0231</uri>
                </contrib>
                <aff id="r42131a1">
                    <label>1</label>Johns Hopkins Bloomberg School of Public Health&#x00a0;(JHSPH), Baltimore, MD, USA</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>8</day>
                <month>1</month>
                <year>2019</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2019 Hicks SC</copyright-statement>
                <copyright-year>2019</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport42131" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.15809.2"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>Thank you to the authors for their thoughtful responses. I appreciated the detail in version 2 of the manuscript.</p>
            <p>Is the work clearly and accurately presented and does it cite the current literature?</p>
            <p>Yes</p>
            <p>If applicable, is the statistical analysis and its interpretation appropriate?</p>
            <p>Yes</p>
            <p>Are all the source data underlying the results available to ensure full reproducibility?</p>
            <p>Yes</p>
            <p>Is the study design appropriate and is the work technically sound?</p>
            <p>Yes</p>
            <p>Are the conclusions drawn adequately supported by the results?</p>
            <p>Yes</p>
            <p>Are sufficient details of methods and analysis provided to allow replication by others?</p>
            <p>Yes</p>
            <p>Reviewer Expertise:</p>
            <p>Statistics, genomics, analysis of single-cell data</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.</p>
        </body>
    </sub-article>
    <sub-article article-type="reviewer-report" id="report37228">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.17256.r37228</article-id>
            <title-group>
                <article-title>Reviewer response for version 1</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Hicks</surname>
                        <given-names>Stephanie C</given-names>
                    </name>
                    <xref ref-type="aff" rid="r37228a1">1</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0002-7858-0231</uri>
                </contrib>
                <aff id="r37228a1">
                    <label>1</label>Johns Hopkins Bloomberg School of Public Health&#x00a0;(JHSPH), Baltimore, MD, USA</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>31</day>
                <month>8</month>
                <year>2018</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2018 Hicks SC</copyright-statement>
                <copyright-year>2018</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport37228" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.15809.1"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve-with-reservations</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>Freytag et al. have produced a nice research article on assessing methods for clustering scRNA-seq data from the 10x&#x00a0;Genomics platform. I was excited to read the article to learn about what they recommend using. I have made some suggestions below for improvements that are mostly related to providing more intuition and higher-level summaries. This is mostly because as a user of these methods, at the end of the paper, I still felt a little confused about which method the authors would recommend using. I hope the authors can update the article with some of the suggestions:</p>
            <p> </p>
            <p> While the authors have provided detailed comparisons (running time, cluster stability, use of different aligners, different genes, etc), the biggest suggestion would be that the authors provide a higher-level summary of what the authors would suggest a user use to cluster his/her data. At the end of reading this paper, I felt a little overwhelmed at the amount of comparisons across various datasets. It's hard to look at Figs 1-7 and get an overall summary of which method to use. The authors do state in the abstract "We found that some methods, including Seurat and Cell Ranger, outperform other methods, although performance seems to be dependent on the complexity of the studied system", but it would be great if the authors could somehow provide a visual high-level summary of how they came to that conclusion, or elaborate in the discussion on that.&#x00a0;</p>
            <p> </p>
            <p> For the "gold standard" data, what was the percent of each human lung cell lines (HCC827, H1975, H2228) that were mixed together? Equal proportions? Was the reason you needed to use demuxlet was because the cell lines&#x00a0;were mixed up for sequencing? It would be great if the authors could elaborate on the experimental design.&#x00a0;</p>
            <p> </p>
            <p> Is the "gold standard" data available with the SNVs called for each cell. It would be useful to have this count matrix and corresponding phenotypic information about each cell in a SingleCellExperiment object for others to have access to.&#x00a0;</p>
            <p> </p>
            <p> It would be great if the authors could include another example dataset with a batch effect in it or something with a slightly less clean design, given most datasets are not quite this "clean". Also, maybe different clustering methods would perform better / worse depending on they data contained rare vs common cell types or included more or less diversity.&#x00a0;</p>
            <p> </p>
            <p> There is a TENxPBMCsData package (https://github.com/kasperdanielhansen/TENxPBMCData) that has been submitted to Bioconductor (similar to the TENxBrainData). This includes all PBMC 10X datasets currently listed on their site and&#x00a0;loads in a SingleCellExperiment object into R.&#x00a0;For the Silver Standard Datasets, you might incorporate this into your workflow.&#x00a0;</p>
            <p> </p>
            <p> How did you (or Cell Ranger)&#x00a0;deal with empty droplets or swapped barcodes on the 10x platform? This seems relevant for discovering cell types using some form of clustering.</p>
            <p> </p>
            <p> Supplemental Table 1 could use a caption and a label at the top saying "Supplemental Table 1". I had many tabs open with different supplemental figures and tables, and was getting confused about which was which one.&#x00a0;</p>
            <p> </p>
            <p> Why did Linnorm and Monocle "continually failed to run"? Did the authors contact the original authors of Linnorm and Monocle to determine if there was a problem with the actual software or if it was a problem with the implementation of the software? It would be great if the authors could elaborate.&#x00a0;</p>
            <p> </p>
            <p> I agree with this statement: "&#x00a0;We concede that it is possible that more care in the upstream data handling and selection of parameters could result in different results." This is true for almost all benchmarking papers. Given the authors are working within the R/Bioconductor framework, it would be great if the authors could use something like SummarizedBenchmark (http://bioconductor.org/packages/release/bioc/vignettes/SummarizedBenchmark/inst/doc/SummarizedBenchmark.html) to keep track of these parameters.&#x00a0;</p>
            <p> </p>
            <p> Could the authors elaborate on how they decided which performance metrics to use?&#x00a0;</p>
            <p> </p>
            <p> What does this mean: "The impact of different aligners and preprocessing was assessed using all appropriate combinations of programs"? Could the authors be more specific?</p>
            <p> </p>
            <p> I'm a little concerned about how much the solutions differ between methods and parameter choices. I understand the point of this paper is to make comparisons between already published methods, but as the authors are now very familiar with these methods, it would be great if they could provide some more practical guidance. What would the authors suggest using?&#x00a0;</p>
            <p> </p>
            <p> Fig 1 -- Could the authors hypothesize on why Seurat, TSCAN, RCA, SC3, RaceID, RaceID2 are estimating so many clusters? Also, why does countClust tend to underestimate the number of clusters? It would be great if the authors could provide some intuition.&#x00a0;</p>
            <p> </p>
            <p> Fig 3 -- If I'm understanding, ascend and countClust produce clusters that are very different than the rest?&#x00a0;</p>
            <p> </p>
            <p> Thank you to the authors for making their code publicly available!</p>
            <p>Is the work clearly and accurately presented and does it cite the current literature?</p>
            <p>Yes</p>
            <p>If applicable, is the statistical analysis and its interpretation appropriate?</p>
            <p>Yes</p>
            <p>Are all the source data underlying the results available to ensure full reproducibility?</p>
            <p>Yes</p>
            <p>Is the study design appropriate and is the work technically sound?</p>
            <p>Yes</p>
            <p>Are the conclusions drawn adequately supported by the results?</p>
            <p>Yes</p>
            <p>Are sufficient details of methods and analysis provided to allow replication by others?</p>
            <p>Yes</p>
            <p>Reviewer Expertise:</p>
            <p>Statistics, genomics, analysis of single-cell data</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.</p>
        </body>
        <sub-article article-type="response" id="comment4300-37228">
            <front-stub>
                <contrib-group>
                    <contrib contrib-type="author">
                        <name>
                            <surname>Freytag</surname>
                            <given-names>Saskia</given-names>
                        </name>
                        <aff>Walter and Eliza Hall Institute of Medical Research, Australia</aff>
                    </contrib>
                </contrib-group>
                <author-notes>
                    <fn fn-type="conflict">
                        <p>
                            <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                    </fn>
                </author-notes>
                <pub-date pub-type="epub">
                    <day>11</day>
                    <month>12</month>
                    <year>2018</year>
                </pub-date>
            </front-stub>
            <body>
                <p>
                    <italic>We would like to thank the reviewer for reviewing our manuscript and for their constructive comments. Below are point-by-point responses to the individual comments.</italic>
                </p>
                <p> </p>
                <p> While the authors have provided detailed comparisons (running time, cluster stability, use of different aligners, different genes, etc), the biggest suggestion would be that the authors provide a higher-level summary of what the authors would suggest a user use to cluster his/her data. At the end of reading this paper, I felt a little overwhelmed at the amount of comparisons across various datasets. It's hard to look at Figs 1-7 and get an overall summary of which method to use. The authors do state in the abstract "We found that some methods, including Seurat and Cell Ranger, outperform other methods, although performance seems to be dependent on the complexity of the studied system", but it would be great if the authors could somehow provide a visual high-level summary of how they came to that conclusion, or elaborate in the discussion on that.&#x00a0;</p>
                <p> </p>
                <p> 
                    <italic>We have added a discussion section in which we summarize the results across all evaluations. This discussion section includes a visual high-level summary (Figure 9).</italic>
                </p>
                <p> </p>
                <p> For the "gold standard" data, what was the percent of each human lung cell lines (HCC827, H1975, H2228) that were mixed together? Equal proportions? Was the reason you needed to use demuxlet was because the cell lines&#x00a0;were mixed up for sequencing? It would be great if the authors could elaborate on the experimental design.&#x00a0;</p>
                <p> </p>
                <p> 
                    <italic>We mixed the cell lines in equal proportions. Due to using 10x Genomics technology, the cell lines were mixed up in the process but could be deconvoluted using demuxlet (ref?). We have elaborated on this further in the manuscript to clarify the experimental design.</italic>
                </p>
                <p> </p>
                <p> Is the "gold standard" data available with the SNVs called for each cell. It would be useful to have this count matrix and corresponding phenotypic information about each cell in a SingleCellExperiment object for others to have access to.&#x00a0;</p>
                <p> 
                    <italic>We have made all datasets as SingleCellExperiment objects, including their phenotypic information, available on Github at https://github.com/bahlolab/cluster_benchmark_data . We have added information regarding the availability of all processed datasets to the manuscript.</italic>
                </p>
                <p> </p>
                <p> It would be great if the authors could include another example dataset with a batch effect in it or something with a slightly less clean design, given most datasets are not quite this "clean". Also, maybe different clustering methods would perform better / worse depending on they data contained rare vs common cell types or included more or less diversity.&#x00a0;</p>
                <p> 
                    <italic>We agree that investigating the performance of clustering approaches on &#x201d;messy&#x201d; scRNA-seq designs would be very interesting. However, this is beyond the scope of this paper, as it requires the application of sophisticated batch correction methods. Such methods should generally be performed by experts rather than beginners, who were the target audience of this paper. We have added a discussion to this effect.</italic>
                </p>
                <p> 
                    <italic>Finally, in order to investigate whether methods perform better or worse in more or less diverse situations, one requires either simulations or mixture experiments. These were beyond the scope of this paper. However, we now refer the readers of our manuscript to the recent benchmarking study of scRNA-seq clustering methods by Du&#x00f3; et al, which investigates just such a scenario. For most methods they did not observe overt differences.</italic>
                </p>
                <p> </p>
                <p> There is a TENxPBMCsData package (https://github.com/kasperdanielhansen/TENxPBMCData) that has been submitted to Bioconductor (similar to the TENxBrainData). This includes all PBMC 10X datasets currently listed on their site and&#x00a0;loads in a SingleCellExperiment object into R.&#x00a0;For the Silver Standard Datasets, you might incorporate this into your workflow.&#x00a0;</p>
                <p> 
                    <italic>We decided to incorporate all moderately large fresh PBMC samples included in the TENxPBMCs into our workflow. This also provided us with an opportunity to update the package versions for the individual clustering tools for our silver standard benchmarking and stability analyses. </italic>
                </p>
                <p> </p>
                <p> How did you (or Cell Ranger)&#x00a0;deal with empty droplets or swapped barcodes on the 10x platform? This seems relevant for discovering cell types using some form of clustering.</p>
                <p> </p>
                <p> 
                    <italic>We added the following explanation: &#x201c;Cell Ranger filters any barcode that contains less than 10% of the 99th percentile of total UMI counts per barcode, as these are considered to be barcodes associated with empty droplets. The barcode by design can take one of 737,000 different sequences that comprise a whitelist. This feature allows the performance of error correction when the observed barcode does not match any barcode on the whitelist due to sequencing error.&#x201d;</italic>
                </p>
                <p> </p>
                <p> Supplemental Table 1 could use a caption and a label at the top saying "Supplemental Table 1". I had many tabs open with different supplemental figures and tables, and was getting confused about which was which one.&#x00a0;</p>
                <p> 
                    <italic>We added a caption on the top of all Supplemental Tables.</italic>
                </p>
                <p> </p>
                <p> Why did Linnorm and Monocle "continually failed to run"? Did the authors contact the original authors of Linnorm and Monocle to determine if there was a problem with the actual software or if it was a problem with the implementation of the software? It would be great if the authors could elaborate.&#x00a0;</p>
                <p> </p>
                <p> 
                    <italic>Linnorm failed because its calculations would time out. Monocle failed because the dispersion could not be calculated. However, neither of the programs was tried using their newer package versions corresponding to R version 3.5.0 nor were any of the packages&#x2019; authors contacted. We have included a statement in the manuscript to this regard.</italic>
                </p>
                <p> </p>
                <p> I agree with this statement: "&#x00a0;We concede that it is possible that more care in the upstream data handling and selection of parameters could result in different results." This is true for almost all benchmarking papers. Given the authors are working within the R/Bioconductor framework, it would be great if the authors could use something like SummarizedBenchmark (http://bioconductor.org/packages/release/bioc/vignettes/SummarizedBenchmark/inst/doc/SummarizedBenchmark.html) to keep track of these parameters.&#x00a0;</p>
                <p> 
                    <italic>We did take a look at the SummarizedBenchmark package, but did not find it suitable for our needs. However, we understand the need to provide all parameters (including defaults) used in the individual analyses and thus have added additional files providing this information to the GitHub respository. </italic>
                </p>
                <p> </p>
                <p> Could the authors elaborate on how they decided which performance metrics to use?&#x00a0;</p>
                <p> </p>
                <p> 
                    <italic>We used performance metrics commonly used in the clustering literature. We also made sure that the selected metrics were applicable in the absence of known cluster labels. Furthermore, they share the advantages of bounded ranges and no assumptions regarding cluster structures. Additionally they offer complementary insights. We have added this explanation to the manuscript.</italic>
                </p>
                <p> </p>
                <p> What does this mean: "The impact of different aligners and preprocessing was assessed using all appropriate combinations of programs"? Could the authors be more specific?</p>
                <p> </p>
                <p> 
                    <italic>We meant to say that we assessed the impact of combinations of different aligners and preprocessing (i.e. CellRanger or scPipe) for all possible clustering methods. Some clustering methods, like ascend, failed to run for scPipe generated output and it was too challenging to run the CellRanger clustering approach on scPipe generated output.</italic>
                </p>
                <p> </p>
                <p> I'm a little concerned about how much the solutions differ between methods and parameter choices. I understand the point of this paper is to make comparisons between already published methods, but as the authors are now very familiar with these methods, it would be great if they could provide some more practical guidance. What would the authors suggest using?&#x00a0;</p>
                <p> </p>
                <p> 
                    <italic>We suggest using several clustering methods ideally with multiple parameter choices in order to ensure that biological results are not artifacts of method or parameter choice. Unfortunately, we do not feel in a position to give specific practical advice for the specific use of individual methods, as optimal parameter choices depend on many different factors including the type of biological system studied.</italic>
                </p>
                <p> </p>
                <p> Fig 1 -- Could the authors hypothesize on why Seurat, TSCAN, RCA, SC3, RaceID, RaceID2 are estimating so many clusters? Also, why does countClust tend to underestimate the number of clusters? It would be great if the authors could provide some intuition.&#x00a0;</p>
                <p> </p>
                <p> 
                    <italic>We believe that many methods tended to overestimate the number of clusters in the gold standard dataset, because the cell lines may be heterogeneous with regards to other biological factors, such as cell state. Consequently, in such a scenario methods may split cells of the same population but in different cell states into multiple clusters.</italic>
                </p>
                <p> 
                    <italic>We have no intuition as to why countClust underestimates the number of clusters.</italic>
                </p>
                <p> </p>
                <p> Fig 3 -- If I'm understanding, ascend and countClust produce clusters that are very different than the rest?&#x00a0;</p>
                <p> 
                    <italic>Yes that is correct.</italic>
                </p>
            </body>
        </sub-article>
    </sub-article>
    <sub-article article-type="reviewer-report" id="report37231">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.17256.r37231</article-id>
            <title-group>
                <article-title>Reviewer response for version 1</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Ghazanfar</surname>
                        <given-names>Shila</given-names>
                    </name>
                    <xref ref-type="aff" rid="r37231a1">1</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0001-7861-6997</uri>
                </contrib>
                <aff id="r37231a1">
                    <label>1</label>School of Mathematics and Statistics, University of Sydney, Sydney, NSW, Australia</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>29</day>
                <month>8</month>
                <year>2018</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2018 Ghazanfar S</copyright-statement>
                <copyright-year>2018</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport37231" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.15809.1"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve-with-reservations</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>Freytag and colleagues provide a comprehensive comparison of clustering methods - specifically designed for scRNA-Seq data - on data collected using the popular droplet-based 10x Genomics platform. A total of four datasets, comprising a Gold standard mixture of cell lines as well as three Silver standard PBMC datasets, were compared in terms of accuracy, stability as well as other metrics like runtime and ease of use. Freytag et al also perform an analysis to try to determine the factors influencing the resulting clusterings for the Silver standard datasets.</p>
            <p> </p>
            <p> It is a very challenging task to perform a comprehensive characterization and comparison of clustering methods on such types of high-dimensional data, due to the sheer number of choices that need to be made, the difficulty in establishing ideal performance, and the relative lack of ground truth. Freytag et al do a great job of addressing these challenges and working towards providing an overall recommendation of clustering methods for non-expert practitioners, while stressing the need for careful interpretation of such results.&#x00a0;</p>
            <p> </p>
            <p> With this in mind, I have some comments/suggestions, as well as a number of minor comments/suggestions, as follows:</p>
            <p> </p>
            <p> **Comments to authors**</p>
            <p> </p>
            <p> Linnorm and Monocle failed - expand on why? I understand that this is indeed a limitation especially for a non-expert practitioner, but it would be good to have an understanding towards what the issue might have been.</p>
            <p> </p>
            <p> Could use a flowchart to summarise the study and various comparisons, as well which methods could no longer be compared (e.g. methods that could not work within the scPipe framework).</p>
            <p> </p>
            <p> Different upstream data handling was performed for each clustering method. How much of a difference was observed just due to this preprocessing, as opposed to the actual clustering step? I understand that each method provides their own preprocessing as *part* of the method, but at least some of these methods would have been developed with plate-based and/or non-UMI-based scRNA-Seq in mind, so may not be intended for the context of 10x Genomics data. Again I understand that you're comparing methods 'out of the box' but it would be insightful to see what differences there are. I suggest a figure like an upsetR plot for the genes/cells filtered and a correlation heatmap of the expression values themselves.</p>
            <p> </p>
            <p> Could you summarise the distance metrics used in the clustering and if there is a general flavour to the clustering algorithm? e.g. hierarchical, k-means, density-based etc. How do these relate in terms of overall accuracy, stability and other metrics?</p>
            <p> </p>
            <p> Stability assessment - mentions that half of the 58,302 genes were randomly selected, but Table 1 says 24,654 total genes detected. There's a big discrepancy between these two so please clarify; if half of the 58,302 genes were selected then a large proportion of genes would have identically zero rows. Also Table 1 shows Dataset 3 had the highest number of 'total genes detected', so how was Dataset 1 the one with "most number of non-zero genes after filtering"?</p>
            <p> </p>
            <p> Run time section - What do you mean by 'overridden'? And for which aspects of the analysis steps was this done?</p>
            <p> </p>
            <p> Figure 4 - These boxplots show ARI among multiple clustering solutions, so a method that gives a consistently bad result is still high (e.g. in this case the RCA method). Suggest an analogous set of boxplots but with ARI_truth, is there a similar variability observed, as seen in these boxplots?</p>
            <p> </p>
            <p> Gene-wise stability analysis - I'm actually unsure how realistic this particular comparison is. It would be insightful to assess clusterings depending on different levels of gene filtering stringency (in the initial Cell Ranger read processing), or stringency on selection of features based on various criteria like highly variable genes.</p>
            <p> </p>
            <p> Figure 7 - Please clarify how 'total number of features' is a cell-specific quantity. Do you mean total number of non-zero features? Was this analysis also performed on the Gold Dataset and what overall similarities could be observed?</p>
            <p> </p>
            <p> Factors influencing clustering solutions - It would be interesting to consider the factors associated with 'correct' cluster assignment for cells. Optionally suggest to perform this for either the Gold Dataset or the Silver datasets and perform a logistic regression with the response being success/failure of a cell to belong to the cluster most associated with the 'true' cell type group. There is an added subtlety as far as matching clusters with cell type groups goes, but I think there are a few reasonable ways to perform this (e.g. assign candidate clusters to the 'true' groups by taking the higher proportion of cell overlap, and allow multiple candidate clusters to match to a single true group). Performing this kind of analysis could shed light on properties of cells that don't tend to cluster correctly, and if there is consistency in this across multiple disparate datasets.</p>
            <p> </p>
            <p> **Minor comments**</p>
            <p> </p>
            <p> Table 1 - countClust 'version' formatted with verbatim.</p>
            <p> </p>
            <p> Table 1 - I would suggest the 'properties' column could be better presented in a checklist format, with ticks/crosses for fulfilling various criteria listed.</p>
            <p> </p>
            <p> Section beginning "silver standard" - 10x is capitalised.</p>
            <p> </p>
            <p> Supplementary Figure 1 - legend fallen off panel a), needs a higher resolution or larger points</p>
            <p> </p>
            <p> NMI definition - trailing parenthesis in denominator</p>
            <p> </p>
            <p> typo - assess the effect**</p>
            <p> </p>
            <p> Figure 2a - I found this quite busy, hard to interpret. Suggest to add shading that covers the points for same method or to facet by dataset. I don't believe the ARI values are particularly comparable between datasets so I would prefer facetting by dataset.</p>
            <p> </p>
            <p> Figure 3 - rows/columns are ordered differently between panels, what's driving this difference?</p>
            <p> </p>
            <p> Supplementary Figure 3 was not mentioned in the main text</p>
            <p> </p>
            <p> Supplementary Figure 4 is a two page pdf, with the first page blank</p>
            <p> </p>
            <p> Figure 6 - Figure caption says Dataset 1 but reports 29,151 genes. Do you mean the Gold Dataset and 29,451 genes? If not, please clarify which data and how many genes.</p>
            <p> </p>
            <p> Discussion - One instance of "Seurat" is missing verbatim format</p>
            <p>Is the work clearly and accurately presented and does it cite the current literature?</p>
            <p>Yes</p>
            <p>If applicable, is the statistical analysis and its interpretation appropriate?</p>
            <p>Yes</p>
            <p>Are all the source data underlying the results available to ensure full reproducibility?</p>
            <p>Yes</p>
            <p>Is the study design appropriate and is the work technically sound?</p>
            <p>Yes</p>
            <p>Are the conclusions drawn adequately supported by the results?</p>
            <p>Yes</p>
            <p>Are sufficient details of methods and analysis provided to allow replication by others?</p>
            <p>Yes</p>
            <p>Reviewer Expertise:</p>
            <p>Statistics, statistical bioinformatics</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.</p>
        </body>
        <sub-article article-type="response" id="comment4301-37231">
            <front-stub>
                <contrib-group>
                    <contrib contrib-type="author">
                        <name>
                            <surname>Freytag</surname>
                            <given-names>Saskia</given-names>
                        </name>
                        <aff>Walter and Eliza Hall Institute of Medical Research, Australia</aff>
                    </contrib>
                </contrib-group>
                <author-notes>
                    <fn fn-type="conflict">
                        <p>
                            <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                    </fn>
                </author-notes>
                <pub-date pub-type="epub">
                    <day>11</day>
                    <month>12</month>
                    <year>2018</year>
                </pub-date>
            </front-stub>
            <body>
                <p>
                    <italic>We would like to thank the reviewer for reviewing our manuscript and for their constructive comments. Below are point-by-point responses to the individual comments.</italic>
                </p>
                <p> </p>
                <p> Linnorm and Monocle failed - expand on why? I understand that this is indeed a limitation especially for a non-expert practitioner, but it would be good to have an understanding towards what the issue might have been.</p>
                <p> 
                    <italic>Linnorm failed because its calculations would time out. Monocle failed because the dispersion could not be calculated. However, neither of the programs was tried using their newer package versions corresponding to R version 3.5.0. We have included a statement in the manuscript to this regard.</italic>
                </p>
                <p> Could use a flowchart to summarize the study and various comparisons, as well which methods could no longer be compared (e.g. methods that could not work within the scPipe framework).</p>
                <p> 
                    <italic>While we were unable to summarize our study design effectively in a flowchart, we have summarized it in a table (see Supplementary Table 2). We hope that this will clarify the various assessments performed in this paper.</italic>
                </p>
                <p> Different upstream data handling was performed for each clustering method. How much of a difference was observed just due to this preprocessing, as opposed to the actual clustering step? I understand that each method provides their own preprocessing as *part* of the method, but at least some of these methods would have been developed with plate-based and/or non-UMI-based scRNA-Seq in mind, so may not be intended for the context of 10x Genomics data. Again I understand that you're comparing methods 'out of the box' but it would be insightful to see what differences there are. I suggest a figure like an upsetR plot for the genes/cells filtered and a correlation heatmap of the expression values themselves.</p>
                <p> 
                    <italic>We agree that different data handling influences the performance of each clustering method, which were indeed designed with different single cell technologies in mind, and that the effect of this would be interesting to further investigate. However, the `black-box` nature of some of the investigated methods means that even recording these differences is challenging. Take Seurat as an example it is unclear whether to report the number of genes passing the filtering step or the number of genes that are used in the clustering. &#x00a0;Instead, we would like to refer you to the recent benchmarking study of clustering methods for scRNA-seq by Du&#x00f3; et al, where the authors investigated the effects of different gene filtering on clustering solutions.</italic>
                </p>
                <p> Could you summarise the distance metrics used in the clustering and if there is a general flavour to the clustering algorithm? e.g. hierarchical, k-means, density-based etc. How do these relate in terms of overall accuracy, stability and other metrics?</p>
                <p> 
                    <italic>Thank you for the suggestion. We have updated the table summarizing the properties of the different clustering methods and added a discussion regarding how different flavors of clustering methods relate to overall performance (see Table 1 and Discussion).</italic>
                </p>
                <p> Stability assessment - mentions that half of the 58,302 genes were randomly selected, but Table 1 says 24,654 total genes detected. There's a big discrepancy between these two so please clarify; if half of the 58,302 genes were selected then a large proportion of genes would have identically zero rows. Also Table 1 shows Dataset 3 had the highest number of 'total genes detected', so how was Dataset 1 the one with "most number of non-zero genes after filtering"?</p>
                <p> 
                    <italic>You are correct. We randomly selected half of 58,302 genes of which many were zero. We have since replaced this analysis, as per your suggestion, with an analysis that assesses stability when keeping only the top 10th, 20th, 30th, 40th, and 50th percentile of all genes including the ones not detected. </italic>
                </p>
                <p> 
                    <italic>With regards to the number of detected genes in dataset 1 and dataset 3, indeed dataset 3 had more detected genes. Thank your for correcting this.</italic>
                </p>
                <p> Run time section - What do you mean by 'overridden'? And for which aspects of the analysis steps was this done?</p>
                <p> 
                    <italic>We meant to say that a seed had been set to provide reproducibility of all parts of the analysis that involve randomness. This has been corrected in the manuscript.</italic>
                </p>
                <p> Figure 4 - These boxplots show ARI among multiple clustering solutions, so a method that gives a consistently bad result is still high (e.g. in this case the RCA method). Suggest an analogous set of boxplots but with ARI_truth, is there a similar variability observed, as seen in these boxplots?</p>
                <p> 
                    <italic>Thank you for the suggestion, we have included a boxplot with ARI_truth.</italic>
                </p>
                <p> </p>
                <p> Gene-wise stability analysis - I'm actually unsure how realistic this particular comparison is. It would be insightful to assess clusterings depending on different levels of gene filtering stringency (in the initial Cell Ranger read processing), or stringency on selection of features based on various criteria like highly variable genes.</p>
                <p> 
                    <italic>We have replaced the gene-wise stability analysis with an assessment of the performance when keeping only the top 10th, 20th, 30th, 40th, and 50th percentile of all genes (compare Figure 6). We think that this is more insightful as it is closer to filtering performed during analysis.</italic>
                </p>
                <p> </p>
                <p> Figure 7 - Please clarify how 'total number of features' is a cell-specific quantity. Do you mean total number of non-zero features? Was this analysis also performed on the Gold Dataset and what overall similarities could be observed?</p>
                <p> 
                    <italic>Indeed we do mean the number of non-zero genes and we have replaced this in the figure with &#x201c;number of detected genes&#x201d;. We also include the same analysis on the gold standard dataset in the Supplementary (Supplementary Figure 11).</italic>
                </p>
                <p> Factors influencing clustering solutions - It would be interesting to consider the factors associated with 'correct' cluster assignment for cells. Optionally suggest to perform this for either the Gold Dataset or the Silver datasets and perform a logistic regression with the response being success/failure of a cell to belong to the cluster most associated with the 'true' cell type group. There is an added subtlety as far as matching clusters with cell type groups goes, but I think there are a few reasonable ways to perform this (e.g. assign candidate clusters to the 'true' groups by taking the higher proportion of cell overlap, and allow multiple candidate clusters to match to a single true group). Performing this kind of analysis could shed light on properties of cells that don't tend to cluster correctly, and if there is consistency in this across multiple disparate datasets.</p>
                <p> </p>
                <p> 
                    <italic>We did perform the suggested analysis. However, results from this analysis did not give any insights beyond the already conducted analysis (see </italic>
                    <ext-link ext-link-type="uri" xlink:href="https://github.com/SaskiaFreytag/cluster_benchmarking_code/tree/master/revision_figure">https://github.com/SaskiaFreytag/cluster_benchmarking_code/tree/master/revision_figure</ext-link>
                    <italic>). Hence, we chose not to include this in the manuscript. </italic>
                </p>
                <p> </p>
                <p> Table 1 - countClust 'version' formatted with verbatim.</p>
                <p> 
                    <italic>Thank you for noticing, this has been corrected.</italic>
                </p>
                <p> </p>
                <p> Table 1 - I would suggest the 'properties' column could be better presented in a checklist format, with ticks/crosses for fulfilling various criteria listed.</p>
                <p> 
                    <italic>Table 1 is now Supplementary Table 1. Unfortunately, properties differ too much to adequately represent these in a checklist. </italic>
                </p>
                <p> </p>
                <p> Section beginning "silver standard" - 10x is capitalised.</p>
                <p> 
                    <italic>Thank you for noticing, this has been corrected.</italic>
                </p>
                <p> </p>
                <p> Supplementary Figure 1 - legend fallen off panel a), needs a higher resolution or larger points</p>
                <p> 
                    <italic>We have increased the resolution.</italic>
                </p>
                <p> </p>
                <p> NMI definition - trailing parenthesis in denominator</p>
                <p> </p>
                <p> 
                    <italic>Thank you for noticing, this has been corrected.</italic>
                </p>
                <p> </p>
                <p> typo - assess the effect**</p>
                <p> 
                    <italic>Thank you for noticing, this has been corrected.</italic>
                </p>
                <p> </p>
                <p> Figure 2a - I found this quite busy, hard to interpret. Suggest to add shading that covers the points for same method or to facet by dataset. I don't believe the ARI values are particularly comparable between datasets so I would prefer facetting by dataset.</p>
                <p> </p>
                <p> 
                    <italic>We agree with the reviewer and now use faceting.</italic>
                </p>
                <p> </p>
                <p> Figure 3 - rows/columns are ordered differently between panels, what's driving this difference?</p>
                <p> </p>
                <p> 
                    <italic>The difference by clustering on the similarity across methods, i.e. more similar methods are closer to each other. We have included a statement explaining this in the figure description.</italic>
                </p>
                <p> </p>
                <p> Supplementary Figure 3 was not mentioned in the main text</p>
                <p> </p>
                <p> 
                    <italic>We now mention this Supplementary Figure.</italic>
                </p>
                <p> </p>
                <p> Supplementary Figure 4 is a two page pdf, with the first page blank</p>
                <p> 
                    <italic>We have corrected this error.</italic>
                </p>
                <p> </p>
                <p> Figure 6 - Figure caption says Dataset 1 but reports 29,151 genes. Do you mean the Gold Dataset and 29,451 genes? If not, please clarify which data and how many genes.</p>
                <p> 
                    <italic>Note that this figure has been replaced. We indeed meant Dataset 1, but with only half the genes.</italic>
                </p>
                <p> </p>
                <p> Discussion - One instance of "Seurat" is missing verbatim format</p>
                <p> 
                    <italic>Thank you for noticing, this has been corrected.</italic>
                </p>
            </body>
        </sub-article>
    </sub-article>
    <sub-article article-type="reviewer-report" id="report37232">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.17256.r37232</article-id>
            <title-group>
                <article-title>Reviewer response for version 1</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Ho</surname>
                        <given-names>Joshua W. K.</given-names>
                    </name>
                    <xref ref-type="aff" rid="r37232a1">1</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0003-2331-7011</uri>
                </contrib>
                <aff id="r37232a1">
                    <label>1</label>Victor Chang Cardiac Research Institute&#x00a0;(VCCRI), Darlinghurst, NSW, Australia</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>28</day>
                <month>8</month>
                <year>2018</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2018 Ho JWK</copyright-statement>
                <copyright-year>2018</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport37232" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.15809.1"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>This paper presents a well-designed and comprehensive evaluation of widely used clustering algorithms for medium-sized 10x Genomics scRNA-seq data. Clustering is a highly active area of research in scRNA-seq data analysis. With so many published clustering tools available, it is often difficult to choose the most appropriate tool. This paper attempts to address this problem by systematically comparing the performance of 12 commonly used clustering tools.&#x00a0;The evaluation results should serve as an important guide to bioinformatics practitioners. This paper is a very useful&#x00a0;contribution to the field.</p>
            <p>Is the work clearly and accurately presented and does it cite the current literature?</p>
            <p>Yes</p>
            <p>If applicable, is the statistical analysis and its interpretation appropriate?</p>
            <p>Yes</p>
            <p>Are all the source data underlying the results available to ensure full reproducibility?</p>
            <p>Yes</p>
            <p>Is the study design appropriate and is the work technically sound?</p>
            <p>Yes</p>
            <p>Are the conclusions drawn adequately supported by the results?</p>
            <p>Yes</p>
            <p>Are sufficient details of methods and analysis provided to allow replication by others?</p>
            <p>Yes</p>
            <p>Reviewer Expertise:</p>
            <p>Bioinformatics, single-cell transcriptomics</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.</p>
        </body>
    </sub-article>
</article>
