<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.2 20190208//EN" "http://jats.nlm.nih.gov/publishing/1.2/JATS-journalpublishing1.dtd"><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="methods-article" dtd-version="1.2" xml:lang="en">
    <front>
        <journal-meta>
            <journal-id journal-id-type="pmc">F1000Research</journal-id>
            <journal-title-group>
                <journal-title>F1000Research</journal-title>
            </journal-title-group>
            <issn pub-type="epub">2046-1402</issn>
            <publisher>
                <publisher-name>F1000 Research Limited</publisher-name>
                <publisher-loc>London, UK</publisher-loc>
            </publisher>
        </journal-meta>
        <article-meta>
            <article-id pub-id-type="doi">10.12688/f1000research.15398.1</article-id>
            <article-categories>
                <subj-group subj-group-type="heading">
                    <subject>Method Article</subject>
                </subj-group>
                <subj-group>
                    <subject>Articles</subject>
                </subj-group>
            </article-categories>
            <title-group>
                <article-title>Swimming downstream: statistical analysis of differential transcript usage following Salmon quantification</article-title>
                <fn-group content-type="pub-status">
                    <fn>
                        <p>[version 1; peer review: 3 approved with reservations]</p>
                    </fn>
                </fn-group>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author" corresp="yes">
                    <name>
                        <surname>Love</surname>
                        <given-names>Michael I.</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Conceptualization</role>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <uri content-type="orcid">https://orcid.org/0000-0001-8401-0545</uri>
                    <xref ref-type="corresp" rid="c1">a</xref>
                    <xref ref-type="aff" rid="a1">1</xref>
                    <xref ref-type="aff" rid="a2">2</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Soneson</surname>
                        <given-names>Charlotte</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Conceptualization</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <uri content-type="orcid">https://orcid.org/0000-0003-3833-2169</uri>
                    <xref ref-type="aff" rid="a3">3</xref>
                    <xref ref-type="aff" rid="a4">4</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Patro</surname>
                        <given-names>Rob</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Conceptualization</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <uri content-type="orcid">https://orcid.org/0000-0001-8463-1675</uri>
                    <xref ref-type="aff" rid="a5">5</xref>
                </contrib>
                <aff id="a1">
                    <label>1</label>Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27516, USA</aff>
                <aff id="a2">
                    <label>2</label>Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27516, USA</aff>
                <aff id="a3">
                    <label>3</label>Institute of Molecular Life Sciences, University of Zurich, Zurich, Switzerland</aff>
                <aff id="a4">
                    <label>4</label>SIB Swiss Institute of Bioinformatics, Zurich, Switzerland</aff>
                <aff id="a5">
                    <label>5</label>Department of Computer Science, Stony Brook University, Stony Brook, NY, 11794, USA</aff>
            </contrib-group>
            <author-notes>
                <corresp id="c1">
                    <label>a</label>
                    <email xlink:href="mailto:michaelisaiahlove@gmail.com">michaelisaiahlove@gmail.com</email>
                </corresp>
                <fn fn-type="conflict">
                    <p>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>27</day>
                <month>6</month>
                <year>2018</year>
            </pub-date>
            <pub-date pub-type="collection">
                <year>2018</year>
            </pub-date>
            <volume>7</volume>
            <elocation-id>952</elocation-id>
            <history>
                <date date-type="accepted">
                    <day>22</day>
                    <month>6</month>
                    <year>2018</year>
                </date>
            </history>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2018 Love MI et al.</copyright-statement>
                <copyright-year>2018</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <self-uri content-type="pdf" xlink:href="https://f1000research.com/articles/7-952/pdf"/>
            <abstract>
                <p>Detection of differential transcript usage (DTU) from RNA-seq data is an important bioinformatic analysis that complements differential gene expression analysis. Here we present a simple workflow using a set of existing R/Bioconductor packages for analysis of DTU. We show how these packages can be used downstream of RNA-seq quantification using the Salmon software package. The entire pipeline is fast, benefiting from inference steps by Salmon to quantify expression at the transcript level. The workflow includes live, runnable code chunks for analysis using DRIMSeq and DEXSeq, as well as for performing two-stage testing of DTU using the stageR package, a statistical framework to screen at the gene level and then confirm which transcripts within the significant genes show evidence of DTU. We evaluate these packages and other related packages on a simulated dataset with parameters estimated from real data.</p>
            </abstract>
            <kwd-group kwd-group-type="author">
                <kwd>RNA-seq</kwd>
                <kwd>workflow</kwd>
                <kwd>differential transcript usage</kwd>
                <kwd>Salmon</kwd>
                <kwd>DRIMSeq</kwd>
                <kwd>DEXSeq</kwd>
                <kwd>stageR</kwd>
                <kwd>tximport</kwd>
            </kwd-group>
            <funding-group>
                <award-group id="fund-1" xlink:href="http://dx.doi.org/10.13039/100000051">
                    <funding-source>National Human Genome Research Institute</funding-source>
                    <award-id>HG009125</award-id>
                </award-group>
                <award-group id="fund-2" xlink:href="http://dx.doi.org/10.13039/501100008982">
                    <funding-source>National Science Foundation</funding-source>
                    <award-id>CCF-1750472</award-id>
                    <award-id>BIO-1564917</award-id>
                </award-group>
                <award-group id="fund-3" xlink:href="http://dx.doi.org/10.13039/100000066">
                    <funding-source>National Institute of Environmental Health Sciences</funding-source>
                    <award-id>ES010126</award-id>
                </award-group>
                <award-group id="fund-4" xlink:href="http://dx.doi.org/10.13039/100000054">
                    <funding-source>National Cancer Institute</funding-source>
                    <award-id>CA142538</award-id>
                </award-group>
                <funding-statement>The work of MIL on this workflow was supported by the National Human Genome Research Institute [R01HG009125], the National Cancer Institute [P01CA142538], and the National Institute of Environmental Health Sciences [P30 ES010126]. CS declared that no grants were involved in supporting this work. The work of RP on this workflow was supported by the National Science Foundation [BIO-1564917 and CCF-1750472].</funding-statement>
                <funding-statement>
                    <italic>The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.</italic>
                </funding-statement>
            </funding-group>
        </article-meta>
    </front>
    <body>
        <sec sec-type="intro">
            <title>Introduction</title>
            <p>RNA-seq experiments can be analyzed to detect differences across groups of samples in total gene expression &#x2013; the total expression produced by all isoforms of a gene &#x2013; and additionally differences in transcript usage within a gene. If the amount of expression switches among two or more isoforms of a gene, then the total gene expression may not change by a detectable amount, but the differential transcript usage is nevertheless biologically relevant. While many tutorials and workflows in the Bioconductor project address differential gene expression, there are fewer workflows for performing a differential transcript usage analysis, which provides critical and complementary information to a gene-level analysis. Some of the existing Bioconductor packages and functions that can be used to detect differential transcript usage include 
                <italic toggle="yes">BitSeq</italic>
                <sup>
                    <xref ref-type="bibr" rid="ref-1">1</xref>
                </sup>, 
                <italic toggle="yes">DEXSeq</italic> (originally designed for differential exon usage)
                <sup>
                    <xref ref-type="bibr" rid="ref-2">2</xref>
                </sup>, 
                <monospace>diffSpliceDGE</monospace> from the 
                <italic toggle="yes">edgeR</italic> package
                <sup>
                    <xref ref-type="bibr" rid="ref-3">3</xref>,
                    <xref ref-type="bibr" rid="ref-4">4</xref>
                </sup>, 
                <monospace>diffSplice</monospace> from the 
                <italic toggle="yes">limma</italic> package
                <sup>
                    <xref ref-type="bibr" rid="ref-5">5</xref>,
                    <xref ref-type="bibr" rid="ref-6">6</xref>
                </sup>, 
                <italic toggle="yes">DRIMSeq</italic>
                <sup>
                    <xref ref-type="bibr" rid="ref-7">7</xref>
                </sup>, 
                <italic toggle="yes">stageR</italic>
                <sup>
                    <xref ref-type="bibr" rid="ref-8">8</xref>
                </sup>, and 
                <italic toggle="yes">SGSeq</italic>
                <sup>
                    <xref ref-type="bibr" rid="ref-9">9</xref>
                </sup>. The Bioconductor package 
                <italic toggle="yes">IsoformSwitchAnalyzeR</italic>
                <sup>
                    <xref ref-type="bibr" rid="ref-10">10</xref>
                </sup> is well documented and can be seen as an alternative to this workflow; 
                <italic toggle="yes">IsoformSwitchAnalyzeR</italic> allows for import of data from various quantification methods, including 
                <italic toggle="yes">Salmon</italic>, and allows for statistical inference using 
                <italic toggle="yes">DRIMSeq</italic>, as well as a rank-based statistical test of transcript proportions. In addition, 
                <italic toggle="yes">IsoformSwitchAnalyzeR</italic> includes functions for obtaining the nucleotide and amino acid sequence consequences of isoform switching, which is not covered in this workflow. Other packages related to splicing can be found at the 
                <ext-link ext-link-type="uri" xlink:href="https://bioconductor.org/packages/release/BiocViews.html#___DifferentialSplicing">DifferentialSplicing BiocViews</ext-link>. For more information about the Bioconductor project and its core infrastructure, please refer to the overview by Huber 
                <italic toggle="yes">et al</italic>.
                <sup>
                    <xref ref-type="bibr" rid="ref-11">11</xref>
                </sup>.</p>
            <p>We note that there are numerous other methods for detecting differential transcript usage outside of the Bioconductor project. The 
                <italic toggle="yes">DRIMSeq</italic> publication is a good reference for these, having descriptions and comparisons with many current methods
                <sup>
                    <xref ref-type="bibr" rid="ref-7">7</xref>
                </sup>. This workflow will build on the methods and vignettes from three Bioconductor packages: 
                <italic toggle="yes">DRIMSeq</italic>, 
                <italic toggle="yes">DEXSeq</italic>, and 
                <italic toggle="yes">stageR</italic>.</p>
            <p>Previously, some of the developers of the Bioconductor packages 
                <italic toggle="yes">edgeR</italic> and 
                <italic toggle="yes">DESeq2</italic> have collaborated to develop the 
                <italic toggle="yes">tximport</italic> package
                <sup>
                    <xref ref-type="bibr" rid="ref-12">12</xref>
                </sup> for summarizing the output of fast transcript-level quantifiers, such as 
                <italic toggle="yes">Salmon</italic>
                <sup>
                    <xref ref-type="bibr" rid="ref-13">13</xref>
                </sup>, 
                <italic toggle="yes">Sailfish</italic>
                <sup>
                    <xref ref-type="bibr" rid="ref-14">14</xref>
                </sup>, and 
                <italic toggle="yes">kallisto</italic>
                <sup>
                    <xref ref-type="bibr" rid="ref-15">15</xref>
                </sup>. The 
                <italic toggle="yes">tximport</italic> package focuses on preparing estimated transcript-level counts, abundances and effective transcript lengths, for gene-level statistical analysis using 
                <italic toggle="yes">edgeR</italic>
                <sup>
                    <xref ref-type="bibr" rid="ref-3">3</xref>
                </sup>, 
                <italic toggle="yes">DESeq2</italic>
                <sup>
                    <xref ref-type="bibr" rid="ref-16">16</xref>
                </sup> or 
                <italic toggle="yes">limma-voom</italic>
                <sup>
                    <xref ref-type="bibr" rid="ref-6">6</xref>
                </sup>. 
                <italic toggle="yes">tximport</italic> produces an offset matrix to accompany gene-level counts, that accounts for a number of RNA-seq biases as well as differences in transcript usage among transcripts of different length that would bias an estimator of gene fold change based on the gene-level counts
                <sup>
                    <xref ref-type="bibr" rid="ref-17">17</xref>
                </sup>. 
                <italic toggle="yes">tximport</italic> can alternatively produce a matrix of data that is roughly on the scale of counts, by scaling transcript-per-million (TPM) abundances to add up to the total number of mapped reads. This counts-from-abundance approach directly corrects for technical biases and differential transcript usage across samples, obviating the need for the accompanying offset matrix.</p>
            <p>Complementary to an analysis of differential gene expression, one can use 
                <italic toggle="yes">tximport</italic> to import transcript-level estimated counts, and then pass these counts to packages such as 
                <italic toggle="yes">DRIMSeq</italic> or 
                <italic toggle="yes">DEXSeq</italic> for statistical analysis of differential transcript usage. Following a transcript-level analysis, one can aggregate evidence of differential transcript usage to the gene level. The 
                <italic toggle="yes">stageR</italic> package in Bioconductor provides a statistical framework to 
                <italic toggle="yes">screen</italic> at the gene-level for differential transcript usage with gene-level adjusted p-values, followed by 
                <italic toggle="yes">confirmation</italic> of which transcripts within the significant genes show differential usage with transcript-level adjusted p-values
                <sup>
                    <xref ref-type="bibr" rid="ref-8">8</xref>
                </sup>. The method controls the 
                <italic toggle="yes">overall false discovery rate</italic> (OFDR)
                <sup>
                    <xref ref-type="bibr" rid="ref-18">18</xref>
                </sup> for such a two-stage procedure, which will be discussed in more detail later in the workflow. We believe that 
                <italic toggle="yes">stageR</italic> represents a principled approach to analyzing transcript usage changes, as the methods can be evaluated against a target error rate in a manner that mimics how the methods will be used in practice. That is, following rejection of the null hypothesis at the gene-level, investigators would likely desire to know which transcripts within a gene participated in the differential usage.</p>
            <p>Here we provide a basic workflow for detecting differential transcript usage using Bioconductor packages, following quantification of transcript abundance using the 
                <italic toggle="yes">Salmon</italic> method. This workflow includes live, runnable code chunks for analysis using 
                <italic toggle="yes">DRIMSeq</italic> and 
                <italic toggle="yes">DEXSeq</italic>, as well as for performing stage-wise testing of differential transcript usage using the 
                <italic toggle="yes">stageR</italic> package. For the workflow, we use data that is simulated, so that we can also evaluate the performance of methods for differential transcript usage, as well as differential gene and transcript expression. The simulation was constructed using distributional parameters estimated from the GEUVADIS project RNA-seq dataset
                <sup>
                    <xref ref-type="bibr" rid="ref-19">19</xref>
                </sup> quantified by the 
                <italic toggle="yes">recount2</italic> project
                <sup>
                    <xref ref-type="bibr" rid="ref-20">20</xref>
                </sup>, including the expression levels of the transcripts, the amount of biological variability of gene expression levels across samples, and realistic coverage of reads along the transcript.</p>
        </sec>
        <sec sec-type="methods">
            <title>Methods</title>
            <sec>
                <title>Simulation</title>
                <p>First we describe details of the simulated data, which will be used in the following workflow. Understanding the details of the simulation will be useful for assessing the methods in the later sections. All of the code used to simulate RNA-seq experiments and write paired-end reads to FASTQ files can be found at an associated GitHub repository for the simulation code
                    <sup>
                        <xref ref-type="bibr" rid="ref-21">21</xref>
                    </sup>, and the reads and quantification files can be downloaded from Zenodo
                    <sup>
                        <xref ref-type="bibr" rid="ref-22">22</xref>&#x2013;
                        <xref ref-type="bibr" rid="ref-25">25</xref>
                    </sup>. 
                    <italic toggle="yes">Salmon</italic>
                    <sup>
                        <xref ref-type="bibr" rid="ref-13">13</xref>
                    </sup> was used to estimate transcript-level abundances for a single sample (
                    <ext-link ext-link-type="uri" xlink:href="https://www.ebi.ac.uk/ena/data/view/ERR188297">ERR188297</ext-link>) of the GEUVADIS project
                    <sup>
                        <xref ref-type="bibr" rid="ref-19">19</xref>
                    </sup>, and this was used as a baseline for transcript abundances in the simulation. Transcripts that were associated with estimated counts less than 10 had abundance thresholded to 0, all other transcripts were considered &#x201c;expressed&#x201d;. 
                    <italic toggle="yes">alpine</italic>
                    <sup>
                        <xref ref-type="bibr" rid="ref-26">26</xref>
                    </sup> was used to estimate realistic fragment GC bias from 12 samples from the GEUVADIS project, all from the same sequencing center (the first 12 samples from CNAG-CRG in Supplementary Table 2 from Love 
                    <italic toggle="yes">et al</italic>.
                    <sup>
                        <xref ref-type="bibr" rid="ref-26">26</xref>
                    </sup>). 
                    <italic toggle="yes">DESeq2</italic>
                    <sup>
                        <xref ref-type="bibr" rid="ref-16">16</xref>
                    </sup> was used to estimate mean and dispersion parameters for a Negative Binomial distribution for gene-level counts for 458 GEUVADIS samples provided by the 
                    <italic toggle="yes">recount2</italic> project
                    <sup>
                        <xref ref-type="bibr" rid="ref-20">20</xref>
                    </sup>. An example of 
                    <italic toggle="yes">DESeq2</italic>-generated estimates of dispersion per gene can be seen in 
                    <xref ref-type="other" rid="SF1">Supplementary Figure 1</xref>. Note that, while gene-level dispersion estimates were used to generate underlying transcript-level counts, additional uncertainty on the transcript-level data is a natural consequence of the simulation, as the transcript-level counts must be estimated (the underlying transcript counts are not provided to the methods).</p>
                <p>
                    <italic toggle="yes">polyester</italic>
                    <sup>
                        <xref ref-type="bibr" rid="ref-27">27</xref>
                    </sup> was used to simulate paired-end RNA-seq reads for two groups of 12 samples each, with realistic fragment GC bias, and with dispersion on transcript-level counts drawn from the joint distribution of mean and dispersion values estimated from the GEUVADIS samples. To compare 
                    <italic toggle="yes">DRIMSeq</italic> and 
                    <italic toggle="yes">DEXSeq</italic> in further detail, we generated an additional simulation in which dispersion parameters were assigned to genes via matching on the gene-level count, and then all transcripts of a gene had counts generated using the same per-gene dispersion. The first sample for group 1 and the first sample for group 2 followed the realistic GC bias profile of the same GEUVADIS sample, and so on for all 12 samples. This pairing of the samples was used to generate balanced data, but not used in the statistical analysis. 
                    <italic toggle="yes">countsimQC</italic>
                    <sup>
                        <xref ref-type="bibr" rid="ref-28">28</xref>
                    </sup> was used to examine the properties of the simulation relative to the dataset used for parameter estimation, and the full report can be accessed at the associated GitHub repository for simulation code
                    <sup>
                        <xref ref-type="bibr" rid="ref-21">21</xref>
                    </sup>.</p>
                <p>Differential expression across two groups was generated as follows: 70% of the genes were set as null genes, where abundance was not changed across the two groups. For 10% of genes, all isoforms were differentially expressed at a log fold change between 1 and 2.58 (fold change between 2 and 6). The set of transcripts in these genes was classified as DGE (differential gene expression) by construction, and the expressed transcripts were also DTE (differential transcript expression), but they did not count as DTU (differential transcript usage), as the proportions within the gene remained constant. To simulate balanced differential expression, one of the two groups was randomly chosen to be the baseline, and the other group would have its counts multiplied by the fold change. For 10% of genes, a single expressed isoform was differentially expressed at a log fold change between 1 and 2.58. This set of transcripts was DTE by construction. If the chosen transcript was the only expressed isoform of a gene, this counted also as DGE and not as DTU, but if there were other isoforms that were expressed, this counted for both DGE and DTU, as the proportion of expression among the isoforms was affected. For 10% of genes, differential transcript usage was constructed by exchanging the TPM abundance of two expressed isoforms, or, if only one isoform was expressed, exchanging the abundance of the expressed isoform with a non-expressed one. This counted for DTU and DTE, but not for DGE. An MA plot of the simulated transcript abundances for the two groups is shown in 
                    <xref ref-type="fig" rid="f1">Figure 1</xref>.</p>
                <fig fig-type="figure" id="f1" orientation="portrait" position="float">
                    <label>Figure 1. </label>
                    <caption>
                        <title>MA plot of simulated abundances.</title>
                        <p>Each point depicts a transcript, with the average log2 abundance in transcripts-per-million (TPM) on the x-axis and the difference between the two groups on the y-axis. Of the transcripts which are expressed with TPM &gt; 1 in at least one group, 77% are null transcripts (grey), which fall by construction on the M=0 line, and 23% are differentially expressed (green, orange, or purple). As transcripts can belong to multiple categories of differential gene expression (DGE), differential transcript expression (DTE), and differential transcript usage (DTU), here the transcripts are colored by which genes they belong to (those selected to be DGE-, DTE-, or DTU-by-construction).</p>
                    </caption>
                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/16780/ce9cbfd5-272b-4fa1-83e0-1ea31a26c9ba_figure1.gif"/>
                </fig>
            </sec>
            <sec>
                <title>Operation</title>
                <p>This workflow was designed to work with R 3.5 or higher, and the 
                    <italic toggle="yes">DRIMSeq</italic>, 
                    <italic toggle="yes">DEXSeq</italic>, 
                    <italic toggle="yes">stageR</italic>, and 
                    <italic toggle="yes">tximport</italic> packages for Bioconductor version 3.7 or higher. Bioconductor packages should always be installed following the 
                    <ext-link ext-link-type="uri" xlink:href="https://bioconductor.org/install/">official instructions</ext-link>. The workflow uses a subset of all genes to speed up the analysis, but the Bioconductor packages can easily be run for this dataset on all human genes on a laptop in less than an hour. Timing for the various packages is included within each section.</p>
            </sec>
        </sec>
        <sec>
            <title>Quantification and data import</title>
            <sec>
                <title>Salmon quantification</title>
                <p>We used 
                    <italic toggle="yes">Salmon</italic> version 0.10.0 to quantify abundance and effective transcript lengths for all of the 24 simulated samples. For this workflow, we will use the first six samples from each group. We quantified against the 
                    <ext-link ext-link-type="uri" xlink:href="https://www.gencodegenes.org/releases/current.html">GENCODE</ext-link> human annotation version 28, which was the same reference used to generate the simulated reads. We used the transcript sequences FASTA file that contains &#x201c;Nucleotide sequences of all transcripts on the reference chromosomes&#x201d;. When downloading the FASTA file, it is useful to download the corresponding GTF file, as this will be used in later sections.</p>
                <p>To build the 
                    <italic toggle="yes">Salmon</italic> index, we used the following command. Recent versions of 
                    <italic toggle="yes">Salmon</italic> will discard identical sequence duplicate transcripts, and keep a log of these within the index directory.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">salmon index -t gencode.v28.transcripts.fa -i gencode.v28_salmon-0.10.0</styled-content>
                    </preformat>
                </p>
                <p>To quantify each sample, we used the following command, which says to quantify with six threads using the GENCODE index, with inward and unstranded paired end reads, using fragment GC bias correction, writing out to the directory 
                    <monospace>sample</monospace> and using as input these two reads files. The library type is specified by 
                    <monospace>-l IU</monospace> (inward and unstranded) and the options are discussed in the 
                    <ext-link ext-link-type="uri" xlink:href="http://salmon.readthedocs.io/en/latest/library_type.html">Salmon documentation</ext-link>. Recent versions of Salmon can automatically detect the library type by setting 
                    <monospace>-l A</monospace>. Such a command can be automated in a bash loop using bash variables, or one can use more advanced workflow management systems such as Snakemake
                    <sup>
                        <xref ref-type="bibr" rid="ref-29">29</xref>
                    </sup> or Nextflow
                    <sup>
                        <xref ref-type="bibr" rid="ref-30">30</xref>
                    </sup>.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">salmon quant -p 6 -i gencode.v28_salmon-0.10.0 -l IU \</styled-content>
      
                        <styled-content style="font-size:15px;color:#000000;">--gcBias -o sample -1 sample_1.fa.gz -2 sample_2.fa.gz</styled-content>
                    </preformat>
                </p>
            </sec>
            <sec>
                <title>Importing counts into R/Bioconductor</title>
                <p>We can use 
                    <italic toggle="yes">tximport</italic> to import the estimated counts, abundances and effective transcript lengths into R. We recommend to construct a CSV file that keeps track of the sample identifiers and any relevant variables, e.g. condition, time point, batch, and so on. Here we have made a sample CSV file and provided it along with this workflow&#x2019;s R package.</p>
                <p>In order to find this file, we first need to know where on the machine this workflow package lives, so we can point to the 
                    <monospace>extdata</monospace> directory where the CSV file is located. These two lines of code load the workflow package and find this directory on the machine. These two lines of code would therefore not be part of a 
                    <italic toggle="yes">typical</italic> workflow.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#214A87;">library</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(rnaseqDTU)</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">csv.dir &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">system.file</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"extdata"</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">package=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"rnaseqDTU"</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">)</styled-content>
                    </preformat>
                </p>
                <p>The CSV file records which samples are condition 1 and which are condition 2. The columns of this CSV file can have any names, although 
                    <monospace>sample_id</monospace> will be used later by 
                    <italic toggle="yes">DRIMSeq</italic>, and so using this column name allows us to pass this 
                    <italic toggle="yes">data.frame</italic> directly to 
                    <italic toggle="yes">DRIMSeq</italic> at a later step.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">samps &lt;- </styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">read.csv</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">file.path</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(csv.dir,</styled-content> 
                        <styled-content style="font-size:15px;color:#4F9905;">"samples.csv"</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">))</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">head</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(samps)</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">##  sample_id  condition
## 1	 s1_1	       1
## 2	 s2_1	       1
## 3	 s3_1	       1
## 4	 s4_1	       1
## 5	 s5_1	       1
## 6	 s6_1	       1
</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">samps</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">condition &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">factor</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(samps</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">condition)</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">table</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(samps</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">condition)</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">##
## 1 2
## 6 6</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">files &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">file.path</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"/path/to/dir"</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">, samps</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">sample_id,</styled-content> 
                        <styled-content style="font-size:15px;color:#4F9905;">"quant.sf"</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">)</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">names</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(files) &lt;- samps</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">sample_id</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">head</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(files)</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">##                         s1_1                         s2_1
## "/path/to/dir/s1_1/quant.sf" "/path/to/dir/s2_1/quant.sf"
##                         s3_1                         s4_1
## "/path/to/dir/s3_1/quant.sf" "/path/to/dir/s4_1/quant.sf"
##                         s5_1                         s6_1
## "/path/to/dir/s5_1/quant.sf" "/path/to/dir/s6_1/quant.sf"</styled-content>
                    </preformat>
                </p>
                <p>We can then import transcript-level counts using 
                    <italic toggle="yes">tximport</italic>. We suggest for DTU analysis to generate counts from abundance, using the 
                    <monospace>scaledTPM</monospace> method described by Soneson 
                    <italic toggle="yes">et al</italic>.
                    <sup>
                        <xref ref-type="bibr" rid="ref-12">12</xref>
                    </sup>. The 
                    <monospace>countsFromAbundance</monospace> option of 
                    <italic toggle="yes">tximport</italic> uses estimated abundances to generate roughly count-scaled data, such that each column will sum to the number of reads mapped for that library. We recommend 
                    <monospace>scaledTPM</monospace> for differential transcript usage so that the estimated proportions fit by 
                    <italic toggle="yes">DRIMSeq</italic> in the following sections correspond to the proportions of underlying abundance.</p>
                <p>If instead of 
                    <monospace>scaledTPM</monospace>, we used the original estimated transcript counts (
                    <monospace>countsFromAbundance="no"</monospace>), or if we used 
                    <monospace>lengthScaledTPM</monospace> transcript counts, then a change in transcript usage among transcripts of different length could result in a changed total count for the gene, even if there is no change in total gene expression. This is because the original transcript counts and 
                    <monospace>lengthScaledTPM</monospace> transcript counts scale with transcript length, while 
                    <monospace>scaledTPM</monospace> transcript counts do not. For testing DTU using 
                    <italic toggle="yes">DRIMSeq</italic> and 
                    <italic toggle="yes">DEXSeq</italic>, it is convenient if the count-scale data do 
                    <italic toggle="yes">not</italic> scale with transcript length within a gene. Note that this could be corrected by an offset, but this is not easily implemented in the current DTU analysis packages. While this workflow only considers existing software features, we are considering developing a new 
                    <monospace>countsFromAbundance</monospace> method which would scale abundance for all transcripts of a gene by a fixed gene length, then each sample by its number of mapped reads, therefore balancing between the benefits of 
                    <monospace>scaledTPM</monospace> and 
                    <monospace>lengthScaledTPM</monospace>.</p>
                <p>The following code chunk is not evaluated, but instead we will load a pre-constructed matrix of counts. The actual quantification files for this dataset have been made publicly available; see the 
                    <italic toggle="yes">Data availability</italic> section at the end of this workflow.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#214A87;">library</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(tximport)</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">txi &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">tximport</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(files,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">type=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"salmon"</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">txOut=</styled-content>
                        <styled-content style="font-size:15px;color:#8F5903;">TRUE</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">,</styled-content>
                  
                        <styled-content style="font-size:15px;color:#214A87;">countsFromAbundance=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"scaledTPM"</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">)</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">cts &lt;- txi</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">counts</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">cts &lt;- cts[</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">rowSums</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(cts)</styled-content> 
                        <styled-content style="font-size:15px;color:#CF5C00;">&gt;</styled-content> 
                        <styled-content style="font-size:15px;color:#0000CF;">0</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">,]</styled-content>
                    </preformat>
                </p>
            </sec>
            <sec>
                <title>Transcript-to-gene mapping</title>
                <p>Bioconductor offers numerous approaches for building a 
                    <italic toggle="yes">TxDb</italic> object, a transcript database that can be used to link transcripts to genes (among other uses). We ran the following unevaluated code chunks to generate a 
                    <italic toggle="yes">TxDb</italic>, and then used the 
                    <monospace>select</monospace> function with the 
                    <italic toggle="yes">TxDb</italic> to produce a corresponding 
                    <italic toggle="yes">data.frame</italic> called 
                    <monospace>txdf</monospace> which links transcript IDs to gene IDs. In this 
                    <italic toggle="yes">TxDb</italic>, the transcript IDs are called 
                    <monospace>TXNAME</monospace> and the gene IDs are called 
                    <monospace>GENEID</monospace>. The version 28 human GTF file was downloaded from the GENCODE website when downloading the transcripts FASTA file.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#214A87;">library</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(GenomicFeatures)</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">gtf &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#4F9905;">"gencode.v28.annotation.gtf.gz"</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">txdb.filename &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#4F9905;">"gencode.v28.annotation.sqlite"</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">txdb &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">makeTxDbFromGFF</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(gtf)</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">saveDb</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(txdb, txdb.filename)</styled-content>
                    </preformat>
                </p>
                <p>Once the 
                    <italic toggle="yes">TxDb</italic> database has been generated and saved, it can be quickly reloaded:</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">txdb &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">loadDb</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(txdb.filename)</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">txdf &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">select</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(txdb,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">keys</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(txdb,</styled-content> 
                        <styled-content style="font-size:15px;color:#4F9905;">"GENEID"</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">),</styled-content> 
                        <styled-content style="font-size:15px;color:#4F9905;">"TXNAME"</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#4F9905;">"GENEID"</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">)</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">tab &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">table</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(txdf</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">GENEID)</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">txdf</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">ntx &lt;- tab[</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">match</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(txdf</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">GENEID,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">names</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(tab))]</styled-content>
                    </preformat>
                </p>
            </sec>
        </sec>
        <sec>
            <title>Statistical analysis of differential transcript usage</title>
            <sec>
                <title>DRIMSeq</title>
                <p>We load the 
                    <monospace>cts</monospace> object as created in the 
                    <italic toggle="yes">tximport</italic> code chunks. This contains count-scale data, generated from abundance using the 
                    <monospace>scaledTPM</monospace> method. The column sums are equal to the number of mapped paired-end reads per experiment. The experiments have between 31 and 38 million paired-end reads that were mapped to the transcriptome using 
                    <italic toggle="yes">Salmon</italic>.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#214A87;">data</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(salmon_cts)</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">cts[</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">:</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">3</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">,</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">:</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">3</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">]</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">##                         s1_1       s2_1       s3_1
## ENST00000488147.1 179.798908 184.437348 229.046306
## ENST00000469289.1   0.000000   0.000000   0.000000
## ENST00000466430.5   5.004159   3.627831   9.463167</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#214A87;">range</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">colSums</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(cts)</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">/</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1e6</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">)</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">## [1] 31.37738 38.47173</styled-content>
                    </preformat>
                </p>
                <p>We also have the 
                    <monospace>txdf</monospace> object giving the transcript-to-gene mappings (for construction, see previous section). This is contained in a file called 
                    <monospace>simulate.rda</monospace> that contains a number of R objects with information about the simulation, that we will use later to assess the methods&#x2019; performance.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#214A87;">data</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(simulate)</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">head</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(txdf)</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">##               GENEID            TXNAME ntx
## 1 ENSG00000000003.14 ENST00000612152.4   5
## 2 ENSG00000000003.14 ENST00000373020.8   5
## 3 ENSG00000000003.14 ENST00000614008.4   5
## 4 ENSG00000000003.14 ENST00000496771.5   5
## 5 ENSG00000000003.14 ENST00000494424.1   5
## 6  ENSG00000000005.5 ENST00000373031.4   2</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#214A87;">all</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">rownames</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(cts)</styled-content> 
                        <styled-content style="font-size:15px;color:#CF5C00;">%in%</styled-content> 
                        <styled-content style="font-size:15px;color:#000000;">txdf</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">TXNAME)</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">## [1] TRUE</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">txdf &lt;- txdf[</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">match</styled-content>(
                        <styled-content style="font-size:15px;color:#214A87;">rownames</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(cts),txdf</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">TXNAME),]</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">all</styled-content>(
                        <styled-content style="font-size:15px;color:#214A87;">rownames</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(cts)</styled-content> 
                        <styled-content style="font-size:15px;color:#CF5C00;">==</styled-content> 
                        <styled-content style="font-size:15px;color:#000000;">txdf</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">TXNAME)</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">## [1] TRUE</styled-content>
                    </preformat>
                </p>
                <p>In order to run 
                    <italic toggle="yes">DRIMSeq</italic>, we build a 
                    <italic toggle="yes">data.frame</italic> with the gene ID, the feature (transcript) ID, and then columns for each of the samples:</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">counts &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">data.frame</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">gene_id=</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">txdf</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">GENEID,</styled-content>
                       
                        <styled-content style="font-size:15px;color:#214A87;">feature_id=</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">txdf</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">TXNAME,</styled-content>
                       
                        <styled-content style="font-size:15px;color:#000000;">cts)</styled-content>
                    </preformat>
                </p>
                <p>We can now load the 
                    <italic toggle="yes">DRIMSeq</italic> package and create a 
                    <italic toggle="yes">dmDSdata</italic> object, with our 
                    <monospace>counts</monospace> and 
                    <monospace>samps</monospace> 
                    <italic toggle="yes">data.frames</italic>. Typing in the object name and pressing return will give information about the number of genes:</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#214A87;">library</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(DRIMSeq)</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">d &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">dmDSdata</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">counts=</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">counts,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">samples=</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">samps)</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">d</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">## An object of class dmDSdata</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">## with 16612 genes and 12 samples</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">## * data accessors: counts(), samples()</styled-content>
                    </preformat>
                </p>
                <p>The 
                    <italic toggle="yes">dmDSdata</italic> object has a number of specific methods. Note that the rows of the object are gene-oriented, so pulling out the first 
                    <italic toggle="yes">row</italic> corresponds to all of the transcripts of the first gene:</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#214A87;">methods</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">class=class</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(d))</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">## [1] [           coerce      counts      dmFilter     dmPrecision length</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">## [7] names       plotData    show</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">## see &#x2019;?methods&#x2019; for accessing help and source code</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#214A87;">counts</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(d[</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">,])[,</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">:</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">4</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">]</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">##              gene_id        feature_id       s1_1        s2_1
## 1 ENSG00000000419.12 ENST00000371588.9 1394.71411  1210.12539
## 2 ENSG00000000419.12 ENST00000466152.5  135.15850    18.20031
## 3 ENSG00000000419.12 ENST00000371582.8  154.77943    35.39425
## 4 ENSG00000000419.12 ENST00000371584.8   42.85733    86.04958
## 5 ENSG00000000419.12 ENST00000413082.1    0.00000     0.00000</styled-content>
                    </preformat>
                </p>
                <p>It will be useful to first filter the object, before running procedures to estimate model parameters. This greatly speeds up the fitting and removes transcripts that may be troublesome for parameter estimation, e.g. estimating the proportion of expression among the transcripts of a gene when the total count is very low. We first define 
                    <monospace>n</monospace> to be the total number of samples, and 
                    <monospace>n.small</monospace> to be the sample size of the smallest group. We use all three of the possible filters: for a transcript to be retained in the dataset, we require that (1) it has a count of at least 10 in at least 
                    <monospace>n.small</monospace> samples, (2) it has a relative abundance proportion of at least 0.1 in at least 
                    <monospace>n.small</monospace> samples, and (3) the total count of the corresponding gene is at least 10 in all 
                    <monospace>n</monospace> samples. We used all three possible filters, whereas only the two count filters are used in the 
                    <italic toggle="yes">DRIMSeq</italic> vignette example code.</p>
                <p>It is important to consider what types of transcripts may be removed by the filters, and potentially adjust depending on the dataset. If 
                    <monospace>n</monospace> was large, it would make sense to allow perhaps a few samples to have very low counts, so lowering 
                    <monospace>min_samps_gene_expr</monospace> to some factor multiple (
                    <italic toggle="yes">&lt;</italic> 1) of 
                    <monospace>n,</monospace> and likewise for the first two filters for 
                    <monospace>n.small</monospace>. The second filter means that if a transcript does not make up more than 10% of the gene&#x2019;s expression for at least 
                    <monospace>n.small</monospace> samples, it will be removed. If this proportion seems too high, for example, if very lowly expressed isoforms are of particular interest, then the filter can be omitted or the 
                    <monospace>min_feature_prop</monospace> lowered. For a concrete example, if a transcript goes from a proportion of 0% in the control group to a proportion of 9% in the treatment group, this would be removed by the above 10% filter. After filtering, this dataset has 7,764 genes.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">n &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#0000CF;">12</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">n.small &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#0000CF;">6</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">d &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">dmFilter</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(d,</styled-content>
                
                        <styled-content style="font-size:15px;color:#214A87;">min_samps_feature_expr=</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">n.small,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">min_feature_expr=</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">10</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">,</styled-content>
                
                        <styled-content style="font-size:15px;color:#214A87;">min_samps_feature_prop=</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">n.small,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">min_feature_prop=</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">0.1</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">,</styled-content>
                
                        <styled-content style="font-size:15px;color:#214A87;">min_samps_gene_expr=</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">n,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">min_gene_expr=</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">10</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">)</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">d</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">## An object of class dmDSdata</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">## with 7764 genes and 12 samples
## * data accessors: counts(), samples()</styled-content>
                    </preformat>
                </p>
                <p>The 
                    <italic toggle="yes">dmDSdata</italic> object only contains genes that have more that one isoform, which makes sense as we are testing for differential transcript usage. We can find out how many of the remaining genes have 
                    <italic toggle="yes">N</italic> isoforms by tabulating the number of times we see a gene ID, then tabulating the output again:</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#214A87;">table</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">table</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">counts</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(d)</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">gene_id))</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">##
##    2    3    4    5    6    7
## 4062 2514  931  222   34    1</styled-content>
                    </preformat>
                </p>
                <p>We create a design matrix, using a design formula and the sample information contained in the object, accessed via 
                    <italic toggle="yes">samples</italic>. Here we use a simple design with just two groups, but more complex designs are possible. For some discussion of complex designs, one can refer to the vignettes of the 
                    <italic toggle="yes">limma</italic>, 
                    <italic toggle="yes">edgeR</italic>, or 
                    <italic toggle="yes">DESeq2</italic> packages.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">design_full &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">model.matrix</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">~</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">condition,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">data=</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">DRIMSeq</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">::</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">samples</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(d))</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">colnames</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(design_full)</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">## [1] "(Intercept)" "condition2"</styled-content>
                    </preformat>
                </p>
                <p>Only for speeding up running the live code chunks in this workflow, we subset to the first 250 genes, representing about one thirtieth of the dataset. This step would not be run in a typical workflow.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">d &lt;- d[</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">:</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">250</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">,]</styled-content>

                        <styled-content style="font-size:15px;color:#0000CF;">7764</styled-content> 
                        <styled-content style="font-size:15px;color:#CF5C00;">/</styled-content> 
                        <styled-content style="font-size:15px;color:#0000CF;">250</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">## [1] 31.056</styled-content>
                    </preformat>
                </p>
                <p>We then use the following three functions to estimate the model parameters and test for DTU. We first estimate the 
                    <italic toggle="yes">precision</italic>, which is related to the dispersion in the Dirichlet Multinomial model via the formula below. Because precision is in the denominator of the right hand side of the equation, they are inversely related. Higher 
                    <italic toggle="yes">dispersion</italic> &#x2013; counts more variable around their expected value &#x2013; is associated with lower 
                    <italic toggle="yes">precision</italic>. For full details about the 
                    <italic toggle="yes">DRIMSeq</italic> model, one should read both the detailed software vignette and the publication
                    <sup>
                        <xref ref-type="bibr" rid="ref-7">7</xref>
                    </sup>. After estimating the precision, we fit regression coefficients and perform null hypothesis testing on the coefficient of interest. Because we have a simple two-group model, we test the coefficient associated with the difference between condition 2 and condition 1, called 
                    <monospace>condition2</monospace>. The following code takes about half a minute, and so a full analysis on this dataset takes about 15 minutes on a laptop.</p>
                <p>
                    <disp-formula id="e1">
                        <mml:math display="block" id="math1">
                            <mml:mrow>
                                <mml:mtext>dispersion</mml:mtext>
                                <mml:mo>=</mml:mo>
                                <mml:mfrac>
                                    <mml:mn>1</mml:mn>
                                    <mml:mrow>
                                        <mml:mn>1</mml:mn>
                                        <mml:mo>+</mml:mo>
                                        <mml:mtext>precision</mml:mtext>
                                    </mml:mrow>
                                </mml:mfrac>
                            </mml:mrow>
                        </mml:math>
                    </disp-formula>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#214A87;">set.seed</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">)</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">system.time</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">({</styled-content>
  
                        <styled-content style="font-size:15px;color:#000000;">d &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">dmPrecision</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(d,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">design=</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">design_full)</styled-content>
  
                        <styled-content style="font-size:15px;color:#000000;">d &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">dmFit</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(d,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">design=</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">design_full)</styled-content>
  
                        <styled-content style="font-size:15px;color:#000000;">d &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">dmTest</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(d,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">coef=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"condition2"</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">)</styled-content>   

                        <styled-content style="font-size:15px;color:#000000;">})</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">## ! Using a subset of 0.1 genes to estimate common precision !</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">## ! Using common_precision = 21.2862 as prec_init !</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">## ! Using 0 as a shrinkage factor !</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">##   user   system  elapsed
## 34.213    0.450   35.846</styled-content>
                    </preformat>
                </p>
                <p>To build a results table, we run the 
                    <monospace>results</monospace> function. We can generate a single p-value per gene, which tests whether there is any differential transcript usage within the gene, or a single p-value per transcript, which tests whether the proportions for this transcript changed within the gene:</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">res &lt;- DRIMSeq</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">::</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">results</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(d)</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">head</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(res)</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">##              gene_id        lr df       pvalue    adj_pvalue
## 1 ENSG00000000457.13  1.493561  4 8.277814e-01  9.120246e-01
## 2 ENSG00000000460.16  1.068294  3 7.847330e-01  9.101892e-01
## 3 ENSG00000000938.12  4.366806  2 1.126575e-01  2.750169e-01
## 4 ENSG00000001084.11  1.630085  3 6.525877e-01  8.643316e-01
## 5 ENSG00000001167.14 28.402587  1 9.853354e-08  5.007113e-07
## 6 ENSG00000001461.16  9.815460  1 1.730510e-03  6.732766e-03</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">res.txp &lt;- DRIMSeq</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">::</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">results</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(d,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">level=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"feature"</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">)</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">head</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(res.txp)</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">##              gene_id         feature_id          lr df    pvalue adj_pvalue
## 1 ENSG00000000457.13 ENST00000367771.10  0.16587607  1 0.6838032  0.9171007
## 2 ENSG00000000457.13  ENST00000367770.5  0.01666448  1 0.8972856  0.9788571
## 3 ENSG00000000457.13  ENST00000367772.8  1.02668495  1 0.3109386  0.6667146
## 4 ENSG00000000457.13  ENST00000423670.1  0.06046507  1 0.8057624  0.9323782
## 5 ENSG00000000457.13  ENST00000470238.1  0.28905766  1 0.5908250  0.8713427
## 6 ENSG00000000460.16  ENST00000496973.5  0.83415788  1 0.3610730  0.7232298</styled-content>
                    </preformat>
                </p>
                <p>Because the 
                    <monospace>pvalue</monospace> column may contain 
                    <monospace>NA</monospace> values, we use the following function to turn these into 1&#x2019;s. The 
                    <monospace>NA</monospace> values would otherwise cause problems for the stage-wise analysis.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">no.na &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">function</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(x)</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">ifelse</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">is.na</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(x),</styled-content> 
                        <styled-content style="font-size:15px;color:#0000CF;">1</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">, x)</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">res</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">pvalue &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">no.na</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(res</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">pvalue)</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">res.txp</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">pvalue &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">no.na</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(res.txp</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">pvalue)</styled-content>
                    </preformat>
                </p>
                <p>We can plot the estimated proportions for one of the significant genes, where we can see evidence of switching (
                    <xref ref-type="fig" rid="f2">Figure 2</xref>).</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">idx &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">which</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(res</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">adj_pvalue</styled-content> 
                        <styled-content style="font-size:15px;color:#CF5C00;">&lt;</styled-content> 
                        <styled-content style="font-size:15px;color:#0000CF;">0.05</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">)[</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">]</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">res[idx,]</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">##              gene_id       lr df       pvalue    adj_pvalue
## 5 ENSG00000001167.14 28.40259  1 9.853354e-08  5.007113e-07</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#214A87;">plotProportions</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(d, res</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">gene_id[idx],</styled-content> 
                        <styled-content style="font-size:15px;color:#4F9905;">"condition"</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">)</styled-content>
                    </preformat>
                </p>
                <fig fig-type="figure" id="f2" orientation="portrait" position="float">
                    <label>Figure 2. </label>
                    <caption>
                        <title>Estimated proportions for one of the significant genes.</title>
                    </caption>
                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/16780/ce9cbfd5-272b-4fa1-83e0-1ea31a26c9ba_figure2.gif"/>
                </fig>
            </sec>
            <sec>
                <title>stageR following DRIMSeq</title>
                <p>Because we have been working with only a subset of the data, we now load the results tables that would have been generated by running 
                    <italic toggle="yes">DRIMSeq</italic> functions on the entire dataset.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#214A87;">data</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(drim_tables)</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">nrow</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(res)</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">## [1] 7764</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#214A87;">nrow</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(res.txp)</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">## [1] 20711</styled-content>
                    </preformat>
                </p>
                <p>A typical analysis of differential transcript usage would involve asking first: &#x201c;which genes contain any evidence of DTU?&#x201d;, and secondly, &#x201c;which transcripts in the genes that contain some evidence may be participating in the DTU?&#x201d; Note that a gene may pass the first stage without exhibiting enough evidence to identify one or more transcripts that are participating in the DTU. The 
                    <italic toggle="yes">stageR</italic> package is designed to allow for such two-stage testing procedures, where the first stage is called a 
                    <italic toggle="yes">screening</italic> stage and the second stage a 
                    <italic toggle="yes">confirmation</italic> stage
                    <sup>
                        <xref ref-type="bibr" rid="ref-8">8</xref>
                    </sup>. The methods are general, and can also be applied to testing, for example, changes across a time series followed by investigation of individual time points, as shown in the 
                    <italic toggle="yes">stageR</italic> package vignette. We show below how 
                    <italic toggle="yes">stageR</italic> is used to detect DTU and how to interpret its output.</p>
                <p>We first construct a vector of p-values for the screening stage. Because of how the 
                    <italic toggle="yes">stageR</italic> package will combine transcript and gene names, we need to strip the gene and transcript version numbers from their Ensembl IDs (this is done by keeping only the first 15 characters of the gene and transcript IDs).</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">pScreen &lt;- res</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">pvalue</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">strp &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">function</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(x)</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">substr</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(x,</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">,</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">15</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">)</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">names</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(pScreen) &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">strp</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(res</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">gene_id)</styled-content>
                    </preformat>
                </p>
                <p>We construct a one column matrix of the confirmation p-values:</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">pConfirmation &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">matrix</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(res.txp</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">pvalue,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">ncol=</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">)</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">rownames</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(pConfirmation) &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">strp</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(res.txp</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">feature_id)</styled-content>
                    </preformat>
                </p>
                <p>We arrange a two column 
                    <italic toggle="yes">data.frame</italic> with the transcript and gene identifiers.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">tx2gene &lt;- res.txp[,</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">c</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"feature_id"</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#4F9905;">"gene_id"</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">)]</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">for</styled-content> 
                        <styled-content style="font-size:15px;color:#000000;">(i</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">in</styled-content> 
                        <styled-content style="font-size:15px;color:#0000CF;">1</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">:</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">2</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">) tx2gene[,i] &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">strp</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(tx2gene[,i])</styled-content>
                    </preformat>
                </p>
                <p>The following functions then perform the 
                    <italic toggle="yes">stageR</italic> analysis. We must specify an 
                    <monospace>alpha,</monospace> which will be the 
                    <italic toggle="yes">overall false discovery rate</italic> target for the analysis, defined below. Unlike typical adjusted p-values or q-values, we cannot choose an arbitrary threshold later: after specifying 
                    <monospace>alpha=0.05,</monospace> we need to use 5% as the target in downstream steps. There are also convenience functions 
                    <italic toggle="yes">getSignificantGenes</italic> and 
                    <italic toggle="yes">getSignificantTx</italic>, which are demonstrated in the 
                    <italic toggle="yes">stageR</italic> vignette.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#214A87;">library</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(stageR)</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">stageRObj &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">stageRTx</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">pScreen=</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">pScreen,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">pConfirmation=</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">pConfirmation,</styled-content>
                        
                        <styled-content style="font-size:15px;color:#214A87;">pScreenAdjusted=</styled-content>
                        <styled-content style="font-size:15px;color:#8F5903;">FALSE</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">tx2gene=</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">tx2gene)</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">stageRObj &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">stageWiseAdjustment</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(stageRObj,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">method=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"dtu"</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">alpha=</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">0.05</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">)</styled-content> 

                        <styled-content style="font-size:15px;color:#214A87;">suppressWarnings</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">({</styled-content>
  
                        <styled-content style="font-size:15px;color:#000000;">drim.padj &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">getAdjustedPValues</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(stageRObj,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">order=</styled-content>
                        <styled-content style="font-size:15px;color:#8F5903;">FALSE</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">,</styled-content>
                                      
                        <styled-content style="font-size:15px;color:#214A87;">onlySignificantGenes=</styled-content>
                        <styled-content style="font-size:15px;color:#8F5903;">TRUE</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">)</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">})</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">head</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(drim.padj)</styled-content>


                        <styled-content style="font-size:15px;color:#000000;">
##            geneID            txID         gene transcript
## 1 ENSG00000001167 ENST00000341376 1.446731e-05   0.000000
## 2 ENSG00000001167 ENST00000353205 1.446731e-05   0.000000
## 3 ENSG00000001461 ENST00000003912 8.263160e-03   0.000000
## 4 ENSG00000001461 ENST00000339255 8.263160e-03   0.000000
## 5 ENSG00000001631 ENST00000394507 1.287012e-04   0.060474
## 6 ENSG00000001631 ENST00000475770 1.287012e-04   1.000000</styled-content>
                    </preformat>
                </p>
                <p>The final table with adjusted p-values summarizes the information from the two-stage analysis. Only genes that passed the filter are included in the table, so the table already represents 
                    <italic toggle="yes">screened</italic> genes. The transcripts with values in the column, 
                    <monospace>transcript,</monospace> less than 0.05 pass the 
                    <italic toggle="yes">confirmation</italic> stage on a target 5% 
                    <italic toggle="yes">overall false discovery rate</italic>, or OFDR. This means that, in expectation, no more than 5% of the genes that pass screening will either (1) not contain any DTU, so be falsely screened genes, or (2) contain a transcript with a transcript adjusted p-value less than 0.05 which does not participate in DTU, so contain a falsely confirmed transcript. The 
                    <italic toggle="yes">stageR</italic> procedure allows us to look at both the genes that passed the screening stage and the transcripts with adjusted p-values less than our target 
                    <monospace>alpha,</monospace> and understand what kind of 
                    <italic toggle="yes">overall</italic> error rate this procedure entails. This cannot be said for an arbitrary procedure of looking at standard gene adjusted p-values and transcript adjusted p-values, where the adjustment was performed independently.</p>
            </sec>
            <sec>
                <title>Post-hoc filtering on the standard deviation in proportions</title>
                <p>We found that 
                    <italic toggle="yes">DRIMSeq</italic> was sensitive to detect DTU, but could exceed its false discovery rate (FDR) bounds, particularly on the transcript-level tests, and that a post-hoc, non-specific filtering of the 
                    <italic toggle="yes">DRIMSeq</italic> transcript p-values improved the FDR control. We considered the standard deviation (SD) of the per-sample proportions as a filtering statistic. This statistic does not use the information about which samples belong to which condition group. We set the p-values for transcripts with small per-sample proportion SD to 1 and then re-computed the adjusted p-values using the method of Benjamini and Hochberg
                    <sup>
                        <xref ref-type="bibr" rid="ref-31">31</xref>
                    </sup>. Excluding transcripts with small SD of the per-sample proportions brought the observed FDR closer to its nominal target in the simulation considered here, as shown below.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">res.txp.filt &lt;- DRIMSeq</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">::</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">results</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(d,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">level=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"feature"</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">)</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">getSampleProportions &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">function</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(d) {</styled-content>
  
                        <styled-content style="font-size:15px;color:#000000;">cts &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">as.matrix</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">subset</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">counts</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(d),</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">select=</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">-</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">c</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(gene_id, feature_id)))</styled-content>
  
                        <styled-content style="font-size:15px;color:#000000;">gene.cts &lt;-</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">rowsum</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(cts,</styled-content>  
                        <styled-content style="font-size:15px;color:#214A87;">counts</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(d)</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">gene_id)</styled-content>
  
                        <styled-content style="font-size:15px;color:#000000;">total.cts &lt;- gene.cts[</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">match</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">counts</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(d)</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">gene_id),</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">rownames</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(gene.cts)),]</styled-content>
  
                        <styled-content style="font-size:15px;color:#000000;">cts</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">/</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">total.cts</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">}</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">prop.d &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">getSampleProportions</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(d)</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">res.txp.filt</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">prop.sd &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">sqrt</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">rowVars</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(prop.d))</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">res.txp.filt</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">pvalue[res.txp.filt</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">prop.sd</styled-content> 
                        <styled-content style="font-size:15px;color:#CF5C00;">&lt;</styled-content> 
                        <styled-content style="font-size:15px;color:#000000;">.</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">] &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#0000CF;">1</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">res.txp.filt</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">adj_pvalue &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">p.adjust</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(res.txp.filt</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">pvalue,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">method=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"BH"</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">)</styled-content>
                    </preformat>
                </p>
                <p>The above post-hoc filter is not part of the 
                    <italic toggle="yes">DRIMSeq</italic> modeling steps, and to avoid interfering with the modeling, we run it after 
                    <italic toggle="yes">DRIMSeq</italic>. The other three filters used before have been tested by the 
                    <italic toggle="yes">DRIMSeq</italic> package authors, and are therefore a recommended part of an analysis before the modeling begins.</p>
            </sec>
            <sec>
                <title>DEXSeq</title>
                <p>The 
                    <italic toggle="yes">DEXSeq</italic> package was originally designed for detecting differential exon usage
                    <sup>
                        <xref ref-type="bibr" rid="ref-32">32</xref>
                    </sup>, but can also be adapted to run on estimated transcript counts, in order to detect DTU. Using 
                    <italic toggle="yes">DEXSeq</italic> on transcript counts was evaluated by Soneson 
                    <italic toggle="yes">et al</italic>.
                    <sup>
                        <xref ref-type="bibr" rid="ref-33">33</xref>
                    </sup>, showing the benefits in FDR control from filtering lowly expressed transcripts for a transcript-level analysis. We benchmarked 
                    <italic toggle="yes">DEXSeq</italic> here, beginning with the 
                    <italic toggle="yes">DRIMSeq</italic> filtered object, as these filters are intuitive, they greatly speed up the analysis, and such filtering was shown to be beneficial in FDR control.</p>
                <p>The two factors of (1) working on isoform counts rather than individual exons and (2) using the 
                    <italic toggle="yes">DRIMSeq</italic> filtering procedure dramatically increase the speed of 
                    <italic toggle="yes">DEXSeq</italic>, compared to running an exon-level analysis. Another advantage is that we benefit from the sophisticated bias models of 
                    <italic toggle="yes">Salmon</italic>, which account for drops in coverage on alternative exons that can otherwise throw off estimates of transcript abundance
                    <sup>
                        <xref ref-type="bibr" rid="ref-26">26</xref>
                    </sup>. A disadvantage over the exon-level analysis is that we must know in advance all of the possible isoforms that can be generated from a gene locus, all of which are assumed to be contained in the annotation files (FASTA and GTF).</p>
                <p>We first load the 
                    <italic toggle="yes">DEXSeq</italic> package and then build a 
                    <italic toggle="yes">DEXSeqDataSet</italic> from the data contained in the 
                    <italic toggle="yes">dmDStest</italic> object (the class of the 
                    <italic toggle="yes">DRIMSeq</italic> object changes as the results are added). The design formula of the 
                    <italic toggle="yes">DEXSeqDataSet</italic> here uses the language &#x201c;exon&#x201d; but this should be read as &#x201c;transcript&#x201d; for our analysis. 
                    <italic toggle="yes">DEXSeq</italic> will test &#x2013; after accounting for total gene expression for this sample and for the proportion of this transcript relative to the others &#x2013; whether there is a condition-specific difference in the transcript proportion relative to the others. The testing of &#x201c;this&#x201d; vs &#x201c;others&#x201d; in 
                    <italic toggle="yes">DEXSeq</italic> enables it to be much faster than its original published version, which involved fitting coefficients for each exon within a gene (here it would have been for each transcript within a gene).</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#214A87;">library</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(DEXSeq)</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">sample.data &lt;- DRIMSeq</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">::</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">samples</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(d)</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">count.data &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">round</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">as.matrix</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">counts</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(d)[,</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">-</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">c</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">:</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">2</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">)]))</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">dxd &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">DEXSeqDataSet</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">countData=</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">count.data,</styled-content>
                       
                        <styled-content style="font-size:15px;color:#214A87;">sampleData=</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">sample.data,</styled-content>
                       
                        <styled-content style="font-size:15px;color:#214A87;">design=</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">~</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">sample</styled-content> 
                        <styled-content style="font-size:15px;color:#CF5C00;">+</styled-content> 
                        <styled-content style="font-size:15px;color:#000000;">exon</styled-content> 
                        <styled-content style="font-size:15px;color:#CF5C00;">+</styled-content> 
                        <styled-content style="font-size:15px;color:#000000;">condition</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">:</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">exon,</styled-content>
                       
                        <styled-content style="font-size:15px;color:#214A87;">featureID=counts</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(d)</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">feature_id,</styled-content>
                       
                        <styled-content style="font-size:15px;color:#214A87;">groupID=counts</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(d)</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">gene_id)</styled-content>
                    </preformat>
                </p>
                <p>The following functions run the 
                    <italic toggle="yes">DEXSeq</italic> analysis. While we are only working on a subset of the data, the full analysis for this dataset took less than 3 minutes on a laptop.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#214A87;">system.time</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">({</styled-content>
  
                        <styled-content style="font-size:15px;color:#000000;">dxd &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">estimateSizeFactors</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(dxd)</styled-content>
  
                        <styled-content style="font-size:15px;color:#000000;">dxd &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">estimateDispersions</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(dxd,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">quiet=</styled-content>
                        <styled-content style="font-size:15px;color:#8F5903;">TRUE</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">)</styled-content>
  
                        <styled-content style="font-size:15px;color:#000000;">dxd &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">nbinomLRT</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(dxd,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">reduced=</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">~</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">sample</styled-content> 
                        <styled-content style="font-size:15px;color:#CF5C00;">+</styled-content> 
                        <styled-content style="font-size:15px;color:#000000;">exon)</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">})</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">
##    user system elapsed
##   7.084  0.064   7.184</styled-content>
                    </preformat>
                </p>
                <p>We then extract the results table, not filtering on mean counts (as we have already conducted filtering via 
                    <italic toggle="yes">DRIMSeq</italic> functions). We compute a per-gene adjusted p-value, using the 
                    <italic toggle="yes">perGeneQValue</italic> function, which aggregates evidence from multiple tests within a gene to a single p-value for the gene and then corrects for multiple testing across genes
                    <sup>
                        <xref ref-type="bibr" rid="ref-32">32</xref>
                    </sup>. Other methods for aggregative evidence from the multiple tests within genes have been discussed in a recent publication and may be substituted at this step
                    <sup>
                        <xref ref-type="bibr" rid="ref-34">34</xref>
                    </sup>. Finally, we build a simple results table with the per-gene adjusted p-values.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">dxr &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">DEXSeqResults</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(dxd,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">independentFiltering=</styled-content>
                        <styled-content style="font-size:15px;color:#8F5903;">FALSE</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">)</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">qval &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">perGeneQValue</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(dxr)</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">dxr.g &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">data.frame</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">gene=names</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(qval),qval)</styled-content>
                    </preformat>
                </p>
                <p>For size consideration of the workflow R package, we reduce also the transcript-level results table to a simple 
                    <italic toggle="yes">data.frame</italic>:</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">columns &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">c</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"featureID"</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">,</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"groupID"</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">,</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"pvalue"</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">)</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">dxr &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">as.data.frame</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(dxr[,columns])</styled-content>
                    </preformat>
                </p>
            </sec>
            <sec>
                <title>stageR following DEXSeq</title>
                <p>Again, as we have been working with only a subset of the data, we now load the results tables that would have been generated by running 
                    <italic toggle="yes">DEXSeq</italic> functions on the entire dataset.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#214A87;">data</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(dex_tables)</styled-content>
                    </preformat>
                </p>
                <p>If the 
                    <italic toggle="yes">stageR</italic> package has not already been loaded, we make sure to load it, and run code very similar to that used above for 
                    <italic toggle="yes">DRIMSeq</italic> two-stage testing, with a target 
                    <monospace>alpha=0.05</monospace>.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#214A87;">library</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(stageR)</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">strp &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">function</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(x)</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">substr</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(x,</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">,</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">15</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">)</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">pConfirmation &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">matrix</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(dxr</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">pvalue,</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">ncol=</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">)</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">dimnames</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(pConfirmation) &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">list</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">strp</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(dxr</styled-content>
                        <styled-content style="font-size:15px;color: #CF5C00;">$</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">featureID),</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"transcript"</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">)</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">pScreen &lt;- qval</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">names</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(pScreen) &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">strp</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">names</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(pScreen))</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">tx2gene &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">as.data.frame</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(dxr[,</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">c</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"featureID"</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#4F9905;">"groupID"</styled-content>)])

                        <styled-content style="font-size:15px;color:#214A87;">for</styled-content> 
                        <styled-content style="font-size:15px;color:#000000;">(i</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">in</styled-content> 
                        <styled-content style="font-size:15px;color:#0000CF;">1</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">:</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">2</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">) tx2gene[,i] &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">strp</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(tx2gene[,i])</styled-content>
                    </preformat>
                </p>
                <p>The following three functions provide a table with the OFDR control described above. To repeat, the set of genes passing screening should not have more than 5% of either genes which have in fact no DTU or genes which contain a transcript with an adjusted p-value less than 5% which do not participate in DTU.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">stageRObj &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">stageRTx</styled-content>(
                        <styled-content style="font-size:15px;color:#214A87;">pScreen=</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">pScreen,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">pConfirmation=</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">pConfirmation,</styled-content>
                        
                        <styled-content style="font-size:15px;color:#214A87;">pScreenAdjusted=</styled-content>
                        <styled-content style="font-size:15px;color:#8F5903;">TRUE</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">tx2gene=</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">tx2gene)</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">stageRObj &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">stageWiseAdjustment</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(stageRObj,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">method=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"dtu"</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">alpha=</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">0.05</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">)</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">suppressWarnings</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">({</styled-content>
  
                        <styled-content style="font-size:15px;color:#000000;">dex.padj &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">getAdjustedPValues</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(stageRObj,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">order=</styled-content>
                        <styled-content style="font-size:15px;color:#8F5903;">FALSE</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">,</styled-content>
                                     
                        <styled-content style="font-size:15px;color:#214A87;">onlySignificantGenes=</styled-content>
                        <styled-content style="font-size:15px;color:#8F5903;">TRUE</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">)</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">})</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">head</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(dex.padj)</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">##            geneID            txID         gene transcript
## 1 ENSG00000001167 ENST00000341376 0.0000877079          0
## 2 ENSG00000001167 ENST00000353205 0.0000877079          0
## 3 ENSG00000001461 ENST00000003912 0.0051524663          0
## 4 ENSG00000001461 ENST00000339255 0.0051524663          0
## 5 ENSG00000001630 ENST00000003100 0.0234729668          0
## 6 ENSG00000001630 ENST00000450723 0.0234729668          0</styled-content>
                    </preformat>
                </p>
            </sec>
            <sec>
                <title>SUPPA2</title>
                <p>
                    <italic toggle="yes">SUPPA2</italic> is a command-line software package written in Python that also takes as input 
                    <italic toggle="yes">Salmon</italic> quantification, and so, for completeness, we also show example commands and evaluate its performance on the simulated data
                    <sup>
                        <xref ref-type="bibr" rid="ref-35">35</xref>
                    </sup>. 
                    <italic toggle="yes">SUPPA2</italic> offers a number of distinct features, including the ability to translate from 
                    <italic toggle="yes">Salmon</italic> transcript-level quantifications to individual splicing events, which are cataloged using a specific vocabulary described in the 
                    <ext-link ext-link-type="uri" xlink:href="https://github.com/comprna/SUPPA">SUPPA2 software usage guide</ext-link>. 
                    <italic toggle="yes">SUPPA2</italic> additionally offers differential analysis on the splicing events, which may be more valuable to investigators than per-transcript results, depending on the research goals (similar to the exon-level primary use case of 
                    <italic toggle="yes">DEXSeq</italic>).</p>
                <p>Here, as our DTU simulation involved switching between expressed transcripts without assessing whether they were separated by one or more splice events, and as the other two Bioconductor methods for detecting DTU involve transcript-level analysis, we ran 
                    <italic toggle="yes">SUPPA2</italic> in its differential transcript usage mode. We chose to filter on transcripts with TPM larger than 1; TPM filtering is a command-line option available during the 
                    <monospace>diffSplice</monospace> step of 
                    <italic toggle="yes">SUPPA2</italic> and this improves the running time. We did not use gene-correction, as we wanted to apply the aggregation and correction method 
                    <monospace>perGeneQValue</monospace> from 
                    <italic toggle="yes">DEXSeq</italic> to obtain an FDR bounded set of genes and transcripts as output. We did not perform the stage-wise analysis of 
                    <italic toggle="yes">SUPPA2</italic> output, although this could be done by small modifications to the above code for either 
                    <italic toggle="yes">DRIMSeq</italic> or 
                    <italic toggle="yes">DEXSeq</italic>.</p>
                <p>We used the following R code to prepare two files containing TPM estimates for each of the two groups, using the 
                    <italic toggle="yes">tximport</italic> object defined above:</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">x &lt;- txi</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">abundance</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">x[x</styled-content> 
                        <styled-content style="font-size:15px;color:#CF5C00;">&lt;</styled-content> 
                        <styled-content style="font-size:15px;color:#0000CF;">0.01</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">] &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#0000CF;">0</styled-content> 
                        <styled-content style="font-size:15px;color:#8F5903;"># eliminate very small TPMs</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">n &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#0000CF;">6</styled-content> 
                        <styled-content style="font-size:15px;color:#8F5903;"># sample size per group</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">write.table</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(x[,</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">:</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">n],</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">file=paste0</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"suppa/group1.tpm"</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">),</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">quote=</styled-content>
                        <styled-content style="font-size:15px;color:#8F5903;">FALSE</styled-content>, 
                        <styled-content style="font-size:15px;color:#214A87;">sep=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"\t"</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">)</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">write.table</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(x[,n</styled-content> 
                        <styled-content style="font-size:15px;color:#CF5C00;">+</styled-content> 
                        <styled-content style="font-size:15px;color:#0000CF;">1</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">:</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">n],</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">file=paste0</styled-content>(
                        <styled-content style="font-size:15px;color:#4F9905;">"suppa/group2.tpm"</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">),</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">quote=</styled-content>
                        <styled-content style="font-size:15px;color:#8F5903;">FALSE</styled-content>, 
                        <styled-content style="font-size:15px;color:#214A87;">sep=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"\t"</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">)</styled-content>
                    </preformat>
                </p>
                <p>The 
                    <italic toggle="yes">SUPPA2</italic> example code can be found at the software homepage, but we include here the code used on the 6 vs 6 analysis. The first line generates a set of isoforms from the GTF file. The second and third line generate PSI (percent spliced in) estimates for each transcript from files containing the TPMs for each group. The final line performs the differential analysis.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">python suppa.py generateEvents -f ioi -i gencode.v28.annotation.gtf \
  -o suppa/isoforms
python suppa.py psiPerIsoform -g gencode.v28.annotation.gtf \
  -e suppa/group1.tpm -o suppa/group1
python suppa.py psiPerIsoform -g gencode.v28.annotation.gtf \
  -e suppa/group2.tpm -o suppa/group2
python suppa.py diffSplice -m empirical -th 1 -i suppa/isoforms.ioi \
  -p suppa/group1_isoform.psi suppa/group2_isoform.psi \
  -e suppa/group1.tpm suppa/group2.tpm -o suppa/diff_empirical</styled-content>
                    </preformat>
                </p>
                <p>We imported the analysis results into R:</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">suppa &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">read.delim</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"suppa/diff_empirical.dpsi"</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">)</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">names</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(suppa) &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">c</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"txp.gene"</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">,</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"dpsi"</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">,</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"pval"</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">)</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">suppa</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">gene &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">sub</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">";.*"</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#4F9905;">""</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#000000;">suppa</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">txp.gene)</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">suppa</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">txp &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">sub</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">".*;"</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#4F9905;">""</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#000000;">suppa</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">txp.gene)</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">suppa &lt;- suppa[</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">!</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">is.nan</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(suppa</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">dpsi),]</styled-content>
                    </preformat>
                </p>
                <p>The following line was used to compute transcript-level adjusted p-values. We noticed that 
                    <italic toggle="yes">SUPPA2</italic> had a large gain in sensitivity, while still controlling its FDR, if the set of transcripts examined were limited to those that passed the 
                    <italic toggle="yes">DRIMSeq</italic> filtering steps above. Therefore, before running any multiple test correction steps, we filtered to this subset of transcripts. We assessed whether the TPM 
                    <italic toggle="yes">&gt;</italic> 1 filtering step made a difference in the sensitivity and false discovery rate for 
                    <italic toggle="yes">SUPPA2</italic> when combined with the 
                    <italic toggle="yes">DRIMSeq</italic> filtering; it did not.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">suppa &lt;- suppa[</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">match</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(res.txp</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">feature_id, suppa</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">txp),]</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">suppa</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">padj &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">p.adjust</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(suppa</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">pval,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">method=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"BH"</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">)</styled-content>
                    </preformat>
                </p>
                <p>We generated per-gene adjusted p-values, using 
                    <italic toggle="yes">perGeneQValue</italic> from 
                    <italic toggle="yes">DEXSeq</italic>:</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#214A87;">library</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(DEXSeq)</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">suppa.dxr &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">as</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">DataFrame</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">groupID=</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">suppa</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">gene,</styled-content>
                             
                        <styled-content style="font-size:15px;color:#214A87;">pvalue=</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">suppa</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">pval,</styled-content>
                             
                        <styled-content style="font-size:15px;color:#214A87;">padj=rep</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">nrow</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(suppa))),</styled-content> 
                        <styled-content style="font-size:15px;color:#4F9905;">"DEXSeqResults"</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">)</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">qval &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">perGeneQValue</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(suppa.dxr)</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">suppa.g &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">data.frame</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">gene=names</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(qval),</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">qval=</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">qval)</styled-content>
                    </preformat>
                </p>
            </sec>
            <sec>
                <title>Citing methods in published research</title>
                <p>This concludes the DTU section of the workflow. If you use 
                    <italic toggle="yes">DRIMSeq</italic>
                    <sup>
                        <xref ref-type="bibr" rid="ref-7">7</xref>
                    </sup>, 
                    <italic toggle="yes">DEXSeq</italic>
                    <sup>
                        <xref ref-type="bibr" rid="ref-32">32</xref>
                    </sup>, 
                    <italic toggle="yes">SUPPA2</italic>
                    <sup>
                        <xref ref-type="bibr" rid="ref-35">35</xref>
                    </sup>, 
                    <italic toggle="yes">stageR</italic>
                    <sup>
                        <xref ref-type="bibr" rid="ref-8">8</xref>
                    </sup>, 
                    <italic toggle="yes">tximport</italic>
                    <sup>
                        <xref ref-type="bibr" rid="ref-12">12</xref>
                    </sup>, or 
                    <italic toggle="yes">Salmon</italic>
                    <sup>
                        <xref ref-type="bibr" rid="ref-13">13</xref>
                    </sup> in published research, please cite the relevant methods publications, which can be found in the References section of this workflow.</p>
            </sec>
        </sec>
        <sec>
            <title>Evaluation of methods for DTU</title>
            <p>We begin the evaluation by noting that all of the methods correctly avoided calling many of the DGE events as DTU events. The object 
                <monospace>dge.genes</monospace> contains the names of all the genes in which all the isoforms were differentially expressed by an equal amount (so not DTU). 
                <italic toggle="yes">SUPPA2</italic> output is not included in the workflow, but it only reported one of the DGE genes as DTU out of 851 with an adjusted p-value less than 0.05.</p>
            <p>The number of DGE genes called in DTU analysis with 
                <italic toggle="yes">DRIMSeq</italic>:</p>
            <p>
                <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                    <styled-content style="font-size:15px;color:#000000;">res</styled-content>
                    <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                    <styled-content style="font-size:15px;color:#000000;">dge &lt;- res</styled-content>
                    <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                    <styled-content style="font-size:15px;color:#000000;">gene_id</styled-content> 
                    <styled-content style="font-size:15px;color:#CF5C00;">%in%</styled-content> 
                    <styled-content style="font-size:15px;color:#000000;">dge.genes</styled-content>

                    <styled-content style="font-size:15px;color:#214A87;">with</styled-content>
                    <styled-content style="font-size:15px;color:#000000;">(res,</styled-content> 
                    <styled-content style="font-size:15px;color:#214A87;">table</styled-content>
                    <styled-content style="font-size:15px;color:#000000;">(</styled-content>
                    <styled-content style="font-size:15px;color:#214A87;">sig=</styled-content>
                    <styled-content style="font-size:15px;color:#000000;">adj_pvalue</styled-content> 
                    <styled-content style="font-size:15px;color:#CF5C00;">&lt;</styled-content> 
                    <styled-content style="font-size:15px;color:#000000;">.</styled-content>
                    <styled-content style="font-size:15px;color:#0000CF;">05</styled-content>
                    <styled-content style="font-size:15px;color:#000000;">, dge))</styled-content>
                </preformat>
            </p>
            <p>
                <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                    <styled-content style="font-size:15px;color:#000000;">##        dge
## sig     FALSE TRUE
##   FALSE  5375  754
##   TRUE   1590   17</styled-content>
                </preformat>
            </p>
            <p>The number of DGE genes called in DTU analysis with 
                <italic toggle="yes">DEXSeq</italic>:</p>
            <p>
                <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                    <styled-content style="font-size:15px;color:#000000;">dxr.g</styled-content>
                    <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                    <styled-content style="font-size:15px;color:#000000;">dge &lt;- dxr.g</styled-content>
                    <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                    <styled-content style="font-size:15px;color:#000000;">gene</styled-content> 
                    <styled-content style="font-size:15px;color:#CF5C00;">%in%</styled-content> 
                    <styled-content style="font-size:15px;color:#000000;">dge.genes</styled-content>

                    <styled-content style="font-size:15px;color:#214A87;">with</styled-content>
                    <styled-content style="font-size:15px;color:#000000;">(dxr.g,</styled-content> 
                    <styled-content style="font-size:15px;color:#214A87;">table</styled-content>
                    <styled-content style="font-size:15px;color:#000000;">(</styled-content>
                    <styled-content style="font-size:15px;color:#214A87;">sig=</styled-content>
                    <styled-content style="font-size:15px;color:#000000;">qval</styled-content> 
                    <styled-content style="font-size:15px;color:#CF5C00;">&lt;</styled-content> 
                    <styled-content style="font-size:15px;color:#000000;">.</styled-content>
                    <styled-content style="font-size:15px;color:#0000CF;">05</styled-content>
                    <styled-content style="font-size:15px;color:#000000;">, dge))</styled-content>
                </preformat>
            </p>
            <p>
                <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                    <styled-content style="font-size:15px;color:#000000;">##        dge
## sig     FALSE TRUE
##   FALSE  5538  769
##   TRUE   1455    2</styled-content>
                </preformat>
            </p>
            <p>The 
                <italic toggle="yes">iCOBRA</italic> package
                <sup>
                    <xref ref-type="bibr" rid="ref-36">36</xref>
                </sup> was used to construct plots to assess the true positive rate over the false discovery rate at three nominal FDR thresholds: 1%, 5%, and 10%. The code for evaluating all methods and constructing the 
                <italic toggle="yes">iCOBRA</italic> plots is included in the simulation repository
                <sup>
                    <xref ref-type="bibr" rid="ref-21">21</xref>
                </sup>. Above, we showed an analysis for a comparison of 6 vs 6 samples. As we were interested in the performance at various sample sizes, we performed the entire analysis for 
                <italic toggle="yes">DRIMSeq</italic>, 
                <italic toggle="yes">DEXSeq</italic>, and 
                <italic toggle="yes">SUPPA2</italic> at per-group sample sizes of 3, 6, 9, and 12.</p>
            <p>At the gene level, in terms of controlling the nominal FDR, 
                <italic toggle="yes">SUPPA2</italic> always controlled its FDR, even for the smallest sample size, 
                <italic toggle="yes">DEXSeq</italic> controlled except for the 1% threshold in the smallest sample size case, and 
                <italic toggle="yes">DRIMSeq</italic> exceeded its FDR but approached the target for larger sample sizes (
                <xref ref-type="fig" rid="f3">Figure 3</xref>). Exceeding the nominal FDR level by a small amount should be considered with a method&#x2019;s relative sensitivity in mind as well, compared to other methods. For example, for the 6 vs 6 comparison, 
                <italic toggle="yes">DRIMSeq</italic> had observed FDR of 12% at nominal 10%, meaning that for every 100 genes reported as containing DTU, the method reported 2 extra genes more than its target. 
                <italic toggle="yes">DRIMSeq</italic> and 
                <italic toggle="yes">DEXSeq</italic> were the most sensitive methods in recovering gene-level DTU in this simulation.</p>
            <fig fig-type="figure" id="f3" orientation="portrait" position="float">
                <label>Figure 3. </label>
                <caption>
                    <title>Gene-level screening for differential transcript usage (DTU).</title>
                    <p>True positive rate (y-axis) over false discovery rate (FDR) (x-axis) for DEXSeq, DRIMSeq, and SUPPA2. The four panels shown are for per-group sample sizes: (
                        <bold>A</bold>) 3, (
                        <bold>B</bold>) 6, (
                        <bold>C</bold>) 9, and (
                        <bold>D</bold>) 12. Circles indicate thresholds of 1%, 5%, and 10% nominal FDR, which are filled if the observed value is less than the target (dashed vertical lines).</p>
                </caption>
                <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/16780/ce9cbfd5-272b-4fa1-83e0-1ea31a26c9ba_figure3.gif"/>
            </fig>
            <p>We assessed the overall false discovery rate (OFDR) procedure implemented with 
                <italic toggle="yes">stageR</italic> using gene- and transcript-level p-values from 
                <italic toggle="yes">DRIMSeq</italic> and 
                <italic toggle="yes">DEXSeq</italic>. For 
                <italic toggle="yes">DRIMSeq</italic>, we assessed whether raising the p-values for transcripts with small proportion SD helped to recover OFDR control. 
                <italic toggle="yes">DEXSeq</italic> input to 
                <italic toggle="yes">stageR</italic> tended to stay within the 5% OFDR target, and the observed OFDR for 
                <italic toggle="yes">DRIMSeq</italic> with proportion SD filtering lowered to around 15% at per-group sample size of 6 and higher (
                <xref ref-type="fig" rid="f4">Figure 4</xref>). Without the filtering, the observed OFDR for 
                <italic toggle="yes">DRIMSeq</italic> was otherwise around 25%.</p>
            <fig fig-type="figure" id="f4" orientation="portrait" position="float">
                <label>Figure 4. </label>
                <caption>
                    <title>Number of true positives and observed overall false discovery rate (OFDR) using stageR for 5% target.</title>
                    <p>Each method is drawn as a line, and the numbers to the right of the points indicate the per-group sample size. Adjusted p-values for a nominal 5% OFDR (dashed vertical line) were generated for DEXSeq and DRIMSeq (with and without post-hoc filtering) from gene- and transcript-level p-values using the stageR framework for stage-wise testing.</p>
                </caption>
                <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/16780/ce9cbfd5-272b-4fa1-83e0-1ea31a26c9ba_figure4.gif"/>
            </fig>
            <p>Finally, we assessed the transcript-level adjusted p-values for DTU directly from 
                <italic toggle="yes">DRIMSeq</italic>, 
                <italic toggle="yes">DEXSeq</italic>, and 
                <italic toggle="yes">SUPPA2</italic>. This analysis did not use 
                <italic toggle="yes">stageR</italic> for stage-wise testing, and so we compute the standard FDR, where the unit of false discovery is the 
                <italic toggle="yes">transcript</italic>, in contrast to the OFDR where the unit of false discovery is the 
                <italic toggle="yes">gene</italic>. In general, we recommend using the 
                <italic toggle="yes">stageR</italic> results, as it allows error control on a natural procedure of looking across genes, then within genes for which transcripts participate in DTU. 
                <italic toggle="yes">SUPPA2</italic> again tended to control its FDR, as did 
                <italic toggle="yes">DEXSeq</italic> (
                <xref ref-type="fig" rid="f5">Figure 5</xref>). 
                <italic toggle="yes">DRIMSeq</italic> with proportion SD filtering approached the target FDR as sample size increased for the 5% and 10% targets, while without filtering, the observed FDR was always higher than the target.</p>
            <fig fig-type="figure" id="f5" orientation="portrait" position="float">
                <label>Figure 5. </label>
                <caption>
                    <title>Transcript-level differential transcript usage (DTU) analysis without stage-wise testing.</title>
                    <p>True positive rate (y-axis) over false discovery rate (x-axis) for DEXSeq, DRIMSeq (with and without post-hoc filtering), and SUPPA2. The four panels shown are for per-group sample sizes: (
                        <bold>A</bold>) 3, (
                        <bold>B</bold>) 6, (
                        <bold>C</bold>) 9, and (
                        <bold>D</bold>) 12. Circles indicate thresholds of 1%, 5%, and 10% nominal FDR.</p>
                </caption>
                <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/16780/ce9cbfd5-272b-4fa1-83e0-1ea31a26c9ba_figure5.gif"/>
            </fig>
            <p>In 
                <xref ref-type="table" rid="T1">Table 1</xref> we include the timing for each method at various sample sizes. Timing includes only the 
                <monospace>diffSplice</monospace> step of 
                <italic toggle="yes">SUPPA2</italic> (the other steps take less than a minute). For 
                <italic toggle="yes">DRIMSeq</italic> and 
                <italic toggle="yes">DEXSeq</italic>, we include the timing of the estimation steps (importing counts with 
                <italic toggle="yes">tximport</italic> and filtering takes only a few seconds).</p>
            <table-wrap id="T1" orientation="portrait" position="anchor">
                <label>Table 1. </label>
                <caption>
                    <title>Timing of methods for differential transcript usage (DTU) in hours:minutes by per-group sample size.</title>
                </caption>
                <table content-type="article-table" frame="hsides">
                    <thead>
                        <tr>
                            <th align="left" colspan="1" rowspan="1">Method</th>
                            <th align="left" colspan="1" rowspan="1">n=3</th>
                            <th align="left" colspan="1" rowspan="1">n=6</th>
                            <th align="left" colspan="1" rowspan="1">n=9</th>
                            <th align="left" colspan="1" rowspan="1">n=12</th>
                        </tr>
                    </thead>
                    <tbody>
                        <tr>
                            <td align="left" colspan="1" rowspan="1">
                                <italic toggle="yes">DRIMSeq</italic>
                            </td>
                            <td align="left" colspan="1" rowspan="1">0:15</td>
                            <td align="left" colspan="1" rowspan="1">0:15</td>
                            <td align="left" colspan="1" rowspan="1">0:18</td>
                            <td align="left" colspan="1" rowspan="1">0:18</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1">
                                <italic toggle="yes">DEXSeq</italic>
                            </td>
                            <td align="left" colspan="1" rowspan="1">0:01</td>
                            <td align="left" colspan="1" rowspan="1">0:02</td>
                            <td align="left" colspan="1" rowspan="1">0:04</td>
                            <td align="left" colspan="1" rowspan="1">0:07</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1">
                                <italic toggle="yes">SUPPA2</italic>
                            </td>
                            <td align="left" colspan="1" rowspan="1">0:16</td>
                            <td align="left" colspan="1" rowspan="1">0:18</td>
                            <td align="left" colspan="1" rowspan="1">3:48</td>
                            <td align="left" colspan="1" rowspan="1">5:33</td>
                        </tr>
                    </tbody>
                </table>
            </table-wrap>
        </sec>
        <sec>
            <title>Evaluation with fixed per-gene dispersion</title>
            <p>In order to further investigate performance differences between 
                <italic toggle="yes">DRIMSeq</italic> and 
                <italic toggle="yes">DEXSeq</italic>, we generated an additional simulation in which genes were assigned Negative Binomial dispersion parameters by matching the gene-level count to the joint distribution of mean and dispersions on the GEUVADIS dataset. Then transcript-level counts were generated with all transcripts of a gene being assigned the same Negative Binomial dispersion parameter. This contrasts with the main simulation, in which each transcript was assigned its own dispersion parameter, resulting in heterogeneity of dispersion within a gene. As we do not know the degree to which transcripts of a gene would have correlated biological variability in an experimental dataset, we also include the results for the count-based methods that estimate precision/dispersion, 
                <italic toggle="yes">DRIMSeq</italic> and 
                <italic toggle="yes">DEXSeq</italic> on this additional simulation.</p>
            <p>
                <italic toggle="yes">DRIMSeq</italic>, which estimates a single precision parameter per gene, performed slightly better on this simulation at the gene level (
                <xref ref-type="fig" rid="f6">Figure 6</xref>), although we note that 
                <italic toggle="yes">DRIMSeq</italic> nearly controlled FDR at the gene level already in the main simulation. 
                <italic toggle="yes">DEXSeq</italic> models different dispersion parameters for every transcript, and its performance changes less across the two simulations. More improvement was seen for 
                <italic toggle="yes">DRIMSeq</italic> with proportion SD filtering, in the OFDR analysis (
                <xref ref-type="fig" rid="f7">Figure 7</xref>) and in the transcript-level analysis without screening (
                <xref ref-type="fig" rid="f8">Figure 8</xref>). Again, we caveat our comparative evaluation of 
                <italic toggle="yes">DRIMSeq</italic> and 
                <italic toggle="yes">DEXSeq</italic> by noting that we do not know whether various real RNA-seq experiments will more closely reflect within-gene heterogeneous dispersion or fixed dispersion, or something in between.</p>
            <fig fig-type="figure" id="f6" orientation="portrait" position="float">
                <label>Figure 6. </label>
                <caption>
                    <title>Gene-level screening for differential transcript usage (DTU), on the simulation with fixed per-gene dispersions.</title>
                    <p>The four panels shown are for per-group sample sizes: (
                        <bold>A</bold>) 3, (
                        <bold>B</bold>) 6, (
                        <bold>C</bold>) 9, and (
                        <bold>D</bold>) 12. Circles indicate thresholds of 1%, 5%, and 10% nominal FDR.</p>
                </caption>
                <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/16780/ce9cbfd5-272b-4fa1-83e0-1ea31a26c9ba_figure6.gif"/>
            </fig>
            <fig fig-type="figure" id="f7" orientation="portrait" position="float">
                <label>Figure 7. </label>
                <caption>
                    <title>Number of true positives and observed overall false discovery rate (OFDR) using stageR for 5% target, on the simulation with fixed per-gene dispersions.</title>
                </caption>
                <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/16780/ce9cbfd5-272b-4fa1-83e0-1ea31a26c9ba_figure7.gif"/>
            </fig>
            <fig fig-type="figure" id="f8" orientation="portrait" position="float">
                <label>Figure 8. </label>
                <caption>
                    <title>Transcript-level differential transcript usage (DTU) analysis without stage-wise testing, on the simulation with fixed per-gene dispersions.</title>
                    <p>The four panels shown are for per-group sample sizes: (
                        <bold>A</bold>) 3, (
                        <bold>B</bold>) 6, (
                        <bold>C</bold>) 9, and (
                        <bold>D</bold>) 12. Circles indicate thresholds of 1%, 5%, and 10% nominal FDR.</p>
                </caption>
                <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/16780/ce9cbfd5-272b-4fa1-83e0-1ea31a26c9ba_figure8.gif"/>
            </fig>
        </sec>
        <sec>
            <title>DTU analysis complements DGE analysis</title>
            <sec>
                <title>DGE analysis with DESeq2</title>
                <p>In the final section of the workflow containing live code examples, we demonstrate how differential transcript usage, summarized to the gene-level, can be visualized with respect to differential gene expression analysis results. We use 
                    <italic toggle="yes">tximport</italic> and summarize counts to the gene level and compute an average transcript length offset for count-based methods
                    <sup>
                        <xref ref-type="bibr" rid="ref-12">12</xref>
                    </sup>. We will then show code for using 
                    <italic toggle="yes">DESeq2</italic> and 
                    <italic toggle="yes">edgeR</italic> to assess differential gene expression. Because we have simulated the genes according to three different categories, we can color the final plot by the true simulated state of the genes. We note that we will pair 
                    <italic toggle="yes">DEXSeq</italic> with 
                    <italic toggle="yes">DESeq2</italic> results in the following plot, and 
                    <italic toggle="yes">DRIMSeq</italic> with 
                    <italic toggle="yes">edgeR</italic> results. However, this pairing is arbitrary, and any DTU method can reasonably be paired with any DGE method.</p>
                <p>The following line of code is unevaluated, but was used to generate an object 
                    <monospace>txi.g</monospace> which contains the gene-level counts, abundances and average transcript lengths.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">txi.g &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">tximport</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(files,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">type=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"salmon"</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">tx2gene=</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">txdf[,</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">2</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">:</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">])</styled-content>
                    </preformat>
                </p>
                <p>For the workflow, we load the 
                    <monospace>txi.g</monospace> object which is saved in a file 
                    <monospace>salmon_gene_txi.rda.</monospace> We then load the 
                    <italic toggle="yes">DESeq2</italic> package and build a 
                    <italic toggle="yes">DESeqDataSet</italic> from 
                    <monospace>txi.g</monospace>, providing also the sample information and a design formula.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#214A87;">data</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(salmon_gene_txi)</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">library</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(DESeq2)</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">dds &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">DESeqDataSetFromTximport</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(txi.g, samps,</styled-content> 
                        <styled-content style="font-size:15px;color:#CF5C00;">~</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">condition)</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">## using counts and average transcript lengths from tximport</preformat>
                </p>
                <p>The following two lines of code run the 
                    <italic toggle="yes">DESeq2</italic> analysis
                    <sup>
                        <xref ref-type="bibr" rid="ref-16">16</xref>
                    </sup>.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">dds &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">DESeq</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(dds)</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">dres &lt;- DESeq2</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">::</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">results</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(dds)</styled-content>
                    </preformat>
                </p>
                <p>We can confirm that most of the DTU genes are correctly not included in the significant DGE results (although some are).</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#214A87;">length</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(dtu.genes)</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">
## [1] 1501</styled-content>
                    </preformat>
</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#214A87;">table</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">rownames</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(dres)[</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">which</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(dres</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">padj</styled-content> 
                        <styled-content style="font-size:15px;color:#CF5C00;">&lt;</styled-content> 
                        <styled-content style="font-size:15px;color:#000000;">.</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">05</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">)]</styled-content> 
                        <styled-content style="font-size:15px;color:#CF5C00;">%in%</styled-content> 
                        <styled-content style="font-size:15px;color:#000000;">dtu.genes)</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">
##
## FALSE TRUE
##  2587  102</styled-content>
                    </preformat>
                </p>
                <p>Because we happen to know the true status of each of the genes, we can make a scatterplot of the results, coloring the genes by their status (whether DGE, DTE, or DTU by construction).</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#214A87;">all</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(dxr.g</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">gene</styled-content> 
                        <styled-content style="font-size:15px;color:#CF5C00;">%in%</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">rownames</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(dres))</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">
## [1] TRUE</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">dres &lt;- dres[dxr.g</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">gene,]</styled-content>

                        <styled-content style="font-size:15px;color:#8F5903;"># we can only color because we simulated...</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">col &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">rep</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">8</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">nrow</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(dres))</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">col[</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">rownames</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(dres)</styled-content> 
                        <styled-content style="font-size:15px;color:#CF5C00;">%in%</styled-content> 
                        <styled-content style="font-size:15px;color:#000000;">dge.genes] &lt;-</styled-content>  
                        <styled-content style="font-size:15px;color:#0000CF;">1</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">col[</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">rownames</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(dres)</styled-content> 
                        <styled-content style="font-size:15px;color:#CF5C00;">%in%</styled-content> 
                        <styled-content style="font-size:15px;color:#000000;">dte.genes] &lt;-</styled-content>  
                        <styled-content style="font-size:15px;color:#0000CF;">2</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">col[</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">rownames</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(dres)</styled-content> 
                        <styled-content style="font-size:15px;color:#CF5C00;">%in%</styled-content> 
                        <styled-content style="font-size:15px;color:#000000;">dtu.genes] &lt;-</styled-content>  
                        <styled-content style="font-size:15px;color:#0000CF;">3</styled-content>
                    </preformat>
                </p>
                <p>
                    <xref ref-type="fig" rid="f9">Figure 9</xref> displays the evidence for differential transcript usage over that for differential gene expression. We can see that the DTU genes cluster on the y-axis (mostly not captured in the DGE analysis), and the DGE genes cluster on the x-axis (mostly not captured in the DTU analysis). The DTE genes fall in the middle, as all of them represent DGE, and some of them additionally represent DTU (if the gene had other expressed transcripts). Because 
                    <italic toggle="yes">DEXSeq</italic> outputs an adjusted p-value of 0 for some of the genes, we set these instead to a jittered value around 10
                    <sup>&#x2212;20</sup>, so that their number and location on the x-axis could be visualized. These jittered values should only be used for visualization.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#214A87;">bigpar</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">()</styled-content>

                        <styled-content style="font-size:15px;color:#8F5903;"># here cap the smallest DESeq2 adj p-value</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">cap.padj &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">pmin</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">-</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">1og10</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(dres</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">padj),</styled-content> 
                        <styled-content style="font-size:15px;color:#0000CF;">100</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">)</styled-content>

                        <styled-content style="font-size:15px;color:#8F5903;"># this vector only used for plotting</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">jitter.padj &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#CF5C00;">-</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">1og10</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(dxr.g</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">qval</styled-content> 
                        <styled-content style="font-size:15px;color:#CF5C00;">+</styled-content> 
                        <styled-content style="font-size:15px;color:#0000CF;">1e-20</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">)</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">jp.idx &lt;- jitter.padj</styled-content> 
                        <styled-content style="font-size:15px;color:#CF5C00;">==</styled-content> 
                        <styled-content style="font-size:15px;color:#0000CF;">20</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">jitter.padj[jp.idx] &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">rnorm</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">sum</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(jp.idx),</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">20</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">,.</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">25</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">)</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">plot</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(cap.padj, jitter.padj,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">col=</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">col,</styled-content>
      
                        <styled-content style="font-size:15px;color:#214A87;">xlab=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"Gene expression"</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">,</styled-content>
      
                        <styled-content style="font-size:15px;color:#214A87;">ylab=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"Transcript usage"</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">)</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">legend</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"topright"</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">,</styled-content>
        
                        <styled-content style="font-size:15px;color:#214A87;">c</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"DGE"</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">,</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"DTE"</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">,</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"DTU"</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">,</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"null"</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">),</styled-content>
        
                        <styled-content style="font-size:15px;color:#214A87;">col=c</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">:</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">3</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">,</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">8</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">),</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">pch=</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">20</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">bty=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"n"</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">)</styled-content>        </preformat>
                </p>
                <fig fig-type="figure" id="f9" orientation="portrait" position="float">
                    <label>Figure 9. </label>
                    <caption>
                        <title>Transcript usage over gene expression plot.</title>
                        <p>Each point represents a gene, and plotted are -log10 adjusted p-values for DEXSeq&#x2019;s test of differential transcript usage (y-axis) and DESeq2&#x2019;s test of differential gene expression (x-axis). Because we simulated the data we can color the genes according to their true category.</p>
                    </caption>
                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/16780/ce9cbfd5-272b-4fa1-83e0-1ea31a26c9ba_figure9.gif"/>
                </fig>
            </sec>
            <sec>
                <title>DGE analysis with edgeR</title>
                <p>We can repeat the same analysis using 
                    <italic toggle="yes">edgeR</italic> as the inference engine
                    <sup>
                        <xref ref-type="bibr" rid="ref-3">3</xref>
                    </sup>. The following code incorporates the average transcript length matrix as an offset for an 
                    <italic toggle="yes">edgeR</italic> analysis.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#214A87;">library</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(edgeR)
cts.g &lt;- txi.g</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">counts
normMat &lt;- txi.g</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">length
normMat &lt;- normMat</styled-content> 
                        <styled-content style="font-size:15px;color:#CF5C00;">/</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">exp</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">rowMeans</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">log</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(normMat)))
o &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">log</styled-content>(
                        <styled-content style="font-size:15px;color:#214A87;">calcNormFactors</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(cts.g</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">/</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">normMat))</styled-content> 
                        <styled-content style="font-size:15px;color:#CF5C00;">+</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">log</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">colSums</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(cts.g</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">/</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">normMat))
y &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">DGEList</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(cts.g)
y &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">scaleOffset</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(y,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">t</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">t</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">log</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(normMat))</styled-content> 
                        <styled-content style="font-size:15px;color:#CF5C00;">+</styled-content> 
                        <styled-content style="font-size:15px;color:#000000;">o))
keep &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">filterByExpr</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(y)
y &lt;- y[keep,]</styled-content>
                    </preformat>
                </p>
                <p>The basic 
                    <italic toggle="yes">edgeR</italic> model fitting and results extraction can be accomplished with the following lines:</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">y &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">estimateDisp</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(y, design_full)
fit &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">glmFit</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(y, design_full)
lrt &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">glmLRT</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(fit)
tt &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">topTags</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(lrt,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">n=nrow</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(y),</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">sort=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"none"</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">)[[</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">]]</styled-content>
                    </preformat>
                </p>
                <p>We confirm that most of the DTU genes are correctly not reported as DGE:</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#214A87;">table</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">rownames</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(tt)[</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">which</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(tt</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">FDR</styled-content> 
                        <styled-content style="font-size:15px;color:#CF5C00;">&lt;</styled-content> 
                        <styled-content style="font-size:15px;color:#000000;">.</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">05</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">)]</styled-content> 
                        <styled-content style="font-size:15px;color:#CF5C00;">%in%</styled-content> 
                        <styled-content style="font-size:15px;color:#000000;">dtu.genes)</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">
##
## FALSE TRUE
##  2280   31</styled-content>
                    </preformat>
                </p>
                <p>Again, we can color the genes by their true status in the simulation:</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">common &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">intersect</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(res</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">gene_id,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">rownames</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(tt))</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">tt &lt;- tt[common,]</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">res.sub &lt;- res[</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">match</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(common, res</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">gene_id),]</styled-content>

                        <styled-content style="font-size:15px;color:#8F5903;"># we can only color because we simulated...</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">col &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">rep</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">8</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">nrow</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(tt))
col[</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">rownames</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(tt)</styled-content> 
                        <styled-content style="font-size:15px;color:#CF5C00;">%in%</styled-content> 
                        <styled-content style="font-size:15px;color:#000000;">dge.genes] &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#0000CF;">1</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">col[</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">rownames</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(tt)</styled-content> 
                        <styled-content style="font-size:15px;color:#CF5C00;">%in%</styled-content> 
                        <styled-content style="font-size:15px;color:#000000;">dte.genes] &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#0000CF;">2</styled-content>

                        <styled-content style="font-size:15px;color:#000000;">col[</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">rownames</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(tt)</styled-content> 
                        <styled-content style="font-size:15px;color:#CF5C00;">%in%</styled-content> 
                        <styled-content style="font-size:15px;color:#000000;">dtu.genes] &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#0000CF;">3</styled-content>
                    </preformat>
                </p>
                <p>
                    <xref ref-type="fig" rid="f10">Figure 10</xref> displays the evidence for differential transcript usage over that for differential gene expression, now using 
                    <italic toggle="yes">DRIMSeq</italic> and 
                    <italic toggle="yes">edgeR</italic>. One obvious contrast with 
                    <xref ref-type="fig" rid="f9">Figure 9</xref> is that 
                    <italic toggle="yes">DRIMSeq</italic> outputs lower non-zero adjusted p-values than 
                    <italic toggle="yes">DEXSeq</italic> does, where 
                    <italic toggle="yes">DEXSeq</italic> instead outputs 0 for many genes. The plots look more similar when zooming in on the 
                    <italic toggle="yes">DRIMSeq</italic> y-axis, as can be seen in 
                    <xref ref-type="fig" rid="f11">Figure 11</xref>.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#214A87;">bigpar</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">()</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">plot</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">-</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">log10</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(tt</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">FDR),</styled-content> 
                        <styled-content style="font-size:15px;color:#CF5C00;">-</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">log10</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(res.sub</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">adj_pvalue),</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">col=</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">col,</styled-content>
     
                        <styled-content style="font-size:15px;color:#214A87;">xlab=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"Gene expression"</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">,</styled-content>
     
                        <styled-content style="font-size:15px;color:#214A87;">ylab=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"Transcript usage"</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">)</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">legend</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"topright"</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">,</styled-content>
        
                        <styled-content style="font-size:15px;color:#214A87;">c</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"DGE"</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">,</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"DTE"</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">,</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"DTU"</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">,</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"null"</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">),</styled-content>
        
                        <styled-content style="font-size:15px;color:#214A87;">col=c</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">:</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">3</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">,</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">8</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">),</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">pch=</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">20</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">bty=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"n"</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">)</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#214A87;">bigpar</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">()</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">plot</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">-</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">log10</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(tt</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">FDR),</styled-content> 
                        <styled-content style="font-size:15px;color:#CF5C00;">-</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">log10</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(res.sub</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">$</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">adj_pvalue),</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">col=</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">col,</styled-content>
     
                        <styled-content style="font-size:15px;color:#214A87;">xlab=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"Gene expression"</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">,</styled-content>
     
                        <styled-content style="font-size:15px;color:#214A87;">ylab=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"Transcript usage"</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">ylim=c</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">0</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">,</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">20</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">))</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">legend</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"topright"</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">,</styled-content>
        
                        <styled-content style="font-size:15px;color:#214A87;">c</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">(</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"DGE"</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">,</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"DTE"</styled-content>,
                        <styled-content style="font-size:15px;color:#4F9905;">"DTU"</styled-content>,
                        <styled-content style="font-size:15px;color:#4F9905;">"null"</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">),</styled-content>
        
                        <styled-content style="font-size:15px;color:#214A87;">col=c</styled-content>(
                        <styled-content style="font-size:15px;color:#0000CF;">1</styled-content>
                        <styled-content style="font-size:15px;color:#CF5C00;">:</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">3</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">,</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">8</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">),</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">pch=</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">20</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">bty=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"n"</styled-content>
                        <styled-content style="font-size:15px;color:#000000;">)</styled-content>
                    </preformat>
                </p>
                <fig fig-type="figure" id="f10" orientation="portrait" position="float">
                    <label>Figure 10. </label>
                    <caption>
                        <title>Transcript usage over gene expression plot, as previously, but for DRIMSeq and edgeR.</title>
                    </caption>
                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/16780/ce9cbfd5-272b-4fa1-83e0-1ea31a26c9ba_figure10.gif"/>
                </fig>
                <fig fig-type="figure" id="f11" orientation="portrait" position="float">
                    <label>Figure 11. </label>
                    <caption>
                        <title>Transcript usage over gene expression plot, zooming in on the DRIMSeq adjusted p-values.</title>
                    </caption>
                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/16780/ce9cbfd5-272b-4fa1-83e0-1ea31a26c9ba_figure11.gif"/>
                </fig>
            </sec>
        </sec>
        <sec>
            <title>Evaluation of methods for DGE</title>
            <p>We additionally assessed Bioconductor and other R packages for differential gene expression, to determine true positive rate and control of false discovery rate on the simulated dataset. In this analysis, the simulated &#x201c;DTE&#x201d; genes (where a single transcript was chosen to be differentially expressed) should count for differential gene expression, while the simulated &#x201c;DTU&#x201d; genes should not, as the total expression of the gene remains constant.</p>
            <p>We compared 
                <italic toggle="yes">DESeq2</italic>
                <sup>
                    <xref ref-type="bibr" rid="ref-16">16</xref>
                </sup>, 
                <italic toggle="yes">EBSeq</italic>
                <sup>
                    <xref ref-type="bibr" rid="ref-37">37</xref>
                </sup>, 
                <italic toggle="yes">edgeR</italic>
                <sup>
                    <xref ref-type="bibr" rid="ref-3">3</xref>
                </sup>, 
                <italic toggle="yes">edgeR-QL</italic> (using the quasi-likelihood functions)
                <sup>
                    <xref ref-type="bibr" rid="ref-38">38</xref>
                </sup>, 
                <italic toggle="yes">limma</italic> with 
                <italic toggle="yes">voom</italic> transformation
                <sup>
                    <xref ref-type="bibr" rid="ref-6">6</xref>
                </sup>, 
                <italic toggle="yes">SAMseq</italic>
                <sup>
                    <xref ref-type="bibr" rid="ref-39">39</xref>
                </sup>, and 
                <italic toggle="yes">sleuth</italic>
                <sup>
                    <xref ref-type="bibr" rid="ref-40">40</xref>
                </sup>. We used 
                <italic toggle="yes">tximport</italic> to summarize 
                <italic toggle="yes">Salmon</italic> abundances to the gene level, and provided all methods other than 
                <italic toggle="yes">DESeq2</italic> and 
                <italic toggle="yes">sleuth</italic> with the 
                <monospace>lengthScaledTPM</monospace> count matrix. 
                <italic toggle="yes">sleuth</italic> takes as input the quantification from 
                <italic toggle="yes">kallisto</italic>
                <sup>
                    <xref ref-type="bibr" rid="ref-15">15</xref>
                </sup>, which was run with 30 bootstrap samples and bias correction. For gene-level analysis in 
                <italic toggle="yes">sleuth</italic>, the argument 
                <monospace>aggregation_column="gene_id"</monospace> was used. As 
                <italic toggle="yes">DESeq2</italic> has specially designed import functions for taking in estimated gene counts and an offset from 
                <italic toggle="yes">tximport</italic>, we used this approach to provide 
                <italic toggle="yes">Salmon</italic> summarized gene-level counts and an offset. 
                <italic toggle="yes">edgeR</italic> and 
                <italic toggle="yes">edgeR-QL</italic> had the same performance using the counts and offset approach or the 
                <monospace>lengthScaledTPM</monospace> approach, so we used the latter for code simplicity. The exact code used to run the different methods can be found at the simulation code repository
                <sup>
                    <xref ref-type="bibr" rid="ref-21">21</xref>
                </sup>. Timings for the different gene-level methods are presented in 
                <xref ref-type="table" rid="T2">Table 2</xref>.</p>
            <table-wrap id="T2" orientation="portrait" position="anchor">
                <label>Table 2. </label>
                <caption>
                    <title>Timing of methods for differential gene expression (DGE) rounded to the minute by per-group sample size.</title>
                    <p>Timing includes data import and summarization to gene-level quantities using one core.</p>
                </caption>
                <table content-type="article-table" frame="hsides">
                    <thead>
                        <tr>
                            <th align="left" colspan="1" rowspan="1">Method</th>
                            <th align="left" colspan="1" rowspan="1">n=3</th>
                            <th align="left" colspan="1" rowspan="1">n=6</th>
                            <th align="left" colspan="1" rowspan="1">n=9</th>
                            <th align="left" colspan="1" rowspan="1">n=12</th>
                        </tr>
                    </thead>
                    <tbody>
                        <tr>
                            <td align="left" colspan="1" rowspan="1">
                                <italic toggle="yes">DESeq2</italic>
                            </td>
                            <td align="left" colspan="1" rowspan="1">&lt;1</td>
                            <td align="left" colspan="1" rowspan="1">&lt;1</td>
                            <td align="left" colspan="1" rowspan="1">&lt;1</td>
                            <td align="left" colspan="1" rowspan="1">&lt;1</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1">
                                <italic toggle="yes">EBSeq</italic>
                            </td>
                            <td align="left" colspan="1" rowspan="1">1</td>
                            <td align="left" colspan="1" rowspan="1">2</td>
                            <td align="left" colspan="1" rowspan="1">2</td>
                            <td align="left" colspan="1" rowspan="1">3</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1">
                                <italic toggle="yes">edgeR</italic>
                            </td>
                            <td align="left" colspan="1" rowspan="1">&lt;1</td>
                            <td align="left" colspan="1" rowspan="1">&lt;1</td>
                            <td align="left" colspan="1" rowspan="1">&lt;1</td>
                            <td align="left" colspan="1" rowspan="1">&lt;1</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1">
                                <italic toggle="yes">edgeR-QL</italic>
                            </td>
                            <td align="left" colspan="1" rowspan="1">&lt;1</td>
                            <td align="left" colspan="1" rowspan="1">&lt;1</td>
                            <td align="left" colspan="1" rowspan="1">&lt;1</td>
                            <td align="left" colspan="1" rowspan="1">&lt;1</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1">
                                <italic toggle="yes">limma</italic>
                            </td>
                            <td align="left" colspan="1" rowspan="1">&lt;1</td>
                            <td align="left" colspan="1" rowspan="1">&lt;1</td>
                            <td align="left" colspan="1" rowspan="1">&lt;1</td>
                            <td align="left" colspan="1" rowspan="1">&lt;1</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1">
                                <italic toggle="yes">SAMseq</italic>
                            </td>
                            <td align="left" colspan="1" rowspan="1">&lt;1</td>
                            <td align="left" colspan="1" rowspan="1">&lt;1</td>
                            <td align="left" colspan="1" rowspan="1">&lt;1</td>
                            <td align="left" colspan="1" rowspan="1">&lt;1</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1">
                                <italic toggle="yes">sleuth</italic>
                            </td>
                            <td align="left" colspan="1" rowspan="1">2</td>
                            <td align="left" colspan="1" rowspan="1">4</td>
                            <td align="left" colspan="1" rowspan="1">5</td>
                            <td align="left" colspan="1" rowspan="1">7</td>
                        </tr>
                    </tbody>
                </table>
            </table-wrap>
            <p>
                <italic toggle="yes">iCOBRA</italic> plots with true positive rate over false discovery rate for gene-level analysis across four different per-group sample sizes are presented in 
                <xref ref-type="fig" rid="f12">Figure 12</xref>. For the smallest per-group sample size of 3, all methods except 
                <italic toggle="yes">DESeq2</italic> and 
                <italic toggle="yes">EBSeq</italic> tended to control the FDR, while those two method had, for example, 15% FDR at the nominal 10% rate. 
                <italic toggle="yes">SAMseq</italic>, with so few samples, did not have any sensitivity to detect DGE. At the per-group sample size of 6, all methods except 
                <italic toggle="yes">DESeq2</italic> and 
                <italic toggle="yes">SAMseq</italic> tended to control the FDR. At this sample size, 
                <italic toggle="yes">EBSeq</italic> controlled its FDR. For the largest per-group sample sizes, 9 and 12, the performance of many methods remained similar as previously, except 
                <italic toggle="yes">sleuth</italic> did not control the nominal 5% or 10% FDR. We performed additional experiments to see if the performance of 
                <italic toggle="yes">sleuth</italic> at higher sample sizes was related to the realistic GC bias parameters used in the simulation, but simulating fragments uniformly from the transcripts revealed the same performance at per-group sample sizes 9 and 12 (
                <xref ref-type="other" rid="SF2">Supplementary Figure 2</xref>). Reducing the number of DGE, DTE and DTU genes from 10% to 5% each, however, did recover control of the FDR at the nominal 5% and 10% FDR for 
                <italic toggle="yes">sleuth</italic> (
                <xref ref-type="other" rid="SF3">Supplementary Figure 3</xref>).</p>
            <fig fig-type="figure" id="f12" orientation="portrait" position="float">
                <label>Figure 12. </label>
                <caption>
                    <title>True positive rate over false discovery rate for differential gene expression of the simulated dataset.</title>
                </caption>
                <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/16780/ce9cbfd5-272b-4fa1-83e0-1ea31a26c9ba_figure12.gif"/>
            </fig>
        </sec>
        <sec>
            <title>Evaluation of methods for DTE</title>
            <p>Finally, we assessed the Bioconductor and R packages for differential transcript expression analysis. While we believe the separation of differential transcript usage and differential gene expression described in the earlier sections of the workflow represents an easily interpretable approach, some investigators may prefer to assess differential expression on a per-transcript basis. For this assessment, all of the simulated non-null transcripts count as DTE, whether from the simulated DGE-, DTE-, or DTU-by-construction genes. For most of the methods, we simply provided the transcript-level data to the same functions as for the DGE analysis. 
                <italic toggle="yes">EBSeq</italic> was provided with the number of isoforms per gene. The timing of the methods is presented in 
                <xref ref-type="table" rid="T3">Table 3</xref>.</p>
            <table-wrap id="T3" orientation="portrait" position="anchor">
                <label>Table 3. </label>
                <caption>
                    <title>Timing of methods for differential transcript expression (DTE) rounded to the nearest minute by per-group sample size.</title>
                    <p>Timing includes data import.</p>
                </caption>
                <table content-type="article-table" frame="hsides">
                    <thead>
                        <tr>
                            <th align="left" colspan="1" rowspan="1">Method</th>
                            <th align="left" colspan="1" rowspan="1">n=3</th>
                            <th align="left" colspan="1" rowspan="1">n=6</th>
                            <th align="left" colspan="1" rowspan="1">n=9</th>
                            <th align="left" colspan="1" rowspan="1">n=12</th>
                        </tr>
                    </thead>
                    <tbody>
                        <tr>
                            <td align="left" colspan="1" rowspan="1">
                                <italic toggle="yes">DESeq2</italic>
                            </td>
                            <td align="left" colspan="1" rowspan="1">&lt;1</td>
                            <td align="left" colspan="1" rowspan="1">&lt;1</td>
                            <td align="left" colspan="1" rowspan="1">&lt;1</td>
                            <td align="left" colspan="1" rowspan="1">1</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1">
                                <italic toggle="yes">EBSeq</italic>
                            </td>
                            <td align="left" colspan="1" rowspan="1">5</td>
                            <td align="left" colspan="1" rowspan="1">11</td>
                            <td align="left" colspan="1" rowspan="1">18</td>
                            <td align="left" colspan="1" rowspan="1">22</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1">
                                <italic toggle="yes">edgeR</italic>
                            </td>
                            <td align="left" colspan="1" rowspan="1">&lt;1</td>
                            <td align="left" colspan="1" rowspan="1">&lt;1</td>
                            <td align="left" colspan="1" rowspan="1">&lt;1</td>
                            <td align="left" colspan="1" rowspan="1">&lt;1</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1">
                                <italic toggle="yes">edgeR-QL</italic>
                            </td>
                            <td align="left" colspan="1" rowspan="1">&lt;1</td>
                            <td align="left" colspan="1" rowspan="1">&lt;1</td>
                            <td align="left" colspan="1" rowspan="1">&lt;1</td>
                            <td align="left" colspan="1" rowspan="1">&lt;1</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1">
                                <italic toggle="yes">limma</italic>
                            </td>
                            <td align="left" colspan="1" rowspan="1">&lt;1</td>
                            <td align="left" colspan="1" rowspan="1">&lt;1</td>
                            <td align="left" colspan="1" rowspan="1">&lt;1</td>
                            <td align="left" colspan="1" rowspan="1">&lt;1</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1">
                                <italic toggle="yes">SAMseq</italic>
                            </td>
                            <td align="left" colspan="1" rowspan="1">&lt;1</td>
                            <td align="left" colspan="1" rowspan="1">&lt;1</td>
                            <td align="left" colspan="1" rowspan="1">&lt;1</td>
                            <td align="left" colspan="1" rowspan="1">1</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1">
                                <italic toggle="yes">sleuth</italic>
                            </td>
                            <td align="left" colspan="1" rowspan="1">2</td>
                            <td align="left" colspan="1" rowspan="1">2</td>
                            <td align="left" colspan="1" rowspan="1">2</td>
                            <td align="left" colspan="1" rowspan="1">2</td>
                        </tr>
                    </tbody>
                </table>
            </table-wrap>
            <p>
                <italic toggle="yes">iCOBRA</italic> plots with the true positive rate over false discovery rate for the transcript-level analysis are shown in 
                <xref ref-type="fig" rid="f13">Figure 13</xref>. The performance at per-group sample size of 3 was similar to the gene-level analysis, except 
                <italic toggle="yes">DESeq2</italic> came closer to controlling the FDR and 
                <italic toggle="yes">EBSeq</italic> performed slightly worse than before, while the rest of the methods tended to control their FDR. At per-group sample size of 6, all of the evaluated methods tended to control the FDR, though 
                <italic toggle="yes">DESeq2</italic>, 
                <italic toggle="yes">EBSeq</italic>, 
                <italic toggle="yes">SAMseq</italic>, and 
                <italic toggle="yes">sleuth</italic> tended to have higher sensitivity than 
                <italic toggle="yes">edgeR</italic>, 
                <italic toggle="yes">edgeR-QL</italic> and 
                <italic toggle="yes">limma</italic>. The same issue of FDR control for 
                <italic toggle="yes">sleuth</italic> was seen in the transcript-level analysis as in the gene-level analysis, for per-group sample size 9 and 12.</p>
            <fig fig-type="figure" id="f13" orientation="portrait" position="float">
                <label>Figure 13. </label>
                <caption>
                    <title>True positive rate over false discovery rate for differential transcript expression of the simulated dataset.</title>
                </caption>
                <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/16780/ce9cbfd5-272b-4fa1-83e0-1ea31a26c9ba_figure13.gif"/>
            </fig>
        </sec>
        <sec>
            <title>Discussion</title>
            <p>Here we presented a workflow for analyzing RNA-seq experiments for differential transcript usage across groups of samples. The Bioconductor packages used, 
                <italic toggle="yes">DRIMSeq</italic>, 
                <italic toggle="yes">DEXSeq</italic>, and 
                <italic toggle="yes">stageR</italic>, are simple to use and fast when run on transcript-level data. We show how these can be used downstream of transcript abundance quantification with 
                <italic toggle="yes">Salmon</italic>. We evaluated these methods on a simulated dataset and showed how the transcript usage results complement a gene-level analysis, which can also be run on output from 
                <italic toggle="yes">Salmon</italic>, using the 
                <italic toggle="yes">tximport</italic> package to aggregate quantification to the gene level. We used the simulated dataset to evaluate Bioconductor and other R packages for differential gene expression, and differential transcript expression. We recommend the use of 
                <italic toggle="yes">stageR</italic> for its formal statistical procedure involving a screening and confirmation stage, as this fits closely to what we expect a typical analysis to entail. 
                <italic toggle="yes">stageR</italic> then provides error control for an overall false discovery rate, assuming that the underlying tests are well calibrated.</p>
            <p>One potential limitation of this workflow is that, in contrast to other methods such as the standard 
                <italic toggle="yes">DEXSeq</italic> analysis, 
                <italic toggle="yes">SUPPA2</italic>, or 
                <italic toggle="yes">LeafCutter</italic>
                <sup>
                    <xref ref-type="bibr" rid="ref-41">41</xref>
                </sup>, here we considered and detected expression switching between annotated transcripts. Other methods such as 
                <italic toggle="yes">DEXSeq</italic> (exon-based), 
                <italic toggle="yes">SUPPA2</italic>, or 
                <italic toggle="yes">LeafCutter</italic> may benefit in terms of power and interpretability from performing statistical analysis directly on exon usage or splice events. Methods such as 
                <italic toggle="yes">DEXSeq</italic> (exon-based) and 
                <italic toggle="yes">LeafCutter</italic> benefit in the ability to detect un-annotated events. The workflow presented here would require further processing to attribute transcript usage changes to specific splice events, and is limited to considering the estimated abundance of annotated transcripts.</p>
        </sec>
        <sec>
            <title>Session information</title>
            <p>The following provides the session information used when compiling this document.</p>
            <p>
                <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                    <styled-content style="font-size:15px;color:#000000;">devtools</styled-content>
                    <styled-content style="font-size:15px;color:#CF5C00 ;">::</styled-content>
                    <styled-content style="font-size:15px;color:#214A87;">session_info</styled-content>
                    <styled-content style="font-size:15px;color:#000000;">()</styled-content>
                </preformat>
            </p>
            <p>
                <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                    <styled-content style="font-size:15px;color:#000000;">## Session info -------------------------------------------------------------

##  setting  value
##  version  R version 3.5.0 (2018-04-23)
##  system   x86_64, darwin15.6.0
##  ui       X11
##  language (EN)
##  collate  en_US.UTF-8
##  tz       America/New_York
##  date     2018-06-17

## Packages -----------------------------------------------------------------

##  package              * version   date       source
##  acepack                1.4.1     2016-10-29 CRAN (R 3.5.0)
##  annotate               1.58.0    2018-05-01 Bioconductor
##  AnnotationDbi        * 1.42.1    2018-05-08 Bioconductor
##  assertthat             0.2.0     2017-04-11 CRAN (R 3.5.0)
##  backports              1.1.2     2017-12-13 cran (@1.1.2)
##  base                 * 3.5.0     2018-04-24 local
##  base64enc              0.1-3     2015-07-28 CRAN (R 3.5.0)
##  Biobase              * 2.40.0    2018-05-01 Bioconductor
##  BiocGenerics         * 0.26.0    2018-05-01 Bioconductor
##  BiocInstaller        * 1.30.0    2018-05-04 Bioconductor
##  BiocParallel         * 1.14.1    2018-05-06 Bioconductor
##  BiocStyle              2.8.0     2018-05-01 Bioconductor
##  BiocWorkflowTools      1.6.1     2018-05-24 Bioconductor
##  biomaRt                2.36.0    2018-05-01 Bioconductor
##  Biostrings             2.48.0    2018-05-01 Bioconductor
##  bit                    1.1-12    2014-04-09 CRAN (R 3.5.0)
##  bit64                  0.9-7     2017-05-08 CRAN (R 3.5.0)
##  bitops                 1.0-6     2013-08-17 CRAN (R 3.5.0)
##  blob                   1.1.1     2018-03-25 CRAN (R 3.5.0)
##  bookdown               0.7       2018-02-18 CRAN (R 3.5.0)
##  checkmate              1.8.5     2017-10-24 CRAN (R 3.5.0)
##  cluster                2.0.7-1   2018-04-13 CRAN (R 3.5.0)
##  codetools              0.2-15    2016-10-05 CRAN (R 3.5.0)
##  colorspace             1.3-2     2016-12-14 CRAN (R 3.5.0)
##  compiler               3.5.0     2018-04-24 local
##  data.table             1.11.2    2018-05-08 CRAN (R 3.5.0)
##  datasets             * 3.5.0     2018-04-24 local
##  DBI                    1.0.0     2018-05-02 CRAN (R 3.5.0)
##  DelayedArray         * 0.6.0     2018-05-01 Bioconductor
##  DESeq2               * 1.20.0    2018-05-01 Bioconductor
##  devtools             * 1.13.5    2018-02-18 CRAN (R 3.5.0)
##  DEXSeq               * 1.26.0    2018-05-01 Bioconductor
##  digest                 0.6.15    2018-01-28 cran (@0.6.15)
##  DRIMSeq              * 1.8.0     2018-05-01 Bioconductor
##  edgeR                * 3.22.2    2018-05-24 cran (@3.22.2)
##  evaluate               0.10.1    2017-06-24 CRAN (R 3.5.0)
##  foreign                0.8-70    2017-11-28 CRAN (R 3.5.0)
##  Formula                1.2-3     2018-05-03 CRAN (R 3.5.0)
##  genefilter             1.62.0    2018-05-01 Bioconductor
##  geneplotter            1.58.0    2018-05-01 Bioconductor
##  GenomeInfoDb         * 1.16.0    2018-05-01 Bioconductor
##  GenomeInfoDbData       1.1.0     2018-01-10 Bioconductor
##  GenomicRanges        * 1.32.2    2018-05-06 Bioconductor
##  ggplot2                2.2.1     2016-12-30 CRAN (R 3.5.0)
##  git2r                  0.21.0    2018-01-04 CRAN (R 3.5.0)
##  graphics             * 3.5.0     2018-04-24 local
##  grDevices            * 3.5.0     2018-04-24 local
##  grid                   3.5.0     2018-04-24 local
##  gridExtra              2.3       2017-09-09 CRAN (R 3.5.0)
##  gtable                 0.2.0     2016-02-26 CRAN (R 3.5.0)
##  Hmisc                  4.1-1     2018-01-03 CRAN (R 3.5.0)
##  htmlTable              1.11.2    2018-01-20 CRAN (R 3.5.0)
##  htmltools              0.3.6     2017-04-28 CRAN (R 3.5.0)
##  htmlwidgets            1.2       2018-04-19 CRAN (R 3.5.0)
##  httr                   1.3.1     2017-08-20 CRAN (R 3.5.0)
##  hwriter                1.3.2     2014-09-10 CRAN (R 3.5.0)
##  IRanges              * 2.14.9    2018-05-15 Bioconductor
##  knitr                * 1.20      2018-02-20 CRAN (R 3.5.0)
##  labeling               0.3       2014-08-23 CRAN (R 3.5.0)
##  lattice                0.20-35   2017-03-25 CRAN (R 3.5.0)
##  latticeExtra           0.6-28    2016-02-09 CRAN (R 3.5.0)
##  lazyeval               0.2.1     2017-10-29 CRAN (R 3.5.0)
##  limma                * 3.36.1    2018-05-05 Bioconductor
##  locfit                 1.5-9.1   2013-04-20 CRAN (R 3.5.0)
##  magrittr               1.5       2014-11-22 CRAN (R 3.5.0)
##  Matrix                 1.2-14    2018-04-13 CRAN (R 3.5.0)
##  matrixStats          * 0.53.1    2018-02-11 CRAN (R 3.5.0)
##  memoise                1.1.0     2017-04-21 CRAN (R 3.5.0)
##  methods              * 3.5.0     2018-04-24 local
##  munsell                0.4.3     2016-02-13 CRAN (R 3.5.0)
##  nnet                   7.3-12    2016-02-02 CRAN (R 3.5.0)
##  parallel             * 3.5.0     2018-04-24 local
##  pillar                 1.2.2     2018-04-26 CRAN (R 3.5.0)
##  plyr                   1.8.4     2016-06-08 CRAN (R 3.5.0)
##  prettyunits            1.0.2     2015-07-13 CRAN (R 3.5.0)
##  progress               1.1.2     2016-12-14 CRAN (R 3.5.0)
##  R6                     2.2.2     2017-06-17 CRAN (R 3.5.0)
##  rafalib              * 1.0.0     2015-08-09 CRAN (R 3.5.0)
##  RColorBrewer         * 1.1-2     2014-12-07 CRAN (R 3.5.0)
##  Rcpp                   0.12.17   2018-05-18 cran (@0.12.17)
##  RCurl                  1.95-4.10 2018-01-04 CRAN (R 3.5.0)
##  reshape2               1.4.3     2017-12-11 CRAN (R 3.5.0)
##  rlang                  0.2.1     2018-05-30 cran (@0.2.1)
##  rmarkdown            * 1.9       2018-03-01 CRAN (R 3.5.0)
##  rnaseqDTU            * 0.1.0     2018-06-18 local (mikelove/rnaseqDTU@NA)
##  rpart                  4.1-13    2018-02-23 CRAN (R 3.5.0)
##  rprojroot              1.3-2     2018-01-03 cran (@1.3-2)
##  Rsamtools              1.32.0    2018-05-01 Bioconductor
##  RSQLite                2.1.1     2018-05-06 CRAN (R 3.5.0)
##  rstudioapi             0.7       2017-09-07 CRAN (R 3.5.0)
##  S4Vectors            * 0.18.1    2018-05-02 Bioconductor
##  scales                 0.5.0     2017-08-24 CRAN (R 3.5.0)
##  splines                3.5.0     2018-04-24 local
##  stageR               * 1.2.22    2018-06-14 cran (@1.2.22)
##  statmod                1.4.30    2017-06-18 CRAN (R 3.5.0)
##  stats                * 3.5.0     2018-04-24 local
##  stats4               * 3.5.0     2018-04-24 local
##  stringi                1.2.2     2018-05-02 CRAN (R 3.5.0)
##  stringr                1.3.1     2018-05-10 CRAN (R 3.5.0)
##  SummarizedExperiment * 1.10.1    2018-05-11 Bioconductor
##  survival               2.42-3    2018-04-16 CRAN (R 3.5.0)
##  tibble                 1.4.2     2018-01-22 CRAN (R 3.5.0)
##  tinytex                0.5       2018-04-16 CRAN (R 3.5.0)
##  tools                  3.5.0     2018-04-24 local
##  utils                * 3.5.0     2018-04-24 local
##  withr                  2.1.2     2018-03-15 CRAN (R 3.5.0)
##  xfun                   0.1       2018-01-22 CRAN (R 3.5.0)
##  XML                    3.98-1.11 2018-04-16 CRAN (R 3.5.0)
##  xtable                 1.8-2     2016-02-05 CRAN (R 3.5.0)
##  XVector                0.20.0    2018-05-01 Bioconductor
##  yaml                   2.1.19    2018-05-01 CRAN (R 3.5.0)
##  zlibbioc               1.26.0    2018-05-01 Bioconductor</styled-content>
                </preformat>
            </p>
        </sec>
        <sec>
            <title>Software versions</title>
            <p>The statistical methods were evaluated using the following software versions: 
                <italic toggle="yes">DRIMSeq</italic> - 1.8.0, 
                <italic toggle="yes">DEXSeq</italic> - 1.26.0, 
                <italic toggle="yes">stageR</italic> - 1.2.21, 
                <italic toggle="yes">tximport</italic> - 1.8.0, 
                <italic toggle="yes">DESeq2</italic> - 1.20.0, 
                <italic toggle="yes">EBSeq</italic> - 1.20.0, 
                <italic toggle="yes">edgeR</italic> - 3.22.2, 
                <italic toggle="yes">limma</italic> - 3.36.1, 
                <italic toggle="yes">samr</italic> - 2.0, 
                <italic toggle="yes">sleuth</italic> - 0.29.0, 
                <italic toggle="yes">SUPPA2</italic> - 2.3. The samples were quantified with 
                <italic toggle="yes">Salmon</italic> version 0.10.0 and 
                <italic toggle="yes">kallisto</italic> version 0.44.0. 
                <italic toggle="yes">polyester</italic> version 1.16.0 and 
                <italic toggle="yes">alpine</italic> version 1.6.0 were used in generating the simulated dataset.</p>
        </sec>
        <sec>
            <title>Data availability</title>
            <p>The simulated paired-end read FASTQ files have been uploaded in three batches of eight samples each to Zenodo -</p>
            <p>
                <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.5281/zenodo.1291375">https://doi.org/10.5281/zenodo.1291375</ext-link>
                <sup>
                    <xref ref-type="bibr" rid="ref-22">22</xref>
                </sup>
            </p>
            <p>
                <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.5281/zenodo.1291404">https://doi.org/10.5281/zenodo.1291404</ext-link>
                <sup>
                    <xref ref-type="bibr" rid="ref-23">23</xref>
                </sup>
            </p>
            <p>
                <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.5281/zenodo.1291443">https://doi.org/10.5281/zenodo.1291443</ext-link>
                <sup>
                    <xref ref-type="bibr" rid="ref-24">24</xref>
                </sup>
            </p>
            <p>The quantification files are also available as a separate Zenodo dataset - 
                <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.5281/zenodo.1291522">https://doi.org/10.5281/zenodo.1291522</ext-link>
                <sup>
                    <xref ref-type="bibr" rid="ref-25">25</xref>
                </sup>
            </p>
            <p>The scripts used to generate the simulated dataset are available at the simulation GitHub repository (
                <ext-link ext-link-type="uri" xlink:href="https://github.com/mikelove/swimdown/tree/v1.0">https://github.com/mikelove/swimdown/tree/v1.0</ext-link>) and archived here - 
                <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.5281/zenodo.1293899">https://doi.org/10.5281/zenodo.1293899</ext-link>
                <sup>
                    <xref ref-type="bibr" rid="ref-21">21</xref>
                </sup>. All data is available under a CC BY 4.0 license.</p>
        </sec>
        <sec>
            <title>Software availability</title>
            <list list-type="bullet">
                <list-item>
                    <label>1. </label>
                    <p>All software used in this workflow is available as part of Bioconductor version 3.7.</p>
                </list-item>
                <list-item>
                    <label>2. </label>
                    <p>Source code for the workflow: 
                        <ext-link ext-link-type="uri" xlink:href="https://github.com/mikelove/rnaseqDTU">https://github.com/mikelove/rnaseqDTU</ext-link>
                    </p>
                </list-item>
                <list-item>
                    <label>3. </label>
                    <p>Link to archived source code as at time of publication: 
                        <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.5281/zenodo.1293914">https://doi.org/10.5281/zenodo.1293914</ext-link>
                        <sup>
                            <xref ref-type="bibr" rid="ref-42">42</xref>
                        </sup>
                    </p>
                </list-item>
                <list-item>
                    <label>4. </label>
                    <p>License: Artistic-2.0</p>
                </list-item>
            </list>
        </sec>
    </body>
    <back>
        <ack>
            <title>Acknowledgments</title>
            <p>The authors thank Koen Van den Berge and Malgorzata Nowicka for helpful comments on the workflow.</p>
        </ack>
        <sec id="SM1" sec-type="supplementary-material">
            <title>Supplementary material</title>
            <p id="SF1">Supplementary File 1 - PDF file containing the following supplementary figures -</p>
            <p>
                <ext-link ext-link-type="uri" xlink:href="https://f1000researchdata.s3.amazonaws.com/supplementary/15398/04e7b725-1377-4771-9d14-da8a17e63883.pdf">Click here to access the data</ext-link>.</p>
            <p>Supplementary Figure 1: Dispersion-over-mean comparison plot produced by 
                <italic toggle="yes">countsimQC</italic>. The left panel shows 
                <italic toggle="yes">DESeq2</italic> estimates of dispersion per gene over the mean of normalized counts from the GEUVADIS project, provided by the Recount2 project (n = 458 non-duplicated samples). The right panel shows estimates of dispersion per transcript over the mean of normalized counts for 
                <italic toggle="yes">Salmon</italic> estimated transcript counts for the simulated dataset (the 12 vs 12 comparison), showing only the transcripts where the mean of counts over samples was greater than 5. Black points indicate maximum likelihood estimates (Cox-Reid adjusted), blue points indicate posterior estimates, and the red line indicates the parametric trend line. Points at the bottom of the plot indicate maximum likelihood estimates of 10
                <sup>-8</sup>. The design formula included sequencing center and population for GEUVADIS, and the condition variable for the simulated dataset. The simulation dataset was constructed by drawing mean and dispersions parameters from the joint distribution of the estimates from the GEUVADIS project. The full 
                <italic toggle="yes">countsimQC</italic> report can be found at 
                <ext-link ext-link-type="uri" xlink:href="https://github.com/mikelove/swimdown/tree/master/countsimqc">https://github.com/mikelove/swimdown/tree/master/countsimqc</ext-link>.</p>
            <p id="SF2">Supplementary Figure 2: We performed additional experiments to assess the false discovery rate (FDR) control for 
                <italic toggle="yes">sleuth</italic> at per-group sample size of 9 (left column) and 12 (right column), at the gene-level (top row) and the transcript-level (bottom row). To determine whether the excess observed FDR was due to the inclusion of realistic fragment GC coverage in the main simulation, for this experiment fragments were instead drawn uniformly from positions on the transcripts. The dispersion-mean relationship was kept the same, drawing from the joint distribution of estimates on the GEUVADIS dataset (n = 458).</p>
            <p id="SF3">Supplementary Figure 3: As in 
                <xref ref-type="other" rid="SF2">Supplementary Figure 2</xref>, shown is the result of an additional experiment to assess the false discovery rate (FDR) control for 
                <italic toggle="yes">sleuth</italic> for the two largest sample sizes in the simulation. For this experiment, realistic fragment GC bias was used in the simulation, but the percent of genes with DGE, DTE and DTU was lowered from 10% to 5% each. This modication of the simulation helped to regain control of FDR for 
                <italic toggle="yes">sleuth</italic>.</p>
        </sec>
        <ref-list>
            <ref id="ref-1">
                <label>1</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Glaus</surname>
                            <given-names>P</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Honkela</surname>
                            <given-names>A</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Rattray</surname>
                            <given-names>M</given-names>
                        </name>
					</person-group>:
                    <article-title>Identifying differentially expressed transcripts from RNA-seq data with biological variation.</article-title>
                    <source>
						
                        <italic toggle="yes">Bioinformatics.</italic>
					</source>
                    <year>2012</year>;<volume>28</volume>(<issue>13</issue>):<fpage>1721</fpage>&#x2013;<lpage>1728</lpage>.
                    <pub-id pub-id-type="pmid">22563066</pub-id>
                    <pub-id pub-id-type="doi">10.1093/bioinformatics/bts260</pub-id>
                    <pub-id pub-id-type="pmcid">3381971</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-2">
                <label>2</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Anders</surname>
                            <given-names>S</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Reyes</surname>
                            <given-names>A</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Huber</surname>
                            <given-names>W</given-names>
                        </name>
					</person-group>:
                    <article-title>Detecting differential usage of exons from RNA-seq data.</article-title>
                    <source>
						
                        <italic toggle="yes">Genome Res.</italic>
					</source>
                    <year>2012</year>;<volume>22</volume>(<issue>10</issue>):<fpage>2008</fpage>&#x2013;<lpage>2017</lpage>.
                    <pub-id pub-id-type="pmid">22722343</pub-id>
                    <pub-id pub-id-type="doi">10.1101/gr.133744.111</pub-id>
                    <pub-id pub-id-type="pmcid">3460195</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-3">
                <label>3</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Robinson</surname>
                            <given-names>MD</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>McCarthy</surname>
                            <given-names>DJ</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Smyth</surname>
                            <given-names>GK</given-names>
                        </name>
					</person-group>:
                    <article-title>edgeR: a Bioconductor package for differential expression analysis of digital gene expression data.</article-title>
                    <source>
						
                        <italic toggle="yes">Bioinformatics.</italic>
					</source>
                    <year>2010</year>;<volume>26</volume>(<issue>1</issue>):<fpage>139</fpage>&#x2013;<lpage>140</lpage>.
                    <pub-id pub-id-type="pmid">19910308</pub-id>
                    <pub-id pub-id-type="doi">10.1093/bioinformatics/btp616</pub-id>
                    <pub-id pub-id-type="pmcid">2796818</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-4">
                <label>4</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>McCarthy</surname>
                            <given-names>DJ</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Chen</surname>
                            <given-names>Y</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Smyth</surname>
                            <given-names>GK</given-names>
                        </name>
					</person-group>:
                    <article-title>Differential expression analysis of multifactor RNA-seq experiments with respect to biological variation.</article-title>
                    <source>
						
                        <italic toggle="yes">Nucleic Acids Res.</italic>
					</source>
                    <year>2012</year>;<volume>40</volume>(<issue>10</issue>):<fpage>4288</fpage>&#x2013;<lpage>4297</lpage>.
                    <pub-id pub-id-type="pmid">22287627</pub-id>
                    <pub-id pub-id-type="doi">10.1093/nar/gks042</pub-id>
                    <pub-id pub-id-type="pmcid">3378882</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-5">
                <label>5</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Smyth</surname>
                            <given-names>GK</given-names>
                        </name>
					</person-group>:
                    <article-title>Linear models and empirical bayes methods for assessing differential expression in microarray experiments.</article-title>
                    <source>
						
                        <italic toggle="yes">Stat Appl Genet Mol Biol.</italic>
					</source>
                    <year>2004</year>;<volume>3</volume>(<issue>1</issue>): Article3.
                    <pub-id pub-id-type="pmid">16646809</pub-id>
                    <pub-id pub-id-type="doi">10.2202/1544-6115.1027</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-6">
                <label>6</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Law</surname>
                            <given-names>CW</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Chen</surname>
                            <given-names>Y</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Shi</surname>
                            <given-names>W</given-names>
                        </name>
						
                        <etal/>
					</person-group>:
                    <article-title>Voom: Precision weights unlock linear model analysis tools for RNA-seq read counts.</article-title>
                    <source>
						
                        <italic toggle="yes">Genome Biol.</italic>
					</source>
                    <year>2014</year>;<volume>15</volume>(<issue>2</issue>):<fpage>R29</fpage>.
                    <pub-id pub-id-type="pmid">24485249</pub-id>
                    <pub-id pub-id-type="doi">10.1186/gb-2014-15-2-r29</pub-id>
                    <pub-id pub-id-type="pmcid">4053721</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-7">
                <label>7</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Nowicka</surname>
                            <given-names>M</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Robinson</surname>
                            <given-names>MD</given-names>
                        </name>
					</person-group>:
                    <article-title>DRIMSeq: a Dirichlet-multinomial framework for multivariate count outcomes in genomics [version 2; referees: 2 approved].</article-title>
                    <source>
						
                        <italic toggle="yes">F1000Res.</italic>
					</source>
                    <year>2016</year>;<volume>5</volume>:<fpage>1356</fpage>.
                    <pub-id pub-id-type="pmid">28105305</pub-id>
                    <pub-id pub-id-type="doi">10.12688/f1000research.8900.2</pub-id>
                    <pub-id pub-id-type="pmcid">5200948</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-8">
                <label>8</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Van den Berge</surname>
                            <given-names>K</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Soneson</surname>
                            <given-names>C</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Robinson</surname>
                            <given-names>MD</given-names>
                        </name>
						
                        <etal/>
					</person-group>:
                    <article-title>stageR: a general stage-wise method for controlling the gene-level false discovery rate in differential expression and differential transcript usage.</article-title>
                    <source>
						
                        <italic toggle="yes">Genome Biol.</italic>
					</source>
                    <year>2017</year>;<volume>18</volume>(<issue>1</issue>):<fpage>151</fpage>.
                    <pub-id pub-id-type="pmid">28784146</pub-id>
                    <pub-id pub-id-type="doi">10.1186/s13059-017-1277-0</pub-id>
                    <pub-id pub-id-type="pmcid">5547545</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-9">
                <label>9</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Goldstein</surname>
                            <given-names>LD</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Cao</surname>
                            <given-names>Y</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Pau</surname>
                            <given-names>G</given-names>
                        </name>
						
                        <etal/>
					</person-group>:
                    <article-title>Prediction and Quantification of Splice Events from RNA-Seq Data.</article-title>
                    <source>
						
                        <italic toggle="yes">PLoS One.</italic>
					</source>
                    <year>2016</year>;<volume>11</volume>(<issue>5</issue>):<fpage>e0156132</fpage>.
                    <pub-id pub-id-type="pmid">27218464</pub-id>
                    <pub-id pub-id-type="doi">10.1371/journal.pone.0156132</pub-id>
                    <pub-id pub-id-type="pmcid">4878813</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-10">
                <label>10</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Vitting-Seerup</surname>
                            <given-names>K</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Sandelin</surname>
                            <given-names>A</given-names>
                        </name>
					</person-group>:
                    <article-title>The landscape of isoform switches in human cancers.</article-title>
                    <source>
						
                        <italic toggle="yes">Mol Cancer Res.</italic>
					</source>
                    <year>2017</year>;<volume>15</volume>(<issue>9</issue>):<fpage>1206</fpage>&#x2013;<lpage>1220</lpage>.
                    <pub-id pub-id-type="pmid">28584021</pub-id>
                    <pub-id pub-id-type="doi">10.1158/1541-7786.MCR-16-0459</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-11">
                <label>11</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Huber</surname>
                            <given-names>W</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Carey</surname>
                            <given-names>VJ</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Gentleman</surname>
                            <given-names>R</given-names>
                        </name>
						
                        <etal/>
					</person-group>:
                    <article-title>Orchestrating high-throughput genomic analysis with Bioconductor.</article-title>
                    <source>
						
                        <italic toggle="yes">Nat Methods.</italic>
					</source>
                    <year>2015</year>;<volume>12</volume>(<issue>2</issue>):<fpage>115</fpage>&#x2013;<lpage>121</lpage>.
                    <pub-id pub-id-type="pmid">25633503</pub-id>
                    <pub-id pub-id-type="doi">10.1038/nmeth.3252</pub-id>
                    <pub-id pub-id-type="pmcid">4509590</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-12">
                <label>12</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Soneson</surname>
                            <given-names>C</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Love</surname>
                            <given-names>MI</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Robinson</surname>
                            <given-names>MD</given-names>
                        </name>
					</person-group>:
                    <article-title>Differential analyses for RNA-seq: transcript-level estimates improve gene-level inferences [version 2; referees: 2 approved].</article-title>
                    <source>
						
                        <italic toggle="yes">F1000Res.</italic>
					</source>
                    <year>2016</year>;<volume>4</volume>:<fpage>1521</fpage>.
                    <pub-id pub-id-type="pmid">26925227</pub-id>
                    <pub-id pub-id-type="doi">10.12688/f1000research.7563.2</pub-id>
                    <pub-id pub-id-type="pmcid">4712774 </pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-13">
                <label>13</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Patro</surname>
                            <given-names>R</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Duggal</surname>
                            <given-names>G</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Love</surname>
                            <given-names>MI</given-names>
                        </name>
						
                        <etal/>
					</person-group>:
                    <article-title>Salmon provides fast and bias-aware quantification of transcript expression.</article-title>
                    <source>
						
                        <italic toggle="yes">Nat Methods.</italic>
					</source>
                    <year>2017</year>;<volume>14</volume>(<issue>4</issue>):<fpage>417</fpage>&#x2013;<lpage>419</lpage>.
                    <pub-id pub-id-type="pmid">28263959</pub-id>
                    <pub-id pub-id-type="doi">10.1038/nmeth.4197</pub-id>
                    <pub-id pub-id-type="pmcid">5600148</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-14">
                <label>14</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Patro</surname>
                            <given-names>R</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Mount</surname>
                            <given-names>SM</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Kingsford</surname>
                            <given-names>C</given-names>
                        </name>
					</person-group>:
                    <article-title>Sailfish enables alignment-free isoform quantification from RNA-seq reads using lightweight algorithms.</article-title>
                    <source>
						
                        <italic toggle="yes">Nat Biotechnol.</italic>
					</source>
                    <year>2014</year>;<volume>32</volume>(<issue>5</issue>):<fpage>462</fpage>&#x2013;<lpage>464</lpage>.
                    <pub-id pub-id-type="pmid">24752080</pub-id>
                    <pub-id pub-id-type="doi">10.1038/nbt.2862</pub-id>
                    <pub-id pub-id-type="pmcid">4077321</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-15">
                <label>15</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Bray</surname>
                            <given-names>NL</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Pimentel</surname>
                            <given-names>H</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Melsted</surname>
                            <given-names>P</given-names>
                        </name>
						
                        <etal/>
					</person-group>:
                    <article-title>Near-optimal probabilistic RNA-seq quantification.</article-title>
                    <source>
						
                        <italic toggle="yes">Nat Biotechnol.</italic>
					</source>
                    <year>2016</year>;<volume>34</volume>(<issue>5</issue>):<fpage>525</fpage>&#x2013;<lpage>527</lpage>.
                    <pub-id pub-id-type="pmid">27043002</pub-id>
                    <pub-id pub-id-type="doi">10.1038/nbt.3519</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-16">
                <label>16</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Love</surname>
                            <given-names>MI</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Huber</surname>
                            <given-names>W</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Anders</surname>
                            <given-names>S</given-names>
                        </name>
					</person-group>:
                    <article-title>Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2.</article-title>
                    <source>
						
                        <italic toggle="yes">Genome Biol.</italic>
					</source>
                    <year>2014</year>;<volume>15</volume>(<issue>12</issue>):<fpage>550</fpage>.
                    <pub-id pub-id-type="pmid">25516281</pub-id>
                    <pub-id pub-id-type="doi">10.1186/s13059-014-0550-8</pub-id>
                    <pub-id pub-id-type="pmcid">4302049</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-17">
                <label>17</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Trapnell</surname>
                            <given-names>C</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Hendrickson</surname>
                            <given-names>DG</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Sauvageau</surname>
                            <given-names>M</given-names>
                        </name>
						
                        <etal/>
					</person-group>:
                    <article-title>Differential analysis of gene regulation at transcript resolution with RNA-seq.</article-title>
                    <source>
						
                        <italic toggle="yes">Nat Biotechnol.</italic>
					</source>
                    <year>2013</year>;<volume>31</volume>(<issue>1</issue>):<fpage>46</fpage>&#x2013;<lpage>53</lpage>.
                    <pub-id pub-id-type="pmid">23222703</pub-id>
                    <pub-id pub-id-type="doi">10.1038/nbt.2450</pub-id>
                    <pub-id pub-id-type="pmcid">3869392</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-18">
                <label>18</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Heller</surname>
                            <given-names>R</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Manduchi</surname>
                            <given-names>E</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Grant</surname>
                            <given-names>GR</given-names>
                        </name>
						
                        <etal/>
					</person-group>:
                    <article-title>A flexible two-stage procedure for identifying gene sets that are differentially expressed.</article-title>
                    <source>
						
                        <italic toggle="yes">Bioinformatics.</italic>
					</source>
                    <year>2009</year>;<volume>25</volume>(<issue>8</issue>):<fpage>1019</fpage>&#x2013;<lpage>25</lpage>.
                    <pub-id pub-id-type="pmid">19213738</pub-id>
                    <pub-id pub-id-type="doi">10.1093/bioinformatics/btp076</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-19">
                <label>19</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Lappalainen</surname>
                            <given-names>T</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Sammeth</surname>
                            <given-names>M</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Friedl&#x00e4;nder</surname>
                            <given-names>MR</given-names>
                        </name>
						
                        <etal/>
					</person-group>:
                    <article-title>Transcriptome and genome sequencing uncovers functional variation in humans.</article-title>
                    <source>
						
                        <italic toggle="yes">Nature.</italic>
					</source>
                    <year>2013</year>;<volume>501</volume>(<issue>7468</issue>):<fpage>506</fpage>&#x2013;<lpage>511</lpage>.
                    <pub-id pub-id-type="pmid">24037378</pub-id>
                    <pub-id pub-id-type="doi">10.1038/nature12531</pub-id>
                    <pub-id pub-id-type="pmcid">3918453</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-20">
                <label>20</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Collado-Torres</surname>
                            <given-names>L</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Nellore</surname>
                            <given-names>A</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Kammers</surname>
                            <given-names>K</given-names>
                        </name>
						
                        <etal/>
					</person-group>:
                    <article-title>Reproducible RNA-seq analysis using 
                        <italic toggle="yes">recount2</italic>.</article-title>
                    <source>
						
                        <italic toggle="yes">Nat Biotechnol.</italic>
					</source>
                    <year>2017</year>;<volume>35</volume>(<issue>4</issue>):<fpage>319</fpage>&#x2013;<lpage>321</lpage>.
                    <pub-id pub-id-type="pmid">28398307</pub-id>
                    <pub-id pub-id-type="doi">10.1038/nbt.3838</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-21">
                <label>21</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Love</surname>
                            <given-names>MI</given-names>
                        </name>
					</person-group>:
                    <article-title>Scripts used in constructing and evaluating the simulated data for Swimming Downstream</article-title>.<year>2018</year>.
                    <ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.5281/zenodo.1293899">Data Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-22">
                <label>22</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Love</surname>
                            <given-names>MI</given-names>
                        </name>
					</person-group>:
                    <article-title>Simulation data (1) for Swimming Downstream: pairs of samples 1-4</article-title>.<year>2018</year>.
                    <ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.5281/zenodo.1291375">Data Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-23">
                <label>23</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Love</surname>
                            <given-names>MI</given-names>
                        </name>
					</person-group>:
                    <article-title>Simulation data (2) for Swimming Downstream: pairs of samples 5-8</article-title>.<year>2018</year>.
                    <ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.5281/zenodo.1291404">Data Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-24">
                <label>24</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Love</surname>
                            <given-names>MI</given-names>
                        </name>
					</person-group>:
                    <article-title>Simulation data (3) for Swimming Downstream, pairs of samples 9-12</article-title>.<year>2018</year>.
                    <ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.5281/zenodo.1291443">Data Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-25">
                <label>25</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Love</surname>
                            <given-names>MI</given-names>
                        </name>
					</person-group>:
                    <article-title>Quantification files for Swimming Downstream</article-title>.<year>2018</year>.
                    <ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.5281/zenodo.1291522">Data Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-26">
                <label>26</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Love</surname>
                            <given-names>MI</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Hogenesch</surname>
                            <given-names>JB</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Irizarry</surname>
                            <given-names>RA</given-names>
                        </name>
					</person-group>:
                    <article-title>Modeling of RNA-seq fragment sequence bias reduces systematic errors in transcript abundance estimation.</article-title>
                    <source>
						
                        <italic toggle="yes">Nat Biotechnol.</italic>
					</source>
                    <year>2016</year>;<volume>34</volume>(<issue>12</issue>):<fpage>1287</fpage>&#x2013;<lpage>1291</lpage>.
                    <pub-id pub-id-type="pmid">27669167</pub-id>
                    <pub-id pub-id-type="doi">10.1038/nbt.3682</pub-id>
                    <pub-id pub-id-type="pmcid">5143225</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-27">
                <label>27</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Frazee</surname>
                            <given-names>AC</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Jaffe</surname>
                            <given-names>AE</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Langmead</surname>
                            <given-names>B</given-names>
                        </name>
						
                        <etal/>
					</person-group>:
                    <article-title>
                        <italic toggle="yes">Polyester</italic>: simulating RNA-seq datasets with differential transcript expression.</article-title>
                    <source>
						
                        <italic toggle="yes">Bioinformatics.</italic>
					</source>
                    <year>2015</year>;<volume>31</volume>(<issue>17</issue>):<fpage>2778</fpage>&#x2013;<lpage>2784</lpage>.
                    <pub-id pub-id-type="pmid">25926345</pub-id>
                    <pub-id pub-id-type="doi">10.1093/bioinformatics/btv272</pub-id>
                    <pub-id pub-id-type="pmcid">4635655</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-28">
                <label>28</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Soneson</surname>
                            <given-names>C</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Robinson</surname>
                            <given-names>MD</given-names>
                        </name>
					</person-group>:
                    <article-title>Towards unified quality verification of synthetic count data with 
                        <italic toggle="yes">countsimQC</italic>.</article-title>
                    <source>
						
                        <italic toggle="yes">Bioinformatics.</italic>
					</source>
                    <year>2018</year>;<volume>34</volume>(<issue>4</issue>):<fpage>691</fpage>&#x2013;<lpage>692</lpage>.
                    <pub-id pub-id-type="pmid">29028961</pub-id>
                    <pub-id pub-id-type="doi">10.1093/bioinformatics/btx631</pub-id>
                    <pub-id pub-id-type="pmcid">5860609</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-29">
                <label>29</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>K&#x00f6;ster</surname>
                            <given-names>J</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Rahmann</surname>
                            <given-names>S</given-names>
                        </name>
					</person-group>:
                    <article-title>Snakemake--a scalable bioinformatics workflow engine.</article-title>
                    <source>
						
                        <italic toggle="yes">Bioinformatics.</italic>
					</source>
                    <year>2012</year>;<volume>28</volume>(<issue>19</issue>):<fpage>2520</fpage>&#x2013;<lpage>2522</lpage>.
                    <pub-id pub-id-type="pmid">22908215</pub-id>
                    <pub-id pub-id-type="doi">10.1093/bioinformatics/bts480</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-30">
                <label>30</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Di Tommaso</surname>
                            <given-names>P</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Chatzou</surname>
                            <given-names>M</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Floden</surname>
                            <given-names>EW</given-names>
                        </name>
						
                        <etal/>
					</person-group>:
                    <article-title>Nextflow enables reproducible computational workflows.</article-title>
                    <source>
						
                        <italic toggle="yes">Nat Biotechnol.</italic>
					</source>
                    <year>2017</year>;<volume>35</volume>(<issue>4</issue>):<fpage>316</fpage>&#x2013;<lpage>319</lpage>.
                    <pub-id pub-id-type="pmid">28398311</pub-id>
                    <pub-id pub-id-type="doi">10.1038/nbt.3820</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-31">
                <label>31</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Benjamini</surname>
                            <given-names>Y</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Hochberg</surname>
                            <given-names>Y</given-names>
                        </name>
					</person-group>:
                    <article-title>Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing.</article-title>
                    <source>
						
                        <italic toggle="yes">J R Stat Soc Series B Stat Methodol.</italic>
					</source>
                    <year>1995</year>;<volume>57</volume>(<issue>1</issue>):<fpage>289</fpage>&#x2013;<lpage>300</lpage>.
                    <ext-link ext-link-type="uri" xlink:href="https://www.jstor.org/stable/2346101?seq=1#page_scan_tab_contents">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-32">
                <label>32</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Anders</surname>
                            <given-names>S</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Reyes</surname>
                            <given-names>A</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Huber</surname>
                            <given-names>W</given-names>
                        </name>
					</person-group>:
                    <article-title>Detecting differential usage of exons from RNA-seq data.</article-title>
                    <source>
						
                        <italic toggle="yes">Genome Res.</italic>
					</source>
                    <year>2012</year>;<volume>22</volume>(<issue>10</issue>):<fpage>2008</fpage>&#x2013;<lpage>2017</lpage>.
                    <pub-id pub-id-type="pmid">22722343</pub-id>
                    <pub-id pub-id-type="doi">10.1101/gr.133744.111</pub-id>
                    <pub-id pub-id-type="pmcid">3460195</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-33">
                <label>33</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Soneson</surname>
                            <given-names>C</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Matthes</surname>
                            <given-names>KL</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Nowicka</surname>
                            <given-names>M</given-names>
                        </name>
						
                        <etal/>
					</person-group>:
                    <article-title>Isoform prefiltering improves performance of count-based methods for analysis of differential transcript usage.</article-title>
                    <source>
						
                        <italic toggle="yes">Genome Biol.</italic>
					</source>
                    <year>2016</year>;<volume>17</volume>(<issue>1</issue>):<fpage>12</fpage>.
                    <pub-id pub-id-type="pmid">26813113</pub-id>
                    <pub-id pub-id-type="doi">10.1186/s13059-015-0862-3</pub-id>
                    <pub-id pub-id-type="pmcid">4729156</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-34">
                <label>34</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Yi</surname>
                            <given-names>L</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Pimentel</surname>
                            <given-names>H</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Bray</surname>
                            <given-names>NL</given-names>
                        </name>
						
                        <etal/>
					</person-group>:
                    <article-title>Gene-level differential analysis at transcript-level resolution.</article-title>
                    <source>
						
                        <italic toggle="yes">Genome Biol.</italic>
					</source>
                    <year>2018</year>;<volume>19</volume>(<issue>1</issue>):<fpage>53</fpage>.
                    <pub-id pub-id-type="pmid">29650040</pub-id>
                    <pub-id pub-id-type="doi">10.1186/s13059-018-1419-z</pub-id>
                    <pub-id pub-id-type="pmcid">5896116</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-35">
                <label>35</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Trincado</surname>
                            <given-names>JL</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Entizne</surname>
                            <given-names>JC</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Hysenaj</surname>
                            <given-names>G</given-names>
                        </name>
						
                        <etal/>
					</person-group>:
                    <article-title>SUPPA2: fast, accurate, and uncertainty-aware differential splicing analysis across multiple conditions.</article-title>
                    <source>
						
                        <italic toggle="yes">Genome Biol.</italic>
					</source>
                    <year>2018</year>;<volume>19</volume>(<issue>1</issue>):<fpage>40</fpage>.
                    <pub-id pub-id-type="pmid">29571299</pub-id>
                    <pub-id pub-id-type="doi">10.1186/s13059-018-1417-1</pub-id>
                    <pub-id pub-id-type="pmcid">5866513</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-36">
                <label>36</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Soneson</surname>
                            <given-names>C</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Robinson</surname>
                            <given-names>MD</given-names>
                        </name>
					</person-group>:
                    <article-title>iCOBRA: open, reproducible, standardized and live method benchmarking.</article-title>
                    <source>
						
                        <italic toggle="yes">Nat Methods.</italic>
					</source>
                    <year>2016</year>;<volume>13</volume>(<issue>4</issue>):<fpage>283</fpage>.
                    <pub-id pub-id-type="pmid">27027585</pub-id>
                    <pub-id pub-id-type="doi">10.1038/nmeth.3805</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-37">
                <label>37</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Leng</surname>
                            <given-names>N</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Dawson</surname>
                            <given-names>JA</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Thomson</surname>
                            <given-names>JA</given-names>
                        </name>
						
                        <etal/>
					</person-group>:
                    <article-title>EBSeq: an empirical Bayes hierarchical model for inference in RNA-seq experiments.</article-title>
                    <source>
						
                        <italic toggle="yes">Bioinformatics.</italic>
					</source>
                    <year>2013</year>;<volume>29</volume>(<issue>8</issue>):<fpage>1035</fpage>&#x2013;<lpage>1043</lpage>.
                    <pub-id pub-id-type="pmid">23428641</pub-id>
                    <pub-id pub-id-type="doi">10.1093/bioinformatics/btt087</pub-id>
                    <pub-id pub-id-type="pmcid">3624807</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-38">
                <label>38</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Lund</surname>
                            <given-names>SP</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Nettleton</surname>
                            <given-names>D</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>McCarthy</surname>
                            <given-names>DJ</given-names>
                        </name>
						
                        <etal/>
					</person-group>:
                    <article-title>Detecting differential expression in RNA-sequence data using quasi-likelihood with shrunken dispersion estimates.</article-title>
                    <source>
						
                        <italic toggle="yes">Stat Appl Genet Mol Biol.</italic>
					</source>
                    <year>2012</year>;<volume>11</volume>(<issue>5</issue>): pii: /j/sagmb.2012.11.issue-5/1544-6115.1826/1544-6115.1826.xml.
                    <pub-id pub-id-type="pmid">23104842</pub-id>
                    <pub-id pub-id-type="doi">10.1515/1544-6115.1826</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-39">
                <label>39</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Li</surname>
                            <given-names>J</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Tibshirani</surname>
                            <given-names>R</given-names>
                        </name>
					</person-group>:
                    <article-title>Finding consistent patterns: A nonparametric approach for identifying differential expression in RNA-seq data.</article-title>
                    <source>
						
                        <italic toggle="yes">Stat Methods Med Res.</italic>
					</source>
                    <year>2013</year>;<volume>22</volume>(<issue>5</issue>):<fpage>519</fpage>&#x2013;<lpage>536</lpage>.
                    <pub-id pub-id-type="pmid">22127579</pub-id>
                    <pub-id pub-id-type="doi">10.1177/0962280211428386</pub-id>
                    <pub-id pub-id-type="pmcid">4605138</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-40">
                <label>40</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Pimentel</surname>
                            <given-names>H</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Bray</surname>
                            <given-names>NL</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Puente</surname>
                            <given-names>S</given-names>
                        </name>
						
                        <etal/>
					</person-group>:
                    <article-title>Differential analysis of RNA-seq incorporating quantification uncertainty.</article-title>
                    <source>
						
                        <italic toggle="yes">Nat Methods.</italic>
					</source>
                    <year>2017</year>;<volume>14</volume>(<issue>7</issue>):<fpage>687</fpage>&#x2013;<lpage>690</lpage>.
                    <pub-id pub-id-type="pmid">28581496</pub-id>
                    <pub-id pub-id-type="doi">10.1038/nmeth.4324</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-41">
                <label>41</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Li</surname>
                            <given-names>YI</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Knowles</surname>
                            <given-names>DA</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Humphrey</surname>
                            <given-names>J</given-names>
                        </name>
						
                        <etal/>
					</person-group>:
                    <article-title>Annotation-free quantification of RNA splicing using LeafCutter.</article-title>
                    <source>
						
                        <italic toggle="yes">Nat Genet.</italic>
					</source>
                    <year>2018</year>;<volume>50</volume>(<issue>1</issue>):<fpage>151</fpage>&#x2013;<lpage>158</lpage>.
                    <pub-id pub-id-type="pmid">29229983</pub-id>
                    <pub-id pub-id-type="doi">10.1038/s41588-017-0004-9</pub-id>
                    <pub-id pub-id-type="pmcid">5742080</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-42">
                <label>42</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Love</surname>
                            <given-names>MI</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Soneson</surname>
                            <given-names>C</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Patro</surname>
                            <given-names>R</given-names>
                        </name>
					</person-group>:
                    <article-title>Swimming downstream: statistical analysis of differential transcript usage following Salmon quantification</article-title>.<year>2018</year>.
                    <ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.5281/zenodo.1293914">Data Source</ext-link>
                </mixed-citation>
            </ref>
        </ref-list>
    </back>
    <sub-article article-type="reviewer-report" id="report35682">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.16780.r35682</article-id>
            <title-group>
                <article-title>Reviewer response for version 1</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Schurch</surname>
                        <given-names>Nick</given-names>
                    </name>
                    <xref ref-type="aff" rid="r35682a1">1</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0001-9068-9654</uri>
                </contrib>
                <aff id="r35682a1">
                    <label>1</label>Division&#x00a0;of&#x00a0;Computational Biology, School of Life Sciences, University of Dundee, Dundee, UK</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>I am first author of a paper for a DTU tool (RATs https://www.biorxiv.org/content/early/2017/05/02/132761) that is currently going through the publication process. In it we clearly highlight that existing DTU tools including those used here do not perform well.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>13</day>
                <month>8</month>
                <year>2018</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2018 Schurch N</copyright-statement>
                <copyright-year>2018</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport35682" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.15398.1"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve-with-reservations</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>In 'Swimming downstream: statistical analysis of differential transcript usage following Salmon quantification' Love, Sonesson &amp; Patro present both 1) a workflow for identifying the signatures of differential transcript usage between RNA-seq samples in two conditions, based on a suite of tools, and 2) a benchmarking analysis of the performance of these tools based on simulated data. The aims of this work are laudable and I have no doubt it will be a valuable addition to the literature, the resulting paper suffers from several flaws and needs considerable additional work, in my opinion.</p>
            <p> </p>
            <p> 
                <bold>Major comments:</bold>
            </p>
            <p> </p>
            <p> 1) The intermingling of the benchmarking and workflow sections of this manuscript make the text confused and difficult to read. I'd suggest that the authors either restructure the manuscript beginning with the workflow section and then following with the benchmarking section, or split the work in to two and concentrate separately on the two areas.</p>
            <p> </p>
            <p> 2) This work is listed as a Method article. I am not convinced that an example of stringing existing tools together fits the description required for this section (that is: "Method Articles describe a new experimental, observational, or computational method, test or procedure (basic or clinical research)."). The benchmarking part of the work is better suited to a Research Article, whilst the workflow part is more like a computational protocol and might be better suited for publication as a Study Protocol.</p>
            <p> </p>
            <p> 3) Quantifying transcript expression from RNA-seq data is challenging but has become common-place and relatively straight-forward thanks to the development of high-performance tools such Salmon and Kallisto. These tools typically provide a transcripts-per-million estimation of a transcripts expression. With these quantifications in place the inevitable, and even more challenging, next step is to identify those transcripts where their&#x00a0;expression is changing between samples. To date there has not been a clear data-driven exploration of the underlying statistical properties of TPM quantifications (or estimated transcript counts from TPMs) as a function of biological and technical replication - instead, much as was the case for differential gene expression from RNA-seq data until relatively recently - the tools for identifying DTE are built on the strong assumption of a distribution for the quantifications and, typically, assume a negative binomial distribution. Although this looks to be a good assumption in the case of gene expression, it is far from clear to me that the assumption of a negative binomial distribution for the distribution of a transcripts TPM or estimated counts across biological replicates is a good assumption for TPMs or estimated counts from TPMs, particularly given that - in the context of biological DTU - the expression of a transcript can be strongly correlated with the other child transcripts of the gene. The fixed per-gene dispersion section seems like the beginnings of an exploration in this area but this assumption too is without any justification. Perhaps the authors could use some highly replicated data from a complex eukaryote to actually measure these distributions and give clarity on whether these assumptions are valid? Or, failing that, explore the impact of different potential distributions of the tool performance?</p>
            <p> </p>
            <p> 4) The entire discussion section of the benchmarking results is essentially missing and the current discussion section of more like a brief conclusion. Points that I would like to see the authors discuss in detail include: 
                <list list-type="bullet">
                    <list-item>
                        <p>The low overall TPRs exhibited by all the tools; 25-80% for DTU, 50-80% for DGE &amp; only 20-50% for DTE. What this means for these field and how might these be improved?</p>
                    </list-item>
                    <list-item>
                        <p>The TPR/FPR performance of the tools not only as function of the sample size, but also as a function of the annotation used in the original transcript quantitations, as a function of the effect-size threshold used and as a function of the low-count-rate filtering used for each tool. These are all critical parameters in the tools performance.</p>
                    </list-item>
                    <list-item>
                        <p>An expanded discussion of the extremely poor FPR performance of DRIMseq, that is largely glossed-over in the current text. Why is DRIM-seq performing so poorly? It is more or less dependant on the specific parameters used, or the details of the simulated data, than the other tools - or is it just generically over-sensitive across all the parameter space.</p>
                    </list-item>
                    <list-item>
                        <p>The overlap between the sets of DTU, DGE &amp; DTE identified by each tool, instead the authors just give us some numbers and the TPR/FPR performance metrics. Are these tools reliably identifying the same thing or are they finding wildly different sets of results? (but please, no Venn diagrams! I can respectfully recommend upsetR for this kind of plot).</p>
                    </list-item>
                    <list-item>
                        <p>The use of p-values, adjusted or not, as a threshold for subsetting these results for scientific relevance - particularly given Blume et. al 2018
                            <sup>
                                <xref ref-type="bibr" rid="rep-ref-35682-1">1</xref>
                            </sup>.</p>
                    </list-item>
                    <list-item>
                        <p>Some discussion of why the authors limit themselves to discussing DRIMseq, DEXSeq and SUPPA2 despite listing five additional alternative methods in the introduction. Alternatively, the authors could include these tools in their benchmarking, particularly if they decided to split the work into two papers with one of these focussing on the benchmarking.</p>
                    </list-item>
                    <list-item>
                        <p>Some discussion of the impact that the development of long-read sequencing of native RNAs will have on this field, these tools, and their results in the next few years - perhaps the authors could even use some of the publically available data from the Oxford Nanopore RNA consortium (https://github.com/nanopore-wgs-consortium/NA12878/blob/master/RNA.md) to contrast the performance of this new technology with the tools they examine here for detecting DTE and DTU.</p>
                    </list-item>
                    <list-item>
                        <p>How do these tools cope with RNA-seq experiments with more complex designs? For example, what about if there are 7 conditions, or a time-series (see for example Calixto et. al., 2018
                            <sup>
                                <xref ref-type="bibr" rid="rep-ref-35682-2">2</xref>
                            </sup>? What approaches would the authors then recommend?</p>
                    </list-item>
                </list> </p>
            <p> 5) No effort has been made to test these workflows with real data with validated instances of DTU. These exist in the published literature. For a workflow description this is fine, but for the benchmarking aspect of the work I would like to see the authors use this pipeline in anger, with real data, and see what the results are and how they match up with the validated results.</p>
            <p> </p>
            <p> 6) The introduction does not motivate the importance of identifying DTU in biology. I'd like to see the introduction present the biological relevance of DTU, the relative sparsity of existing validated DTU instances, and the scope DTU has for being an explored layer of regulation for basic biological processes.</p>
            <p> </p>
            <p> 7) The only conclusion from the paper seems to be that the authors recommend the use of stageR - based largely on the fact that its two-stage model matches what the authors think a typical analysis workflow is. This conclusion may be sound advice but a) this paper does not present any compelling *evidence* that this is a typical workflow, and b) stageR is not really what this paper is about" Indeed, here stageR is used as a framework to assist with assessing the performance of the other tools. I'd like to see the authors instead draw some clear conclusions about which tools are the best to use for identifying DTU.</p>
            <p> </p>
            <p> 
                <bold>Minor Comments:</bold>
            </p>
            <p> </p>
            <p> 1) The workflow section really needs some workflow diagrams to highlight the chain for each tool and where they are similar and different.</p>
            <p> </p>
            <p> 2) The plots in the paper are not as high quality as I'd expect:</p>
            <p> &#x00a0;- Figures need to be higher resolution (this may be the journals fault, not the authors)</p>
            <p> &#x00a0;- Figures 3,5,6,8,12 &amp; 13 are multi-panel figures with the same axes on each figure. They would benefit from being plotted with shared axes allowing the performance between different samples sizes to be more clearly visible to the reader.</p>
            <p> &#x00a0;- Figures 9-11: perhaps consider using a multi-panel 2d histogram to show the density profiles for each group, or at least using a better point symbol.</p>
            <p>Is the rationale for developing the new method (or application) clearly explained?</p>
            <p>No</p>
            <p>Is the description of the method technically sound?</p>
            <p>Yes</p>
            <p>Are the conclusions about the method and its performance adequately supported by the findings presented in the article?</p>
            <p>Partly</p>
            <p>If any results are presented, are all the source data underlying the results available to ensure full reproducibility?</p>
            <p>Yes</p>
            <p>Are sufficient details provided to allow replication of the method development and its use by others?</p>
            <p>Yes</p>
            <p>Reviewer Expertise:</p>
            <p>Bioinformatics, RNA-seq, transcriptomics tools, benchmarking</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.</p>
        </body>
        <back>
            <ref-list>
                <title>References</title>
                <ref id="rep-ref-35682-1">
                    <label>1</label>
                    <mixed-citation publication-type="journal">
                        <person-group person-group-type="author"/>:
                        <article-title>Second-generation p-values: Improved rigor, reproducibility, &amp; transparency in statistical analyses.</article-title>
                        <source>
                            <italic>PLoS One</italic>
                        </source>.<year>2018</year>;<volume>13</volume>(<issue>3</issue>) :
                        <elocation-id>10.1371/journal.pone.0188299</elocation-id>
                        <fpage>e0188299</fpage>
                        <pub-id pub-id-type="pmid">29565985</pub-id>
                        <pub-id pub-id-type="doi">10.1371/journal.pone.0188299</pub-id>
                    </mixed-citation>
                </ref>
                <ref id="rep-ref-35682-2">
                    <label>2</label>
                    <mixed-citation publication-type="journal">
                        <person-group person-group-type="author"/>:
                        <article-title>Rapid and Dynamic Alternative Splicing Impacts the Arabidopsis Cold Response Transcriptome.</article-title>
                        <source>
                            <italic>Plant Cell</italic>
                        </source>.<year>2018</year>;<volume>30</volume>(<issue>7</issue>) :
                        <elocation-id>10.1105/tpc.18.00177</elocation-id>
                        <fpage>1424</fpage>-<lpage>1444</lpage>
                        <pub-id pub-id-type="pmid">29764987</pub-id>
                        <pub-id pub-id-type="doi">10.1105/tpc.18.00177</pub-id>
                    </mixed-citation>
                </ref>
            </ref-list>
        </back>
        <sub-article article-type="response" id="comment3966-35682">
            <front-stub>
                <contrib-group>
                    <contrib contrib-type="author">
                        <name>
                            <surname>Love</surname>
                            <given-names>Michael</given-names>
                        </name>
                        <aff>University of North Carolina at Chapel Hill, USA</aff>
                    </contrib>
                </contrib-group>
                <author-notes>
                    <fn fn-type="conflict">
                        <p>
                            <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                    </fn>
                </author-notes>
                <pub-date pub-type="epub">
                    <day>11</day>
                    <month>9</month>
                    <year>2018</year>
                </pub-date>
            </front-stub>
            <body>
                <p>We thank all reviewers for their insightful comments and suggestions that we feel have greatly improved the readability and usefulness of the workflow. We summarize the main changes and then address reviewer-specific comments point-by-point: 
                    <list list-type="bullet">
                        <list-item>
                            <p>We have addressed all minor text or grammatical suggestions by the reviewers.</p>
                        </list-item>
                        <list-item>
                            <p>We have re-organized the article into distinct and more separated Workflow and Evaluation sections, which was suggested by all reviewers. We begin the article with a clear outline, titled: "Structure of this article", which outlines the Workflow part and the Evaluation part. This outline has direct links to relevant sections and subsections which follow. We have also included an overview diagram of the methods and packages included in the Workflow section, and how they are interconnected.</p>
                        </list-item>
                        <list-item>
                            <p>We have added to the Introduction more motivational text on why a DTU analysis is relevant for biology and biomedical research.</p>
                        </list-item>
                        <list-item>
                            <p>We have added a large section describing the methods DEXSeq and DRIMSeq, before the Workflow section.</p>
                        </list-item>
                        <list-item>
                            <p>We have expanded the original sections discussing counts-from-abundance and their use in the workflow, to make our use of the tximport method more clear.</p>
                        </list-item>
                        <list-item>
                            <p>For the DEXSeq section, we have corrected an earlier incorrect use of nbinomLRT(), which is now replaced with the correct testForDEU(). The practical result is that DEXSeq performs somewhat less conservatively, but the original code was incorrect, and the fix is necessary. The incorrect use of nbinomLRT() in this context will now produce an error in future releases of Bioconductor, to avoid possible incorrect usage.</p>
                        </list-item>
                        <list-item>
                            <p>We have added RATs to the DTU Evaluation.</p>
                        </list-item>
                        <list-item>
                            <p>We now apply stageR to all DTU methods that are evaluated: DRIMSeq, DEXSeq, RATs, and SUPPA2. The RATs and SUPPA2 methods are described, but the code is not provided, as these packages are not part of the Workflow.</p>
                        </list-item>
                        <list-item>
                            <p>We use consistent x-axes and y-axes whenever possible, and use PDF instead of JPG to reduce compression artifacts. When a consistent x-axis is not used in the main text, we include Supplementary Figures with the same plots with outlying methods dropped to keep the x-axis consistent.</p>
                        </list-item>
                        <list-item>
                            <p>We use a palette in which colors are more discernable for color-blind readers</p>
                        </list-item>
                        <list-item>
                            <p>In the Evaluation sections, we include additional plots which examine the simulated gene type source of false positives for the DTU, DGE, and DTE analyses.</p>
                        </list-item>
                        <list-item>
                            <p>We added a new evaluation to examine performance differences between DRIMSeq and DEXSeq, using the identical simulated data that was used in Soneson et al (2016) and Nowicka and Robinson (2016).</p>
                        </list-item>
                        <list-item>
                            <p>We have added a 2 vs 2 simulation for the DTU Evaluation.</p>
                        </list-item>
                        <list-item>
                            <p>We added a brief overview description of all methods assessed in the DGE and DTE Evaluations.</p>
                        </list-item>
                        <list-item>
                            <p>We have added more recommendations in the Discussion.</p>
                        </list-item>
                    </list> Reviewer-specific comments:</p>
                <p> </p>
                <p> </p>
                <p> 1) We have followed the reviewer's suggestion, and have separated the Workflow and Evaluation sections, with an outline at the beginning clearly delineating the two sections, and an overview diagram.</p>
                <p> </p>
                <p> 2) We originally submitted our Bioconductor workflow as a "Research" article, but the Editorial Office recommended to change the categorization to "Method", which is the categorization of many of the other Bioconductor workflows. Bioconductor workflows are not intended to introduce new computational methods or new software packages, but to demonstrate, with live code that resides in an Rmarkdown vignette within an R package structure, how to use a number of different existing Bioconductor packages to analyze a dataset.&#x00a0;</p>
                <p> </p>
                <p> We asked for comment from the Editorial Office on the recommended categorization of Bioconductor workflows under the F1000Research article types, and they provided us with the following statement:</p>
                <p> </p>
                <p> "
                    <italic>In general, Bioconductor workflows are classified as Method articles in F1000Research, since Research articles must present novel research findings, and Software Tool articles must present novel software tools. Since this article by Love et al neither presented novel research findings nor a new software tool, the F1000Research editorial office felt that classifying this article as a Method article was most appropriate. The majority of workflows submitted to the Bioconductor gateway will fall into this article type.</italic>" -F1000Research Editorial Office</p>
                <p> </p>
                <p> 3) We have followed the reviewer's suggestion and included, in addition to the fixed per-gene dispersion simulation, an additional simulation from Soneson et al. (2016), to assess differences between DRIMSeq and DEXSeq, the two methods that are the focus of the workflow. This simulation involved generation of Negative Binomial gene counts, and then the expression was distributed from genes to transcripts by per-sample draws from a Dirichlet distribution, with a minority of genes undergoing DTU across condition. Analysis of additional datasets, and a final determination of which type of data-generating process is closer to various real RNA-seq datasets, is beyond the scope of this workflow, but we feel that the existing simulations cover a range of possibilities and are useful to the readers of the workflow. We comment in a number of places on the limitations of the simulation, including in the overview:</p>
                <p> </p>
                <p> "
                    <italic>While the evaluations rely on simulated data, and are therefore relevant only to the extent that the simulation model and parameters reflect real data, we feel the evaluations are useful for a rough comparison of method performance, and for observing relative changes in performance for a given method as sample size increases.</italic>"</p>
                <p> </p>
                <p> Also at the end of the DTU Evaluation:</p>
                <p> </p>
                <p> "
                    <italic>Again, a caveat of all of our comparative evaluations of DRIMSeq and DEXSeq is that we do not know whether various real RNA-seq experiments will more closely reflect heterogeneous dispersion or fixed dispersion within genes, or if the counts within gene are better modeled by distributing gene-level abundance to transcripts via a Dirichlet distribution as in Soneson et al (2016). However, we have examined simulations reflecting each of these cases, and confirmed that minimum count and minimum proportion filtering benefit both DRIMSeq and DEXSeq.</italic>"</p>
                <p> </p>
                <p> 4) We now include more discussion on the results of the evaluations in the Discussion, including a comment on statistical power. We include a breakdown of false positives by the simulated gene type. Further cross-section of all methods' performance by incomplete annotation, effect size filters, and various count or proportion filters is beyond the scope of the article. Complete analysis of overlap of calls across the various simulations and analyses is also beyond the scope of the article.</p>
                <p> </p>
                <p> We now explore DRIMSeq's performance in the "main" and "fixed per-gene dispersion" simulations, wherein we see that many of the excess false positives at the transcript-level arise from simulated DTU genes, so other transcripts not participating in DTU were being reported as significant. In the &#x201c;main&#x201d; simulation, where DRIMSeq has the most problem with FDR control, it only slightly exceeds a target 10% FDR at the gene level at per-group sample sizes 6 and higher. With proportion SD filtering, DRIMSeq at the transcript level also has small inflation of target 10% FDR for per-group sample sizes 6 and higher.&#x00a0;</p>
                <p> </p>
                <p> We now include RATs as an additional method evaluated on the "main" simulation for DTU analysis. RATs performs similar to SUPPA2, in that it nearly always controls the FDR, although in some cases, it displays higher gene-level sensitivity than SUPPA2. We do not intend the article to be a complete evaluation of all existing methods for DTU, but to compare the two Bioconductor methods that are the focus of the workflow with a few key DTU methods.</p>
                <p> </p>
                <p> Extended discussion of long-read sequencing is beyond the scope of the article, although we added the following comment to the workflow section on importing counts:</p>
                <p> </p>
                <p> "
                    <italic>If a different experiment is performed and a different quantification method used to produce counts per transcript which do not scale with transcript length, then the recommendation would be to use these counts per transcript directly. Examples of experiments producing counts per transcript that would potentially not scale with transcript length include counts of full-transcript-length or nearly-full-transcript-length reads, or counts of 3' tagged RNA-seq reads aggregated to transcript groups. In either case, the statistical methods for DTU could be provided directly with the transcript counts.</italic>"</p>
                <p> </p>
                <p> A relevant quote from Nowicka and Robinson (2016) is:</p>
                <p> </p>
                <p> "
                    <italic>With emerging technologies that sequence longer DNA fragments (either truly or synthetically), we may see in the near future more direct counting of full-length transcripts, making transcript-level quantification more robust and accurate.</italic>&#x201d;</p>
                <p> </p>
                <p> In the "DTU testing" section, we now discuss how DEXSeq and DRIMSeq can be used to evaluate experiments with complex designs, with little limitation as long as the coefficients for each sample can be encoded as a design matrix multiplied by a vector of coefficients.</p>
                <p> </p>
                <p> 5) Comprehensive evaluation of the methods on additional datasets is beyond the scope of the article.</p>
                <p> </p>
                <p> 6) Following this and other reviewers' suggestion, we have now added motivation to the first part of the Introduction as to why DTU is relevant for biological or biomedical research.</p>
                <p> </p>
                <p> 7) We have revised some of our description of the stageR framework to be more clear about why we recommend its use in a DTU workflow:</p>
                <p> </p>
                <p> "
                    <italic>It is likely that an investigator would want both a list of statistically significant genes and transcripts participating in DTU, and stageR provides error control on this pair of lists, assuming that the underlying tests are well calibrated.</italic>"</p>
                <p> </p>
                <p> We also provide some more details in the Discussion regarding the various methods and their performance.</p>
                <p> </p>
                <p> Minor Comments:</p>
                <p> </p>
                <p> 1) We have added an overview diagram as Figure 1.</p>
                <p> </p>
                <p> 2) We have updated figures to be PDF instead of JPG, and made the axes more consistent when possible.</p>
            </body>
        </sub-article>
    </sub-article>
    <sub-article article-type="reviewer-report" id="report35546">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.16780.r35546</article-id>
            <title-group>
                <article-title>Reviewer response for version 1</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Oshlack</surname>
                        <given-names>Alicia</given-names>
                    </name>
                    <xref ref-type="aff" rid="r35546a1">1</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0001-9788-5690</uri>
                </contrib>
                <contrib contrib-type="author">
                    <name>
                        <surname>Cmero</surname>
                        <given-names>Marek</given-names>
                    </name>
                    <xref ref-type="aff" rid="r35546a1">1</xref>
                    <role>Co-referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0001-7783-5530</uri>
                </contrib>
                <aff id="r35546a1">
                    <label>1</label>Murdoch Children's Research Institute, Royal Children's Hospital, Parkville, Vic, Australia</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>30</day>
                <month>7</month>
                <year>2018</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2018 Oshlack A and Cmero M</copyright-statement>
                <copyright-year>2018</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport35546" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.15398.1"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve-with-reservations</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>A workflow to enable more people to perform differential transcript usage on their RNA-seq data set is a useful addition to the literature. Benchmarking methods and combinations of workflows are also an important part of the literature. In this manuscript, both things have been attempted, which unfortunately makes the manuscript a little blurred in its focus.</p>
            <p> </p>
            <p> We view a workflow as an instructional manuscript in which a step-by-step analysis can be reproduced with a new data set that a user wants to bring to the analysis. This is presented in the sections Quantification and data import and Statistical analysis of differential transcript usage and, in our view, should be the focus of the manuscript. These are complex analyses combining several packages with several alternative paths. It would really help the user if a flowchart for this analysis could be made that shows the common parts of the workflow (e.g. starting with a Salmon, importing into R), how the alternatives split and which packages are used for alternative parts of the workflow. For example, DRIMseq is an alternative to DEXseq, which can then be followed by stageR, and Suppa is a complete (parallel) workflow.</p>
            <p> </p>
            <p> The evaluation sections are somewhat useful and interesting in their own right, but rely on simulated data and are therefore not directly applicable to readers who are looking for workflows to guide them in their own data analysis. However, they do help users decide which workflows to choose in their own analysis.</p>
            <p> </p>
            <p> Overall we wonder if this manuscript could be two separate manuscripts: a workflow for DTU and an evaluation of methods based on simulated data? Another (preferable) alternative would be to only focus on DTU in the evaluation and keep the section Evaluation of methods for DTU as a guide to help the user to choose the workflow (with this clearly stated). We felt there were too many additional analysis introduced after this point which relied on more in-depth understanding of the DGE literature, which was not really the focus of the workflow.</p>
            <p> </p>
            <p> 
                <bold>Minor comments:</bold>
            </p>
            <p> Several sections should be edited for clarity and flow of ideas. Specifically, 
                <list list-type="bullet">
                    <list-item>
                        <p>page 6: "We recommend scaledTPM for differential transcript usage so that the estimated proportions fit by DRIMSeq in the following sections correspond to the proportions of underlying abundance." Could the authors please rewrite/break up this sentence to improve readability?</p>
                    </list-item>
                    <list-item>
                        <p>page 6, section 'Import counts into R/Bioconductor': the authors should clarify whether the referenced R package is for demonstration purposes only (i.e. should the user install the rnaseqDTU to perform any of the workflow?).</p>
                    </list-item>
                    <list-item>
                        <p>page 6: could the concept of using counts from abundance be introduced/explained before referring to specific package parameters and settings?</p>
                    </list-item>
                    <list-item>
                        <p>page 6: "The following code chunk is not evaluated, but instead we will load a pre-constructed matrix of counts". Could the authors please clarify this sentence? We assume this means that instead of constructing a matrix of counts (as in a typical workflow), pre-constructed data is loaded.</p>
                    </list-item>
                    <list-item>
                        <p>page 7 "We ran the following unevaluated code chunks": does 'unevaluated' refer to not run in a typical workflow?</p>
                    </list-item>
                    <list-item>
                        <p>page 7, 'Statistic analysis of differential transcript usage', second paragraph: could the description of txdf be moved to the previous section where it is constructed? This would help improve the flow.</p>
                    </list-item>
                    <list-item>
                        <p>page 12: "(2) contain a transcript with a transcript adjusted p-value less than 0.05 which does not participate in DTU, so contain a falsely confirmed transcript": could the authors please rewrite this sentence for clarity.</p>
                    </list-item>
                    <list-item>
                        <p>page 13: sentence "The testing of &#x201c;this&#x201d; vs &#x201c;others&#x201d;..." could be improved for clarity, e.g.: "DEXseq in its original version requires fitting of coefficients for each exon within a gene. Running DEXseq at a transcript-level considerably improves performance as fewer features per gene require fitting of coefficients."</p>
                    </list-item>
                    <list-item>
                        <p>page 14, after the line "dxr &lt;- as.data.frame(dxr[,columns]": showing head(dxr) could help in clarifying the output.</p>
                    </list-item>
                    <list-item>
                        <p>page 15, in the code "paste0("suppa/group1.tpm")": the paste function is not necessary here.</p>
                    </list-item>
                    <list-item>
                        <p>Section 'Evaluation of methods for DTU': could the authors offer an explanation why SUPPA2 only reported one DGE gene as DTU?</p>
                    </list-item>
                    <list-item>
                        <p>Could the y and x axes on the plots on pages 17-20 and 25 be made consistent with each other? Also, very minor point, but these plots have some jpeg artefact. Could pdf or png plots be used instead?</p>
                    </list-item>
                    <list-item>
                        <p>page 19 "DRIMSeq [...] performed slightly better": could a metric be referenced in how the package performed better?</p>
                    </list-item>
                    <list-item>
                        <p>page 22: "We can repeat the same analysis...": 'same analysis' is misleading as this section tests only DGE.</p>
                    </list-item>
                    <list-item>
                        <p>page 24: could the authors formally introduce or describe EBSeq and SAMseq packages, preferably earlier in the manuscript?</p>
                    </list-item>
                    <list-item>
                        <p>page 26: could the authors use 'compute time' instead of 'timing'?</p>
                    </list-item>
                </list> </p>
            <p> We identified the following typographical errors and grammatical issues: 
                <list list-type="bullet">
                    <list-item>
                        <p>page 5: "We recommend [constructing] a CSV file..."</p>
                    </list-item>
                    <list-item>
                        <p>page 6: "We suggest for DTU analysis to generate counts from abundance..." reword to "For DTU analysis, we suggest generating counts from abundance..."</p>
                    </list-item>
                    <list-item>
                        <p>page 16: "DEXSeq controlled [the FDR] except for..."</p>
                    </list-item>
                    <list-item>
                        <p>page 16: "DRIMSeq had [an] observed FDR.."</p>
                    </list-item>
                    <list-item>
                        <p>page 16: "...reported 2 extra genes more than..." change to "reported two more genes than"</p>
                    </list-item>
                    <list-item>
                        <p>page 16: "...DEXseq were the most sensitive methods [for] recovering"</p>
                    </list-item>
                    <list-item>
                        <p>page 19 "...DRIMSeq and DEXSeq[,] [in] this additional simulation"</p>
                    </list-item>
                    <list-item>
                        <p>page 19: "Again, we caveat our comparative evaluation of DRIMSeq and DEXSeq by noting that we do not know..." change to "Again, a caveat of our comparative evaluation of DRIMSeq and DEXSeq is that we do not know..."</p>
                    </list-item>
                    <list-item>
                        <p>page 24: "did not have [adequate] sensitivity to detect DGE"</p>
                    </list-item>
                    <list-item>
                        <p>page 24: "while those two method[s] had"</p>
                    </list-item>
                </list>
            </p>
            <p>Is the rationale for developing the new method (or application) clearly explained?</p>
            <p>Partly</p>
            <p>Is the description of the method technically sound?</p>
            <p>Yes</p>
            <p>Are the conclusions about the method and its performance adequately supported by the findings presented in the article?</p>
            <p>Partly</p>
            <p>If any results are presented, are all the source data underlying the results available to ensure full reproducibility?</p>
            <p>Yes</p>
            <p>Are sufficient details provided to allow replication of the method development and its use by others?</p>
            <p>Yes</p>
            <p>Reviewer Expertise:</p>
            <p>NA</p>
            <p>We confirm that we have read this submission and believe that we have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however we have significant reservations, as outlined above.</p>
        </body>
        <sub-article article-type="response" id="comment3964-35546">
            <front-stub>
                <contrib-group>
                    <contrib contrib-type="author">
                        <name>
                            <surname>Love</surname>
                            <given-names>Michael</given-names>
                        </name>
                        <aff>University of North Carolina at Chapel Hill, USA</aff>
                    </contrib>
                </contrib-group>
                <author-notes>
                    <fn fn-type="conflict">
                        <p>
                            <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                    </fn>
                </author-notes>
                <pub-date pub-type="epub">
                    <day>11</day>
                    <month>9</month>
                    <year>2018</year>
                </pub-date>
            </front-stub>
            <body>
                <p>We thank all reviewers for their insightful comments and suggestions that we feel have greatly improved the readability and usefulness of the workflow. We summarize the main changes and then address reviewer-specific comments point-by-point: 
                    <list list-type="bullet">
                        <list-item>
                            <p>We have addressed all minor text or grammatical suggestions by the reviewers.</p>
                        </list-item>
                        <list-item>
                            <p>We have re-organized the article into distinct and more separated Workflow and Evaluation sections, which was suggested by all reviewers. We begin the article with a clear outline, titled: "Structure of this article", which outlines the Workflow part and the Evaluation part. This outline has direct links to relevant sections and subsections which follow. We have also included an overview diagram of the methods and packages included in the Workflow section, and how they are interconnected.</p>
                        </list-item>
                        <list-item>
                            <p>We have added to the Introduction more motivational text on why a DTU analysis is relevant for biology and biomedical research.</p>
                        </list-item>
                        <list-item>
                            <p>We have added a large section describing the methods DEXSeq and DRIMSeq, before the Workflow section.</p>
                        </list-item>
                        <list-item>
                            <p>We have expanded the original sections discussing counts-from-abundance and their use in the workflow, to make our use of the tximport method more clear.</p>
                        </list-item>
                        <list-item>
                            <p>For the DEXSeq section, we have corrected an earlier incorrect use of nbinomLRT(), which is now replaced with the correct testForDEU(). The practical result is that DEXSeq performs somewhat less conservatively, but the original code was incorrect, and the fix is necessary. The incorrect use of nbinomLRT() in this context will now produce an error in future releases of Bioconductor, to avoid possible incorrect usage.</p>
                        </list-item>
                        <list-item>
                            <p>We have added RATs to the DTU Evaluation.</p>
                        </list-item>
                        <list-item>
                            <p>We now apply stageR to all DTU methods that are evaluated: DRIMSeq, DEXSeq, RATs, and SUPPA2. The RATs and SUPPA2 methods are described, but the code is not provided, as these packages are not part of the Workflow.</p>
                        </list-item>
                        <list-item>
                            <p>We use consistent x-axes and y-axes whenever possible, and use PDF instead of JPG to reduce compression artifacts. When a consistent x-axis is not used in the main text, we include Supplementary Figures with the same plots with outlying methods dropped to keep the x-axis consistent.</p>
                        </list-item>
                        <list-item>
                            <p>We use a palette in which colors are more discernable for color-blind readers</p>
                        </list-item>
                        <list-item>
                            <p>In the Evaluation sections, we include additional plots which examine the simulated gene type source of false positives for the DTU, DGE, and DTE analyses.</p>
                        </list-item>
                        <list-item>
                            <p>We added a new evaluation to examine performance differences between DRIMSeq and DEXSeq, using the identical simulated data that was used in Soneson et al (2016) and Nowicka and Robinson (2016).</p>
                        </list-item>
                        <list-item>
                            <p>We have added a 2 vs 2 simulation for the DTU Evaluation.</p>
                        </list-item>
                        <list-item>
                            <p>We added a brief overview description of all methods assessed in the DGE and DTE Evaluations.</p>
                        </list-item>
                        <list-item>
                            <p>We have added more recommendations in the Discussion.</p>
                        </list-item>
                    </list> Reviewer-specific comments: 
                    <list list-type="bullet">
                        <list-item>
                            <p>We have tried to separate and clarify the Workflow section and the Evaluation section. We now include an overview diagram, as helpfully suggested here.</p>
                        </list-item>
                        <list-item>
                            <p>We have expanded the section on counts-from-abundance, added a section before the counts are imported, and clarified the sentences highlighted by the reviewers.</p>
                        </list-item>
                        <list-item>
                            <p>We have clarified a number of the "not evaluated" sentences in the original workflow.</p>
                        </list-item>
                        <list-item>
                            <p>The description of txdf is given in the section where it is constructed, under the heading "Transcript-to-gene mapping".</p>
                        </list-item>
                        <list-item>
                            <p>We have clarified the OFDR description in the sentence highlighted by the reviewers, and have removed the "this" vs "other" sentence, as the history of DEXSeq method development is not necessary or useful for the readers of this workflow.</p>
                        </list-item>
                        <list-item>
                            <p>We have added `head(dxr)` to demonstrate the output.</p>
                        </list-item>
                        <list-item>
                            <p>We have removed the SUPPA2 code, as now the workflow focuses on the Bioconductor package DRIMSeq and DEXSeq, which have live code examples (SUPPA2 is a python package and so cannot have live code examples in a Bioconductor workflow).</p>
                        </list-item>
                        <list-item>
                            <p>We have made the x- and y-axes consistent whenever possible.</p>
                        </list-item>
                        <list-item>
                            <p>We have revised the Workflow and Evaluation sections following all of the reviewers' helpful comments, error spotting, and suggestions on improved wording.</p>
                        </list-item>
                    </list>
                </p>
            </body>
        </sub-article>
    </sub-article>
    <sub-article article-type="reviewer-report" id="report35548">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.16780.r35548</article-id>
            <title-group>
                <article-title>Reviewer response for version 1</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Vitting-Seerup</surname>
                        <given-names>Kristoffer</given-names>
                    </name>
                    <xref ref-type="aff" rid="r35548a1">1</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0002-6450-0608</uri>
                </contrib>
                <contrib contrib-type="author">
                    <name>
                        <surname>Thodberg</surname>
                        <given-names>Malte</given-names>
                    </name>
                    <xref ref-type="aff" rid="r35548a1">1</xref>
                    <role>Co-referee</role>
                </contrib>
                <aff id="r35548a1">
                    <label>1</label>Department of Biology, Biotech Research and Innovation Centre, University of Copenhagen, Copenhagen, Denmark</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>24</day>
                <month>7</month>
                <year>2018</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2018 Vitting-Seerup K and Thodberg M</copyright-statement>
                <copyright-year>2018</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport35548" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.15398.1"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve-with-reservations</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>
                <bold>Summary</bold>
            </p>
            <p> In &#x201c;Swimming downstream: statistical analysis of differential transcript usage following Salmon quantification&#x201d; Love 
                <italic>et al</italic> presents a combined workflow and benchmark for differential transcript usage. This is a vital paper as there is no consensus on which differential transcript usage tools works better (here addressed by the benchmark part) and very few people analyze differential transcript usage &#x2013; something the workflow can hopefully help with. Of special note is the extent to which open source have been embraced by Love 
                <italic>et al</italic> &#x2013; an approach that is commendable (and copy worthy). Although the manuscript has a lot of potential it can, in its current form, be challenging to read and the benchmark of differential transcript usage part needs to be extended. Revisions are therefore required.</p>
            <p> </p>
            <p> 
                <bold>Preface</bold> 
                <list list-type="bullet">
                    <list-item>
                        <p>Malte Thodeberg helped me review this paper &#x2013; thanks Malte!</p>
                    </list-item>
                    <list-item>
                        <p>Since neither of us are native English speakers/writers we have not attempted to&#x00a0;corrected for potential gramma and/or spelling mistakes</p>
                    </list-item>
                    <list-item>
                        <p>I'm the developer of IsoformSwitchAnalyzeR.</p>
                    </list-item>
                </list> </p>
            <p> 
                <bold>General comments</bold> 
                <list list-type="bullet">
                    <list-item>
                        <p>The article switches between describing a workflow, which users can follow to perform differential transcript usage on their own data, and a benchmark of differential expression/usage tools. The two sections should be much more clearly separated and each should be more concisely written. 
                            <list list-type="bullet">
                                <list-item>
                                    <p>One solution would be to have the benchmark first and the workflow afterwards. It would then be natural that workflow used the tool(s) deemed better by the benchmark.</p>
                                </list-item>
                            </list> </p>
                    </list-item>
                    <list-item>
                        <p>The main problem with the workflow part of the manuscript is the intermixing of the workflow and benchmarking (and the intro/methods) sections which makes it necessary to include a lot of callouts, omissions and special cases. This has the unintended effect of cluttered the workflow making it hard to read and/or follow. This would however be solved by the above suggested re-structuring. If such restructure were implemented it would also seem more natural that the workflow consistently only use a small dataset (either a subset of the simulated data or another dataset entirely) whereby the workflow could be simplified a lot.</p>
                    </list-item>
                    <list-item>
                        <p>Although the benchmark is of high quality it still needs to be a bit more exhaustive.</p>
                    </list-item>
                    <list-item>
                        <p>(Even with the suggested re-structure) The whole article would highly benefit from an overview paragraph and/or figure to give the reader the high-level overview of the outline before jumping into it (something like a table/figure/description of content). This could also be a table of content (with links included to enable easy jumping in the article).</p>
                    </list-item>
                </list> </p>
            <p> 
                <bold>Title</bold> 
                <list list-type="bullet">
                    <list-item>
                        <p>The title should reflect it is a workflow and/or benchmark. The current title suggests the authors developed a new tool for differential transcript usage which were specifically designed to integrate with Salmon. Furthermore, it could be considered to change the title so it also indicates the differential gene/transcript expression performed in the manuscript.</p>
                    </list-item>
                </list> </p>
            <p> 
                <bold>Introduction</bold> 
                <list list-type="bullet">
                    <list-item>
                        <p>The introduction lacks a section describing why differential transcript usage are of interest in the first place.</p>
                    </list-item>
                    <list-item>
                        <p>Large parts of what would normally be in the introduction and methods have been moved into the results. Introduction to tools and methods including descriptions of how they work belongs in the introduction. Description of parameter choice for e.g. scaling during tximport also belongs in intro/methods. 
                            <list list-type="bullet">
                                <list-item>
                                    <p>Optional suggestion: include a lay-man introduction to how the tools work (the technical part are in the original papers for people interested).</p>
                                </list-item>
                            </list> </p>
                    </list-item>
                    <list-item>
                        <p>In the section where tools for DTU are mention please remove (or argue for inclusion of) BITSeq and stageR. StageR is for post analysis of p-values (no test). Although BITSeq is mentioned in some of the BiocViews of alternative splicing neither the article nor the vignette shows anything but DTE (aka no DTU). Mention that SGSseq wraps DEXSeq.</p>
                    </list-item>
                    <list-item>
                        <p>The test build into IsoformSwitchAnalyzeR in not rank-based&#x00a0;&#x2013; but it is obsolete and will be removed from the next update &#x2013; so it could be skipped entirely (along with the other non-maintained tests).</p>
                    </list-item>
                    <list-item>
                        <p>Please reference IsoformSwitchAnalyzeR for its main purpose: the downstream analysis of functional consequences of identified isoform switches. Consider also mentioning other tools for downstream analysis (some can be found at&#x00a0;
                            <ext-link ext-link-type="uri" xlink:href="https://www.bioconductor.org/packages/devel/BiocViews.html#___AlternativeSplicing">https://www.bioconductor.org/packages/devel/BiocViews.html#___AlternativeSplicing</ext-link> ).</p>
                    </list-item>
                    <list-item>
                        <p>To be more user-friendly please insert a link when mentioning the IsoformSwitchAnalyzeR vignette.</p>
                    </list-item>
                </list> &#x00a0;</p>
            <p> 
                <bold>Methods</bold> 
                <list list-type="bullet">
                    <list-item>
                        <p>Please add in the number of transcripts considered expressed (&gt;= 10 estimated fragment counts)</p>
                    </list-item>
                    <list-item>
                        <p>The simulations performed should either be named or numbered to allow for clear reference to which of the simulated datasets are used.</p>
                    </list-item>
                    <list-item>
                        <p>In the countSimRepport please compare the simulated data to the 12 samples which were used for the basis of the simulation (comparing 12 to hundreds of samples is not easy to interpret).</p>
                    </list-item>
                    <list-item>
                        <p>Please elaborate on discussion of the different options for scaling-from-TPM-to-counts. It is unclear what the difference is and when it matters. Furthermore you write &#x201c;if we used lengthScaledTPM transcript counts, then a change in transcript usage among transcripts of different length could result in a changed total count for the gene, even if there is no change in total gene expression&#x201d; is there a mixup here? If not, why do you then use lengthScaledTPM in the DGE/DTU section? Please include a recommendation of when to use which option for analysis of DGE/DTE, DTU and if both are present in the data.</p>
                    </list-item>
                    <list-item>
                        <p>Modifications 
                            <list list-type="bullet">
                                <list-item>
                                    <p>Include a paragraph on quantification before introducing the modifications. If any expression filtering was done (as fig 1 indicate and mention above) it should be clearly stated.</p>
                                </list-item>
                                <list-item>
                                    <p>Currently it is unclear how many genes were modified in which way. To remedy that please provide a table indicating the number genes modified for DTU or DGE by each of the changes you introduce (as well as the total number of genes modified.</p>
                                </list-item>
                                <list-item>
                                    <p>Why both simulate DTU with a modification of a single isoform and a switch of two isoforms if you are not investigating whether it makes a difference - seems redundant? (more on that in the DGE benchmark).</p>
                                </list-item>
                            </list> </p>
                    </list-item>
                </list> &#x00a0;</p>
            <p> 
                <bold>In the workflow</bold> 
                <list list-type="bullet">
                    <list-item>
                        <p>Please add a comment of why DRIMSeq have NA as p-values (that will confuse many people)</p>
                    </list-item>
                </list> &#x00a0;</p>
            <p> 
                <bold>Post-hoc filtering on DRIMSeq</bold> 
                <list list-type="bullet">
                    <list-item>
                        <p>What is the reasoning beheading this filtering step? And is it statistically valid to do this filtering &#x2013; the proportions and p-values are not independent. Is the modified p-value distribution still uniform in the interval [0.05-1[ enabling proper FDR correction?</p>
                    </list-item>
                    <list-item>
                        <p>If the filtering is statistically sound why not also do it for the other methods?</p>
                    </list-item>
                </list> &#x00a0;</p>
            <p> 
                <bold>Evaluation of methods for DTU</bold>. This is the major selling point of the article and the part that require most work. 
                <list list-type="bullet">
                    <list-item>
                        <p>To reflect a very common-use case scenarios the benchmark should also be formed with 2 replicates. Since the benchmark presented here show quite subtle differences (in TPR vs FDR) between 9 and 12 replicates the 2-replicate scenario could for replace either of them.</p>
                    </list-item>
                    <list-item>
                        <p>The benchmark simulation should not only be performed once (one time) as the exact samples used in that run will have a large effect (especially for the smaller comparisons). Instead 25 simulations should be performed and the average iCOBRA plot could be shown (possibly extended to also show variation across the simulations).</p>
                    </list-item>
                    <list-item>
                        <p>The benchmark must also include a run on unmodified simulated data to test how many false positives are found if there truly are no DTU (which might be the case for some datasets).</p>
                    </list-item>
                    <list-item>
                        <p>Be consistent and concise in the use of stageR. Either use with no tools or use with all tools (or both to also enable a benchmark of stageR). Else the transcript level FDR between tools are not comparable). Highlight the difference between perGeneQValue and stageR (or only use one of them) or highlight where each is used. For example, it is not clear whether stageR was used in figure 3 and if it was whether it was for all tools.</p>
                    </list-item>
                    <list-item>
                        <p>Given the success of repurposing DEXSeq to DTU, and the good performance of limma for DTE/DGE, the current benchmark could also test a repurposing of limma&#x2019;s (and edgeR&#x2019;s) differential exon usage test. This is optional &#x2013; but it would be a huge step forward for testing differential isoform usage as it would bring a lot of clarity to the field.</p>
                    </list-item>
                    <list-item>
                        <p>Use same axis for the 4 iCOBRA plots to illustrate improvement with increasing number of samples. Please include group sizes (e.g. 3 vs 3, 6 vs 6 etc.) in the figure to make it easier to read - could be instead of the rather uninformative &#x201c;overall&#x201d; facet title.</p>
                    </list-item>
                    <list-item>
                        <p>Please comment: 
                            <list list-type="bullet">
                                <list-item>
                                    <p>On the large performance increase from &#x201c;Kallisto + DEXSeq&#x201d; in Soneson el al, Genome Biology 2016 (where FDR performance was quite poor) to the current &#x201c;Salmon + DEXSeq&#x201d; which performs rather good.</p>
                                </list-item>
                                <list-item>
                                    <p>On the differences between your benchmark (indicating DEXSeq works better) and the benchmark performed by Nowicka et al in the DRIMSeq paper (indicating DRIMSeq) works better.</p>
                                </list-item>
                            </list> </p>
                    </list-item>
                    <list-item>
                        <p>Please move the evaluation with fixed per-gene dispersion to supplementary material as it is just a sanity check.</p>
                    </list-item>
                    <list-item>
                        <p>Please end section with a recommendation of what tool to use.</p>
                    </list-item>
                </list> &#x00a0;</p>
            <p> 
                <bold>Evaluation of DTU vs DGE</bold> 
                <list list-type="bullet">
                    <list-item>
                        <p>This section belongs in the workflow part of the article.</p>
                    </list-item>
                </list> </p>
            <p> 
                <bold>Evaluation of DGE/DTE</bold> 
                <list list-type="bullet">
                    <list-item>
                        <p>The reason for (re)doing a DGE/DTU benchmark here need to be clearly described (which is to test how tools perform when there are also underlying DTU as hinted in Soneson 2016, F1000Research).</p>
                    </list-item>
                    <list-item>
                        <p>To reflect a very common-use case scenarios the benchmark should also be formed with 2 replicates. The 2-replicate scenario could replace either the 9 or 12 replicates</p>
                    </list-item>
                    <list-item>
                        <p>Table with runtime should be moved to supplementary as it can be summarized as &#x201c;sleuth is slower&#x201d;.</p>
                    </list-item>
                    <list-item>
                        <p>The TPR vs FDR figures are unreadable due to too many lines on top of one another &#x2013; this must be fixed. Furthermore, use same axis for the 4 iCOBRA plots to show improvement with increasing number of samples. Please include group sizes in the figure to make it easier to read - could be instead of the &#x201c;overall&#x201d; facet title.</p>
                    </list-item>
                    <list-item>
                        <p>The DGE results are quite surprising &#x2013; in other recent benchmarks most tools handle FDR quite well &#x2013; which is not the case here. 
                            <list list-type="bullet">
                                <list-item>
                                    <p>I suspect this might be due to the DGE where only a single isoform was changed (meaning the overall gene expression could change only marginally). Therefore, the authors should investigate how the benchmark result differ when only considering either the DGE introduce with one isoform upregulated or the DGE with all isoforms were upregulated.</p>
                                </list-item>
                                <list-item>
                                    <p>If the results hold op a comment on how this compare to recent DGE benchmarks is necessary</p>
                                </list-item>
                            </list> </p>
                    </list-item>
                    <list-item>
                        <p>If the problem rather seems to be the presence of DTU this should be highlighted and discussed.</p>
                    </list-item>
                    <list-item>
                        <p>For figure S2 please include the sleuth result on the main simulated data as well else a direct comparison (to judge the effect of the GC content) is not feasible</p>
                    </list-item>
                    <list-item>
                        <p>Please end section with a recommendation of what tools to use.</p>
                    </list-item>
                </list> &#x00a0;</p>
            <p> 
                <bold>Discussion</bold> 
                <list list-type="bullet">
                    <list-item>
                        <p>There also needs to be a discussion around the benchmark part of the paper &#x2013; it is currently completely missing.</p>
                    </list-item>
                </list> </p>
            <p> Please don't hesitate to contact me if anything was unclear.</p>
            <p>Is the rationale for developing the new method (or application) clearly explained?</p>
            <p>Partly</p>
            <p>Is the description of the method technically sound?</p>
            <p>Yes</p>
            <p>Are the conclusions about the method and its performance adequately supported by the findings presented in the article?</p>
            <p>Yes</p>
            <p>If any results are presented, are all the source data underlying the results available to ensure full reproducibility?</p>
            <p>Yes</p>
            <p>Are sufficient details provided to allow replication of the method development and its use by others?</p>
            <p>Yes</p>
            <p>Reviewer Expertise:</p>
            <p>Bioinformatics with a focus on isoform usage analysis.</p>
            <p>We confirm that we have read this submission and believe that we have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however we have significant reservations, as outlined above.</p>
        </body>
        <sub-article article-type="response" id="comment3965-35548">
            <front-stub>
                <contrib-group>
                    <contrib contrib-type="author">
                        <name>
                            <surname>Love</surname>
                            <given-names>Michael</given-names>
                        </name>
                        <aff>University of North Carolina at Chapel Hill, USA</aff>
                    </contrib>
                </contrib-group>
                <author-notes>
                    <fn fn-type="conflict">
                        <p>
                            <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                    </fn>
                </author-notes>
                <pub-date pub-type="epub">
                    <day>11</day>
                    <month>9</month>
                    <year>2018</year>
                </pub-date>
            </front-stub>
            <body>
                <p>We thank all reviewers for their insightful comments and suggestions that we feel have greatly improved the readability and usefulness of the workflow. We summarize the main changes and then address reviewer-specific comments point-by-point: 
                    <list list-type="bullet">
                        <list-item>
                            <p>We have addressed all minor text or grammatical suggestions by the reviewers.</p>
                        </list-item>
                        <list-item>
                            <p>We have re-organized the article into distinct and more separated Workflow and Evaluation sections, which was suggested by all reviewers. We begin the article with a clear outline, titled: "Structure of this article", which outlines the Workflow part and the Evaluation part. This outline has direct links to relevant sections and subsections which follow. We have also included an overview diagram of the methods and packages included in the Workflow section, and how they are interconnected.</p>
                        </list-item>
                        <list-item>
                            <p>We have added to the Introduction more motivational text on why a DTU analysis is relevant for biology and biomedical research.</p>
                        </list-item>
                        <list-item>
                            <p>We have added a large section describing the methods DEXSeq and DRIMSeq, before the Workflow section.</p>
                        </list-item>
                        <list-item>
                            <p>We have expanded the original sections discussing counts-from-abundance and their use in the workflow, to make our use of the tximport method more clear.</p>
                        </list-item>
                        <list-item>
                            <p>For the DEXSeq section, we have corrected an earlier incorrect use of nbinomLRT(), which is now replaced with the correct testForDEU(). The practical result is that DEXSeq performs somewhat less conservatively, but the original code was incorrect, and the fix is necessary. The incorrect use of nbinomLRT() in this context will now produce an error in future releases of Bioconductor, to avoid possible incorrect usage.</p>
                        </list-item>
                        <list-item>
                            <p>We have added RATs to the DTU Evaluation.</p>
                        </list-item>
                        <list-item>
                            <p>We now apply stageR to all DTU methods that are evaluated: DRIMSeq, DEXSeq, RATs, and SUPPA2. The RATs and SUPPA2 methods are described, but the code is not provided, as these packages are not part of the Workflow.</p>
                        </list-item>
                        <list-item>
                            <p>We use consistent x-axes and y-axes whenever possible, and use PDF instead of JPG to reduce compression artifacts. When a consistent x-axis is not used in the main text, we include Supplementary Figures with the same plots with outlying methods dropped to keep the x-axis consistent.</p>
                        </list-item>
                        <list-item>
                            <p>We use a palette in which colors are more discernable for color-blind readers</p>
                        </list-item>
                        <list-item>
                            <p>In the Evaluation sections, we include additional plots which examine the simulated gene type source of false positives for the DTU, DGE, and DTE analyses.</p>
                        </list-item>
                        <list-item>
                            <p>We added a new evaluation to examine performance differences between DRIMSeq and DEXSeq, using the identical simulated data that was used in Soneson et al (2016) and Nowicka and Robinson (2016).</p>
                        </list-item>
                        <list-item>
                            <p>We have added a 2 vs 2 simulation for the DTU Evaluation.</p>
                        </list-item>
                        <list-item>
                            <p>We added a brief overview description of all methods assessed in the DGE and DTE Evaluations.</p>
                        </list-item>
                        <list-item>
                            <p>We have added more recommendations in the Discussion.</p>
                        </list-item>
                    </list> Reviewer-specific comments:</p>
                <p> </p>
                <p> 
                    <underline>General comments</underline>
                </p>
                <p> </p>
                <p> We believe we have made the separation between Workflow and Evaluation much more clear now, and have added an outline to the beginning of the article with hyperlinks to subsections and with an overview diagram, as usefully suggested here.</p>
                <p> </p>
                <p> 
                    <underline>Title</underline>
                </p>
                <p> </p>
                <p> We believe the title is appropriate and does not suggest a new tool. The fact that existing tools are leveraged in the workflow is clear from the abstract and the main text.</p>
                <p> </p>
                <p> 
                    <underline>Introduction</underline>
                </p>
                <p> </p>
                <p> The Bioconductor workflows do not have typical structure with Introduction, Methods, Results and Discussion, but instead a prolonged section where relevant concepts are typically introduced as needed. See, for example, the DESeq2 workflow: https://bioconductor.org/packages/rnaseqGene. We have now added overview descriptions of the methods DEXSeq and DRIMSeq before the Workflow section begins.</p>
                <p> </p>
                <p> We have removed BitSeq. We believed earlier that cjBitSeq, which is a new DTU method, was implemented in the Bioconductor package BitSeq, but it is a separate GitHub package (https://github.com/mqbssppe/cjBitSeq). Since we are listing Bioconductor packages that can be used for DTU, we now do not list BitSeq. We now have a separate sentence describing stageR and its connection to the DTU methods, and SGSeq (and we mention its leveraging of DEXSeq or limma).&#x00a0;</p>
                <p> </p>
                <p> We no longer mention the statistical test from Vitting-Seerup and Sandelin (2017). We use the suggested purpose description for IsoformSwitchAnalyzeR, link to the AlternativeSplicing BiocViews, and include a link to the IsoformSwitchAnalyzeR vignette.</p>
                <p> </p>
                <p> 
                    <underline>Methods</underline>
                </p>
                <p> </p>
                <p> We now include the number of transcripts with estimated counts greater than 10 in the Simulation. We name the various simulations, and use their name when referring to them in the main text or captions.</p>
                <p> </p>
                <p> Our purpose in using the countsimQC report is to compare the joint distribution of estimated &#x00a0;parameters (mean, dispersion) from the simulation and from the dataset from which the estimates were derived. We therefore compare the 24 simulated samples to the 458 non-duplicated GEUVADIS samples that were used for the estimation of the mean and dispersion parameters. We have made this more clear in the caption of the countsimQC Supplementary Figure.</p>
                <p> </p>
                <p> We have elaborated on discussion of the different options for counts-from-abundance, including the sentence about change in total counts. We include details on the recommended counts-from-abundance options through the text and in the overview diagram, Figure 1.</p>
                <p> </p>
                <p> We state whenever any expression filtering was done. The only expression filtering in the DTU section is performed by the filtering functions in DRIMSeq, and the TPM &gt; 1 filter to speed up SUPPA2 on the command line. We mention the various expression filters used by the different DGE and DTE methods in the Evaluation section for those methods. We include in the Simulation section the exact number of genes modified by simulated DGE, simulated DTE, and simulated DTU.</p>
                <p> We have added a comment on the NA p-values for DRIMSeq in the section in the workflow where they are replaced with a p-value of 1. The text now reads:</p>
                <p> </p>
                <p> "From investigating these NA p-value cases for DRIMSeq, they all occur when one condition group has all zero counts for a transcript, but sufficient counts from the other condition group, and sufficient counts for the gene. DRIMSeq will not estimate a precision for such a gene. These all happen to be true positive genes for DTU in the simulation, where the isoform switch is total or nearly total. DEXSeq, shown in a later section, does not produce NA p-values for any genes. A potential fix would be to use a plug-in common or trended precision for such genes, but this is not implemented in the current version of DRIMSeq."</p>
                <p> We now perform post-hoc proportion SD filtering on the adjusted transcript p-values for DRIMSeq directly, which has little effect on the results. The SD of proportions and the p-values may possibly be independent under the null hypothesis of no DTU, which is the requirement for proper Type I error control of an independent filter [Bourgon (2010)], but we do not attempt to provide empirical evidence to support this. Importantly, we apply the post-hoc filtering because we have empirical evidence that DRIMSeq was not providing uniform p-values for null transcripts on the simulated data explored in this article. Therefore, we begin with a non-uniform distribution of p-values for the null transcripts. The filtering is shown empirically to improve the FDR control.</p>
                <p> We do not perform the simulation multiple times, and we have not extended iCOBRA to support multiple iterations on a single plot, which is beyond the scope of this article. We are most interested in the relative performance of the various methods, and their general location on the TPR-FDR plots, which is achieved with the current evaluation. We did explore running DEXSeq 25 times on the 3 vs 3 "main" simulation, and the inter-simulation variation in the TPR-FDR plot was minimal. We have uploaded all 24 of the simulated paired-end reads to Zenodo, and the dataset is already quite large. We do not run the methods on entirely null datasets, which is beyond the scope of this article.</p>
                <p> </p>
                <p> We have now used stageR on all methods. stageR accepts gene-level p-values (or adjusted p-values) and transcript-level p-values. If gene-level p-values are not provided by a method then DEXSeq's perGeneQValue was used to generate gene-level adjusted p-values, for use with stageR.</p>
                <p> </p>
                <p> We do not evaluate other methods for exon usage, as we focus in the workflow on Bioconductor methods that have been already proposed and evaluated for DTU analysis in publications.</p>
                <p> </p>
                <p> We now use consistent axes, and include the group size in the strip titles.</p>
                <p> </p>
                <p> We now evaluate DRIMSeq and DEXSeq on the identical simulation dataset used in both Soneson et al (2016) and Nowicka and Robinson (2016). We find similar performance of DEXSeq as reported in those papers using a less stringent transcript filter, but when we use DRIMSeq count and proportion filters as recommended in this workflow, the performance of DEXSeq is greatly improved, to levels consistent with what we see in the "main" simulation.</p>
                <p> </p>
                <p> 
                    <underline>Evaluation of DGE/DTE</underline>
                </p>
                <p> </p>
                <p> We clarify why a DGE and DTE evaluation is included.</p>
                <p> </p>
                <p> We do not perform a 2 replicate DGE or DTE evaluation, as this is beyond the scope of the article.</p>
                <p> </p>
                <p> We now breakdown the DGE and DTE results by simulated gene type. We do not see any strong enrichment of one simulated gene type in the false positive breakdown plots. We believe our evaluation may differ from others in exploring the consistency of results as sample size increases.</p>
                <p> </p>
                <p> 
                    <underline>Discussion</underline>
                </p>
                <p> </p>
                <p> We now include in the Discussion some recommendations on tool usage and performance.</p>
            </body>
        </sub-article>
    </sub-article>
</article>
