ALL Metrics
-
Views
-
Downloads
Get PDF
Get XML
Cite
Export
Track
Software Tool Article
Revised

RNAtor: an Android-based application for biologists to plan RNA sequencing experiments

[version 2; peer review: 1 approved, 1 approved with reservations]
PUBLISHED 16 Nov 2017
Author details Author details
OPEN PEER REVIEW
REVIEWER STATUS

Abstract

RNA sequencing (RNA-seq) is a powerful technology that allows one to assess the RNA levels in a sample. Analysis of these levels can help in identifying novel transcripts (coding, non-coding and splice variants), understanding transcript structures, and estimating gene/allele expression. Biologists face specific challenges while designing RNA-seq experiments. The nature of these challenges lies in determining the total number of sequenced reads and technical replicates required for detecting marginally differentially expressed transcripts. Despite previous attempts to address these challenges, easily-accessible and biologist-friendly mobile applications do not exist. Thus, we developed RNAtor, a mobile application for Android platforms, to aid biologists in correctly designing their RNA-seq experiments. The recommendations from RNAtor are based on simulations and real data.

Keywords

RNA-seq, Android-based, simulations, mobile application, recommendations, experimental design

Revised Amendments from Version 1

Keeping in view the reviewers’ suggestions, we have made the following changes in the revised version of the manuscript.

  1. Portions of Abstract, Results and Discussion were re-written to correctly reflect the advantages and limitations of the tool, compared to the other existing web-based tools like EDDA and Scotty.
  2. Legend to Figure 3 is added that was missing earlier and the legends for other figures were revised to correctly reflect the data.
  3. Provided better description of data presented in Figure 2.
  4. Described the method by which the DEGs were calculated.
  5. Defined replicates.
  6. Described the use of simulated data.
  7. Defined true and false positives.
  8. Defined transcript recovery.
  9. Supplementary Figure 4 was replaced at a higher resolution.

See the authors' detailed response to the review by Niranjan Nagarajan
See the authors' detailed response to the review by Daisuke Komura

Introduction

RNA-seq offers several advantages over low-throughput technologies such as quantitative PCR and annotation-dependent methods such as microarrays. Designing RNA-seq experiments accurately, however, poses challenge to biologists. This is particularly true when prior knowledge on genome or transcriptome of the organism of choice is not available. It is important to determine the number of technical replicates and the number of sequencing reads, and choose the right analytical tool, to estimate subtle differences between expression levels of transcripts.

Web-based tools, Scotty (Busby et al., 2013) and EDDA (Luo et al., 2014), have an established precedence in aiding RNA-seq design. While Scotty relies solely on pilot or prototype data, EDDA relies on either pilot data or a simulate-and-test paradigm to account for variability across experimental conditions. Scotty has a built-in t-test based module, whereas EDDA has been linked to five other DE tools, post mode-normalization of the data. Both can detect DEGs upto 2-fold difference.

In the current manuscript, we describe RNAtor, an Android app with a user-friendly graphical user interface (GUI) that helps biologists design RNA-seq experiments. A mobile application offers a lot more flexibility, ease of navigation, user-friendliness, and offline features compared to a web-based tool, even when the latter can also be accessed or computed on the mobile. RNAtor can be linked to any existing differential expression analysis tool, and can help design experiments to estimate expression differences with as low as 0.8–1.2X fold change. RNAtor’s recommendations are based on an exhaustive combination of discovery with simulated reads for transcriptomes of varying sizes (3 to 100 Mb). These recommendations are subsequently validated with sequenced data from Saccharomyces cerevisiae, while comparing expression profiles of wild-type and mutant strains.

Methods

Implementation

We simulated varying numbers of Illumina-like reads with technical replicates, with fold changes ranging from 1.2–5X between the control and treatment samples, in both directions, on a 3 Mb human chr14 (hg19) transcriptome, using Polyester (Frazee et al., 2015). We detected differentially expressed genes (DEGs) on all the simulations using Tophat v2.1.1-Cufflinks v2.2.1 (Trapnell et al., 2012) based genome-guided workflow followed by differential expression analyses using five tools: Deseq v1.28.0 (Anders & Huber, 2010); Deseq2 v1.16.1 (Love et al., 2014); EdgeR v3.18.1 (Robinson et al., 2010); Cuffdiff-Cufflinks v2.2.1 (Trapnell et al., 2012); and Kallisto v0.43.1 (Bray et al., 2016) and a de novo assembly-based tool, Trinity v2.3.2 (Grabherr et al., 2011) followed by differential expression analyses using Kallisto v0.43.1 (Bray et al., 2016). Thus, Kallisto was used twice; first, with the genome-guided paradigm and second, with de novo assembly using Trinity. In the first scenario, the Tophat-Cufflinks alignments (.bam) were converted to reads (.fastq) to be used with Kallisto along with the 3 Mb transcriptome as the reference. In the second scenario, the de novo assembled transcriptome as the reference along with the simulated reads was used with Kallisto. All differential expression analysis softwares were run with default cut-offs. We studied results from these simulations on the number of DEGs detected reliably and the extent of recovery of those DEGs. Transcript recovery refers to the length the transcript as assembled by Tophat, found to be differentially expressed by EdgeR or CuffDiff or DESeq2, in relation to the actual length as per simulations. It is possible to estimate this parameter only for these three tools, since they offer a handle to the actual transcript IDs. Based on these simulations, we arrived at recommendations on the number of reads, number of replicates, and the tool(s) needed to identify DEGs reliably. We validated these recommendations using simulated reads from larger transcriptomes (10Mb, 30Mb and 100Mb), created by combining transcriptomes from more than one hg19 chromosome, and using a real Sacharomyces cerevisiae dataset (ENA accession: ERP004763) comprising of 48 biological replicates, for two conditions; wild-type (WT) and a snf2 knock-out (KO) mutant (Schurch et al., 2016).

Operation

The size of the transcriptome (or genome if the transcriptome size is not known), taken from a user-defined or from a backend database, the number of replicates to use and the fold change of DEGs are user-defined parameters in RNAtor (Figure 1). An RNAtor flowchart highlighting simulation conditions and analytical tools used is provided in Supplementary Figure S1.

4e652a73-b05f-47bf-81d7-9eb2e4dcf974_figure1.gif

Figure 1. Screenshots of the RNAtor mobile application.

Results

RNAtor was evaluated using questions that a biologist would typically ask before starting an experiment, followed by the recommendations provided by RNAtor.

Read requirements for optimal DEG detection

One, 1.5, 6, 10, 14 and 20 million reads are needed for detection of differential expression of DEGs at 5-fold, 4-fold, 3-fold, 2-fold, 1.5-fold and 1.2-fold change, respectively, for a 3Mb transcriptome with 3 technical replicates.

We simulated 0.2–20 million reads for human chromosome 14 (~3Mb) and observed that the numbers of detected DEGs simulated at a given fold change peaked for a certain coverage before plateauing (Figure 2). This observation remained valid for the real data (Figure 3) and the large simulated transcriptomes (10Mb, 30Mb and 100Mb) (Supplementary Figure S2). Increasing the number of sequencing reads increased the sensitivity of detection. The final recommendations from RNAtor correspond to the number of DEGs at its peak, and are therefore, a good compromise between sensitivity and keeping the cost of sequencing low. Changing the number of technical replicates does change the recommendation. For example, with more than three replicates, RNAtor suggests producing fewer reads to obtain the same information (Table 1).

4e652a73-b05f-47bf-81d7-9eb2e4dcf974_figure2.gif

Figure 2. Number of differentially expressed genes (DEGs) detected for simulated datasets (hg19 chr14) by Deseq, Deseq2, EdgeR, Cuffdiff, Kallisto-Sleuth and Trinity-Kallisto tools.

4e652a73-b05f-47bf-81d7-9eb2e4dcf974_figure3.gif

Figure 3. Number of differentially expressed genes (DEGs) detected using a real dataset (Saccharomyces cerevisiae) with the Kallisto-Sleuth pipeline.

Table 1. RNAtor output on the number of sequencing reads (in millions) to be produced for 2–5 technical replicates to detect differentially expressed genes at a given fold change.

2 replicates3 replicates4 replicates5 replicates
5fold 621.51.5
4fold 10621.5
3fold 10666
2fold 1410106
1.5fold 30202014

Detection sensitivity of DE tools

Kallisto detected optimal number of DEGs with the highest sensitivity. Focusing purely on the number of DEGs detected between WT and KO, Kallisto performed best over the other tools tested (Figure 2 and Supplementary Figure 3).

Detection specificity and transcript recovery by DE tools

Cuffdiff can be used for high specificity and DeSeq2 and EdgeR, for high transcript recovery. Although Kallisto-Sleuth was fast and produced results with high sensitivity; we observed that this was at the expense of specificity of detection (Supplementary Figure S3). Cuffdiff produced results with high specificity albeit with a loss of sensitivity (Supplementary Figure S3). The transcript recovery was best for EdgeR for shorter (<742 bases) and medium-sized (742–1456 bases) transcripts, and best for CuffDiff for longer transcripts (>1456 bases), among the 3 tools tested (CuffDiff, DeSeq and EdgeR, Supplementary Figure S4).

Performance of assembly-based pipeline over that of genome-guided tools

The assembly-based pipeline yields more DEGs with higher sensitivity and specificity. Using Trinity (Grabherr et al., 2011) as an assembly pipeline along with Kallisto enhanced the number of DEGs detected when compared with the genome-guided Kallisto-Sleuth pipeline (Figure 2). While the sensitivity of Trinity-Kallisto was marginally better, its specificity was visibly better when compared to the Kallisto-Sleuth pipeline (Supplementary Figure S3).

Discussion

Although some of the challenges with RNA-seq experiments have been addressed previously (Busby et al., 2013; Luo et al., 2014), currently there is no easy-to-use, biologist-friendly mobile phone-based app. Scotty, a previously reported, useful, interactive web-based tool aids RNA-seq experimental design. However, it has a dependence on pilot or prototype data, closely matching the actual experimental conditions (Busby et al., 2013). EDDA, another web-based interactive RNA-seq experimental design aiding tool, offers more flexibility in terms of the use either providing pilot data or using a simulate-and-test paradigm as per the desired experimental conditions (Luo et al., 2014). Both can detect genes or transcripts of only up to 2X fold change in the test condition relative to the control. RNAtor addresses some of these gaps as a user-friendly mobile app. Hhowever, it has certain limitations. For example, it does not take into account the dynamic nature of any transcriptome (where the exact size of transcriptome is not known and cannot simply be derived from the genome size), the throughput of different sequencing instruments, the presence of spliced variants, and the relative abundance of transcript, for e.g. in relation to a control gene or any other gene of interest. We also recognize that the RNAtor v1.0 is based on simple assumptions that can affect the recommendations. Nevertheless, the validation of the recommendations resulting from training on simulated RNA-seq data that has not yet incorporated various biological biases, with real data from Saccharomyces cerevisiae provides strong evidence that our assumptions do not significantly impact RNAtor's guidance to users. That said, there is a prevailing need for a simple tool for biologists, who have simple questions. RNA-seq is not necessarily used to answer complex questions always, but also often as a superior substitute to qPCR. We intend to expand the scope of the tool in its future releases, by introducing biases that mimick various experimental conditions into the simulation phase.

Data availability

The Android version of RNAtor is available on Google Play Store.

Latest source code: https://github.com/binaypanda/RNAtor.

Archived source code as at the time of publication: https://doi.org/10.5281/zenodo.814905 (Panda, 2017).

License: RNAtor v1.0 is distributed under GNU GPLv3 licence.

Comments on this article Comments (0)

Version 2
VERSION 2 PUBLISHED 26 Jun 2017
Comment
Author details Author details
Competing interests
Grant information
Copyright
Download
 
Export To
metrics
Views Downloads
F1000Research - -
PubMed Central
Data from PMC are received and updated monthly.
- -
Citations
CITE
how to cite this article
Kane S, Garg H, Krishnan NM et al. RNAtor: an Android-based application for biologists to plan RNA sequencing experiments [version 2; peer review: 1 approved, 1 approved with reservations]. F1000Research 2017, 6:997 (https://doi.org/10.12688/f1000research.11982.2)
NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.
track
receive updates on this article
Track an article to receive email alerts on any updates to this article.

Open Peer Review

Current Reviewer Status: ?
Key to Reviewer Statuses VIEW
ApprovedThe paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approvedFundamental flaws in the paper seriously undermine the findings and conclusions
Version 2
VERSION 2
PUBLISHED 16 Nov 2017
Revised
Views
35
Cite
Reviewer Report 27 Dec 2017
Daisuke Komura, Department of Genomic Pathology, Medical Research Institute, Tokyo Medical and Dental University, Tokyo, Japan 
Approved
VIEWS 35
The quality of the revised manuscript has greatly improved. The authors addressed most of the remarks mentioned during the first review. I have one additional minor comment.

1) Supplementary Figure 2 : #reads=0 data points in the ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Komura D. Reviewer Report For: RNAtor: an Android-based application for biologists to plan RNA sequencing experiments [version 2; peer review: 1 approved, 1 approved with reservations]. F1000Research 2017, 6:997 (https://doi.org/10.5256/f1000research.14320.r28051)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
Views
22
Cite
Reviewer Report 29 Nov 2017
Niranjan Nagarajan, Genome Institute of Singapore, Singapore, Singapore 
Approved with Reservations
VIEWS 22
I thank the authors for carefully considering my comments and revising accordingly. I have a few major comments that still remain to be addressed:

1) I believe the manuscript needs at least a few convincing examples to ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Nagarajan N. Reviewer Report For: RNAtor: an Android-based application for biologists to plan RNA sequencing experiments [version 2; peer review: 1 approved, 1 approved with reservations]. F1000Research 2017, 6:997 (https://doi.org/10.5256/f1000research.14320.r28052)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
Version 1
VERSION 1
PUBLISHED 26 Jun 2017
Views
41
Cite
Reviewer Report 30 Aug 2017
Niranjan Nagarajan, Genome Institute of Singapore, Singapore, Singapore 
Not Approved
VIEWS 41
RNAtor looks like a simple and easy to use application. However this could also be its drawback in that it may be too simplistic. The authors should show some evidence that the parts they omit do not significantly impact RNAtor's ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Nagarajan N. Reviewer Report For: RNAtor: an Android-based application for biologists to plan RNA sequencing experiments [version 2; peer review: 1 approved, 1 approved with reservations]. F1000Research 2017, 6:997 (https://doi.org/10.5256/f1000research.12955.r23782)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
  • Author Response 16 Nov 2017
    Binay Panda, Ganit Labs, Bio-IT Centre, Institute of Bioinformatics and Applied Biotechnology, Bangalore, India
    16 Nov 2017
    Author Response
    We thank Dr. Niranjan Nagarajan for reviewing the manuscript and providing his comments. Following his suggestions, we have revised the manuscript and have responded to all his queries below. We ... Continue reading
COMMENTS ON THIS REPORT
  • Author Response 16 Nov 2017
    Binay Panda, Ganit Labs, Bio-IT Centre, Institute of Bioinformatics and Applied Biotechnology, Bangalore, India
    16 Nov 2017
    Author Response
    We thank Dr. Niranjan Nagarajan for reviewing the manuscript and providing his comments. Following his suggestions, we have revised the manuscript and have responded to all his queries below. We ... Continue reading
Views
32
Cite
Reviewer Report 06 Jul 2017
Daisuke Komura, Department of Genomic Pathology, Medical Research Institute, Tokyo Medical and Dental University, Tokyo, Japan 
Approved with Reservations
VIEWS 32
The authors developed a new mobile application named RNAtor to assist in the designing of RNA-seq experiments. It provides users with the number of reads that is required for optimal detection of differentially expressed genes at a given fold-change threshold ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Komura D. Reviewer Report For: RNAtor: an Android-based application for biologists to plan RNA sequencing experiments [version 2; peer review: 1 approved, 1 approved with reservations]. F1000Research 2017, 6:997 (https://doi.org/10.5256/f1000research.12955.r24054)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
  • Author Response 16 Nov 2017
    Binay Panda, Ganit Labs, Bio-IT Centre, Institute of Bioinformatics and Applied Biotechnology, Bangalore, India
    16 Nov 2017
    Author Response
    We thank Dr. Daisuke Komura for his time and timely review. We have addressed all his queries below and have incorporated his suggestions in the revised version of the manuscript, ... Continue reading
COMMENTS ON THIS REPORT
  • Author Response 16 Nov 2017
    Binay Panda, Ganit Labs, Bio-IT Centre, Institute of Bioinformatics and Applied Biotechnology, Bangalore, India
    16 Nov 2017
    Author Response
    We thank Dr. Daisuke Komura for his time and timely review. We have addressed all his queries below and have incorporated his suggestions in the revised version of the manuscript, ... Continue reading

Comments on this article Comments (0)

Version 2
VERSION 2 PUBLISHED 26 Jun 2017
Comment
Alongside their report, reviewers assign a status to the article:
Approved - the paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations - A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approved - fundamental flaws in the paper seriously undermine the findings and conclusions
Sign In
If you've forgotten your password, please enter your email address below and we'll send you instructions on how to reset your password.

The email address should be the one you originally registered with F1000.

Email address not valid, please try again

You registered with F1000 via Google, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Google account password, please click here.

You registered with F1000 via Facebook, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Facebook account password, please click here.

Code not correct, please try again
Email us for further assistance.
Server error, please try again.