CircSeqAlignTk: An R package for end-to-end analysis of RNA-seq data for circular genomes

Jianqiang Sun; Xi Fu; Wei Cao

doi:10.12688/f1000research.127348.2

Home Browse CircSeqAlignTk: An R package for end-to-end analysis of RNA-seq data...

ALL Metrics

Views

Downloads

Get PDF

Get XML

Export

▬

✚

Software Tool Article

Revised

CircSeqAlignTk: An R package for end-to-end analysis of RNA-seq data for circular genomes

[version 2; peer review: 1 approved, 2 approved with reservations]

Jianqiang Sun ¹, Xi Fu², Wei Cao¹

PUBLISHED 30 Apr 2024

Author details Author details

¹ Research Center for Agricultural Information Technology, National Agriculture and Food Research Organization, Tsukuba, 305-8604, Japan
² Graduate School of Agricultural and Life Sciences, The University of Tokyo, Tokyo, 113-8657, Japan

Jianqiang Sun
Roles: Conceptualization, Formal Analysis, Funding Acquisition, Investigation, Methodology, Project Administration, Software, Writing – Original Draft Preparation, Writing – Review & Editing

Xi Fu
Roles: Writing – Review & Editing

Wei Cao
Roles: Writing – Review & Editing

OPEN PEER REVIEW

REVIEWER STATUS

This article is included in the RPackage gateway.

This article is included in the Bioinformatics gateway.

This article is included in the Japan Institutional Gateway gateway.

Abstract

RNA sequencing (RNA-seq) technology has become one of the standard tools for studying biological mechanisms at the transcriptome level. Advances in RNA-seq technology have led to the development of numerous publicly available tools for RNA-seq data analysis. Most of these tools target linear genome sequences despite the necessity of studying organisms with circular genome sequences. For example, studying the infection mechanisms of viroids which comprise 246–401 nucleotides circular RNAs and target plants may prevent tremendous economic and agricultural damage. Unfortunately, using the available tools to construct workflows for the analysis of circular genome sequences is difficult, especially for non-bioinformaticians. To overcome this limitation, we present CircSeqAlignTk, an easy-to-use and richly documented R package. CircSeqAlignTk offers both command line and graphical user interfaces for end-to-end RNA-seq data analysis, spanning alignment to the visualisation of circular genome sequences, via a series of functions. Moreover, it includes a feature to generate synthetic sequencing data that mirrors real RNA-seq data from biological experiments. CircSeqAlignTk not only provides an easy-to-use analysis interface for novice users but also allows developers to evaluate the performance of alignment tools and new workflows.

Keywords

R package, alignment, visualisation, small RNA-seq, circular genome sequence, viroid.

Corresponding author: Jianqiang Sun

Competing interests: No competing interests were disclosed.

Grant information: This work was supported by JSPS KAKENHI [21K05608 and 22H05179].
The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Copyright: © 2024 Sun J et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

How to cite: Sun J, Fu X and Cao W. CircSeqAlignTk: An R package for end-to-end analysis of RNA-seq data for circular genomes [version 2; peer review: 1 approved, 2 approved with reservations]. F1000Research 2024, 11:1221 (https://doi.org/10.12688/f1000research.127348.2) First published: 27 Oct 2022, 11:1221 (https://doi.org/10.12688/f1000research.127348.1) Latest published: 30 Apr 2024, 11:1221 (https://doi.org/10.12688/f1000research.127348.2)

Revised Amendments from Version 1

1. The explanation about GUI usage was added into the manuscript.
2. Improvement of the language used in the manuscript.
3. Figure 1 was updated.
4. Figure 3 was added into the manuscript.
5. We added a reference Chang et al., 2024 (shiny) into the reference section.

See the authors' detailed response to the review by Xueyi Dong
See the authors' detailed response to the review by Alexander Zelikovsky and Bikram Sahoo
See the authors' detailed response to the review by Eric Soler and Mohammad Salma

Introduction

RNA sequencing (RNA-seq) technology provides insights into various biological mechanisms, including gene stress responses and plant viral infection mechanisms (Vihervaara et al., 2018; Zanardo et al., 2019). The two essential processes for analysing RNA-seq data are aligning sequence reads to the genome sequence and summarising the alignment coverage. The widespread use of RNA-seq has encouraged the development of numerous tools for data analysis. For example, Bowtie2 (Langmead & Salzberg, 2012) and HISAT2 (Kim et al., 2019) are well-known tools for read alignment, whereas SAMtools (Li et al., 2009) and BEDtools (Quinlan & Hall, 2010) are used for coverage calculations.

Applying RNA-seq technology to various organisms, including those with circular genome sequences like bacteria, viruses, and viroids, offers insights into addressing crucial biological and social challenges. For instance, delving into the infection mechanisms of viroids, known as one of the simplest infectious agents with single-stranded circular non-coding RNAs comprising 246–401 nucleotides (Hull, 2014), has the potential to avert significant economic and agricultural losses (Soliman et al., 2012; Sastry, 2013). Nonetheless, the majority of current tools cater exclusively to RNA-seq data from organisms with linear genome sequences, such as animals and plants. Early efforts in developing tools for these genomes often involved intricate workflows, integrating numerous tools coded in diverse programming languages, making them less accessible, especially for non-bioinformaticians. While recent advancements have introduced tools for aligning reads to circular genomes (Ayad & Pissis, 2017; Adkar-Purushothama et al., 2021), sophisticated programming skills are still needed owing to limited documentation and illustrative examples.

Here, we introduce, CircSeqAlignTk, an accessible R package designed as a circular sequence alignment toolkit. CircSeqAlignTk offers both command line interface (CLI) and graphical user interface (GUI) options for end-to-end analysis of RNA-seq data targeting circular genomes, with a primary emphasis on viroids. Furthermore, CircSeqAlignTk seamlessly integrates with other R packages, ensuring consistent analysis within a uniform programming language environment.

Methods

Operation

CircSeqAlignTk is an R package registered in the Bioconductor repository, with its source code available on GitHub and archived on Zenodo (Sun, Fu & Cao, 2022). The package requires R (≥ 4.2) and runs on most popular operating systems (OSs) including Linux, macOS X, and Windows.

Implementation

Workflow analysis using CircSeqAlignTk (Figure 1) begins with the preparation of two types of data. The first type is RNA-seq data in FASTQ format which can be obtained from biological experiments; for example, researchers may sequence small RNAs from plants that may be infected by pathogens using high-throughput sequencing platforms. Alternatively, data can be downloaded from public databases such as the Sequence Read Archive (Leinonen et al., 2011), which are typically published by other researchers worldwide and can be used for re-analysis and meta-analysis. The second type is organism genome sequence data (e.g., the circular RNA sequence of a viroid) in the FASTA format, which can be obtained from public databases such as GenBank (Benson et al., 2013).

Figure 1. Overview of workflow analyses and functions implemented in the CircSeqAlignTk package.

After the preparation step, the build_index function in CircSeqAlignTk constructs two types of reference sequences from the input genome sequence for alignment: (i) type 1, the input genome sequence itself, and (ii) type 2, generated by converting the type 1 reference sequence into a circular sequence by opening the circle at a position opposite to that of the type 1 reference sequence. Once the two reference sequences are constructed, the align_reads function aligns reads through two stages: (i) aligning reads to the type 1 reference and (ii) collecting the unaligned reads and aligning them to the type 2 reference. The align_reads function allows users to select either Bowtie2 (Langmead and Salzberg, 2012) or HISAT2 (Kim et al., 2019). Alignment is executed by directly calling Bowtie2 or HISAT2, both of which are installed on the OS. However, if these tools are unavailable, align_reads automatically calls the Bioconductor packages Rbowtie2 (Wei et al., 2018) or Rhisat2 (Soneson, 2022) for alignment. Rbowtie2 and Rhisat2 are installed automatically as dependencies of CircSeqAlignTk. The alignment coverage can be calculated separately for aligned reads in forward and reverse strands with the calc_coverage function. The calc_coverage function internally calls coverage function implemented in the IRanges package to calculate the number of reads covering each position of the reference sequence.

Lastly, the plot function visualise the alignment coverage based on the length and strand of the aligned reads, respectively.

The GUI of CircSeqAlignTk is an application based on the shiny package (Chang et al., 2024). It allows users to proceed with the whole analysis without writing any code. In practice, users can select FASTA and FASTQ files, perform alignment, and visualise the results intuitively by mouse operation. Additionally, quality control of FASTQ files (e.g., trimming adapter sequences and low-quality bases) is implemented to support the integrity of end-to-end data analysis.

In addition to conducting end-to-end RNA-seq data analysis, CircSeqAlignTk incorporates a function, generate_reads, designed to generate synthetic sequence reads that emulate RNA-seq data obtained from circular genome sequences. This function allows developers to validate the performance of new alignment algorithms and analysis workflows. To generate synthetic reads, users can specify specific circular genome sequences for read sampling and include adapter sequences and mismatches by adjusting arguments.

Notably, that although CircSeqAlignTk provides a user-friendly analysis tool, and therefore offers a way to adjust important parameters that may affect the analysis results, some minor parameter adjustments are not possible. For example, when using the GUI for FASTQ quality control, the user can onl1y specify the (1) adapter sequence, (2) read length range, (3) minimum Phred score, and (4) minimum number of Ns in a read. Therefore, more fine-grained quality control of FASTQ needs to be addressed by users using other software in advance.

Use cases

The aim of the use cases is to briefly overview of the fundamental usage of CircSeqAlignTk functions. In this context, we introduce two use-case examples: (i) the analysis of small RNA-seq data sequenced from a viroid infection experiment and (ii) the analysis of synthetic small RNA-seq data created by CircSeqAlignTk. Furthermore, the detailed usage of CircSeqAlignTk is documented in the package vignette, accessible via the browseVignettes function.

browseVignettes('CircSeqAlignTk')

Analysis of small RNA-seq data sequenced from a viroid infection experiment

For a practical CircSeqAlignTk use case, we analysed a subset of small RNA-seq data sequenced from tomato plants experimentally infected with the potato spindle tuber viroid (PSTVd) isolate Cen-1. Herein, we demonstrate the alignment of RNA-seq reads onto the genome sequence of PSTVd isolate Cen-1 and visualisation of alignment coverage with CircSeqAlignTk. The sample RNA-seq data and genome sequence of PSTVd isolate Cen-1 are included in CircSeqAlignTk and can be accessed with the system.file function.

library(CircSeqAlignTk)
fq <- system.file(package = 'CircSeqAlignTk', 'extdata', 'srna.fq.gz')
genome_seq <- system.file(package = 'CircSeqAlignTk', 'extdata', 'FR851463.fa')

Given that the majority of reads in this RNA-seq data include adapters bearing the sequence “AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC,” we employed AdapterRemoval (Schubert et al., 2016), which was implemented in the R package Rbowtie2 (Wei et al., 2018), to trim the adapters prior to analysis with CircSeqAlignTk.

library(R.utils)
library(Rbowtie2)
gunzip(fq, destname='srna.fq')
params <- '--maxns 1 --trimqualities --minquality 30 --minlength 21 --maxlength 24'
remove_adapters(file 1 = 'srna.fq',
          adapter1 = 'AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC',
          adapter2 = NULL,
          output1 = 'srna_trimmed.fq',
          params,
          overwrite = TRUE)

Following adapter removal, we constructed indices of the PSTVd isolate Cen-1 genome sequences using the build_index function and executed alignment with the align_reads function. Subsequently, we summarised the alignment coverage using the calc_coverage function and visualised the result using the plot function (Figure 2A).

ref_index <- build_index(input = genome_seq,
                output = 'index')
aln <- align_reads(input = 'srna_trimmed.fq',
            index = ref_index,
            output = 'align_results')
alncov <- calc_coverage(aln)
plot(alncov)

Figure 2. Visualisation of alignment coverage.

A. Alignment coverage of RNA-seq data from viroid-infected tomato plants. The x-axis represents the position of the reference sequence. The upper and lower y-axes represent the alignment coverage of reads with forward and reverse strands, respectively. Colours indicate the length of reads aligned on the reference sequence. B. Alignment coverage of synthetic RNA-seq data generated by the CircSeqAlignTk functions.

Analysis of synthetic small RNA-seq data

A distinctive feature of CircSeqAlignTk is its capability to generate synthetic small RNA-seq data that emulate real RNA-seq data obtained from biological experiments. Herein, we utilised the generate_reads function to generate 10,000 small RNA-seq reads, each comprising 150 nucleotides and the adapter sequence “AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC,” simulating genuine RNA-seq reads from plants infected by the PSTVd isolate Cen-1. Furthermore, we introduced two mismatches in each read with respective probabilities of 0.1 and 0.01.

set.seed(1)
genome_seq <- system.file(package = 'CircSeqAlignTk', 'extdata', 'FR851463.fa')
sim <- generate_reads(n = 5000,
              seq = genome_seq,
              adapter = 'AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC',
              output = 'synthetic_reads.fq.gz',
              read_length = 150,
              mismatch_prob = c(0.1, 0.1 * 0.1))

The above function generates synthetic reads by repeating the following operations: randomly cutting substrings from the whole genome sequence of the PSTVd isolate Cen-1, adding the adapter, and introducing two mismatches based on the given probability. Both the location of random cutting and the length of the reads can be stored into a variable, enabling users to review this information and visualise the ground truth of alignment coverage of these synthetic reads (Figure 2B).

head(slot(sim, 'read_info'))
##   mean std strand    prob  start end            sRNA   length
## 1  341   4    +  0.1079135   704 727 GGAACCGCAGTTGGTTCCTCGGAA     24
## 2   74   4    +  0.1946800   431 454 CTCGGAGGAGCGCTTCAGGGATCC     24
## 3  227   4    +  0.1104790   588 611 CCCCTCGCCCCCTTTGCGCTGTCG     24
## 4   65   4    +  0.1496360   425 445 TTGCGGCCCGGAGGAGCGCTT        21
## 5  341   4    +  0.1079135   702 724 TTGGAACCGCAGTTGGTTCCGCG      23
## 6  239   3    +  0.1342126   599 622 CTTTGCGCTGTCGCTTCGGCTACT     24
alncov <- slot(sim, 'coverage')
plot(alncov)

The generated reads are saved in FASTQ format. Users can utilise these reads to evaluate the performance of the workflow analysis by calculating the root mean squared error between the ground truth and workflow outputs.

gunzip('synthetic_reads.fq.gz', destname='synthetic_reads.fq')
params <- '--maxns 1 --trimqualities --minquality 30 --minlength 21 --maxlength 24'
remove_adapters(file 1 = 'synthetic_reads.fq',
           adapter1 = 'AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC',
           adapter2 = NULL,
           output1 = 'synthetic_reads_trimmed.fq',
           params,
           overwrite = TRUE)
ref_index <- build_index(input = genome_seq,
                output = 'index')
aln <- align_reads(input = 'synthetic_reads_trimmed.fq',
            index = ref_index,
            output = 'align_results')
alncov <- calc_coverage(aln)
plot(alncov)

# coverage of reads in forward strand
fwd_pred <- slot(alncov, 'forward')
fwd_true <- slot(slot(sim, 'coverage'), 'forward')
sqrt(sum((fwd_pred - fwd_true) ^ 2) / length(fwd_true))
## [1] 0.2201737

# coverage of reads in reverse strand
rev_pred <- slot(alncov, 'reverse')
rev_true <- slot(slot(sim, 'coverage'), 'reverse')
sqrt(sum((rev_pred - rev_true) ^ 2) / length(rev_true))
## [1] 0.1262061

GUI usage

To use the GUI of CircSeqAlignTk, start R, create an application with the build_app function, and run the application with the runApp function. For example, executing the following code will start the web browser and launch the application as shown in Figure 3. Users can specify the FASTA and FASTQ files according to the on-screen instructions and click on the run button for quality control of FASTQ file, alignment, and visualisation. The alignment results are saved in the folder where the application was launched and are also displayed at the bottom of the application screen.

library(shiny)
library(CircSeqAlignTk)
app <- build_app()
shiny::runApp(app)

Figure 3. GUI of CircSeqAlignTk.

The GUI allows selection of input files (FASTA and FASTQ). After selecting the input file, quality control and alignment can be executed by clicking on the execute button.

Conclusions

The R package CircSeqAlignTk demonstrates significant potential for conducting end-to-end analysis of RNA-seq data from circular genomes, including bacteria, viruses, and viroids. In addition, its applicability can be expanded to encompass other organisms and organelles with circular genomes. Owing to its simple installation, straightforward usage in both command line interface and graphical user interface modes, and detailed documentation, the package will substantially reduce the barriers associated with analysing RNA-seq data of this nature.

Software availability

Software available from: https://doi.org/doi:10.18129/B9.bioc.CircSeqAlignTk

Source code available from: https://github.com/jsun/CircSeqAlignTk

Archived source code at the time of publication: https://doi.org/10.5281/zenodo.7218032 (Sun, Fu & Cao, 2022).

License: MIT

Data availability

Zenodo: CircSeqAlignTk. https://doi.org/10.5281/zenodo.7218032 (Sun, Fu & Cao, 2022).

- The datasets analysed in this study are stored in the inst/extdata directory of the CircSeqAlignTk package.

Acknowledgements

We thank Yosuke Matsushita for insightful discussions and inputs.

References

Adkar-Purushothama CR, Sridharan lyer P, Sano T, et al.: sRNA Profiler: a user-focused interface for small RNA mapping and profiling. Cells. 2021; 10(7): 1771. PubMed Abstract | Publisher Full Text
Ayad LAK, Pissis SP: MARS: improving multiple circular sequence alignment using refined sequences. BMC Genomes. 2017; 18: 86. PubMed Abstract | Publisher Full Text
Benson DA, Cavanaugh M, Clark K, et al.: GenBank. Nucleic Acids Res. 2013; 41: D36–D42. PubMed Abstract | Publisher Full Text
Chang W, Cheng J, Allaire JJ, et al.: shiny: Web Application Framework for R. R package version 1.8.0. 2024. Reference Source
Hull R: Plant Virology (fifth edition). Plant virology. Cambridge, Massachusetts, US: Academic Press; 2014. Publisher Full Text
Kim D, Paggi JM, Park C, et al.: Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat. Biotechnol. 2019; 37: 907–915. PubMed Abstract | Publisher Full Text
Langmead B, Salzberg S: Fast gapped-read alignment with Bowtie 2. Nat. Methods. 2012; 9: 357–359. PubMed Abstract | Publisher Full Text
Leinonen R, Sugawara H, Shumway M: and on behalf of the International Nucleotide Sequence Database Collaboration. The sequence read archive. Nucleic Acids Res. 2011; 39: D19–D21. PubMed Abstract | Publisher Full Text
Li H, Handsaker B, Wysoker A, et al.: The sequence alignment/map format and SAMtools. Bioinformatics. 2009; 25(16): 2078–2079. PubMed Abstract | Publisher Full Text
Quinlan AR, Hall IM: BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010; 26(6): 841–842. PubMed Abstract | Publisher Full Text
Sastry KS: Impact of virus and viroid diseases on crop yields. Plant virus and viroid diseases in the tropics. Springer; Dordrecht: 2013. Publisher Full Text
Schubert M, Lindgreen S, Orlando L: (2016) AdapterRemoval v2: rapid adapter trimming, identification, and read merging. BMC. Res. Notes. 2016; 9: 88. PubMed Abstract | Publisher Full Text
Soliman T, Mourits MCM, Oude Lansink AGJM, et al.: Quantitative economic impact assessment of an invasive plant disease under uncertainty – A case study for potato spindle tuber viroid (PSTVd) invasion into the European Union. Crop Prot. 2012; 40: 28–35. Publisher Full Text
Soneson C: Rhisat2: R wrapper for HISAT2 aligner. R package version 1.12.0. GitHub. [October 6, 2022 accessed]. Reference Source
Sun J, Fu X, Cao W: CircSeqAlignTk. Zenodo. [Dataset]. 2022. Publisher Full Text
Vihervaara A, Duarte FM, Lis JT: Molecular mechanisms driving transcriptional stress responses. Nat. Rev. Genet. 2018; 19: 385–397. PubMed Abstract | Publisher Full Text
Wei Z, Zhang W, Fang H, et al.: esATAC: and easy-to-use systematic pipeline for ATAC-seq data analysis. Bioinformatics. 2018; 34(15): 2664–2665. PubMed Abstract | Publisher Full Text
Zanardo LG, de Souza GB , Alves MS: Transcriptomics of plant–virus interactions: a review. Theor. Exp. Plant Physiol. 2019; 31: 103–125. Publisher Full Text

Comments on this article Comments (0)

Version 2

VERSION 2 PUBLISHED 27 Oct 2022

Author details Author details

Xi Fu
Roles: Writing – Review & Editing

Wei Cao
Roles: Writing – Review & Editing

Competing interests

No competing interests were disclosed.

Grant information

This work was supported by JSPS KAKENHI [21K05608 and 22H05179].
The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Article Versions (2)

version 2

Revised

Published: 30 Apr 2024, 11:1221

https://doi.org/10.12688/f1000research.127348.2

version 1

Published: 27 Oct 2022, 11:1221

https://doi.org/10.12688/f1000research.127348.1

© 2024 Sun J et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download

Export To

metrics

	Views	Downloads
F1000Research	-	-
PubMed Central Data from PMC are received and updated monthly.	-	-

Citations

SEE MORE DETAILS

CITE

how to cite this article

Sun J, Fu X and Cao W. CircSeqAlignTk: An R package for end-to-end analysis of RNA-seq data for circular genomes [version 2; peer review: 1 approved, 2 approved with reservations]. F1000Research 2024, 11:1221 (https://doi.org/10.12688/f1000research.127348.2)

NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.

track

receive updates on this article

Track an article to receive email alerts on any updates to this article.

Open Peer Review

Current Reviewer Status: ?

Key to Reviewer Statuses VIEW HIDE

ApprovedThe paper is scientifically sound in its current form and only minor, if any, improvements are suggested

Approved with reservations A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.

Not approvedFundamental flaws in the paper seriously undermine the findings and conclusions

Version 1

VERSION 1

PUBLISHED 27 Oct 2022

Views

Reviewer Report 28 Apr 2024

Alexander Zelikovsky, Department of Computer Science, Georgia State University, Atlanta, Georgia, USA

Bikram Sahoo, Department of Computer Science, Georgia State University, Atlanta, Georgia, USA

Approved

https://doi.org/10.5256/f1000research.139847.r267951

The article provides a comprehensive overview of the urgent need for packages tailored for non-bioinformaticians, addressing the challenges in analyzing circular genome data. However, there are some minor comments:

The authors primarily focus on the use

The authors primarily focus on the use of Bowtie2 and HISAT2 for alignment, neglecting to mention pseudoalignment algorithms like Kalisto. Including a brief description of such tools in the introduction would enhance the readers' understanding. Additionally, a comparative analysis of different alignment algorithms, along with the option to choose between them, would augment the robustness of the tool.
While the package utilizes SAMtools and BEDtools for coverage computation, incorporating downstream analysis tools compatible with circular genomes would enrich the utility of the tool for users.
Each figure in the article requires thorough explanation to aid interpretation, as they may be challenging to decipher at first glance. For instance, the coverage plot's significance is not easy to comprehend.
To attract non-computational users, it would be advantageous for the package to feature an easy-to-use graphical user interface (GUI), aligning with the intended accessibility for this audience.

Overall, these suggestions aim to enhance the comprehensiveness and usability of the CircSeqAlignTk package for a wider range of users.

Is the rationale for developing the new software tool clearly explained?

Yes
Is the description of the software tool technically sound?

Yes
Are sufficient details of the code, methods and analysis (if applicable) provided to allow replication of the software development and its use by others?

Yes
Is sufficient information provided to allow interpretation of the expected output datasets and any results generated using the tool?

Partly
Are the conclusions about the tool and its performance adequately supported by the findings presented in the article?

Partly

Competing Interests: No competing interests were disclosed.

Reviewer Expertise: Computational genomics, RNA-seq and DNA-seq data analysis

We confirm that we have read this submission and believe that we have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

CITE

Report a concern

Author Response 30 Apr 2024

Jianqiang Sun, Research Center for Agricultural Information Technology, National Agriculture and Food Research Organization, Tsukuba, 305-8604, Japan

30 Apr 2024

Author Response
Thank you for reviewing our manuscript and for your constructive comments. According to reviewers’ comments, we have largely revised the manuscript and updated the software. Please find our point-by-point detailed ... Continue reading
Thank you for reviewing our manuscript and for your constructive comments. According to reviewers’ comments, we have largely revised the manuscript and updated the software. Please find our point-by-point detailed responses to the reviewers’ comments, below.

We will update the software to support kallisto and update the documentation of this package in the next scheduled release. Thank you for your suggestion.

We plan to add functions for downstream analysis based on future needs.

We modified the captions of figures in the revised manuscript.

Thank you and other reviewers’ suggestions, we have already implemented the GUI and released it.
Thank you for reviewing our manuscript and for your constructive comments. According to reviewers’ comments, we have largely revised the manuscript and updated the software. Please find our point-by-point detailed responses to the reviewers’ comments, below.

We will update the software to support kallisto and update the documentation of this package in the next scheduled release. Thank you for your suggestion.

We plan to add functions for downstream analysis based on future needs.

We modified the captions of figures in the revised manuscript.

Thank you and other reviewers’ suggestions, we have already implemented the GUI and released it.
Competing Interests: No competing interests were disclosed. Close
Report a concern
Respond or Comment

COMMENTS ON THIS REPORT

Author Response 30 Apr 2024

Jianqiang Sun, Research Center for Agricultural Information Technology, National Agriculture and Food Research Organization, Tsukuba, 305-8604, Japan

30 Apr 2024

Author Response
Thank you for reviewing our manuscript and for your constructive comments. According to reviewers’ comments, we have largely revised the manuscript and updated the software. Please find our point-by-point detailed ... Continue reading
Thank you for reviewing our manuscript and for your constructive comments. According to reviewers’ comments, we have largely revised the manuscript and updated the software. Please find our point-by-point detailed responses to the reviewers’ comments, below.

We will update the software to support kallisto and update the documentation of this package in the next scheduled release. Thank you for your suggestion.

We plan to add functions for downstream analysis based on future needs.

We modified the captions of figures in the revised manuscript.

Thank you and other reviewers’ suggestions, we have already implemented the GUI and released it.
Thank you for reviewing our manuscript and for your constructive comments. According to reviewers’ comments, we have largely revised the manuscript and updated the software. Please find our point-by-point detailed responses to the reviewers’ comments, below.

We will update the software to support kallisto and update the documentation of this package in the next scheduled release. Thank you for your suggestion.

We plan to add functions for downstream analysis based on future needs.

We modified the captions of figures in the revised manuscript.

Thank you and other reviewers’ suggestions, we have already implemented the GUI and released it.
Competing Interests: No competing interests were disclosed. Close
Report a concern

Views

Reviewer Report 20 Mar 2024

Xueyi Dong, ACRF Cancer Biology and Stem Cells Division Institution, Walter and Eliza Hall Institute of Medical Research, Melbourne, Victoria, Australia; Bioinformatics Division, Walter and Eliza Hall Institute of Medical Research, Melbourne, Victoria, Australia

Approved with Reservations

https://doi.org/10.5256/f1000research.139847.r254597

In the manuscript "CircSeqAlignTk: An R package for end-to-end analysis of RNA-seq data for circular genomes" by Jianqiang Sun, Xi Fu and Wei Cao, the authors introduced an R package CircSeqAlignTk for RNA-seq data analysis related to circular genome, including read alignment, coverage visualization and data simulation. The authors have clearly introduced the rationale for developing this package, how the alignment to circular genome works, and how to use this package. I only have several minor comments:

The authors should add more details on the definition of alignment coverage and explain how it is calculated in this package. More descriptions should be added to the captions of Figure 2 to help the readers interpret this figure.
While the analyses are based on circular genomes, the visualization of alignment coverage was still on linear axes. I found it a bit hard to determine the end of the sequence coordinate in Figure 2 (i.e. are the blank space on the right end of x-axis regions with no coverage, or are they outside of the range of the genome?). Have you considered using circular plots?
If applicable, the performance of CircSeqAlignTk should be compared to other tools for the same or similar tasks.

Is the rationale for developing the new software tool clearly explained?

Yes
Is the description of the software tool technically sound?

Yes
Are sufficient details of the code, methods and analysis (if applicable) provided to allow replication of the software development and its use by others?

Yes
Is sufficient information provided to allow interpretation of the expected output datasets and any results generated using the tool?

Partly
Are the conclusions about the tool and its performance adequately supported by the findings presented in the article?

Partly

Competing Interests: No competing interests were disclosed.

Reviewer Expertise: bioinformatics

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.

CITE

Report a concern

Author Response 30 Apr 2024

Jianqiang Sun, Research Center for Agricultural Information Technology, National Agriculture and Food Research Organization, Tsukuba, 305-8604, Japan

30 Apr 2024

Author Response
Thank you for reviewing our manuscript and for your constructive comments. We have largely revised the manuscript according to your and other reviewers’ comments. Please check the revised manuscript. Additionally, ... Continue reading
Thank you for reviewing our manuscript and for your constructive comments. We have largely revised the manuscript according to your and other reviewers’ comments. Please check the revised manuscript. Additionally, below are our responses to the individual comments.

The coverage is calculated separately for forward and reverse aligned-reads from the BAM file using the `IRanges::coverage` function which counts the number of reads covering each position of viroid-genome. Thank you for pointing this out. We have revised the manuscript and Figure 2.

Because the expression profile of viroids’ small RNA is often visualised in linear axes, this package visualises in a linear fashion by default. However, the vignette of this package introduces an example of visualisation with circular axes (https://www.bioconductor.org/packages/release/bioc/html/CircSeqAlignTk.html).

Thank you for the constructive comments. Although many alignment tools have been developed for circular genomes, few specialise in viroid expression analysis. Among these, the sRNA Profiler (Adkar-Purushothama et al) is an alignment tool with similar purpose to CircSeqAlignTk. However, neither by checking the manual nor by reading the code we could not find the functionality to perform alignment and visualisation from FASTQ files. Thus, comparative studies were not possible due to the lack of software on the same basis as CircSeqAlignTk.
Thank you for reviewing our manuscript and for your constructive comments. We have largely revised the manuscript according to your and other reviewers’ comments. Please check the revised manuscript. Additionally, below are our responses to the individual comments.

The coverage is calculated separately for forward and reverse aligned-reads from the BAM file using the `IRanges::coverage` function which counts the number of reads covering each position of viroid-genome. Thank you for pointing this out. We have revised the manuscript and Figure 2.

Because the expression profile of viroids’ small RNA is often visualised in linear axes, this package visualises in a linear fashion by default. However, the vignette of this package introduces an example of visualisation with circular axes (https://www.bioconductor.org/packages/release/bioc/html/CircSeqAlignTk.html).

Thank you for the constructive comments. Although many alignment tools have been developed for circular genomes, few specialise in viroid expression analysis. Among these, the sRNA Profiler (Adkar-Purushothama et al) is an alignment tool with similar purpose to CircSeqAlignTk. However, neither by checking the manual nor by reading the code we could not find the functionality to perform alignment and visualisation from FASTQ files. Thus, comparative studies were not possible due to the lack of software on the same basis as CircSeqAlignTk.
Competing Interests: The authors have no competing interests to disclose. Close
Report a concern
Respond or Comment

COMMENTS ON THIS REPORT

Author Response 30 Apr 2024

Jianqiang Sun, Research Center for Agricultural Information Technology, National Agriculture and Food Research Organization, Tsukuba, 305-8604, Japan

30 Apr 2024

Author Response
Thank you for reviewing our manuscript and for your constructive comments. We have largely revised the manuscript according to your and other reviewers’ comments. Please check the revised manuscript. Additionally, ... Continue reading
Thank you for reviewing our manuscript and for your constructive comments. We have largely revised the manuscript according to your and other reviewers’ comments. Please check the revised manuscript. Additionally, below are our responses to the individual comments.

The coverage is calculated separately for forward and reverse aligned-reads from the BAM file using the `IRanges::coverage` function which counts the number of reads covering each position of viroid-genome. Thank you for pointing this out. We have revised the manuscript and Figure 2.

Because the expression profile of viroids’ small RNA is often visualised in linear axes, this package visualises in a linear fashion by default. However, the vignette of this package introduces an example of visualisation with circular axes (https://www.bioconductor.org/packages/release/bioc/html/CircSeqAlignTk.html).

Thank you for the constructive comments. Although many alignment tools have been developed for circular genomes, few specialise in viroid expression analysis. Among these, the sRNA Profiler (Adkar-Purushothama et al) is an alignment tool with similar purpose to CircSeqAlignTk. However, neither by checking the manual nor by reading the code we could not find the functionality to perform alignment and visualisation from FASTQ files. Thus, comparative studies were not possible due to the lack of software on the same basis as CircSeqAlignTk.
Thank you for reviewing our manuscript and for your constructive comments. We have largely revised the manuscript according to your and other reviewers’ comments. Please check the revised manuscript. Additionally, below are our responses to the individual comments.

The coverage is calculated separately for forward and reverse aligned-reads from the BAM file using the `IRanges::coverage` function which counts the number of reads covering each position of viroid-genome. Thank you for pointing this out. We have revised the manuscript and Figure 2.

Because the expression profile of viroids’ small RNA is often visualised in linear axes, this package visualises in a linear fashion by default. However, the vignette of this package introduces an example of visualisation with circular axes (https://www.bioconductor.org/packages/release/bioc/html/CircSeqAlignTk.html).

Thank you for the constructive comments. Although many alignment tools have been developed for circular genomes, few specialise in viroid expression analysis. Among these, the sRNA Profiler (Adkar-Purushothama et al) is an alignment tool with similar purpose to CircSeqAlignTk. However, neither by checking the manual nor by reading the code we could not find the functionality to perform alignment and visualisation from FASTQ files. Thus, comparative studies were not possible due to the lack of software on the same basis as CircSeqAlignTk.
Competing Interests: The authors have no competing interests to disclose. Close
Report a concern

Views

Reviewer Report 13 Feb 2024

Eric Soler, University Montpellier & Université de Paris, Paris & Montpellier, France

Mohammad Salma, University Montpellier & Université de Paris, Paris & Montpellier, France

Approved with Reservations

https://doi.org/10.5256/f1000research.139847.r228243

The manuscript titled “CircSeqAlignTk: An R package for end-to-end analysis of RNA-seq data for circular genomes” by Jianqiang Sun, Xi Fu, and Wei Cao describes a new tool dedicated to circular genome mapping in deep sequencing applications. While this tool could be of interest to scientists dealing with (small) circular genome sequencing, its accessibility to non-bioinformaticians/non-specialists could be improved (see comments below). We provide several suggestions to enhance the content of the manuscript.

Minor comments:

Comparative Analysis: A comparative analysis of CircSeqAlignTk with existing tools would greatly enhance the manuscript. This comparison could focus on performance metrics, user-friendliness, and specific advantages of CircSeqAlignTk. Such analysis would provide a clearer picture of the tool’s place in the current landscape of bioinformatics software.
Discussion of Limitations: A balanced discussion of any potential limitations of CircSeqAlignTk, or scenarios where it might not be the optimal choice, would provide a more comprehensive view of the tool. This discussion could guide users in making informed decisions about when to use this package.
Future Work and Enhancements: Suggestions for future enhancements would be beneficial. This could include potential areas of expansion or integration with other bioinformatics tools and workflows.
Implementation of an R Shiny Interface: Lastly, we strongly recommend the implementation of an R Shiny interface for CircSeqAlignTk completed by Docker integration. An R Shiny interface would significantly enhance the accessibility of the tool, especially for researchers who are not bioinformaticians. This user-friendly interface would allow a broader range of scientists to engage with and benefit from the tool, making it not just a powerful resource but also an accessible one. The ability to interact with CircSeqAlignTk through a graphical interface would streamline the analysis process and potentially increase the adoption and impact of the tool in the research community. Incorporating Docker into this solution would offer several advantages. It would not only make CircSeqAlignTk more accessible but also ensure its robustness, reproducibility, and compatibility with diverse computational environments. This approach could significantly expand the user base of CircSeqAlignTk and enhance its overall utility in the scientific community.

In conclusion, the manuscript presents a valuable tool for the analysis of RNA-seq data from circular genomes. The implementation of the suggested enhancements would, in our opinion, greatly increase the manuscript’s impact and the utility of CircSeqAlignTk to a broader scientific community.

Is the rationale for developing the new software tool clearly explained?

Yes
Is the description of the software tool technically sound?

Yes
Are sufficient details of the code, methods and analysis (if applicable) provided to allow replication of the software development and its use by others?

Yes
Is sufficient information provided to allow interpretation of the expected output datasets and any results generated using the tool?

Yes
Are the conclusions about the tool and its performance adequately supported by the findings presented in the article?

Partly

Competing Interests: No competing interests were disclosed.

Reviewer Expertise: Genomics, epigenetics, bioinformatics

We confirm that we have read this submission and believe that we have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however we have significant reservations, as outlined above.

CITE

Report a concern

Author Response 20 Jun 2024

Jianqiang Sun, Research Center for Agricultural Information Technology, National Agriculture and Food Research Organization, Tsukuba, 305-8604, Japan

20 Jun 2024

Author Response
Thank you for reviewing our manuscript and for your constructive comments. We have largely revised the manuscript according to your and other reviewers’ comments. Please check the revised manuscript. Additionally, ... Continue reading
Thank you for reviewing our manuscript and for your constructive comments. We have largely revised the manuscript according to your and other reviewers’ comments. Please check the revised manuscript. Additionally, below are our responses to the individual comments.

Thank you for the constructive comments. Although many alignment tools have been developed for circular genomes, few specialise in viroid expression analysis. Among these, the sRNA Profiler (Adkar-Purushothama et al) is an alignment tool with similar purpose to CircSeqAlignTk. However, neither by checking the manual nor by reading the code we could not find the functionality to perform alignment and visualisation from FASTQ files. Thus, comparative studies were not possible due to the lack of software on the same basis as CircSeqAlignTk.

We added the discussion about the limitations of CircSeqAlignTk in the revised manuscript. In brief, although CircSeqAlignTk provides a user-friendly analysis tool and therefore offers a way to adjust important parameters that may affect the analysis results, some minor parameter adjustments are not possible.

Currently, research on viroid expression profiling focuses on mapping viroid-derived sRNAs to viroids and visualising their coverage. Our package covers all these operations and is promising enough to be stand-alone at the moment. However, since CircSeqAlignTk is one of Bioconductor packages, it can easily be used together with other R/Bioconductor packages. Additionally, we intend to update the package according to the direction of development and the needs of future research.

Thanks for the constructive remarks, we have implemented the Shiny interface to access CircSeqAlignTk for end-to-end data analysis. Users can now create an instance with the `build_app` function and easily start the GUI with the `runApp` function. Additionally, as you know, the size of RNA-Seq data is generally large and using Docker requires mounting the directory, which is quite inconvenient for users unfamiliar with CUI. For this reason, we decided not to provide a Docker image at this stage.
Thank you for reviewing our manuscript and for your constructive comments. We have largely revised the manuscript according to your and other reviewers’ comments. Please check the revised manuscript. Additionally, below are our responses to the individual comments.

Thank you for the constructive comments. Although many alignment tools have been developed for circular genomes, few specialise in viroid expression analysis. Among these, the sRNA Profiler (Adkar-Purushothama et al) is an alignment tool with similar purpose to CircSeqAlignTk. However, neither by checking the manual nor by reading the code we could not find the functionality to perform alignment and visualisation from FASTQ files. Thus, comparative studies were not possible due to the lack of software on the same basis as CircSeqAlignTk.

We added the discussion about the limitations of CircSeqAlignTk in the revised manuscript. In brief, although CircSeqAlignTk provides a user-friendly analysis tool and therefore offers a way to adjust important parameters that may affect the analysis results, some minor parameter adjustments are not possible.

Currently, research on viroid expression profiling focuses on mapping viroid-derived sRNAs to viroids and visualising their coverage. Our package covers all these operations and is promising enough to be stand-alone at the moment. However, since CircSeqAlignTk is one of Bioconductor packages, it can easily be used together with other R/Bioconductor packages. Additionally, we intend to update the package according to the direction of development and the needs of future research.

Thanks for the constructive remarks, we have implemented the Shiny interface to access CircSeqAlignTk for end-to-end data analysis. Users can now create an instance with the `build_app` function and easily start the GUI with the `runApp` function. Additionally, as you know, the size of RNA-Seq data is generally large and using Docker requires mounting the directory, which is quite inconvenient for users unfamiliar with CUI. For this reason, we decided not to provide a Docker image at this stage.
Competing Interests: The authors have no competing interests to disclose. Close
Report a concern
Respond or Comment

COMMENTS ON THIS REPORT

Author Response 20 Jun 2024

Jianqiang Sun, Research Center for Agricultural Information Technology, National Agriculture and Food Research Organization, Tsukuba, 305-8604, Japan

20 Jun 2024

Author Response
Thank you for reviewing our manuscript and for your constructive comments. We have largely revised the manuscript according to your and other reviewers’ comments. Please check the revised manuscript. Additionally, ... Continue reading
Thank you for reviewing our manuscript and for your constructive comments. We have largely revised the manuscript according to your and other reviewers’ comments. Please check the revised manuscript. Additionally, below are our responses to the individual comments.

Thank you for the constructive comments. Although many alignment tools have been developed for circular genomes, few specialise in viroid expression analysis. Among these, the sRNA Profiler (Adkar-Purushothama et al) is an alignment tool with similar purpose to CircSeqAlignTk. However, neither by checking the manual nor by reading the code we could not find the functionality to perform alignment and visualisation from FASTQ files. Thus, comparative studies were not possible due to the lack of software on the same basis as CircSeqAlignTk.

We added the discussion about the limitations of CircSeqAlignTk in the revised manuscript. In brief, although CircSeqAlignTk provides a user-friendly analysis tool and therefore offers a way to adjust important parameters that may affect the analysis results, some minor parameter adjustments are not possible.

Currently, research on viroid expression profiling focuses on mapping viroid-derived sRNAs to viroids and visualising their coverage. Our package covers all these operations and is promising enough to be stand-alone at the moment. However, since CircSeqAlignTk is one of Bioconductor packages, it can easily be used together with other R/Bioconductor packages. Additionally, we intend to update the package according to the direction of development and the needs of future research.

Thanks for the constructive remarks, we have implemented the Shiny interface to access CircSeqAlignTk for end-to-end data analysis. Users can now create an instance with the `build_app` function and easily start the GUI with the `runApp` function. Additionally, as you know, the size of RNA-Seq data is generally large and using Docker requires mounting the directory, which is quite inconvenient for users unfamiliar with CUI. For this reason, we decided not to provide a Docker image at this stage.
Thank you for reviewing our manuscript and for your constructive comments. We have largely revised the manuscript according to your and other reviewers’ comments. Please check the revised manuscript. Additionally, below are our responses to the individual comments.

Thank you for the constructive comments. Although many alignment tools have been developed for circular genomes, few specialise in viroid expression analysis. Among these, the sRNA Profiler (Adkar-Purushothama et al) is an alignment tool with similar purpose to CircSeqAlignTk. However, neither by checking the manual nor by reading the code we could not find the functionality to perform alignment and visualisation from FASTQ files. Thus, comparative studies were not possible due to the lack of software on the same basis as CircSeqAlignTk.

We added the discussion about the limitations of CircSeqAlignTk in the revised manuscript. In brief, although CircSeqAlignTk provides a user-friendly analysis tool and therefore offers a way to adjust important parameters that may affect the analysis results, some minor parameter adjustments are not possible.

Currently, research on viroid expression profiling focuses on mapping viroid-derived sRNAs to viroids and visualising their coverage. Our package covers all these operations and is promising enough to be stand-alone at the moment. However, since CircSeqAlignTk is one of Bioconductor packages, it can easily be used together with other R/Bioconductor packages. Additionally, we intend to update the package according to the direction of development and the needs of future research.

Thanks for the constructive remarks, we have implemented the Shiny interface to access CircSeqAlignTk for end-to-end data analysis. Users can now create an instance with the `build_app` function and easily start the GUI with the `runApp` function. Additionally, as you know, the size of RNA-Seq data is generally large and using Docker requires mounting the directory, which is quite inconvenient for users unfamiliar with CUI. For this reason, we decided not to provide a Docker image at this stage.
Competing Interests: The authors have no competing interests to disclose. Close
Report a concern

Comments on this article Comments (0)

Version 2

VERSION 2 PUBLISHED 27 Oct 2022

Open Peer Review

Reviewer Status

Reviewer Reports

	Invited Reviewers
	1	2	3
Version 2 (revision) 30 Apr 24
Version 1 27 Oct 22	read	read	read

Eric Soler, University Montpellier & Université de Paris, Paris & Montpellier, France

Mohammad Salma, University Montpellier & Université de Paris, Paris & Montpellier, France
Xueyi Dong, Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia; Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Alexander Zelikovsky, Georgia State University, Atlanta, USA

Bikram Sahoo, Georgia State University, Atlanta, USA

Comments on this article

All Comments(0)

Add a comment

Browse by related subjects

Back to all reports

Reviewer Report

22 Views

28 Apr 2024 | for Version 1

Alexander Zelikovsky, Department of Computer Science, Georgia State University, Atlanta, Georgia, USA

Bikram Sahoo, Department of Computer Science, Georgia State University, Atlanta, Georgia, USA

22 Views Cite this report Responses(1)

Approved

The authors primarily focus on the use of Bowtie2 and HISAT2 for alignment, neglecting to mention pseudoalignment algorithms like Kalisto. Including a brief description of such tools in the introduction would enhance the readers' understanding. Additionally, a comparative analysis of different alignment algorithms, along with the option to choose between them, would augment the robustness of the tool.
While the package utilizes SAMtools and BEDtools for coverage computation, incorporating downstream analysis tools compatible with circular genomes would enrich the utility of the tool for users.
Each figure in the article requires thorough explanation to aid interpretation, as they may be challenging to decipher at first glance. For instance, the coverage plot's significance is not easy to comprehend.
To attract non-computational users, it would be advantageous for the package to feature an easy-to-use graphical user interface (GUI), aligning with the intended accessibility for this audience.

Overall, these suggestions aim to enhance the comprehensiveness and usability of the CircSeqAlignTk package for a wider range of users.

Is the rationale for developing the new software tool clearly explained?

Yes
Is the description of the software tool technically sound?

Yes
Are sufficient details of the code, methods and analysis (if applicable) provided to allow replication of the software development and its use by others?

Yes
Is sufficient information provided to allow interpretation of the expected output datasets and any results generated using the tool?

Partly
Are the conclusions about the tool and its performance adequately supported by the findings presented in the article?

Partly

Competing Interests

No competing interests were disclosed.

Reviewer Expertise

Computational genomics, RNA-seq and DNA-seq data analysis

We confirm that we have read this submission and believe that we have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Respond to this report

Responses (1)

Author Response

30 Apr 2024

Jianqiang Sun, Research Center for Agricultural Information Technology, National Agriculture and Food Research Organization, Tsukuba, 305-8604, Japan

Thank you for reviewing our manuscript and for your constructive comments. According to reviewers’ comments, we have largely revised the manuscript and updated the software. Please find our point-by-point detailed responses to the reviewers’ comments, below.

We will update the software to support kallisto and update the documentation of this package in the next scheduled release. Thank you for your suggestion.
We plan to add functions for downstream analysis based on future needs.
We modified the captions of figures in the revised manuscript.
Thank you and other reviewers’ suggestions, we have already implemented the GUI and released it.

View more View less

Competing Interests

No competing interests were disclosed.

Back to all reports

Reviewer Report

23 Views

20 Mar 2024 | for Version 1

23 Views Cite this report Responses(1)

Approved With Reservations

The authors should add more details on the definition of alignment coverage and explain how it is calculated in this package. More descriptions should be added to the captions of Figure 2 to help the readers interpret this figure.
While the analyses are based on circular genomes, the visualization of alignment coverage was still on linear axes. I found it a bit hard to determine the end of the sequence coordinate in Figure 2 (i.e. are the blank space on the right end of x-axis regions with no coverage, or are they outside of the range of the genome?). Have you considered using circular plots?
If applicable, the performance of CircSeqAlignTk should be compared to other tools for the same or similar tasks.

Is the rationale for developing the new software tool clearly explained?

Yes
Is the description of the software tool technically sound?

Yes
Are sufficient details of the code, methods and analysis (if applicable) provided to allow replication of the software development and its use by others?

Yes
Is sufficient information provided to allow interpretation of the expected output datasets and any results generated using the tool?

Partly
Are the conclusions about the tool and its performance adequately supported by the findings presented in the article?

Partly

Competing Interests

No competing interests were disclosed.

Reviewer Expertise

bioinformatics

Respond to this report

Responses (1)

Author Response

30 Apr 2024

Jianqiang Sun, Research Center for Agricultural Information Technology, National Agriculture and Food Research Organization, Tsukuba, 305-8604, Japan

Thank you for reviewing our manuscript and for your constructive comments. We have largely revised the manuscript according to your and other reviewers’ comments. Please check the revised manuscript. Additionally, below are our responses to the individual comments.

The coverage is calculated separately for forward and reverse aligned-reads from the BAM file using the `IRanges::coverage` function which counts the number of reads covering each position of viroid-genome. Thank you for pointing this out. We have revised the manuscript and Figure 2.
Because the expression profile of viroids’ small RNA is often visualised in linear axes, this package visualises in a linear fashion by default. However, the vignette of this package introduces an example of visualisation with circular axes (https://www.bioconductor.org/packages/release/bioc/html/CircSeqAlignTk.html).
Thank you for the constructive comments. Although many alignment tools have been developed for circular genomes, few specialise in viroid expression analysis. Among these, the sRNA Profiler (Adkar-Purushothama et al) is an alignment tool with similar purpose to CircSeqAlignTk. However, neither by checking the manual nor by reading the code we could not find the functionality to perform alignment and visualisation from FASTQ files. Thus, comparative studies were not possible due to the lack of software on the same basis as CircSeqAlignTk.

View more View less

Competing Interests

The authors have no competing interests to disclose.

Back to all reports

Reviewer Report

32 Views

13 Feb 2024 | for Version 1

Eric Soler, University Montpellier & Université de Paris, Paris & Montpellier, France

Mohammad Salma, University Montpellier & Université de Paris, Paris & Montpellier, France

32 Views Cite this report Responses(1)

Approved With Reservations

Comparative Analysis: A comparative analysis of CircSeqAlignTk with existing tools would greatly enhance the manuscript. This comparison could focus on performance metrics, user-friendliness, and specific advantages of CircSeqAlignTk. Such analysis would provide a clearer picture of the tool’s place in the current landscape of bioinformatics software.
Discussion of Limitations: A balanced discussion of any potential limitations of CircSeqAlignTk, or scenarios where it might not be the optimal choice, would provide a more comprehensive view of the tool. This discussion could guide users in making informed decisions about when to use this package.
Future Work and Enhancements: Suggestions for future enhancements would be beneficial. This could include potential areas of expansion or integration with other bioinformatics tools and workflows.
Implementation of an R Shiny Interface: Lastly, we strongly recommend the implementation of an R Shiny interface for CircSeqAlignTk completed by Docker integration. An R Shiny interface would significantly enhance the accessibility of the tool, especially for researchers who are not bioinformaticians. This user-friendly interface would allow a broader range of scientists to engage with and benefit from the tool, making it not just a powerful resource but also an accessible one. The ability to interact with CircSeqAlignTk through a graphical interface would streamline the analysis process and potentially increase the adoption and impact of the tool in the research community. Incorporating Docker into this solution would offer several advantages. It would not only make CircSeqAlignTk more accessible but also ensure its robustness, reproducibility, and compatibility with diverse computational environments. This approach could significantly expand the user base of CircSeqAlignTk and enhance its overall utility in the scientific community.

Is the rationale for developing the new software tool clearly explained?

Yes
Is the description of the software tool technically sound?

Yes
Are sufficient details of the code, methods and analysis (if applicable) provided to allow replication of the software development and its use by others?

Yes
Is sufficient information provided to allow interpretation of the expected output datasets and any results generated using the tool?

Yes
Are the conclusions about the tool and its performance adequately supported by the findings presented in the article?

Partly

Competing Interests

No competing interests were disclosed.

Reviewer Expertise

Genomics, epigenetics, bioinformatics

Respond to this report

Responses (1)

Author Response

20 Jun 2024

Jianqiang Sun, Research Center for Agricultural Information Technology, National Agriculture and Food Research Organization, Tsukuba, 305-8604, Japan

Thank you for the constructive comments. Although many alignment tools have been developed for circular genomes, few specialise in viroid expression analysis. Among these, the sRNA Profiler (Adkar-Purushothama et al) is an alignment tool with similar purpose to CircSeqAlignTk. However, neither by checking the manual nor by reading the code we could not find the functionality to perform alignment and visualisation from FASTQ files. Thus, comparative studies were not possible due to the lack of software on the same basis as CircSeqAlignTk.
We added the discussion about the limitations of CircSeqAlignTk in the revised manuscript. In brief, although CircSeqAlignTk provides a user-friendly analysis tool and therefore offers a way to adjust important parameters that may affect the analysis results, some minor parameter adjustments are not possible.
Currently, research on viroid expression profiling focuses on mapping viroid-derived sRNAs to viroids and visualising their coverage. Our package covers all these operations and is promising enough to be stand-alone at the moment. However, since CircSeqAlignTk is one of Bioconductor packages, it can easily be used together with other R/Bioconductor packages. Additionally, we intend to update the package according to the direction of development and the needs of future research.
Thanks for the constructive remarks, we have implemented the Shiny interface to access CircSeqAlignTk for end-to-end data analysis. Users can now create an instance with the `build_app` function and easily start the GUI with the `runApp` function. Additionally, as you know, the size of RNA-Seq data is generally large and using Docker requires mounting the directory, which is quite inconvenient for users unfamiliar with CUI. For this reason, we decided not to provide a Docker image at this stage.

View more View less

Competing Interests

The authors have no competing interests to disclose.

Alongside their report, reviewers assign a status to the article:

Approved - the paper is scientifically sound in its current form and only minor, if any, improvements are suggested

Approved with reservations - A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.

Not approved - fundamental flaws in the paper seriously undermine the findings and conclusions

[1] Adkar-Purushothama CR, Sridharan lyer P, Sano T, et al.: sRNA Profiler: a user-focused interface for small RNA mapping and profiling. Cells. 2021; 10(7): 1771. PubMed Abstract | Publisher Full Text

[2] Ayad LAK, Pissis SP: MARS: improving multiple circular sequence alignment using refined sequences. BMC Genomes. 2017; 18: 86. PubMed Abstract | Publisher Full Text

[3] Benson DA, Cavanaugh M, Clark K, et al.: GenBank. Nucleic Acids Res. 2013; 41: D36–D42. PubMed Abstract | Publisher Full Text

[4] Chang W, Cheng J, Allaire JJ, et al.: shiny: Web Application Framework for R. R package version 1.8.0. 2024. Reference Source

[5] Hull R: Plant Virology (fifth edition). Plant virology. Cambridge, Massachusetts, US: Academic Press; 2014. Publisher Full Text

[6] Kim D, Paggi JM, Park C, et al.: Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat. Biotechnol. 2019; 37: 907–915. PubMed Abstract | Publisher Full Text

[7] Langmead B, Salzberg S: Fast gapped-read alignment with Bowtie 2. Nat. Methods. 2012; 9: 357–359. PubMed Abstract | Publisher Full Text

[8] Leinonen R, Sugawara H, Shumway M: and on behalf of the International Nucleotide Sequence Database Collaboration. The sequence read archive. Nucleic Acids Res. 2011; 39: D19–D21. PubMed Abstract | Publisher Full Text

[9] Li H, Handsaker B, Wysoker A, et al.: The sequence alignment/map format and SAMtools. Bioinformatics. 2009; 25(16): 2078–2079. PubMed Abstract | Publisher Full Text

[10] Quinlan AR, Hall IM: BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010; 26(6): 841–842. PubMed Abstract | Publisher Full Text

[11] Sastry KS: Impact of virus and viroid diseases on crop yields. Plant virus and viroid diseases in the tropics. Springer; Dordrecht: 2013. Publisher Full Text

[12] Schubert M, Lindgreen S, Orlando L: (2016) AdapterRemoval v2: rapid adapter trimming, identification, and read merging. BMC. Res. Notes. 2016; 9: 88. PubMed Abstract | Publisher Full Text

[13] Soliman T, Mourits MCM, Oude Lansink AGJM, et al.: Quantitative economic impact assessment of an invasive plant disease under uncertainty – A case study for potato spindle tuber viroid (PSTVd) invasion into the European Union. Crop Prot. 2012; 40: 28–35. Publisher Full Text

[14] Soneson C: Rhisat2: R wrapper for HISAT2 aligner. R package version 1.12.0. GitHub. [October 6, 2022 accessed]. Reference Source

[15] Sun J, Fu X, Cao W: CircSeqAlignTk. Zenodo. [Dataset]. 2022. Publisher Full Text

[16] Vihervaara A, Duarte FM, Lis JT: Molecular mechanisms driving transcriptional stress responses. Nat. Rev. Genet. 2018; 19: 385–397. PubMed Abstract | Publisher Full Text

[17] Wei Z, Zhang W, Fang H, et al.: esATAC: and easy-to-use systematic pipeline for ATAC-seq data analysis. Bioinformatics. 2018; 34(15): 2664–2665. PubMed Abstract | Publisher Full Text

[18] Zanardo LG, de Souza GB , Alves MS: Transcriptomics of plant–virus interactions: a review. Theor. Exp. Plant Physiol. 2019; 31: 103–125. Publisher Full Text

CircSeqAlignTk: An R package for end-to-end analysis of RNA-seq data for circular genomes

Abstract

Keywords

Revised Amendments from Version 1

Introduction

Methods

Operation

Implementation

Figure 1. Overview of workflow analyses and functions implemented in the CircSeqAlignTk package.

Use cases

Analysis of small RNA-seq data sequenced from a viroid infection experiment

Figure 2. Visualisation of alignment coverage.

Analysis of synthetic small RNA-seq data

GUI usage

Figure 3. GUI of CircSeqAlignTk.

Conclusions

Software availability

Data availability

Acknowledgements

References

Comments on this article Comments (0)

Open Peer Review

Comments on this article Comments (0)

Open Peer Review

Reviewer Status

Reviewer Reports

Comments on this article

Browse by related subjects

Competing Interests Policy

Stay Updated