ALL Metrics
-
Views
-
Downloads
Get PDF
Get XML
Cite
Export
Track
Software Tool Article

iMutSig: a web application to identify the most similar mutational signature using shiny

[version 1; peer review: 2 approved with reservations]
PUBLISHED 10 Jun 2020
Author details Author details
OPEN PEER REVIEW
REVIEWER STATUS

This article is included in the RPackage gateway.

Abstract

There are two frameworks for characterizing mutational signatures which are commonly used to describe the nucleotide patterns that arise from mutational processes. Estimated mutational signatures from fitting these two methods in human cancer can be found online, in the Catalogue Of Somatic Mutations In Cancer (COSMIC) website or a GitHub repository. The two frameworks make differing assumptions regarding independence of base pairs and for that reason may produce different results. Consequently, there is a need to compare and contrast the results of the two methods, but no such tool currently exists. In this paper, we provide a simple and intuitive interface that allows such comparisons to be easily performed. When using our software, the user may download published mutational signatures of either type. Mutational signatures from the pmsignature data source are expanded to probabilistic vectors of 96-possible mutation types, the same model specification used by COSMIC, and then compared to COSMIC signatures. Cosine similarity measures the extent of signature similarity. iMutSig provides a simple and user-friendly web application allowing researchers to compare signatures from COSMIC to those from pmsignature, and vice versa. Furthermore, iMutSig allows users to input a self-defined mutational signature and examine its similarity to published signatures from both data sources. iMutSig is accessible online and source code is available for download on GitHub.

Keywords

Mutational Signatures, pmsignature, COSMIC, Web interface, Shiny, R

Introduction

Each human is subject to a variety of mutational processes throughout their lifetime, which result in the formation of a catalog of somatic mutations as his/her unique mutational profile1. A mutational signature captures the pattern of the mutations and contexts in which those mutations occur (i.e., the neighboring bases). Examples of important mutational processes with distinct mutational signatures include aging and ultraviolet (UV) radiation. Additionally, many research groups are performing analysis to discover de novo mutational signatures in cancer14.

Currently, there are two frameworks used to characterize and visualize mutational signatures5,6. The first, proposed by Alexandrov et al., used a vector of 96 probabilities to capture the possible composition of the six nucleotide substitutions (C>A, C>T, C>G, T>A, T>C, T>G) and the neighboring base immediately on each of the 5′ and 3′ side of the mutated base1. A list of published mutational signatures can be downloaded from the Catalogue Of Somatic Mutations In Cancer (COSMIC) website7 (version 2, v2). Later, Alexandrov et al. published an expanded set of mutational signatures, which was referred to as version 3 (v3)8. The 67 COSMIC v3 Single Base Substitution (SBS) signatures include 30 v2 signatures. Based on the signature concept, but using different model assumptions, Shiraishi et al. proposed a mixed-membership model, pmsignature, which substantially reduced the number of parameters needed to characterize a signature9. They achieved this by assuming independence across bases, thereby reducing the number of parameters from 6*4*4-1 = 95 to (6-1)+(4-1)+(4-1) = 119. The reduction in the number of parameters is greater if more flanking bases are included. All Shiraishi’s signatures can be downloaded from their GitHub repository9. In this paper, we will refer to signatures resulting from these two methods as “COSMIC signatures” with version numbers (for those resulting from Alexandrov et al.’s method) and “PM signatures” (for those resulting from Shiraishi et al.’s method).

A large number of researchers have published scientific findings resulting from the COSMIC signature-based method1012, which was defined as the “gold standard" in the field by Baez-Ortega et al.6. Meanwhile, an increasing number of researchers are using the pmsignature-based method for samples with lower numbers of somatic variants due to it requiring fewer parameters9,13,14. Given that both methods are widely used, investigators need the ability to compare results from their analysis with those reported in earlier databases, which may have been produced using the alternate method. For example, researchers have adopted both tools for gastric cancer and tried to compare and integrate the information from two data sources in a somewhat ad hoc manner15. No rigorous tool exists for this task. In this paper we present iMutSig, an easy-to-use tool that allows users to identify the most similar mutational signatures across methods and to compare the information characterizing those signatures using simple point-and-click navigation.

Methods

Implementation

In order to measure the similarity between mutational signatures across two databases, we need to represent PM signatures in a way that is comparable with those from COSMIC. To do this we convert the PM signature into a probabilistic vector with the same length as the COSMIC signature, i.e., 96. To calculate each of 96 resulting probabilities in the vector, we take the constituent components that make up the COSMIC signature - which refer to the nucleotide substitution and two flanking bases at the -1 and +1 position - calculate the probability of each component for the given PM signature, and then multiply those probabilities using pm-signature’s assumption of independence. For example, to calculate the probability of the COSMIC signature C[C >A]T we multiply three PM signature’s probabilities: P(C at pos -1), P(C >A), and P(T at pos +1). This example is shown in Table 1 and Equation 1.

P(C[C>A]T)=P(Catpos1)P([C>A])P(Tatpos+1)=0.052×0.012×0.116=7.24×105(1)

Table 1. An example of PM signatures.

Nucleotide substitution
C>AC>GC>TT>AT>CT>G
c9347314-4bd6-4497-a9ff-219373b2409c_TF1.gif0.0030.8790.0030.0900.014
Flanking bases
PositionACGT
-20.1590.0420.4860.314
-10.044c9347314-4bd6-4497-a9ff-219373b2409c_TF2.gif0.8700.034
+10.0760.2370.571c9347314-4bd6-4497-a9ff-219373b2409c_TF3.gif
+20.2450.2470.2560.252
Transcription strand
PlusMinus
0.5110.489

Now that we have represented both forms of signature using probabilistic vectors of length n = 96, P and C say, we can directly compare the two signature types. In order to measure the similarity between them we use cosine similarity, CS, defined as shown in Equation 2:

CS(P,C)=PCPC=i=1nPiCii=1nPii=1nCi(2)

Intuitively speaking, cosine similarity is the cosine of the angle between the two vectors. As such, cosine similarity ranges from 0 to 1 (inclusive). In our context, if two mutational signatures have a cosine similarity of 1, they must be identical, i.e., the angle between them is 0°; in contrast, if two mutational signatures have a cosine similarity of 0, they are maximally dissimilar (i.e., orthogonal). Computing the cosine similarity between the input signature and each of the candidate signatures, and then sorting the similarities from highest to lowest value, we identify the candidate signature with the highest cosine similarity as the most similar mutational signature.

Operation

iMutSig is built in R with its key features depending on the R package, pmsignature9. As shown in Figure 1, the Shiny app currently supports three possible workflows for users to choose from, depending on the type of signatures they have already obtained: 1) starting with a COSMIC signature; 2) starting with a PM signature; 3) starting with a self-defined signature that could follow either the COSMIC or PM format.

c9347314-4bd6-4497-a9ff-219373b2409c_figure1.gif

Figure 1. Overview of three workflows in the iMutSig interface.

The first two tabs allow users to finding the most similar PM signature to an input COSMIC signature (highlighted in green) and vice versa (highlighted in orange). In addition, users can identify the most similar signatures from both data sources to an input signature (highlighted in blue).

The first tab in the Shiny app window, “COSMIC to pmsignature", allows users to select an input COSMIC signature via a drop-down list and returns the best-matched PM signature. The returned results are divided and organized separately in the top and the bottom portion of the page. The top half tab summarizes background information regarding the input signature by presenting: 1) visualized plots of the input signature and its membership among all cancer types, i.e., in which kind of cancers the mutational signatures has been found; 2) a table showing the cosine similarity between this signature and all PM signatures, sorted in decreasing order, along with a visualization of a similarity heatmap with color and intensity proportional to assessed similarity. The bottom half tab presents plots and descriptions of the input COSMIC signature, the most similar PM signature, and a second PM signature that the user can select. Thus, users can easily access all the vital information and results regarding these signatures rather than having to manually gather and organize information from publications. The top half of the tab will be automatically updated via a control panel in the middle section of the tab, which enables users to select a signature to start with and also highlights information about the currently selected signature, the most-similar signature from the alternate model framework, and the cosine similarity.

The second tab was designed in a similar manner to the first tab, but for the case in which we are starting with a PM signature and looking for the most similar COSMIC signature. For the first two tabs, users can choose which version of COSMIC signatures to input from the sub-menus, i.e., v2 or v3.

Unlike the first two tabs, the third tab enables users to enter a user-supplied signature, which can be in either PM or COSMIC format, and then identify the most similar signature from each online database. The user will be requested to enter a sub-menu based on the type of the input signature and to upload a comma-separated values (CSV) file containing a single signature. A sample CSV file is provided for download to give the user a better sense of the format of the input file. Then, the tab will be updated to display three tables, one from each data source (COSMIC v2, v3 and PM), listing the signatures from that data source and the cosine similarity of each signature with the user-uploaded signature. The tables are ordered from most similar to least similar signature. In addition, the user is able to view figures of the best-matched signatures (i.e., those with highest cosine similarity) from each data source, allowing users to observe any similarities and dissimilarities. Below, users will see a list of cancer types that contain the best-matched signature.

Use cases

We use iMutSig to identify the most similar signature for a given PM/COSMIC signature or a user-supplied signature. Figure 2 shows the input panel after inputting COSMIC v3 signature SBS1 and Figure 3 shows the input panel after inputting PM signature P1. If users provide a user-supplied signature of either COSMIC-kind or PM-kind, the results can be seen in Figure 4 and Figure 5. Consider the example shown in Figure 4, where we input COSMIC v2 signature C1. iMutSig returned the most similar signatures COSMIC v2 signature C1, COSMIC v3 signature SBS1, and PM signature P7 (similarity = 1.000, 0.947, and 0.948, respectively) along with the names of its associated cancer types. When providing PM signature P1, iMutSig returned COSMIC v2 signature C10, v3 signature C10a and PM signature P1 (similarity = 0.816, 0.957, 1.0, respectively).

c9347314-4bd6-4497-a9ff-219373b2409c_figure2.gif

Figure 2. Input a COSMIC v3 signature, SBS1.

c9347314-4bd6-4497-a9ff-219373b2409c_figure3.gif

Figure 3. Input a PM signature, P1

c9347314-4bd6-4497-a9ff-219373b2409c_figure4.gif

Figure 4. Input a user-supplied COSMIC signature.

c9347314-4bd6-4497-a9ff-219373b2409c_figure5.gif

Figure 5. Input a user-supplied PM signature.

Conclusions

iMutSig is a user-friendly interactive browser-based application that allows users who have a signature that they have discovered in an analysis of their own data to identify the best-matched existing mutational signature from the COSMIC and PM databases. It also allows users to directly compare signatures between the two databases. It does this in an interactive way, and also allows straightforward visualization of results. iMutSig enables researchers to easily identify the most similar mutational signature and to easily access characteristic information from both data sources without additional software installation and programming of their own.

Data availability

All data underlying the results are available as part of the article and no additional source data are required.

Software availability

Software available from: https://zhiyang.shinyapps.io/iMutSig/

Source code available from: http://www.github.com/USCbiostats/iMutSig

Archived source code at time of publication: https://doi.org/10.5281/zenodo.387388816

License: MIT

Comments on this article Comments (0)

Version 2
VERSION 2 PUBLISHED 10 Jun 2020
Comment
Author details Author details
Competing interests
Grant information
Copyright
Download
 
Export To
metrics
Views Downloads
F1000Research - -
PubMed Central
Data from PMC are received and updated monthly.
- -
Citations
CITE
how to cite this article
Yang Z, Pandey P, Marjoram P and Siegmund KD. iMutSig: a web application to identify the most similar mutational signature using shiny [version 1; peer review: 2 approved with reservations]. F1000Research 2020, 9:586 (https://doi.org/10.12688/f1000research.24435.1)
NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.
track
receive updates on this article
Track an article to receive email alerts on any updates to this article.

Open Peer Review

Current Reviewer Status: ?
Key to Reviewer Statuses VIEW
ApprovedThe paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approvedFundamental flaws in the paper seriously undermine the findings and conclusions
Version 1
VERSION 1
PUBLISHED 10 Jun 2020
Views
40
Cite
Reviewer Report 15 Jul 2020
Vittorio Perduca, Université de Paris, CNRS, MAP5 UMR 8145, F-75006, Paris, France 
Approved with Reservations
VIEWS 40
This paper presents an original online tool for comparing mutational signatures represented according to two alternative formats, namely COSMIC vectors with the relative frequencies of the 96 types of substitutions on one side1, and lower dimensional "pmsignature" vectors on the other side2. The ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Perduca V. Reviewer Report For: iMutSig: a web application to identify the most similar mutational signature using shiny [version 1; peer review: 2 approved with reservations]. F1000Research 2020, 9:586 (https://doi.org/10.5256/f1000research.26954.r64570)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
  • Author Response 19 Nov 2020
    Zhi Yang, Department of Preventive Medicine, Keck School of Medicine of the University of Southern California, 2001 N.Soto Street, Los Angeles, 91003, USA
    19 Nov 2020
    Author Response
    This paper presents an original online tool for comparing mutational signatures represented according to two alternative formats, namely COSMIC vectors with the relative frequencies of the 96 types of substitutions on ... Continue reading
COMMENTS ON THIS REPORT
  • Author Response 19 Nov 2020
    Zhi Yang, Department of Preventive Medicine, Keck School of Medicine of the University of Southern California, 2001 N.Soto Street, Los Angeles, 91003, USA
    19 Nov 2020
    Author Response
    This paper presents an original online tool for comparing mutational signatures represented according to two alternative formats, namely COSMIC vectors with the relative frequencies of the 96 types of substitutions on ... Continue reading
Views
44
Cite
Reviewer Report 22 Jun 2020
Adrian Baez-Ortega, Transmissible Cancer Group, Department of Veterinary Medicine, University of Cambridge, Cambridge, UK 
Approved with Reservations
VIEWS 44
Yang et al. present an interactive software tool, iMutSig, which allows comparison between two alternative mathematical representations of mutational signatures. Both of these representations are widely used, but are remarkably different in their visual aspect, making intuitive comparisons difficult. To ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Baez-Ortega A. Reviewer Report For: iMutSig: a web application to identify the most similar mutational signature using shiny [version 1; peer review: 2 approved with reservations]. F1000Research 2020, 9:586 (https://doi.org/10.5256/f1000research.26954.r64568)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
  • Author Response 19 Nov 2020
    Zhi Yang, Department of Preventive Medicine, Keck School of Medicine of the University of Southern California, 2001 N.Soto Street, Los Angeles, 91003, USA
    19 Nov 2020
    Author Response
    We are very grateful to the reviewers for their comments and suggestions. We give detailed responses to each of those comments below.


    MAJOR COMMENTS

    1. Implementation, paragraph 2: ... Continue reading
COMMENTS ON THIS REPORT
  • Author Response 19 Nov 2020
    Zhi Yang, Department of Preventive Medicine, Keck School of Medicine of the University of Southern California, 2001 N.Soto Street, Los Angeles, 91003, USA
    19 Nov 2020
    Author Response
    We are very grateful to the reviewers for their comments and suggestions. We give detailed responses to each of those comments below.


    MAJOR COMMENTS

    1. Implementation, paragraph 2: ... Continue reading

Comments on this article Comments (0)

Version 2
VERSION 2 PUBLISHED 10 Jun 2020
Comment
Alongside their report, reviewers assign a status to the article:
Approved - the paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations - A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approved - fundamental flaws in the paper seriously undermine the findings and conclusions
Sign In
If you've forgotten your password, please enter your email address below and we'll send you instructions on how to reset your password.

The email address should be the one you originally registered with F1000.

Email address not valid, please try again

You registered with F1000 via Google, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Google account password, please click here.

You registered with F1000 via Facebook, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Facebook account password, please click here.

Code not correct, please try again
Email us for further assistance.
Server error, please try again.