Automation of ReactomeFIViz via CyREST API

Pathway- and network-based approaches project seemingly unrelated genes onto the context of pathways and networks, enhancing the analysis power that cannot be achieved via gene-based approaches. Pathway and network approaches are routinely applied in large-scale data analysis for cancer and other complicated diseases. ReactomeFIViz is a Cytoscape app, providing features for researchers to perform pathway- and network-based data analysis and visualization by leveraging manually curated Reactome pathways and highly reliable Reactome functional interaction network. To facilitate adoption of this app in bioinformatics software pipeline and workflow development, we develop a CyREST API for ReactomeFIViz by exposing some major features in the app. We describe a use case to demonstrate the use of this API in a Python-based notebook, and believe the new API will provide the community a convenient and powerful tool to perform pathway- and network-based data analysis and visualization using our app in an automatic way.


Introduction
Pathway-and network-based computational approaches are now routinely used in large-scale data analysis to uncover hidden patterns that are otherwise impossible to discover. These approaches project significant genes, proteins, metabolites, and other kinds of biological entities collected from other approaches onto the context of pathways and networks, knowledge produced by many years' experimental studies. Cytoscape 1 is the most popular biological network visualization and analysis platform, widely used in the research community to perform pathway and network analysis and visualization. The release of the CyREST app 2 enables Cytoscape as an integrative and indisposable tool to build automatic software pipeline and workflow in programming languages widely used by the bioinformatics and computational biology community, including Python and R, via a RESTful API. The standalone Java-based Cytoscape application thereby functions as a microservice servlet exposing the major features of Cytoscape.
Reactome 3 is the most comprehensive open source biological pathway knowledgebase, widely used in the research community, with its web site accessed by roughly 60,000 unique IP addresses per month. To perform genome-scale network-based data analysis and visualization, we have also constructed a highly reliable Reactome functional interaction (FI) network by extracting FIs from manually curated pathways from Reactome and other popular large-scale pathway databases and predicting FIs based on a machine learning approach 4 . Based on this FI network and the high quality Reactome pathways, we have developed a Cytoscape app, called "ReactomeFIViz" 5 , which is one of most popular Cytoscape apps, downloaded over 30,000 times since it was released in September, 2013 into Cytoscape app store.
ReactomeFIViz provides a suite of features to help users to perform pathway-and network-based data analysis and visualization for cancer and other complicated diseases. Users can construct a FI subnetwork based on a set of genes, perform network clustering, annotate found network modules using Reactome pathways and Gene Ontology terms, and perform survival analysis if clinical data is available to search for gene signatures related to patient overall survival. Users can also explore Reactome pathways directly inside Cytoscape, perform pathway enrichment analysis for a set of genes, and conduct pathway modeling using multiple types of omics data based on factor graphs converted automatically from Reactome pathways. Recently we added new features for users to visualize FDA approved cancer drugs and their targets interactions in the context of Reactome pathways and the FI network, and perform fuzzy logic based modeling to study the effects of drug application on the pathway activities (see ReactomeFIViz wiki page).
To facilitate third-party software tool developers to integrate the powerful network and pathway analysis features provided in ReactomeFIViz, we implemented a new CyREST API. The current version of this API is focused on FI network construction and Reactome pathway enrichment analysis for a set of genes.

Methods
To develop the CyREST API for ReactomeFIViz, we followed the recommended procedures described in Cytoscape wiki on adding Automation to existing apps. To handle the complex data models used in ReactomeFIViz, we chose the Functions over Commands approach by adding JAX-RS resource onto existing ReactomeFIViz code base. In brief, we added two new Java packages, org.reactome.cytoscape.rest and org.reactome.cytoscape. rest.tasks, and refactored original tasks into ObservableTask. All refactored ObservableTasks are grouped into package org. reactome.cytoscape.rest.tasks, and their execution is managed by a SyncrhonousTaskManager object and monitored by their respective TaskObserver objects.
The ReactomeFIViz CyREST API is specified in a Java interface, ReactomeFIVizResource, and documented using Swagger UI as Java annotations for methods defined in the interface. The implementation of ReactomeFIVizResource is provided in class ReactomeFIVizResourceImp. Both the interface and the implementation are placed in package org.reactome.cytoscape.rest.
The ReactomeFIViz CyREST API is powered up by the CyREST app using its embedded light-weight Grizzly HTTP server. CyREST delegates all RESTful API calls to ReactomeFIViz, which then calls the ReactomeFIViz RESTful server via its RESTful API. The ReactomeFIViz server fetches the Reactome content from databases hosted in a MySQL database engine via a Hibernate API and the in-house built Reactome Java API ( Figure 1).

Operation
Currently, the ReactomeFIViz CyREST API provides 8 methods (Table 1). These methods allow third-party workflow and pipeline developers to construct a FI sub-network based on a set of genes listed in a variety of file formats, annotate displayed network using collected pathways, GO biological process, molecular function, or cellular component terms, perform network cluster and then annotate network modules. These methods also allow them to perform Reactome pathway enrichment analysis for a set of genes and then export pathway diagrams. The

Amendments from Version 1
The major changes we have made according to comments from reviewers are below: 1. Added a new table (Table 2) listing detailed information about actions performed in the use case workflow.
2. We have updated Figure 2 by changing words "Download Diagram" to "Export Diagram" in the figure.
3. Added some new words in the main text (the second paragraph in the Workflow section) to provide ReactomeFIViz CyREST API function names, assisting readers to understand how to reproduce the workflow without going to the Jupyter notebook.
4. Added a sentence to describe the overlapped genes between OV modules and two BRCA modules, 13 and 17, which are not significantly overlapped with any OV module.  CyREST API document for ReactomeFIViz, which is accessed via menu Help/Automation/CyREST API, provides detailed description about all these resources.

Results
ReactomeFIViz CyREST API provides a set of URL-based language-neutral functions, which accept parameters and return results in the JSON format. As with any other CyREST API, it can be easily integrated into Python, R, or any other programming language as long as it supports HTTP-based function calling.
Here we describe a use case based on The Jupyter Notebook to showcase the usage of this API in a workflow development.

Workflow
Previous study 6 has shown the genomic similarity between high-grade serous ovarian tumors and basal-like breast tumors based on multiple omics data types, including copy number variants (CNVs), somatic mutations, and mRNA gene expression. To demonstrate use of the ReactomeFIViz CyREST API, we perform a network-and pathway-based comparison analysis between genes having somatic mutations in TCGA ovarian cancer and breast cancer. Our analysis is focused on showing the utility of our API. Therefore, we use all TCGA breast cancer samples without subtyping to simplify our workflow. Figure 2 shows the workflow of this use case, and Table 2 lists the detailed information for actions performed in the workflow. The TCGA BRCA (breast invasive carcinoma) and OV (ovarian serous cystadenocarcinoma) mutation data was downloaded from the Broad firehose web site in the mutation annotation file (MAF) format using its RESTful API, and stored in two local files, one for each cancer type. The MAF file was then loaded into Cytoscape via the CyREST API, buildFISubNetwork, to construct a FI sub-network after choosing a sample cutoff to select genes forming a network composed of about 500 genes. The FI-network was then subject to network clustering analysis using the cluster call. Two sets of network modules from network clustering were compared to find modules shared and not shared between these two cancers in the Python notebook. Pathway enrichment analysis was performed using ReactomePathwayEnrichment to collect pathways not shared between them. These results suggest common and cancerspecific network and pathway patterns, facilitating researchers to understand shared and specific oncogenesis mechanisms in these  To visualize mutated genes in the context of Reactome pathways, the notebook also generated two pathway diagrams, one for each cancer, and saved into the working directory as PDF files. Entities in pathway diagrams composed of mutated genes are highlighted in purple.

Discussion
Reactome provides a large set of high-quality manually curated pathways. The Reactome FI network provides a genome-scale highly reliable functional interaction network covering over 60% of total human genes. The ReactomeFIViz CyREST API delivers language neutral REST-based functions for third-party software developers to leverage high-valued resources provided by the Reactome project in their own software tools.
The current set of functions implemented in this version of ReactomeFIViz CyREST API focuses on some major features implemented in ReactomeFIVz, related to FI network construction, clustering, and Reactome pathway enrichment analysis. As shown in the above use case, it is very easy to integrate with other CyREST APIs and integrated into a Python or R programming language environment to perform Reactome-related pathway and network analysis and visualization.
We will expose other ReactomeFIViz features in the CyREST API, including gene expression data analysis, network module-based survival analysis, pathway modeling based on Boolean network and probabilistic graph model, and cancer drug visualization and simulation. We will also develop a Python package for easy third-party tool integration.

Competing interests
No competing interests were disclosed.

Grant information
This project is supported by a NIH grant (5U41HG003751).

John H. Morris
Resource for Biocomputing, Visualization and Informatics, Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, CA, USA In this article, the authors have described their extensions to the popular ReactomeFIViz Cytoscape App to support automation via CyREST. They do an excellent job providing the motivation for their extension. The manuscript was clear and well-written, and I believe that the new REST API's will be quite useful.
I do have a problem with the manuscript, however, in that in the Workflow section, they state: "The MAF was then loaded into Cytoscape via the CyREST API to construct a FI sub-network after choosing a sample cutoff to select genes forming a network composed of about 500 genes. The FI-network was then subject to network clustering analysis. Two sets of network modules from network clustering were compared to find modules shared and not shared between these two cancers. Pathway enrichment analysis was performed to collect pathways not shared between them." My problem with this is that they don't provide us with enough detail on the actual CyREST calls and parameters they used to allow us to understand how to reproduce it. Which part of their analysis was done in Python vs. which was done in Cytoscape through the GUI. It is true that all of my answers exist in the Jupyter notebook they provided, but that's a lot of work to figure out that they used the "buildFISubNetwork" method with the "MAF" format the fiVersion "2016". My suggestion is to add at least enough information either in the figure or the text to help others understand the REST flow for their App without referring to the notebook.

Is the rationale for developing the new software tool clearly explained? Yes
Is the description of the software tool technically sound? Yes

Is sufficient information provided to allow interpretation of the expected output datasets and any results generated using the tool? Yes
Are the conclusions about the tool and its performance adequately supported by the findings presented in the article? Yes No competing interests were disclosed.

Competing Interests:
I have read this submission. I believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.

4.
Author Response 20 May 2018 , Oregon Health & Science University, USA Guanming Wu Dear Dr. Morris, Thanks a lot for reviewing our paper and your comment. In version 2 of our paper, we have added a new table listing detailed information about actions performed in the use case workflow (Table 2), and added new words in the main text (the second paragraph in the Workflow section) to help readers understand how to reproduce the workflow without going to the Jupyter notebook. We hope our revisions have addressed your concern.
No competing interests were disclosed. The authors describe the development of CyREST API for ReactomeFIViz, a Cytoscape app, which exposes some Reactome functions as REST APIs for external software components, to process pathway and network data in automatic and reproducible workflows built using almost any programming languages. The source code provided for ReactomeFiViz and for the Python use case notebook, make the use of CyREST API easier and simpler to understand.
The manuscript is well-organized and the described app will be highly valuable for users working with big data related to enrichment analysis and visualization of networks.
I have only some minor comments: In the Introduction section (pag.2): the sentence "These approaches project significant genes, proteins, metabolites, and other kinds of biological entities collected from pre-analysis onto pathways and networks, knowledge produced by many years' experimental studies.", is not quite understandable, I think it should be rephrased.
The Methods section (pag.2) is clearly explained and sufficient details are provided to allow replication of the method development and its use by others.
The section Operation (pag.2) I think is unnecessary, the authors can add it as the subsection of Methods because here they are explaining the methods available in the CyREST API developed.
In Thanks a lot for reviewing our paper and your comments. We have made changes to address some of your comments. Please see details below: 1). In the Introduction section (pag.2): the sentence "These approaches project significant genes, proteins, metabolites, and other kinds of biological entities collected from pre-analysis onto proteins, metabolites, and other kinds of biological entities collected from pre-analysis onto pathways and networks, knowledge produced by many years' experimental studies.", is not quite understandable, I think it should be rephrased.

We have rephrased this sentence and hope the meaning is clear now.
2). The Methods section (pag.2) is clearly explained and sufficient details are provided to allow replication of the method development and its use by others.

Thanks!
3). The section Operation (pag.2) I think is unnecessary, the authors can add it as the subsection of Methods because here they are explaining the methods available in the CyREST API developed.

This section is suggested by the article template provided by the journal editor.
4). In the Analysis results subsection (pag.3): I think it's possible to add a figure (such as a Venn Diagram) to better explain the results obtained. -What is the way, in which the enriched signaling pathways are involved in BRCA and/or OV carcinoma?) This is a great question. However, this short software article is written to showcase how to use the new ReactomeFIViz CyREST API. We have refrained ourselves to discuss the biological implications of our analysis results. However, the exported pathway diagrams may answer questions like this, which will be addressed in our future projects. 6). More references to supporting literature must be provided in the manuscript in order to further substantiate the claims.

Thanks for the comment. Along with listed references, we have also linked to external resources using hyper-links in the main text. We feel that we have cited enough supporting references for this on-line short software article.
Best regards, Guanming Wu, Ph.D.
No competing interests were disclosed.

Competing Interests:
The benefits of publishing with F1000Research: Your article is published within days, with no editorial bias You can publish traditional articles, null/negative results, case reports, data notes and more The peer review process is transparent and collaborative Your article is indexed in PubMed after passing peer review Dedicated customer support at every stage For pre-submission enquiries, contact research@f1000.com