eXamine: Visualizing annotated networks in Cytoscape

eXamine is a Cytoscape app that displays set membership as contours on top of a node-link layout of a small graph. In addition to facilitating interpretation of enriched gene sets of small biological networks, eXamine can be used in other domains such as the visualization of communities in small social networks. eXamine was made available on the Cytoscape App Store in March 2014, has since registered more than 7,700 downloads, and has been highly rated by more than 25 users. In this paper, we present eXamine's new automation features that enable researchers to compose reproducible analysis workflows to generate visualizations of small, set-annotated graphs.

This article is included in the Cytoscape Apps gateway. report report report report report report report report Amendments from Version 1

Introduction
The Cytoscape app eXamine visualizes a small graph and a collection of node sets. The main purpose of eXamine is to aid in the interpretation of a small subnetwork module extracted from a large biological network 1 . Cytoscape apps like jActiveModules 2 or external tools like Heinz 3 extract subnetwork modules from a protein-protein interaction network given gene expression data. To interpret the identified subnetwork module, a frequent follow-up analysis is to compute enrichment of the nodes of the identified module in terms of known annotations such as from the Gene Ontology (GO) 4,5 or from the Kyoto Encyclopedia of Genes and Genomes (KEGG) 6 . These annotations are a collection of node sets. eXamine provides a visual analysis approach that facilitates interpretation of the subnetwork module and the identified node sets by biologists. Given a small collection of node sets, eXamine generates a visualization of a small graph as a node-link layout together with contours for the selected sets ( Figure 1). More specifically, the layout is computed using a variation of the algorithm described in 7. This algorithm preserves topological distances along inter-module links as much as possible, while making sure that none of the nodes overlap. In addition, spanning graphs are derived for those sets that have been selected by the user. These graphs are included in the computation of topological distances between nodes, pulling the nodes closer together. The spanning graphs are also used to draw the set contours, by adjusting the associated links to form the rounded shapes that visually encompass and connect nodes.
As an alternative to eXamine, the existing group layout of Cytoscape can be used to show node partitions by visualizing disjoint sets in separate circles. The Venn and Euler diagram app 8 for Cytoscape visualizes overlapping sets. However, in both cases network and group analysis are visually separate. Finally, Cytoscape provides a group viewer I that aggregates groups into meta-nodes, without making group overlaps explicit. Here, we present a new version of eXamine that uses Cytoscape's recently introduced automation features. With these new features it becomes possible to create reproducible analysis workflows that generate appealing visualizations of small, set-annotated graphs. We demonstrate eXamine's new automation features using two use cases. The first use case replicates the case study provided in the original eXamine publication 1 . The second use case, the analysis of a social network, demonstrates that eXamine is applicable to other domains beyond computational biology.

Methods
Implementation eXamine is implemented in Java and is available as an app for Cytoscape 3. We used WebCola algorithms II to simultaneously lay out nodes, links, and set contours. We refer to the original publication 1 for additional implementation details regarding the used visualization techniques. Since the typical analysis workflows of eXamine consist of relatively simple commands that do not require streaming of complex data, we implemented the automation features through the 'Command' interface rather than the 'Function' interface. As a result, eXamine's commands can be either directly from Cytoscape or through Cytoscape's REST (CyREST) interface from a Jupyter notebook or from the programming language R.

Operation
As eXamine requires Cytoscape 3.6 to run, eXamine has the same system requirements as Cytoscape 3.6. eXamine can be operated through the Cytoscape graphical user interface (GUI) or through the new 'Command' interface of Cytoscape. We refer to the original publication 1 for GUI instructions. In the following, we will describe and use the new 'Command' interface of eXamine. Table 1 provides a summary of the API of the eXamine commands and their parameters. To enable workflow authors to use eXamine's automation most effectively, we also generated Swagger-based documentation that describes the The graphs generated in the two use cases using eXamine's automation features. (c) The first step in the workflow consists of importing a network, followed by importing node annotations that associate each node with a set of groups. Next, we optionally select a smaller subnetwork to visualize. We generate an internal representation of the groups, and import additional group annotations. After selecting the groups to visualize, we export an image of the visualization. Alternatively, we can launch a window that allows the user to select different groups.
commands and arguments. This documentation can be accessed in the Cytoscape menu: Help → Automation → CyREST Command API. Figure 1 shows a typical workflow of eXamine analysis, where the commands provided are in the 'examine' namespace (red) and the commands provided by Cytoscape are colored blue.

Use cases
To illustrate the new 'Command' interface of eXamine, we present two use cases. In the first use case, we describe a workflow to study a small subnetwork module extracted from the KEGG mouse network 6 . In the second use case, we describe a workflow to study Zachary's karate club, a well-known social network. Both workflows are available as Jupyter notebooks and as R markdown documents. The workflows require Cytoscape (≥ v3.6) and a recent Python version (Python v2.7 or Python v3.6) or R (≥ v3.4). We note that the commands described in the two use cases can also be directly executed from Cytoscape via the 'Commands dialog' or a separate 'Commands script' III .  Figure 1 shows an example workflow using the below commands.

Command Argument (type) Description examine generate groups
Generates eXamine groups from a given set of columns of the node

Use case 1: Dysregulated signaling in Human Cytomegalovirus
The Human Cytomegalovirus (HCMV) is a highly-contagious herpes virus. Previously 1 , we used eXamine to interpret a small subnetwork module (17 nodes and 18 edges) extracted from the KEGG mouse network using Heinz 3 given gene expression data of an HCMV-infected mouse cell line. Node sets of this subnetwork module were annotated using enriched pathways from KEGG and enriched terms from GO. Below, we provide Python code IV that uses Cytoscape's and eXamine's automation features to generate Figure 1. An R markdown document for this use case is available on the git repository V .
1. To begin, we define a helper function that creates HTTP POST requests.
executeRestCommand("network" , "select" ,{"nodeList" : "Module:small"}) 5. Using the 'examine generate groups' command, we generate group nodes for each member of the specified sets that occur in the module (the nodes we previously selected).
Alternatively, using the 'examine interact' command, we can launch an interactive visualization window that allows us to select different groups. executeRestCommand("examine","interact",{}) Use case 2: Zachary's karate club We consider the graph 'Zachary's karate club', which is an undirected social network of friendships between 34 members of a karate club at a US university in the 1970s 10 . Here, we use eXamine to visualize the six overlapping communities of this network identified in 11. We provide Python code below VI . The corresponding R markdown document is available on the git repository VII .
1. We use the same helper function as defined in the previous use case.
2. We import the network using the 'network import url' command provided by Cytoscape.
executeRestCommand("network" , "select", {"nodeList":"all"}) 5. Using the 'examine generate groups' command, we generate group nodes for each member of the specified sets that annotate the nodes of the network.

Discussion and conclusions
eXamine is limited to visualizing small, relatively sparse networks. It is not possible to use eXamine to construct a comprehensive layout if the network consists of hundreds of nodes or if there are dozens of annotation sets to visualize at the same time. This is a natural limitation of any visualization approach based on node-link diagrams and set contours.
eXamine currently uses Cytoscape's CyREST API to import networks and their annotations. If apps that provide gene enrichment analysis functionality, such as BiNGO 12 , would expose this functionality through the CyREST API, we envision updating the workflow to include this type of upstream analysis. Finally, it would be good to enhance the API to return richer R and Python data types rather than a flag indicating whether the command succeeded. For instance, upon a 'generate groups' it would be good to actually return a dataframe containing the groups. This, however, will require switching from the 'Command' interface to the 'Functions' interface.

Summary
eXamine is a Cytoscape app for a set-oriented visual analysis approach for small annotated graphs that displays set membership as contours on top of a node-link layout ( Figure 1). In this paper, we presented new automation features for eXamine that are accessible through Cytoscape's REST API ( Table 1). As such, researchers can embed eXamine in reproducible and well-documented workflows that generate appealing visualizations of small, set-annotated graphs ( Figure 1). We demonstrated two such workflows in the context of computational network biology and social network analysis.
Data and software availability Author contributions KD and PS implemented the automation features of eXamine. MEK and GWK supervised the project. GWK, KD, MEK and PS wrote the paper. All authors read and approved the manuscript.

Competing interests
No competing interests were disclosed.

Grant information
The author(s) declared that no grants were involved in supporting this work. Many researchers arrive at point in enrichment analysis where they have tables of enriched ontology terms and pathways, but don't know how to go beyond presenting those data as tables or perhaps bar graphs. The use of networks is an intuitive and powerful visualization option. The eXamine app for Cytoscape provides a nice approach for displaying a simplified version of enrichment results. A couple steps are missing, however, and a few minor edits are suggested.

Missing steps:
All sources of enrichment that I'm aware of provide rows of terms and a list of genes in a single field per row. It would be nice if you provided a bit of Python to transform that common input data type to that required by eXamine.
I was wondering about genes that belong to more than one term (often the case). I looks like your tool handles these via a piped list. Is that right? Please describe the correct syntax in the text so readers don't have to guess.

3.
Groups in Cytoscape are no longer provided as a separate plugin, but rather are now integrated into the core of Cytoscape. Your reference to "RBVI Cytoscape plugins" should be updated to simply mention the Cytoscape manual and point to the section on groups: http://manual.cytoscape.org/en/stable/Creating_Networks.html#grouping-nodes The LaTeX formatting for your Python code is unfortunately rendering single quotes in a way that throws SyntaxError in Python. Can you update the rendering so one can simply copy/paste snippets?
In the next version of this paper, I'd highly recommend using py2cytoscape ( ). This will make the code you have write much more https://github.com/cytoscape/py2cytoscape concise and easier to maintain or adapt. Likewise, I'd really like to see R examples and you can leverage RCy3 to make that code easy to write as well ( ). http://bioconductor.org/packages/release/bioc/html/RCy3.html

Is sufficient information provided to allow interpretation of the expected output datasets and any results generated using the tool? Yes
Are the conclusions about the tool and its performance adequately supported by the findings presented in the article? Yes No competing interests were disclosed. Competing Interests:

I have read this submission. I believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.
Author Response 01 Jul 2018 , University of Illinois at Urbana-Champaign, USA

Mohammed El-Kebir
We thank the referee for the positive feedback. Below our point-by-point response to the referee's comments.
Regarding the missing details on generating the enrichment input, the DAVID webtool ( ) yields a file where the rows are genes containing a list of terms. https://david.ncifcrf.gov/ Implementing a generic Python function that transforms enrichment output that comes in various to our required format is a challenging task. Ideally, the enrichment step should be part of the workflow, as described in Discussion. We handle genes with multiple terms using lists, and the default separator used by Cytoscape's import table functionality is indeed a pipe character. We updated the tutorials and text to describe this. We updated the reference to "RBVI Cytoscape plugins". We replaced all single quotes by double quotes, which are retained by the used LaTeX package for displaying the code (package is 'listing'). Simple copy-paste of the commands into the Python interpreter should work now. We thank the referee for the suggestions regarding R and Python. We now use the RCy3 library for the two use cases in R. As for py2cytoscape, we found that the functionality that we need is not supported directly and requires separate script files for each command. We found that using the REST interface directly was easier.
No competing interests. In their manuscript "eXamine: Visualizing Annotated Networks in Cytoscape", Spohr et al. describe a new version of their Cytoscape tool eXamine. eXamine allows user to enrich the visualization of small networks via colored contours which represent specific annotations of subnetworks, e.g. functional annotation. eXamine is very useful because it not only highlights group properties in a visually appealing way but also allows the representation to be changed interactively. The software always attempts to achieve an optimal layout in which overlaps are kept at a minimum. Here, the authors have adapted eXamine to make use of the new Cytoscape automation features which allow for programmatic control of the plotting process, thus allows for generating reproducible results that can further be embedded in scripts.
The manuscript is well written and describes the new features and their usage clearly.
The code of eXamine is hosted on github and contains the two use cases from the manuscript in the form of popular Jupyter notebooks. Each step of the examples is well described and illustrates to the user how eXamine can be used in a scripting environment such as Python. In particular less obvious steps like extracting network ids from a JSON result help new users to achieve results fairly quickly.

I have only minor comments:
Unfortunately the second use case (Jupyter notebook) in the git repository did not work due to the use of a local file path instead of a public URL.

Richa Batra
Institute of Computational Biology, Helmholtz Center Munich, Neuherberg, Germany  It would have been great to have a small dataset and step by step guide through the cytoscape app.
Then small R script that I can run to generate the same results. Could you extend it for R users? Do you plan it in future work?
If the paper is about automation of eXamine via python, then it should reflect in title. The current name of the paper is misleading.

Is the description of the software tool technically sound? Yes
Are sufficient details of the code, methods and analysis (if applicable) provided to allow replication of the software development and its use by others? Yes

Is sufficient information provided to allow interpretation of the expected output datasets and any results generated using the tool? Yes
Are the conclusions about the tool and its performance adequately supported by the findings presented in the article? Yes No competing interests were disclosed. We thank the referee for the positive feedback. Below our point-by-point response to the referee's comments.
We have merged Figures 1 and 2  This article describes the use of new Commands exposed by the eXamine app.
The article is well written and adequately describes the Commands. My comments are advisory, and I hope they can improve the paper.
The procedure for using Commands should be well known to a proficient Cytoscape user. However, there's an important class of user that wants to use Automation but isn't a Cytoscape expert. To help him/her along good reference Cytoscape manual (Commands Tool), and good to point out that the commands can be used from the Commands dialog, a separate Commands script (via the Cytoscape command line or the Tools | Execute Command File menu item), or via Python. (See last paragraph of Methods section.) Good, too, to remind the user that eXamine commands are in the examine namespace, and that different namespaces resolve to Cytoscape and other apps.
This paper is about Python usage. Good to remind user that similar calls can be made from R.
In the Introduction, it would be helpful to justify eXamine automation by giving an example of workflows that become possible. This motivates the user. This could be as simple as summarizing Use cases.
For Discussion, it may be worth speculating on the value of providing Python and R libraries that act as cover functions for the eXamine REST calls. Python and R programmers really want to think in terms of Python and R, not REST. In Use Cases, good to say where the Jupyter notebooks are ... even if you identify them at the end of the paper.
As a side note, I notice that all eXamine endpoints are GET. The Commands convention we're using now is POST, with GET being deprecated. In this case, it doesn't matter much, as the parameter list isn't long and there isn't any return result that would benefit from JSON encoding. For a future release, good to and there isn't any return result that would benefit from JSON encoding. For a future release, good to consider POST versions, too. That way, if an error occurs, you'll be able to return it in a CIResponse structure.
For the BASE_URL, the datasets are necessary to run the examples. Can you list them at the end of the paper? Of course, they must be persistently available for the life of the paper, correct??
In Use Case 1, step 2, it would be good to give a sentence or two explaining how/why a user would have created this dataset.
That's it ... nice job! Is the rationale for developing the new software tool clearly explained? Partly

Is sufficient information provided to allow interpretation of the expected output datasets and any results generated using the tool? Yes
Are the conclusions about the tool and its performance adequately supported by the findings presented in the article? Yes No competing interests were disclosed.

Competing Interests:
I have read this submission. I believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.
Author Response 01 Jul 2018 , University of Illinois at Urbana-Champaign, USA

Mohammed El-Kebir
We thank the referee for the positive feedback. Below our point-by-point response to the referee's comments.
We included the following sentences along with a reference to the Cytoscape documentation: ' Figure 1 shows a typical workflow of eXamine analysis, where the commands provided are in the `examine' namespace (red) and the commands provided by Cytoscape are colored blue.' 'We note that the commands described in the two use cases can also be directly executed 'We note that the commands described in the two use cases can also be directly executed from Cytoscape via the `Commands dialog' or a separate `Commands script'. ' We now provide R code for the two use cases. We reworded the last paragraph of the introduction to more clearly describe the benefits of eXamine's new automation features. We wrote the following sentences in Discussion: 'Finally, it would be good to enhance the API to return richer R and Python data types rather than a flag indicating whether the command succeeded. For instance, upon a `table import' it would be good to actually return a dataframe. This, however, will require switching from the `Command' interface to the `Function' interface.' Thank you for the excellent suggestion about the coloring. We colored eXamine commands red and Cytoscape commands blue. We included URLs to the Jupyter and R markdown documents. We now use POST commands. We now include links to the datasets. They are indeed persistently archived using zenodo. We included the following sentences in step 2 of use case 1: 'This a protein-protein interaction network that is typically used for the analysis of high-throughput biological data in the context of a biological network. The Cytoscape app KEGGscape provides functionality for importing pathways from KEGG [9].'

No competing interests. Competing Interests:
The benefits of publishing with F1000Research: Your article is published within days, with no editorial bias You can publish traditional articles, null/negative results, case reports, data notes and more The peer review process is transparent and collaborative Your article is indexed in PubMed after passing peer review Dedicated customer support at every stage For pre-submission enquiries, contact research@f1000.com