: An interface from Cytoscape
that provides a user interface to R
packages [v1; ref status: indexed, http://f1000r.es/1tv]
The European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, CB10 1SD, UK
We acknowledge with thanks the financial support from the EU through the project “BioPreDyn” (ECFP7-KBBE-2011-5 Grant number 289434).
The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
is an increasing number of software packages to analyse biological experimental
data in the R environment. In
particular, Bioconductor, a
repository of curated R packages, is one of the most comprehensive resources for
bioinformatics and biostatistics. The use of these packages is increasing, but
it requires a basic understanding of the R language, as well as the syntax of
the specific package used. The availability of user graphical interfaces for
these packages would decrease the learning curve and broaden their application.
we present a Cytoscape plug-in termed
Cyrface that allows Cytoscape plug-ins to connect to any function
and package developed in R. Cyrface can be used to run R packages from within the Cytoscape environment making use of a
graphical user interface. Moreover, it links the R packages with the
capabilities of Cytoscape and its plug-ins, in particular network visualization
and analysis. Cyrface’s utility has been demonstrated for two Bioconductor
packages (CellNOptR and DrugVsDisease), and here we further illustrate its
usage by implementing a workflow of data analysis and visualization. Download
links, installation instructions and user guides can be accessed from the Cyrface homepage (http://www.ebi.ac.uk/saezrodriguez/cyrface/).
The availability of high-throughput experimental data has led to the development of multiple computational methods to analyse these data. Arguably, one of the most used environments is the statistical programming language R1. Multiple R packages for computational biology and bioinformatics are available in various resources such as the Comprehensive R Archive Network (CRAN). Furthermore, Bioconductor2 provides a comprehensive collection of packages to analyse biological data developed in R. These packages are subject to stringent quality control in terms of functionality and documentation. It is an open-source project hosting 671 active and curated software packages as of September 2013.
For those not familiar with computational programming, learning R and running packages can be a time consuming task and therefore the use of intuitive graphical interfaces can enhance the usability of the tool. Cytoscape3,4 is a Java open-source framework with an intuitive graphical interface devoted to the visualization and analysis of networks. It is arguably one of the most used tools in bioinformatics, and has a variety of plug-ins to solve numerous computational biology problems. Therefore, we developed Cyrface, a plug-in for Cytoscape that facilitates an interface between any R package and Cytoscape. Cyrface is designed to integrate the major strengths of R and Cytoscape environments by providing a general Java to R interface. By linking these two environments, Cyrface allows one to use Cytoscape as a user interface for R packages and Cytoscape plug-ins in order to reach the wealth of methods implemented in R.
Workflow management systems such as Taverna5 and Galaxy6–8 can call R packages from a graphical user interface (GUI)-based interface. Taverna is a standalone Java open-source tool for the general development and execution of workflows. Galaxy is an open-source web-platform to assemble workflows based on genomic experimental data analysis. Thus, Cyrface complements Taverna and Galaxy by enhancing GUIs for R within a different environment with complementary features.
RCytoscape9 is another tool that exists to link R and Cytoscape. It is a Bioconductor R package that establishes a connection between R and Java in the opposite direction of Cyrface: it supports the connection from R to Java, whereas Cyrface allows a connection from Java to R. A typical use of RCytoscape is to handle experimental data from R and transfer the biological network to Cytoscape while controlling it within R. Hence, RCytoscape and Cyrface provide complementary features.
This paper is structured as follows: Firstly, we provide a description of the implementation of Cyrface. Then, to illustrate the applicability of Cyrface, we show two existing packages, CytoCopteR10 and DrugVsDisease (DvD)11, that make use of Cyrface, and we create a simplified version of the DataRail12 workflow to process and visualize experimental data using methods available in R. Finally, we discuss on-going and future developments.
Cyrface is a Java open-source framework developed to establish the connection between Cytoscape and R. Interaction between these two different environments (invoking R within Java) is not natively supported by Java. Therefore, to achieve this Cyrface uses the external libraries RCaller (https://code.google.com/p/rcaller/) and Rserve (http://www.rforge.net/Rserve/).
On the one hand, to support the communication between Java and R, RCaller uses an R package called Runiversal that converts the R objects into an XML format, thus allowing the R objects to be read by Java.
On the other hand, Rserve establishes a TCP/IP server allowing other programs from various languages to connect to an R session and access its features. Rserve is currently being used by several mature projects, among them the Taverna workflow management system5.
Support for Rserve and RCaller libraries in Cyrface is implemented by the RserveHandler and RCallerHandler Java classes, respectively. Both classes extend the abstract class RHandler that contains the signature of all the necessary methods to establish and maintain a connection with R. Figure 1 depicts the hierarchical structure of the Java classes responsible for handling the connection between Java and R. Moreover, it depicts the connection points between these two different environments.
Figure 1. Diagram of the Cyrface interaction layer with R.
Within the grey box the class hierarchy of the classes responsible for establishing the connection between Cytoscape and R is represented. RHandler is an abstract Java class that is extended by RserveHandler and RCallerHandler classes that add support to Rserve and RCaller libraries, respectively. The connection from Java to R can be achieved using either RserveHandler or RCallerHandler classes, or other classes that successfully extend RHandler.
Cyrface software architecture can be extended to support other Java libraries that facilitate the connection between Java and R. Thereby, this structure allows one to take advantage of particular strengths of different libraries and to adapt to particular requirements of the users, for instance execute R commands automatically without requiring first to manually initiate an R session.
Cyrface uses another Cytoscape plug-in termed CommandTool. CommandTool offers the users the ability to script basic commands in Cytoscape, such as import, display or modify networks through a simple command line. The integration allows the users to use the simple command line of CommandTool to execute R commands within Cytoscape and visualise directly the output. On Cyrface’s homepage (http://www.ebi.ac.uk/saezrodriguez/cyrface/) we provide an example using the CommandTool console to plot several characteristics of the iris data set using the ggplot13 plotting library.
Results and discussion
A typical use of Cyrface is to provide a graphical user interface to R packages within Cytoscape. Cyrface is currently being used by two Cytoscape plug-ins, CytoCopteR10 and DvD11.
CytoCopteR10 provides a simple step-by-step interface allowing users without any experience in R to use the CellNOptR (www.cellnopt.org) package and handle the input and output networks in Cytoscape. CellNOptR is an open-source software package that provides methods for building predictive logic models from signalling networks using experimental measurements.
DvD11, Drug vs. Disease, is an R package that provides a workflow for the comparison of drug and disease gene expression profiles. It provides dynamic access to databases, such as Array Express14, to compare drug and disease signatures to generate hypotheses of drug-repurposing.
The packages mentioned above are two examples of the usefulness of Cyrface in capturing the strengths of two environments. On one side, R provides a wealth of bioinformatics and biostatistics packages with very comprehensive resources such as Bioconductor and CRAN. On the other side, Cytoscape facilitates a user-friendly graphical interface for network visualisation and analysis, complemented with a variety of plug-ins addressing different computational biological problems. Cyrface links these two environments by providing a way to develop user-friendly interfaces for R packages by embedding them within Cytoscape.
As an illustrative example, Cyrface provides a simple version of the DataRail12 workflow using methods implemented in R. DataRail is an open-source MATLAB toolbox that handles experimental data in a tabular format and provides methods to maximize and extract information using internal or external tools. Saez-Rodriguez et al.12 also proposed an experimental data storing format termed Minimum Information for Data Analysis in Systems Biology (MIDAS).This is a tabular format based upon the minimum-information standards that specifies the layout of experimental data files. A typical use of DataRail is to import, store and process the input information from instruments using the MIDAS format, and export it to other MIDAS compliant software.
The simplified version of the DataRail workflow implemented in Cyrface is structured in several sequential steps that allows the users to import, normalise and visualise experimental data-sets stored in the MIDAS format (see Figure 2). At any stage the users are able to export and visualise the transformed data set.
Figure 2. The Cyrface implementation of the DataRail
The rounded rectangles represent the MIDAS files containing the experimental data at a given state. Hexagon nodes represent functions such as load or normalise. Green identifies steps that were successfully executed and grey identifies those that were not run yet.
An extension to the workflow was subsequently added to support the CellNOptR10 model training function. CellNOptR uses the experimental data and a corresponding prior-knowledge network to generate a logic model and train it to maximise the fit with the experimental measurements. Thereby, through an intuitive graphical interface, users are able to visualise a biological network, modify it and use it to assess the quality of the fit with a corresponding data set of experimental data.
The workflow supports any network format that is supported by Cytoscape, for example the SIF format. Moreover, the workflow was extended to support the Systems Biology Markup Language (SBML) Qualitative Models (Qual) format15. SBML Qual is an extension of the SBML level 3 standard and is proposed to provide a standard representation for logic and qualitative models of biological networks. The latest specification document for SBML Qual can be found on the package homepage (http://sbml.org/Documents/Specifications/SBML_Level_3/Packages/Qualitative_Models_(qual)). Support for importing models stored in SBML Qual format is achieved using the jSBML library16 and the respective SBML Qual package. Supplementary material 1 provides a step-by-step tutorial and an example on how to use the workflow.
Here, we present Cyrface; a bioinformatics Java library that provides a general interaction between Cytoscape and R. Cyrface offers a way to combine a friendly graphical interface within the Cytoscape environment with any R package. A GUI should benefit beginners and occasional users; as well as being useful for training and illustration purposes, it extends the accessibility of the tool to those not familiar with the R command line interface.
The Cyrface homepage (http://www.ebi.ac.uk/saezrodriguez/cyrface/) contains the link to download Cyrface, and installation and user-guide instructions. A few examples demonstrating the usefulness of the tool and the different supported libraries are also shown and explained. The source-code of Cyrface is publicly available on its Sourceforge webpage (https://sourceforge.net/projects/cyrface/) and permanently available on 10.5281/zenodo.7096.
Future features for Cyrface will include the extension to the new version of Cytoscape, Cytoscape 3, and improvements to the DataRail workflow. These will include increasing its modularity and supporting other features, such as cutting and selecting specific regions of the data.