FeatureViewer, a BioJS component for visualization of position-based annotations in protein sequences

Summary: FeatureViewer is a BioJS component that lays out, maps, orients, and renders position-based annotations for protein sequences. This component is highly flexible and customizable, allowing the presentation of annotations by rows, all centered, or distributed in non-overlapping tracks. It uses either lines or shapes for sites and rectangles for regions. The result is a powerful visualization tool that can be easily integrated into web applications as well as documents as it provides an export-to-image functionality. Availability: https://github.com/biojs/biojs/blob/master/src/main/javascript/Biojs.FeatureViewer.js; http://dx.doi.org/10.5281/zenodo.7719


Amendments from Version 1
To the reviewers. We are grateful for your detailed reviews, they have been helpful in order to improve both our work and our manuscript. Based on the reviewers' comments, we have introduced some modifications to the original version. The summary of those modifications is included here: (i) we have simplified the abstract; (ii) we have included a new paragraph and figure in the Introduction section summarizing other approaches related to web-based visualization of protein sequence annotations; (iii) we have emphasized the novelty of this work at the end of the Introduction section; (iv) we have added a figure showing the relation across the main component and its extensions at the beginning of the Extensions section; and (v) we have corrected the typos and grammar errors from the first version.

Introduction
Position-based annotation is one of the cornerstones of bioinformatics. A great number of databases, analysis and prediction methods are geared towards providing data mapped to specific sequence coordinates. In the case of proteins, the Pfam 1 database identifies, marks-up, and characterizes different functional regions within a given protein. The coordinates of these domains are often given in terms of the start and end position within the protein. The largest pool of reviewed and automatically annotated proteins is provided by the UniProt Consortium 2 . It contains position-based annotations for structural regions, modified residues, and functional sites among others. Finally, protein feature prediction methods such as those integrated into PredictProtein 3 provide position-based annotations such as secondary structure states, buried and exposed residues, coiled-coil stretches, and disordered regions. PredictProtein also maps functional regions such as protein-protein binding sites and protein-DNA binding sites onto positions within the sequence.
Visualization of protein sequence features has already been used in different projects, some of which are shown in Figure 1. For intance, Pfam renders Pfam domains as well as some sites of interest, such as metals, active binding sites, and also disulphide bonds. It supports uncertainty for the start and end positions of the features by means of variations of rectangular-based shapes. Dasty 4 displays protein features from different sources as well as sequences and 3D structures, provindg an overview of the visualized protein. The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB-PDB) 5 mainly focuses on 3D structures for proteins but also includes feature visualization, showing the relationship between UniProt and PDB coordinates.
BioJS 6 is an open source JavaScript collection of components for visualization of biological data on the web. Here, we present Fea-tureViewer and its current extensions: SimpleFeatureViewer that simplifies the input data format, and DasProteinFeatureViewer that retrieves the input data from a web service. The FeatureViewer is a standard, portable BioJS component designed to easily render position-based annotations, a.k.a. features. The FeatureViewer component can be easily integrated into and controlled from other applications. As the FeatureViewer and its extensions have been developed within the BioJS framework, they result in a set of modular visual components displaying position-based annotations that can be integrated with other web applications in a standard manner. Modularity and easy integration differentiate these components from previous protein feature web based visualization efforts.
The FeatureViewer component The FeatureViewer component extensively uses the Raphaël javascript library 7 that renders Scalable Vector Graphics (SVG) objects in modern web browsers. The use of SVG allows the graphics to scale to any requested resolution and is portable across different computing platforms and viewing software. The FeatureViewer component can be easily integrated into any web application by including its dependencies in the head section, e.g., jQuery 8 and Raphaël, and then instantiating the component within a JavaScript section. A special dependency for some images is required as they are used for the pop-up dialogue controls. The code below shows how to instantiate the component to create the visualization shown in Figure 2. A complete example and more information can be found at http://www.ebi.ac.uk/Tools/biojs/ registry/Biojs.FeatureViewer.html. The FeatureViewer component has been tested with modern browsers such as Mozilla, Chrome, and Internet Explorer (IE); however, the image export option is not available in IE.

Options and data
In order to instantiate the FeatureViewer component, some options should be defined. The mandatory options correspond to (i) a place holder named target, i.e., a DIV element in the web page where the annotations will be rendered, and (ii) a JSON object, named json, with the configuration, the protein identifier, the annotations, and the legend. FeatureViewer is a dummy component in the sense that it does not make any calculations about where to render the annotations, not even when the rendering style is changed; all the rendering information is provided in the json option. A comprehensive list of the elements in the json option is available at http://www. ebi.ac.uk/Tools/biojs/registry/Biojs.FeatureViewer.html. The Fea-tureViewer component includes three different layouts to display the features: all features centered, features grouped by type, and features organized in non-overlapping tracks, as shown in Figure 3.

User controls
Additional options can also be specified in order to customize the user controls as well as interaction with the features. User controls include the zooming slider and the export-to-image button, as shown in the Figure 3a. The zooming slider allows users to hone in on a region of interest and view it in greater detail, making it possible to move from an overview aspect into a detailed one without navigating to a different page. The export-to-image button allows users to export the rendered features into an image that can embedded into a paper or presentation.
Different kinds of interaction are also possible. Events bound to rendered annotations include a mouse-over action that highlights and colors the "focused" feature. Click action is also supported. Clicking on a feature selects it so it will remain highlighted until another feature is selected; clicking an already selected feature will deselect it. Tooltips tied to each shape pop up and reveal additional information about the rendered annotations. Either shapes or lines can be used to display features covering one single amino acid; currently metal bindings can be rendered as circles, active sites as diamonds, lipidation as waves, glycosylation as hexagons, and other post translational modifications, i.e., modified residues, as triangles. When shapes are used, it is possible to drag them making it easier to distinguish one from another when they are clustered.

Extensions
In order to make it easier for both developers and users to work wiht the FeatureViewer, two extensions are also provided. Sim-pleFeatureViewer simplifies the required features data while DasProteinFeatureViewer uses a web service to retrieve the features data. Figure 4 shows the three components in the Feature Viewer family.  As FeatureViewer requires highly detailed information in order to display the features, a simpler version, the SimpleFeatureViewer component, builds on top of it. This simplified version takes care of calculating the configuration options as well as the localization of the features; thus, developers using this version can focus on the actual data rather than on intricate details regarding styles, pixels, and coordinates. However, only the non-overlapping tracks style is supported by this component. The main advantage of this component is that its data structure is much simpler than the one required for FeatureViewer, as observed in the following code excerpt. This component requires a place holder, a sequence identifier, a sequence length, and a features array; the width in pixels to be used to rendered the protein features can also be defined by using the option imageWidth. The features array contains information for each feature to be displayed including, for instance, identifier, start and end positions in the sequence, label, and color among others. More information is available at http://www.ebi.ac.uk/Tools/biojs/ registry/Biojs.SimpleFeatureViewer.html.
A second extension, the DasProteinFeatureViewer, makes use of a web service that retrieves data from Distributed Annotation System (DAS) sources. DAS defines a communication protocol used to exchange annotations on gene or protein sequences 9 . Multiple protein databases provide their data following the DAS principles, for instance UniProt and InterPro 10 . For this extension, no information about the features themselves is required as such details will be retrieved from the web service, as shown in the code below.
var myPainter = new Biojs.DasProteinFeatureViewer({ target: "YourOwnDivId", segment: "a4_human" }); Additional options allow developers to specify the protein identifier, the DAS sources, the feature types -e.g., domain, chain, variant, etc., the rendering style, the image width, and some others. In order to avoid cross-domain problems, a proxy can also be specified. The feature types used by this component are those defined by UniProt, which is also used as the reference DAS source, i.e., the one providing the protein sequence. More information available at http://www.ebi.ac.uk/Tools/biojs/registry/Biojs.DasProteinFea-tureViewer.html.

Use case
The PredictProtein service 3 integrates multiple algorithms that either retrieve from curated databases or automatically predict aspects of protein structure and function. Many of the predictions provided by the methods are mapped to positions within the protein. In order to easily highlight patterns, compare predictions, and cross-validate results, the PredictProtein interface lays out the predicted annotations in data tracks, i.e., in separate rows, each row presenting different predicted features. Data tracks are laid one under the other and enable the quick overview of some of the prominent features of the protein e.g., a cluster of binding sites close to the N-terminal or the count of trans-membrane regions. Figure 5 shows the implementation of the FeatureViewer component used for the PredictProtein service.

Conclusions
The FeatureViewer component and its extensions, SimpleFeature-Viewer and DasProteinFeatureViewer, provide a platform to visualize position-based biological data easily and efficiently. Feature-Viewer, like any other BioJS component, can be easily integrated with other web components or extended to have greater functionality than the one shown here. We expect this component to be particularly useful to developers and users alike, requiring little technical knowledge for its full functioning.
LG thanks Pablo Moreno who spotted a bug related to multiple instantiation and contributed to fixing it, as well as Mark Bingley, Claire O'Donovan, Sangya Pundir, and Xavier Watkins in the Uni-Prot EMBL-EBI team and Rafael Jiménez and John Gómez in the IntAct EMBL-EBI team for their comments and suggestions about the FeatureViewer component and its extensions.
The authors thank all those who funded our research as well as researchers who deposited data into publicly available datasets and programmers who provided their work under a free license: our work stands upon their shoulders and would not have been possible without them. There are currently only a few approaches available for visualising protein features on the web. The BioJS is a welcome addition to the set of tools available to web developers. The fact that it is FeatureViewer available as a BioJS component should make it interesting as a re-usable web site element. However, I find that some of the already existing approaches provide a representation of features that I (subjectively) find graphically more refined and that go beyond simple boxes, circles, and triangles to represent sequence annotations.

Minor modifications:
I feel that the current version of this manuscript does not adequately summarise the already available approaches for visualising protein features. To list a few already available protein feature viewers: -(is already being mentioned in the manuscript), shows the location of Pfam domains on protein Pfam sequences. It has currently one of the graphically most appealing views and can show more than one sequence per page (e.g. ) http://pfam.sanger.ac.uk/family/Piwi#tabview=tab1 -provides a graphical summary of secondary structure and pfam domains for protein PDBsum sequences (e.g. and in large: https://www.ebi.ac.uk/pdbsum/1cdg https://www.ebi.ac.uk/thornton-srv/databases/cgi-bin/pdbsum/GetPage.pl?pdbcode=1cdg&template=wirpfam.html&r=w ) -has been developed for several years at the EBI (in fact there seems to be some shared DASty co-authorship) and provides a feature view that is based on DAS and can also map annotations to 3D (e.g. article describes an ensemble of JavaScript components designed for F1000Research bioinformatics web developers that allow the retrieval, layout and display of positional annotation on a 1D coordinate system, such as a protein or nucleotide sequence. Special support is provided for the display of protein positional annotation retrieved from the Distributed Annotation System (requires some server-side configuration), and the system provides standard glyphs and shading styles to allow active sites and common types of post-translational modifications to be effectively displayed. Importantly, the components employ the BioJS framework, which allows them to exchange messages with other BioJS biological data visualization components (as well as any jquery based module) to facilitate the creation of rich, interactive web interfaces.

Using the component
The feature viewer component example page demonstrates that it provides a clean static visualization of the positional annotation as it appears on a coordinate system (complete with annotation legend). Online documentation supports the authors' statement that the plugin is highly configurable: attributes are provided to control practically every aspect of an annotation's appearance, and the surrounding coordinate space and additional decorators'. A few minutes spent interacting with the demonstration, however, shows that there is room for improvement: . Once the zoom control is adjusted to focus in on a smaller region, it (1) Suggested improvement doesn't allow the user to sweep across the coordinate space (i.e. by click-dragging the visible range). From a developer perspective, I would also expect the component to raise BioJS events to inform other components about the change in 'Region Of Interest' to allow them to respond in kind.
From a developer perspective, I would also expect the component to raise BioJS events to inform other components about the change in 'Region Of Interest' to allow them to respond in kind.
The authors mention that feature glyphs can be dragged to facilitate (2) Suggested improvement. inspection in crowded regions. It is unclear how useful this capability is -since once dragged, any change in view results in the feature glyph's shape being reset, and in crowded regions, the user must -by definition -move many features to examine the precise location of annotation. User modified glyph layout should -at the very least -be preserved on changes of scale. There may also be wisdom in including a force-directed layout algorithm to better optimize placement of nearby glyphs in response to manual adjustment of any particular one.

Developing with the component(s)
Apart from choosing appropriate values for the vast array of attributes and annotation display settings that supports (more on that later), the most onerous aspect of deploying is FeatureViewer FeatureViewer keeping track of the array of dependencies it requires. However, JavaScript module dependency management is a moving target, and I make the recommendation below with that in mind: . I that the authors provide examples that employ a (3) Suggested improvement strongly recommend client-side JavaScript package management system such as 'RequireJS' (see for some other tips about more sanitary methods of deploying jquery). http://jquerysbestfriends.com Dependency management is important, and demonstrating good practice will allow these tools to be more widely adopted, and more effectively maintained. One aspect of this paper that was not immediately clear on first reading was the functional relationship between -which renders annotation layouts into SVG, and the helper components FeatureViewer SimpleFeatureRenderer and DASFeatureRenderer.
(4) Suggested revision/reviewer contribution A diagram such as the one I uploaded to FigShare here: http://figshare.com/articles/The_BioJS_FeatureViewer_components_for_sequence_annotation_retrieval_layout_and_d may help readers of this article more effectively grasp the different functionalities provided by the components in this system.

Specific revisions/corrections (5) Required revision
In the abstract, the authors claim: "To our knowledge, this is the first client-side modular component to visualize position-based annotations that can be integrated into other web applications in a standard manner." This is not quite correct. There have been other client-side modular web components that have been created for the visualisation of positional annotation, although some of the co-authors may have, for modesty reasons omitted mentioning the fact. In this case, Dasty3 and its forerunner, Dasty2 both provided a self-contained (ie modular) component that could be controlled through javascript and embedded in a more complex web page.
Here, I recommend the authors emphasise more strongly the key novelty in this work. For example: " , they are the first modular because these components are built with the BioJS framework visualization components for the display of position-based annotation that can be integrated with other web applications in a standard manner."  . Suggested revision in Point 1 bold italics "Either shapes or lines can be used to display features covering one single amino acid; currently metal bindings can be rendered as circles, active sites as diamonds, lipidation as waves, glycosylation as hexagons, and [ ] post translational modifications as triangles." other The reason for this revision is that Lipidation and glycosylation are, of course, both PTMs. The authors could also mention here that triangles are used as the default for other types of 'MOD_RES' type PTM (glycation, hydroxylation, phosphorylation, etc). . In the following sentence (in the Introduction section): "The largest pool of reviewed and Point 2 automatically annotated proteins provided by the UniProt Consortium also contains position-based annotations for structural regions, modified residues, and functional sites among others." The authors probably mean to make the following statement (revision in ): bold italics "The largest pool of reviewed and automatically annotated proteins provided by the UniProt Consortium is also contains position-based annotations .."

. It
Again, revising for clarity (in the Introduction section) (revision in ): Point 3.
bold italics "Finally, methods such as those integrated into PredictProtein provide protein feature prediction feature.."

position-based
In the next paragraph: "and that retrieves the input data from a web Point 4. DasProteinFeatureViewer service." There is a missing space between 'and' and 'DasProteinFeatureViewer' . In the second paragraph of "The component" section "The code below shows how Point 5 FeatureViewer to instantiate the component; the corresponding visualization is shown in Figure 1." This sentence is to instantiate the component; the corresponding visualization is shown in Figure 1." This sentence is perhaps more cleanly stated as: "The code below shows how to instantiate the component to create the visualization shown in Figure 1". Dear James, Thanks for your review, it has have been useful to improve our work. We have tried to addressed all you comments, however those related to the component itself, i.e., JavaScript code, will be taken into account for a new version of the software, and those related to BioJS in general will be sent to BioJS core developers. Please see our responses below.

(9) Typos in Web Resources
We are currently working on an improved component to visualize Response to (1) and (2): protein sequence annotations. We have made notes about this suggestions and will take them into account for the new visualization. Unfortunately, such improvements are not yet ready to be integrated into the public BioJS GitHub repository.
We agree with the reviewer on the convenience of using dependency Response to (3): management for BioJS components. As we think it will impact not just this component but any other currently in BioJS, we have initiated a discussion on how to improve this aspect in BioJS, but no decision has been made yet. We will take it into account for the new visualization we are working on.
We have included a figure similar to the proposed one by the reviewer at the Response to (4): beginning of the 'Extensions' section. It introduces the extensions and indeed, as mentioned by the reviewer, helps the reader to understand the relation across the three components.
We have included a new paragraph in the Introduction in order to cover other Response to (5): efforts on web visualization for protein sequence annotations, covering not only Dasty but others as well. We have also emphasized the novelty of this work as suggested by the reviewer.