ALL Metrics
-
Views
-
Downloads
Get PDF
Get XML
Cite
Export
Track
Software Tool Article
Revised

haploR: an R package for querying web-based annotation tools

[version 2; peer review: 3 approved]
PUBLISHED 15 May 2017
Author details Author details
OPEN PEER REVIEW
REVIEWER STATUS

This article is included in the RPackage gateway.

Abstract

We developed haploR, an R package for querying web based genome annotation tools HaploReg and RegulomeDB. haploR gathers information in a data frame which is suitable for downstream bioinformatic analyses. This will facilitate post-genome wide association studies streamline analysis for rapid discovery and interpretation of genetic associations.

Keywords

R, databases, genomics, genetic variants, genome annotation, data mining

Revised Amendments from Version 1

This new version considered interesting comments of the reviewers regarding applicability of the haploR and comparison to its analogues as well as correction some missed points during the first version, attending most of the comments raised by the reviewers. 
  
Major changes in this version 2 are:

- Altered the Abstract and Introduction sections.
- Updated a ‘Methods’ section: only the basic examples are kept; other examples were moved to haploR-vignette (see Supplementary File S1).
- Altered a 'Conclusion and Future Work' section: we emphasised the advantages of haploR and provided clarifications regarding adding the Regulatory Elements Database.

This version 2 also includes an updated haploR-vignette as Supplementary File S1.

See the authors' detailed response to the review by Garrett M. Dancik
See the authors' detailed response to the review by Claudia Vitolo, Estibaliz Gascon and Fatima Pillosu
See the authors' detailed response to the review by Stephanie M. Gogarten

Introduction

Genome wide association studies (GWAS) have produced a significant amount of data. To better understand the biological mechanisms involved in complex trait regulations, web-based tools, such as HaploReg1 and RegulomeDB2, were proposed. These tools offer a link of detected genetic variants to additional post-GWAS information about linkage disequilibrium (LD), expression quantitative trait loci (eQTL), allele frequencies (AF), protein functions, and chromatin states (for annotated single-nucleotide polymorphisms (SNP)). These tools are all web-based and require the user to do the following: open a web page, manually enter information, and obtain the results. The user needs to advise that in a number of situations, extra precautions must be made. Two examples of this would be saving the results in different file formats (TXT, CSV, XLSX, etc.,) or taking advantage of their highly-optimized search engines from custom scripts. Among a plethora of annotation packages on Bioconductor (www.bioconductor.org) and CRAN (www.cran-project.org), myvariant3, biomaRt4, rentrez5 can retrieve information about annotated SNPs. However, even rich outputs of these packages lack information about LD, eQTL, AF and haplotype blocks. We present an R package, haploR, which allows querying HaploReg and RegulomeDB web-based tools from R environment. The package connects to the web site, queries the database, and downloads results into a data frame. HaploR can easily be included in bioinformatics pipelines, which will facilitate search for SNP -phenotype associations.

We present an R package, haploR, which allows querying HaploReg and RegulomeDB web-based tools from R environment. The package connects to the web site, queries the database and downloads results into a data frame. haploR can easily be included in bioinformatics pipelines, which will facilitate search for SNP - phenotype associations.

Methods

Implementation

haploR relies on HTTP methods POST and GET to query and download the content of web pages. Functions queryHaploreg(...) and queryRegulome(...) are designed to query the HaploReg (http://archive.broadinstitute.org/mammals/haploreg/haploreg.php) and RegulomeDB (http://www.regulomedb.org/), respectively. The structure of the retrieved data is described on the package website and corresponding vignette.

Operation

The package is cross-platform (Windows, macOS and Linux), without any specific computer hardware requirements. A standard computer with the most-recent version of R will handle most applications of the haploR package. Installation instructions and a list of prerequisites are provided on the package web page.

Use cases

Querying HaploReg

To query HaploReg, the user needs to call queryHaploreg(query, file, study, ...). This function can accept three different inputs: (1) a vector of SNPs (query); (2) a text file (file); or (3) a study (study) that can be obtained from HaploReg using getHaploregStudyList(). Parameters of these functions are directly linked to options provided at the HaploReg web page and described in the package user manual. Examples below show usage of a vector of SNPs. For other examples please refer to the package vignette.

library(haploR)
x <- queryHaploreg(query=c("rs10048158","rs4791078"))

Here parameter query represents a vector of SNPs identified with rs-IDs.

Querying RegulomeDB

The RegulomeDB project also allows exploration of properties of SNPs and presents results in different formats: (1) plain text (vector of rs-ID) (2) BED and (3) GFF formats. The function queryRegulome(query, ...) is used to query the RegulomeDB:

x <- queryRegulome(query=c("rs4791078","rs10048158"))

Here the query is a vector of rs-IDs. The output is similar to that used in the queryHaploreg function in terms of the type of information retrieved, but specific to the RegulomeDB output. For detailed format explanations refer to the RegulomeDB web site.

Conclusion and future work

haploR can be easily included to bioinformatics pipeline to streamline the process and reduce the analysis time. Its advantages over the original databases include: shorter retrieval time, the ability to present results in a user-friendly form (allowing for a more streamlined workflow,) and convenient use of needed information in reports, presentations and publications. We plan to add other tools, such as Regulatory Elements (http://dnase.genome.duke.edu/index.php), which provides the data from DNaseI hypersensitivity and microarray experiments performed in 6. Understanding the factors modulating gene expression and protein yield across individuals can be beneficial. Cell types may help discover novel mechanisms of genetic associations.

Software availability

Tool available from: https://cran.r-project.org/package=haploR

Source code available from: https://github.com/izhbannikov/haploR

Archived source as at time of publication: https://cran.r-project.org/src/contrib/haploR_1.4.4.tar.gz, doi: https://doi.org/10.5281/zenodo.570956

License: GPL-3

Data availability

The example script and output files for the package are available at: https://doi.org/10.5281/zenodo.570960

Comments on this article Comments (1)

Version 2
VERSION 2 PUBLISHED 15 May 2017
Revised
Version 1
VERSION 1 PUBLISHED 01 Feb 2017
Discussion is closed on this version, please comment on the latest version above.
  • Reader Comment 08 Feb 2017
    Shaun Lehmann, Australian National University, Australia
    08 Feb 2017
    Reader Comment
    While the value of tools that allow for the more ready accession of existing databases is apparent, I have difficulty understanding precisely how the use of haploR might benefit me.
    ... Continue reading
  • Discussion is closed on this version, please comment on the latest version above.
Author details Author details
Competing interests
Grant information
Copyright
Download
 
Export To
metrics
Views Downloads
F1000Research - -
PubMed Central
Data from PMC are received and updated monthly.
- -
Citations
CITE
how to cite this article
Zhbannikov IY, Arbeev K, Ukraintseva S and Yashin AI. haploR: an R package for querying web-based annotation tools [version 2; peer review: 3 approved]. F1000Research 2017, 6:97 (https://doi.org/10.12688/f1000research.10742.2)
NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.
track
receive updates on this article
Track an article to receive email alerts on any updates to this article.

Open Peer Review

Current Reviewer Status: ?
Key to Reviewer Statuses VIEW
ApprovedThe paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approvedFundamental flaws in the paper seriously undermine the findings and conclusions
Version 2
VERSION 2
PUBLISHED 15 May 2017
Revised
Views
9
Cite
Reviewer Report 03 Jul 2017
Claudia Vitolo, European Centre for Medium-Range Weather Forecasts, Reading, UK 
Estibaliz Gascon, European Centre for Medium-Range Weather Forecasts, Reading, UK 
Fatima Pillosu, European Centre for Medium-Range Weather Forecasts, Reading, UK 
Approved
VIEWS 9
The authors have addressed my concerns.

I only have few minor comments:
  • There is a repetition in the last part of the introduction (The package connects to the web site...)
     
... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Vitolo C, Gascon E and Pillosu F. Reviewer Report For: haploR: an R package for querying web-based annotation tools [version 2; peer review: 3 approved]. F1000Research 2017, 6:97 (https://doi.org/10.5256/f1000research.12496.r22714)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
Views
15
Cite
Reviewer Report 31 May 2017
Garrett M. Dancik, Department of Computer Science, Eastern Connecticut State University, Willimantic, CT, USA 
Approved
VIEWS 15
The authors have ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Dancik GM. Reviewer Report For: haploR: an R package for querying web-based annotation tools [version 2; peer review: 3 approved]. F1000Research 2017, 6:97 (https://doi.org/10.5256/f1000research.12496.r22712)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
Views
16
Cite
Reviewer Report 30 May 2017
Stephanie M. Gogarten, Department of Biostatistics, University of Washington, Seattle, WA, USA 
Approved
VIEWS 16
The authors have addressed my concerns. My only additional comment is that the last two ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Gogarten SM. Reviewer Report For: haploR: an R package for querying web-based annotation tools [version 2; peer review: 3 approved]. F1000Research 2017, 6:97 (https://doi.org/10.5256/f1000research.12496.r22713)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
Version 1
VERSION 1
PUBLISHED 01 Feb 2017
Views
44
Cite
Reviewer Report 03 Mar 2017
Stephanie M. Gogarten, Department of Biostatistics, University of Washington, Seattle, WA, USA 
Approved with Reservations
VIEWS 44
This paper describes an R-package, haploR, which queries bionformatics databases. The benefit of the package is an ability to incorporate these queries into workflows in R, rather than using a web interface.

The haploR package seems useful, ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Gogarten SM. Reviewer Report For: haploR: an R package for querying web-based annotation tools [version 2; peer review: 3 approved]. F1000Research 2017, 6:97 (https://doi.org/10.5256/f1000research.11583.r20081)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
  • Author Response 15 May 2017
    Ilya Zhbannikov, Biodemography of Aging Research Unit (BARU) at Social Science Research Institute, Duke University, Durham, USA
    15 May 2017
    Author Response
    We thank the reviewer for careful reading of our paper and constructive remarks. We believe that the comments have identified important areas which required improvement. After completion of the suggested ... Continue reading
COMMENTS ON THIS REPORT
  • Author Response 15 May 2017
    Ilya Zhbannikov, Biodemography of Aging Research Unit (BARU) at Social Science Research Institute, Duke University, Durham, USA
    15 May 2017
    Author Response
    We thank the reviewer for careful reading of our paper and constructive remarks. We believe that the comments have identified important areas which required improvement. After completion of the suggested ... Continue reading
Views
40
Cite
Reviewer Report 23 Feb 2017
Claudia Vitolo, European Centre for Medium-Range Weather Forecasts, Reading, UK 
Estibaliz Gascon, European Centre for Medium-Range Weather Forecasts, Reading, UK 
Fatima Pillosu, European Centre for Medium-Range Weather Forecasts, Reading, UK 
Approved with Reservations
VIEWS 40
This papers describes the implementation of the haploR R-package which is used to retrieve information from web-based genome annotation tools. This R-package aims to simplify the reproducibility of bioinformatics pipe lines.

Overall, we think the structure of ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Vitolo C, Gascon E and Pillosu F. Reviewer Report For: haploR: an R package for querying web-based annotation tools [version 2; peer review: 3 approved]. F1000Research 2017, 6:97 (https://doi.org/10.5256/f1000research.11583.r19826)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
  • Author Response 15 May 2017
    Ilya Zhbannikov, Biodemography of Aging Research Unit (BARU) at Social Science Research Institute, Duke University, Durham, USA
    15 May 2017
    Author Response
    We thank the reviewers for their careful reading of the manuscript, package testing and their constructive remarks. We have taken the comments on board to improve and clarify the manuscript. ... Continue reading
COMMENTS ON THIS REPORT
  • Author Response 15 May 2017
    Ilya Zhbannikov, Biodemography of Aging Research Unit (BARU) at Social Science Research Institute, Duke University, Durham, USA
    15 May 2017
    Author Response
    We thank the reviewers for their careful reading of the manuscript, package testing and their constructive remarks. We have taken the comments on board to improve and clarify the manuscript. ... Continue reading
Views
43
Cite
Reviewer Report 13 Feb 2017
Garrett M. Dancik, Department of Computer Science, Eastern Connecticut State University, Willimantic, CT, USA 
Approved with Reservations
VIEWS 43
The authors describe an R package named haploR for querying the HaploReg and ReglomeDB web-based databases. Because querying can be carried out in R,  haploR adds convenience for querying these databases when subsequent downstream analyses in R are desired. 

... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Dancik GM. Reviewer Report For: haploR: an R package for querying web-based annotation tools [version 2; peer review: 3 approved]. F1000Research 2017, 6:97 (https://doi.org/10.5256/f1000research.11583.r19824)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
  • Author Response 15 May 2017
    Ilya Zhbannikov, Biodemography of Aging Research Unit (BARU) at Social Science Research Institute, Duke University, Durham, USA
    15 May 2017
    Author Response
    We thank the reviewer for insightful and thorough feedback. It was clear from those comments that our original paper did not emphasize clearly enough the unique contribution of the R ... Continue reading
COMMENTS ON THIS REPORT
  • Author Response 15 May 2017
    Ilya Zhbannikov, Biodemography of Aging Research Unit (BARU) at Social Science Research Institute, Duke University, Durham, USA
    15 May 2017
    Author Response
    We thank the reviewer for insightful and thorough feedback. It was clear from those comments that our original paper did not emphasize clearly enough the unique contribution of the R ... Continue reading

Comments on this article Comments (1)

Version 2
VERSION 2 PUBLISHED 15 May 2017
Revised
Version 1
VERSION 1 PUBLISHED 01 Feb 2017
Discussion is closed on this version, please comment on the latest version above.
  • Reader Comment 08 Feb 2017
    Shaun Lehmann, Australian National University, Australia
    08 Feb 2017
    Reader Comment
    While the value of tools that allow for the more ready accession of existing databases is apparent, I have difficulty understanding precisely how the use of haploR might benefit me.
    ... Continue reading
  • Discussion is closed on this version, please comment on the latest version above.
Alongside their report, reviewers assign a status to the article:
Approved - the paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations - A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approved - fundamental flaws in the paper seriously undermine the findings and conclusions
Sign In
If you've forgotten your password, please enter your email address below and we'll send you instructions on how to reset your password.

The email address should be the one you originally registered with F1000.

Email address not valid, please try again

You registered with F1000 via Google, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Google account password, please click here.

You registered with F1000 via Facebook, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Facebook account password, please click here.

Code not correct, please try again
Email us for further assistance.
Server error, please try again.