ALL Metrics
-
Views
-
Downloads
Get PDF
Get XML
Cite
Export
Track
Software Tool Article

haploR: an R-package for querying web-based annotation tools

[version 1; peer review: 3 approved with reservations]
PUBLISHED 01 Feb 2017
Author details Author details
OPEN PEER REVIEW
REVIEWER STATUS

This article is included in the RPackage gateway.

Abstract

There exists a set of web-based tools for integration and exploring information linked to annotated genetic variants. We developed haploR, an R-package for querying such web-based genome annotation tools (currently implementing on HaploReg and RegulomeDB) and gathering information in a format suitable for downstream bioinformatic analyses. This will facilitate post-genome wide association studies streamline analysis for rapid discovery and interpretation of genetic associations.

Keywords

R, databases, genomics, genetic variants, genome annotation, data mining

Introduction

Genomic experiments, including genome wide association studies (GWAS), produced and continue to produce a huge amount of data. To better understand the biological mechanisms involved in regulation complex traits, this information requires further analysis. Large projects, such as ENCODE1, are devoted to bring together accumulated knowledge about different functional and regulatory elements that control cells’ functioning. These projects manage such data to facilitate collaboration between researchers working in the area of genetics of complex traits.

There exists a set of web-based tools, such as HaploReg2 and RegulomeDB3, which offer a link of detected genetic variants to additional post-GWAS information. These include information about linkage disequilibrium (LD), expression quantitative trait loci (eQTL), allele frequencies, protein functions, chromatin states, etc., for annotated genetic variants. These tools are web-based, which requires the user to open a web page, manually enter information and obtain the results of such linking in a certain format.

In a number of situations, a user needs to have additional flexibility in working with such tools. For example, saving the results of such analyses in different file formats for further use. This can be provided using various kinds of computer languages available in Modern Bioinformatics and Computational Biology, including R, Python, Perl and other high-level languages and computational platforms. Among them, R language is one of the leaders, since it is free and offers a large set of packages to facilitate bioinformatics analysis.

We present an R-package, haploR, which allows for querying HaploReg and RegulomeDB web-based tools. The package connects to the corresponding web site, queries the database and downloads results in the form of a data frame or a file. The package can easily be included in bioinformatics pipelines, which will, in turn, facilitate analysis for rapid single nucleotide variant (SNP)/gene - phenotype association discovery.

Methods

Implementation

The R-package haploR relies on HTTP methods POST and GET to query, download and parse the content of web pages. Functions queryHaploreg(...) and queryRegulome(...) are designed to obtain data from the resources HaploReg (http://archive.broadinstitute.org/mammals/haploreg/haploreg.php) and RegulomeDB (http://www.regulomedb.org/), respectively.

Operation

The package is cross-platform (Windows, macOS and Linux), without any specific computer hardware requirements. A standard computer with the most-recent version of R (3.3.2 at the time of writing) will handle most applications of the haploR package.

Use cases

Querying HaploReg

To query HaploReg and download the results, the user needs to call queryHaploreg(query, file, study, ...) function. This function can accept three different inputs: (1) a vector of SNPs (query); (2) a text file (file); or (3) a study (study). Other parameters are directly linked to query options (see HaploReg web page) and described in the package user manual. Output of this function is a table with column names identical to those used in HaploReg. Examples below show usage of these options.

Input vector of SNPs

library(haploR)

queryHaploreg(query=c("rs10048158","rs4791078"))

Here parameter query represents a vector of rs-IDs.

Input text file with SNPs

In this example, SNPs are stored in a text file, one SNP per line. In this case, to call queryHaploreg, the user has to execute the following command:

queryHaploreg(file=system.file("extdata/snps.txt", package="haploR"))

Here file represents a path to the file with SNPs.

Using a particular study

HaploReg offers an option to use data from study done in the past. To use this option, the user should first obtain a list of studies and then use a particular study as a parameter:

#Get a list of studies

studies <- getStudyList()

#Query Hploreg

queryHaploreg(study=studies[[2]])

Other options, such as a source for epigenomes, mammalian conservation algorithm, and others are also available; see the package’s user manual (https://cran.r-project.org/web/packages/haploR/haploR.pdf) and vignette (https://cran.r-project.org/web/packages/haploR/vignettes/haplor-vignette.html) for correct use.

Querying RegulomeDB

The RegulomeDB project also allows exploration of properties of SNPs and presents results in different formats: (1) plain text (2) BED and (3) GFF formats. The function queryRegulome(query, format) is used to query the RegulomeDB:

queryRegulome(query=c("rs4791078","rs10048158"), format="full")

Here the query is a vector of rsIDs and format is an output format provided by the RegulomeDB web site. The output of this function is similar to that used in the queryHaploreg function, but has columns that correspond to the RegulomeDB output.

Conclusion and future work

Here, we present a new package haploR, which currently allows querying web tools HaploReg and RegulomeDB. We plan to add other web-based tools, such as Regulatory Elements DB (http://dnase.genome.duke.edu/index.php), which provides the data from DNaseI-hypersensitivity and Affymetrix microarray experiments performed in 4.

Software and data availability

Tool available from: https://cran.r-project.org/package=haploR

Source code available from: https://github.com/izhbannikov/haploR

Archived source as at time of publication: doi, https://doi.org/10.5281/zenodo.2599965; https://cran.r-project.org/src/contrib/haploR_1.4.1.tar.gz

License: GPL-2 | GPL-3

The example script and output files for the package are available at: https://doi.org/10.5281/zenodo.2600396

Comments on this article Comments (1)

Version 2
VERSION 2 PUBLISHED 15 May 2017
Revised
Version 1
VERSION 1 PUBLISHED 01 Feb 2017
Discussion is closed on this version, please comment on the latest version above.
  • Reader Comment 08 Feb 2017
    Shaun Lehmann, Australian National University, Australia
    08 Feb 2017
    Reader Comment
    While the value of tools that allow for the more ready accession of existing databases is apparent, I have difficulty understanding precisely how the use of haploR might benefit me.
    ... Continue reading
  • Discussion is closed on this version, please comment on the latest version above.
Author details Author details
Competing interests
Grant information
Copyright
Download
 
Export To
metrics
Views Downloads
F1000Research - -
PubMed Central
Data from PMC are received and updated monthly.
- -
Citations
CITE
how to cite this article
Zhbannikov IY, Arbeev K and Yashin AI. haploR: an R-package for querying web-based annotation tools [version 1; peer review: 3 approved with reservations]. F1000Research 2017, 6:97 (https://doi.org/10.12688/f1000research.10742.1)
NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.
track
receive updates on this article
Track an article to receive email alerts on any updates to this article.

Open Peer Review

Current Reviewer Status: ?
Key to Reviewer Statuses VIEW
ApprovedThe paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approvedFundamental flaws in the paper seriously undermine the findings and conclusions
Version 1
VERSION 1
PUBLISHED 01 Feb 2017
Views
44
Cite
Reviewer Report 03 Mar 2017
Stephanie M. Gogarten, Department of Biostatistics, University of Washington, Seattle, WA, USA 
Approved with Reservations
VIEWS 44
This paper describes an R-package, haploR, which queries bionformatics databases. The benefit of the package is an ability to incorporate these queries into workflows in R, rather than using a web interface.

The haploR package seems useful, ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Gogarten SM. Reviewer Report For: haploR: an R-package for querying web-based annotation tools [version 1; peer review: 3 approved with reservations]. F1000Research 2017, 6:97 (https://doi.org/10.5256/f1000research.11583.r20081)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
  • Author Response 15 May 2017
    Ilya Zhbannikov, Biodemography of Aging Research Unit (BARU) at Social Science Research Institute, Duke University, Durham, USA
    15 May 2017
    Author Response
    We thank the reviewer for careful reading of our paper and constructive remarks. We believe that the comments have identified important areas which required improvement. After completion of the suggested ... Continue reading
COMMENTS ON THIS REPORT
  • Author Response 15 May 2017
    Ilya Zhbannikov, Biodemography of Aging Research Unit (BARU) at Social Science Research Institute, Duke University, Durham, USA
    15 May 2017
    Author Response
    We thank the reviewer for careful reading of our paper and constructive remarks. We believe that the comments have identified important areas which required improvement. After completion of the suggested ... Continue reading
Views
40
Cite
Reviewer Report 23 Feb 2017
Claudia Vitolo, European Centre for Medium-Range Weather Forecasts, Reading, UK 
Estibaliz Gascon, European Centre for Medium-Range Weather Forecasts, Reading, UK 
Fatima Pillosu, European Centre for Medium-Range Weather Forecasts, Reading, UK 
Approved with Reservations
VIEWS 40
This papers describes the implementation of the haploR R-package which is used to retrieve information from web-based genome annotation tools. This R-package aims to simplify the reproducibility of bioinformatics pipe lines.

Overall, we think the structure of ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Vitolo C, Gascon E and Pillosu F. Reviewer Report For: haploR: an R-package for querying web-based annotation tools [version 1; peer review: 3 approved with reservations]. F1000Research 2017, 6:97 (https://doi.org/10.5256/f1000research.11583.r19826)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
  • Author Response 15 May 2017
    Ilya Zhbannikov, Biodemography of Aging Research Unit (BARU) at Social Science Research Institute, Duke University, Durham, USA
    15 May 2017
    Author Response
    We thank the reviewers for their careful reading of the manuscript, package testing and their constructive remarks. We have taken the comments on board to improve and clarify the manuscript. ... Continue reading
COMMENTS ON THIS REPORT
  • Author Response 15 May 2017
    Ilya Zhbannikov, Biodemography of Aging Research Unit (BARU) at Social Science Research Institute, Duke University, Durham, USA
    15 May 2017
    Author Response
    We thank the reviewers for their careful reading of the manuscript, package testing and their constructive remarks. We have taken the comments on board to improve and clarify the manuscript. ... Continue reading
Views
43
Cite
Reviewer Report 13 Feb 2017
Garrett M. Dancik, Department of Computer Science, Eastern Connecticut State University, Willimantic, CT, USA 
Approved with Reservations
VIEWS 43
The authors describe an R package named haploR for querying the HaploReg and ReglomeDB web-based databases. Because querying can be carried out in R,  haploR adds convenience for querying these databases when subsequent downstream analyses in R are desired. 

... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Dancik GM. Reviewer Report For: haploR: an R-package for querying web-based annotation tools [version 1; peer review: 3 approved with reservations]. F1000Research 2017, 6:97 (https://doi.org/10.5256/f1000research.11583.r19824)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
  • Author Response 15 May 2017
    Ilya Zhbannikov, Biodemography of Aging Research Unit (BARU) at Social Science Research Institute, Duke University, Durham, USA
    15 May 2017
    Author Response
    We thank the reviewer for insightful and thorough feedback. It was clear from those comments that our original paper did not emphasize clearly enough the unique contribution of the R ... Continue reading
COMMENTS ON THIS REPORT
  • Author Response 15 May 2017
    Ilya Zhbannikov, Biodemography of Aging Research Unit (BARU) at Social Science Research Institute, Duke University, Durham, USA
    15 May 2017
    Author Response
    We thank the reviewer for insightful and thorough feedback. It was clear from those comments that our original paper did not emphasize clearly enough the unique contribution of the R ... Continue reading

Comments on this article Comments (1)

Version 2
VERSION 2 PUBLISHED 15 May 2017
Revised
Version 1
VERSION 1 PUBLISHED 01 Feb 2017
Discussion is closed on this version, please comment on the latest version above.
  • Reader Comment 08 Feb 2017
    Shaun Lehmann, Australian National University, Australia
    08 Feb 2017
    Reader Comment
    While the value of tools that allow for the more ready accession of existing databases is apparent, I have difficulty understanding precisely how the use of haploR might benefit me.
    ... Continue reading
  • Discussion is closed on this version, please comment on the latest version above.
Alongside their report, reviewers assign a status to the article:
Approved - the paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations - A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approved - fundamental flaws in the paper seriously undermine the findings and conclusions
Sign In
If you've forgotten your password, please enter your email address below and we'll send you instructions on how to reset your password.

The email address should be the one you originally registered with F1000.

Email address not valid, please try again

You registered with F1000 via Google, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Google account password, please click here.

You registered with F1000 via Facebook, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Facebook account password, please click here.

Code not correct, please try again
Email us for further assistance.
Server error, please try again.