<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.2 20190208//EN" "http://jats.nlm.nih.gov/publishing/1.2/JATS-journalpublishing1.dtd"><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="other" dtd-version="1.2" xml:lang="en">
    <front>
        <journal-meta>
            <journal-id journal-id-type="pmc">F1000Research</journal-id>
            <journal-title-group>
                <journal-title>F1000Research</journal-title>
            </journal-title-group>
            <issn pub-type="epub">2046-1402</issn>
            <publisher>
                <publisher-name>F1000 Research Limited</publisher-name>
                <publisher-loc>London, UK</publisher-loc>
            </publisher>
        </journal-meta>
        <article-meta>
            <article-id pub-id-type="doi">10.12688/f1000research.10742.2</article-id>
            <article-categories>
                <subj-group subj-group-type="heading">
                    <subject>Software Tool Article</subject>
                </subj-group>
                <subj-group>
                    <subject>Articles</subject>
                    <subj-group>
                        <subject>Bioinformatics</subject>
                    </subj-group>
                </subj-group>
            </article-categories>
            <title-group>
                <article-title>
                    <italic>haploR</italic>: an R package for querying web-based annotation tools</article-title>
                <fn-group content-type="pub-status">
                    <fn>
                        <p>[version 2; peer review: 3 approved]</p>
                    </fn>
                </fn-group>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author" corresp="yes">
                    <name>
                        <surname>Zhbannikov</surname>
                        <given-names>Ilya Y.</given-names>
                    </name>
                    <uri content-type="orcid">https://orcid.org/0000-0002-6502-6514</uri>
                    <xref ref-type="corresp" rid="c1">a</xref>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Arbeev</surname>
                        <given-names>Konstantin</given-names>
                    </name>
                    <uri content-type="orcid">https://orcid.org/0000-0002-4195-7832</uri>
                    <xref ref-type="aff" rid="a1">1</xref>
                    <xref ref-type="aff" rid="a2">2</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Ukraintseva</surname>
                        <given-names>Svetlana</given-names>
                    </name>
                    <xref ref-type="aff" rid="a1">1</xref>
                    <xref ref-type="aff" rid="a2">2</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Yashin</surname>
                        <given-names>Anatoliy I.</given-names>
                    </name>
                    <xref ref-type="aff" rid="a1">1</xref>
                    <xref ref-type="aff" rid="a2">2</xref>
                </contrib>
                <aff id="a1">
                    <label>1</label>Biodemography of Aging Research Unit (BARU) at Social Science Research Institute, Duke University, Durham, NC, USA</aff>
                <aff id="a2">
                    <label>2</label>Duke Population Research Institute, Duke University, Durham, NC, USA</aff>
            </contrib-group>
            <author-notes>
                <corresp id="c1">
                    <label>a</label>
                    <email xlink:href="mailto:ilya.zhbannikov@duke.edu">ilya.zhbannikov@duke.edu</email>
                </corresp>
                <fn fn-type="con">
                    <p>IYZ developed the package, evaluation/validation tests and wrote the manuscript. KA, SU and AIY contributed to the development of the package and revised manuscript. All authors read and approved the final manuscript.</p>
                </fn>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>15</day>
                <month>5</month>
                <year>2017</year>
            </pub-date>
            <pub-date pub-type="collection">
                <year>2017</year>
            </pub-date>
            <volume>6</volume>
            <elocation-id>97</elocation-id>
            <history>
                <date date-type="accepted">
                    <day>10</day>
                    <month>5</month>
                    <year>2017</year>
                </date>
            </history>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2017 Zhbannikov IY et al.</copyright-statement>
                <copyright-year>2017</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <self-uri content-type="pdf" xlink:href="https://f1000research.com/articles/6-97/pdf"/>
            <abstract>
                <p>We developed 
                    <italic toggle="yes">haploR</italic>, an R package for querying web based genome annotation tools HaploReg and RegulomeDB. 
                    <italic toggle="yes">haploR</italic> gathers information in a data frame which is suitable for downstream bioinformatic analyses. This will facilitate post-genome wide association studies streamline analysis for rapid discovery and interpretation of genetic associations. </p>
            </abstract>
            <kwd-group kwd-group-type="author">
                <kwd>R</kwd>
                <kwd>databases</kwd>
                <kwd>genomics</kwd>
                <kwd>genetic variants</kwd>
                <kwd>genome annotation</kwd>
                <kwd>data mining</kwd>
            </kwd-group>
            <funding-group>
                <award-group id="fund-1" xlink:href="http://dx.doi.org/10.13039/100000049">
                    <funding-source>National Institute on Aging</funding-source>
                    <award-id>P01AG043352</award-id>
                    <award-id>R01AG046860</award-id>
                    <award-id>P30AG034424</award-id>
                </award-group>
                <funding-statement>This work was supported by the National Institute on Aging of the National Institutes of Health (NIA/NIH) under Award Numbers P01AG043352, R01AG046860, and P30AG034424. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIA/NIH.</funding-statement>
                <funding-statement>
                    <italic>The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.</italic>
                </funding-statement>
            </funding-group>
        </article-meta>
        <notes>
            <sec sec-type="version-changes">
                <label>Revised</label>
                <title>Amendments from Version 1</title>
                <p>This new version considered interesting comments of the reviewers regarding applicability of the 
                    <italic>haploR </italic>and comparison to its analogues as well as correction some missed points during the first version, attending most of the comments raised by the reviewers.&#x00a0; &#x00a0;&#x00a0; Major changes in this version 2 are: - Altered the Abstract and Introduction sections. - Updated a &#x2018;Methods&#x2019; section: only the basic examples are kept; other examples were moved to 
                    <italic>haploR</italic>-vignette (see Supplementary File S1). - Altered a 'Conclusion and Future Work' section: we emphasised the advantages of 
                    <italic>haploR </italic>and provided clarifications regarding adding the Regulatory Elements Database. This version 2 also includes an updated 
                    <italic>haploR</italic>-vignette as Supplementary File S1.</p>
            </sec>
        </notes>
    </front>
    <body>
        <sec sec-type="intro">
            <title>Introduction</title>
            <p>Genome wide association studies (GWAS) have produced a significant amount of data. To better understand the biological mechanisms involved in complex trait regulations, web-based tools, such as HaploReg
                <sup>
                    <xref ref-type="bibr" rid="ref-1">1</xref>
                </sup> and RegulomeDB
                <sup>
                    <xref ref-type="bibr" rid="ref-2">2</xref>
                </sup>, were proposed. These tools offer a link of detected genetic variants to additional post-GWAS information about linkage disequilibrium (LD), expression quantitative trait loci (eQTL), allele frequencies (AF), protein functions, and chromatin states (for annotated single-nucleotide polymorphisms (SNP)). These tools are all web-based and require the user to do the following: open a web page, manually enter information, and obtain the results. The user needs to advise that in a number of situations, extra precautions must be made. Two examples of this would be saving the results in different file formats (TXT, CSV, XLSX, etc.,) or taking advantage of their highly-optimized search engines from custom scripts. Among a plethora of annotation packages on Bioconductor (
                <ext-link ext-link-type="uri" xlink:href="http://www.bioconductor.org">www.bioconductor.org</ext-link>) and CRAN (
                <ext-link ext-link-type="uri" xlink:href="http://www.cran-project.org">www.cran-project.org</ext-link>), 
                <italic toggle="yes">myvariant</italic>
                <sup>
                    <xref ref-type="bibr" rid="ref-3">3</xref>
                </sup>, 
                <italic toggle="yes">biomaRt</italic>
                <sup>
                    <xref ref-type="bibr" rid="ref-4">4</xref>
                </sup>, 
                <italic toggle="yes">rentrez</italic>
                <sup>
                    <xref ref-type="bibr" rid="ref-5">5</xref>
                </sup> can retrieve information about annotated SNPs. However, even rich outputs of these packages lack information about LD, eQTL, AF and haplotype blocks. We present an R package, 
                <italic toggle="yes">haploR</italic>, which allows querying HaploReg and RegulomeDB web-based tools from R environment. The package connects to the web site, queries the database, and downloads results into a data frame. HaploR can easily be included in bioinformatics pipelines, which will facilitate search for SNP -phenotype associations.</p>
            <p>We present an R package, 
                <italic toggle="yes">haploR</italic>, which allows querying HaploReg and RegulomeDB web-based tools from R environment. The package connects to the web site, queries the database and downloads results into a data frame. 
                <italic toggle="yes">haploR</italic> can easily be included in bioinformatics pipelines, which will facilitate search for SNP - phenotype associations.</p>
        </sec>
        <sec sec-type="methods">
            <title>Methods</title>
            <sec>
                <title>Implementation</title>
                <p>
                    <italic toggle="yes">haploR</italic> relies on HTTP methods 
                    <monospace>POST</monospace> and 
                    <monospace>GET</monospace> to query and download the content of web pages. Functions 
                    <monospace>queryHaploreg(...)</monospace> and 
                    <monospace>queryRegulome(...)</monospace> are designed to query the HaploReg (
                    <ext-link ext-link-type="uri" xlink:href="http://archive.broadinstitute.org/mammals/haploreg/haploreg.php">http://archive.broadinstitute.org/mammals/haploreg/haploreg.php</ext-link>) and RegulomeDB (
                    <ext-link ext-link-type="uri" xlink:href="http://www.regulomedb.org/">http://www.regulomedb.org/</ext-link>), respectively. The structure of the retrieved data is described on the package website and corresponding vignette.</p>
            </sec>
            <sec>
                <title>Operation</title>
                <p>The package is cross-platform (Windows, macOS and Linux), without any specific computer hardware requirements. A standard computer with the most-recent version of R will handle most applications of the 
                    <italic toggle="yes">haploR</italic> package. Installation instructions and a list of prerequisites are provided on the package web page.</p>
            </sec>
        </sec>
        <sec>
            <title>Use cases</title>
            <sec>
                <title>Querying HaploReg</title>
                <p>To query HaploReg, the user needs to call 
                    <monospace>queryHaploreg(query, file, study, ...)</monospace>. This function can accept three different inputs: (1) a vector of SNPs 
                    <monospace>(query)</monospace>; (2) a text file (
                    <monospace>file</monospace>); or (3) a study (
                    <monospace>study</monospace>) that can be obtained from HaploReg using 
                    <monospace>getHaploregStudyList()</monospace>. Parameters of these functions are directly linked to options provided at the HaploReg web page and described in the package user manual. Examples below show usage of a vector of SNPs. For other examples please refer to the package vignette.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;">library(haploR)
x &lt;- queryHaploreg(query=c("rs10048158","rs4791078"))</styled-content>
                    </preformat>
                </p>
                <p>Here parameter 
                    <monospace>query</monospace> represents a vector of SNPs identified with rs-IDs.</p>
            </sec>
            <sec>
                <title>Querying RegulomeDB</title>
                <p>The RegulomeDB project also allows exploration of properties of SNPs and presents results in different formats: (1) plain text (vector of rs-ID) (2) BED and (3) GFF formats. The function 
                    <monospace>queryRegulome(query, ...)</monospace> is used to query the RegulomeDB:</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;">x &lt;- queryRegulome(query=c("rs4791078","rs10048158"))</styled-content>
                    </preformat>
                </p>
                <p>Here the 
                    <monospace>query</monospace> is a vector of rs-IDs. The output is similar to that used in the 
                    <monospace>queryHaploreg</monospace> function in terms of the type of information retrieved, but specific to the RegulomeDB output. For detailed format explanations refer to the RegulomeDB web site.</p>
            </sec>
        </sec>
        <sec sec-type="conclusions">
            <title>Conclusion and future work</title>
            <p>
                <italic toggle="yes">haploR</italic> can be easily included to bioinformatics pipeline to streamline the process and reduce the analysis time. Its advantages over the original databases include: shorter retrieval time, the ability to present results in a user-friendly form (allowing for a more streamlined workflow,) and convenient use of needed information in reports, presentations and publications. We plan to add other tools, such as Regulatory Elements (
                <ext-link ext-link-type="uri" xlink:href="http://dnase.genome.duke.edu/index.php">http://dnase.genome.duke.edu/index.php</ext-link>), which provides the data from DNaseI hypersensitivity and microarray experiments performed in 
                <xref ref-type="bibr" rid="ref-6">6</xref>. Understanding the factors modulating gene expression and protein yield across individuals can be beneficial. Cell types may help discover novel mechanisms of genetic associations.</p>
        </sec>
        <sec>
            <title>Software availability</title>
            <p>Tool available from: 

                <ext-link ext-link-type="uri" xlink:href="https://cran.r-project.org/package=haploR">https://cran.r-project.org/package=haploR</ext-link>
            </p>
            <p>Source code available from: 

                <ext-link ext-link-type="uri" xlink:href="https://github.com/izhbannikov/haploR">https://github.com/izhbannikov/haploR</ext-link>
            </p>
            <p>Archived source as at time of publication: 

                <ext-link ext-link-type="uri" xlink:href="https://cran.r-project.org/src/contrib/haploR_1.4.4.tar.gz">https://cran.r-project.org/src/contrib/haploR_1.4.4.tar.gz</ext-link>, doi: 
                <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.5281/zenodo.570956">https://doi.org/10.5281/zenodo.570956</ext-link>
            </p>
            <p>License: GPL-3</p>
        </sec>
        <sec>
            <title>Data availability</title>
            <p>The data referenced by this article are under copyright with the following copyright statement: Copyright: &#x00ef;&#x00bf;&#x00bd; 2017 Zhbannikov IY et al.</p>
            <p>Data associated with the article are available under the terms of the Creative Commons Zero "No rights reserved" data waiver (CC0 1.0 Public domain dedication).
                <ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/publicdomain/zero/1.0/"/>
            </p>
            <p>The example script and output files for the package are available at: 

                <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.5281/zenodo.570960">https://doi.org/10.5281/zenodo.570960</ext-link>
            </p>
        </sec>
    </body>
    <back>
        <sec id="SM1" sec-type="supplementary-material">
            <title>Supplementary material</title>
            <p>
                <bold>
                    <italic toggle="yes">haploR</italic>-vignette. Using haploR, an R package for querying HaploReg and RegulomeDB.</bold> This file includes a description of post-GWAS analysis and the unique contribution of the haploR to it. It also includes an example of a typical analysis workflow using haploR. There is also a description of the post-GWAS web databases (HaploReg, RegulomeDB) used in the package with comprehensive examples of usage. This file also describes the data structures used in haploR.</p>
            <p>
                <ext-link ext-link-type="uri" xlink:href="https://f1000researchdata.s3.amazonaws.com/supplementary/10742/c102caad-f2d8-4643-b379-9217c7a685b3.pdf">Click here to access the data</ext-link>.</p>
        </sec>
        <ref-list>
            <ref id="ref-1">
                <label>1</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Ward</surname>
                            <given-names>LD</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Kellis</surname>
                            <given-names>M</given-names>
                        </name>
</person-group>:
                    <article-title>HaploReg: a resource for exploring chromatin states, conservation, and regulatory motif alterations within sets of genetically linked variants.</article-title>
                    <source>

                        <italic toggle="yes">Nucleic Acids Res.</italic>
</source>
                    <year>2012</year>;<volume>40</volume>(<issue>Database issue</issue>):<fpage>D930</fpage>&#x2013;<lpage>4</lpage>.
                    <pub-id pub-id-type="pmid">22064851</pub-id>
                    <pub-id pub-id-type="doi">10.1093/nar/gkr917</pub-id>
                    <pub-id pub-id-type="pmcid">3245002</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-2">
                <label>2</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Boyle</surname>
                            <given-names>AP</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Hong</surname>
                            <given-names>EL</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Hariharan</surname>
                            <given-names>M</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Annotation of functional variation in personal genomes using RegulomeDB.</article-title>
                    <source>

                        <italic toggle="yes">Genome Res.</italic>
</source>
                    <year>2012</year>;<volume>22</volume>(<issue>9</issue>):<fpage>1790</fpage>&#x2013;<lpage>1797</lpage>.
                    <pub-id pub-id-type="pmid">22955989</pub-id>
                    <pub-id pub-id-type="doi">10.1101/gr.137323.112</pub-id>
                    <pub-id pub-id-type="pmcid">3431494</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-3">
                <label>3</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Mark</surname>
                            <given-names>A</given-names>
                        </name>
</person-group>:
                    <article-title>myvariant: Accesses MyVariant.info variant query and annotation services</article-title>. R package version 1.4.0,<year>2015</year>.
                    <ext-link ext-link-type="uri" xlink:href="https://bioconductor.statistik.tu-dortmund.de/packages/3.4/bioc/manuals/myvariant/man/myvariant.pdf">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-4">
                <label>4</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Durinck</surname>
                            <given-names>S</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Moreau</surname>
                            <given-names>Y</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Kasprzyk</surname>
                            <given-names>A</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>BioMart and Bioconductor: a powerful link between biological databases and microarray data analysis.</article-title>
                    <source>

                        <italic toggle="yes">Bioinformatics.</italic>
</source>
                    <year>2005</year>;<volume>21</volume>(<issue>16</issue>):<fpage>3439</fpage>&#x2013;<lpage>40</lpage>.
                    <pub-id pub-id-type="pmid">16082012</pub-id>
                    <pub-id pub-id-type="doi">10.1093/bioinformatics/bti525</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-5">
                <label>5</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Winter</surname>
                            <given-names>D</given-names>
                        </name>
</person-group>:
                    <article-title>rentrez: Entrez in R</article-title>. R package version 1.0.4,<year>2016</year>.
                    <ext-link ext-link-type="uri" xlink:href="https://cran.r-project.org/web/packages/rentrez/index.html">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-6">
                <label>6</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Sheffield</surname>
                            <given-names>NC</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Thurman</surname>
                            <given-names>RE</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Song</surname>
                            <given-names>L</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Patterns of regulatory activity across diverse human cell types predict tissue identity, transcription factor binding, and long-range interactions.</article-title>
                    <source>

                        <italic toggle="yes">Genome Res.</italic>
</source>
                    <year>2013</year>;<volume>23</volume>(<issue>5</issue>):<fpage>777</fpage>&#x2013;<lpage>88</lpage>.
                    <pub-id pub-id-type="pmid">23482648</pub-id>
                    <pub-id pub-id-type="doi">10.1101/gr.152140.112</pub-id>
                    <pub-id pub-id-type="pmcid">3638134</pub-id>
                </mixed-citation>
            </ref>
        </ref-list>
    </back>
    <sub-article article-type="reviewer-report" id="report22714">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.12496.r22714</article-id>
            <title-group>
                <article-title>Reviewer response for version 2</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Vitolo</surname>
                        <given-names>Claudia</given-names>
                    </name>
                    <xref ref-type="aff" rid="r22714a1">1</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0002-4252-1176</uri>
                </contrib>
                <contrib contrib-type="author">
                    <name>
                        <surname>Gascon</surname>
                        <given-names>Estibaliz</given-names>
                    </name>
                    <xref ref-type="aff" rid="r22714a1">1</xref>
                    <role>Co-referee</role>
                </contrib>
                <contrib contrib-type="author">
                    <name>
                        <surname>Pillosu</surname>
                        <given-names>Fatima</given-names>
                    </name>
                    <xref ref-type="aff" rid="r22714a1">1</xref>
                    <role>Co-referee</role>
                </contrib>
                <aff id="r22714a1">
                    <label>1</label>European Centre for Medium-Range Weather Forecasts, Reading, UK</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>3</day>
                <month>7</month>
                <year>2017</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2017 Vitolo C et al.</copyright-statement>
                <copyright-year>2017</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport22714" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.10742.2"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>The authors have addressed my concerns.</p>
            <p> </p>
            <p> I only have few&#x00a0;minor comments: 
                <list list-type="bullet">
                    <list-item>
                        <p>There is a repetition in the last part of the introduction (The package connects to the web site...)</p>
                    </list-item>
                    <list-item>
                        <p>In the text you mention the 'package website'. If I understand well, this is actually the package repository on GitHub, right? Just make that clearer in the text.</p>
                    </list-item>
                    <list-item>
                        <p>R CMD check shows that there is a mismatch between documentation and code for function queryRegulome(), please fix argument 'timeout' (default value&#x00a0;in Code: 100 while in Docs: 10)</p>
                    </list-item>
                </list> Many thanks to the author for their hard work on this revision.</p>
            <p>Reviewer Expertise:</p>
            <p>NA</p>
            <p>We confirm that we have read this submission and believe that we have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.</p>
        </body>
    </sub-article>
    <sub-article article-type="reviewer-report" id="report22712">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.12496.r22712</article-id>
            <title-group>
                <article-title>Reviewer response for version 2</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Dancik</surname>
                        <given-names>Garrett M.</given-names>
                    </name>
                    <xref ref-type="aff" rid="r22712a1">1</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0002-1391-8641</uri>
                </contrib>
                <aff id="r22712a1">
                    <label>1</label>Department of Computer Science, Eastern Connecticut State University, Willimantic, CT, USA</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>31</day>
                <month>5</month>
                <year>2017</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2017 Dancik GM</copyright-statement>
                <copyright-year>2017</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport22712" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.10742.2"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>The authors have addressed my concerns.</p>
            <p>Reviewer Expertise:</p>
            <p>NA</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.</p>
        </body>
    </sub-article>
    <sub-article article-type="reviewer-report" id="report22713">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.12496.r22713</article-id>
            <title-group>
                <article-title>Reviewer response for version 2</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Gogarten</surname>
                        <given-names>Stephanie M.</given-names>
                    </name>
                    <xref ref-type="aff" rid="r22713a1">1</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0002-7231-9745</uri>
                </contrib>
                <aff id="r22713a1">
                    <label>1</label>Department of Biostatistics, University of Washington, Seattle, WA, USA</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>30</day>
                <month>5</month>
                <year>2017</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2017 Gogarten SM</copyright-statement>
                <copyright-year>2017</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport22713" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.10742.2"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>The authors have addressed my concerns. My only additional comment is that the last two sentences of the Introduction are now redundant with the previous paragraph and should be deleted.</p>
            <p>Reviewer Expertise:</p>
            <p>NA</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.</p>
        </body>
    </sub-article>
    <sub-article article-type="reviewer-report" id="report20081">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.11583.r20081</article-id>
            <title-group>
                <article-title>Reviewer response for version 1</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Gogarten</surname>
                        <given-names>Stephanie M.</given-names>
                    </name>
                    <xref ref-type="aff" rid="r20081a1">1</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0002-7231-9745</uri>
                </contrib>
                <aff id="r20081a1">
                    <label>1</label>Department of Biostatistics, University of Washington, Seattle, WA, USA</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>3</day>
                <month>3</month>
                <year>2017</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2017 Gogarten SM</copyright-statement>
                <copyright-year>2017</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport20081" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.10742.1"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve-with-reservations</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>This paper describes an R-package, 
                <italic>haploR</italic>, which queries bionformatics databases. The benefit of the package is an ability to incorporate these queries into workflows in R, rather than using a web interface.</p>
            <p> </p>
            <p> The 
                <italic>haploR</italic> package seems useful, but the paper is lacking sufficient detail in several areas.&#x00a0; 
                <list list-type="order">
                    <list-item>
                        <p>The Bioconductor project (bioconductor.org) contains a wealth of resources for querying various sources of annotation from R. The paper should discuss how the 
                            <italic>haploR</italic> package provides features that are not available in existing resources.</p>
                    </list-item>
                    <list-item>
                        <p>The types of information available in HaploReg and RegulomeDB are not well described. Why were these particular resources selected for this package and how do they differ from each other?</p>
                    </list-item>
                    <list-item>
                        <p>The "future work" section mentions adding other web tools to the package in the future. What additional information will be provided by those tools and how were they selected for inclusion in the package?</p>
                    </list-item>
                </list> </p>
            <p> I was able to install the R-package and follow the examples given in the vignette. However, these examples would benefit from more explanation. 
                <list list-type="order">
                    <list-item>
                        <p>In the HaploReg example, querying the database with two rs IDs returns results for many additional rs IDs. Why is this?</p>
                    </list-item>
                    <list-item>
                        <p>Why is the first element returned by getStudyList() blank?</p>
                    </list-item>
                </list> </p>
            <p> In summary, the authors have provided a potentially useful R-package, but they need to include more explanation of how this package will benefit the bioinformatics community.</p>
            <p>Reviewer Expertise:</p>
            <p>NA</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.</p>
        </body>
        <sub-article article-type="response" id="comment2692-20081">
            <front-stub>
                <contrib-group>
                    <contrib contrib-type="author">
                        <name>
                            <surname>Zhbannikov</surname>
                            <given-names>Ilya</given-names>
                        </name>
                        <aff>Duke University, USA</aff>
                    </contrib>
                </contrib-group>
                <author-notes>
                    <fn fn-type="conflict">
                        <p>
                            <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                    </fn>
                </author-notes>
                <pub-date pub-type="epub">
                    <day>4</day>
                    <month>5</month>
                    <year>2017</year>
                </pub-date>
            </front-stub>
            <body>
                <p>We thank the reviewer for careful reading of our paper and constructive remarks. We believe that the comments have identified important areas which required improvement. After completion of the suggested edits, the revised paper has benefited from an improvement in the overall presentation and clarity. Reviewer comments/suggestions (RC) are in italics font; our responses (AR) are in regular, black font.</p>
                <p>
                    <bold>RC1:</bold>
                </p>
                <p>
                    <italic>The Bioconductor project (bioconductor.org) contains a wealth of resources for querying various sources of annotation from R. The paper should discuss how the haploR package provides features that are not available in existing resources.</italic>
                </p>
                <p>
                    <bold>AR1:</bold>
                </p>
                <p>We wanted to automatically retrieve the information about annotated genetic variants listed as an output of our custom genomic pipeline. We decided to find an R package that would be able to do this rather than download very large annotation files from different projects in order to query them locally. Among a plethora of annotation packages from Bioconductor and CRAN (
                    <italic>annotate</italic>, 
                    <italic>mygene</italic>, 
                    <italic>ensembldb</italic>, 
                    <italic>biomaRt</italic>, 
                    <italic>myvariant</italic>, 
                    <italic>rsnps</italic>, 
                    <italic>rentrez</italic>), only 
                    <italic>myvariant</italic>, 
                    <italic>biomaRt</italic>, 
                    <italic>rentrez</italic> could potentially serve our needs. However, even the rich outputs of 
                    <italic>myvariant</italic>, 
                    <italic>biomaRt</italic> and 
                    <italic>rentrez</italic> did not contain ready-to use information about LD, sequence conservation across mammals, the effect of SNPs on regulatory motifs, and the effect of SNPs on expression from eQTL studies. In the revised version of our paper we briefly (due to limited size) emphasized the advantages of haploR. Please see introductory section.</p>
                <p>
                    <bold>RC2:</bold>
                </p>
                <p>
                    <italic>The types of information available in HaploReg and RegulomeDB are not well described. Why were these particular resources selected for this package and how do they differ from each other?</italic>
                </p>
                <p>
                    <bold>AC2:</bold>
                </p>
                <p>HaploReg is a web resource for exploring annotations of genetically linked variants (i.e. variants in haplotype blocks). The particular advantage of HaploReg is that it allows explorations the effects of SNPs on expression from eQTL studies. It also outputs genetically linked (to the query) SNPs, therefore we can discover effects of correlations. RegulomeDB is a resource that shows annotated SNPs with known and predicted regulatory elements in the intergenic regions of the human genome. Data mostly come from publicly available datasets (GEO, ENCODE, etc.). Both HaploReg and RegulomeDB were chosen as convenient tools for exploring effects of eQTL and determining close-related variants. We added description of HaploReg and RegulomeDB output data to the package vignette (please see Overview section).</p>
                <p>
                    <bold>RC3:</bold>
                </p>
                <p>
                    <italic>The "future work" section mentions adding other web tools to the package in the future.&#x00a0; What additional information will be provided by those tools and how were they selected for inclusion in the package?</italic>
                </p>
                <p>
                    <bold>AC3:</bold>
                </p>
                <p>We think that including additional resources on regulatory factors is beneficial since such factors can modulate gene expression and protein yield distinctly across individuals and cell types. This can help us to discover novel mechanisms of genetic associations.</p>
                <p>
                    <bold>RC4:</bold>
                </p>
                <p>
                    <italic>I was able to install the R-package and follow the examples given in the vignette. However, these examples would benefit from more explanation. In the HaploReg example, querying the database with two rs IDs returns results for many additional rs IDs. Why is this?</italic>
                </p>
                <p>
                    <bold>AC4: </bold>
                </p>
                <p>This happened because HaploReg returns information about query SNPs and also information about those SNPs, which are in LD equal or higher than some pre-defined threshold (0.8 by default).</p>
                <p>
                    <bold>RC5:</bold>
                </p>
                <p>
                    <italic>Why is the first element returned by getStudyList() blank?</italic>
                </p>
                <p>
                    <bold>AC5:</bold>
                </p>
                <p>This was because we used a study list returned by Haploreg 'as is' where the first element was blank. It is fixed in version 1.4.4 of the package (blanks were removed).</p>
            </body>
        </sub-article>
    </sub-article>
    <sub-article article-type="reviewer-report" id="report19826">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.11583.r19826</article-id>
            <title-group>
                <article-title>Reviewer response for version 1</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Vitolo</surname>
                        <given-names>Claudia</given-names>
                    </name>
                    <xref ref-type="aff" rid="r19826a1">1</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0002-4252-1176</uri>
                </contrib>
                <contrib contrib-type="author">
                    <name>
                        <surname>Gascon</surname>
                        <given-names>Estibaliz</given-names>
                    </name>
                    <xref ref-type="aff" rid="r19826a1">1</xref>
                    <role>Co-referee</role>
                </contrib>
                <contrib contrib-type="author">
                    <name>
                        <surname>Pillosu</surname>
                        <given-names>Fatima</given-names>
                    </name>
                    <xref ref-type="aff" rid="r19826a1">1</xref>
                    <role>Co-referee</role>
                </contrib>
                <aff id="r19826a1">
                    <label>1</label>European Centre for Medium-Range Weather Forecasts, Reading, UK</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>23</day>
                <month>2</month>
                <year>2017</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2017 Vitolo C et al.</copyright-statement>
                <copyright-year>2017</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport19826" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.10742.1"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve-with-reservations</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>This papers describes the implementation of the 
                <italic>haploR</italic> R-package which is used to retrieve information from web-based genome annotation tools. This R-package aims to simplify the reproducibility of bioinformatics pipe lines.</p>
            <p> </p>
            <p> Overall, we think the structure of the paper and the aim of the project are inline with the journal&#x2019;s guidelines. The 
                <italic>haploR </italic>package seems a valuable open source tool for bioinformaticians and R users as it facilitates data retrieval from web-based databases (such as HaploReg and RegulomeDB) and makes the scientific workflow more reproducible. We also appreciate the intention to keep improving the package by extending the list of supported databases.</p>
            <p> </p>
            <p> We mostly work on climate science and have a limited understanding of bioinformatics. However, we use R extensively and we decided to review this work from a generic R-user perspective. We focused our review on this paper and source code, we considered user manual and the vignette out of the scope of this review.</p>
            <p> </p>
            <p> In our opinion, this paper deserves publication but requires some further work. We decided to approve it with reservations because we noticed some ambiguities in the paper that need to be clarified. We also suggest small changes to the code that could make the functions in the package less error-prone and more future proof. Our specific comments are listed below.</p>
            <p> </p>
            <p> 
                <bold>Major comments</bold> 
                <list list-type="order">
                    <list-item>
                        <p>INTRODUCTION</p>
                        <p> &#x00a0; 
                            <list list-type="bullet">
                                <list-item>
                                    <p>We think the introduction is rather vague. There are several sentences such as &#x201c;in a number of situations&#x201d; or &#x201c;in a certain format&#x201d; which are too vague and require further explanations. For example, instead of saying &#x201c;in a certain format&#x201d;, the authors could explicitly mention the formats that they are referring to (e.g. csv, json, etc). Again, in the second sentence of the third paragraph &#x201c;... saving the results of such analyses in different file formats ...&#x201d; the authors should again specify what the different file formats are.</p>
                                </list-item>
                                <list-item>
                                    <p>Just before the fourth paragraph, the authors should mention if this package could be added to one of the CRAN Task Views (
                                        <ext-link ext-link-type="uri" xlink:href="https://cran.r-project.org/web/views/">https://cran.r-project.org/web/views/</ext-link>) and whether there are other packages with similar goals. If there are other related packages, it would be interesting to mention whether the data could be combined. &#x00a0;</p>
                                </list-item>
                            </list> </p>
                    </list-item>
                    <list-item>
                        <p>METHODS</p>
                        <p> &#x00a0; 
                            <list list-type="bullet">
                                <list-item>
                                    <p>The second sentence of the sub-section Implementation says &#x201c;Functions&#x2026;.are designed to obtain data from the resources HaploReg...and RegulomeDB&#x2026;.&#x201d;. Here, it is important to describe the structure of the retrieved data.</p>
                                </list-item>
                                <list-item>
                                    <p>We appreciate that most bioinformaticians are familiar with web-based databases such as HaploReg and RegulomeDB. However, a student might want to use this tool and having a more detailed description of these web databases would be useful to get started. Please, also consider commenting on the use and interpretation of the retrieved information, for example plotting a subset of the full dataset.</p>
                                </list-item>
                                <list-item>
                                    <p>The Operation section should include clear instructions for the installation and a complete description of package dependencies, including versions of the dependent packages.</p>
                                </list-item>
                            </list> </p>
                    </list-item>
                    <list-item>
                        <p>USE CASES</p>
                        <p> &#x00a0; 
                            <list list-type="bullet">
                                <list-item>
                                    <p>This section is rather vague. The authors should clearly describe all the input arguments of the functions, as well as the expected results.</p>
                                </list-item>
                                <list-item>
                                    <p>Querying HaploReg - Input vector of SNPs 
                                        <list list-type="bullet">
                                            <list-item>
                                                <p>When writing example code, it is considered good practice to assign the result of a command to an object, e.g. x &lt;- queryHaploreg(query=c("rs10048158","rs4791078")). Please consider making this change throughout the paper.</p>
                                            </list-item>
                                            <list-item>
                                                <p>When we run the command x &lt;- queryHaploreg(query=c("rs10048158","rs4791078")) we get the following message: &#x201c;No encoding supplied: defaulting to UTF-8&#x201d;. Consider changing the encoding or removing non-Ascii characters from the table before outputting.</p>
                                            </list-item>
                                            <list-item>
                                                <p>After retrieving the data, please describe the structure of the retrieved object. In particular you should mention the expected number of columns and rows as well as the name and type of variables (the authors might find the str() function useful).</p>
                                            </list-item>
                                            <list-item>
                                                <p>We tried to print the object, the result filled the screen and was unreadable. We suggest to convert the dataframe into a tibble table (see tibble package) to generate a more readable printed output.</p>
                                            </list-item>
                                            <list-item>
                                                <p>We checked the structure of the retrieved objects and the data types are all characters. Some of the columns clearly contain numeric variables (e.g. r2, D , ARF&#x2026;). We suggest to convert there columns from character to numeric before outputting. This conversion is important because users might incur into errors when generating basic statistics. For instance, running x &lt;- queryHaploreg(query=c("rs10048158","rs4791078")); quantile(x$AFR) generates the following error message: &#x201c;Error in (1 - h) * qs[i] : non-numeric argument to binary operator&#x201d;.</p>
                                            </list-item>
                                        </list> </p>
                                </list-item>
                                <list-item>
                                    <p>Querying HaploReg - Input text file with SNPs:&#x00a0;This example is reproducible but the authors do not specify how the "extdata/snps.txt&#x201d; is structured. We suggest to write something like &#x201c;the text file should list the rs-IDs in one column, with one rs-ID per row&#x201d;.&#x00a0;</p>
                                </list-item>
                                <list-item>
                                    <p>Querying HaploReg - Using a particular study:&#x00a0;When we extracted the list of studies, we noticed that we cannot subset it using names. Subsetting using indices is prone to errors because the list of studies could increase over time and their order could change. &#x00a0;</p>
                                </list-item>
                                <list-item>
                                    <p>Querying RegulomeDB 
                                        <list list-type="bullet">
                                            <list-item>
                                                <p>Please explain what the argument format is. It is not obvious to non-experts.</p>
                                            </list-item>
                                            <list-item>
                                                <p>The last sentence of this sub-section &#x201c;the output of this function is similar to that used in the queryHaploreg&#x2026;..&#x201d; The outputs of queryHaploreg() and queryRegulome() are not similar. The former is a data.base, the latter is a list. Even comparing the data.frame from queryHaploreg() with the first element (res.table) of queryRegulome() &#x00a0;and we found different number of rows, columns, variables and data types (the first contains factors and the second characters). What are the similarities between them?</p>
                                            </list-item>
                                        </list> </p>
                                </list-item>
                            </list> </p>
                    </list-item>
                    <list-item>
                        <p>CONCLUSION AND FUTURE WORK:&#x00a0;There is not a discussion about the use cases and the conclusions are poor. You should clearly state the advantages to use these packages over the original databases. For example, you could mention the opportunity to generate a more streamlined workflow, shorter retrieval times, a shallow learning curve, etc.</p>
                    </list-item>
                    <list-item>
                        <p>SOFTWARE AND DATA AVAILABILITY</p>
                        <p> &#x00a0; 
                            <list list-type="bullet">
                                <list-item>
                                    <p>Licence:&#x00a0;It is unclear what license the authors use. The authors write GPL-2 | GPL-3, but it is not possible to use both at the same time.</p>
                                </list-item>
                                <list-item>
                                    <p>Author contributions:&#x00a0;The authors mention that IYZ performed evaluation and validation tests. We were expecting these tests to be provided as unit tests. They don&#x2019;t seem to be included in source code. We suggest to follow best practice by integrating unit tests using the test that framework and using travis-CI (https://travis-ci.org/) for continuous integration. Travis-CI works with Unix base systems, the authors could also test the package on Windows using the appveyor service (
                                        <ext-link ext-link-type="uri" xlink:href="https://www.appveyor.com/">https://www.appveyor.com/</ext-link>).</p>
                                </list-item>
                                <list-item>
                                    <p>DESCRIPTION file: 
                                        <list list-type="bullet">
                                            <list-item>
                                                <p>According to the manual &#x201c;Writing R extensions&#x201d;, the description should mention the role of the authors (
                                                    <ext-link ext-link-type="uri" xlink:href="https://cran.r-project.org/doc/manuals/r-release/R-exts.html#The-DESCRIPTION-file">https://cran.r-project.org/doc/manuals/r-release/R-exts.html#The-DESCRIPTION-file</ext-link>).</p>
                                            </list-item>
                                            <list-item>
                                                <p>The Depends section shows R (&gt;= 3.3). This should be made consistent with the Operation section in which the authors mention to have used R 3.3.2.</p>
                                            </list-item>
                                        </list> </p>
                                </list-item>
                                <list-item>
                                    <p>NAMESPACE file: You seem to use only few functions from the XML and httr packages, so we suggest to load them individually (using importFrom rather than import) to avoid masking.</p>
                                </list-item>
                            </list> </p>
                    </list-item>
                </list> </p>
            <p> </p>
            <p> </p>
            <p> 
                <bold>Minor comments</bold> 
                <list list-type="order">
                    <list-item>
                        <p>ABSTRACT 
                            <list list-type="bullet">
                                <list-item>
                                    <p>First line of the abstract, &#x201c;There exists a set of web-based tools for integration and exploring information linked to annotated genetic variants&#x201d;. We think that this statement would be more appropriate for the introduction because it does not add any key information about the work carried out. The abstract could start with the second sentence, maybe something like, e.g. &#x201c;This paper presents haploR, a novel R-package ...&#x201d;</p>
                                </list-item>
                            </list> </p>
                    </list-item>
                    <list-item>
                        <p>INTRODUCTION 
                            <list list-type="bullet">
                                <list-item>
                                    <p>Second sentence of the fourth paragraph: &#x201c;The package &#x2026; downloads results in the form of a data frame or a file&#x201d;. Technically, a data frame can be saved in a file. Please consider rewording this sentence.</p>
                                </list-item>
                                <list-item>
                                    <p>The second and the third paragraph could be joined because the topics are strongly related.</p>
                                </list-item>
                            </list> </p>
                    </list-item>
                    <list-item>
                        <p>Grant informations:&#x00a0;In most research journals this section is called &#x201c;Acknowledgments&#x201d;.</p>
                    </list-item>
                </list>
            </p>
            <p>Reviewer Expertise:</p>
            <p>NA</p>
            <p>We confirm that we have read this submission and believe that we have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however we have significant reservations, as outlined above.</p>
        </body>
        <sub-article article-type="response" id="comment2693-19826">
            <front-stub>
                <contrib-group>
                    <contrib contrib-type="author">
                        <name>
                            <surname>Zhbannikov</surname>
                            <given-names>Ilya</given-names>
                        </name>
                        <aff>Duke University, USA</aff>
                    </contrib>
                </contrib-group>
                <author-notes>
                    <fn fn-type="conflict">
                        <p>
                            <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                    </fn>
                </author-notes>
                <pub-date pub-type="epub">
                    <day>4</day>
                    <month>5</month>
                    <year>2017</year>
                </pub-date>
            </front-stub>
            <body>
                <p>We thank the reviewers for their careful reading of the manuscript, package testing and their constructive remarks. We have taken the comments on board to improve and clarify the manuscript. Please find below a detailed point-by-point response to all comments (reviewers comments/suggestions (RC) are in italics font; our responses (AR) are in regular, black font.). Unfortunately, due to limited size of the article we could not reflect all the suggestions provided by reviewers explicitly in the article, but we addressed them in corresponding package vignette and web site (
                    <ext-link ext-link-type="uri" xlink:href="https://github.com/izhbannikov/haploR">https://github.com/izhbannikov/haploR</ext-link>, README section).</p>
                <p>Major comments</p>
                <p>INTRODUCTION</p>
                <p>
                    <bold>RC1:</bold>
                </p>
                <p>
                    <italic>We think the introduction is rather vague.&#x00a0; There are several sentences such as &#x201c;in a number of situations&#x201d; or &#x201c;in a certain format&#x201d; which are too vague and require further explanations. For example, instead of saying &#x201c;in a certain format&#x201d;,&#x00a0; the authors could explicitly mention the formats that they are referring to (e.g. csv, json, etc).&#x00a0; Again, in the second sentence of the third paragraph &#x201c;... saving the results of such analyses in different file formats ...&#x201d;&#x00a0; the authors should again specify what the different file formats are.</italic>
                </p>
                <p>
                    <bold>AR1:</bold>
                </p>
                <p>We rewrote the Introduction section and explicitly mentioned file types. Please also see the package vignette for workflow examples.</p>
                <p>
                    <bold>RC2:</bold>
                </p>
                <p>
                    <italic>Just before the fourth paragraph, the authors should mention if this package could be added to one of the CRAN Task Views (https://cran.r-project.org/web/views/) and whether there are other packages with similar goals. If there are other related packages, it would be interesting to mention whether the data could be combined.&#x00a0;</italic>
                </p>
                <p>
                    <bold>AR2:</bold>
                </p>
                <p>We added information about other related packages to the Introductory section. haploR is not presented in CRAN Task Views yet but we are working on adding it to there.</p>
                <p>METHODS</p>
                <p>
                    <bold>RC2:</bold>
                </p>
                <p>&#x00a0;
                    <italic>The second sentence of the sub-section Implementation says &#x201c;Functions&#x2026;.are designed to obtain data from the resources HaploReg...and RegulomeDB&#x2026;.&#x201d;. &#x00a0;Here, it is important to describe the structure of the retrieved data. We appreciate that most bioinformaticians are familiar with web-based databases such as HaploReg and RegulomeDB. &#x00a0;However, a student might want to use this tool and having a more detailed description of these web databases would be useful to get started. Please, also consider commenting on the use and interpretation of the retrieved information, &#x00a0;for example plotting a subset of the full dataset. The Operation section should include clear instructions for the installation and a complete description &#x00a0;of package dependencies, including versions of the dependent packages.</italic>
                </p>
                <p>
                    <bold>AR2:</bold>
                </p>
                <p>Due to limited space of the article (1,000 words maximum) we provided data description and installation instructions at the package website (
                    <ext-link ext-link-type="uri" xlink:href="https://github.com/izhbannikov/haploR">https://github.com/izhbannikov/haploR</ext-link>) and within the corresponding revised vignette (
                    <ext-link ext-link-type="uri" xlink:href="https://github.com/izhbannikov/haploR/blob/master/vignettes/haplor-vignette.Rmd">https://github.com/izhbannikov/haploR/blob/master/vignettes/haplor-vignette.Rmd</ext-link>) or just browseVignettes(&#x201c;haploR&#x201d;)).</p>
                <p>USE CASES</p>
                <p>
                    <bold>RC3</bold>:</p>
                <p>
                    <italic>This section is rather vague. The authors should clearly describe all the input arguments of the functions, as well as the expected results. Querying HaploReg - Input vector of SNPs</italic>
                </p>
                <p>
                    <bold>AR3:</bold>
                </p>
                <p>Due to limited size of the paper, we now provide description of the input parameters in the package vignette and the website. Sorry for the inconvenience.</p>
                <p>
                    <bold>RC4:</bold>
                </p>
                <p>
                    <italic>When writing example code, it is considered good practice to assign the result of a command to an object, e.g. x &lt;- queryHaploreg(query=c("rs10048158","rs4791078")). Please consider making this change throughout the paper.</italic>
                </p>
                <p>
                    <bold>AR4:</bold>
                </p>
                <p>Thank you for pointing on this. Such issue is fixed in revised article: results of all data retrieval commands are assigned to objects.</p>
                <p>
                    <bold>RC5:</bold>
                </p>
                <p>
                    <italic>When we run the command x &lt;- queryHaploreg(query=c("rs10048158","rs4791078")) we get the following message: &#x201c;No encoding supplied: defaulting to UTF-8&#x201d;. Consider changing the encoding or removing non-Ascii characters from the table before outputting.</italic>
                </p>
                <p>
                    <bold>AR5:</bold>
                </p>
                <p>We fixed this warning in version 1.4.4 of the package. The parameter encoding added to queryHaploreg function. Default is set to UTF-8.</p>
                <p>
                    <bold>RC6:</bold>
                </p>
                <p>
                    <italic>After retrieving the data, please describe the structure of the retrieved object. In particular you should mention the expected number of columns and rows as well as the name and type of variables (the authors might find the str() function useful).</italic>
                </p>
                <p>
                    <bold>AR6:</bold>
                </p>
                <p>We describe this in corresponding vignette due to limited space of the article (not more than 1,000 words). Please see sections 
                    <bold>Querying HaploReg</bold>, 
                    <bold>Querying RegulomeDB</bold> and their subsections 
                    <bold>Output</bold>.</p>
                <p>
                    <bold>RC7:</bold>
                </p>
                <p>
                    <italic>We tried to print the object, the result filled the screen and was unreadable.&#x00a0; We suggest to convert the dataframe into a tibble table (see tibble package) to generate a more readable printed output.</italic>
                </p>
                <p>
                    <bold>AR7:</bold>
                </p>
                <p>Thank you for this suggestion. Now we use 
                    <italic>tibble</italic> for generating a printable output.</p>
                <p>
                    <bold>RC8:</bold>
                </p>
                <p>
                    <italic>We checked the structure of the retrieved objects and the data types are all characters. Some of the columns clearly contain numeric variables (e.g. r2, D , ARF&#x2026;). We suggest to convert there columns from character to numeric before outputting. This conversion is important because users might incur into errors when generating basic statistics. For instance, running x &lt;- queryHaploreg(query=c("rs10048158","rs4791078"));</italic>
                </p>
                <p>
                    <italic>quantile(x$AFR) generates the following error message: &#x201c;Error in (1 - h) * qs[i] : non-numeric argument to binary operator&#x201d;.</italic>
                </p>
                <p>
                    <bold>AR8:</bold>
                </p>
                <p>This issue is fixed in the current version (1.4.4) of the package available from CRAN. Thank you very much for pointing on that.</p>
                <p>
                    <bold>RC9:</bold>
                </p>
                <p>
                    <italic>Querying HaploReg - Input text file with SNPs: This example is reproducible but the authors&#x00a0; do not specify how the "extdata/snps.txt&#x201d; is structured. We suggest to write something like &#x201c;the text file should list the rs-IDs in one column, with one rs-ID per row&#x201d;.</italic>
                </p>
                <p>
                    <bold>AR9:</bold>
                </p>
                <p>We moved this example to the package vignette and package web page where we describe the structure of extdata/snps.txt .</p>
                <p>
                    <bold>RC10:</bold>
                </p>
                <p>
                    <italic>Querying HaploReg - Using a particular study: When we extracted the list of studies, we noticed that we cannot subset it using names. Subsetting using indices is prone to errors because the list of studies could increase over time and their order could change.</italic>&#x00a0;</p>
                <p>
                    <bold>AR10:</bold>
                </p>
                <p>Thank you for emphasizing this important point. This issue is fixed in 1.4.4 version of the package.</p>
                <p>
                    <bold>RC11:</bold>
                </p>
                <p>
                    <italic>Querying RegulomeDB Please explain what the argument format is. It is not obvious to non-experts.</italic>
                </p>
                <p>
                    <bold>AR11:</bold>
                </p>
                <p>We added instructions for the argument format details. Please see package web site README, subsection &#x201c;Arguments&#x201d; of section &#x201c;Querying RegulomeDB&#x201d; .</p>
                <p>
                    <bold>RC12:</bold>
                </p>
                <p>
                    <italic>The last sentence of this sub-section &#x201c;the output of this function is similar to that used in the queryHaploreg&#x2026;..&#x201d; The outputs of queryHaploreg() and queryRegulome() are not similar. The former is a data.base, the latter is a list. Even comparing the data.frame from queryHaploreg() with the first element (res.table) of queryRegulome()&#x00a0; and we found different number of rows, columns, variables and data types (the first contains factors and the second characters). What are the similarities between them?</italic>
                </p>
                <p>
                    <bold>AR12:</bold>
                </p>
                <p>Thank you for this useful remark. We agree that technically these formats are different and similarities are in only the type of information retrieved.</p>
                <p>CONCLUSION AND FUTURE WORK:</p>
                <p>
                    <bold>RC13:</bold>
                </p>
                <p>
                    <italic>There is not a discussion about the use cases and the conclusions are poor. You should clearly state the advantages to use these packages over the original databases. For example, you could mention the opportunity to generate a more streamlined workflow, shorter retrieval times, a shallow learning curve, etc.</italic>
                </p>
                <p>
                    <bold>AR13:</bold>
                </p>
                <p>We rewrote the conclusion according to your suggestions.</p>
                <p>SOFTWARE AND DATA AVAILABILITY</p>
                <p>
                    <bold>RC14:</bold>
                </p>
                <p>
                    <italic>Licence: It is unclear what license the authors use. The authors write GPL-2 | GPL-3, but it is not possible to use both at the same time.</italic>
                </p>
                <p>
                    <bold>AR14</bold>:</p>
                <p>Thank you for this remark. License changed to GPL-3 in version 1.4.4 of the package.</p>
                <p>
                    <bold>RC15:</bold>
                </p>
                <p>
                    <italic>Author contributions: The authors mention that IYZ performed evaluation and validation tests. We were expecting these tests to be provided as unit tests. They don&#x2019;t seem to be included in source code. We suggest to follow best practice by integrating unit tests using the test that framework and using travis-CI (https://travis-ci.org/) for continuous integration. Travis-CI works with Unix base systems, the authors could also test the package on Windows using the appveyor service (https://www.appveyor.com/).</italic>
                </p>
                <p>
                    <bold>AR15:</bold>
                </p>
                <p>We added unit tests to version 1.4.4 of the package.</p>
                <p>DESCRIPTION file:</p>
                <p>
                    <bold>RC16:</bold>
                </p>
                <p>
                    <italic>According to the manual &#x201c;Writing R extensions&#x201d;, the description should mention the role of the authors (https://cran.r-project.org/doc/manuals/r-release/R-exts.html#The-DESCRIPTION-file).</italic>
                </p>
                <p>
                    <bold>AR16:</bold>
                </p>
                <p>We updated the description file and now it describes the roles of listed contributors.</p>
                <p>
                    <bold>RC15:</bold>
                </p>
                <p>
                    <italic>The Depends section shows R (&gt;= 3.3). This should be made consistent with the Operation section in which the authors mention to have used R 3.3.2.</italic>
                </p>
                <p>
                    <bold>AR15:</bold>
                </p>
                <p>We changed the Depends section to R (&gt;= 3.3.2).</p>
                <p>
                    <bold>RC16:</bold>
                </p>
                <p>
                    <italic>NAMESPACE file: You seem to use only few functions from the XML and httr packages, so we suggest to load them individually (using importFrom rather than import) to avoid masking.</italic>
                </p>
                <p>
                    <bold>AR16:</bold>
                </p>
                <p>Thank you for this suggestion. Now we import only needed functions with &#x201c;importFrom&#x201d; statement.</p>
                <p>Minor comments</p>
                <p>ABSTRACT</p>
                <p>
                    <bold>RC17:</bold>
                </p>
                <p>
                    <italic>First line of the abstract, &#x201c;There exists a set of web-based tools for integration and exploring information linked to annotated genetic variants&#x201d;. &#x00a0;We think that this statement would be more appropriate for the introduction because it does not add any key information about the work carried out. The abstract could start with the second sentence, maybe something like, e.g. &#x201c;This paper presents haploR, a novel R-package ...&#x201d;</italic>
                </p>
                <p>
                    <bold>AR17:</bold>
                </p>
                <p>Thank you for this helpful suggestion. We adopted the text according to this.</p>
                <p>INTRODUCTION</p>
                <p>
                    <bold>RC18:</bold>
                </p>
                <p>
                    <italic>Second sentence of the fourth paragraph: &#x201c;The package &#x2026; downloads results in the form of a data frame or a file&#x201d;. &#x00a0;Technically, a data frame can be saved in a file. Please consider rewording this sentence.</italic>
                </p>
                <p>
                    <bold>AR18:</bold>
                </p>
                <p>We reworded this sentence to: "The package connects to the web site, queries the database and downloads results."</p>
                <p>
                    <bold>RC19:</bold>
                </p>
                <p>
                    <italic>The second and the third paragraph could be joined because the topics are strongly related.</italic>
                </p>
                <p>
                    <bold>AR19:</bold>
                </p>
                <p>We joined the first and second paragraphs.</p>
                <p>
                    <bold>RC20:</bold>
                </p>
                <p>
                    <italic>Grant informations: In most research journals this section is called &#x201c;Acknowledgments&#x201d;.</italic>
                </p>
                <p>
                    <bold>AR20:</bold>
                </p>
                <p>We changed the &#x201c;Grant Information&#x201d; section name to "Acknowledgments".</p>
            </body>
        </sub-article>
    </sub-article>
    <sub-article article-type="reviewer-report" id="report19824">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.11583.r19824</article-id>
            <title-group>
                <article-title>Reviewer response for version 1</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Dancik</surname>
                        <given-names>Garrett M.</given-names>
                    </name>
                    <xref ref-type="aff" rid="r19824a1">1</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0002-1391-8641</uri>
                </contrib>
                <aff id="r19824a1">
                    <label>1</label>Department of Computer Science, Eastern Connecticut State University, Willimantic, CT, USA</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>13</day>
                <month>2</month>
                <year>2017</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2017 Dancik GM</copyright-statement>
                <copyright-year>2017</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport19824" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.10742.1"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve-with-reservations</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>The authors describe an 
                <italic>R</italic> package named 
                <italic>haploR </italic>for querying&#x00a0;the HaploReg and ReglomeDB web-based databases. Because querying can be carried out in 
                <italic>R</italic>, &#x00a0;
                <italic>haploR</italic>&#x00a0;adds convenience for querying these databases when subsequent downstream analyses in 
                <italic>R</italic> are desired.&#x00a0;</p>
            <p> </p>
            <p> The 
                <italic>R</italic> package is easy to use and works as described. However, the potential application of 
                <italic>haploR</italic> is only vaguely described. The authors should include concrete examples of downstream analyses in order to demonstrate when 
                <italic>haploR</italic> would be preferred to traditional queries executed from the web.&#x00a0;</p>
            <p> </p>
            <p> In addition, addressing the following items would add clarity to the manuscript and the tool: 
                <list list-type="order">
                    <list-item>
                        <p>The authors should describe when the results returned by 
                            <italic>haploR</italic> differ from the web-based results. For example,&#x00a0;whereas the results table from querying HaploReg on the web may indicate that a particular variant has "4 altered motifs", providing links to the variant entry where the motifs are listed,&#x00a0;
                            <italic>haploR</italic>&#x00a0;directly returns the motifs present. This is an advantage of 
                            <italic>haploR</italic> that should be described.</p>
                    </list-item>
                    <list-item>
                        <p>There are several spelling and grammatical errors which make the manuscript difficult to follow in some parts. For example, the Introduction states that "Large projects...are devoted to bring together", instead of "bringing together".</p>
                    </list-item>
                </list>
            </p>
            <p>Reviewer Expertise:</p>
            <p>NA</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.</p>
        </body>
        <sub-article article-type="response" id="comment2691-19824">
            <front-stub>
                <contrib-group>
                    <contrib contrib-type="author">
                        <name>
                            <surname>Zhbannikov</surname>
                            <given-names>Ilya</given-names>
                        </name>
                        <aff>Duke University, USA</aff>
                    </contrib>
                </contrib-group>
                <author-notes>
                    <fn fn-type="conflict">
                        <p>
                            <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                    </fn>
                </author-notes>
                <pub-date pub-type="epub">
                    <day>4</day>
                    <month>5</month>
                    <year>2017</year>
                </pub-date>
            </front-stub>
            <body>
                <p>We thank the reviewer for insightful and thorough feedback. It was clear from those comments that our original paper did not emphasize clearly enough the unique contribution of the R package 
                    <italic>haploR</italic>. These comments critique helped us to revise the note and package vignette to clarify several aspects of data retrieval methodology used in the package. We revised the paper and this revision addresses all of the reviewer&#x2019;s concerns. Reviewer comments/suggestions (RC)&#x00a0;are in italics font; author&#x2019;s responses (AR) are in regular, black font.</p>
                <p>
                    <bold>RC1:</bold>
                </p>
                <p>
                    <italic>The R package is easy to use and works as described. However, the potential application of haploR is only vaguely described. The authors should include concrete examples of downstream analyses in order to demonstrate when haploR would be preferred to traditional queries executed from the web.</italic>
                </p>
                <p>
                    <bold>AR1:</bold>
                </p>
                <p>We provided corresponding examples in the package vignette and also on the package web page: 
                    <ext-link ext-link-type="uri" xlink:href="https://github.com/izhbannikov/haploR">https://github.com/izhbannikov/haploR</ext-link> . Please see &#x201c;Motivation and typical analysis workflow&#x201d; section.</p>
                <p>
                    <bold>RC2:</bold>
                </p>
                <p>
                    <italic>In addition, addressing the following items would add clarity to the manuscript and the tool:</italic>
                </p>
                <p>
                    <italic>The authors should describe when the results returned by haploR differ from the web-based results. For example, whereas the results table from querying HaploReg on the web may indicate that a particular variant has "4 altered motifs", providing links to the variant entry where the motifs are listed, haploR directly returns the motifs present. This is an advantage of haploR that should be described.</italic>
                </p>
                <p>
                    <bold>AR2:</bold>
                </p>
                <p>Thank you for this useful suggestion. Following your suggestion and due to limited article size (no more than 1,000 words) we emphasized it in a package vignette (please see the end of &#x201c;One or several genetic variants&#x201d; subsection).</p>
                <p>
                    <bold>RC3:</bold>
                </p>
                <p>
                    <italic>There are several spelling and grammatical errors which make the manuscript difficult to follow in some parts. For example, the Introduction states that "Large projects...are devoted to bring together", instead of "bringing together".</italic>
                </p>
                <p>
                    <bold>AR3:</bold>
                </p>
                <p>We addressed these errors in the revised article.</p>
                <p>We are happy to make any other changes that may be required.</p>
                <p>Sincerely,</p>
                <p>Ilya Zhbannikov</p>
            </body>
        </sub-article>
    </sub-article>
</article>
