<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.2 20190208//EN" "http://jats.nlm.nih.gov/publishing/1.2/JATS-journalpublishing1.dtd"><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="other" dtd-version="1.2" xml:lang="en">
    <front>
        <journal-meta>
            <journal-id journal-id-type="pmc">F1000Research</journal-id>
            <journal-title-group>
                <journal-title>F1000Research</journal-title>
            </journal-title-group>
            <issn pub-type="epub">2046-1402</issn>
            <publisher>
                <publisher-name>F1000 Research Limited</publisher-name>
                <publisher-loc>London, UK</publisher-loc>
            </publisher>
        </journal-meta>
        <article-meta>
            <article-id pub-id-type="doi">10.12688/f1000research.17548.2</article-id>
            <article-categories>
                <subj-group subj-group-type="heading">
                    <subject>Software Tool Article</subject>
                </subj-group>
                <subj-group>
                    <subject>Articles</subject>
                </subj-group>
            </article-categories>
            <title-group>
                <article-title>Expanding the Orthologous Matrix (OMA) programmatic interfaces: REST API and the 
                    <italic>OmaDB</italic> packages for R and Python</article-title>
                <fn-group content-type="pub-status">
                    <fn>
                        <p>[version 2; peer review: 2 approved]</p>
                    </fn>
                </fn-group>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Kaleb</surname>
                        <given-names>Klara</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Conceptualization</role>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <uri content-type="orcid">https://orcid.org/0000-0003-2091-2114</uri>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Vesztrocy</surname>
                        <given-names>Alex Warwick</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <uri content-type="orcid">https://orcid.org/0000-0002-4074-4261</uri>
                    <xref ref-type="aff" rid="a1">1</xref>
                    <xref ref-type="aff" rid="a2">2</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Altenhoff</surname>
                        <given-names>Adrian M.</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Conceptualization</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Resources</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Supervision</role>
                    <role content-type="http://credit.niso.org/">Validation</role>
                    <uri content-type="orcid">https://orcid.org/0000-0001-7492-1273</uri>
                    <xref ref-type="aff" rid="a2">2</xref>
                    <xref ref-type="aff" rid="a3">3</xref>
                </contrib>
                <contrib contrib-type="author" corresp="yes">
                    <name>
                        <surname>Dessimoz</surname>
                        <given-names>Christophe</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Conceptualization</role>
                    <role content-type="http://credit.niso.org/">Funding Acquisition</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Project Administration</role>
                    <role content-type="http://credit.niso.org/">Supervision</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <uri content-type="orcid">https://orcid.org/0000-0002-2170-853X</uri>
                    <xref ref-type="corresp" rid="c1">a</xref>
                    <xref ref-type="aff" rid="a1">1</xref>
                    <xref ref-type="aff" rid="a2">2</xref>
                    <xref ref-type="aff" rid="a4">4</xref>
                    <xref ref-type="aff" rid="a5">5</xref>
                    <xref ref-type="aff" rid="a6">6</xref>
                </contrib>
                <aff id="a1">
                    <label>1</label>Centre for Life&#x2019;s Origins and Evolution, Department of Genetics, Evolution and Environment, University College London, London, WC1E 6BT, UK</aff>
                <aff id="a2">
                    <label>2</label>Swiss Institute of Bioinformatics, Lausanne, Switzerland</aff>
                <aff id="a3">
                    <label>3</label>Department of Computer Science, ETH Zurich, Zurich, Switzerland</aff>
                <aff id="a4">
                    <label>4</label>Department of Computer Science, University College London, London, WC1E 6BT, UK</aff>
                <aff id="a5">
                    <label>5</label>Department of Computational Biology, University of Lausanne, Lausanne, 1015, Switzerland</aff>
                <aff id="a6">
                    <label>6</label>Center for Integrative Genomics, University of Lausanne, Lausanne, 1015, Switzerland</aff>
            </contrib-group>
            <author-notes>
                <corresp id="c1">
                    <label>a</label>
                    <email xlink:href="mailto:Christophe.Dessimoz@unil.ch">Christophe.Dessimoz@unil.ch</email>
                </corresp>
                <fn fn-type="conflict">
                    <p>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>29</day>
                <month>3</month>
                <year>2019</year>
            </pub-date>
            <pub-date pub-type="collection">
                <year>2019</year>
            </pub-date>
            <volume>8</volume>
            <elocation-id>42</elocation-id>
            <history>
                <date date-type="accepted">
                    <day>22</day>
                    <month>3</month>
                    <year>2019</year>
                </date>
            </history>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2019 Kaleb K et al.</copyright-statement>
                <copyright-year>2019</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <self-uri content-type="pdf" xlink:href="https://f1000research.com/articles/8-42/pdf"/>
            <abstract>
                <p>The Orthologous Matrix (OMA) is a well-established resource to identify orthologs among many genomes. Here, we present two recent additions to its programmatic interface, namely a REST API, and user-friendly R and Python packages called 
                    <italic toggle="yes">OmaDB</italic>. These should further facilitate the incorporation of OMA data into computational scripts and pipelines. The REST API can be freely accessed at 
                    <ext-link ext-link-type="uri" xlink:href="https://omabrowser.org/api">https://omabrowser.org/api</ext-link>. The R OmaDB package is available as part of Bioconductor at 
                    <ext-link ext-link-type="uri" xlink:href="http://bioconductor.org/packages/OmaDB/">http://bioconductor.org/packages/OmaDB/</ext-link>, and the omadb Python package is available from the Python Package Index (PyPI) at 
                    <ext-link ext-link-type="uri" xlink:href="https://pypi.org/project/omadb/">https://pypi.org/project/omadb/</ext-link>.</p>
            </abstract>
            <kwd-group kwd-group-type="author">
                <kwd>orthologs</kwd>
                <kwd>paralogs</kwd>
                <kwd>hierarchical orthologous groups</kwd>
                <kwd>comparative genomics</kwd>
                <kwd>orthologous matrix</kwd>
                <kwd>oma</kwd>
                <kwd>API</kwd>
                <kwd>R</kwd>
                <kwd>python</kwd>
                <kwd>REST</kwd>
                <kwd>bioconductor</kwd>
            </kwd-group>
            <funding-group>
                <award-group id="fund-1" xlink:href="http://dx.doi.org/10.13039/501100000268">
                    <funding-source>Biotechnology and Biological Sciences Research Council</funding-source>
                    <award-id>BB/M015009/1</award-id>
                </award-group>
                <award-group id="fund-2" xlink:href="http://dx.doi.org/10.13039/501100000765">
                    <funding-source>University College London</funding-source>
                </award-group>
                <award-group id="fund-3" xlink:href="http://dx.doi.org/10.13039/501100001711">
                    <funding-source>Schweizerischer Nationalfonds zur F&#x00f6;rderung der Wissenschaftlichen Forschung</funding-source>
                    <award-id>150654</award-id>
                </award-group>
                <award-group id="fund-4">
                    <funding-source>Swiss State Secretariat for Education</funding-source>
                </award-group>
                <funding-statement>We acknowledge support by Swiss National Science Foundation grant 150654, UK BBSRC grant BB/M015009/1, the Swiss State Secretariat for Education, Research and Innovation (SERI), as well as a UCL Genetics, Evolution and Environment Departmental Summer Bursary (to KK).</funding-statement>
                <funding-statement>
                    <italic>The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.</italic>
                </funding-statement>
            </funding-group>
        </article-meta>
        <notes>
            <sec sec-type="version-changes">
                <label>Revised</label>
                <title>Amendments from Version 1</title>
                <p>Version 2 of our manuscript addresses the points from the peer-reviewers, whom we thank for their constructive feedback. We clarified the installation procedure for the package, which is currently in the development version of Bioconductor, due to be released in Spring 2019. Furthermore, we corrected typos, improved the documentation, and clarified potential differences in the output of the code examples that can arise due to updates of the OMA database.</p>
            </sec>
        </notes>
    </front>
    <body>
        <sec sec-type="intro">
            <title>Introduction</title>
            <p>Orthologs are pairs of protein coding genes that have common ancestry and have diverged due to speciation events
                <sup>
                    <xref ref-type="bibr" rid="ref-1">1</xref>
                </sup>. The detection of orthologs is of fundamental importance in many fields in biology, such as comparative genomics, as it allows us to propagate existing biological knowledge to ever growing newly sequenced data
                <sup>
                    <xref ref-type="bibr" rid="ref-2">2</xref>,
                    <xref ref-type="bibr" rid="ref-3">3</xref>
                </sup>.</p>
            <p>The Orthologous Matrix (OMA) is a method and resource for the inference of orthologs among complete genomes
                <sup>
                    <xref ref-type="bibr" rid="ref-4">4</xref>
                </sup>. The OMA database (
                <ext-link ext-link-type="uri" xlink:href="https://omabrowser.org/">https://omabrowser.org</ext-link>) features broad scope and size with currently over 2,100 species from all three domains of life.</p>
            <p>The OMA browser has supported multiple ways of exporting the underlying data from its beginning. Users can download data either via bulk archives or interactively through the browser&#x2014;using where possible standard file formats, such as FASTA, OrthoXML
                <sup>
                    <xref ref-type="bibr" rid="ref-5">5</xref>
                </sup>, or PhyloXML
                <sup>
                    <xref ref-type="bibr" rid="ref-6">6</xref>
                </sup>. For programmatic access, early OMA database releases offered an Application Programming Interface (API) in the form of the Simple Object Access Protocol (SOAP). However, the complexity and limited adoption of SOAP has prompted us to recently switch to the simpler, faster, and more widely used Representational State Transfer (REST) protocol for the OMA API
                <sup>
                    <xref ref-type="bibr" rid="ref-4">4</xref>
                </sup>. Here, we provide a description of this new OMA REST API.</p>
            <p>Furthermore, the R environment is widely used in bioinformatics due to its flexibility as a high-level scripting language, statistical capabilities, and numerous bioinformatics libraries. In particular, the Bioconductor open source framework contains over 2,000 packages to facilitate either access to or manipulation of biological data
                <sup>
                    <xref ref-type="bibr" rid="ref-7">7</xref>
                </sup>. This motivated us to develop the OmaDB Bioconductor package which provides a more idiomatic and user-friendly access to OMA data in R implemented on top of the REST API.</p>
            <p>Finally, to also enable Python users to easily interact with the database, we have developed a similar package in that language, compliant with the conventions and with support of typical complementary Python packages as outlined below.</p>
        </sec>
        <sec sec-type="methods">
            <title>Methods</title>
            <p>We start by describing the OMA REST API, before moving on to detail the OmaDB Bioconductor package, and finally outline the omadb Python package.</p>
            <sec>
                <title>OMA REST API</title>
                <p>The REST framework is an API architectural style that is based on URLs and HTTP protocol methods. It was designed to be stateless and thus is context independent. That is, it does not save data internally between the HTTP requests which minimises server-side application state, thus easing parallelism. The combination of the HTTP and JSON data formats makes it particularly suitable for web applications and easily supported by most programming languages.</p>
                <p>Since the backend of the OMA browser is almost fully based on Python and its frontend is supported by the Django web framework
                    <sup>
                        <xref ref-type="bibr" rid="ref-8">8</xref>
                    </sup>, we have opted to use the Django Rest Framework (DRF) to implement a REST API in our latest release
                    <sup>
                        <xref ref-type="bibr" rid="ref-4">4</xref>
                    </sup>. Most API calls require querying the OMA database, stored in HDF5
                    <sup>
                        <xref ref-type="bibr" rid="ref-9">9</xref>
                    </sup>, using a custom Python library (&#x201c;pyoma&#x201d;). The query results are serialised in the format requested by the user &#x2014; typically JSON.</p>
                <p>Most data available through the OMA browser is now also accessible via the API, with the exception of the local synteny data. This includes individual genes and their attributes such as protein or cDNA sequences, cross-references, pairwise orthologs, hierarchical orthologous groups
                    <sup>
                        <xref ref-type="bibr" rid="ref-10">10</xref>
                    </sup>, as well as species trees and the corresponding taxonomy. The API documentation as well as the interactive interface can be found at 
                    <ext-link ext-link-type="uri" xlink:href="https://omabrowser.org/api/docs">https://omabrowser.org/api/docs</ext-link> (
                    <xref ref-type="fig" rid="f1">Figure 1</xref>).</p>
                <fig fig-type="figure" id="f1" orientation="portrait" position="float">
                    <label>Figure 1. </label>
                    <caption>
                        <title>Showcase of the OMA REST API documentation page, with an example of the interactive query and response.</title>
                    </caption>
                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/20395/0775066b-cd87-4518-84de-47c304c2611a_figure1.gif"/>
                </fig>
            </sec>
            <sec>
                <title>OmaDB Bioconductor package</title>
                <p>To facilitate simplified access to the API and downstream analyses in the R environment, we have also developed an API wrapper package in R, now available in Bioconductor
                    <sup>
                        <xref ref-type="bibr" rid="ref-7">7</xref>
                    </sup> (
                    <ext-link ext-link-type="uri" xlink:href="http://bioconductor.org/packages/OmaDB/">http://bioconductor.org/packages/OmaDB/</ext-link>). This allowed for abstraction of the server interface, eliminating the need to know structure of the database or the URL endpoints to access the required data.</p>
                <p>The package consists of a collection of functions that import OMA data into R objects, the type of which depends on the query supplied. Due to the volume of the data available, some selected object attributes are at first given as URL endpoints. However, these are automatically loaded upon accession. OmaDB also facilitates further downstream analyses with other Bioconductor packages, such as GO enrichment analysis with topGO
                    <sup>
                        <xref ref-type="bibr" rid="ref-11">11</xref>
                    </sup>, sequence analysis with BioStrings
                    <sup>
                        <xref ref-type="bibr" rid="ref-12">12</xref>
                    </sup>, phylogenetic analyses using ggtree
                    <sup>
                        <xref ref-type="bibr" rid="ref-13">13</xref>
                    </sup> or gene locus analyses with the help of GenomicRanges
                    <sup>
                        <xref ref-type="bibr" rid="ref-14">14</xref>
                    </sup>.</p>
                <p>The open source code is hosted at 
                    <ext-link ext-link-type="uri" xlink:href="https://github.com/klarakaleb/OmaDB">https://github.com/DessimozLab/OmaDB/</ext-link>. In the results section we showcase usage of the latest version of the package (v2.0), which requires R version &gt;= 3.6 and Bioconductor version &gt;= 3.9. Note that as of the time of publication, this is in the Bioconductor development version. For details, see the Software Availability section.</p>
                <p>
                    <bold>
                        <italic toggle="yes">Package Installation</italic>
                    </bold>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#333333;">if (!requireNamespace(</styled-content>
                        <styled-content style="font-size:15px;color:#DD1144;">"BiocManager"</styled-content>
                        <styled-content style="font-size:15px;color:#333333;">))</styled-content>
    
                        <styled-content style="font-size:15px;color:#333333;">install.packages(</styled-content>
                        <styled-content style="font-size:15px;color:#DD1144;">"BiocManager"</styled-content>
                        <styled-content style="font-size:15px;color:#333333;">)</styled-content>

                        <styled-content style="font-size:15px;color:#333333;">BiocManager::install(</styled-content>
                        <styled-content style="font-size:15px;color:#DD1144;">"OmaDB"</styled-content>
                        <styled-content style="font-size:15px;color:#333333;">)</styled-content>

                        <styled-content style="font-size:15px;color:#999988; font-style:italic"># Load the package</styled-content>

                        <styled-content style="font-size:15px;color:#333333; font-weight:bold">library</styled-content>
                        <styled-content style="font-size:15px;color:#333333;">(OmaDB)</styled-content>
                    </preformat>
                </p>
            </sec>
            <sec>
                <title>omadb Python package</title>
                <p>For Python users, we provide an analogous package named 
                    <italic toggle="yes">omadb</italic>. Results are supplied to users as a hybrid attribute-dictionary object. As such, both attribute and key-based access is possible. Where the URL of a further API call is listed in a response, this has been designed to be automatically requested for the user.</p>
                <p>For data that can be represented as a table, the 
                    <italic toggle="yes">pandas</italic> package
                    <sup>
                        <xref ref-type="bibr" rid="ref-15">15</xref>
                    </sup> is supported. HOGs can be analysed or displayed using the 
                    <italic toggle="yes">pyham</italic> library
                    <sup>
                        <xref ref-type="bibr" rid="ref-16">16</xref>
                    </sup>. Trees are retrievable as 
                    <italic toggle="yes">DendroPy</italic>
                    <sup>
                        <xref ref-type="bibr" rid="ref-17">17</xref>
                    </sup> or 
                    <italic toggle="yes">ETE3</italic>
                    <sup>
                        <xref ref-type="bibr" rid="ref-18">18</xref>
                    </sup> Tree objects. Gene Ontology enrichment analyses are possible through the use of the 
                    <italic toggle="yes">goatools</italic> package
                    <sup>
                        <xref ref-type="bibr" rid="ref-19">19</xref>
                    </sup>.</p>
                <p>	The open source code is hosted at 
                    <ext-link ext-link-type="uri" xlink:href="https://github.com/DessimozLab/pyomadb/">https://github.com/DessimozLab/pyomadb/</ext-link>. The package requires Python &gt;=3.6, as well as a stable internet connection. It is also available to download from PyPI, installable using pip.</p>
                <p>
                    <bold>
                        <italic toggle="yes">Package Installation</italic>
                    </bold>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#999988; font-style:italic"># Install in shell, using pip</styled-content>

                        <styled-content style="font-size:15px;color:#333333;">$ pip install omadb</styled-content>


                        <styled-content style="font-size:15px;color:#999988; font-style:italic"># In Python, load the package</styled-content>

                        <styled-content style="font-size:15px;color:#333333;">&gt;&gt;&gt;</styled-content> 
                        <styled-content style="font-size:15px;color:#333333; font-weight:bold;">from</styled-content> 
                        <styled-content style="font-size:15px;color:#333333;">omadb import Client</styled-content>

                        <styled-content style="font-size:15px;color:#999988; font-style:italic"># Initialise the client</styled-content>

                        <styled-content style="font-size:15px;color:#333333;">&gt;&gt;&gt; c = Client()</styled-content>
                    </preformat>
                </p>
            </sec>
        </sec>
        <sec sec-type="results">
            <title>Results</title>
            <p>We provide six illustrative examples in R. The first shows a direct call to the REST API, while the other five showcase the OmaDB R package (version 2.0). These examples are also available as a Jupyter notebook
                <sup>
                    <xref ref-type="bibr" rid="ref-20">20</xref>
                </sup> as part of the OmaDB R code repository. We have also provided analogous examples in Python, also in the form of a Jupyter notebook, included in its code repository&#x2014;with the exception of Example 6, which uses a package only available in R.</p>
            <p>Note that the results of the queries using the API and the packages may change as we continue to update the OMA database. The OMA database release of June 2018 was used to generate the examples below. </p>
            <sec>
                <title>Example 1 - Simply accessing the API, in R, via URLs</title>
                <p>One way to access the API is to directly send a request using httr
                    <sup>
                        <xref ref-type="bibr" rid="ref-21">21</xref>
                    </sup> in R. This approach requires the user to know the URL of the API endpoint, as well as the URL of the API function of interest. Some additional processing steps of the resultant response is usually needed. A simple example to retrieve information on the P53_RAT protein is provided below.</p>
                <p>Here we first formulate our URL of interest and use it to send a GET request to the API. This gives us the response JSON object, which can then be parsed into an R list.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#333333; font-weight:bold;">library</styled-content>
                        <styled-content style="font-size:15px;color:#333333;">(httr)</styled-content>


                        <styled-content style="font-size:15px;color:#333333;">url &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#DD1144;">"https://omabrowser.org/api/protein/P53_RAT/"</styled-content>

                        <styled-content style="font-size:15px;color:#333333;">response &lt;- GET(url)</styled-content>


                        <styled-content style="font-size:15px;color:#333333;">response_content_list &lt;- httr::content(response, as =</styled-content> 
                        <styled-content style="font-size:15px;color:#DD1144;">"parsed"</styled-content>
                        <styled-content style="font-size:15px;color:#333333;">)</styled-content>
                    </preformat>
                </p>
            </sec>
            <sec>
                <title>Example 2 - Using a sequence to find its gene family (Hierarchical Orthologous Group) and function via gene ontologies</title>
                <p>Below is a simple workflow using the OmaDB package to annotate a given protein sequence, using the mapSequence() function.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#333333; font-weight:bold;">library</styled-content>
                        <styled-content style="font-size:15px;color:#333333;">(OmaDB)</styled-content>


                        <styled-content style="font-size:15px;color:#333333;">sequence &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#DD1144;">'MKLVFLVLLFLGALGLCLAGRRRSVQWCAVSQPEATKCFQWQRNMRKVRGPPVSCIKRD
SPIQCIQAIAENRADAVTLDGGFIYEAGLAPYKLRPVAAEVYGTERQPRTHYYAVAVVKKGGSFQLNELQGL
KSCHTGLRRTAGWNVPIGTLRPFLNWTGPPEPIEAAVARFFSASCVPGADKGQFPNLCRLCAGTGENKCAFS
SQEPYFSYSGAFKCLRDGAGDVAFIRESTVFEDLSDEAERDEYELLCPDNTRKPVDKFKDCHLARVPSHAVV
ARSVNGKEDAIWNLLRQAQEKFGKDKSPKFQLFGSPSGQKDLLFKDSAIGFSRVPPRIDSGLYLGSGYFTAI
QNLRKSEEEVAARRARVVWCAVGEQELRKCNQWSGLSEGSVTCSSASTTEDCIALVLKGEADAMSLDGGYVY
TAGKCGLVPVLAENYKSQQSSDPDPNCVDRPVEGYLAVAVVRRSDTSLTWNSVKGKKSCHTAVDRTAGWNIP
MGLLFNQTGSCKFDEYFSQSCAPGSDPRSNLCALCIGDEQGENKCVPNSNERYYGYTGAFRCLAENAGDVAF
VKDVTVLQNTDGNNNEAWAKDLKLADFALLCLDGKRKPVTEARSCHLAMAPNHAVVSRMDKVERLKQVLLHQ
QAKFGRNGSDCPDKFCLFQSETKNLLFNDNTECLARLHGKTTYEKYLGPQYVAGITNLKKCSTSPLLEACEF
LRK'</styled-content>


                        <styled-content style="font-size:15px;color:#333333;">seq_annotation &lt;- mapSequence(sequence)</styled-content>

                        <styled-content style="font-size:15px;color:#333333;">length(seq_annotation$targets)</styled-content>    
                        <styled-content style="font-size:15px;color:#999988;">                           
                            <italic toggle="yes"># 1</italic>
                        </styled-content>
                    </preformat>
                </p>
                <p>The identified targets can be found in the seq_annotation$targets. As the length of this object attribute is 1, in this example the sequence mapping identified a single target sequence. From this object further information can be obtained as follows:</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#333333;">seq_annotation$targets[[1]]$canonicalid</styled-content>            
                        <styled-content style="font-size:15px;color:#999988; font-style:italic"># 'TRFL_HUMAN'</styled-content>
                    </preformat>
                </p>
                <p>Thus, our sequence is human lactotransferrin (also known as lactoferrin). Lactotransferrin is one of four subfamilies of transferrins in mammals
                    <sup>
                        <xref ref-type="bibr" rid="ref-22">22</xref>
                    </sup>.</p>
                <p>To investigate the evolutionary history of genes more precisely, we turn to Hierarchical Orthologous Groups (HOGs)&#x2014;sets of genes which have descended from a single common ancestral gene within a taxonomic range of interest
                    <sup>
                        <xref ref-type="bibr" rid="ref-10">10</xref>
                    </sup>. For an introduction to HOGs, we refer the interested reader to the following short video: 
                    <ext-link ext-link-type="uri" xlink:href="https://youtu.be/5p5x5gxzhZA">https://youtu.be/5p5x5gxzhZA</ext-link>.</p>
                <p>By knowing the ID of the HOG to which our sequence belongs, we can obtain a list of all the HOG members (i.e. all genes in the HOG), as follows:</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#333333;">hog_id &lt;- seq_annotation$targets[[</styled-content>
                        <styled-content style="font-size:15px;color:#008080;">1</styled-content>
                        <styled-content style="font-size:15px;color:#333333;">]]$oma_hog_id</styled-content>   
                        <styled-content style="font-size:15px;color:#999988; font-style:italic"># 'HOG:0413862.1a.1b'</styled-content>

                        <styled-content style="font-size:15px;color:#333333;">hog &lt;- getHOG(id = hog_id, members = </styled-content>
                        <styled-content style="font-size:15px;color:#000000;">TRUE, level =</styled-content> 
                        <styled-content style="font-size:15px;color:#DD1144;">'Mammalia'</styled-content>
                        <styled-content style="font-size:15px;color:#333333;">)</styled-content>

                        <styled-content style="font-size:15px;color:#333333;">hog$members</styled-content>
                    </preformat>
                </p>
                <p>Note that it is also possible to access information on a HOG using the getHOG() function. A HOG can be identified by its ID or the ID of one of its member proteins. Therefore the below will produce the same output.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#333333;">hog &lt;- getHOG(id =</styled-content> 
                        <styled-content style="font-size:15px;color:#DD1144;">'TRFL_HUMAN'</styled-content>
                        <styled-content style="font-size:15px;color:#333333;">, members = </styled-content>
                        <styled-content style="font-size:15px;color:#000000;">TRUE, level =</styled-content> 
                        <styled-content style="font-size:15px;color:#DD1144;">'Mammalia'</styled-content>
                        <styled-content style="font-size:15px;color:#333333;">)</styled-content>
                    </preformat>
                </p>
                <p>We can easily retrieve the Gene Ontology (GO) terms
                    <sup>
                        <xref ref-type="bibr" rid="ref-23">23</xref>
                    </sup> that are associated to each of the members using OmaDB.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#333333;">go_annotations &lt;- getProtein(hog$members$omaid,</styled-content> 
    
                        <styled-content style="font-size:15px;color:#333333;">attribute =</styled-content> 
                        <styled-content style="font-size:15px;color:#DD1144;">'gene_ontology'</styled-content>
                        <styled-content style="font-size:15px;color:#333333;">)</styled-content>
                    </preformat>
                </p>
                <p>The resultant list of GO terms per gene is in the &#x201c;geneID2GO&#x201d; format by default, which is used by the topGO
                    <sup>
                        <xref ref-type="bibr" rid="ref-11">11</xref>
                    </sup> package.</p>
                <p>To compare the function of lactotransferrins with their paralogous counterparts, we can retrieve a background set consisting of all members of the transferring HOG defined at the root of the eukaryotes</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#333333;">bgHOG &lt;- getHOG(id =</styled-content> 
                        <styled-content style="font-size:15px;color:#DD1144;">'TRFL_HUMAN'</styled-content>
                        <styled-content style="font-size:15px;color:#333333;">, members = TRUE, level =</styled-content> 
                        <styled-content style="font-size:15px;color:#DD1144;">'Eukaryota'</styled-content>
                        <styled-content style="font-size:15px;color:#333333;">)</styled-content>

                        <styled-content style="font-size:15px;color:#333333;">bgAnnnot &lt;- getProtein(bgHOG$members$omaid, attribute =</styled-content> 
                        <styled-content style="font-size:15px;color:#DD1144;">'gene_ontology'</styled-content>
                        <styled-content style="font-size:15px;color:#333333;">)</styled-content>
                    </preformat>
                </p>
                <p>We can now construct a topGO object using the getTopGO function as seen below. Note that the background set of terms is set by getTopGO to all terms appearing in the list of annotations. This may not be appropriate in all cases&#x2014;the choice of background set requires careful consideration
                    <sup>
                        <xref ref-type="bibr" rid="ref-24">24</xref>
                    </sup>.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#333333;">bgAnnnotFormatted = formatTopGO(bgAnnnot, format =</styled-content> 
                        <styled-content style="font-size:15px;color:#DD1144;">'geneID2GO'</styled-content>
                        <styled-content style="font-size:15px;color:#333333;">)</styled-content>


                        <styled-content style="font-size:15px;color:#333333; font-weight:bold;">library</styled-content>
                        <styled-content style="font-size:15px;color:#333333;">(topGO)</styled-content>

                        <styled-content style="font-size:15px;color:#333333;">myGO &lt;- getTopGO(annotations = bgAnnnotFormatted, format =</styled-content> 
                        <styled-content style="font-size:15px;color:#DD1144;">'geneID2GO'</styled-content>
                        <styled-content style="font-size:15px;color:#333333;">,</styled-content>  
     
                        <styled-content style="font-size:15px;color:#333333;">foregroundGenes = hog$members$entry_nr, ontology =</styled-content> 
                        <styled-content style="font-size:15px;color:#DD1144;">'BP'</styled-content>
                        <styled-content style="font-size:15px;color:#333333;">)</styled-content>


                        <styled-content style="font-size:15px;color:#333333;">myRes &lt;- runTest(myGO, algorithm =</styled-content> 
                        <styled-content style="font-size:15px;color:#DD1144;">'classic'</styled-content>
                        <styled-content style="font-size:15px;color:#333333;">, statistic =</styled-content> 
                        <styled-content style="font-size:15px;color:#DD1144;">'fisher'</styled-content>
                        <styled-content style="font-size:15px;color:#333333;">)</styled-content>

                        <styled-content style="font-size:15px;color:#333333;">print(GenTable(myGO, myRes))</styled-content>
                    </preformat>
                </p>
                <p>As the output in 
                    <xref ref-type="table" rid="T1">Table 1</xref> indicates, several enriched terms in the mammalian lactotransferrin are related to bone formation, consistent with previous reports in the literature (e.g. 
                    <xref ref-type="bibr" rid="ref-25">25</xref>). So is the role of lactotransferrin in antimicrobial activity (e.g. 
                    <xref ref-type="bibr" rid="ref-26">26</xref>).</p>
                <table-wrap id="T1" orientation="portrait" position="anchor">
                    <label>Table 1. </label>
                    <caption>
                        <title>Gene Ontology enrichment of Biological Process terms associated with mammalian lactotransferrins compared to all eukaryotic transferrins, as obtained from example 2.</title>
                    </caption>
                    <table content-type="article-table" frame="hsides">
                        <thead>
                            <tr>
                                <th align="left" colspan="1" rowspan="1">GO.ID</th>
                                <th align="left" colspan="1" rowspan="1">Term</th>
                                <th align="left" colspan="1" rowspan="1">P-value</th>
                            </tr>
                        </thead>
                        <tbody>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">GO:0001501</td>
                                <td align="left" colspan="1" rowspan="1">skeletal system development</td>
                                <td align="left" colspan="1" rowspan="1">&lt;1e-30</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">GO:0001503</td>
                                <td align="left" colspan="1" rowspan="1">ossification</td>
                                <td align="left" colspan="1" rowspan="1">&lt;1e-30</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">GO:0001649</td>
                                <td align="left" colspan="1" rowspan="1">osteoblast differentiation</td>
                                <td align="left" colspan="1" rowspan="1">&lt;1e-30</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">GO:0001816</td>
                                <td align="left" colspan="1" rowspan="1">cytokine production</td>
                                <td align="left" colspan="1" rowspan="1">&lt;1e-30</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">GO:0001817</td>
                                <td align="left" colspan="1" rowspan="1">regulation of cytokine production</td>
                                <td align="left" colspan="1" rowspan="1">&lt;1e-30</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">GO:0001818</td>
                                <td align="left" colspan="1" rowspan="1">negative regulation of cytokine production</td>
                                <td align="left" colspan="1" rowspan="1">&lt;1e-30</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">GO:0002237</td>
                                <td align="left" colspan="1" rowspan="1">response to molecule of bacterial origin</td>
                                <td align="left" colspan="1" rowspan="1">&lt;1e-30</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">GO:0002682</td>
                                <td align="left" colspan="1" rowspan="1">regulation of immune system process</td>
                                <td align="left" colspan="1" rowspan="1">&lt;1e-30</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">GO:0002683</td>
                                <td align="left" colspan="1" rowspan="1">negative regulation of immune system process</td>
                                <td align="left" colspan="1" rowspan="1">&lt;1e-30</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">GO:0002761</td>
                                <td align="left" colspan="1" rowspan="1">regulation of myeloid leukocyte differentiation</td>
                                <td align="left" colspan="1" rowspan="1">&lt;1e-30</td>
                            </tr>
                        </tbody>
                    </table>
                </table-wrap>
            </sec>
            <sec>
                <title>Example 3 - Taxonomic tree visualisation</title>
                <p>The taxonomic data obtained using the OmaDB package can easily be plugged into ggtree
                    <sup>
                        <xref ref-type="bibr" rid="ref-13">13</xref>
                    </sup> for phylogenetic tree visualisation. First, the tree is obtained using the getTaxonomy() function. In this example, the tree is rooted at the Hominoidea taxonomic level. The default format of the object returned is newick.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#333333;">tax &lt;- getTaxonomy(root =</styled-content> 
                        <styled-content style="font-size:15px;color:#DD1144;">'Hominoidea'</styled-content>
                        <styled-content style="font-size:15px;color:#333333;">)</styled-content>
                    </preformat>
                </p>
                <p>The resultant object can directly be used to build a phylogenetic tree using the ggtree package as below:</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#333333; font-weight:bold">library</styled-content>
                        <styled-content style="font-size:15px;color:#333333;">(ggtree)
tree &lt;- getTree(tax$newick)

mytree &lt;- ggtree(tree)</styled-content> </preformat>
                </p>
                <p>The tree can be further annotated using species silhouettes from PhyloPic (
                    <ext-link ext-link-type="uri" xlink:href="http://phylopic.org/">http://phylopic.org/</ext-link>). This functionality is already enabled within the ggtree package and just requires obtaining the relevant image codes. The workflow to produce 
                    <xref ref-type="fig" rid="f2">Figure 2</xref> is below.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#333333; font-weight:bold">library</styled-content>
                        <styled-content style="font-size:15px;color:#333333;">(rphylopic)

labels &lt;- tree$tip.label

labelsFormatted &lt;- sapply(labels, FUN = function(x)
           gsub(</styled-content>
                        <styled-content style="font-size:15px;color:#DD1144;">"_"</styled-content>
                        <styled-content style="font-size:15px;color:#333333;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#DD1144;">" "</styled-content>
                        <styled-content style="font-size:15px;color:#333333;">, x, fixed = TRUE))

ids &lt;- sapply(labelsFormatted, FUN = function(x)         
           name_search(x)$canonicalName[1,1])

images &lt;- sapply( as.character(ids), FUN = function(x)  
           tryCatch(name_images(x)$same[[1]]$uid, error = 
           function(w) name_images(x)$supertaxa[[1]]$uid) )

d &lt;- data.frame(label = labels, images = as.character(images))</styled-content>


                        <styled-content style="font-size:15px;color:#333333; font-weight:bold">library</styled-content>
                        <styled-content style="font-size:15px;color:#333333;">(dplyr)
library(ggimage)</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#333333;">mytree %&lt;+% d + geom_tiplab(aes(image = images), geom =</styled-content> 
                        <styled-content style="font-size:15px;color:#DD1144;">'phylopic'</styled-content>
                        <styled-content style="font-size:15px;color:#333333;">, 
    offset = 2.3, color =</styled-content> 
                        <styled-content style="font-size:15px;color:#DD1144;">'steelblue'</styled-content>
                        <styled-content style="font-size:15px;color:#333333;">) + geom_tiplab(offset = 0.3)    
    + ggplot2::xlim(0, 7)</styled-content>
                    </preformat>
                </p>
                <fig fig-type="figure" id="f2" orientation="portrait" position="float">
                    <label>Figure 2. </label>
                    <caption>
                        <title>Species taxonomy tree obtained using example 3.</title>
                    </caption>
                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/20395/0775066b-cd87-4518-84de-47c304c2611a_figure2.gif"/>
                </fig>
            </sec>
            <sec>
                <title>Example 4 - Visualising the distribution of PAM distances in the taxonomic space</title>
                <p>To obtain all orthologous pairs between two genomes, we can use the getGenomePairs() function. To limit server load, the resultant response is paginated and by default only returns the first page, capped at 100 entries. This is easily adjustable by setting the &#x2018;per_page&#x2019; parameter to either the number of orthologs required or simply to &#x2018;all&#x2019;.</p>
                <p>In this example, we compare the distribution of PAM distances (Point accepted mutations; 
                    <xref ref-type="bibr" rid="ref-27">27</xref>) between orthologs of two species-pairs, namely human-dog and human-mouse. First, we request the required data:</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#333333;">mouse_id = getGenome(id=</styled-content>
                        <styled-content style="font-size:15px;color:#DD1144;">'Mus musculus'</styled-content>
                        <styled-content style="font-size:15px;color:#333333;">)$taxon_id
human_id = getGenome(id=</styled-content>
                        <styled-content style="font-size:15px;color:#DD1144;">'Homo sapiens'</styled-content>
                        <styled-content style="font-size:15px;color:#333333;">)$taxon_id
dog_id = getGenome(id=</styled-content>
                        <styled-content style="font-size:15px;color:#DD1144;">'Canis lupus familiaris'</styled-content>
                        <styled-content style="font-size:15px;color:#333333;">)$taxon_id

human_mouse &lt;- getGenomePairs(genome_id1 = human_id, 
    genome_id2 = mouse_id, rel_type =</styled-content> 
                        <styled-content style="font-size:15px;color:#DD1144;">'1:1'</styled-content>
                        <styled-content style="font-size:15px;color:#333333;">)

human_dog &lt;- getGenomePairs(genome_id1 = human_id, 
    genome_id2 = dog_id, rel_type =</styled-content> 
                        <styled-content style="font-size:15px;color:#DD1144;">'1:1'</styled-content>
                        <styled-content style="font-size:15px;color:#333333;">)</styled-content>
                    </preformat>
                </p>
                <p>We can then bind the two resultant data frames and plot the results (
                    <xref ref-type="fig" rid="f3">Figure 3</xref>), as so:</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#333333;">human_mouse$Species &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#DD1144;">'Mus musculus'
</styled-content>
                        <styled-content style="font-size:15px;color:#333333;">human_dog$Species &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#DD1144;">'Canis lupus familiaris'
</styled-content>
                        <styled-content style="font-size:15px;color:#333333;">
all_pairs &lt;- rbind(human_mouse, human_dog)
all_pairs$Species &lt;- as.factor(all_pairs$Species)</styled-content>


                        <styled-content style="font-size:15px;color:#333333;">library(ggplot2)

g &lt;- ggplot(all_pairs, aes(x =</styled-content> 
                        <styled-content style="font-size:15px;color:#DD1144;">distance</styled-content>
                        <styled-content style="font-size:15px;color:#333333;">, fill = Species)) +
    geom_density(alpha = 0.5) + 
    xlab(</styled-content>
                        <styled-content style="font-size:15px;color:#DD1144;">'evolutionary distance [PAM]'</styled-content>
                        <styled-content style="font-size:15px;color:#333333;">) +
    theme(legend.position =</styled-content> 
                        <styled-content style="font-size:15px;color:#DD1144;">'bottom'</styled-content>
                        <styled-content style="font-size:15px;color:#333333;">, panel.grid.major = 
    element_blank(), panel.grid.minor = element_blank(),
    panel.background = element_blank(), axis.line = element_line(colour 
    =</styled-content> 
                        <styled-content style="font-size:15px;color:#DD1144;">'black'</styled-content>
                        <styled-content style="font-size:15px;color:#333333;">))
print(g)</styled-content>
                    </preformat>
                </p>
                <fig fig-type="figure" id="f3" orientation="portrait" position="float">
                    <label>Figure 3. </label>
                    <caption>
                        <title>Distribution of evolutionary distances (in PAM units; 27) human-dog (red) and human-mouse (blue) pairs, obtained using example 4.</title>
                    </caption>
                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/20395/0775066b-cd87-4518-84de-47c304c2611a_figure3.gif"/>
                </fig>
                <p>The two-sample Kolmogorov-Smirnov test can be performed on the two distributions, using the command:</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#333333;">ks.test(human_dog$distance, human_mouse$distance)</styled-content>
                    </preformat>
                </p>
                <p>This returns p-value &lt; 2.2e-16. The median distance between dog and human is shorter than that of mouse and human (8.8 
                    <italic toggle="yes">vs.</italic> 11.8). This is consistent with previous observations that the rodent has a longer branch than humans and carnivores, in part due to their shorter generation time
                    <sup>
                        <xref ref-type="bibr" rid="ref-28">28</xref>
                    </sup>.</p>
            </sec>
            <sec>
                <title>Example 5 - Annotating protein sequences not present in OMA</title>
                <p>Although the OMA database currently analyses over 2,100 genomes, many more have been sequenced, and the gap keeps on widening. It is nevertheless possible to use OMA to infer the function of custom protein sequences through a fast approximate search against all sequences in OMA
                    <sup>
                        <xref ref-type="bibr" rid="ref-4">4</xref>
                    </sup>.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#999988; font-style:italic"># Our mystery sequence is cystic fibrosis transmembrane conductance
# regulator in the Emperor penguin (UniProt ID: A0A087RGQ1_APTFO)</styled-content>
                        <styled-content style="font-size:15px;color:#333333;">
mysterySeq &lt;-</styled-content>

                        <styled-content style="font-size:15px;color:#DD1144;">'FFFLLRWTKPILRKGYRRRLELSDIYQIPSADSADNLSEKLEREWDRELATSKKKPKLINALRRCFFWKFM
FYGIILYLGEVTKSVQPLLLGRIIASYDPDNSDERSIAYYLAIGLCLLFLVRTLLIHPAIFGLHHIGMQMRI
AMFSLIYKKILKLSSRVLDKISTGQLVSLLSNNLNKFDEGLALAHFVWIAPLQVALLMGLLWDMLEASAFSG
LAFLIVLAFFQAWLGQRMMKYRNKRAGKINERLVITSEIIENIQSVKAYCWEDAMEKMIESIRETELKLTRK
AAYVRYFNSSAFFFSGFFVVFLAVLPYAVIKGIILRKIFTTISFCIVLRMTVTRQFPGSVQTWYDSIGAINK
IQDFLLKKEYKSLEYNLTTTGVELDKVTAFWDEGIGELFVKANQENNNSKAPSTDNNLFFSNFPLHASPVLQ
DINFKIEKGQLLAVSGSTGAGKTSLLMLIMGELEPSQGRLKHSGRISFSPQVSWIMPGTIKENIIFGVSYDE
YRYKSVIKACQLEEDISKFPDKDYTVLGDGGIILSGGQRARISLARAVYKDADLYLLDSPFGHLDIFTEKEI
FESCVCKLMANKTRILVTSKLEHLKIADKILILHEGSCYFYGTFSELQGQRPDFSSELMGFDSFDQFSAERR
NSILTETLRRFSIEGEGTGSRNEIKKQSFKQTSDFNDKRKNSIIINPLNASRKFSVVQRNGMQVNGIEDGHN
DPPERRFSLVPDLEQGDVGLLRSSMLNTDHILQGRRRQSVLNLMTGTSVNYGPNFSKKGSTTFRKMSMVPQT
NLSSEIDIYTRRLSRDSVLDITDEINEEDLKECFTDDAESMGTVTTWNTYFRYVTIHKNLIFVLILCVTVFL
VEVAASLAGLWFLKQTALKANTTQSENSTSDKPPVIVTVTSSYYIIYIYVGVADTLLAMGIFRGLPLVHTLI
TVSKTLHQKMVHAVLHAPMSTFNSWKAGGMLNRFSKDTAVLDDLLPLTVFDFIQLILIVIGAITVVSILQPY
IFLASVPVIAAFILLRAYFLHTSQQLKQLESEARSPIFTHLVTSLKGLWTLRAFGRQPYFETLFHKALNLHT
ANWFLYLSTLRWFQMRIEMIFVVFFVAVAFISIVTTGDGSGKVGIILTLAMNIMGTLQWAVNSSIDVDSLMR
SVGRIFKFIDMPTEEMKNIKPHKNNQFSDALVIENRHAKEEKNWPSGGQMTVKDLTAKYSEGGAAVLENISF
SISSGQRVGLLGRTGSGKSTLLFAFLRLLNTEGDIQIDGVSWSTVSVQQWRKAFGVIPQKVFIFSGTFRMNL
DPYGQWNDEEIWKVAEEVGLKSVIEQFPGQLDFVLVDGGCVLSHGHKQLMCLARSVLSKAKILLLDEPSAHL
DPVTSQVIRKTLKHAFANCTVILSEHRLEAMLECQRFLVIEDNKLRQYESIQKLLNEKSSFRQAISHADRLK
LLPVHHRNSSKRKPRPKITALQEETEEEVQETRL'</styled-content>

                        <styled-content style="font-size:15px;color:#333333;">myAnnotations &lt;- annotateSequence(mysterySeq)</styled-content>
                    </preformat>
                </p>
                <p>This results in 54 GO annotations. By comparison, this sequence has merely 15 GO annotations in UniProt-GOA
                    <sup>
                        <xref ref-type="bibr" rid="ref-29">29</xref>
                    </sup> &#x2014; all of which are also predicted by this method in OMA.</p>
            </sec>
            <sec>
                <title>Example 6 - Combining OmaDB with BgeeDB for gene expression</title>
                <p>We go back to the lactotransferrin gene family from Example 2. We can use OmaDB in conjunction with the BgeeDB Bioconductor package
                    <sup>
                        <xref ref-type="bibr" rid="ref-30">30</xref>
                    </sup> to retrieve expression data from the Bgee database
                    <sup>
                        <xref ref-type="bibr" rid="ref-31">31</xref>
                    </sup> as follows.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#333333;">BiocManager::install("BgeeDB")
library(BgeeDB)</styled-content>


                        <styled-content style="font-size:15px;color:#999988; font-style:italic"># Bgee uses Ensembl gene IDs, obtainable using OmaDB&#x2019;s cross-references.</styled-content>

                        <styled-content style="font-size:15px;color:#333333;">trfl_xrefs &lt;- getProtein(id='TRFL_HUMAN')$xref</styled-content>

                        <styled-content style="font-size:15px;color:#333333;">trfl_ens_id &lt;- subset(trfl_xrefs, source ==</styled-content> 
                        <styled-content style="font-size:15px;color:#DD1144;">'Ensembl Gene'</styled-content>
                        <styled-content style="font-size:15px;color:#333333;">)$xref</styled-content>

                        <styled-content style="font-size:15px;color:#999988; font-style:italic"># The Ensembl gene IDs need to be without version suffix</styled-content>

                        <styled-content style="font-size:15px;color:#333333;">trfl_ens_id &lt;- strsplit(trfl_ens_id,'.',fixed=TRUE)[[1]][1]</styled-content>


                        <styled-content style="font-size:15px;color:#333333;">my_stage &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#DD1144;">'UBERON:0034920'</styled-content> 
                        <styled-content style="font-size:15px;color:#999988; font-style:italic"># Infant stage</styled-content>

                        <styled-content style="font-size:15px;color:#333333;">bgee.expr &lt;- Bgee$new(species=</styled-content>
                        <styled-content style="font-size:15px;color:#DD1144;">'Homo_sapiens'</styled-content>
                        <styled-content style="font-size:15px;color:#333333;">)</styled-content>

                        <styled-content style="font-size:15px;color:#333333;">expr.data &lt;- loadTopAnatData(bgee.expr, stage = my_stage)</styled-content>

                        <styled-content style="font-size:15px;color:#333333;">gene.expr.tissue.ids &lt;- 
    unlist(expr.data$gene2anatomy[trfl_ens_id], use.names = F)</styled-content>

                        <styled-content style="font-size:15px;color:#333333;">tissues &lt;- expr.data$organ.names
print(tissues[tissues$ID %in% gene.expr.tissue.ids, ])</styled-content>
</preformat>
                </p>
                <p>Among the tissues in which lactotransferrin is expressed according to Bgee (
                    <xref ref-type="table" rid="T2">Table 2</xref>), we note the bone marrow and the palpebral conjunctiva (the eyelid inner surface). This is consistent with the aforementioned involvement of lactotransferrin in bone formation and anti-microbial activity.</p>
                <table-wrap id="T2" orientation="portrait" position="anchor">
                    <label>Table 2. </label>
                    <caption>
                        <title>Human tissues in which lactotransferrin is expressed in infant stage, according to the Bgee database version 14 (output of Example 6).</title>
                    </caption>
                    <table content-type="article-table" frame="hsides">
                        <thead>
                            <tr>
                                <th align="left" colspan="1" rowspan="1">ID</th>
                                <th align="left" colspan="1" rowspan="1">Name</th>
                            </tr>
                        </thead>
                        <tbody>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">UBERON:0001812</td>
                                <td align="left" colspan="1" rowspan="1">palpebral conjunctiva</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">UBERON:0000178</td>
                                <td align="left" colspan="1" rowspan="1">blood</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">UBERON:0002371</td>
                                <td align="left" colspan="1" rowspan="1">bone marrow</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">UBERON:0001154</td>
                                <td align="left" colspan="1" rowspan="1">vermiform appendix</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">UBERON:0002084</td>
                                <td align="left" colspan="1" rowspan="1">heart left ventricle</td>
                            </tr>
                        </tbody>
                    </table>
                </table-wrap>
                <p>Further tutorials on the OmaDB package can be found in the accompanying vignettes:</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#333333;">browseVignettes(</styled-content>
                        <styled-content style="font-size:15px;color:#DD1144;">'OmaDB'</styled-content>
                        <styled-content style="font-size:15px;color:#333333;">)</styled-content>
                    </preformat>
                </p>
            </sec>
        </sec>
        <sec sec-type="discussion">
            <title>Discussion and outlook</title>
            <p>Orthology is used for various purposes, such as species tree inference, gene evolution dynamic, or protein function prediction. The retrieval of orthologs is thus typically just the starting point of a larger analysis. Therefore, this overhaul and expansion of the OMA programmatic interface will facilitate the incorporation of OMA data in such larger analyses.</p>
            <p>Our R package will continue to be maintained in line with the biannual Bioconductor releases. Further work to improve the package includes improvement in performance. For example, the responses are currently fully loaded into an R object of choice which, depending on the response size, may create some time lag in the response. We will also continue to update the package and the API in sync with the OMA browser to incorporate new functionalities of OMA.</p>
            <p>Likewise, we will also maintain and further develop the Python package. In particular, we will explore the possibility of further integration with the BioPython library
                <sup>
                    <xref ref-type="bibr" rid="ref-32">32</xref>
                </sup>.</p>
            <p>More generally, in OMA we will keep supporting the various ways of accessing the underlying data, including the interactive web browser and flat files in a variety of formats. The REST API is also complemented by a new SPARQL interface that enables highly specific queries, as well as federated queries over multiple resources
                <sup>
                    <xref ref-type="bibr" rid="ref-4">4</xref>
                </sup>. However, the query language is more complex.</p>
            <p>We very much welcome feedback and questions from the community. We also highly appreciate contributions to the code in the form of pull requests. Our preferred channel for support is the BioStar website
                <sup>
                    <xref ref-type="bibr" rid="ref-33">33</xref>
                </sup>, where we monitor all posts with keyword &#x201c;oma&#x201d;.</p>
        </sec>
        <sec>
            <title>Software availability</title>
            <p>Please note that this manuscript uses version 2.0 of the OmaDB R package, which is in the 
                <bold>development version</bold> of Bioconductor (v.3.9). Until the release of Bioconductor v.3.9 in Spring 2019, there are two possible ways of installing it:</p>
            <list list-type="bullet">
                <list-item>
                    <label>1) </label>
                    <p>Install the development version of R (v.3.6) &#x2014; required for Bioconductor v.3.9 &#x2014; and install OmaDB using the command:</p>
                    <p>
                        <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                            <styled-content style="font-size:15px;color:#333333;">BiocManager::install(</styled-content>
                            <styled-content style="font-size:15px;color:#DD1144;">'OmaDB'</styled-content>
                            <styled-content style="font-size:15px;color:#333333;">, version =</styled-content> 
                            <styled-content style="font-size:15px;color:#DD1144;">'devel'</styled-content>
                            <styled-content style="font-size:15px;color:#333333;">)</styled-content>

                            <styled-content style="font-size:15px;color:#333333; font-style:italic">&#x2013;or&#x2013;</styled-content>
                        </preformat>
                    </p>
                </list-item>
                <list-item>
                    <label>2) </label>
                    <p>Install OmaDB 2.0 directly from the github repo using the devtools R package:</p>
                    <p>
                        <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                            <styled-content style="font-size:15px;color:#333333;">install.packages(</styled-content>
                            <styled-content style="font-size:15px;color:#DD1144;">'devtools'</styled-content>
                            <styled-content style="font-size:15px;color:#333333;">)</styled-content>

                            <styled-content style="font-size:15px;color:#333333; font-weight:bold;">library</styled-content>
                            <styled-content style="font-size:15px;color:#333333;">(devtools)</styled-content>

                            <styled-content style="font-size:15px;color:#333333;">install_github(</styled-content>
                            <styled-content style="font-size:15px;color:#DD1144;">'dessimozlab/omadb'</styled-content>
                            <styled-content style="font-size:15px;color:#333333;">)</styled-content>
                        </preformat>
                    </p>
                </list-item>
            </list>
            <p>REST API available from: 
                <ext-link ext-link-type="uri" xlink:href="https://omabrowser.org/api">https://omabrowser.org/api</ext-link>
            </p>
            <p>Documentation available from: 
                <ext-link ext-link-type="uri" xlink:href="https://omabrowser.org/api/docs">https://omabrowser.org/api/docs</ext-link>
            </p>
            <p>R OmaDB package available from: 
                <ext-link ext-link-type="uri" xlink:href="http://bioconductor.org/packages/OmaDB/">http://bioconductor.org/packages/OmaDB/</ext-link>
            </p>
            <p>Source code available from: 
                <ext-link ext-link-type="uri" xlink:href="https://github.com/DessimozLab/OmaDB/">https://github.com/DessimozLab/OmaDB/</ext-link>
            </p>
            <p>Archived source code as at time of publication: 
                <ext-link ext-link-type="uri" xlink:href="http://doi.org/10.5281/zenodo.2595086">http://doi.org/10.5281/zenodo.2595086</ext-link>
                <sup>
                    <xref ref-type="bibr" rid="ref-34">34</xref>
                </sup>
            </p>
            <p>License: GPL-2</p>
            <p>omadb Python package available from: 
                <ext-link ext-link-type="uri" xlink:href="https://pypi.org/project/omadb/">https://pypi.org/project/omadb/</ext-link>
            </p>
            <p>Source code available from: 
                <ext-link ext-link-type="uri" xlink:href="https://github.com/DessimozLab/pyomadb/">https://github.com/DessimozLab/pyomadb/</ext-link>
            </p>
            <p>Archived source code as at time of publication: 
                <ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.5281/zenodo.2530250">http://doi.org/10.5281/zenodo.2530250</ext-link>
                <sup>
                    <xref ref-type="bibr" rid="ref-35">35</xref>
                </sup>
            </p>
            <p>License: GPL-3</p>
            <p>We also provide a binder to reproduce in Python the analyses done in R. This is available from: 
                <ext-link ext-link-type="uri" xlink:href="https://mybinder.org/v2/gh/DessimozLab/pyomadb/master?filepath=examples%2Fpyomadb-examples.ipynb">https://mybinder.org/v2/gh/DessimozLab/pyomadb/master?filepath=examples%2Fpyomadb-examples.ipynb</ext-link>
            </p>
        </sec>
    </body>
    <back>
        <ack>
            <title>Acknowledgements</title>
            <p>We thank Natasha Glover for helpful feedback on the manuscript, and Fr&#x00e9;d&#x00e9;ric Bastian for help on the example involving BgeeDB.</p>
        </ack>
        <ref-list>
            <ref id="ref-1">
                <label>1</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
				
                        <name name-style="western">
                            <surname>Fitch</surname>
                            <given-names>WM</given-names>
                        </name>
			</person-group>:
                    <article-title>Distinguishing homologous from analogous proteins.</article-title>
                    <source>
				
                        <italic toggle="yes">Syst Zool.</italic>
			</source>
                    <year>1970</year>;<volume>19</volume>(<issue>2</issue>):<fpage>99</fpage>&#x2013;<lpage>113</lpage>.
                    <pub-id pub-id-type="pmid">5449325</pub-id>
                    <pub-id pub-id-type="doi">10.2307/2412448</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-2">
                <label>2</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
				
                        <name name-style="western">
                            <surname>Sonnhammer</surname>
                            <given-names>EL</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Gabald&#x00f3;n</surname>
                            <given-names>T</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Sousa da Silva</surname>
                            <given-names>AW</given-names>
                        </name>
				
                        <etal/>
			</person-group>:
                    <article-title>Big data and other challenges in the quest for orthologs.</article-title>
                    <source>
				
                        <italic toggle="yes">Bioinformatics.</italic>
			</source>
                    <year>2014</year>;<volume>30</volume>(<issue>21</issue>):<fpage>2993</fpage>&#x2013;<lpage>8</lpage>.
                    <pub-id pub-id-type="pmid">25064571</pub-id>
                    <pub-id pub-id-type="doi">10.1093/bioinformatics/btu492</pub-id>
                    <pub-id pub-id-type="pmcid">4201156</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-3">
                <label>3</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
				
                        <name name-style="western">
                            <surname>Forslund</surname>
                            <given-names>K</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Pereira</surname>
                            <given-names>C</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Capella-Gutierrez</surname>
                            <given-names>S</given-names>
                        </name>
				
                        <etal/>
			</person-group>:
                    <article-title>Gearing up to handle the mosaic nature of life in the quest for orthologs.</article-title>
                    <source>
				
                        <italic toggle="yes">Bioinformatics.</italic>
			</source>
                    <year>2018</year>;<volume>34</volume>(<issue>2</issue>):<fpage>323</fpage>&#x2013;<lpage>329</lpage>.
                    <pub-id pub-id-type="pmid">28968857</pub-id>
                    <pub-id pub-id-type="doi">10.1093/bioinformatics/btx542</pub-id>
                    <pub-id pub-id-type="pmcid">5860199</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-4">
                <label>4</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
				
                        <name name-style="western">
                            <surname>Altenhoff</surname>
                            <given-names>AM</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Glover</surname>
                            <given-names>NM</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Train</surname>
                            <given-names>CM</given-names>
                        </name>
				
                        <etal/>
			</person-group>:
                    <article-title>The OMA orthology database in 2018: retrieving evolutionary relationships among all domains of life through richer web and programmatic interfaces.</article-title>
                    <source>
				
                        <italic toggle="yes">Nucleic Acids Res.</italic>
			</source>
                    <year>2018</year>;<volume>46</volume>(<issue>D1</issue>):<fpage>D477</fpage>&#x2013;<lpage>85</lpage>.
                    <pub-id pub-id-type="pmid">29106550</pub-id>
                    <pub-id pub-id-type="doi">10.1093/nar/gkx1019</pub-id>
                    <pub-id pub-id-type="pmcid">5753216</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-5">
                <label>5</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
				
                        <name name-style="western">
                            <surname>Schmitt</surname>
                            <given-names>T</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Messina</surname>
                            <given-names>DN</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Schreiber</surname>
                            <given-names>F</given-names>
                        </name>
				
                        <etal/>
			</person-group>:
                    <article-title>Letter to the editor: SeqXML and OrthoXML: standards for sequence and orthology information.</article-title>
                    <source>
				
                        <italic toggle="yes">Brief Bioinform.</italic>
			</source>
                    <year>2011</year>;<volume>12</volume>(<issue>5</issue>):<fpage>485</fpage>&#x2013;<lpage>8</lpage>.
                    <pub-id pub-id-type="pmid">21666252</pub-id>
                    <pub-id pub-id-type="doi">10.1093/bib/bbr025</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-6">
                <label>6</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
				
                        <name name-style="western">
                            <surname>Han</surname>
                            <given-names>MV</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Zmasek</surname>
                            <given-names>CM</given-names>
                        </name>
			</person-group>:
                    <article-title>phyloXML: XML for evolutionary biology and comparative genomics.</article-title>
                    <source>
				
                        <italic toggle="yes">BMC Bioinformatics.</italic>
			</source>
                    <year>2009</year>;<volume>10</volume>:<fpage>356</fpage>.
                    <pub-id pub-id-type="pmid">19860910</pub-id>
                    <pub-id pub-id-type="doi">10.1186/1471-2105-10-356</pub-id>
                    <pub-id pub-id-type="pmcid">2774328</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-7">
                <label>7</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
				
                        <name name-style="western">
                            <surname>Huber</surname>
                            <given-names>W</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Carey</surname>
                            <given-names>VJ</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Gentleman</surname>
                            <given-names>R</given-names>
                        </name>
				
                        <etal/>
			</person-group>:
                    <article-title>Orchestrating high-throughput genomic analysis with Bioconductor.</article-title>
                    <source>
				
                        <italic toggle="yes">Nat Methods.</italic>
			</source>
                    <year>2015</year>;<volume>12</volume>(<issue>2</issue>):<fpage>115</fpage>&#x2013;<lpage>21</lpage>.
                    <pub-id pub-id-type="pmid">25633503</pub-id>
                    <pub-id pub-id-type="doi">10.1038/nmeth.3252</pub-id>
                    <pub-id pub-id-type="pmcid">4509590</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-8">
                <label>8</label>
                <mixed-citation publication-type="journal">
                    <article-title>Django Software Foundation</article-title>. Django. [cited 2018].
                    <ext-link ext-link-type="uri" xlink:href="http://djangoproject.com">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-9">
                <label>9</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
				
                        <name name-style="western">
                            <surname>Folk</surname>
                            <given-names>M</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Heber</surname>
                            <given-names>G</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Koziol</surname>
                            <given-names>Q</given-names>
                        </name>
				
                        <etal/>
			</person-group>:
                    <article-title>An overview of the HDF5 technology suite and its applications.</article-title>
                    <source>
				
                        <italic toggle="yes">Proceedings of the EDBT.</italic>
			</source>
                    <year>2011</year>.
                    <pub-id pub-id-type="doi">10.1145/1966895.1966900</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-10">
                <label>10</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
				
                        <name name-style="western">
                            <surname>Altenhoff</surname>
                            <given-names>AM</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Gil</surname>
                            <given-names>M</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Gonnet</surname>
                            <given-names>GH</given-names>
                        </name>
				
                        <etal/>
			</person-group>:
                    <article-title>Inferring hierarchical orthologous groups from orthologous gene pairs.</article-title>
                    <source>
				
                        <italic toggle="yes">PLoS One.</italic>
			</source>
                    <year>2013</year>;<volume>8</volume>(<issue>1</issue>):<fpage>e53786</fpage>.
                    <pub-id pub-id-type="pmid">23342000</pub-id>
                    <pub-id pub-id-type="doi">10.1371/journal.pone.0053786</pub-id>
                    <pub-id pub-id-type="pmcid">3544860</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-11">
                <label>11</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
				
                        <name name-style="western">
                            <surname>Alexa</surname>
                            <given-names>A</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Rahnenfuhrer</surname>
                            <given-names>J</given-names>
                        </name>
			</person-group>:
                    <article-title>topGO: Enrichment analysis for Gene Ontology.</article-title>R package version 2.28.0.
                    <italic toggle="yes">Bioconductor</italic>.<year>2016</year>.
                    <pub-id pub-id-type="doi">10.18129/B9.bioc.topGO</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-12">
                <label>12</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
				
                        <name name-style="western">
                            <surname>Pag&#x00e8;s</surname>
                            <given-names>H</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Aboyoun</surname>
                            <given-names>P</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Gentleman</surname>
                            <given-names>R</given-names>
                        </name>
				
                        <etal/>
			</person-group>:
                    <article-title>Biostrings: Efficient manipulation of biological strings</article-title>. R Package Version.<year>2017</year>;<volume>2</volume>(<issue>0</issue>).
                    <ext-link ext-link-type="uri" xlink:href="https://rdrr.io/bioc/Biostrings/">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-13">
                <label>13</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
				
                        <name name-style="western">
                            <surname>Yu</surname>
                            <given-names>G</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Smith</surname>
                            <given-names>DK</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Zhu</surname>
                            <given-names>H</given-names>
                        </name>
				
                        <etal/>
			</person-group>:
                    <article-title>ggtree:  an r package for visualization and annotation of phylogenetic trees with their covariates and other associated data.</article-title>McInerny G editor.
                    <source>
				
                        <italic toggle="yes">Methods Ecol Evol.</italic>
			</source>
                    <year>2017</year>;<volume>8</volume>(<issue>1</issue>):<fpage>28</fpage>&#x2013;<lpage>36</lpage>.
                    <pub-id pub-id-type="doi">10.1111/2041-210X.12628</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-14">
                <label>14</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
				
                        <name name-style="western">
                            <surname>Lawrence</surname>
                            <given-names>M</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Huber</surname>
                            <given-names>W</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Pag&#x00e8;s</surname>
                            <given-names>H</given-names>
                        </name>
				
                        <etal/>
			</person-group>:
                    <article-title>Software for computing and annotating genomic ranges.</article-title>
                    <source>
				
                        <italic toggle="yes">PLoS Comput Biol.</italic>
			</source>
                    <year>2013</year>;<volume>9</volume>(<issue>8</issue>):<fpage>e1003118</fpage>.
                    <pub-id pub-id-type="pmid">23950696</pub-id>
                    <pub-id pub-id-type="doi">10.1371/journal.pcbi.1003118</pub-id>
                    <pub-id pub-id-type="pmcid">3738458</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-15">
                <label>15</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
				
                        <name name-style="western">
                            <surname>McKinney</surname>
                            <given-names>W</given-names>
                        </name>
			</person-group>:
                    <article-title>pandas: a foundational Python library for data analysis and statistics.</article-title>
                    <source>
				
                        <italic toggle="yes">Python for High Performance and Scientific Computing.</italic>
			</source>
                    <year>2011</year>;<fpage>1</fpage>&#x2013;<lpage>9</lpage>.
                    <ext-link ext-link-type="uri" xlink:href="https://www.dlr.de/sc/Portaldata/15/Resources/dokumente/pyhpc2011/submissions/pyhpc2011_submission_9.pdf">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-16">
                <label>16</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
				
                        <name name-style="western">
                            <surname>Train</surname>
                            <given-names>CM</given-names>
                        </name>
					
                        <name name-style="western">
                            <surname>Pignatelli</surname>
                            <given-names>M</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Altenhoff</surname>
                            <given-names>A</given-names>
                        </name>
				
                        <etal/>
			</person-group>:
                    <article-title>iHam &amp; pyHam: visualizing and processing hierarchical orthologous groups.</article-title>
                    <source>
				
                        <italic toggle="yes">Bioinformatics.</italic>
			</source>
                    <year>2018</year>.
                    <pub-id pub-id-type="pmid">30508066</pub-id>
                    <pub-id pub-id-type="doi">10.1093/bioinformatics/bty994</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-17">
                <label>17</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
				
                        <name name-style="western">
                            <surname>Sukumaran</surname>
                            <given-names>J</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Holder</surname>
                            <given-names>MT</given-names>
                        </name>
			</person-group>:
                    <article-title>DendroPy: a Python library for phylogenetic computing.</article-title>
                    <source>
				
                        <italic toggle="yes">Bioinformatics.</italic>
			</source>
                    <year>2010</year>;<volume>26</volume>(<issue>12</issue>):<fpage>1569</fpage>&#x2013;<lpage>71</lpage>.
                    <pub-id pub-id-type="pmid">20421198</pub-id>
                    <pub-id pub-id-type="doi">10.1093/bioinformatics/btq228</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-18">
                <label>18</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
				
                        <name name-style="western">
                            <surname>Huerta-Cepas</surname>
                            <given-names>J</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Serra</surname>
                            <given-names>F</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Bork</surname>
                            <given-names>P</given-names>
                        </name>
			</person-group>:
                    <article-title>ETE 3: Reconstruction, Analysis, and Visualization of Phylogenomic Data.</article-title>
                    <source>
				
                        <italic toggle="yes">Mol Biol Evol.</italic>
			</source>
                    <year>2016</year>;<volume>33</volume>(<issue>6</issue>):<fpage>1635</fpage>&#x2013;<lpage>8</lpage>.
                    <pub-id pub-id-type="pmid">26921390</pub-id>
                    <pub-id pub-id-type="doi">10.1093/molbev/msw046</pub-id>
                    <pub-id pub-id-type="pmcid">4868116</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-19">
                <label>19</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
				
                        <name name-style="western">
                            <surname>Klopfenstein</surname>
                            <given-names>DV</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Zhang</surname>
                            <given-names>L</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Pedersen</surname>
                            <given-names>BS</given-names>
                        </name>
				
                        <etal/>
			</person-group>:
                    <article-title>GOATOOLS: A Python library for Gene Ontology analyses.</article-title>
                    <source>
				
                        <italic toggle="yes">Sci Rep.</italic>
			</source>
                    <year>2018</year>;<volume>8</volume>(<issue>1</issue>):<fpage>10872</fpage>.
                    <pub-id pub-id-type="pmid">30022098</pub-id>
                    <pub-id pub-id-type="doi">10.1038/s41598-018-28948-z</pub-id>
                    <pub-id pub-id-type="pmcid">6052049</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-20">
                <label>20</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
				
                        <name name-style="western">
                            <surname>Kluyver</surname>
                            <given-names>T</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Ragan-Kelley</surname>
                            <given-names>B</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>P&#x00e9;rez</surname>
                            <given-names>F</given-names>
                        </name>
				
                        <etal/>
			</person-group>:
                    <article-title>Jupyter Notebooks-a publishing format for reproducible computational workflows.</article-title>In:
                    <italic toggle="yes">ELPUB</italic>.<year>2016</year>;<fpage>87</fpage>&#x2013;<lpage>90</lpage>.
                    <pub-id pub-id-type="doi">10.3233/978-1-61499-649-1-87</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-21">
                <label>21</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
				
                        <name name-style="western">
                            <surname>Wickham</surname>
                            <given-names>H</given-names>
                        </name>
			</person-group>:
                    <article-title>httr: Tools for Working with URLs and HTTP</article-title>.<year>2018</year>.
                    <ext-link ext-link-type="uri" xlink:href="https://cran.r-project.org/web/packages/httr/httr.pdf">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-22">
                <label>22</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
				
                        <name name-style="western">
                            <surname>Lambert</surname>
                            <given-names>LA</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Perri</surname>
                            <given-names>H</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Meehan</surname>
                            <given-names>TJ</given-names>
                        </name>
			</person-group>:
                    <article-title>Evolution of duplications in the transferrin family of proteins.</article-title>
                    <source>
				
                        <italic toggle="yes">Comp Biochem Physiol B Biochem Mol Biol.</italic>
			</source>
                    <year>2005</year>;<volume>140</volume>(<issue>1</issue>):<fpage>11</fpage>&#x2013;<lpage>25</lpage>.
                    <pub-id pub-id-type="pmid">15621505</pub-id>
                    <pub-id pub-id-type="doi">10.1016/j.cbpc.2004.09.012</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-23">
                <label>23</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
				
                        <name name-style="western">
                            <surname>Ashburner</surname>
                            <given-names>M</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Ball</surname>
                            <given-names>CA</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Blake</surname>
                            <given-names>JA</given-names>
                        </name>
				
                        <etal/>
			</person-group>:
                    <article-title>Gene ontology: tool for the unification of biology. The Gene Ontology Consortium.</article-title>
                    <source>
				
                        <italic toggle="yes">Nat Genet.</italic>
			</source>
                    <year>2000</year>;<volume>25</volume>(<issue>1</issue>):<fpage>25</fpage>&#x2013;<lpage>9</lpage>.
                    <pub-id pub-id-type="pmid">10802651</pub-id>
                    <pub-id pub-id-type="doi">10.1038/75556</pub-id>
                    <pub-id pub-id-type="pmcid">3037419</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-24">
                <label>24</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
				
                        <name name-style="western">
                            <surname>Gaudet</surname>
                            <given-names>P</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Dessimoz</surname>
                            <given-names>C</given-names>
                        </name>
			</person-group>:
                    <article-title>Gene Ontology: Pitfalls, Biases, and Remedies.</article-title>
                    <source>
				
                        <italic toggle="yes">Methods Mol Biol.</italic>
				</source>In: Dessimoz C, &#x0160;kunca N, editors.
                    <italic toggle="yes">The Gene Ontology Handbook</italic>. New York, NY: Springer New York;<year>2017</year>;<volume>1446</volume>:<fpage>189</fpage>&#x2013;<lpage>205</lpage>.
                    <pub-id pub-id-type="pmid">27812944</pub-id>
                    <pub-id pub-id-type="doi">10.1007/978-1-4939-3743-1_14</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-25">
                <label>25</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
				
                        <name name-style="western">
                            <surname>Naot</surname>
                            <given-names>D</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Grey</surname>
                            <given-names>A</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Reid</surname>
                            <given-names>IR</given-names>
                        </name>
				
                        <etal/>
			</person-group>:
                    <article-title>Lactoferrin--a novel bone growth factor.</article-title>
                    <source>
				
                        <italic toggle="yes">Clin Med Res.</italic>
			</source>
                    <year>2005</year>;<volume>3</volume>(<issue>2</issue>):<fpage>93</fpage>&#x2013;<lpage>101</lpage>.
                    <pub-id pub-id-type="pmid">16012127</pub-id>
                    <pub-id pub-id-type="doi">10.3121/cmr.3.2.93</pub-id>
                    <pub-id pub-id-type="pmcid">1183439</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-26">
                <label>26</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
				
                        <name name-style="western">
                            <surname>Orsi</surname>
                            <given-names>N</given-names>
                        </name>
			</person-group>:
                    <article-title>The antimicrobial activity of lactoferrin: current status and perspectives.</article-title>
                    <source>
				
                        <italic toggle="yes">Biometals.</italic>
			</source>
                    <year>2004</year>;<volume>17</volume>(<issue>3</issue>):<fpage>189</fpage>&#x2013;<lpage>96</lpage>.
                    <pub-id pub-id-type="pmid">15222464</pub-id>
                    <pub-id pub-id-type="doi">10.1023/B:BIOM.0000027691.86757.e2</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-27">
                <label>27</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
				
                        <name name-style="western">
                            <surname>Dayhoff</surname>
                            <given-names>MO</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Schwartz</surname>
                            <given-names>RM</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Orcutt</surname>
                            <given-names>BC</given-names>
                        </name>
			</person-group>:
                    <article-title>A model of evolutionary change in proteins.</article-title>In:
                    <italic toggle="yes">Atlas of Protein Sequence and Structure</italic>.<year>1978</year>;<fpage>345</fpage>&#x2013;<lpage>52</lpage>.
                    <ext-link ext-link-type="uri" xlink:href="http://chagall.med.cornell.edu/BioinfoCourse/PDFs/Lecture2/Dayhoff1978.pdf">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-28">
                <label>28</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
				
                        <name name-style="western">
                            <surname>Easteal</surname>
                            <given-names>S</given-names>
                        </name>
			</person-group>:
                    <article-title>Generation time and the rate of molecular evolution.</article-title>
                    <source>
				
                        <italic toggle="yes">Mol Biol Evol.</italic>
			</source>
                    <year>1985</year>;<volume>2</volume>(<issue>5</issue>):<fpage>450</fpage>&#x2013;<lpage>3</lpage>.
                    <pub-id pub-id-type="pmid">3870871</pub-id>
                    <pub-id pub-id-type="doi">10.1093/oxfordjournals.molbev.a040361</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-29">
                <label>29</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
				
                        <name name-style="western">
                            <surname>Huntley</surname>
                            <given-names>RP</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Sawford</surname>
                            <given-names>T</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Mutowo-Meullenet</surname>
                            <given-names>P</given-names>
                        </name>
				
                        <etal/>
			</person-group>:
                    <article-title>The GOA database: gene Ontology annotation updates for 2015.</article-title>
                    <source>
				
                        <italic toggle="yes">Nucleic Acids Res.</italic>
			</source>
                    <year>2015</year>;<volume>43</volume>(<issue>Database issue</issue>):<fpage>D1057</fpage>&#x2013;<lpage>63</lpage>.
                    <pub-id pub-id-type="pmid">25378336</pub-id>
                    <pub-id pub-id-type="doi">10.1093/nar/gku1113</pub-id>
                    <pub-id pub-id-type="pmcid">4383930</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-30">
                <label>30</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
				
                        <name name-style="western">
                            <surname>Komljenovic</surname>
                            <given-names>A</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Roux</surname>
                            <given-names>J</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Wollbrett</surname>
                            <given-names>J</given-names>
                        </name>
				
                        <etal/>
			</person-group>:
                    <article-title> BgeeDB, an R package for retrieval of curated expression datasets and for gene list expression localization enrichment tests [version 2; referees: 2 approved, 1 approved with reservations].</article-title>
                    <source>
				
                        <italic toggle="yes">F1000Res.</italic>
			</source>
                    <year>2016</year>;<volume>5</volume>:<fpage>2748</fpage>.
                    <pub-id pub-id-type="pmid">30467516</pub-id>
                    <pub-id pub-id-type="doi">10.12688/f1000research.9973.2</pub-id>
                    <pub-id pub-id-type="pmcid">6113886</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-31">
                <label>31</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
				
                        <name name-style="western">
                            <surname>Bastian</surname>
                            <given-names>F</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Parmentier</surname>
                            <given-names>G</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Roux</surname>
                            <given-names>J</given-names>
                        </name>
				
                        <etal/>
			</person-group>:
                    <article-title>Bgee: Integrating and Comparing Heterogeneous Transcriptome Data Among Species.</article-title>In:
                    <italic toggle="yes">Data Integration in the Life Sciences</italic>. (Lecture Notes in Computer Science). Springer Berlin Heidelberg.<year>2008</year>;<fpage>124</fpage>&#x2013;<lpage>31</lpage>.
                    <pub-id pub-id-type="doi">10.1007/978-3-540-69828-9_12</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-32">
                <label>32</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
				
                        <name name-style="western">
                            <surname>Cock</surname>
                            <given-names>PJ</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Antao</surname>
                            <given-names>T</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Chang</surname>
                            <given-names>JT</given-names>
                        </name>
				
                        <etal/>
			</person-group>:
                    <article-title>Biopython: freely available Python tools for computational molecular biology and bioinformatics.</article-title>
                    <source>
				
                        <italic toggle="yes">Bioinformatics.</italic>
			</source>
                    <year>2009</year>;<volume>25</volume>(<issue>11</issue>):<fpage>1422</fpage>&#x2013;<lpage>3</lpage>.
                    <pub-id pub-id-type="pmid">19304878</pub-id>
                    <pub-id pub-id-type="doi">10.1093/bioinformatics/btp163</pub-id>
                    <pub-id pub-id-type="pmcid">2682512</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-33">
                <label>33</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
				
                        <name name-style="western">
                            <surname>Parnell</surname>
                            <given-names>LD</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Lindenbaum</surname>
                            <given-names>P</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Shameer</surname>
                            <given-names>K</given-names>
                        </name>
				
                        <etal/>
			</person-group>:
                    <article-title>BioStar: an online question &amp; answer resource for the bioinformatics community.</article-title>
                    <source>
				
                        <italic toggle="yes">PLoS Comput Biol.</italic>
			</source>
                    <year>2011</year>;<volume>7</volume>(<issue>10</issue>):<fpage>e1002216</fpage>.
                    <pub-id pub-id-type="pmid">22046109</pub-id>
                    <pub-id pub-id-type="doi">10.1371/journal.pcbi.1002216</pub-id>
                    <pub-id pub-id-type="pmcid">3203049</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-34">
                <label>34</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
				
                        <name name-style="western">
                            <surname>klarakaleb</surname>
                        </name>
					
                        <name name-style="western">
                            <surname>Altenhoff</surname>
                            <given-names>A</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>bioc-gitadmin</surname>
                        </name>
						
                        <etal/>
			</person-group>:
                    <article-title>DessimozLab/OmaDB: v1.99.1 (Version 1.99.2).</article-title>
                    <source>
				
                        <italic toggle="yes">Zenodo.</italic>
			</source>
                    <year>2019</year>.
                    <ext-link ext-link-type="uri" xlink:href="http://www.doi.org/10.5281/zenodo.2595086">http://www.doi.org/10.5281/zenodo.2595086</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-35">
                <label>35</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
				
                        <name name-style="western">
                            <surname>Alex</surname>
                            <given-names>WV</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Altenhoff</surname>
                            <given-names>A</given-names>
                        </name>
			</person-group>:
                    <article-title>DessimozLab/pyomadb: v2.0.0 (Version 2.0.0).</article-title>
                    <source>
				
                        <italic toggle="yes">Zenodo.</italic>
			</source>
                    <year>2019</year>.
                    <ext-link ext-link-type="uri" xlink:href="http://www.doi.org/10.5281/zenodo.2530250">http://www.doi.org/10.5281/zenodo.2530250</ext-link>
                </mixed-citation>
            </ref>
        </ref-list>
    </back>
    <sub-article article-type="reviewer-report" id="report46493">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.20395.r46493</article-id>
            <title-group>
                <article-title>Reviewer response for version 2</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Gatto</surname>
                        <given-names>Laurent</given-names>
                    </name>
                    <xref ref-type="aff" rid="r46493a1">1</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0002-1520-2268</uri>
                </contrib>
                <aff id="r46493a1">
                    <label>1</label>University of Louvain (UCLouvain), Brussels, Belgium</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>12</day>
                <month>4</month>
                <year>2019</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2019 Gatto L</copyright-statement>
                <copyright-year>2019</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport46493" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.17548.2"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>Thank you very much to the authors for their detailed reply, carefully addressing all my points and suggestions.</p>
            <p>Are the conclusions about the tool and its performance adequately supported by the findings presented in the article?</p>
            <p>Yes</p>
            <p>Is the rationale for developing the new software tool clearly explained?</p>
            <p>Yes</p>
            <p>Is the description of the software tool technically sound?</p>
            <p>Yes</p>
            <p>Are sufficient details of the code, methods and analysis (if applicable) provided to allow replication of the software development and its use by others?</p>
            <p>Partly</p>
            <p>Is sufficient information provided to allow interpretation of the expected output datasets and any results generated using the tool?</p>
            <p>Partly</p>
            <p>Reviewer Expertise:</p>
            <p>Computational biology and bioinformatics, research software, reproducible research, omics.</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.</p>
        </body>
    </sub-article>
    <sub-article article-type="reviewer-report" id="report46494">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.20395.r46494</article-id>
            <title-group>
                <article-title>Reviewer response for version 2</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Greshake Tzovaras</surname>
                        <given-names>Bastian</given-names>
                    </name>
                    <xref ref-type="aff" rid="r46494a1">1</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0002-9925-9623</uri>
                </contrib>
                <contrib contrib-type="author">
                    <name>
                        <surname>Tran</surname>
                        <given-names>Ngoc-Vinh</given-names>
                    </name>
                    <xref ref-type="aff" rid="r46494a2">2</xref>
                    <role>Co-referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0001-6772-7595</uri>
                </contrib>
                <aff id="r46494a1">
                    <label>1</label>Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory&#x00a0;(LBNL), Berkeley, CA, USA</aff>
                <aff id="r46494a2">
                    <label>2</label>Department for Applied Bioinformatics, Institute of Cell Biology and Neuroscience, Goethe University, Frankfurt am Main, Germany</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>1</day>
                <month>4</month>
                <year>2019</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2019 Greshake Tzovaras B and Tran NV</copyright-statement>
                <copyright-year>2019</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport46494" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.17548.2"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>The authors have successfully addressed all prior comments.</p>
            <p>Are the conclusions about the tool and its performance adequately supported by the findings presented in the article?</p>
            <p>Yes</p>
            <p>Is the rationale for developing the new software tool clearly explained?</p>
            <p>Yes</p>
            <p>Is the description of the software tool technically sound?</p>
            <p>Yes</p>
            <p>Are sufficient details of the code, methods and analysis (if applicable) provided to allow replication of the software development and its use by others?</p>
            <p>Yes</p>
            <p>Is sufficient information provided to allow interpretation of the expected output datasets and any results generated using the tool?</p>
            <p>Yes</p>
            <p>Reviewer Expertise:</p>
            <p>bioinformatics, evolutionary biology</p>
            <p>We confirm that we have read this submission and believe that we have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.</p>
        </body>
    </sub-article>
    <sub-article article-type="reviewer-report" id="report42908">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.19190.r42908</article-id>
            <title-group>
                <article-title>Reviewer response for version 1</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Gatto</surname>
                        <given-names>Laurent</given-names>
                    </name>
                    <xref ref-type="aff" rid="r42908a1">1</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0002-1520-2268</uri>
                </contrib>
                <aff id="r42908a1">
                    <label>1</label>University of Louvain (UCLouvain), Brussels, Belgium</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>11</day>
                <month>2</month>
                <year>2019</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2019 Gatto L</copyright-statement>
                <copyright-year>2019</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport42908" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.17548.1"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve-with-reservations</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>Kaleb et al. present two packages, namely R/Bioconductor OmoDB and python omadb, that allow users to query and use data from the Orthologous Matrix database. The article is well written and the authors provide 6 examples that convincingly demonstrate the usefulness and reach of their work.</p>
            <p> </p>
            <p> I have a couple of comments and suggestions below, presented in chronological order. The only serious one is a request for the authors to describe the outputs in their examples a bit more (see below for details), to facilitate the adoption for users that wouldn't be familiar with R. 
                <list list-type="bullet">
                    <list-item>
                        <p>In https://omabrowser.org/api/docs, the pagination example has a typo. The genomes should be replaced with genome:</p>
                    </list-item>
                </list> ```</p>
            <p> $ "https://omabrowser.org/api/genomes/?page=2"</p>
            <p> HTTP/1.1 404 Not Found</p>
            <p> Server: nginx</p>
            <p> Date: Mon, 11 Feb 2019 06:04:30 GMT</p>
            <p> Content-Type: text/html; charset=utf-8</p>
            <p> Connection: keep-alive</p>
            <p> X-Frame-Options: SAMEORIGIN</p>
            <p> Vary: Cookie</p>
            <p> Set-Cookie: __utmmobile=d41d8cd98f00b204e9800998ecf8427e; expires=Wed, 10-Feb-2021 06:04:30 UTC; Path=/</p>
            <p> Set-Cookie: sessionid=9zb42ljib7apkubml1e1t742i5p6f3a6; expires=Mon, 25-Feb-2019 06:04:30 GMT; HttpOnly; Max-Age=1209600; Path=/</p>
            <p> </p>
            <p> $ curl -I "https://omabrowser.org/api/genome/?page=2"</p>
            <p> HTTP/1.1 200 OK</p>
            <p> Server: nginx</p>
            <p> Date: Mon, 11 Feb 2019 06:04:32 GMT</p>
            <p> Content-Type: application/json</p>
            <p> Connection: keep-alive</p>
            <p> Link: ; rel="first", ; rel="prev", ; rel="next", ; rel="last"</p>
            <p> X-Total-Count: 2198</p>
            <p> Vary: Accept, Cookie</p>
            <p> Allow: GET, HEAD, OPTIONS</p>
            <p> X-Frame-Options: SAMEORIGIN</p>
            <p> Set-Cookie: __utmmobile=d41d8cd98f00b204e9800998ecf8427e; expires=Wed, 10-Feb-2021 06:04:32 UTC; Path=/</p>
            <p> Set-Cookie: sessionid=9h5n3ouuwvh4ock9dz3q4yiprmb8iw0d; expires=Mon, 25-Feb-2019 06:04:32 GMT; HttpOnly; Max-Age=1209600; Path=/</p>
            <p> Strict-Transport-Security: max-age=15768000</p>
            <p> ``` 
                <list list-type="bullet">
                    <list-item>
                        <p>In the introduction, the authors explain that 'Most data available through the OMA browser is now accessible via the API'. I think it would be useful to know what data isn't available and whether the browser and REST API would ever be equivalent in terms of data served. This might be partly addressed later, in the discussion, where the authors mention 'support for local synteny'. Some additional details would be useful to redirect users to the appropriate interface. Similarly, it would be useful to know if the R and python packages provide access to the same data, or if differences also exist there.</p>
                    </list-item>
                    <list-item>
                        <p>I didn't see mention of the R and python packages on the OmaDB web page. This would be a useful addition for visitors.</p>
                    </list-item>
                    <list-item>
                        <p>In the Bioconductor package section, the authors explain that data is provided in 'R friendly objects, namely S3 objects and data frames'. I would suggest to rephrase this and only refer to objects, as S4 objects are also returned and the nature of the technical class system is probably not necessary in the frame of this document.</p>
                    </list-item>
                    <list-item>
                        <p>Regarding the R package, I would suggest to add URL and BugReports fields in the packages DESCRIPTION file. This helps users find the GitHub repository and report issues. I also noted that in the 'getting started' vignette, it looks like some section a missing a space after the section markup. I have send a pull request fixing these and some other minor issue.</p>
                    </list-item>
                    <list-item>
                        <p>Note that the html and R version of the vignette shouldn't be included in the package source.</p>
                    </list-item>
                    <list-item>
                        <p>In the python package section, the authors mention that this package is 
                            <italic>also</italic> named 'omadb'. I would argue that the packages have different names, as programming languages are case sensitive and suggest to drop the also to avoid any confusion.</p>
                    </list-item>
                    <list-item>
                        <p>In the first sentence of the result section, authors should replace R library by R package, as they are referring to their package, not the location where the package is being installed (the library).</p>
                    </list-item>
                    <list-item>
                        <p>In general, it would be very useful for the authors to describe the different outputs they have. I am not expecting the authors to provide full details of the REST API responses, but describing how the results match the text would be important. For example, in example 1, they only show how to produce the `response_content_list` response. Here, it would be useful to explain that this R list directly maps the REST json message, and point to the specific documentation entry point. Such an explanation motivates the example in the text and helps users, that aren't familiar with REST, to understand the relation between the server and the package.</p>
                    </list-item>
                    <list-item>
                        <p>Similarly in example 2, the authors create the `seq_annotation` variable and mention that only one target sequence was identified. Here, it would be useful to show that `length(seq_annotation$targets)` is equal to 1, to back their claim, to indicate how users can verify the number of targets, and motivate the use of the first list index in later code chunks.</p>
                    </list-item>
                    <list-item>
                        <p>Still in example 2, the authors query and extract the hog members. These data are however already present in the first output of that example, under `seq_annotation$targets[[1]]$oma_hog_members`. It would be useful to explain why the authors send a second query to obtain that data and clarify whether `oma_hog_members` is always equivalent to calling `getHOG` and `getProtein`.</p>
                    </list-item>
                    <list-item>
                        <p>When trying to reproduce the code, I first failed to run the code chunks calling `getProtein`. Later, the authors clarify the software requirements in more details. It would however be useful to briefly mention, early on in the Results section, what version was used for the examples.</p>
                    </list-item>
                    <list-item>
                        <p>In example 5, I would suggest to update to new function name, as `getAnnotations` is expected to be deprecated in the next release, especially as the new version of the package is anyway required for the `getProtein` function.</p>
                    </list-item>
                </list> ```</p>
            <p> &gt; myAnnotations &lt;- getAnnotation(mysterySeq)</p>
            <p> Warning message:</p>
            <p> 'getAnnotation' is deprecated.</p>
            <p> Use 'annotateSequence' instead.</p>
            <p> See help("Deprecated")</p>
            <p> ``` 
                <list list-type="bullet">
                    <list-item>
                        <p>Another example where an explanation of the output is important is example 5. The authors call `myAnnotation &lt;- getAnnotations(mysterySeq)` and then refer to 54 GO annotation results. In repeating their analysis, I obtain a data frame with 55 observations (see&#x00a0;below). It is this unclear whether I have a different result, if one observation should be dropped, or if my output is completely wrong (was I even expecting a data frame?).</p>
                    </list-item>
                </list> </p>
            <p> ```</p>
            <p> &gt; dim(myAnnotations)</p>
            <p> [1] 55 13</p>
            <p> ``` 
                <list list-type="bullet">
                    <list-item>
                        <p>In general, given the nature of the package, i.e. that it accesses an online repository that is (or can be) updated regularly, results may change, this also explaining why I may have different results.</p>
                    </list-item>
                </list>
            </p>
            <p>Are the conclusions about the tool and its performance adequately supported by the findings presented in the article?</p>
            <p>Yes</p>
            <p>Is the rationale for developing the new software tool clearly explained?</p>
            <p>Yes</p>
            <p>Is the description of the software tool technically sound?</p>
            <p>Yes</p>
            <p>Are sufficient details of the code, methods and analysis (if applicable) provided to allow replication of the software development and its use by others?</p>
            <p>Partly</p>
            <p>Is sufficient information provided to allow interpretation of the expected output datasets and any results generated using the tool?</p>
            <p>Partly</p>
            <p>Reviewer Expertise:</p>
            <p>Computational biology and bioinformatics, research software, reproducible research, omics.</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.</p>
        </body>
        <sub-article article-type="response" id="comment4500-42908">
            <front-stub>
                <contrib-group>
                    <contrib contrib-type="author">
                        <name>
                            <surname>Dessimoz</surname>
                            <given-names>Christophe</given-names>
                        </name>
                        <aff>University of Lausanne, Switzerland</aff>
                    </contrib>
                </contrib-group>
                <author-notes>
                    <fn fn-type="conflict">
                        <p>
                            <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                    </fn>
                </author-notes>
                <pub-date pub-type="epub">
                    <day>19</day>
                    <month>3</month>
                    <year>2019</year>
                </pub-date>
            </front-stub>
            <body>
                <p>
                    <italic>In https://omabrowser.org/api/docs, the pagination example has a typo. The genomes should be replaced with </italic>
                    <italic>genome</italic>
                    <italic>.</italic>
                </p>
                <p> </p>
                <p> RESPONSE: Thanks for spotting this typo. It has been fixed.</p>
                <p> </p>
                <p> 
                    <italic>In the introduction, the authors explain that 'Most data available through the OMA browser is now accessible via the API'. I think it would be useful to know what data isn't available and whether the browser and REST API would ever be equivalent in terms of data served. This might be partly addressed later, in the discussion, where the authors mention 'support for local synteny'. Some additional details would be useful to redirect users to the appropriate interface. Similarly, it would be useful to know if the R and python packages provide access to the same data, or if differences also exist there.</italic>
                </p>
                <p> </p>
                <p> RESPONSE: As mentioned in the Discussion, the package currently lacks availability of the data on local synteny, mainly due to the complexity of the data representation in the API format. &#x00a0;We aim to bridge this gap in the next release. We have now amended the Methods section to reflect the difference between OMA browser and the API more explicitly, as well as the discussion to reassure the users the API and OMA browser will be kept in sync with further OMA developments.</p>
                <p> Both R and Python packages use the same API and supply the same data.</p>
                <p> </p>
                <p> 
                    <italic>I didn't see mention of the R and python packages on the OmaDB web page. This would be a useful addition for visitors.&#x00a0;&#x00a0; &#x00a0;</italic>
                </p>
                <p> </p>
                <p> RESPONSE: We already had a link to the R package in the /api/docs, but we have now also included link directly from the "compute" menu.</p>
                <p> </p>
                <p> 
                    <italic>In the Bioconductor package section, the authors explain that data is provided in 'R friendly objects, namely S3 objects and data frames'. I would suggest to rephrase this and only refer to objects, as S4 objects are also returned and the nature of the technical class system is probably not necessary in the frame of this document.&#x00a0;&#x00a0; &#x00a0;</italic>
                </p>
                <p> </p>
                <p> RESPONSE: This has now been amended.</p>
                <p> </p>
                <p> 
                    <italic>Regarding the R package, I would suggest </italic>
                    <italic>to add</italic>
                    <italic> URL and BugReports fields in the packages DESCRIPTION file. This helps users find the GitHub repository and report issues. I also noted that in the 'getting started' vignette, it looks like some section a missing </italic>
                    <italic>a space</italic>
                    <italic> after the section markup. I have </italic>
                    <italic>send</italic>
                    <italic> a pull request fixing these and some other minor issue.</italic>
                </p>
                <p> </p>
                <p> RESPONSE: Thank you for the pull request, this has now been merged with OmaDB &#x00a0;version 2.0.</p>
                <p> </p>
                <p> 
                    <italic>Note that the </italic>
                    <italic>html</italic>
                    <italic> and R version of the vignette shouldn't be included in the package source.</italic>
                </p>
                <p> </p>
                <p> RESPONSE: This has now been amended.</p>
                <p> </p>
                <p> 
                    <italic>In the python package section, the authors mention that this package is also named '</italic>
                    <italic>omadb</italic>
                    <italic>'. I would argue that the packages have different names, as programming languages are case sensitive and suggest to drop the also to avoid any confusion.</italic>
                </p>
                <p> </p>
                <p> RESPONSE: Amended as requested.</p>
                <p> </p>
                <p> 
                    <italic>In the first sentence of the result section, authors should replace R library by R package, as they are referring to their package, not the location where the package is being installed (the library).</italic>
                </p>
                <p> </p>
                <p> RESPONSE: This has now been amended.</p>
                <p> </p>
                <p> 
                    <italic>In general, it would be very useful for the authors to describe the different outputs they have. I am not expecting the authors to provide full details of the REST API responses, but describing how the results match the text would be important. For example, in example 1, they only show how to produce the `response_content_list` response. Here, it would be useful to explain that this R list directly maps the REST </italic>
                    <italic>json</italic>
                    <italic> message, and point to the specific documentation entry point. Such an explanation motivates the example in the text and helps users, that aren't familiar with REST, to understand the relation between the server and the package.</italic>
                </p>
                <p> </p>
                <p> RESPONSE: We agree, and further information on the output generated in the manuscript has now been added to example 1.</p>
                <p> </p>
                <p> 
                    <italic>Similarly</italic>
                    <italic> in example 2, the authors create the `seq_annotation` variable and mention that only one target sequence was identified. Here, it would be useful to show that `length(seq_annotation$targets)` is equal to 1, to back their claim, to indicate how users can verify the number of targets, and motivate the use of the first list index in later code chunks.</italic>
                </p>
                <p> </p>
                <p> RESPONSE: This has now been updated.</p>
                <p> </p>
                <p> 
                    <italic>Still</italic>
                    <italic> in example 2, the authors query and extract the hog members. These data are however already present in the first output of that example, under `seq_annotation$targets[[1]]$oma_hog_members`. It would be useful to explain why the authors send a second query to obtain that data and clarify whether `oma_hog_members` is always equivalent to calling `getHOG` and `getProtein`.</italic>
                </p>
                <p> </p>
                <p> RESPONSE: It is true that the hog members can also be directly accessed via the oma_hog_members attribute. However, the members are only loaded once the attribute is accessed, so we do not add unnecessary requests. Even more importantly, if we would load the hog members via the oma_hog_members attribute, it is not obvious for which taxonomic level the members are loaded. We therefore prefer to keep the current slightly more verbose way to access the data.</p>
                <p> </p>
                <p> 
                    <italic>When trying to reproduce the code, I first failed to run the code chunks calling `getProtein`. Later, the authors clarify the software requirements in more details. It would </italic>
                    <italic>however</italic>
                    <italic> be useful to briefly mention, early on in the Results section, what version was used for the examples.</italic>
                </p>
                <p> </p>
                <p> RESPONSE: We agree that this can be confusing, and we have now amended the Methods and the Results section to explicitly mention the usage of OmaDB v2.0 for the examples.</p>
                <p> </p>
                <p> 
                    <italic>In example 5, I would suggest </italic>
                    <italic>to update</italic>
                    <italic> to </italic>
                    <italic>new</italic>
                    <italic> function name, as `getAnnotations` is expected to be deprecated in the next release, especially as the new version of the package is anyway required for the `getProtein` function.</italic>
                </p>
                <p> </p>
                <p> RESPONSE: This was a mistake on our side and has now been amended.</p>
                <p> </p>
                <p> 
                    <italic>Another example where an explanation of the output is important is example 5. The authors call `myAnnotation &lt;- getAnnotations(mysterySeq)` and then refer to 54 GO annotation results. In repeating their analysis, I obtain a data frame with 55 observations (see below). It is this unclear whether I have a different result, if one observation should be dropped, or if my output is completely wrong (was I even expecting a data frame?).&#x00a0;&#x00a0; &#x00a0;In general, given the nature of the package, i.e. that it accesses an online repository that is (or can be) updated regularly, results may change, this also </italic>
                    <italic>explaining</italic>
                    <italic> why I may have different results.</italic>
                </p>
                <p> </p>
                <p> RESPONSE: We can confirm that this is due to the December 2018 OMA release, where there are indeed 55 results returned for that particular query. The fact that the results may vary due to the continued updates of the OMA database has now been explicitly mentioned in the beginning of the methods section of the manuscript. We have also added what version of the database was used to generate the examples in the manuscript.</p>
            </body>
        </sub-article>
    </sub-article>
    <sub-article article-type="reviewer-report" id="report42912">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.19190.r42912</article-id>
            <title-group>
                <article-title>Reviewer response for version 1</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Greshake Tzovaras</surname>
                        <given-names>Bastian</given-names>
                    </name>
                    <xref ref-type="aff" rid="r42912a1">1</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0002-9925-9623</uri>
                </contrib>
                <contrib contrib-type="author">
                    <name>
                        <surname>Tran</surname>
                        <given-names>Ngoc-Vinh</given-names>
                    </name>
                    <xref ref-type="aff" rid="r42912a2">2</xref>
                    <role>Co-referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0001-6772-7595</uri>
                </contrib>
                <aff id="r42912a1">
                    <label>1</label>Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory&#x00a0;(LBNL), Berkeley, CA, USA</aff>
                <aff id="r42912a2">
                    <label>2</label>Department for Applied Bioinformatics, Institute of Cell Biology and Neuroscience, Goethe University, Frankfurt am Main, Germany</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>5</day>
                <month>2</month>
                <year>2019</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2019 Greshake Tzovaras B and Tran NV</copyright-statement>
                <copyright-year>2019</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport42912" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.17548.1"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>The authors describe a new REST API as an interface to the well-established Orthologous Matrix database. As identifying and evaluating orthologs is a central step in many biological analyses, an easy way to query the over 2,100 species in OMA is highly valuable. To further facilitate querying the data through their API, the authors present packages for R and Python. The API is well documented on the OMA website and the R package comes with vignettes describing different use cases. The manuscript presented here focuses on the OmaDB R-package and showcases some of its functions.</p>
            <p> </p>
            <p> Being somewhat "ahead of it's time", the R package as described in the manuscript requires both the development version of R (v3.6) and Bioconductor (v3.9). The package installation instructions at the beginning of the manuscript only glances over it, more complete instructions are only found in the 
                <italic>Software availability </italic>section at the end.</p>
            <p> We recommend including more explicit warnings/instructions about the required versions at the beginning, otherwise potential users might be confused when trying to follow along with the examples given in the manuscript (As happened to us and it took us some time to figure out what's going on).</p>
            <p> </p>
            <p> While the Python package is not extensively discussed in this manuscript, the authors provide a Binder that can be used to reproduce the same analyses using Python. We recommend putting a link to it (
                <ext-link ext-link-type="uri" xlink:href="https://mybinder.org/v2/gh/DessimozLab/pyomadb/master?filepath=examples%2Fpyomadb-examples.ipynb">https://mybinder.org/v2/gh/DessimozLab/pyomadb/master?filepath=examples%2Fpyomadb-examples.ipynb</ext-link>) in the manuscript, to help users with taking up the Python library.</p>
            <p> </p>
            <p> We welcome the switch from the SOAP API to a more modern REST implementation and the provided packages to interface with the API will be valuable for a lot of researchers working with orthologs.</p>
            <p>Are the conclusions about the tool and its performance adequately supported by the findings presented in the article?</p>
            <p>Yes</p>
            <p>Is the rationale for developing the new software tool clearly explained?</p>
            <p>Yes</p>
            <p>Is the description of the software tool technically sound?</p>
            <p>Yes</p>
            <p>Are sufficient details of the code, methods and analysis (if applicable) provided to allow replication of the software development and its use by others?</p>
            <p>Yes</p>
            <p>Is sufficient information provided to allow interpretation of the expected output datasets and any results generated using the tool?</p>
            <p>Yes</p>
            <p>Reviewer Expertise:</p>
            <p>bioinformatics, evolutionary biology</p>
            <p>We confirm that we have read this submission and believe that we have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.</p>
        </body>
        <sub-article article-type="response" id="comment4499-42912">
            <front-stub>
                <contrib-group>
                    <contrib contrib-type="author">
                        <name>
                            <surname>Dessimoz</surname>
                            <given-names>Christophe</given-names>
                        </name>
                        <aff>University of Lausanne, Switzerland</aff>
                    </contrib>
                </contrib-group>
                <author-notes>
                    <fn fn-type="conflict">
                        <p>
                            <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                    </fn>
                </author-notes>
                <pub-date pub-type="epub">
                    <day>19</day>
                    <month>3</month>
                    <year>2019</year>
                </pub-date>
            </front-stub>
            <body>
                <p>
                    <list list-type="order">
                        <list-item>
                            <p>
                                <italic>Being somewhat "ahead of its time", the R package as described in the manuscript requires both the development version of R (v3.6) and Bioconductor (v3.9). The package installation instructions at the beginning of the manuscript only </italic>
                                <italic>glances</italic>
                                <italic> over it, more complete instructions are only found in the Software availability section at the end.</italic>
                            </p>
                            <p> </p>
                            <p> 
                                <italic>We recommend including more explicit warnings/instructions about the required versions at the beginning, </italic>
                                <italic>otherwise</italic>
                                <italic> potential users might be confused when trying to follow along with the examples given in the manuscript (As happened to us and it took us some time to figure out what's going on).</italic>
                            </p>
                        </list-item>
                    </list> RESPONSE: We agree that this might cause confusion and we have now updated the OmaDB package section in Methods to mention explicitly that the package version used in the manuscript is 2.0, which until April 2019 is in the development version of Bioconductor. We also point the readers to the Software Availability section where further instructions for package installation are provided. We are hesitant to amend the package installation instructions at the beginning of the manuscript to reflect this as it will change shortly. 
                    <list list-type="order">
                        <list-item>
                            <p>While the Python package is not extensively discussed in this manuscript, the authors provide a Binder that can be used to reproduce the same analyses using Python. We recommend putting a link to it (https://mybinder.org/v2/gh/DessimozLab/pyomadb/master?filepath=examples%2Fpyomadb-examples.ipynb) in the manuscript, to help users with taking up the Python library.</p>
                        </list-item>
                    </list> </p>
                <p> &#x00a0;&#x00a0; &#x00a0;RESPONSE: This has now been added.</p>
            </body>
        </sub-article>
    </sub-article>
</article>
