<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.2 20190208//EN" "http://jats.nlm.nih.gov/publishing/1.2/JATS-journalpublishing1.dtd"><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="other" dtd-version="1.2" xml:lang="en">
    <front>
        <journal-meta>
            <journal-id journal-id-type="pmc">F1000Research</journal-id>
            <journal-title-group>
                <journal-title>F1000Research</journal-title>
            </journal-title-group>
            <issn pub-type="epub">2046-1402</issn>
            <publisher>
                <publisher-name>F1000 Research Limited</publisher-name>
                <publisher-loc>London, UK</publisher-loc>
            </publisher>
        </journal-meta>
        <article-meta>
            <article-id pub-id-type="doi">10.12688/f1000research.26459.1</article-id>
            <article-categories>
                <subj-group subj-group-type="heading">
                    <subject>Software Tool Article</subject>
                </subj-group>
                <subj-group>
                    <subject>Articles</subject>
                </subj-group>
            </article-categories>
            <title-group>
                <article-title>Uncovering host-microbiome interactions in global systems with collaborative programming: a novel approach integrating social and data sciences</article-title>
                <fn-group content-type="pub-status">
                    <fn>
                        <p>[version 1; peer review: 1 approved with reservations]</p>
                    </fn>
                </fn-group>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Oberstaller</surname>
                        <given-names>Jenna</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Conceptualization</role>
                    <role content-type="http://credit.niso.org/">Data Curation</role>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Project Administration</role>
                    <role content-type="http://credit.niso.org/">Resources</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Supervision</role>
                    <role content-type="http://credit.niso.org/">Validation</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Adapa</surname>
                        <given-names>Swamy Rakesh</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Conceptualization</role>
                    <role content-type="http://credit.niso.org/">Data Curation</role>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Project Administration</role>
                    <role content-type="http://credit.niso.org/">Resources</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Supervision</role>
                    <role content-type="http://credit.niso.org/">Validation</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Dayhoff II</surname>
                        <given-names>Guy W.</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Data Curation</role>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Project Administration</role>
                    <role content-type="http://credit.niso.org/">Resources</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Supervision</role>
                    <role content-type="http://credit.niso.org/">Validation</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Gibbons</surname>
                        <given-names>Justin</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Conceptualization</role>
                    <role content-type="http://credit.niso.org/">Data Curation</role>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Project Administration</role>
                    <role content-type="http://credit.niso.org/">Resources</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Supervision</role>
                    <role content-type="http://credit.niso.org/">Validation</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Keller</surname>
                        <given-names>Thomas E.</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Conceptualization</role>
                    <role content-type="http://credit.niso.org/">Data Curation</role>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Project Administration</role>
                    <role content-type="http://credit.niso.org/">Resources</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Supervision</role>
                    <role content-type="http://credit.niso.org/">Validation</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Li</surname>
                        <given-names>Chang</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Data Curation</role>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Project Administration</role>
                    <role content-type="http://credit.niso.org/">Resources</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Supervision</role>
                    <role content-type="http://credit.niso.org/">Validation</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Lim</surname>
                        <given-names>Jean</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Data Curation</role>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Project Administration</role>
                    <role content-type="http://credit.niso.org/">Resources</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Supervision</role>
                    <role content-type="http://credit.niso.org/">Validation</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <uri content-type="orcid">https://orcid.org/0000-0003-4578-5318</uri>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Pham</surname>
                        <given-names>Minh</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Validation</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <uri content-type="orcid">https://orcid.org/0000-0003-3981-0951</uri>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Sarkar</surname>
                        <given-names>Anujit</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Conceptualization</role>
                    <role content-type="http://credit.niso.org/">Data Curation</role>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Project Administration</role>
                    <role content-type="http://credit.niso.org/">Resources</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Supervision</role>
                    <role content-type="http://credit.niso.org/">Validation</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Sharma</surname>
                        <given-names>Ravi</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Validation</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Wani</surname>
                        <given-names>Agaz H.</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Data Curation</role>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Project Administration</role>
                    <role content-type="http://credit.niso.org/">Resources</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Supervision</role>
                    <role content-type="http://credit.niso.org/">Validation</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Vianello</surname>
                        <given-names>Andrea</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Validation</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <uri content-type="orcid">https://orcid.org/0000-0002-2323-6767</uri>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Duong</surname>
                        <given-names>Linh M.</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Validation</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Wang</surname>
                        <given-names>Chenggi</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Conceptualization</role>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Validation</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Atkinson</surname>
                        <given-names>Celine Grace F.</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Validation</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Barrow</surname>
                        <given-names>Madeleine</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Validation</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Van Bibber</surname>
                        <given-names>Nathan W.</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Validation</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Dahrendorff</surname>
                        <given-names>Jan</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Validation</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Dean</surname>
                        <given-names>David A. E.</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Validation</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Dokur</surname>
                        <given-names>Omkar</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Validation</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Ferreira</surname>
                        <given-names>Gloria C.</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Validation</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Hastings</surname>
                        <given-names>Mitchell</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Validation</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Herbert</surname>
                        <given-names>Gregory S.</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Validation</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <uri content-type="orcid">https://orcid.org/0000-0001-7312-6147</uri>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Huq</surname>
                        <given-names>Khandaker Tasnim</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Validation</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Kim</surname>
                        <given-names>Youngchul</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Validation</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a1">1</xref>
                    <xref ref-type="aff" rid="a2">2</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Liao</surname>
                        <given-names>Xiangyun</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Validation</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a3">3</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Liu</surname>
                        <given-names>XiaoMing</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Validation</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Mansuri</surname>
                        <given-names>Fahad</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Validation</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Martin</surname>
                        <given-names>Lynn B.</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Validation</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Miller</surname>
                        <given-names>Elizabeth M.</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Validation</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <uri content-type="orcid">https://orcid.org/0000-0002-5046-380X</uri>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Natarajan</surname>
                        <given-names>Ojas</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Validation</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <uri content-type="orcid">https://orcid.org/0000-0001-7887-2640</uri>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Pang</surname>
                        <given-names>Jinyong</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Validation</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Prieto</surname>
                        <given-names>Francesca</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Validation</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Radulovic</surname>
                        <given-names>Peter W.</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Validation</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Sheth</surname>
                        <given-names>Vyoma</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Validation</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Sumpter</surname>
                        <given-names>Matthew</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Validation</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <uri content-type="orcid">https://orcid.org/0000-0001-5854-9686</uri>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Sutherland</surname>
                        <given-names>Desirae</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Validation</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Vijayakumar</surname>
                        <given-names>Nisha</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Validation</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="yes">
                    <name>
                        <surname>Jiang</surname>
                        <given-names>Rays H. Y.</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Conceptualization</role>
                    <role content-type="http://credit.niso.org/">Data Curation</role>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Funding Acquisition</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Project Administration</role>
                    <role content-type="http://credit.niso.org/">Resources</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Supervision</role>
                    <role content-type="http://credit.niso.org/">Validation</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="corresp" rid="c1">a</xref>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <aff id="a1">
                    <label>1</label>University of South Florida, Tampa, FL, 33612, USA</aff>
                <aff id="a2">
                    <label>2</label>Moffit Cancer Center, Tampa, FL, 33612, USA</aff>
                <aff id="a3">
                    <label>3</label>Texas A&amp;M, College Station, TX, USA</aff>
            </contrib-group>
            <author-notes>
                <corresp id="c1">
                    <label>a</label>
                    <email xlink:href="mailto:Jiang2@usf.edu">Jiang2@usf.edu</email>
                </corresp>
                <fn fn-type="conflict">
                    <p>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>17</day>
                <month>12</month>
                <year>2020</year>
            </pub-date>
            <pub-date pub-type="collection">
                <year>2020</year>
            </pub-date>
            <volume>9</volume>
            <elocation-id>1478</elocation-id>
            <history>
                <date date-type="accepted">
                    <day>23</day>
                    <month>9</month>
                    <year>2020</year>
                </date>
            </history>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2020 Oberstaller J et al.</copyright-statement>
                <copyright-year>2020</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <self-uri content-type="pdf" xlink:href="https://f1000research.com/articles/9-1478/pdf"/>
            <abstract>
                <p>Microbiome data are undergoing exponential growth powered by rapid technological advancement. As the scope and depth of microbiome research increases, cross-disciplinary research is urgently needed for interpreting and harnessing the unprecedented data output. However, conventional research settings pose challenges to much-needed interdisciplinary research efforts due to barriers in scientific terminologies, methodology and research-culture. To breach these barriers, our University of South Florida OneHealth Codeathon was designed to be an interactive, hands-on event that solves real-world data problems. The format brought together students, postdocs, faculty, researchers, and clinicians in a uniquely cross-disciplinary, team-focused setting. Teams were formed to encourage equitable distribution of diverse domain-experts and proficient programmers, with beginners to experts on each team. To unify the intellectual framework, we set the focus on the topics of microbiome interactions at different scales from clinical to environmental sciences, leveraging local expertise in the fields of genetics, genomics, clinical data, and social and geospatial sciences. As a result, teams developed working methods and pipelines to face major challenges in current microbiome research, including data integration, experimental power calculations, geospatial mapping, and machine-learning classifiers. This broad, transdisciplinary and efficient workflow will be an example for future workshops to deliver useful data-science products.</p>
            </abstract>
            <kwd-group kwd-group-type="author">
                <kwd>hackathon</kwd>
                <kwd>codeathon</kwd>
                <kwd>data science</kwd>
                <kwd>transdisciplinary</kwd>
                <kwd>gut microbiome</kwd>
                <kwd>oral microbiome</kwd>
                <kwd>human migration microbiome</kwd>
                <kwd>Clinical Informatics</kwd>
                <kwd>Bioinformatics</kwd>
                <kwd>Operational Taxonomic Unit (OTU)</kwd>
                <kwd>16S rRNA</kwd>
                <kwd>machine learning</kwd>
                <kwd>Geographic Information Systems (GIS)</kwd>
            </kwd-group>
            <funding-group>
                <award-group id="fund-1" xlink:href="http://dx.doi.org/10.13039/100008900">
                    <funding-source>University of South Florida</funding-source>
                </award-group>
                <funding-statement>The OneHealth codeathon is supported by the USF Conference and Event award (Fall 2019 cycle) to RHYJ; USF Genomics program, Institute for the Advanced Study of Culture and the Environment, and USF library systems. RHYJ&#x2019;s microbiome research is supported by COPH interdisciplinary research award (2019 -20 cycle), and USF-Exeter international catalytic research award.</funding-statement>
                <funding-statement>
                    <italic>The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.</italic>
                </funding-statement>
            </funding-group>
        </article-meta>
    </front>
    <body>
        <sec sec-type="intro">
            <title>Introduction</title>
            <sec>
                <title>OneHealth Codeathon: Genesis of a working model for applied, interdisciplinary problem-solving</title>
                <p>The National Institutes of Health National Center for Biotechnology Information (NIH NCBI) model for codeathons&#x2014;intensely collaborative, time-limited data workshops which encourage teams of participants to produce software prototypes to solve problems related to a common biomedical topic&#x2014;are an effective avenue for the generation of software prototypes in the biomedical informatics space. Our previous &#x201c;Iron Hack&#x201d; event
                    <sup>
                        <xref ref-type="bibr" rid="ref-1">1</xref>
                    </sup>, centered on rare iron-related diseases, was a transdisciplinary twist on this NCBI model designed to complement and unite local University of South Florida (USF) research programs, inspiring participation from clinicians, genetic counsellors, and researchers from a diversity of biomedical fields at all different career-stages.</p>
                <p>We set out to further expand on the more traditional foundation of codeathons for this year&#x2019;s event, working with the local research-community to select challenges that would encourage and more heavily utilize skillsets less-traditionally drawn to codeathons (e.g. social science researchers), while also supporting emerging USF research initiatives and addressing wider challenges in biomedical data science. This year&#x2019;s event (dubbed the USF OneHealth Codeathon) therefore focused on the fast-evolving field of host-microbiome interactions, with concepts for our team-projects designed around data-centric problems encountered by our interdisciplinary participants in their research and practice.  The event took place on USF&#x2019;s Tampa campus over February 26&#x2013;28, 2020.</p>
                <p>As a result of these intense collaborative efforts, teams developed resources that are relevant not only to microbiome studies, but also general bioinformatics problems. The objective of this report is to demonstrate the utility of a codeathon model to rapidly develop tools for human and environmental health research, with the added community-building benefits of (1) providing opportunities for meaningful, long-term, cross-departmental interactions that stimulate collaborations and creative project design, and (2) offering in-depth exposure to applied data-science for members of traditionally less-computational fields.</p>
            </sec>
            <sec>
                <title>Critical gaps OneHealth Codeathon projects sought to address</title>
                <p>We addressed challenges related to the host microbiome, including the great need for novel genomics tools to handle large, recently generated heterogenous microbiome datasets. We established six OneHealth Codeathon teams to develop six computational-tool prototypes broadly focused on (1) power calculation for microbiome study design, (2) geographical information systems-analysis of microbiome data and associated risk factors, (3) mining archaeological microbiome data, and (4) searching for ecological drivers of earth microbiomes (
                    <xref ref-type="fig" rid="f1">Figure 1</xref>). These team-efforts have led to the convergence of social science, ecology and medical communities with genomics data-science researchers to produce promising computational tools, strengthened through an iterative process of soliciting ideas and feedback from domain experts.</p>
                <fig fig-type="figure" id="f1" orientation="portrait" position="float">
                    <label>Figure 1. </label>
                    <caption>
                        <title>Scope of human holobiont interactions with microbiomes in various contexts explored through USF&#x2019;s OneHealth Codeathon.</title>
                        <p>Two teams (Teams MicroPower Plus and Zero) focused on developing practical computational tools for microbiome study-design and data-analysis. Four teams (Teams Geo, Animal, Track and Yolo) focused on exploring different aspects of host-microbiome interactions from environmental consequences to clinical presentations.</p>
                    </caption>
                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/29214/ca8f781b-3111-4461-98e0-346e2215b98f_figure1.gif"/>
                </fig>
                <p>The remainder of this report is organized into subsections by project, beginning with a detailed description for the six projects, the motivations behind them, and the gaps they seek to fill. We next describe the methodologies and implementations of the projects into usable software applications, how to operate the software applications, and results produced using the software applications. Finally, we discuss the pros and cons of this new highly interdisciplinary and community-driven twist on more traditional hackathons.</p>
            </sec>
        </sec>
        <sec>
            <title>Team 1 &#x2013; MicroPower Plus</title>
            <p>
                <bold>Project title:</bold> Microbiome power-calculation tool for biologists: towards rigorous, reproducible microbiome study-design</p>
            <p>
                <bold>Rationale:</bold> Measured differences between sample groups can result from any number of experimental artifacts not reflective of actual biology, including differing definitions of what a clinical population signifies within different studies, how samples are prepared, and analytical decisions (e.g., bioinformatic and statistical tool-selection, parameter-settings
                <sup>
                    <xref ref-type="bibr" rid="ref-2">2</xref>&#x2013;
                    <xref ref-type="bibr" rid="ref-4">4</xref>
                </sup>). Statistical power calculations are a key part of quality study-design, informing the sample-size required to have sufficient statistical power to detect differences between experimental groups. The size of this difference between groups&#x2014;the effect size&#x2014;should also be taken into account during experimental planning; smaller effects are more sensitive to being obscured by experimental noise. Sufficiently powered studies are critical for robust biological conclusions, and funding agencies increasingly require power and sample-size analyses to consider applications for support. </p>
            <p>R-based software packages enabling power analyses modeling relationships between sample-size and detectable effect-size using PERMANOVA-based methods have been developed to estimate required samples for microbiome experimental design
                <sup>
                    <xref ref-type="bibr" rid="ref-5">5</xref>
                </sup>, given input data from pilot studies. These handy tools are not generally accessible to biologists with limited computational experience and/or a more cursory grasp of statistics. We sought to build on these methods to create a more intuitive calculator/guide for biologists, who often need only a quick point-and-click reference for experimental planning.</p>
            <p>
                <bold>Goal:</bold> To provide an intuitive power- and effect-size calculator-tool for biologists with limited computational experience.</p>
        </sec>
        <sec sec-type="methods">
            <title>Methods</title>
            <sec>
                <title>Data-sources and processing</title>
                <p>Predicted effect sizes detectable at a range of sample sizes and power-levels were precomputed on OTU tables from a variety of human body-site datasets from the Human Microbiome Project (HMP) using the R package 
                    <ext-link ext-link-type="uri" xlink:href="https://github.com/brendankelly/micropower">micropower</ext-link> (v0.4) (Jaccard distance method)
                    <sup>
                        <xref ref-type="bibr" rid="ref-5">5</xref>,
                        <xref ref-type="bibr" rid="ref-6">6</xref>
                    </sup>. We used these precomputed data as a reference for quick and interactive power calculations for commonly used sample sizes by body-site.</p>
                <p>We added additional functionality for calculating the effect size of the experimental intervention given a control group vs. an experimental group using linear modeling. Our tool computes the Bray-Curtis distance between all samples, then uses the Adonis function from the 
                    <ext-link ext-link-type="uri" xlink:href="https://cran.r-project.org/package=vegan">vegan</ext-link> package (v2.5.6) to calculate the correlation parameter Pearson&#x2019;s R
                    <sup>
                        <xref ref-type="bibr" rid="ref-7">7</xref>
                    </sup>.</p>
                <p>A conceptual overview of MicroPower Plus functionality is provided in 
                    <xref ref-type="fig" rid="f2">Figure 2</xref>.</p>
                <fig fig-type="figure" id="f2" orientation="portrait" position="float">
                    <label>Figure 2. </label>
                    <caption>
                        <title>MicroPower Plus functionality conceptual overview.</title>
                        <p>Input-data are OTU or ASV tables selected from curated, published microbiome studies of various human body-sites from which effect size has been pre-calculated for several common sample-sizes using complementary methodologies. The user can then use the interactive, graphical output to explore the relationships between effect-size, sample size and statistical power to use as a quick reference for their own experimental planning.</p>
                    </caption>
                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/29214/ca8f781b-3111-4461-98e0-346e2215b98f_figure2.gif"/>
                </fig>
            </sec>
            <sec>
                <title>Operation and Implementation</title>
                <p>The MicroPower Plus
                    <sup>
                        <xref ref-type="bibr" rid="ref-8">8</xref>
                    </sup> workflow is implemented in a user-friendly R-
                    <ext-link ext-link-type="uri" xlink:href="https://shiny.rstudio.com/">Shiny</ext-link> web application. RStudio and the R packages 
                    <ext-link ext-link-type="uri" xlink:href="https://cran.r-project.org/package=shiny">shiny</ext-link>, 
                    <ext-link ext-link-type="uri" xlink:href="https://cran.r-project.org/package=plotly">plotly</ext-link> and 
                    <ext-link ext-link-type="uri" xlink:href="https://cran.r-project.org/package=tidyverse">tidyverse</ext-link> are required to operate MicroPower Plus
                    <sup>
                        <xref ref-type="bibr" rid="ref-9">9</xref>&#x2013;
                        <xref ref-type="bibr" rid="ref-12">12</xref>
                    </sup>. Further documentation and a tutorial are available at the GitHub repository as listed in the code-availability section.</p>
                <p>After installation of required packages, all necessary tutorial files can be downloaded from GitHub onto the user&#x2019;s local computer, and MicroPowerPlus can be launched by opening the &#x201c;app.R&#x201d; file in RStudio.</p>
            </sec>
        </sec>
        <sec sec-type="cases">
            <title>Use cases</title>
            <p>MicroPower Plus
                <sup>
                    <xref ref-type="bibr" rid="ref-8">8</xref>
                </sup> is most useful as a statistical reference-guide for biologists to make quick calculations to aid in experimental design of microbiome studies. We built a user-interface around the human gut microbiome reference dataset that allows the user to visualize the relationship between sample size, effect size and statistical power as a proof of concept using R Shiny
                <sup>
                    <xref ref-type="bibr" rid="ref-10">10</xref>
                </sup>. Resulting effect size is reported as a bar graph, with reference to effect sizes reported in the literature for comparisons. We created an additional tool that allows the user to input their own data, calculate the effect size from their experiment and report it as a bar graph. Future iterations of this tool will include interactive visualizations for the pre-computed reference data from other body-sites.</p>
            <p>The provided tutorial walks the user through an example power calculation (
                <xref ref-type="fig" rid="f3">Figure 3</xref>) and effect size calculation (
                <xref ref-type="fig" rid="f4">Figure 4</xref>) using the pre-computed human gut microbiome datasets.</p>
            <fig fig-type="figure" id="f3" orientation="portrait" position="float">
                <label>Figure 3. </label>
                <caption>
                    <title>MicroPower Plus power-calculation interface. </title>
                    <p>The user selects the sample type, the sample size for each group and a distance measure. When the user moves the power slider, the estimated effect-size graph (red) changes to the minimum effect size required to attain the given power level. The gray bars reference effect sizes calculated from the indicated sources. By comparing the estimated effect size to the reference effect sizes, the user can get a sense of how large a difference would have to be between their samples to detect significance using different experimental designs.</p>
                </caption>
                <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/29214/ca8f781b-3111-4461-98e0-346e2215b98f_figure3.gif"/>
            </fig>
            <fig fig-type="figure" id="f4" orientation="portrait" position="float">
                <label>Figure 4. </label>
                <caption>
                    <title>MicroPower Plus effect-size calculation (concept).</title>
                    <p>The user uploads a matrix of their microbiome measurements, enters the names of the groups that can be used to distinguish the sample columns by group. MicroPower Plus then calculates the effect size for the experiment (red bar). The gray bars are effect sizes calculated from the indicated literature. Comparing the red bar to the gray bars allows the user to get a sense of the magnitude of their experimental effect.</p>
                </caption>
                <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/29214/ca8f781b-3111-4461-98e0-346e2215b98f_figure4.gif"/>
            </fig>
        </sec>
        <sec>
            <title>Team 2 &#x2013; GEO</title>
            <p>
                <bold>Project title:</bold> Environmental Chemicals: Impact on Human Microbiomes</p>
            <p>
                <bold>Rationale:</bold> Environmental exposures to chemicals have been a public health concern due to the ubiquitous nature of its effects on human health and the environment. Industries and manufacturing sectors contribute to chemical exposures by releasing these chemicals into the environment. Chemicals commonly found in commercial products, such as heavy metals and chlorinated hydrocarbon solvents, can persist in the environment for extended periods, increasing the latency of exposure
                <sup>
                    <xref ref-type="bibr" rid="ref-13">13</xref>
                </sup>.</p>
            <p>A lack of information led to relatively few rules for handling and disposing of chemicals in the first part of the 20
                <sup>th</sup> century, which resulted in the random release of these hazardous chemicals and toxins into the environment. Knowledge of toxic waste dumps and their associated human health and environmental health consequences received national attention in the late 1970&#x2019;s
                <sup>
                    <xref ref-type="bibr" rid="ref-14">14</xref>
                </sup>. In response to public outcry, Congress created &#x201c;Superfund&#x201d; in the 1980&#x2019;s to fund toxic waste clean-up at industrial sites
                <sup>
                    <xref ref-type="bibr" rid="ref-14">14</xref>,
                    <xref ref-type="bibr" rid="ref-15">15</xref>
                </sup>. Superfund sites require long-term remediation efforts, and sites are evaluated for eligibility on a point-based system requiring a preliminary assessment and site-inspection (known as the Hazard Ranking System, or HRS)
                <sup>
                    <xref ref-type="bibr" rid="ref-16">16</xref>
                </sup>. Reporting from the public or an agency is also considered in assessing a site for the qualification. Superfund sites are prioritized by HRS score onto the National Priority List (NPL)
                <sup>
                    <xref ref-type="bibr" rid="ref-16">16</xref>
                </sup>. Currently there are 1335 NPL sites around the U.S., each having specific chemical contaminations.</p>
            <p>Human exposure to toxic chemicals has been shown to elicit different effects depending on the host&#x2019;s immune response, with long-term exposures associating with a range of serious maladies varying from cancers acting on various bodily tissues to neurological effects
                <sup>
                    <xref ref-type="bibr" rid="ref-17">17</xref>
                </sup>. The gut microbiome is hypothesized to have a unique role in enhancing and maintaining host health through the microbiome-gut-brain axis and can impact endocrine, immunological and nutrient signals
                <sup>
                    <xref ref-type="bibr" rid="ref-18">18</xref>
                </sup>. Microbiome dysbiosis can occur with exposure to toxic environmental contaminants via ingestion or inhalation and can lead to several chronic conditions. Due to its diverse functions in the body, the gut microbiome acts as an indicator for health, and there is a growing body of literature exploring the interactions of environmental contaminants with the host microbiome
                <sup>
                    <xref ref-type="bibr" rid="ref-13">13</xref>,
                    <xref ref-type="bibr" rid="ref-17">17</xref>,
                    <xref ref-type="bibr" rid="ref-18">18</xref>
                </sup>.</p>
            <p>Environmental contaminants present in Superfund sites around the U.S. can significantly affect the health of the population in the surrounding areas. To illustrate this effect, we created a tool for visualizing the impact of environmental toxicants on the gut microbiome.</p>
            <p>
                <bold>Goals:</bold> 1) To illustrate the trends of environmental chemical exposures from U.S. Superfund sites over time. 2) to create a tool for visualizing the impact of exposure to environmental chemicals on the gut microbiome around the U.S.</p>
        </sec>
        <sec sec-type="methods">
            <title>Methods</title>
            <sec>
                <title>Implementation and Operation</title>
                <p>
                    <bold>Data-sources and processing:</bold> We processed and combined datasets from the American Gut Project (AGP), census data, and EPA Superfund data to search for informative patterns using the R package 
                    <ext-link ext-link-type="uri" xlink:href="https://bioconductor.org/packages/release/bioc/html/phyloseq.html">phyloseq</ext-link>  1.30.0
                    <sup>
                        <xref ref-type="bibr" rid="ref-19">19</xref>
                    </sup>. We identified most abundant taxa by Superfund site/geographic location. We then performed basic association analyses to assess relationships between abundant/rare taxa, various Superfund sites and contaminants. Archived code are available, see 
                    <italic toggle="yes">Software availability</italic>
                    <sup>
                        <xref ref-type="bibr" rid="ref-20">20</xref>
                    </sup>.</p>
                <p>
                    <italic toggle="yes">1) American Gut Project data:</italic> The American Gut Project (AGP) is a large-scale, crowdsourced project (n =29778) of microbial sequence data with the aim of characterizing the human gut microbiome including associated mitigating factors ranging from diet, lifestyle, overall health, and the broader environment. The metadata file obtained from 
                    <ext-link ext-link-type="uri" xlink:href="ftp://ftp.microbio.me/AmericanGut/latest">AGP sample information</ext-link> (file 04-meta). was reduced to responses from participants within the United States only. Important variables that have been previously found to be associated with differential phenotypes mediated by air pollution in microbial communities in published studies were also selected and included in subsequent testing for associations with Superfund-site proximity.</p>
                <p>
                    <italic toggle="yes">2) Superfund data:</italic> Superfund sites and associated contamination data for current NPL sites were retrieved from EPA data using the R 
                    <ext-link ext-link-type="uri" xlink:href="https://github.com/hepplerj/superfundr">superfundr</ext-link> 0.0.0.9000 package. The data were prepared and transformed using Statistical Analysis Software (SAS v 9.4, Cary, NC). We focused on 10 priority chemicals listed by the EPA.</p>
                <p>
                    <italic toggle="yes">3) Census data:</italic> Select data from the American Community Survey (ACS) were downloaded from the U.S. Census Bureau: American FactFinder website via the download center (U.S. Census Bureau, 2020). This population-based data source provides descriptive socio-demographic data (e.g., sex, race, ethnicity, economic indicators, etc.) by zip code across the nation. Once all datasets were downloaded for each variable, all variables were then merged by a linking variable (i.e., zip code) that each dataset had in common. After data-cleaning, percentages were calculated for each variable. All data-cleaning was conducted using Statistical Analysis Software (SAS v 9.4).</p>
                <p>Loading and filtering OTU tables was memory-intensive, as the initial dataset is very large. Initial attempts for loading the OTU table with a 16 GB laptop were insufficient. To solve the problem, we performed this filtering on a high-performance computation cluster with 180 GB of memory.</p>
                <p>
                    <italic toggle="yes">Merging data across disparate datasets:</italic> Several distinct datasets across the AGP, Superfund, and ACS provided unique information connected only by geographic location and could be merged by an appropriate linking variable (e.g., zip code). Data from all three sources were combined for a total of ~1000 samples. We further reduced the dataset to only samples that were directly related to the gut for downstream prediction using machine learning approaches.</p>
                <p>ArcMap version 10.7 (2020) was used to create choropleth maps from the combined ACS and Superfund datasets to evaluate the association of chemicals found at EPA Superfund sites with select population-based socio-demographic data by zip code overtime. An open source software can be used for the same work is QGIS Geographic Information System, at Open Source Geospatial Foundation Project (
                    <ext-link ext-link-type="uri" xlink:href="http://qgis.org">http://qgis.org</ext-link>).</p>
                <p>
                    <bold>Machine learning analysis on data collected from individuals near Superfund sites:</bold> We selected individuals that were self-identified to be within 5 km of Superfund sites from the final combined dataset. We next performed a classification analysis using random forests implemented via the R package ranger
                    <sup>
                        <xref ref-type="bibr" rid="ref-21">21</xref>
                    </sup>. For each contaminant, we classified each individual as exposed or unexposed based on their proximity to a Superfund site with that contaminant. We then performed 10-fold cross-fold validation and reported the accuracy of the most and least informative contaminants in regard to the microbiome.</p>
            </sec>
        </sec>
        <sec sec-type="results">
            <title>Results</title>
            <p>Geographic distribution of select Superfund-site contaminants and abundance of 
                <italic toggle="yes">Bacteriodetes</italic> OTUs are shown in 
                <xref ref-type="fig" rid="f5">Figure 5</xref>. We next explored a potential relationship between abundance of this bacterial phylum and individual contaminants, and further possible predictive efficacy of contaminants for certain OTUs, using proof-of-concept modeling. We restricted samples to those within 5 km of a Superfund site for these analyses. We constructed a random forest using each contaminant as a binary predictor-variable. We found a strong relationship between several contaminants and microbial composition. The two most predictive contaminants were polycyclic aromatic hydrocarbons and poly-chlorinated biphenyls (PAH, 94% and PCB, 81%, respectively). The contaminant with the lowest accuracy was lead (60%).</p>
            <fig fig-type="figure" id="f5" orientation="portrait" position="float">
                <label>Figure 5. </label>
                <caption>
                    <title>Contaminant associations with the most dominant bacterial phylum.</title>
                    <p>Geographic distribution of select Superfund-site contaminants (circles color-coded by contaminant) and abundance of Bacteriodetes OTUs (underlying heatmap) from samples collected within 5km of a Superfund site. We found a strong relationship between several contaminants and microbial composition.</p>
                </caption>
                <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/29214/ca8f781b-3111-4461-98e0-346e2215b98f_figure5.gif"/>
            </fig>
            <p>It is worth noting that PAH are known to bio-amplify as they go through food-webs. Other health outcomes linked to PAH exposure are various forms of cancer, as well as developmental impacts. PCB have been banned in the manufacturing process since 1979, yet they do not readily break down and remain a hazard over long periods of time. Because of these properties, they are commonly listed as Superfund contaminants of high concern. In conclusion, we found that for several contaminants the microbial composition varied significantly among individuals living near Superfund sites with high or low levels of PAH and PCB, respectively.</p>
        </sec>
        <sec>
            <title>Team 3 - ZERO</title>
            <p>
                <bold>Project title:</bold> Creating a web app to study human gut microbiome variation across geographic regions of the world</p>
            <p>Project Rationales, Descriptions and Goals</p>
            <p>
                <bold>Rationale:</bold> The human gut microbiome is one of the most densely populated sites by bacteria in the human body. It performs numerous functions, and its dysbiosis has been associated with several diseases. A major goal of microbiome researchers has been to understand the diversity of the gut microbiome across human populations. Although several studies have been undertaken for this purpose, these studies are limited in scope and comparative ability. Therefore, the rationale of the present work was to create a web tool which will be equipped with reference databases, populations and necessary scripts for the users to upload, analyze and visualize their own microbiome data at the server, with additional options to compare with the reference populations. Results can subsequently be downloaded by the user. Finally, all the reference population data is to be made available for download, along with necessary scripts to enable the user to run the program on their local computers, without the need to upload their raw data. Such a tool will be extremely useful to any interdisciplinary researchers who may have microbiome-related research questions but are not experienced in writing code, handling large microbiome datasets or who do not have access to advanced computational facilities. The codes, instructions and guidelines are available through a GitHub repository. The flowchart summarizing the approach is provided in 
                <xref ref-type="fig" rid="f6">Figure 6</xref>.</p>
            <fig fig-type="figure" id="f6" orientation="portrait" position="float">
                <label>Figure 6. </label>
                <caption>
                    <title>Proposed Team Zero web-app workflow.</title>
                    <p>Users will be able to upload fastq files for analyses and choose reference-datasets for comparison. The in-built pipeline will then generate the Amplicon Sequence Variants (ASVs) from which the most informative for differentiating populations will be chosen using a Gaussian-Mixture EM algorithm followed by unsupervised K-means clustering. Heatmaps and PCA-plots describing the data will be generated and made available for download. </p>
                </caption>
                <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/29214/ca8f781b-3111-4461-98e0-346e2215b98f_figure6.gif"/>
            </fig>
            <p>
                <bold>Goals:</bold> 1) To download raw microbiome data (V4 region of 16S rRNA gene) from various world populations and generate amplicon sequence variant (ASV) table for comparison purposes. 2) Construct simple, but informative plots such as heatmaps and principle component analysis (PCA) plots to visualize relationships/patterns in the data through the proposed web app. 3) Provide all raw sequencing data, bash scripts and R scripts to run all steps of the analyses, as well as appropriate documentation and guidelines for an easy and error-free run of the pipeline on the user&#x2019;s local computer.</p>
        </sec>
        <sec sec-type="methods">
            <title>Methods</title>
            <sec>
                <title>Data sources and processing</title>
                <p>We first mined microbiome data from various world populations by geographical region. We narrowed our focus to studies on the human gut microbiome involving the V4 region of the 16S rRNA gene. A total of 1428 samples spanning populations from China, the Indian subcontinent (Himalayan region), Brazil and Europe meeting these criteria were incorporated. Raw data were downloaded from the European Nucleotide Archive (Accessions: China, 
                    <ext-link ext-link-type="uri" xlink:href="https://www.ebi.ac.uk/ena/browser/view/PRJNA396815">PRJNA396815</ext-link>; Indian subcontinent, 
                    <ext-link ext-link-type="uri" xlink:href="https://www.ebi.ac.uk/ena/browser/view/PRJEB29137">PRJEB29137</ext-link>; Europe, 
                    <ext-link ext-link-type="uri" xlink:href="https://www.ebi.ac.uk/ena/browser/view/PRJNA497734">PRJNA497734</ext-link>; Brazil, 
                    <ext-link ext-link-type="uri" xlink:href="https://www.ebi.ac.uk/ena/browser/view/PRJEB19103">PRJEB19103</ext-link>) (
                    <xref ref-type="table" rid="T1">Table 1</xref>).</p>
                <table-wrap id="T1" orientation="portrait" position="anchor">
                    <label>Table 1. </label>
                    <caption>
                        <title>Team Zero web-app data-sources by population.</title>
                    </caption>
                    <table content-type="article-table" frame="hsides">
                        <thead>
                            <tr>
                                <th align="left" colspan="1" rowspan="1">Population</th>
                                <th align="left" colspan="1" rowspan="1">ENA study
                                    <break/>accession No.</th>
                                <th align="left" colspan="1" rowspan="1">No. of
                                    <break/>samples</th>
                            </tr>
                        </thead>
                        <tbody>
                            <tr>
                                <td colspan="1" rowspan="1" valign="top">China</td>
                                <td colspan="1" rowspan="1" valign="top">PRJNA396815</td>
                                <td colspan="1" rowspan="1" valign="top">200</td>
                            </tr>
                            <tr>
                                <td colspan="1" rowspan="1" valign="top">Indian subcontinent</td>
                                <td colspan="1" rowspan="1" valign="top">PRJEB29137</td>
                                <td colspan="1" rowspan="1" valign="top">77</td>
                            </tr>
                            <tr>
                                <td colspan="1" rowspan="1" valign="top">Europe</td>
                                <td colspan="1" rowspan="1" valign="top">PRJNA497734</td>
                                <td colspan="1" rowspan="1" valign="top">1000</td>
                            </tr>
                            <tr>
                                <td colspan="1" rowspan="1" valign="top">Brazil</td>
                                <td colspan="1" rowspan="1" valign="top">PRJEB19103</td>
                                <td colspan="1" rowspan="1" valign="top">150</td>
                            </tr>
                        </tbody>
                    </table>
                </table-wrap>
                <p>Despite this initial filtration step, analysis-time was still estimated to be too high to move forward under Codeathon time-restrictions. Thus, in a second step to reduce data volume, 5000 sequences were subsampled using 
                    <ext-link ext-link-type="uri" xlink:href="https://github.com/lh3/seqtk">Seqtk</ext-link> 1.3-r115-dirty
                    <sup>
                        <xref ref-type="bibr" rid="ref-22">22</xref>
                    </sup> from each of the forward and reverse fastq files for each of the samples. All the downstream analyses were based on the subsampled reads. The fastq files were analyzed using the standard 
                    <ext-link ext-link-type="uri" xlink:href="https://benjjneb.github.io/dada2/">DADA2</ext-link> 1.14.1 pipeline
                    <sup>
                        <xref ref-type="bibr" rid="ref-23">23</xref>
                    </sup> to generate the distribution of ASVs observed in this dataset. The corresponding classification of each ASV was obtained using the Silva database (v132)
                    <sup>
                        <xref ref-type="bibr" rid="ref-24">24</xref>
                    </sup>. The bacterial count table was further utilized for downstream analysis.</p>
                <p>The resulting ASV table contained 1,428 samples with 2,655 bacterial taxa. Considering the very sparse data in the ASV table (only 1.231% of ASV elements exhibit reads numbers &gt; 0), we used a Gaussian-mixture model to remove the bacteria with lower reads-coverage. A total of 1,783 taxa were removed and the remaining ASV table was normalized for each sample by the proportion of reads in each taxon using orders-of-magnitude multipliers (1-e
                    <sup>8</sup>). The distribution of standard deviation in reads-number was calculated, and taxa at the tail-ends of the distribution were eliminated, leaving 237 taxa. Similarly, individual samples at the extreme low-end of the reads-number distribution (365 samples) were also removed using the Gaussian-mixture model. Unfortunately, all Chinese-population samples were eliminated during this step, and all downstream analyses were performed only on the populations from Europe, Brazil and the Indian subcontinent.</p>
            </sec>
            <sec>
                <title>Modeling relationships between population and bacterial taxa</title>
                <p>We used the resulting filtered dataset to perform K-means clustering to determine the optimal number of categories, finding k=18 to be most informative for the data. The Akaike information criterion (AIC) and Bayesian information criterion (BIC) were utilized to measure model robustness.</p>
            </sec>
            <sec>
                <title>Operation and implementation</title>
                <p>We incorporated a set of unsupervised machine learning back-end computational methods to investigate the datasets for encoded geographical information. We used python v3.6.9 along with the django web framework and conda 4.7.12 to build our workflow. The machine learning components of the workflow to identify ASVs distinguishing populations by geography are performed using 
                    <ext-link ext-link-type="uri" xlink:href="https://www.tensorflow.org/">TensorFlow2</ext-link>
                    <sup>
                        <xref ref-type="bibr" rid="ref-25">25</xref>
                    </sup>. Data preprocessing and data visualization are mediated through R scripts (see Implementation and Software Availability).</p>
                <p>Herein we implemented a web-based application
                    <sup>
                        <xref ref-type="bibr" rid="ref-26">26</xref>
                    </sup> for the deposition and rapid analysis of microbiome data. Importantly, users are able to (1) download a prepared database along with the server source code, or (2) construct their own database for analysis. The web-based application source code, the preprocessing and data visualization scripts, and instructions for their usage are available online as listed in the 
                    <italic toggle="yes">Software availability</italic> section.</p>
            </sec>
        </sec>
        <sec sec-type="results">
            <title>Results</title>
            <sec>
                <title>The unsupervised classification algorithm indicates strong bacterial association with geographic populations</title>
                <p>Our k-means parameter-exploration indicated 18 classes within the sample ASV data. The result indicates at least one or two bacterial groups are enriched for each class (
                    <xref ref-type="fig" rid="f7">Figure 7A</xref>). Classification further indicated differences in community composition by geographical location (
                    <xref ref-type="fig" rid="f7">Figure 7B</xref>). We performed a PCA to further characterize the relationship between sample categories detected via clustering. We found that the samples from classes 1, 6, 9 and 14 form clearly distinct clusters from each other (
                    <xref ref-type="fig" rid="f7">Figure 7C</xref>), further indicative of underlying geographic patterns. We identified important bacterial taxa contributing to sample classification (
                    <xref ref-type="fig" rid="f8">Figure 8</xref>) and plotted relative contribution of each ASV (classified up to genus-level) driving ordination (
                    <xref ref-type="fig" rid="f9">Figure 9</xref>). Differential relative abundance of these ASVs across all geographic populations indicated distinct geographical patterns, with several ASVs strongly associating with Indian, Brazilian, or European (to a lesser extent) populations (
                    <xref ref-type="fig" rid="f9">Figure 9</xref>). The classification of the ASVs corresponding to 
                    <xref ref-type="fig" rid="f9">Figure 9</xref> are provided in 
                    <xref ref-type="table" rid="T2">Table 2</xref>.</p>
                <fig fig-type="figure" id="f7" orientation="portrait" position="float">
                    <label>Figure 7. </label>
                    <caption>
                        <title>Samples cluster distinctly by OTU composition and geographic population.</title>
                        <p>The color scales indicate the 18 categories used for the classification and normalized reads-number for the studied samples. (
                            <bold>A</bold>) The heatmap indicates enrichment for at least one or two bacterial OTUs in each cluster. (
                            <bold>B</bold>) Enrichment of K-means category by geographic location. The 18 classes showed maximum differential abundance across the three studied populations. (
                            <bold>C</bold>) The PCA plot shows the sample affinities for the classes 1, 6, 9 and 14 which showed the greatest geographical pattern.</p>
                    </caption>
                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/29214/ca8f781b-3111-4461-98e0-346e2215b98f_figure7.gif"/>
                </fig>
                <fig fig-type="figure" id="f8" orientation="portrait" position="float">
                    <label>Figure 8. </label>
                    <caption>
                        <title>Bacteria driving sample classification. </title>
                        <p>The X-axis shows the major ASVs, and their relative contribution to the PCA (
                            <xref ref-type="fig" rid="f7">Figure 7B</xref>) is shown on the Y-axis.</p>
                    </caption>
                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/29214/ca8f781b-3111-4461-98e0-346e2215b98f_figure8.gif"/>
                </fig>
                <fig fig-type="figure" id="f9" orientation="portrait" position="float">
                    <label>Figure 9. </label>
                    <caption>
                        <title>The top 13 bacterial taxa driving sample-classification have strong population associations. </title>
                        <p>The color of the boxplot indicates geographic affiliations. The X-axis indicates the top 13 ASVs and the Y-axis shows the corresponding number of normalized reads.</p>
                    </caption>
                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/29214/ca8f781b-3111-4461-98e0-346e2215b98f_figure9.gif"/>
                </fig>
                <table-wrap id="T2" orientation="portrait" position="anchor">
                    <label>Table 2. </label>
                    <caption>
                        <title>Classification of ASVs displaying highest geographical patterns as shown in 
                            <xref ref-type="fig" rid="f9">Figure 9</xref>.</title>
                        <p>Classification only up to genus level were obtained since the studied region was limited to V4 of the 16S rRNA gene. When two ASVs were affiliated with the same genus, they were distinguished by adding a serial number as suffix. For example, Bacteroides_1 and Bacteroides_2 belong to the same genus.</p>
                    </caption>
                    <table content-type="article-table" frame="hsides">
                        <thead>
                            <tr>
                                <th align="left" colspan="1" rowspan="1">ASV</th>
                                <th align="left" colspan="1" rowspan="1">Phylum</th>
                                <th align="left" colspan="1" rowspan="1">Class</th>
                                <th align="left" colspan="1" rowspan="1">Order</th>
                                <th align="left" colspan="1" rowspan="1">Family</th>
                                <th align="left" colspan="1" rowspan="1">Genus</th>
                            </tr>
                        </thead>
                        <tbody>
                            <tr>
                                <td colspan="1" rowspan="1" valign="top">ASV_1</td>
                                <td colspan="1" rowspan="1" valign="top">Actinobacteria</td>
                                <td colspan="1" rowspan="1" valign="top">Actinobacteria</td>
                                <td colspan="1" rowspan="1" valign="top">Bifidobacteriales</td>
                                <td colspan="1" rowspan="1" valign="top">Bifidobacteriaceae</td>
                                <td colspan="1" rowspan="1" valign="top">Bifidobacterium</td>
                            </tr>
                            <tr>
                                <td colspan="1" rowspan="1" valign="top">ASV_2</td>
                                <td colspan="1" rowspan="1" valign="top">Bacteroidetes</td>
                                <td colspan="1" rowspan="1" valign="top">Bacteroidia</td>
                                <td colspan="1" rowspan="1" valign="top">Bacteroidales</td>
                                <td colspan="1" rowspan="1" valign="top">Bacteroidaceae</td>
                                <td colspan="1" rowspan="1" valign="top">Bacteroides_1</td>
                            </tr>
                            <tr>
                                <td colspan="1" rowspan="1" valign="top">ASV_3</td>
                                <td colspan="1" rowspan="1" valign="top">Bacteroidetes</td>
                                <td colspan="1" rowspan="1" valign="top">Bacteroidia</td>
                                <td colspan="1" rowspan="1" valign="top">Bacteroidales</td>
                                <td colspan="1" rowspan="1" valign="top">Bacteroidaceae</td>
                                <td colspan="1" rowspan="1" valign="top">Bacteroides_2</td>
                            </tr>
                            <tr>
                                <td colspan="1" rowspan="1" valign="top">ASV_4</td>
                                <td colspan="1" rowspan="1" valign="top">Firmicutes</td>
                                <td colspan="1" rowspan="1" valign="top">Clostridia</td>
                                <td colspan="1" rowspan="1" valign="top">Clostridiales</td>
                                <td colspan="1" rowspan="1" valign="top">Ruminococcaceae</td>
                                <td colspan="1" rowspan="1" valign="top">Faecalibacterium_1</td>
                            </tr>
                            <tr>
                                <td colspan="1" rowspan="1" valign="top">ASV_5</td>
                                <td colspan="1" rowspan="1" valign="top">Firmicutes</td>
                                <td colspan="1" rowspan="1" valign="top">Clostridia</td>
                                <td colspan="1" rowspan="1" valign="top">Clostridiales</td>
                                <td colspan="1" rowspan="1" valign="top">Ruminococcaceae</td>
                                <td colspan="1" rowspan="1" valign="top">Faecalibacterium_2</td>
                            </tr>
                            <tr>
                                <td colspan="1" rowspan="1" valign="top">ASV_6</td>
                                <td colspan="1" rowspan="1" valign="top">Proteobacteria</td>
                                <td colspan="1" rowspan="1" valign="top">Gammaproteobacteria</td>
                                <td colspan="1" rowspan="1" valign="top">Enterobacteriales</td>
                                <td colspan="1" rowspan="1" valign="top">Enterobacteriaceae</td>
                                <td colspan="1" rowspan="1" valign="top">Escherichia/Shigella</td>
                            </tr>
                            <tr>
                                <td colspan="1" rowspan="1" valign="top">ASV_7</td>
                                <td colspan="1" rowspan="1" valign="top">Firmicutes</td>
                                <td colspan="1" rowspan="1" valign="top">Clostridia</td>
                                <td colspan="1" rowspan="1" valign="top">Clostridiales</td>
                                <td colspan="1" rowspan="1" valign="top">Lachnospiraceae</td>
                                <td colspan="1" rowspan="1" valign="top">Blautia</td>
                            </tr>
                            <tr>
                                <td colspan="1" rowspan="1" valign="top">ASV_8</td>
                                <td colspan="1" rowspan="1" valign="top">Bacteroidetes</td>
                                <td colspan="1" rowspan="1" valign="top">Bacteroidia</td>
                                <td colspan="1" rowspan="1" valign="top">Bacteroidales</td>
                                <td colspan="1" rowspan="1" valign="top">Prevotellaceae</td>
                                <td colspan="1" rowspan="1" valign="top">Prevotella_9</td>
                            </tr>
                            <tr>
                                <td colspan="1" rowspan="1" valign="top">ASV_9</td>
                                <td colspan="1" rowspan="1" valign="top">Firmicutes</td>
                                <td colspan="1" rowspan="1" valign="top">Clostridia</td>
                                <td colspan="1" rowspan="1" valign="top">Clostridiales</td>
                                <td colspan="1" rowspan="1" valign="top">Lachnospiraceae</td>
                                <td colspan="1" rowspan="1" valign="top">NA</td>
                            </tr>
                            <tr>
                                <td colspan="1" rowspan="1" valign="top">ASV_15</td>
                                <td colspan="1" rowspan="1" valign="top">Firmicutes</td>
                                <td colspan="1" rowspan="1" valign="top">Clostridia</td>
                                <td colspan="1" rowspan="1" valign="top">Clostridiales</td>
                                <td colspan="1" rowspan="1" valign="top">Lachnospiraceae</td>
                                <td colspan="1" rowspan="1" valign="top">Agathobacter</td>
                            </tr>
                            <tr>
                                <td colspan="1" rowspan="1" valign="top">ASV_17</td>
                                <td colspan="1" rowspan="1" valign="top">Bacteroidetes</td>
                                <td colspan="1" rowspan="1" valign="top">Bacteroidia</td>
                                <td colspan="1" rowspan="1" valign="top">Bacteroidales</td>
                                <td colspan="1" rowspan="1" valign="top">Bacteroidaceae</td>
                                <td colspan="1" rowspan="1" valign="top">Bacteroides_3</td>
                            </tr>
                            <tr>
                                <td colspan="1" rowspan="1" valign="top">ASV_26</td>
                                <td colspan="1" rowspan="1" valign="top">Bacteroidetes</td>
                                <td colspan="1" rowspan="1" valign="top">Bacteroidia</td>
                                <td colspan="1" rowspan="1" valign="top">Bacteroidales</td>
                                <td colspan="1" rowspan="1" valign="top">Prevotellaceae</td>
                                <td colspan="1" rowspan="1" valign="top">Prevotella_10</td>
                            </tr>
                            <tr>
                                <td colspan="1" rowspan="1" valign="top">ASV_45</td>
                                <td colspan="1" rowspan="1" valign="top">Proteobacteria</td>
                                <td colspan="1" rowspan="1" valign="top">Gammaproteobacteria</td>
                                <td colspan="1" rowspan="1" valign="top">Aeromonadales</td>
                                <td colspan="1" rowspan="1" valign="top">Succinivibrionaceae</td>
                                <td colspan="1" rowspan="1" valign="top">Succinivibrio</td>
                            </tr>
                        </tbody>
                    </table>
                </table-wrap>
            </sec>
        </sec>
        <sec>
            <title>Conclusions	</title>
            <p>Our work was aimed at creating a web app to study the geographical patterns of the human microbiome and selecting features which could be useful to distinguish the populations. Using publicly available resources, we were able to include different geographical populations and select features to identify differences across groups. The resources for our study are deposited in our GitHub repository (see 
                <italic toggle="yes">Software availability</italic>). Limitations of this study include that factors such as age, gender and other participant phenotypes which could be contributing to geographical patterns were not included in these analyses. However, we were able to create a web-app prototype for identifying features from the composition of the human gut microbiome related to geographical population. In the future, this work can be extended to include other variable regions of the 16S rRNA gene, as well as including other body sites such as the oral cavity, skin, etc. Similarly, batch-effect correction-tools need to be incorporated for unbiased comparison across different studies.</p>
        </sec>
        <sec>
            <title>Team 4 - YOLO</title>
            <p>
                <bold>Project title:</bold> A web-based machine learning pipeline for disease prediction using microbial data</p>
            <p>Project Rationales, Descriptions and Goals</p>
            <p>
                <bold>Rationale:</bold> High-throughput sequencing technologies have resulted in the generation of an increasing amount of microbial data, such as microbiome data. Using these data, machine learning methods are powerful in identifying functionally active microbes and predicting disease status. Even though machine learning algorithms are popular approaches to investigate microbiome, to adopt these methods effectively usually requires specialized training. In addition, model selection and hyper-parameter tuning can be time-consuming even for experienced practitioners. Thus, our project focused on the efficiency of AI in solving big-data problems and facilitating humans to perform other cognition-demanding tasks by developing a GUI-based pipeline for training machine learning algorithms on taxonomic microbiome data. Our pipeline expands access of computational tools to researchers in non-computational disciplines to improve cross-disciplinary study. As a proof of concept, we successfully utilized our pipeline to train a predictive algorithm for obesity rates based upon orthogonal taxonomic units which may be applied toward generating health-related features from clinical, historical, or forensic samples. Our code utilizes three methods: K-nearest neighbors (KNN), support vector machine (SVM), and adaptive boosting (AdaBoost) to achieve respective accuracies near eighty-four, ninety-one, and eighty-six percent. Both KNN and SVM utilized a 10-fold cross-validations to prevent overtraining. Under this method, training was achieved near instantaneously on a 16 GB MacBook to demonstrate feasibility. Outputs are processed into interactive graphical visualizations to improve ease-of-use. Although previous projects have utilized these computational techniques toward processing microbiome data, our pipeline removes barriers to use for researchers without coding backgrounds while streamlining efficiency for all users.</p>
            <p>Studies have revealed significant diversity in the gut microbiome composition related to various phenotypes. Obesity has been associated with changes in the microbiota at phylum-level, reduction in bacterial diversity, and different representations of bacterial genes. For example, studies of lean and obese mice suggest a strong relationship between gut microbiome and obesity. Phylogenetic marker genes uncovered by 16S rRNA gene amplicon sequencing have revolutionized the field of microbial ecology. This PCR-based method has the advantage of identifying difficult to culture bacterial organisms. Various bioinformatic pipelines can then group these sequences into clusters called OTUs. OTUs are based on their sequence similarity to each other rather than a reference taxonomic dataset which may be biased towards existing taxonomic classification
                <sup>
                    <xref ref-type="bibr" rid="ref-27">27</xref>
                </sup>.</p>
            <p>
                <bold>Goals:</bold> We were interested in finding out if there is an association between gut microbiome OTUs and obesity. Additionally, we wanted to be able to use this data to distinguish between lean, overweight, and obese phenotypes in humans. We were able to successfully develop a machine-learning based pipeline that shows the association between gut microbiome OTUs and obesity with high accuracy. Furthermore, this pipeline can predict whether sample OTU data comes from a lean, overweight, or obese human phenotypes. Our work is significant because a heavy coding background is not required for use of high-accuracy machine learning tools.</p>
        </sec>
        <sec sec-type="methods">
            <title>Methods</title>
            <sec>
                <title>Data preprocessing</title>
                <p>To develop our pipeline
                    <sup>
                        <xref ref-type="bibr" rid="ref-28">28</xref>
                    </sup>, sample microbiome data was retrieved from 
                    <xref ref-type="bibr" rid="ref-29">29</xref>. First, we cleaned the data by removing duplicate entries which leaves us with 151 samples. Second, to deal with the sparsity of OTU count data, we added a random small positive number to all 0 entries. Third, data was normalized using the centered log-ratio (CLR) transformation
                    <sup>
                        <xref ref-type="bibr" rid="ref-30">30</xref>
                    </sup>. Then, the dimensionality reduction was performed. We chose to use the Max-min Markov Blanket method to recursively select a small subset of features that are important to the outcome of interest (Obesity or lean in this case). A total of 10 highly informative OTUs were selected during this process and various machine learning methods were explored based on a recent review article
                    <sup>
                        <xref ref-type="bibr" rid="ref-31">31</xref>
                    </sup>.</p>
            </sec>
            <sec>
                <title>Data transformations and machine learning methods</title>
                <p>
                    <italic toggle="yes">Principal component analysis (PCA)</italic> is an unsupervised dimensionality reduction technique that finds relationships in the dataset, then transforms and reduces them into principal components (i.e. uncorrelated features that embody the information contained within the dataset) that do not have redundant information.</p>
                <p>
                    <italic toggle="yes">Random forest</italic> describes a supervised machine learning strategy that splits samples into successively smaller groups based on specific features and associated threshold values. This method is in the planning phase for future versions.</p>
                <p>
                    <italic toggle="yes">SVM</italic> is a method of supervised machine learning that is useful for classification, regression, and detection of outliers. SVMs are effective in higher dimensions where the dimensions are greater than the numbers of samples. Linear Support Vector Machine (SVM) classifier was used to project samples into a higher dimensional space so that they are linearly separable. Linear SVM was performed using 10-fold cross-validation with 3 repeats.</p>
                <p>
                    <italic toggle="yes">KNN</italic> is a machine learning algorithm that can be used for classification and regression. In our pipeline, KNN classifier was used for the classification of disease-status, with classification determined by majority-vote of close-by data points (n = K).</p>
                <p>
                    <italic toggle="yes">AdaBoost</italic> is a machine learning meta-algorithm that can be used to improve performance of other machine learning algorithms. AdaBoost classifier was used to train multiple tree classifiers (where each tree has a subset of available features) to iteratively add more weight to those misclassified samples in the next training loop. GitHub readme and description are available in the software accessibility section.</p>
            </sec>
            <sec>
                <title>Operation and implementation</title>
                <p>We implemented various machine learning models, namely k Nearest Neighbor, AdaBoost, and Support Vector Machines, to predict disease from the microbiome pre-processed data. It includes three main steps. 1) Users can prepare the biome OTU table to perform downstream analysis, such as PCA and machine learning. 2) In the next step, the processed data can be used to perform PCA for exploratory analysis. 3) The data is fed into machine learning models to select the highly predictive features and for the final prediction of disease-status.</p>
                <p>Feature selection and machine learning were implemented using 
                    <ext-link ext-link-type="uri" xlink:href="https://cran.r-project.org/package=MXM">MXM</ext-link> 1.4.5
                    <sup>
                        <xref ref-type="bibr" rid="ref-32">32</xref>
                    </sup> and 
                    <ext-link ext-link-type="uri" xlink:href="https://cran.r-project.org/package=caret">caret</ext-link> 6.0-85 R packages
                    <sup>
                        <xref ref-type="bibr" rid="ref-33">33</xref>
                    </sup>, respectively, in R version 3.6. To make it easy for others to use this implementation, we designed a shiny application with an intuitive graphical user interface (GUI). Users can plot, visualize, and download their results generated through the app.</p>
            </sec>
        </sec>
        <sec sec-type="results">
            <title>Results</title>
            <p>We show that machine learning can be used to differentiate disease from the normal states using OTU information. We used pre-processed data from a twin study with 281 samples and 5462 OTUs
                <sup>
                    <xref ref-type="bibr" rid="ref-29">29</xref>
                </sup>. For exploratory analysis we performed PCA (
                <xref ref-type="fig" rid="f10">Figure 10</xref> and 
                <xref ref-type="fig" rid="f11">Figure 11</xref>; analyses and plots generated using our Shiny app) as shown in 
                <xref ref-type="fig" rid="f10">Figure 10</xref> and 
                <xref ref-type="fig" rid="f11">Figure 11</xref>. This analysis and plots are generated using the Shiny app. We performed feature selection to select the highly significant features, shown in 
                <xref ref-type="table" rid="T3">Table 3</xref>. Abundance of significant OTUs is shown in 
                <xref ref-type="fig" rid="f12">Figure 12</xref>. By using a set of predefined hyperparameters for each model, we achieved 10-fold cross validated accuracy of 0.936 using a linear support vector machine (
                <xref ref-type="fig" rid="f13">Figure 13</xref>). Additionally, 10 OTUs we identified as important to obesity-status are provided in 
                <xref ref-type="table" rid="T3">Table 3</xref>. While we do not have assigned significant functional annotations for them in the current development, future studies could use them as candidate functional groups to aid experimental design for validating clinical and public health microbiome findings.</p>
            <fig fig-type="figure" id="f10" orientation="portrait" position="float">
                <label>Figure 10. </label>
                <caption>
                    <title>A principal component analysis of microbiome data from over 5400 OTUs involving 281 individuals by disease-class. </title>
                    <p>PCA plot tries to identify linear combinations of different OTUs (features) corresponding to microbiome composition discriminating by disease class. PC1 and PC2 explain only a small amount of the variance in OTUs observed across different disease classes.</p>
                </caption>
                <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/29214/ca8f781b-3111-4461-98e0-346e2215b98f_figure10.gif"/>
            </fig>
            <fig fig-type="figure" id="f11" orientation="portrait" position="float">
                <label>Figure 11. </label>
                <caption>
                    <title>PCA plot explaining variation for ancestry between African Americans (AA) and Europeans (EA). </title>
                    <p>The same number of OTUs and individuals are used as in 
                        <xref ref-type="fig" rid="f10">Figure 10</xref> for different classes. This PCA plot shows more separation in the OTU clusters based on ancestry than by different disease classes (shown in 
                        <xref ref-type="fig" rid="f10">Figure 10</xref>).</p>
                </caption>
                <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/29214/ca8f781b-3111-4461-98e0-346e2215b98f_figure11.gif"/>
            </fig>
            <table-wrap id="T3" orientation="portrait" position="anchor">
                <label>Table 3. </label>
                <caption>
                    <title>Informative OTUs identified by the feature selection process.</title>
                    <p>These 10 OTUs are all bacteria which come from 2 distinct phyla. Most of the OTUs identified are at genus-level.</p>
                </caption>
                <table content-type="article-table" frame="hsides">
                    <thead>
                        <tr>
                            <th align="left" colspan="1" rowspan="1">Phylum</th>
                            <th align="left" colspan="1" rowspan="1">Class</th>
                            <th align="left" colspan="1" rowspan="1">Order</th>
                            <th align="left" colspan="1" rowspan="1">Family</th>
                            <th align="left" colspan="1" rowspan="1">Genus</th>
                            <th align="left" colspan="1" rowspan="1">Species</th>
                        </tr>
                    </thead>
                    <tbody>
                        <tr>
                            <td colspan="1" rowspan="1" valign="top">Firmicutes</td>
                            <td colspan="1" rowspan="1" valign="top">Erysipelotrichia</td>
                            <td colspan="1" rowspan="1" valign="top">Erysipelotrichales</td>
                            <td colspan="1" rowspan="1" valign="top">Erysipelotrichaceae</td>
                            <td colspan="1" rowspan="1" valign="top">

                                <italic toggle="yes">Eubacterium</italic>
</td>
                            <td colspan="1" rowspan="1" valign="top">

                                <italic toggle="yes">Biforme</italic>
</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1" valign="top">Firmicutes</td>
                            <td colspan="1" rowspan="1" valign="top">Clostridia</td>
                            <td colspan="1" rowspan="1" valign="top">Clostridiales</td>
                            <td colspan="1" rowspan="1" valign="top">Ruminococcaceae</td>
                            <td colspan="1" rowspan="1" valign="top">

                                <italic toggle="yes">Faecalibacterium</italic>
</td>
                            <td colspan="1" rowspan="1" valign="top">

                                <italic toggle="yes">Prausnitzii</italic>
</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1" valign="top">Firmicutes</td>
                            <td colspan="1" rowspan="1" valign="top">Clostridia</td>
                            <td colspan="1" rowspan="1" valign="top">Clostridiales</td>
                            <td colspan="1" rowspan="1" valign="top">Lachnospiraceae</td>
                            <td colspan="1" rowspan="1" valign="top">

                                <italic toggle="yes">Roseburia</italic>
</td>
                            <td colspan="1" rowspan="1" valign="top">NA</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1" valign="top">Firmicutes</td>
                            <td colspan="1" rowspan="1" valign="top">Clostridia</td>
                            <td colspan="1" rowspan="1" valign="top">Clostridiales</td>
                            <td colspan="1" rowspan="1" valign="top">Veillonellaceae</td>
                            <td colspan="1" rowspan="1" valign="top">

                                <italic toggle="yes">Phascolarctobacterium</italic>
</td>
                            <td colspan="1" rowspan="1" valign="top">NA</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1" valign="top">Firmicutes</td>
                            <td colspan="1" rowspan="1" valign="top">Clostridia</td>
                            <td colspan="1" rowspan="1" valign="top">Clostridiales</td>
                            <td colspan="1" rowspan="1" valign="top">NA</td>
                            <td colspan="1" rowspan="1" valign="top">NA</td>
                            <td colspan="1" rowspan="1" valign="top">NA</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1" valign="top">Bacteroidetes</td>
                            <td colspan="1" rowspan="1" valign="top">Bacteroidia</td>
                            <td colspan="1" rowspan="1" valign="top">Bacteroidales</td>
                            <td colspan="1" rowspan="1" valign="top">Rikenellaceae</td>
                            <td colspan="1" rowspan="1" valign="top">NA</td>
                            <td colspan="1" rowspan="1" valign="top">NA</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1" valign="top">Firmicutes</td>
                            <td colspan="1" rowspan="1" valign="top">Clostridia</td>
                            <td colspan="1" rowspan="1" valign="top">Clostridiales</td>
                            <td colspan="1" rowspan="1" valign="top">Veillonellaceae</td>
                            <td colspan="1" rowspan="1" valign="top">

                                <italic toggle="yes">Megasphaera</italic>
</td>
                            <td colspan="1" rowspan="1" valign="top">NA</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1" valign="top">Firmicutes</td>
                            <td colspan="1" rowspan="1" valign="top">Clostridia</td>
                            <td colspan="1" rowspan="1" valign="top">Clostridiales</td>
                            <td colspan="1" rowspan="1" valign="top">Lachnospiraceae</td>
                            <td colspan="1" rowspan="1" valign="top">NA</td>
                            <td colspan="1" rowspan="1" valign="top">NA</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1" valign="top">Bacteroidetes</td>
                            <td colspan="1" rowspan="1" valign="top">Bacteroidia</td>
                            <td colspan="1" rowspan="1" valign="top">Bacteroidales</td>
                            <td colspan="1" rowspan="1" valign="top">Prevotellaceae</td>
                            <td colspan="1" rowspan="1" valign="top">

                                <italic toggle="yes">Prevotella</italic>
</td>
                            <td colspan="1" rowspan="1" valign="top">

                                <italic toggle="yes">Copri</italic>
</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1" valign="top">Bacteroidetes</td>
                            <td colspan="1" rowspan="1" valign="top">Bacteroidia</td>
                            <td colspan="1" rowspan="1" valign="top">Bacteroidales</td>
                            <td colspan="1" rowspan="1" valign="top">Rikenellaceae</td>
                            <td colspan="1" rowspan="1" valign="top">NA</td>
                            <td colspan="1" rowspan="1" valign="top">NA</td>
                        </tr>
                    </tbody>
                </table>
            </table-wrap>
            <fig fig-type="figure" id="f12" orientation="portrait" position="float">
                <label>Figure 12. </label>
                <caption>
                    <title>Abundance of significant OTUs selected by machine learning</title>
                    <p>These OTUs are highly predictive for the classification of disease vs. normal class.</p>
                </caption>
                <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/29214/ca8f781b-3111-4461-98e0-346e2215b98f_figure12.gif"/>
            </fig>
            <fig fig-type="figure" id="f13" orientation="portrait" position="float">
                <label>Figure 13. </label>
                <caption>
                    <title>Cross-validation using the support vector regression approach.</title>
                    <p>The model showed the best cross-validation score with cost=5 (accuracy = 0.936). </p>
                </caption>
                <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/29214/ca8f781b-3111-4461-98e0-346e2215b98f_figure13.gif"/>
            </fig>
        </sec>
        <sec>
            <title>Team 5 - TRACK</title>
            <p>
                <bold>Project title: Tracking ancient global epidemics</bold>
            </p>
            <p>Project Rationales, Descriptions and Goals</p>
            <p>
                <bold>Rationale</bold>: As the collection of human microbiome data grows, developing user-friendly tools to search proteomics databases has become critical. Bridging the gap between computer science and biological science expertise will facilitate microbiome analysis for both explanatory and predictive purposes, making significant additions to general knowledge in this field. Such effective and convenient methods of sifting through vast datasets would be well-suited to the investigation of not only modern-day microbiome samples, but also preserved historical microbial and proteomic data retrieved from ancient populations at archaeological sites worldwide. Proteomic determination of the microbes of deceased individuals would provide another dimension to forensic analysis by uncovering the pathogens that might have been responsible for their death. The significance of this determination goes beyond simply detecting the presence of bacterial peptides, also extending to tracking the migration and virulence of diseases over time in human populations.</p>
            <p>Exploring ancient or paleolithic host-microbiome interactions is an emerging approach to explore widespread microbial infectious diseases, and even pandemics, by identifying pathogen-expressed proteins in human dental calculus. This approach is supplemented by data from metabolomic analyses, anthropological and paleopathological data from the skeletal material, archaeological contexts, and archival data. Examining protein content of dental calculus has typically given insight into diet and oral health of communities of past generations
                <sup>
                    <xref ref-type="bibr" rid="ref-34">34</xref>&#x2013;
                    <xref ref-type="bibr" rid="ref-36">36</xref>
                </sup>.</p>
            <p>Since dental calculus is formed as a result of bacterial plaque accumulation around the gingiva, dental calculus consists primarily of bacteria. Thus, dental calculus lends itself well to oral microbiome analysis. For example, it was found in a medieval sample that 85&#x2013;95% of the calculus was composed of bacterial proteins
                <sup>
                    <xref ref-type="bibr" rid="ref-36">36</xref>
                </sup>. This indicates a novel method of examining the constituents of the oral microbiome and its variation across cultures, geographies, and various historical periods.</p>
            <p>The availability of a unique set of data from the first quarantine in the world will enable substantial focus on infectious diseases and the modeling of ancient epidemics (
                <xref ref-type="fig" rid="f14">Figure 14</xref>). All of the approximately 1500 individuals for this project died of an infectious disease, we know this from archival records. The addition of body responses to the environment and diseases (metabolites), as well as dietary data (stable isotopes to detect malnutrition), will be trialed, providing the best chance to recognize the pathogen responsible and its overall effects. In genetics and medicine, the combination of code, workflow, logic and available data will provide over 300 years of data on epidemics (especially bubonic plague) including the first influenza pandemic, dated 1580, and outbreaks of typhus and measles. It will be possible to reach ca. 600 years of data at one location using historical and medical records. The plague and other similar illnesses provoking fever are replaced by smallpox, measles and flu in later times, as medicine provides therapies, mobility increases and diet changes with many plants cultivated in different continents from where they originated. Our TRACK prototypes will enable investigations related to pathogen evolution, microbiome adaptations and human immunity responses changes.</p>
            <fig fig-type="figure" id="f14" orientation="portrait" position="float">
                <label>Figure 14. </label>
                <caption>
                    <title>Mask worn by doctors visiting people in quarantine in Venice to protect themselves during the 17th century. </title>
                    <p>
                        <bold>Left</bold>: Masque port&#x00e9; vers 1630 par les m&#x00e9;decins visitant les pestif&#x00e9;r&#x00e9;s from R. Blanchard, in 
                        <italic toggle="yes">Archives de parasitologie</italic>, 1900. Pl. V. 
                        <bold>Right</bold>: drawing of a doctor wearing the mask. From Thomas Bartholin, Plague doctor, 
                        <italic toggle="yes">Thom&#x00e6; Bartholini Historiarum anatomicarum rariorum</italic>, Hafniae: Sumptibus P. Hauboldt, 1654, p. 143</p>
                </caption>
                <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/29214/ca8f781b-3111-4461-98e0-346e2215b98f_figure14.gif"/>
            </fig>
            <p>
                <bold>Goals:</bold> To achieve the transdisciplinary goals inherent to the nature of this paleo-omics project, a central database able to contain different data types is required. Towards this objective, we created and implemented a paleo-omics workflow consisting of: 1) a search engine to query the multi-data database, 2) a retrieving pipeline for paleo proteins, and 3) a query gateway for microbiome-human host interactions (
                <xref ref-type="fig" rid="f15">Figure 15</xref>).</p>
            <fig fig-type="figure" id="f15" orientation="portrait" position="float">
                <label>Figure 15. </label>
                <caption>
                    <title>Prototype paleo-data center workflow.</title>
                    <p>Data derived from laboratory-based analyses of biopaleological samples are processed and analyzed by established analytical software. Results from these analyses are then compared to existing databases, such as RefSeq, and both the known and unknown information are stored in a centralized Paleo-pathology database. A search engine and a web user interface (UI) then provides users access to this centralized Paleo-pathology database. The dedicated proteomics database can be expanded and rebuilt by data scientists with new data sets and novel data structures. Abbreviations: 
                        <italic toggle="yes">BLAST</italic> (Basic Local Alignment Tool): a popular algorithm for comparing biological sequence information, such as the amino-acid sequences of proteins or the nucleotides of DNA RNA sequences
                        <sup>
                            <xref ref-type="bibr" rid="ref-42">42</xref>
                        </sup>. 
                        <italic toggle="yes">CIGAR</italic> (Concise Idiosyncratic Gapped Alignment Report): a string format used to represent information such as which bases align (either a match/mismatch) with the reference, are deleted from the reference, and are insertions that are not in the reference. 
                        <italic toggle="yes">MaxQuant</italic>: a quantitative proteomics software package designed for analyzing large mass-spectrometric data sets.</p>
                </caption>
                <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/29214/ca8f781b-3111-4461-98e0-346e2215b98f_figure15.gif"/>
            </fig>
            <p>While mass spectrometry (MS), shotgun sequencing, and 16S rRNA sequencing data can be employed in paleo-omics, we focused on an MS-based meta-proteomics approach for proof-of-concept demonstration of our prototype within the time constraints of the Codeathon, which we applied to data derived from human dental calculus protein-samples taken from archeological sites.</p>
        </sec>
        <sec sec-type="methods">
            <title>Methods</title>
            <sec>
                <title>Data sources and processing</title>
                <p>MS data and shotgun sequencing data obtained from ancient human dental calculus samples were used for these analyses
                    <sup>
                        <xref ref-type="bibr" rid="ref-36">36</xref>,
                        <xref ref-type="bibr" rid="ref-37">37</xref>
                    </sup>.</p>
                <p>
                    <italic toggle="yes">(1) MS data</italic>: peptides were identified from raw data files by comparing spectra from the second spectrometer of a tandem-MS (MS2) to reference spectra available in protein databases. Many existing proteomics software packages, such as MaxQuant, have been designed for analyzing large MS data sets, such as the MaxQB database, and thus can perform this task
                    <sup>
                        <xref ref-type="bibr" rid="ref-38">38</xref>
                    </sup>.</p>
                <p>
                    <italic toggle="yes">(2) Shotgun sequencing data</italic>: the resulting short reads in FASTQ data format have been initially verified if they correspond to human DNA sequences, sequence reads were aligned to a human reference genome (Genome Reference Consortium Human Build 38) to verify human sequences using the 
                    <ext-link ext-link-type="uri" xlink:href="http://bowtie-bio.sourceforge.net/bowtie2/index.shtml">Bowtie</ext-link> version 1.3.0 and 
                    <ext-link ext-link-type="uri" xlink:href="https://github.com/lh3/bwa">BWA</ext-link> programs version 0.7.17 
                    <sup>
                        <xref ref-type="bibr" rid="ref-39">39</xref>,
                        <xref ref-type="bibr" rid="ref-40">40</xref>
                    </sup>. Reads not aligning to the human reference genome were characterized as non-human.</p>
                <p>All processed data were stored in a high-performance database for future analysis. A web user interface and a search/analysis engine
                    <sup>
                        <xref ref-type="bibr" rid="ref-41">41</xref>
                    </sup> were developed to access these data.</p>
            </sec>
            <sec>
                <title>Assessing presence of select pathogens</title>
                <p>We performed targeted pathogen searches for sequences of oral pathogenic microbes and other human pathogens, including the major human malaria parasite 
                    <italic toggle="yes">Plasmodium falciparum</italic>. We identified pathogenic oral microbes similar to previously published results, but no significant hits to 
                    <italic toggle="yes">P. falciparum</italic> from these two test-sets were identified. We additionally searched for marker oral microbiome species for other human infectious diseases as reported in detail in the results section.</p>
            </sec>
            <sec>
                <title>Operation and Implementation</title>
                <p>Source-code for our prototype is available through our GitHub repository (see 
                    <italic toggle="yes">Software availability</italic> section). This implementation requires the following software packages to reproduce: Python version 3.6.0; Flask version 1.1; R version 3.4.4; Perl version 5.26.1; BLAST version 2.10.0.</p>
            </sec>
        </sec>
        <sec sec-type="results">
            <title>Results</title>
            <p>To test our prototype
                <sup>
                    <xref ref-type="bibr" rid="ref-41">41</xref>
                </sup>, we searched for pathogen sequences against the two archaeological samples in the database, one from Denmark 1100-1450 AD
                <sup>
                    <xref ref-type="bibr" rid="ref-36">36</xref>
                </sup> and one from the United Kingdom 1770-1855 AD
                <sup>
                    <xref ref-type="bibr" rid="ref-34">34</xref>
                </sup>. The medieval Danish samples were used with a complete set of dental pathology characterization and individual data. Consistent with the reported results
                <sup>
                    <xref ref-type="bibr" rid="ref-36">36</xref>
                </sup>, there are oral disease pathology and bacteria normally found in the oral microbiome that can be recovered (
                <xref ref-type="fig" rid="f16">Figure 16</xref>). For instance, the species 
                <italic toggle="yes">Porphyromonas gingivalis</italic> is frequently present in individuals with orthodontic diseases, while 
                <italic toggle="yes">Streptococcus sanguinis</italic> is present in both medieval and contemporary individuals with satisfactory oral health.</p>
            <p>This approach can also be used to discover other bacteria linked to health and possibly reveal other correlations between microbiome bacteria and health status as well as recent evolutionary changes. In archaeology, the current focus is on revealing specific pathogens and there is no established reference material to investigate the past microbiome or its effects on health. Even in recent studies, any conclusions on medieval or older individuals is based on direct comparison with the contemporary microbiome. By using archaeological methods (chronological seriation) together with software developed from our code, it will be possible to investigate any correlation between microbiome and health searching individuals dating to older periods. Such work could provide a reference standard for archaeologists, and evolutionary data to health professionals. For example, using the existing data, we found the opportunistic respiratory pathogen 
                <italic toggle="yes">Haemophilus parainfluenzae</italic>
                <sup>
                    <xref ref-type="bibr" rid="ref-43">43</xref>,
                    <xref ref-type="bibr" rid="ref-44">44</xref>
                </sup> present less frequently in this set of medieval samples (Wilcoxen test, p &lt; 0.05), raising interesting questions about human society transition and infectious diseases. This group appears in Neolithic agrarian human oral microbiomes (7440 BCE)
                <sup>
                    <xref ref-type="bibr" rid="ref-45">45</xref>
                </sup>, but is at low levels in human groups practicing hunting and gathering (2000 BCE, modern day South Africa). Questions of interest to both health professionals and archaeologists that could be answered by employing our code may be when this pathogen became more frequent and why.</p>
            <fig fig-type="figure" id="f16" orientation="portrait" position="float">
                <label>Figure 16. </label>
                <caption>
                    <title>Medieval oral microbiome with bacterial species as markers for oral diseases.</title>
                    <p>
                        <bold>A</bold>. A total of over 200 bacterial species have been recovered from a metaproteomics study using medieval dental calculus
                        <sup>
                            <xref ref-type="bibr" rid="ref-36">36</xref>
                        </sup>. The Label- free protein quantitation (LFQ) was used to quantify all samples and conduct comparative analysis. The taxa abundance levels were normalized on a scale from 0 to 10; and the circle sizes indicate the frequency of taxa occurrences in the study 
                        <bold>B</bold>. Representative species of oral diseases (e.g. 
                        <italic toggle="yes">Porphyromonas gingivalis</italic> and 
                        <italic toggle="yes">Filifactor alocis</italic> ), oral health ( 
                        <italic toggle="yes">Cardiobacterium hominis</italic> and 
                        <italic toggle="yes">Streptococcus sanguinis</italic>), and potential respiratory disease markers (
                        <italic toggle="yes">Haemophilus spp.</italic>)
                        <sup>
                            <xref ref-type="bibr" rid="ref-43">43</xref>,
                            <xref ref-type="bibr" rid="ref-44">44</xref>
                        </sup>. Modern day oral microbiomes from dental plagues and calculus are from the HMP database.</p>
                </caption>
                <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/29214/ca8f781b-3111-4461-98e0-346e2215b98f_figure16.gif"/>
            </fig>
            <p>Understanding the origins and evolution of pathogens is very important to prepare for future pandemics. The only successful work attempted on combining archaeology with genetics and health studies to investigate past pathogens, the reconstruction of the 1918 flu pathogen
                <sup>
                    <xref ref-type="bibr" rid="ref-46">46</xref>
                </sup> proved to be both technically challenging and costly even though fewer than a hundred years had passed since the pandemic because that work tried to reproduce an active virus now extinct. It was also very useful to demonstrate that the strong virulence reported in historical sources, but unconfirmed in medicine, was real. Since 1919, only COVID-19 has demonstrated a similar virulence, proving that data from historical record can be critical in addressing new types of known viruses and pathogens, which can regain traits unseen for a century or more within that category of pathogens (respiratory viruses with flu-like symptoms in this case). That work has shown also how the choice of suitable burial grounds is essential for such work. Our work uses new -omics analyses that are providing new sources of data and could prove equally valuable, revealing the history of recent pathogens, characteristics that may have been present only occasionally, and their successes and failures. Future pathogens might reuse and re-combine successful traits (symptoms, virulence) from past epidemics and therefore our preparedness depends on knowing what to expect, on learning from the past.</p>
            <p>The results of our work are therefore limited to making possible future interdisciplinary research and open up a path to answer new questions. Sequencing proteomic and metabolomic data from pre-modern individuals is still rare and there is no existing database, besides data from a few academic papers, that our software code could search. Yet, making possible new studies through a working proof-of-concept will accelerate the production of databases for ancient individuals. Existing archaeological studies have borne out of early full sequencing of genomes and have been severely limited by such approaches. The benefits deriving from new -omics analyses combined with our approach can provide valuable information on older pathogens. Future work may focus on epidemics initially, but with a potential also for revealing and understanding more subtle and complex relationships between human microbiome and health.</p>
        </sec>
        <sec>
            <title>Team 6 - Animal</title>
            <p>
                <bold>Project title: Capturing ecological and host drivers of microbiomes</bold>
            </p>
            <p>Project Rationales, Descriptions and Goals</p>
            <p>
                <bold>Rationale:</bold> One primary goal of host-microbiome studies is to capture and understand ecological and host drivers of microbial diversity. Research on host-microbiome associations across host species has been facilitated by the increasing accessibility of high-throughput sequencing techniques and the availability of integrated microbiome datasets, such as the Earth Microbiome Project dataset
                <sup>
                    <xref ref-type="bibr" rid="ref-47">47</xref>
                </sup>. These have yielded useful insights on how host-microbiome associations are impacted by host diet
                <sup>
                    <xref ref-type="bibr" rid="ref-48">48</xref>
                </sup>, host taxonomy or phylogeny
                <sup>
                    <xref ref-type="bibr" rid="ref-49">49</xref>
                </sup>, host immune system
                <sup>
                    <xref ref-type="bibr" rid="ref-50">50</xref>
                </sup>, and environmental factors
                <sup>
                    <xref ref-type="bibr" rid="ref-51">51</xref>
                </sup>. However, host species traits vary immensely across species and such diversity has been under sampled in microbiome studies. As a result, the effects of other host factors, including body mass and life history, in relation to previously characterized host and environmental effects, on host-microbiome associations have been understudied.</p>
            <p>
                <bold>Goal:</bold> In this project, we aim to investigate the effects of various host traits, including diet, host taxonomy, body mass, and longevity, in relation to environmental factors, on the intestine, fecal, foregut, and stomach microbiomes of Metazoan (animal) species. We first mined available microbiome and metadata datasets, then applied unsupervised learning directly on rarefied OTU abundance data to uncover clusters of microbial community similarity among animals.</p>
        </sec>
        <sec sec-type="methods">
            <title>Methods</title>
            <sec>
                <title>Data sources and processing</title>
                <p>Rarefied OTU table (1000 reads per sample) and metadata of internal animal microbiomes from the Earth Microbiome Project
                    <sup>
                        <xref ref-type="bibr" rid="ref-47">47</xref>
                    </sup> was obtained from Woodhams 
                    <italic toggle="yes">et al.</italic>
                    <sup>
                        <xref ref-type="bibr" rid="ref-52">52</xref>
                    </sup>. The OTU table was filtered to remove plant samples (Kingdom Plantae), OTUs with &lt;10 total counts across samples, and OTUs occurring in &lt;2 samples.</p>
            </sec>
            <sec>
                <title>Metadata collection</title>
                <p>For each sequenced species in our dataset, we added metadata for body mass and maximum longevity, if available. Body mass data was collected from the Pantheria archives
                    <sup>
                        <xref ref-type="bibr" rid="ref-53">53</xref>
                    </sup>, the Caviede Vidal dataset
                    <sup>
                        <xref ref-type="bibr" rid="ref-54">54</xref>
                    </sup>, and the 
                    <ext-link ext-link-type="uri" xlink:href="https://eol.org/">Encyclopedia of Life</ext-link>. Body mass data was categorized to create three equally sized groups (excluding 
                    <italic toggle="yes">Homo Sapiens</italic>): big (&gt; 58.7 kg), medium (&gt;19.57 kg, &#x2264; 58.7 kg), and small (&#x2264; 19.57 kg). Maximum longevity data was obtained from AnAge
                    <sup>
                        <xref ref-type="bibr" rid="ref-55">55</xref>
                    </sup>.</p>
            </sec>
            <sec>
                <title>Unsupervised learning analysis</title>
                <p> To explore distinct microbial composition structures across samples, an unsupervised cluster analysis was performed on the processed OTU table. OTUs present in less than 5% samples were discarded to obtain robust clusters. Sample-wise distance matrix was then computed using Jensen-Shannon distance on the OTU table of relative abundance. The PAM (partition around medoids) clustering analysis was completed using the 
                    <ext-link ext-link-type="uri" xlink:href="https://cran.r-project.org/package=cluster">cluster</ext-link> version 2.1.0  package in R software version 3.6.1
                    <sup>
                        <xref ref-type="bibr" rid="ref-56">56</xref>
                    </sup>. The optimal number of clusters was determined to maximize the Silhouette coefficient
                    <sup>
                        <xref ref-type="bibr" rid="ref-57">57</xref>
                    </sup>. To visualize results of the cluster analysis, principal component analysis was completed using 
                    <ext-link ext-link-type="uri" xlink:href="https://cran.r-project.org/package=ade4">ade4</ext-link>  version 1.7-13 package in R software. Individual samples were depicted on the space of top two principal components.</p>
            </sec>
            <sec>
                <title>ANOVA F-test and correlation analysis</title>
                <p> For feature selection, ANOVA F-tests were implemented in python to identify quantitative metadata variables with significant means variance differences between clusters. Pearson correlation analysis was also performed in python to evaluate linear relationships between metadata variables.</p>
            </sec>
            <sec>
                <title>Operation and implementation</title>
                <p>The analyses can be performed on a local computer or server with R and Python installed. A step-by-step tutorial of the unsupervised clustering approach is available at 
                    <ext-link ext-link-type="uri" xlink:href="https://enterotype.embl.de/enterotypes.html">https://enterotype.embl.de/enterotypes.html</ext-link>. R markdown and Python codes used for analyses are also available as listed in the 
                    <italic toggle="yes">Software availability</italic> section
                    <sup>
                        <xref ref-type="bibr" rid="ref-58">58</xref>
                    </sup>.</p>
            </sec>
        </sec>
        <sec sec-type="results">
            <title>Results</title>
            <p>We analyzed 726 samples spanning 199 terrestrial and freshwater Metazoan species within seven classes (
                <xref ref-type="fig" rid="f17">Figure 17</xref>). Our unsupervised learning approach generated three sample clusters (
                <xref ref-type="fig" rid="f18">Figure 18</xref>). The largest and most diverse cluster (cluster 1) comprised ~92% of all samples (n=667) from 21 Metazoan orders. These included lepidoptera (butterflies and moths; n=165), primates (n=85), anura (n=79), chiroptera (bats; n=44), carnivora (n=41), passeriformes (perching birds; n=37), hymenoptera (n=27), artiodactyla (n=26), diprotodontia (n=24), rodentia (n=23), lagomorpha (n=19), columbiformes (n=18), cypriniformes (n=18), squamata (n=17), anseriformes (n=9), gasterosteiformes (n=9), coleoptera (n=7), pilosa (n=7), cingulata (n=6), casuariiformes (n=5), and hemiptera (n=1). Cluster 2 comprised 34 samples from bats (n=16), butterflies and moths (n=10), perching birds (n=6), the dung beetle 
                <italic toggle="yes">Teuchestes fossor</italic> (n=1), and the giant anteater  
                <italic toggle="yes">Myrmecophaga tridactyla</italic> (n=1). Cluster 3 was the smallest (n=25) and exclusively comprised butterfly and moth samples belonging to seven species. These included 
                <italic toggle="yes">Maculinea alcon</italic> (n=9), 
                <italic toggle="yes">Durbania amakosa</italic> (n=5), 
                <italic toggle="yes">Spalgis epeus</italic> (n=5), 
                <italic toggle="yes">Lycaena clarki</italic> (n=2), 
                <italic toggle="yes">Surendra vivarna</italic> (n=2), 
                <italic toggle="yes">Anthene usamba</italic> (n=1), and 
                <italic toggle="yes">Rapla iarbus</italic> (n=1).</p>
            <fig fig-type="figure" id="f17" orientation="portrait" position="float">
                <label>Figure 17. </label>
                <caption>
                    <title>Number of samples (y-axis) analyzed for each host class (x-axis) in this study.</title>
                </caption>
                <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/29214/ca8f781b-3111-4461-98e0-346e2215b98f_figure17.gif"/>
            </fig>
            <fig fig-type="figure" id="f18" orientation="portrait" position="float">
                <label>Figure 18. </label>
                <caption>
                    <title>Principal component analysis (PCA) plot showing the three animal clusters. </title>
                    <p>The data clusters were generated by the Partitioning Around Medoids (PAM) clustering algorithm on Jensen-Shannon divergence calculated from OTU relative abundances. Each point on the plot represents a sample, and each cluster was labelled with its general taxonomic composition and sample sizes.</p>
                </caption>
                <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/29214/ca8f781b-3111-4461-98e0-346e2215b98f_figure18.gif"/>
            </fig>
            <p>ANOVA analysis indicated that clusters had the most significant mean differences in microbial alpha diversity, Simpson diversity, Shannon diversity, Faith&#x2019;s phylogenetic diversity, and Chao 1 diversity (
                <xref ref-type="table" rid="T4">Table 4</xref>). Digestive habitat type, host taxonomy/phylogeny, immune complexity, and life stage, were also significantly different between clusters, along with DNA extraction methods and environmental variables. Notably, body mass and maximum longevity were also significantly different between clusters.</p>
            <table-wrap id="T4" orientation="portrait" position="anchor">
                <label>Table 4. </label>
                <caption>
                    <title>PERMANOVA F scores and p-values of metadata variables significantly associated (p&lt;0.05) with cluster groupings.</title>
                </caption>
                <table content-type="article-table" frame="hsides">
                    <thead>
                        <tr>
                            <th align="left" colspan="1" rowspan="1">Metadata Variable</th>
                            <th align="left" colspan="1" rowspan="1">F Score</th>
                            <th align="left" colspan="1" rowspan="1">p value</th>
                        </tr>
                    </thead>
                    <tbody>
                        <tr>
                            <td colspan="1" rowspan="1" valign="top">Simpson diversity</td>
                            <td colspan="1" rowspan="1" valign="top">135.93</td>
                            <td colspan="1" rowspan="1" valign="top">7.71E-51</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1" valign="top">Shannon diversity</td>
                            <td colspan="1" rowspan="1" valign="top">85.60</td>
                            <td colspan="1" rowspan="1" valign="top">4.30E-34</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1" valign="top">Digestive habitat type (intestine,fecal,for
                                <break/>egut,stomach)</td>
                            <td colspan="1" rowspan="1" valign="top">43.29</td>
                            <td colspan="1" rowspan="1" valign="top">1.75E-18</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1" valign="top">Faith's phylogenetic diversity</td>
                            <td colspan="1" rowspan="1" valign="top">33.94</td>
                            <td colspan="1" rowspan="1" valign="top">8.14E-15</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1" valign="top">Host phylum</td>
                            <td colspan="1" rowspan="1" valign="top">28.96</td>
                            <td colspan="1" rowspan="1" valign="top">7.99E-13</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1" valign="top">Host phylogeny (nDMS proxy)</td>
                            <td colspan="1" rowspan="1" valign="top">28.95</td>
                            <td colspan="1" rowspan="1" valign="top">8.03E-13</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1" valign="top">Immune complexity (ordinal score)</td>
                            <td colspan="1" rowspan="1" valign="top">27.31</td>
                            <td colspan="1" rowspan="1" valign="top">3.67E-12</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1" valign="top">Observed OTUs</td>
                            <td colspan="1" rowspan="1" valign="top">26.29</td>
                            <td colspan="1" rowspan="1" valign="top">9.49E-12</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1" valign="top">Lifestage (larvae,juvenile/pupae, infant,
                                <break/>adult)</td>
                            <td colspan="1" rowspan="1" valign="top">20.40</td>
                            <td colspan="1" rowspan="1" valign="top">2.40E-09</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1" valign="top">Chao1 diversity</td>
                            <td colspan="1" rowspan="1" valign="top">20.26</td>
                            <td colspan="1" rowspan="1" valign="top">2.76E-09</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1" valign="top">Preservation method (ethanol, freezing,
                                <break/>RNAlater, others)</td>
                            <td colspan="1" rowspan="1" valign="top">17.91</td>
                            <td colspan="1" rowspan="1" valign="top">2.55E-08</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1" valign="top">Maximum temperature of the warmest
                                <break/>month</td>
                            <td colspan="1" rowspan="1" valign="top">17.38</td>
                            <td colspan="1" rowspan="1" valign="top">4.25E-08</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1" valign="top">Host family</td>
                            <td colspan="1" rowspan="1" valign="top">14.50</td>
                            <td colspan="1" rowspan="1" valign="top">6.68E-07</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1" valign="top">Longitude</td>
                            <td colspan="1" rowspan="1" valign="top">11.64</td>
                            <td colspan="1" rowspan="1" valign="top">1.06E-05</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1" valign="top">Body mass</td>
                            <td colspan="1" rowspan="1" valign="top">10.50</td>
                            <td colspan="1" rowspan="1" valign="top">3.19E-05</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1" valign="top">Mean diurnal temperature range</td>
                            <td colspan="1" rowspan="1" valign="top">10.46</td>
                            <td colspan="1" rowspan="1" valign="top">3.33E-05</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1" valign="top">Surrounding habitat (freshwater,
                                <break/>terrestrial)</td>
                            <td colspan="1" rowspan="1" valign="top">5.55</td>
                            <td colspan="1" rowspan="1" valign="top">0.004051</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1" valign="top">Host order</td>
                            <td colspan="1" rowspan="1" valign="top">5.24</td>
                            <td colspan="1" rowspan="1" valign="top">0.005518</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1" valign="top">Mean temperature of the driest quarter</td>
                            <td colspan="1" rowspan="1" valign="top">5.08</td>
                            <td colspan="1" rowspan="1" valign="top">0.006419</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1" valign="top">Maximum longevity</td>
                            <td colspan="1" rowspan="1" valign="top">3.73</td>
                            <td colspan="1" rowspan="1" valign="top">0.024552</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1" valign="top">Precipitation seasonality</td>
                            <td colspan="1" rowspan="1" valign="top">3.39</td>
                            <td colspan="1" rowspan="1" valign="top">0.034152</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1" valign="top">DNA extraction method (DNeasy
                                <break/>Powersoil, EZna Stool Dna Kit,
                                <break/>PowerFecal, QIAamp DNA Stool Mini Kit,
                                <break/>ZR Fecal DNA Miniprep Kit)</td>
                            <td colspan="1" rowspan="1" valign="top">3.16</td>
                            <td colspan="1" rowspan="1" valign="top">0.042965</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1" valign="top">Vegetation index (NDVI MODIS)</td>
                            <td colspan="1" rowspan="1" valign="top">3.14</td>
                            <td colspan="1" rowspan="1" valign="top">0.043888</td>
                        </tr>
                    </tbody>
                </table>
            </table-wrap>
            <p>Cluster-specific correlation analyses showed that alpha diversity in clusters 1 and 2 was consistently positively correlated with host taxonomy, immune complexity, diet, maximum longevity and latitude. Body mass, vegetation index, terrain complexity, mean temperature of the driest quarter and precipitation of the warmest and coldest quarters showed positive correlations with alpha diversity in cluster 1, but not cluster 2. Latitude and country were positively correlated with alpha diversity in cluster 2, but not cluster 1. Alpha diversity in cluster 3, which comprised butterflies and moths, was positively correlated with environmental variables (terrain complexity, mean diurnal temperature range, precipitation seasonality, elevation) and host factors (digestive habitat type and diet).</p>
            <p>The results support our premise that host traits, including but not limited to body mass and maximum longevity, are under sampled in microbial diversity studies. Understudied host traits could also shape animal internal microbiomes together with well-characterized host traits and environmental variables. Based on our results, we propose comprehensive sampling of host traits in future microbiome studies, which may yield new and unexpected patterns of microbial community organization serving as a baseline for deeper investigations.</p>
            <sec>
                <title>Lessons learned</title>
                <p>Throughout this process we identified several areas where improvements could be made for future disease-focused hackathons. A few of these are described below.</p>
                <p>
                    <bold>Collaboration across domains</bold> requires extensive communications with minimum use of jargons, and active learning from diverse backgrounds. We aimed to further expand on the traditional foundation of codeathons, and we generated novel tools by leveraging research strengths of the local community. However, there has been some challenges in the six teams to efficiently work together, with barriers in communicating the feasibility and significance of particular problems. In-depth and succinct explanation of the technical problems are critical for the successful operations.</p>
                <p>
                    <bold>Scalability of R</bold> has been called into question during the prototype development. For large dataset computations, more efficient implementation can be developed once the prototype has proven to be useful for the community. However, the granularity of solutions available in R make it the preferred tool for designing and experimenting with different solutions.</p>
                <p>
                    <bold>Meticulous documentation</bold> of each analysis step remains crucial for effective dissemination of our approach and results. These necessary components of any project are also excellent opportunities to apply the skillsets of non-coders, as well as to heighten engagement of trainees by reinforcing project rationale. Good documentation, including simple flowcharts, are very useful tools for keeping focus. Non-coding participants who want to gain some experience can often quickly learn markdown language and be vital contributors to repositories.</p>
            </sec>
        </sec>
        <sec sec-type="conclusion">
            <title>Conclusion and next steps</title>
            <p>Interdisciplinary collaborations have proven to be very productive as shown by our six working prototypes addressing broad microbiome related challenges, ranging from power calculations, AI classifiers, GIS integration and large data set visualizations. Although working across fields has been a challenging task, we found that parsing a complex question into distinct parts can help different domain-experts to work together and accomplish tasks none of the individuals could accomplish in isolation. The codeathon workflow is thus a useful research model for many urgent societal problems that suffer from knowledge-transfer and communication issues. We have made all data and code publicly available for further exploration of these tools. Importantly, we are developing impactful projects to further pursue intersectional research spurred by this event, including microbiome-related machine learning, and data mining across archaeological time and geography.</p>
        </sec>
        <sec>
            <title>Data availability</title>
            <p>All data underlying the results are available as part of the article and no additional source data are required.</p>
        </sec>
        <sec>
            <title>Software availability</title>
            <sec>
                <title>Team 1</title>
                <p>
                    <bold>Source code available from:</bold> 
                    <ext-link ext-link-type="uri" xlink:href="https://github.com/USFOneHealthCodeathon2020/Team1_MicroPowerPlus">https://github.com/USFOneHealthCodeathon2020/Team1_MicroPowerPlus</ext-link>.</p>
                <p>
                    <bold>Archived source code at time of publication:</bold> 
                    <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.5281/zenodo.4031770">https://doi.org/10.5281/zenodo.4031770</ext-link>
                    <sup>
                        <xref ref-type="bibr" rid="ref-8">8</xref>
                    </sup>.</p>
                <p>
                    <bold>License:</bold> 
                    <ext-link ext-link-type="uri" xlink:href="https://www.gnu.org/licenses/gpl-3.0.en.html">GNU General Public License 3.0</ext-link>.</p>
            </sec>
            <sec>
                <title>Team 2</title>
                <p>
                    <bold>Source code available from:</bold> 
                    <ext-link ext-link-type="uri" xlink:href="https://github.com/USFOneHealthCodeathon2020/Team2_GEO">https://github.com/USFOneHealthCodeathon2020/Team2_GEO</ext-link>
                </p>
                <p>
                    <bold>Archived source code at time of publication:</bold> 
                    <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.5281/zenodo.4034466">https://doi.org/10.5281/zenodo.4034466</ext-link>
                    <sup>
                        <xref ref-type="bibr" rid="ref-20">20</xref>
                    </sup>.</p>
                <p>
                    <bold>License:</bold> 
                    <ext-link ext-link-type="uri" xlink:href="https://www.gnu.org/licenses/gpl-3.0.en.html">GNU General Public License 3.0</ext-link>.</p>
            </sec>
            <sec>
                <title>Team 3</title>
                <p>
                    <bold>Source code available from:</bold> 
                    <ext-link ext-link-type="uri" xlink:href="https://github.com/USFOneHealthCodeathon2020/projectZer0">https://github.com/USFOneHealthCodeathon2020/projectZer0</ext-link>.</p>
                <p>
                    <bold>Archived source code at time of publication:</bold> 
                    <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.5281/zenodo.4031780">https://doi.org/10.5281/zenodo.4031780</ext-link>
                    <sup>
                        <xref ref-type="bibr" rid="ref-26">26</xref>
                    </sup>
                </p>
                <p>
                    <bold>License:</bold> 
                    <ext-link ext-link-type="uri" xlink:href="https://www.gnu.org/licenses/gpl-3.0.en.html">GNU General Public License 3.0</ext-link>.</p>
            </sec>
            <sec>
                <title>Team 4</title>
                <p>
                    <bold>Source code available from:</bold> 
                    <ext-link ext-link-type="uri" xlink:href="https://github.com/USFOneHealthCodeathon2020/Team-YOLO">https://github.com/USFOneHealthCodeathon2020/Team-YOLO</ext-link> </p>
                <p>
                    <bold>Archived source code at time of publication:</bold> 
                    <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.5281/zenodo.4031776">https://doi.org/10.5281/zenodo.4031776</ext-link>
                    <sup>
                        <xref ref-type="bibr" rid="ref-28">28</xref>
                    </sup>.</p>
                <p>
                    <bold>License:</bold> 
                    <ext-link ext-link-type="uri" xlink:href="https://www.gnu.org/licenses/gpl-3.0.en.html">GNU General Public License 3.0</ext-link>.</p>
                <p>
                    <bold>Team 5</bold>
                </p>
                <p>
                    <bold>Source code available from:</bold> 
                    <ext-link ext-link-type="uri" xlink:href="https://github.com/USFOneHealthCodeathon2020/Team5_MinhRays">https://github.com/USFOneHealthCodeathon2020/Team5_MinhRays</ext-link>
                </p>
                <p>
                    <bold>Archived source code at time of publication:</bold> 
                    <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.5281/zenodo.4031785">https://doi.org/10.5281/zenodo.4031785</ext-link>
                    <sup>
                        <xref ref-type="bibr" rid="ref-41">41</xref>
                    </sup>.</p>
                <p>
                    <bold>License:</bold> 
                    <ext-link ext-link-type="uri" xlink:href="https://www.gnu.org/licenses/gpl-3.0.en.html">GNU General Public License 3.0</ext-link>.</p>
                <p>
                    <bold>Team 6</bold>
                </p>
                <p>
                    <bold>Source code available from:</bold> 
                    <ext-link ext-link-type="uri" xlink:href="https://github.com/USFOneHealthCodeathon2020/Team6_LimSharma">https://github.com/USFOneHealthCodeathon2020/Team6_LimSharma</ext-link>
                </p>
                <p>
                    <bold>Archived source code at time of publication:</bold> 
                    <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.5281/zenodo.4031778">https://doi.org/10.5281/zenodo.4031778</ext-link>
                    <sup>
                        <xref ref-type="bibr" rid="ref-58">58</xref>
                    </sup>.</p>
                <p>
                    <bold>License:</bold> 
                    <ext-link ext-link-type="uri" xlink:href="https://www.gnu.org/licenses/gpl-3.0.en.html">GNU General Public License 3.0</ext-link>.</p>
            </sec>
        </sec>
    </body>
    <back>
        <ack>
            <title>Acknowledgements</title>
            <p>We thank Paige Hunt for coordinating and helping organizing the event. We also thank John Adams, Derek Wildman, Monica Uddin from USF genomics program for their support. We thank Ming Ji, Derek Wildman, Christopher Collazo for forming the evaluation panel. We acknowledge the support from the USF Genomics Program, USF Omics Hub, USF Library, and Institute for the Advanced Study of Culture and the Environment for extensive logistical help in organizing the OneHealth Host-Microbiome Interactions Codeathon.</p>
        </ack>
        <sec>
            <title>Author contributions</title>
            <table-wrap id="T2A" orientation="portrait" position="anchor">
                <caption>
                    <title>&#x00a0;</title>
                </caption>
                <table content-type="article-table" frame="hsides">
                    <thead>
                        <tr>
                            <th align="left" colspan="1" rowspan="1">Contributor Role</th>
                            <th align="left" colspan="1" rowspan="1">Role Definition</th>
                        </tr>
                    </thead>
                    <tbody>
                        <tr>
                            <td colspan="1" rowspan="1" valign="top">Conceptualization</td>
                            <td colspan="1" rowspan="1" valign="top">RHYJ, JO, JG, CW, TEK, AS, SA contributed to forming Ideas and formulation or evolution of overarching research goals and aims. </td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1" valign="top">Data Curation</td>
                            <td colspan="1" rowspan="1" valign="top">RHYJ, JO, JG, SA, TEK, GD, AS, MP, AW, CL, JL contributed to the management activities to annotate (produce metadata), scrub data and maintain research data (including software code, where it is necessary for interpreting the data itself) for initial use and later reuse.</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1" valign="top">Formal Analysis</td>
                            <td colspan="1" rowspan="1" valign="top">All authors participated in application of statistical, mathematical, computational, or other formal techniques to analyze or synthesize study data.</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1" valign="top">Funding Acquisition</td>
                            <td colspan="1" rowspan="1" valign="top">RHYJ contributed to acquisition of the financial support for the project leading to this publication. </td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1" valign="top">Investigation</td>
                            <td colspan="1" rowspan="1" valign="top">All authors contributed to conducting a research and investigation process, specifically performing the experiments, or data/evidence collection.</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1" valign="top">Methodology</td>
                            <td colspan="1" rowspan="1" valign="top">All authors contributed to development or design of methodology; creation of models. </td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1" valign="top">Project Administration</td>
                            <td colspan="1" rowspan="1" valign="top">RHYJ, JO, JG, SA, TEK, GD, AS, MP, AW, CL,JL contributed to management and coordination responsibility for the research activity planning and execution. </td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1" valign="top">Resources</td>
                            <td colspan="1" rowspan="1" valign="top">RHYJ, JO, JG, SA, TEK, GD, AS, MP, AW, CL,JL contributed to provision of study materials, and computing resources, or other analysis tools. </td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1" valign="top">Software</td>
                            <td colspan="1" rowspan="1" valign="top">All authors contributed to programming, software development; designing computer programs; implementation of the computer code and supporting algorithms; testing of existing code components.</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1" valign="top">Supervision</td>
                            <td colspan="1" rowspan="1" valign="top">RHYJ, JO, JG, SA, TEK, GD, AS, MP, AW, CL,JL provided oversight and leadership responsibility for the research activity planning and execution, including mentorship external to the core team.</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1" valign="top">Validation</td>
                            <td colspan="1" rowspan="1" valign="top">All authors contributed to the verification, whether as a part of the activity or separate, of the overall replication/reproducibility of results/experiments and other research outputs. </td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1" valign="top">Visualization</td>
                            <td colspan="1" rowspan="1" valign="top">All authors contributed to the preparation, creation and/or presentation of the published work, specifically visualization/data presentation. </td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1" valign="top">Writing &#x2013; Original Draft Preparation</td>
                            <td colspan="1" rowspan="1" valign="top">All authors contributed to the creation and/or presentation of the published work, specifically writing the initial draft (including substantive translation). </td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1" valign="top">Writing &#x2013; Review &amp; Editing</td>
                            <td colspan="1" rowspan="1" valign="top">All authors contributed to the reparation, creation and/or presentation of the published work by those from the original research group, specifically critical review, commentary or revision &#x2013; including pre- or post-publication stages. </td>
                        </tr>
                    </tbody>
                </table>
            </table-wrap>
        </sec>
        <ref-list>
            <ref id="ref-1">
                <label>1</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Ferreira</surname>
                            <given-names>GC</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Oberstaller</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Fonseca</surname>
                            <given-names>R</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Iron Hack - A symposium/hackathon focused on porphyrias, Friedreich&#x2019;s ataxia, and other rare iron-related diseases [version 1; peer review: 2 approved].</article-title>
                    <source>

                        <italic toggle="yes">F1000Res.</italic>
</source>
                    <year>2019</year>;<volume>8</volume>:<fpage>1135</fpage>.
                    <pub-id pub-id-type="pmid">31824661</pub-id>
                    <pub-id pub-id-type="doi">10.12688/f1000research.19140.1</pub-id>
                    <pub-id pub-id-type="pmcid">6894363</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-2">
                <label>2</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Debelius</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Song</surname>
                            <given-names>SJ</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Vazquez-Baeza</surname>
                            <given-names>Y</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Tiny microbes, enormous impacts: what matters in gut microbiome studies?</article-title>
                    <source>

                        <italic toggle="yes">Genome Biol.</italic>
</source>
                    <year>2016</year>;<volume>17</volume>(<issue>1</issue>):<fpage>217</fpage>.
                    <pub-id pub-id-type="pmid">27760558</pub-id>
                    <pub-id pub-id-type="doi">10.1186/s13059-016-1086-x</pub-id>
                    <pub-id pub-id-type="pmcid">5072314</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-3">
                <label>3</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Debelius</surname>
                            <given-names>JW</given-names>
                        </name>

                        <name name-style="western">
                            <surname>V&#x00e1;zquez-Baeza</surname>
                            <given-names>Y</given-names>
                        </name>

                        <name name-style="western">
                            <surname>McDonald </surname>
                            <given-names>D</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Turning Participatory Microbiome Research into Usable Data: Lessons from the American Gut Project.</article-title>
                    <source>

                        <italic toggle="yes">J Microbiol Biol Educ.</italic>
</source>
                    <year>2016</year>;<volume>17</volume>(<issue>1</issue>):<fpage>46</fpage>&#x2013;<lpage>50</lpage>.
                    <pub-id pub-id-type="pmid">27047589</pub-id>
                    <pub-id pub-id-type="doi">10.1128/jmbe.v17i1.1034</pub-id>
                    <pub-id pub-id-type="pmcid">4798814</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-4">
                <label>4</label>
                <mixed-citation publication-type="journal">
                    <collab>NIH Human Microbiome Portfolio Analysis Team</collab>:
                    <article-title>A review of 10 years of human microbiome research activities at the US National Institutes of Health, Fiscal Years 2007-2016.</article-title>
                    <source>

                        <italic toggle="yes">Microbiome.</italic>
</source>
                    <year>2019</year>;<volume>7</volume>(<issue>1</issue>):<fpage>31</fpage>.
                    <pub-id pub-id-type="pmid">30808411</pub-id>
                    <pub-id pub-id-type="doi">10.1186/s40168-019-0620-y</pub-id>
                    <pub-id pub-id-type="pmcid">6391833</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-5">
                <label>5</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Kelly</surname>
                            <given-names>BJ</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Gross</surname>
                            <given-names>R</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Bittinger</surname>
                            <given-names>K</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Power and sample-size estimation for microbiome studies using pairwise distances and PERMANOVA.</article-title>
                    <source>

                        <italic toggle="yes">Bioinformatics.</italic>
</source>
                    <year>2015</year>;<volume>31</volume>(<issue>15</issue>):<fpage>2461</fpage>&#x2013;<lpage>8</lpage>.
                    <pub-id pub-id-type="pmid">25819674</pub-id>
                    <pub-id pub-id-type="doi">10.1093/bioinformatics/btv183</pub-id>
                    <pub-id pub-id-type="pmcid">4514928</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-6">
                <label>6</label>
                <mixed-citation publication-type="journal">
                    <collab>Human Microbiome Project Consortium</collab>:
                    <article-title>A framework for human microbiome research.</article-title>
                    <source>

                        <italic toggle="yes">Nature.</italic>
</source>
                    <year>2012</year>;<volume>486</volume>(<issue>7402</issue>):<fpage>215</fpage>&#x2013;<lpage>21</lpage>.
                    <pub-id pub-id-type="pmid">22699610</pub-id>
                    <pub-id pub-id-type="doi">10.1038/nature11209</pub-id>
                    <pub-id pub-id-type="pmcid">3377744</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-7">
                <label>7</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Oksanen</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Blanchet</surname>
                            <given-names>FG</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Friendly</surname>
                            <given-names>M</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>vegan: Community Ecology Package. R package version 2.5-6</article-title>.<year>2019</year>.
                    <ext-link ext-link-type="uri" xlink:href="https://cran.r-project.org/web/packages/vegan/vegan.pdf">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-8">
                <label>8</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Oberstaller</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>DokurOmkar</surname>
                            <given-names>V</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Gibbons</surname>
                            <given-names>J</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>USFOneHealthCodeathon2020/Team1_MicroPowerPlus: v1.0.0 (Version v1.0.0).</article-title>
                    <source>

                        <italic toggle="yes">Zenodo.</italic>
</source>
                    <year>2020</year>.
                    <ext-link ext-link-type="uri" xlink:href="http://www.doi.org/10.5281/zenodo.4031770">http://www.doi.org/10.5281/zenodo.4031770</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-9">
                <label>9</label>
                <mixed-citation publication-type="journal">
                    <collab>RStudio Team</collab>:
                    <article-title>RStudio: Integrated Development Environment for R</article-title>. RStudio, PBC: Boston, MA.<year>2020</year>.</mixed-citation>
            </ref>
            <ref id="ref-10">
                <label>10</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Chang</surname>
                            <given-names>W</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Cheng</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Allaire</surname>
                            <given-names>JJ</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>shiny: Web Application Framework for R</article-title>. R package version 1.5.0.<year>2020</year>.
                    <ext-link ext-link-type="uri" xlink:href="https://databricks.com/solutions/data-science">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-11">
                <label>11</label>
                <mixed-citation publication-type="journal">
                    <collab>Inc., P.T</collab>:
                    <article-title>Collaborative data science</article-title>.<year>2015</year>.</mixed-citation>
            </ref>
            <ref id="ref-12">
                <label>12</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Wickham</surname>
                            <given-names>H</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Averick</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Bryan</surname>
                            <given-names>J</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Welcome to the tidyverse.</article-title>
                    <source>

                        <italic toggle="yes">J Open Source Softw.</italic>
</source>
                    <year>2019</year>;<volume>4</volume>(<issue>43</issue>):<fpage>1686</fpage>.
                    <pub-id pub-id-type="doi">10.21105/joss.01686</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-13">
                <label>13</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Laue</surname>
                            <given-names>HE</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Brennan</surname>
                            <given-names>KJM</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Gillet</surname>
                            <given-names>V</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Associations of prenatal exposure to polybrominated diphenyl ethers and polychlorinated biphenyls with long-term gut microbiome structure: a pilot study.</article-title>
                    <source>

                        <italic toggle="yes">Environ Epidemiol.</italic>
</source>
                    <year>2019</year>;<volume>3</volume>(<issue>1</issue>):<fpage>e039</fpage>.
                    <pub-id pub-id-type="pmid">30778401</pub-id>
                    <pub-id pub-id-type="doi">10.1097/EE9.0000000000000039</pub-id>
                    <pub-id pub-id-type="pmcid">6376400</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-14">
                <label>14</label>
                <mixed-citation publication-type="journal">
                    <collab>U.S. Environmental Protection Agency</collab>:
                    <article-title>What is Superfund</article-title>?<year>2018</year>.
                    <ext-link ext-link-type="uri" xlink:href="https://www.epa.gov/superfund/what-superfund">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-15">
                <label>15</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Friis</surname>
                            <given-names>RH</given-names>
                        </name>
</person-group>:
                    <article-title>Essentials of Environmental Health</article-title>. Jones &amp; Bartlett Learning.<year>2019</year>.</mixed-citation>
            </ref>
            <ref id="ref-16">
                <label>16</label>
                <mixed-citation publication-type="journal">
                    <collab>U.S. Environmental Protection Agency</collab>:
                    <article-title>Hazard Ranking System Guidance Manual</article-title>.<year>2020</year>.
                    <ext-link ext-link-type="uri" xlink:href="https://books.google.co.in/books/about/Essentials_of_Environmental_Health.html?id=L4bsu11O9RQC&amp;redir_esc=y">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-17">
                <label>17</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Jin</surname>
                            <given-names>Y</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Wu</surname>
                            <given-names>S</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Zeng</surname>
                            <given-names>Z</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Effects of environmental pollutants on gut microbiota.</article-title>
                    <source>

                        <italic toggle="yes">Environ Pollut.</italic>
</source>
                    <year>2017</year>;<volume>222</volume>:<fpage>1</fpage>&#x2013;<lpage>9</lpage>.
                    <pub-id pub-id-type="pmid">28086130</pub-id>
                    <pub-id pub-id-type="doi">10.1016/j.envpol.2016.11.045</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-18">
                <label>18</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Capellini</surname>
                            <given-names>FM</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Vencia</surname>
                            <given-names>W</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Amadori</surname>
                            <given-names>M</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Characterization of MDCK cells and evaluation of their ability to respond to infectious and non-infectious stressors.</article-title>
                    <source>

                        <italic toggle="yes">Cytotechnology.</italic>
</source>
                    <year>2020</year>;<volume>72</volume>(<issue>1</issue>):<fpage>97</fpage>&#x2013;<lpage>109</lpage>.
                    <pub-id pub-id-type="pmid">31802289</pub-id>
                    <pub-id pub-id-type="doi">10.1007/s10616-019-00360-z</pub-id>
                    <pub-id pub-id-type="pmcid">7002637</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-19">
                <label>19</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>McMurdie</surname>
                            <given-names>PJ</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Holmes</surname>
                            <given-names>S</given-names>
                        </name>
</person-group>:
                    <article-title>phyloseq: an R package for reproducible interactive analysis and graphics of microbiome census data.</article-title>
                    <source>

                        <italic toggle="yes">PLoS One.</italic>
</source>
                    <year>2013</year>;<volume>8</volume>(<issue>4</issue>):<fpage>e61217</fpage>.
                    <pub-id pub-id-type="pmid">23630581</pub-id>
                    <pub-id pub-id-type="doi">10.1371/journal.pone.0061217</pub-id>
                    <pub-id pub-id-type="pmcid">3632530</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-20">
                <label>20</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>CancerGenetics007,  Keller</surname>
                            <given-names>T</given-names>
                        </name>

                        <name name-style="western">
                            <surname>dahrendorff</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Oberstaller</surname>
                            <given-names>J</given-names>
                        </name>
</person-group>:
                    <article-title>USFOneHealthCodeathon2020/Team2_GEO: v1.0.0 (Version v1.0.0).</article-title>
                    <source>

                        <italic toggle="yes">Zenodo.</italic>
</source>
                    <year>2020</year>.
                    <ext-link ext-link-type="uri" xlink:href="http://www.doi.org/10.5281/zenodo.4034466">http://www.doi.org/10.5281/zenodo.4034466</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-21">
                <label>21</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Wright</surname>
                            <given-names>MN</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Ziegler</surname>
                            <given-names>A</given-names>
                        </name>
</person-group>:
                    <article-title>ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R.</article-title>
                    <source>

                        <italic toggle="yes">J Stat Softw.</italic>
</source>
                    <year>2017</year>;<volume>77</volume>(<issue>1</issue>).
                    <pub-id pub-id-type="doi">10.18637/jss.v077.i01</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-22">
                <label>22</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Li</surname>
                            <given-names>H</given-names>
                        </name>
</person-group>:
                    <article-title>Seqtk: a fast and lightweight tool for processing FASTA or FASTQ sequences</article-title>.<year>2013</year>.
                    <ext-link ext-link-type="uri" xlink:href="https://github.com/lh3/seqtk/">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-23">
                <label>23</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Callahan</surname>
                            <given-names>BJ</given-names>
                        </name>

                        <name name-style="western">
                            <surname>McMurdie</surname>
                            <given-names>PJ</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Rosen</surname>
                            <given-names>MJ</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>DADA2: High-resolution sample inference from Illumina amplicon data.</article-title>
                    <source>

                        <italic toggle="yes">Nat Methods.</italic>
</source>
                    <year>2016</year>;<volume>13</volume>(<issue>7</issue>):<fpage>581</fpage>&#x2013;<lpage>3</lpage>.
                    <pub-id pub-id-type="pmid">27214047</pub-id>
                    <pub-id pub-id-type="doi">10.1038/nmeth.3869</pub-id>
                    <pub-id pub-id-type="pmcid">4927377</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-24">
                <label>24</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Quast</surname>
                            <given-names>C</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Pruesse</surname>
                            <given-names>E</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Yilmaz</surname>
                            <given-names>P</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>The SILVA ribosomal RNA gene database project: improved data processing and web-based tools.</article-title>
                    <source>

                        <italic toggle="yes">Nucleic Acids Res.</italic>
</source>
                    <year>2013</year>;<volume>41</volume>(<issue>Database issue</issue>):<fpage>D590</fpage>&#x2013;<lpage>6</lpage>.
                    <pub-id pub-id-type="pmid">23193283</pub-id>
                    <pub-id pub-id-type="doi">10.1093/nar/gks1219</pub-id>
                    <pub-id pub-id-type="pmcid">3531112</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-25">
                <label>25</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Abadi</surname>
                            <given-names>M</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>TensorFlow: Large-scale machine learning on heterogeneous systems</article-title>.<year>2015</year>.</mixed-citation>
            </ref>
            <ref id="ref-26">
                <label>26</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Anujit-sarkar</surname>
                            <given-names>W</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Oberstaller</surname>
                            <given-names>J</given-names>
                        </name>
</person-group>:
                    <article-title>USFOneHealthCodeathon2020/projectZer0: v1.0.0 (Version v1.0.0).</article-title>
                    <source>

                        <italic toggle="yes">Zenodo.</italic>
</source>
                    <year>2020</year>.
                    <ext-link ext-link-type="uri" xlink:href="http://www.doi.org/10.5281/zenodo.4031780">http://www.doi.org/10.5281/zenodo.4031780</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-27">
                <label>27</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Mysara</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Vandamme</surname>
                            <given-names>P</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Props</surname>
                            <given-names>R</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Reconciliation between operational taxonomic units and species boundaries.</article-title>
                    <source>

                        <italic toggle="yes">FEMS Microbiol Ecol.</italic>
</source>
                    <year>2017</year>;<volume>93</volume>(<issue>4</issue>):<fpage>fix029</fpage>.
                    <pub-id pub-id-type="pmid">28334218</pub-id>
                    <pub-id pub-id-type="doi">10.1093/femsec/fix029</pub-id>
                    <pub-id pub-id-type="pmcid">5812548</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-28">
                <label>28</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Agaz</surname>
                            <given-names>NW</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Bibber</surname>
                            <given-names>V</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Dean</surname>
                            <given-names>A</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>USFOneHealthCodeathon2020/Team-YOLO: v1.0.0 (Version v1.0.0).</article-title>
                    <source>

                        <italic toggle="yes">Zenodo.</italic>
</source>
                    <year>2020</year>.
                    <ext-link ext-link-type="uri" xlink:href="http://www.doi.org/10.5281/zenodo.4031776">http://www.doi.org/10.5281/zenodo.4031776</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-29">
                <label>29</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Turnbaugh</surname>
                            <given-names>PJ</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Hamady</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Yatsunenko</surname>
                            <given-names>T</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>A core gut microbiome in obese and lean twins.</article-title>
                    <source>

                        <italic toggle="yes">Nature.</italic>
</source>
                    <year>2009</year>;<volume>457</volume>(<issue>7228</issue>):<fpage>480</fpage>&#x2013;<lpage>4</lpage>.
                    <pub-id pub-id-type="pmid">19043404</pub-id>
                    <pub-id pub-id-type="doi">10.1038/nature07540</pub-id>
                    <pub-id pub-id-type="pmcid">2677729</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-30">
                <label>30</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Randolph</surname>
                            <given-names>TW</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Zhao</surname>
                            <given-names>S</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Copeland</surname>
                            <given-names>W</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Kernel-Penalized Regression for Analysis of Microbiome Data.</article-title>
                    <source>

                        <italic toggle="yes">Ann Appl Stat.</italic>
</source>
                    <year>2018</year>;<volume>12</volume>(<issue>1</issue>):<fpage>540</fpage>&#x2013;<lpage>566</lpage>.
                    <pub-id pub-id-type="pmid">30224943</pub-id>
                    <pub-id pub-id-type="doi">10.1214/17-AOAS1102</pub-id>
                    <pub-id pub-id-type="pmcid">6138053</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-31">
                <label>31</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Zhou</surname>
                            <given-names>YH</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Gallins</surname>
                            <given-names>P</given-names>
                        </name>
</person-group>:
                    <article-title>A Review and Tutorial of Machine Learning Methods for Microbiome Host Trait Prediction.</article-title>
                    <source>

                        <italic toggle="yes">Front Genet.</italic>
</source>
                    <year>2019</year>;<volume>10</volume>:<fpage>579</fpage>.
                    <pub-id pub-id-type="pmid">31293616</pub-id>
                    <pub-id pub-id-type="doi">10.3389/fgene.2019.00579</pub-id>
                    <pub-id pub-id-type="pmcid">6603228</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-32">
                <label>32</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Tsagris</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Tsamardinos</surname>
                            <given-names>I</given-names>
                        </name>
</person-group>:
                    <article-title>Feature selection with the R package 
                        <italic toggle="yes">MXM</italic> [version 2; peer review: 2 approved].</article-title>
                    <source>

                        <italic toggle="yes">F1000Res.</italic>
</source>
                    <year>2018</year>;<volume>7</volume>:<fpage>1505</fpage>.
                    <pub-id pub-id-type="pmid">31656581</pub-id>
                    <pub-id pub-id-type="doi">10.12688/f1000research.16216.2</pub-id>
                    <pub-id pub-id-type="pmcid">6792475</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-33">
                <label>33</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Kuhn</surname>
                            <given-names>M</given-names>
                        </name>
</person-group>:
                    <article-title>Building Predictive Models in R Using the caret Package.</article-title>
                    <source>

                        <italic toggle="yes">J Stat Softw.</italic>
</source>
                    <year>2008</year>;<volume>28</volume>(<issue>5</issue>).
                    <pub-id pub-id-type="doi">10.18637/jss.v028.i05</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-34">
                <label>34</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Hendy</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Warinner</surname>
                            <given-names>C</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Bouwman</surname>
                            <given-names>A</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Proteomic evidence of dietary sources in ancient dental calculus.</article-title>
                    <source>

                        <italic toggle="yes">Proc Biol Sci.</italic>
</source>
                    <year>2018</year>;<volume>285</volume>(<issue>1883</issue>).
                    <pub-id pub-id-type="doi">10.1098/rspb.2018.0977</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-35">
                <label>35</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author"> 

                        <name name-style="western">
                            <surname>Hendy</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Welker</surname>
                            <given-names>F</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Demarchi</surname>
                            <given-names>B</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>A guide to ancient protein studies.</article-title>
                    <source>

                        <italic toggle="yes">Nat Ecol Evol.</italic>
</source>
                    <year>2018</year>;<volume>2</volume>(<issue>5</issue>):<fpage>791</fpage>&#x2013;<lpage>799</lpage>.
                    <pub-id pub-id-type="pmid">29581591</pub-id>
                    <pub-id pub-id-type="doi">10.1038/s41559-018-0510-x</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-36">
                <label>36</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Jersie-Christensen</surname>
                            <given-names>RR</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Lanigan</surname>
                            <given-names>LT</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Lyon</surname>
                            <given-names>D</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Quantitative metaproteomics of medieval dental calculus reveals individual oral health status.</article-title>
                    <source>

                        <italic toggle="yes">Nat Commun.</italic>
</source>
                    <year>2018</year>;<volume>9</volume>(<issue>1</issue>):<fpage>4744</fpage>.
                    <pub-id pub-id-type="pmid">30459334</pub-id>
                    <pub-id pub-id-type="doi">10.1038/s41467-018-07148-3</pub-id>
                    <pub-id pub-id-type="pmcid">6246597</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-37">
                <label>37</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Velsko</surname>
                            <given-names>IM</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Yates</surname>
                            <given-names>JAF</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Aron</surname>
                            <given-names>F</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Microbial differences between dental plaque and historic dental calculus are related to oral biofilm maturation stage.</article-title>
                    <source>

                        <italic toggle="yes">Microbiome.</italic>
</source>
                    <year>2019</year>;<volume>7</volume>(<issue>1</issue>):<fpage>102</fpage>.
                    <pub-id pub-id-type="pmid">31279340</pub-id>
                    <pub-id pub-id-type="doi">10.1186/s40168-019-0717-3</pub-id>
                    <pub-id pub-id-type="pmcid">6612086</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-38">
                <label>38</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Tyanova</surname>
                            <given-names>S</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Temu</surname>
                            <given-names>T</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Cox</surname>
                            <given-names>J</given-names>
                        </name>
</person-group>:
                    <article-title>The MaxQuant computational platform for mass spectrometry-based shotgun proteomics.</article-title>
                    <source>

                        <italic toggle="yes">Nat Protoc.</italic>
</source>
                    <year>2016</year>;<volume>11</volume>(<issue>12</issue>):<fpage>2301</fpage>&#x2013;<lpage>2319</lpage>.
                    <pub-id pub-id-type="pmid">27809316</pub-id>
                    <pub-id pub-id-type="doi">10.1038/nprot.2016.136</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-39">
                <label>39</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Langmead</surname>
                            <given-names>B</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Salzberg</surname>
                            <given-names>SL</given-names>
                        </name>
</person-group>:
                    <article-title>Fast gapped-read alignment with Bowtie 2.</article-title>
                    <source>

                        <italic toggle="yes">Nat Methods.</italic>
</source>
                    <year>2012</year>;<volume>9</volume>(<issue>4</issue>):<fpage>357</fpage>&#x2013;<lpage>9</lpage>.
                    <pub-id pub-id-type="pmid">22388286</pub-id>
                    <pub-id pub-id-type="doi">10.1038/nmeth.1923</pub-id>
                    <pub-id pub-id-type="pmcid">3322381</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-40">
                <label>40</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Li</surname>
                            <given-names>H</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Durbin</surname>
                            <given-names>R</given-names>
                        </name>
</person-group>:
                    <article-title>Fast and accurate long-read alignment with Burrows-Wheeler transform.</article-title>
                    <source>

                        <italic toggle="yes">Bioinformatics.</italic>
</source>
                    <year>2010</year>;<volume>26</volume>(<issue>5</issue>):<fpage>589</fpage>&#x2013;<lpage>95</lpage>.
                    <pub-id pub-id-type="pmid">20080505</pub-id>
                    <pub-id pub-id-type="doi">10.1093/bioinformatics/btp698</pub-id>
                    <pub-id pub-id-type="pmcid">2828108</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-41">
                <label>41</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Pham</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Rays Jiang</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Nguyen</surname>
                            <given-names>D</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>USFOneHealthCodeathon2020/Team5_MinhRays: v1.0.0 (Version v1.0.0).</article-title>
                    <source>

                        <italic toggle="yes">Zenodo.</italic>
</source>
                    <year>2020</year>.
                    <ext-link ext-link-type="uri" xlink:href="http://www.doi.org/10.5281/zenodo.4031785">http://www.doi.org/10.5281/zenodo.4031785</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-42">
                <label>42</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Altschul</surname>
                            <given-names>SF</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Gish</surname>
                            <given-names>W</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Miller</surname>
                            <given-names>W</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Basic local alignment search tool.</article-title>
                    <source>

                        <italic toggle="yes">J Mol Biol.</italic>
</source>
                    <year>1990</year>;<volume>215</volume>(<issue>3</issue>):<fpage>403</fpage>&#x2013;<lpage>10</lpage>.
                    <pub-id pub-id-type="pmid">2231712</pub-id>
                    <pub-id pub-id-type="doi">10.1016/S0022-2836(05)80360-2</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-43">
                <label>43</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Hofstra</surname>
                            <given-names>JJ</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Matamoros</surname>
                            <given-names>S</given-names>
                        </name>

                        <name name-style="western">
                            <surname>van de Pol</surname>
                            <given-names>MA</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Changes in microbiota during experimental human Rhinovirus infection.</article-title>
                    <source>

                        <italic toggle="yes">BMC Infect Dis.</italic>
</source>
                    <year>2015</year>;<volume>15</volume>:<fpage>336</fpage>.
                    <pub-id pub-id-type="pmid">26271750</pub-id>
                    <pub-id pub-id-type="doi">10.1186/s12879-015-1081-y</pub-id>
                    <pub-id pub-id-type="pmcid">4659412</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-44">
                <label>44</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Kosikowska</surname>
                            <given-names>U</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Biernasiuk</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Rybojad</surname>
                            <given-names>P</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Haemophilus parainfluenzae as a marker of the upper respiratory tract microbiota changes under the influence of preoperative prophylaxis with or without postoperative treatment in patients with lung cancer.</article-title>
                    <source>

                        <italic toggle="yes">BMC Microbiol.</italic>
</source>
                    <year>2016</year>;<volume>16</volume>:<fpage>62</fpage>.
                    <pub-id pub-id-type="pmid">27052615</pub-id>
                    <pub-id pub-id-type="doi">10.1186/s12866-016-0679-6</pub-id>
                    <pub-id pub-id-type="pmcid">4823876</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-45">
                <label>45</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Adler</surname>
                            <given-names>CJ</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Dobney</surname>
                            <given-names>K</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Weyrich</surname>
                            <given-names>LS</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Sequencing ancient calcified dental plaque shows changes in oral microbiota with dietary shifts of the Neolithic and Industrial revolutions.</article-title>
                    <source>

                        <italic toggle="yes">Nat Genet.</italic>
</source>
                    <year>2013</year>;<volume>45</volume>(<issue>4</issue>):<fpage>450</fpage>&#x2013;<lpage>5, 455e1</lpage>.
                    <pub-id pub-id-type="pmid">23416520</pub-id>
                    <pub-id pub-id-type="doi">10.1038/ng.2536</pub-id>
                    <pub-id pub-id-type="pmcid">3996550</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-46">
                <label>46</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Jordan</surname>
                            <given-names>D</given-names>
                        </name>
</person-group>:
                    <article-title>The Deadliest Flu: The Complete Story of the Discovery and Reconstruction of the 1918 Pandemic Virus</article-title>.<year>2019</year>.
                    <ext-link ext-link-type="uri" xlink:href="https://www.cdc.gov/flu/pandemic-resources/reconstruction-1918-virus.html">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-47">
                <label>47</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Thompson</surname>
                            <given-names>LR</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Sanders</surname>
                            <given-names>JG</given-names>
                        </name>

                        <name name-style="western">
                            <surname>McDonald</surname>
                            <given-names>D</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>A communal catalogue reveals Earth's multiscale microbial diversity.</article-title>
                    <source>

                        <italic toggle="yes">Nature.</italic>
</source>
                    <year>2017</year>;<volume>551</volume>(<issue>7681</issue>):<fpage>457</fpage>&#x2013;<lpage>463</lpage>.
                    <pub-id pub-id-type="pmid">29088705</pub-id>
                    <pub-id pub-id-type="doi">10.1038/nature24621</pub-id>
                    <pub-id pub-id-type="pmcid">6192678</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-48">
                <label>48</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Ley</surname>
                            <given-names>RE</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Hamady</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Lozupone</surname>
                            <given-names>C</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Evolution of mammals and their gut microbes.</article-title>
                    <source>

                        <italic toggle="yes">Science.</italic>
</source>
                    <year>2008</year>;<volume>320</volume>(<issue>5883</issue>):<fpage>1647</fpage>&#x2013;<lpage>51</lpage>.
                    <pub-id pub-id-type="pmid">18497261</pub-id>
                    <pub-id pub-id-type="doi">10.1126/science.1155725</pub-id>
                    <pub-id pub-id-type="pmcid">2649005</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-49">
                <label>49</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Lim</surname>
                            <given-names>SJ</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Bordenstein</surname>
                            <given-names>SR</given-names>
                        </name>
</person-group>:
                    <article-title>An introduction to phylosymbiosis.</article-title>
                    <source>

                        <italic toggle="yes">Proc Biol Sci.</italic>
</source>
                    <year>2020</year>;<volume>287</volume>(<issue>1922</issue>):<fpage>20192900</fpage>.
                    <pub-id pub-id-type="pmid">32126958</pub-id>
                    <pub-id pub-id-type="doi">10.1098/rspb.2019.2900</pub-id>
                    <pub-id pub-id-type="pmcid">7126058</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-50">
                <label>50</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Franzenburg</surname>
                            <given-names>S</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Walter</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>K&#x00fc;nzel</surname>
                            <given-names>S</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Distinct antimicrobial peptide expression determines host species-specific bacterial associations.</article-title>
                    <source>

                        <italic toggle="yes">Proc Natl Acad Sci U S A.</italic>
</source>
                    <year>2013</year>;<volume>110</volume>(<issue>39</issue>):<fpage>E3730</fpage>&#x2013;<lpage>8</lpage>.
                    <pub-id pub-id-type="pmid">24003149</pub-id>
                    <pub-id pub-id-type="doi">10.1073/pnas.1304960110</pub-id>
                    <pub-id pub-id-type="pmcid">3785777</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-51">
                <label>51</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Kohl</surname>
                            <given-names>KD</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Brun</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Magallanes</surname>
                            <given-names>M</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Gut microbial ecology of lizards: insights into diversity in the wild, effects of captivity, variation across gut regions and transmission.</article-title>
                    <source>

                        <italic toggle="yes">Mol Ecol.</italic>
</source>
                    <year>2017</year>;<volume>26</volume>(<issue>4</issue>):<fpage>1175</fpage>&#x2013;<lpage>1189</lpage>.
                    <pub-id pub-id-type="pmid">27862531</pub-id>
                    <pub-id pub-id-type="doi">10.1111/mec.13921</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-52">
                <label>52</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Woodhams</surname>
                            <given-names>DC</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Bletz</surname>
                            <given-names>MC</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Becker</surname>
                            <given-names>CG</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Host-associated microbiomes are predicted by immune system complexity and climate.</article-title>
                    <source>

                        <italic toggle="yes">Genome Biol.</italic>
</source>
                    <year>2020</year>;<volume>21</volume>(<issue>1</issue>):<fpage>23</fpage>.
                    <pub-id pub-id-type="pmid">32014020</pub-id>
                    <pub-id pub-id-type="doi">10.1186/s13059-019-1908-8</pub-id>
                    <pub-id pub-id-type="pmcid">6996194</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-53">
                <label>53</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Collen</surname>
                            <given-names>B</given-names>
                        </name>

                        <name name-style="western">
                            <surname>McRae</surname>
                            <given-names>L</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Deinet</surname>
                            <given-names>S</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Predicting how populations decline to extinction.</article-title>
                    <source>

                        <italic toggle="yes">Philos Trans R Soc Lond B Biol Sci.</italic>
</source>
                    <year>2011</year>;<volume>366</volume>(<issue>1577</issue>):<fpage>2577</fpage>&#x2013;<lpage>86</lpage>.
                    <pub-id pub-id-type="pmid">21807738</pub-id>
                    <pub-id pub-id-type="doi">10.1098/rstb.2011.0015</pub-id>
                    <pub-id pub-id-type="pmcid">3138608</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-54">
                <label>54</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Caviedes-Vidal</surname>
                            <given-names>E</given-names>
                        </name>

                        <name name-style="western">
                            <surname>McWhorter</surname>
                            <given-names>TJ</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Lavin</surname>
                            <given-names>SR</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>The digestive adaptation of flying vertebrates: high intestinal paracellular absorption compensates for smaller guts.</article-title>
                    <source>

                        <italic toggle="yes">Proc Natl Acad Sci U S A.</italic>
</source>
                    <year>2007</year>;<volume>104</volume>(<issue>48</issue>):<fpage>19132</fpage>&#x2013;<lpage>7</lpage>.
                    <pub-id pub-id-type="pmid">18025481</pub-id>
                    <pub-id pub-id-type="doi">10.1073/pnas.0703159104</pub-id>
                    <pub-id pub-id-type="pmcid">2141920</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-55">
                <label>55</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Tacutu</surname>
                            <given-names>R</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Thornton</surname>
                            <given-names>D</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Johnson</surname>
                            <given-names>E</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Human Ageing Genomic Resources: new and updated databases.</article-title>
                    <source>

                        <italic toggle="yes">Nucleic Acids Res.</italic>
</source>
                    <year>2018</year>;<volume>46</volume>(<issue>D1</issue>):<fpage>D1083</fpage>&#x2013;<lpage>D1090</lpage>.
                    <pub-id pub-id-type="pmid">29121237</pub-id>
                    <pub-id pub-id-type="doi">10.1093/nar/gkx1042</pub-id>
                    <pub-id pub-id-type="pmcid">5753192</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-56">
                <label>56</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Arumugam</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Raes</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Pelletier</surname>
                            <given-names>E</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Enterotypes of the human gut microbiome.</article-title>
                    <source>

                        <italic toggle="yes">Nature.</italic>
</source>
                    <year>2011</year>;<volume>473</volume>(<issue>7346</issue>):<fpage>174</fpage>&#x2013;<lpage>80</lpage>.
                    <pub-id pub-id-type="pmid">21508958</pub-id>
                    <pub-id pub-id-type="doi">10.1038/nature09944</pub-id>
                    <pub-id pub-id-type="pmcid">3728647</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-57">
                <label>57</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Lovmar</surname>
                            <given-names>L</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Ahlford</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Jonsson</surname>
                            <given-names>M</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Silhouette scores for assessment of SNP genotype clusters.</article-title>
                    <source>

                        <italic toggle="yes">BMC Genomics.</italic>
</source>
                    <year>2005</year>;<volume>6</volume>:<fpage>35</fpage>.
                    <pub-id pub-id-type="pmid">15760469</pub-id>
                    <pub-id pub-id-type="doi">10.1186/1471-2164-6-35</pub-id>
                    <pub-id pub-id-type="pmcid">555759</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-58">
                <label>58</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Swadtasnim</surname>
                            <given-names>SO</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Sumpter</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Kim</surname>
                            <given-names>Y</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>USFOneHealthCodeathon2020/Team6_LimSharma: v1.0.0 (Version v1.0.0).</article-title>
                    <source>

                        <italic toggle="yes">Zenodo.</italic>
</source>
                    <year>2020</year>.
                    <ext-link ext-link-type="uri" xlink:href="http://www.doi.org/10.5281/zenodo.4031778">http://www.doi.org/10.5281/zenodo.4031778</ext-link>
                </mixed-citation>
            </ref>
        </ref-list>
    </back>
    <sub-article article-type="reviewer-report" id="report76301">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.29214.r76301</article-id>
            <title-group>
                <article-title>Reviewer response for version 1</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Tripathy</surname>
                        <given-names>Sucheta</given-names>
                    </name>
                    <xref ref-type="aff" rid="r76301a1">1</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0003-0611-8088</uri>
                </contrib>
                <aff id="r76301a1">
                    <label>1</label>Computational Genomics lab, Kolkata, West Bengal, India</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>16</day>
                <month>2</month>
                <year>2021</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2021 Tripathy S</copyright-statement>
                <copyright-year>2021</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport76301" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.26459.1"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve-with-reservations</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>This study describes analyzing several types of metagenomics data by several groups of participants. While the concept is great and rationale is fine, the output as a open source software is far from true. The github page links to each of the projects lead to several files often R scripts with hard coded value and can't be directly used without proper installation instruction. This is lacking for most of the projects under this study. Only in case of projectZero, there are installation instructions. However, upon trying I could not install the package. I got the following error:</p>
            <p> </p>
            <p> PackagesNotFoundError: The following packages are not available from current channels:</p>
            <p> </p>
            <p> &#x00a0; - python[version='3.6.10,3.6.9.*',build=h0371630_0]</p>
            <p> PackagesNotFoundError: The following packages are not available from current channels:</p>
            <p> </p>
            <p> &#x00a0; - python[version='3.6.10,3.6.9.*',build=h0371630_0]</p>
            <p> </p>
            <p> In case of project 5: also the instructions are not very clear. For example in order to run&#x00a0;</p>
            <p> sudo python3 ./server_setup.py</p>
            <p> It is not mentioned where this file server_setup.py is located. Whether it is coming with the distribution or this file is located somewhere is not very apparent.</p>
            <p>Are the conclusions about the tool and its performance adequately supported by the findings presented in the article?</p>
            <p>Yes</p>
            <p>Is the rationale for developing the new software tool clearly explained?</p>
            <p>Yes</p>
            <p>Is the description of the software tool technically sound?</p>
            <p>Partly</p>
            <p>Are sufficient details of the code, methods and analysis (if applicable) provided to allow replication of the software development and its use by others?</p>
            <p>No</p>
            <p>Is sufficient information provided to allow interpretation of the expected output datasets and any results generated using the tool?</p>
            <p>Partly</p>
            <p>Reviewer Expertise:</p>
            <p>Genomics, Computational methods and development of pipelines for data analysis. Genome engineering.</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.</p>
        </body>
    </sub-article>
</article>
