<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.2 20190208//EN" "http://jats.nlm.nih.gov/publishing/1.2/JATS-journalpublishing1.dtd"><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="other" dtd-version="1.2" xml:lang="en">
    <front>
        <journal-meta>
            <journal-id journal-id-type="pmc">F1000Research</journal-id>
            <journal-title-group>
                <journal-title>F1000Research</journal-title>
            </journal-title-group>
            <issn pub-type="epub">2046-1402</issn>
            <publisher>
                <publisher-name>F1000 Research Limited</publisher-name>
                <publisher-loc>London, UK</publisher-loc>
            </publisher>
        </journal-meta>
        <article-meta>
            <article-id pub-id-type="doi">10.12688/f1000research.24345.2</article-id>
            <article-categories>
                <subj-group subj-group-type="heading">
                    <subject>Software Tool Article</subject>
                </subj-group>
                <subj-group>
                    <subject>Articles</subject>
                </subj-group>
            </article-categories>
            <title-group>
                <article-title>Interactive exploratory data analysis of Integrative Human Microbiome Project data using Metaviz</article-title>
                <fn-group content-type="pub-status">
                    <fn>
                        <p>[version 2; peer review: 3 approved]</p>
                    </fn>
                </fn-group>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Wagner</surname>
                        <given-names>Justin</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Data Curation</role>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Resources</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Validation</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <xref ref-type="aff" rid="a1">1</xref>
                    <xref ref-type="aff" rid="a2">2</xref>
                    <xref ref-type="aff" rid="a3">3</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Kancherla</surname>
                        <given-names>Jayaram</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Data Curation</role>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Resources</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Validation</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <uri content-type="orcid">https://orcid.org/0000-0001-5855-5031</uri>
                    <xref ref-type="aff" rid="a2">2</xref>
                    <xref ref-type="aff" rid="a3">3</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Braccia</surname>
                        <given-names>Domenick</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Resources</role>
                    <role content-type="http://credit.niso.org/">Validation</role>
                    <xref ref-type="aff" rid="a2">2</xref>
                    <xref ref-type="aff" rid="a3">3</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Matsumara</surname>
                        <given-names>James</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Data Curation</role>
                    <role content-type="http://credit.niso.org/">Resources</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <xref ref-type="aff" rid="a4">4</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Felix</surname>
                        <given-names>Victor</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Resources</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <xref ref-type="aff" rid="a4">4</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Crabtree</surname>
                        <given-names>Jonathan</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Resources</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <xref ref-type="aff" rid="a4">4</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Mahurkar</surname>
                        <given-names>Anup</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Data Curation</role>
                    <role content-type="http://credit.niso.org/">Funding Acquisition</role>
                    <role content-type="http://credit.niso.org/">Project Administration</role>
                    <role content-type="http://credit.niso.org/">Supervision</role>
                    <uri content-type="orcid">https://orcid.org/0000-0002-4999-2296</uri>
                    <xref ref-type="aff" rid="a4">4</xref>
                </contrib>
                <contrib contrib-type="author" corresp="yes">
                    <name>
                        <surname>Corrada Bravo</surname>
                        <given-names>H&#x00e9;ctor</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Conceptualization</role>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Funding Acquisition</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Project Administration</role>
                    <role content-type="http://credit.niso.org/">Resources</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Supervision</role>
                    <role content-type="http://credit.niso.org/">Validation</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <uri content-type="orcid">https://orcid.org/0000-0002-1255-4444</uri>
                    <xref ref-type="corresp" rid="c1">a</xref>
                    <xref ref-type="aff" rid="a1">1</xref>
                    <xref ref-type="aff" rid="a2">2</xref>
                    <xref ref-type="aff" rid="a3">3</xref>
                </contrib>
                <aff id="a1">
                    <label>1</label>Department of Computer Science, University of Maryland, College Park, College Park, Maryland, 20742, USA</aff>
                <aff id="a2">
                    <label>2</label>Center for Bioinformatics and Computational Biology, University of Maryland, College Park, College Park, Maryland, 20742, USA</aff>
                <aff id="a3">
                    <label>3</label>Institute for Advanced Computer Studies, University of Maryland, College Park, College Park, Maryland, 20742, USA</aff>
                <aff id="a4">
                    <label>4</label>Institute for Genome Sciences, University of Maryland, Baltimore, Baltimore, Maryland, 21201, USA</aff>
            </contrib-group>
            <author-notes>
                <corresp id="c1">
                    <label>a</label>
                    <email xlink:href="mailto:hcorrada@umd.edu">hcorrada@umd.edu</email>
                </corresp>
                <fn fn-type="conflict">
                    <p>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>21</day>
                <month>6</month>
                <year>2021</year>
            </pub-date>
            <pub-date pub-type="collection">
                <year>2020</year>
            </pub-date>
            <volume>9</volume>
            <elocation-id>601</elocation-id>
            <history>
                <date date-type="accepted">
                    <day>21</day>
                    <month>4</month>
                    <year>2021</year>
                </date>
            </history>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2021 Wagner J et al.</copyright-statement>
                <copyright-year>2021</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <self-uri content-type="pdf" xlink:href="https://f1000research.com/articles/9-601/pdf"/>
            <abstract>
                <p>The rich data produced by the second phase of the Human Microbiome Project (iHMP) offers a unique opportunity to test hypotheses that interactions between microbial communities and a human host might impact an individual&#x2019;s health or disease status. In this work we describe infrastructure that integrates Metaviz, an interactive microbiome data analysis and visualization tool, with the iHMP Data Coordination Center web portal and the 
                    <italic toggle="yes">HMP2Data</italic> R/Bioconductor package. We describe integrative statistical and visual analyses of two datasets from iHMP using Metaviz along with the 
                    <italic toggle="yes">metagenomeSeq</italic> R/Bioconductor package for statistical analysis of differential abundance analysis. These use cases demonstrate the utility of a combined approach to access and analyze data from this resource.</p>
            </abstract>
            <kwd-group kwd-group-type="author">
                <kwd>metagenomics</kwd>
                <kwd>visualization</kwd>
                <kwd>R/Bioconductor</kwd>
                <kwd>Intergrative Human Microbiome Project</kwd>
            </kwd-group>
            <funding-group>
                <award-group id="fund-1" xlink:href="http://dx.doi.org/10.13039/100000002">
                    <funding-source>National Institutes of Health</funding-source>
                    <award-id>R01GM114267</award-id>
                </award-group>
                <award-group id="fund-2">
                    <funding-source>National Institutes of Health</funding-source>
                    <award-id>U54DK102556</award-id>
                </award-group>
                <funding-statement>This work was partially funded by NIH grants R01 GM114267 and U54 DK102556. </funding-statement>
                <funding-statement>
                    <italic>The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.</italic>
                </funding-statement>
            </funding-group>
        </article-meta>
        <notes>
            <sec sec-type="version-changes">
                <label>Updated</label>
                <title>Changes from Version 1</title>
                <p>In this version, we prioritized improving the usability and accessibility of using Metaviz for interactive visual exploration of the Human Microbiome Project data resources including adding tutorials and updating documentation for use cases such as generating diversity plots. The documentation updates also include creating a vignette in the metavizR R/Bioconductor package for visualizing variables from protected resources in a local R session. We also provide default workspaces for the HMP datasets available in Metaviz and they are linked from metaviz.org. We include information on how we expect to update the data served through Metaviz based on changes to the HMP2Data R/Bioconductor package. Also, we have updated Figures with higher resolution images. Finally, we provided more explanation of the taxonomic features that we found as differentially abundant and detailed the provenance of the taxonomic assignments. We have updated the references section to reflect citation changes due to these updates.</p>
            </sec>
        </notes>
    </front>
    <body>
        <sec sec-type="intro">
            <title>Introduction</title>
            <p>Metagenomics allows researchers to perform a microbial community census and investigate associations between host phenotype and community status. Metagenomics has been used successfully to track pathogen spread
                <sup>
                    <xref ref-type="bibr" rid="ref-1">1</xref>
                </sup> and identify intervention strategies in childhood malnutrition
                <sup>
                    <xref ref-type="bibr" rid="ref-2">2</xref>
                </sup>. Integrative analysis of samples using multiple sequencing technologies allows for comparison at various levels of granularity. The second phase of the Human Microbiome Project (iHMP) offers a unique opportunity to test hypotheses of interactions between the microbial community and the human host. To examine the iHMP data resource, we use Metaviz
                <sup>
                    <xref ref-type="bibr" rid="ref-3">3</xref>
                </sup>, an interactive microbiome exploratory data analysis and visualization tool, and 
                <italic toggle="yes">metagenomeSeq</italic>
                <sup>
                    <xref ref-type="bibr" rid="ref-4">4</xref>
                </sup>, an R/Bioconductor package for statistical analysis of differential abundance analysis, for combined visual and statistical analysis.</p>
            <sec>
                <title>Human Microbiome Project Phase II</title>
                <p>The second phase of the HMP, also called the Integrative Human Microbiome Project (iHMP), consisted of focused studies of three diseases &#x2013; Inflammatory Bowel Disease (IBD), Type II Diabetes (T2D), and Multi-Omic Microbiome Study: Pregnancy Initiative (MOMS-PI)
                    <sup>
                        <xref ref-type="bibr" rid="ref-5">5</xref>
                    </sup>. The overall goal of the project was to identify associations between human microbiome community census data and the three diseases. Each of the studies were structured for the specific disease and consisted of separate cohorts.</p>
            </sec>
            <sec>
                <title>Metaviz</title>
                <p>Metaviz
                    <sup>
                        <xref ref-type="bibr" rid="ref-6">6</xref>
                    </sup> is a web-based interactive visualization tool for microbiome data analysis. The architecture consists of a JavaScript front-end suite of charts (based on D3.js and Canvas) and a navigation component that lets users select portions of taxonomic hierarchies to visualize and analyze. Metaviz supports two backend data stores &#x2013; a graph database and the 
                    <italic toggle="yes">metavizr</italic> R/Bioconductor package. Metaviz is tightly integrated with the 
                    <italic toggle="yes">metagenomeSeq</italic> statistical testing package so differential abundance testing results can be viewed directly in a Metaviz session. We host an instance of Metaviz that we call the UMD Metagenome Browser (
                    <ext-link ext-link-type="uri" xlink:href="http://metaviz.cbcb.umd.edu">http://metaviz.cbcb.umd.edu</ext-link>).</p>
            </sec>
            <sec>
                <title>Related work</title>
                <p>Visualization tools for large-scale sequencing consortium projects provide a mechanism to explore and interact with data from multiple studies. These applications help users analyze individual datasets and examine trends across the entire project. MAGI is a web-application that enables a user to examine data from TCGA data
                    <sup>
                        <xref ref-type="bibr" rid="ref-7">7</xref>
                    </sup>. The Earth Microbiome Project provides an interactive visualization web-application to analyze its data
                    <sup>
                        <xref ref-type="bibr" rid="ref-8">8</xref>
                    </sup>.  EMPeror offers interactive 3D visualizations of PCA plots to show distances between microbiome samples
                    <sup>
                        <xref ref-type="bibr" rid="ref-9">9</xref>
                    </sup>.  QIIME packages a number of tools for static plotting of Principal Coordinate Analysis and stacked bar plots
                    <sup>
                        <xref ref-type="bibr" rid="ref-10">10</xref>
                    </sup>. MetaPhlAn2 uses a visualization package called GraphPhlan to produce phylogenetic trees and other plots
                    <sup>
                        <xref ref-type="bibr" rid="ref-11">11</xref>
                    </sup>. The 
                    <italic toggle="yes">HMP2Data</italic> R/Bioconductor provides processed 16S sequencing data from the iHMP project in Bioconductor data structures
                    <sup>
                        <xref ref-type="bibr" rid="ref-12">12</xref>
                    </sup>. We implemented Metaviz using design patterns from Epiviz
                    <sup>
                        <xref ref-type="bibr" rid="ref-13">13</xref>
                    </sup>, an interactive epigenetics visualization tool, that visualizes data from a variety of epigenetic sequencing projects. We show how we leverage the microbiome measurement-based design of Metaviz to implement interactive exploration and hypothesis-testing of the iHMP resource.</p>
            </sec>
        </sec>
        <sec>
            <title>Implementation</title>
            <sec>
                <title>Metaviz integration with HMP infrastructure</title>
                <p>The HMP Data Access and Coordination Center maintains a data repository and web portal (
                    <ext-link ext-link-type="uri" xlink:href="https://www.ihmpdcc.org">https://www.ihmpdcc.org</ext-link>). From this web portal, users can browse metadata for datasets, raw sequencing files, and processed files including taxonomic community profile abundance matrices. We implemented several mechanisms to interact with the HMP data resources through Metaviz
                    <sup>
                        <xref ref-type="bibr" rid="ref-6">6</xref>
                    </sup>.</p>
            </sec>
            <sec>
                <title>Data loaded into UMD Metagenome Browser</title>
                <p>We loaded the 16S community profile abundance matrices for the samples from the IBD, T2D, and MOMS-PI studies as provided by the 
                    <italic toggle="yes">HMP2Data</italic> Bioconductor package
                    <sup>
                        <xref ref-type="bibr" rid="ref-12">12</xref>
                    </sup> into the 
                    <ext-link ext-link-type="uri" xlink:href="http://metaviz.cbcb.umd.edu/">UMD Metagenome Browser</ext-link>. A user can select each dataset from the application start screen. 
                    <xref ref-type="fig" rid="f1">Figure 1</xref> details the number of samples, with metadata to the extent available as of May 2020 from the 
                    <italic toggle="yes">HMP2Data</italic> package, from each project currently available in the UMD Metagenome Browser.</p>
                <fig fig-type="figure" id="f1" orientation="portrait" position="float">
                    <label>Figure 1. </label>
                    <caption>
                        <title>Metaviz and iHMP data infrastructure integration.</title>
                        <p>Top: iHMP data accessible through the UMD Metagenome Browser. Middle (
                            <bold>A</bold>, 
                            <bold>B</bold>): Single sample link from data portal to UMD Metagenome Browser. Bottom (
                            <bold>C</bold>, 
                            <bold>D</bold>): Multiple samples manifest file upload and selection to UMD Metagenome Browser. We provide several mechanisms to access the HMP dataset from Metaviz. First, we loaded the three datasets (IBD, T2D, and MOMS-PI) into the hosted instance of Metaviz directly. A user can choose any of these datasets from the data selections screen then samples can be chosen within each dataset. We also link to the HMP Data Portal for single samples as shown in the Middle panel (
                            <bold>A</bold>, 
                            <bold>B</bold>). Finally, the HMP Data Portal provides a &#x201c;cart&#x201d; functionality where a user can select multiple samples and download a manifest listing those files (
                            <bold>C</bold>). A user can upload a manifest file containing selections from the 16S community abundance profiles from the same dataset (IBD, T2D, or MOMS-PI) to the UMD Metagenome Browser and a new Metaviz workspace is created with those files (
                            <bold>D</bold>).</p>
                    </caption>
                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/56081/425442f8-093e-49a2-aa3b-1fd532d81e50_figure1.gif"/>
                </fig>
            </sec>
            <sec>
                <title>HMP Data Portal linking to Metaviz</title>
                <p>When browsing the samples available from the HMP Data Portal, a user can view an individual abundance matrix in Metaviz using the Metaviz tool link from the file description page. When the user clicks the link, a redirect occurs to the UMD Metagenome Browser with a new workspace containing a FacetZoom navigation utility and a heatmap for that sample. 
                    <xref ref-type="fig" rid="f1">Figure 1A</xref> shows the direct link functionality for samples in the IBD dataset and resulting workspace in Metaviz (
                    <xref ref-type="fig" rid="f1">Figure 1B</xref>).</p>
            </sec>
            <sec>
                <title>Metaviz import of Data Portal Manifest</title>
                <p>In the HMP data portal, a user can select files with a shopping cart utility and download the selections as a manifest file. In the UMD Metagenome Browser, the user can upload the manifest file to create a Metaviz workspace on the fly for those samples. Currently, only files from the same project can be viewed in one workspace. Resolving taxonomic hierarchies across datasets in Metaviz is future work that could use a utility such as the 
                    <italic toggle="yes">metagenomeFeatures</italic> R/Bioconductor package
                    <sup>
                        <xref ref-type="bibr" rid="ref-14">14</xref>
                    </sup>. 
                    <xref ref-type="fig" rid="f1">Figure 1C</xref> shows the manifest file workflow for samples from the IBD dataset and resulting workspace in Metaviz (
                    <xref ref-type="fig" rid="f1">Figure 1D</xref>).</p>
            </sec>
            <sec>
                <title>Metaviz usage</title>
                <p>For ease of use, we provide tutorials at 
                    <ext-link ext-link-type="uri" xlink:href="https://epiviz.github.io/tutorials/metaviz/">https://epiviz.github.io/tutorials/metaviz/</ext-link>. As a community resource, we plan to update the Metaviz database within a month of Bioconductor releases of HMP2Data. We maintain links to the HMP Data Portal through the update of 
                    <italic toggle="yes">HMP2Data</italic> package URLs and provide default workspaces for the HMP2 datasets as well as those in 
                    <italic toggle="yes">HMP16SData</italic> R/Bioconductor package (
                    <ext-link ext-link-type="uri" xlink:href="https://epiviz.github.io/metaviz-workspaces/">https://epiviz.github.io/metaviz-workspaces/</ext-link>). For generating data summaries, we recommend using the HMP2Data package with appropriate R libraries to summarize sample information and the interactive HMP Data Portal for data summaries over different samples or study attributes. We provide instructions in the 
                    <italic toggle="yes">metavizr</italic> vignette to handle visualizing data that would be added to an analysis session like protected variables from dbGAP.</p>
            </sec>
        </sec>
        <sec>
            <title>Operation</title>
            <p>The HMP Data Portal and Metaviz are web applications that can run in any modern browser. We recommend using Firefox (version 65 or later) or Chrome (version 65 or later) for best performance. 
                <italic toggle="yes">Metavizr</italic> is a Bioconductor package and general guidelines from Bioconductor for requirements and installation should be followed (
                <ext-link ext-link-type="uri" xlink:href="https://bioconductor.org/install/">https://bioconductor.org/install/</ext-link>).</p>
        </sec>
        <sec sec-type="cases">
            <title>Use cases</title>
            <sec>
                <title>
                    <italic toggle="yes">metavizr</italic> analysis of WGS vs 16S data from same samples</title>
                <p>In the IBD cohort of the iHMP dataset, investigators sequenced a subset of samples using whole metagenome and 16S sequencing. We developed functions in 
                    <italic toggle="yes">metavizr</italic> to compare 16S and whole metagenome data for individual samples. Using the taxonomic profiles of the IBD samples, we matched the taxonomic features discovered with both sequencing methods. With this subset of features, we generated a single taxonomic hierarchy then loaded the 16S and whole metagenome abundance measurements into a 
                    <italic toggle="yes">metavizr</italic> object. 
                    <xref ref-type="fig" rid="f2">Figure 2</xref> shows an example analysis with stacked plots and scatter plots that link to a single FacetZoom to compare the degree of consistency of the data across sequencing methods.</p>
                <fig fig-type="figure" id="f2" orientation="portrait" position="float">
                    <label>Figure 2. </label>
                    <caption>
                        <title>Comparison between 16S and WGS taxonomic profiling using metavizr.</title>
                        <p>We identified taxa present in the taxonomic hierarchy for each method and created a merged dataset. A FacetZoom (bottom) shows the common taxonomic features, two Stacked Plots (middle) show the proportion of all features aggregated to the Order level, and a set of scatter plots (top) for samples with WGS abundance on the X-axis and 16S abundance on the Y-axis. For WGS, the relative proportion output from MetaPhlan for taxa at the order level were transformed to counts per 1000 reads. The scatter plots show the variability in taxonomic community census estimates between sequencing methods. A static similar stacked plot visualization is shown in the main HMP consortium manuscript at the genus and species level across samples for comparison
                            <sup>
                                <xref ref-type="bibr" rid="ref-16">16</xref>
                            </sup>. Metaviz allows users to make specific selections of the FacetZoom to compare taxa at various levels. The scatter plot also allows comparison at single sample resolution. Code to create this Metaviz session is available at the following gist: 
                            <ext-link ext-link-type="uri" xlink:href="https://gist.github.com/jkanche/9216d465d18ab106be7a43f5340eb38a">https://gist.github.com/jkanche/9216d465d18ab106be7a43f5340eb38a</ext-link>.</p>
                    </caption>
                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/56081/425442f8-093e-49a2-aa3b-1fd532d81e50_figure2.gif"/>
                </fig>
            </sec>
            <sec>
                <title>IBD dataset</title>
                <p>The IBD study consisted of two phases: a pilot, which we refer to in this work as the IBD Stool Pilot, and a larger phase that we call IBD iHMP. We use the taxonomic profiles for each phase available from the 
                    <italic toggle="yes">HMP2Data</italic> package and use the same taxonomic classification identifiers in the package. To upload project data on to the UMD Metagenome Browser, we extracted 16S count table and taxonomic annotation using the 
                    <italic toggle="yes">otu_table</italic>() and 
                    <italic toggle="yes">tax_table</italic>() methods of 
                    <italic toggle="yes">HMP2Data</italic> package. We then use 
                    <italic toggle="yes">metagenomeSeq</italic> and 
                    <italic toggle="yes">metavizr</italic> to import the count data along with taxonomy and sample metadata into a neo4j graph database
                    <sup>
                        <xref ref-type="bibr" rid="ref-15">15</xref>
                    </sup> using the 
                    <italic toggle="yes">metavizr</italic> neo4j import functionality. We used Metaviz
                    <sup>
                        <xref ref-type="bibr" rid="ref-6">6</xref>
                    </sup> for exploratory analysis and 
                    <italic toggle="yes">metagenomeSeq</italic> for confirmatory statistical testing. We examined the IBD Stool Pilot and IBD iHMP dataset separately.</p>
            </sec>
            <sec>
                <title>IBD Stool Pilot dataset</title>
                <p>The IBD Stool Pilot dataset contains 16S and whole metagenome sequencing results of stool samples from 41 Crohn&#x2019;s disease (CD) subjects and 10 ulcerative colitis (UC) subjects. We focused our analysis on 16S sequencing and used Metaviz to visually identify taxa that showed a difference in abundance between CD and UC subjects.  
                    <xref ref-type="fig" rid="f3">Figure 3</xref> shows a typical visualization.</p>
                <fig fig-type="figure" id="f3" orientation="portrait" position="float">
                    <label>Figure 3. </label>
                    <caption>
                        <title>Metaviz Analysis of IBD Stool 16S Pilot Dataset.</title>
                        <p>A Metaviz workspace with a FacetZoom taxonomic hierarchy, heatmap, and boxplot for the specific feature in this instance s__:369227. This identifier was from the community abundance profiles available from the HMP2Data package. We identified taxonomic features at each level of the hierarchy using this integrated view and the results for features with a potential differential abundance are listed in Supplementary Table 1. The workspace is available at: 
                            <ext-link ext-link-type="uri" xlink:href="http://metaviz.cbcb.umd.edu/?ws=oLq2Fr9AwVc">http://metaviz.cbcb.umd.edu/?ws=oLq2Fr9AwVc</ext-link>.</p>
                    </caption>
                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/56081/425442f8-093e-49a2-aa3b-1fd532d81e50_figure3.gif"/>
                </fig>
                <p>We also used 
                    <italic toggle="yes">metagenomeSeq</italic> to test the differential abundance of features aggregated to each level of the taxonomy using the 
                    <italic toggle="yes">fitFeatureModel</italic> method that is based on a zero-inflated log-normal linear model. As shown in 
                    <xref ref-type="table" rid="T1">Table 1</xref>, two species had an absolute log fold-change greater than 1 and adjusted (Benjamini-Hochberg) p-value less than 0.1. Visually inspecting the IBD Stool Pilot data by aggregating counts to each level of the taxonomy we found the following features appearing differentially abundant: &#x201c;c__Betaproteobacteria&#x201d;, &#x201c;o__Burkholderiales&#x201d;, &#x201c;f__Ruminococcaceae&#x201d;, &#x201c;g__Lachnospira&#x201d;, &#x201c;g__[Ruminococcus]&#x201d;, &#x201c;g__Faecalibacterium&#x201d;, &#x201c;s__:589277&#x201d;, &#x201c;s__:333166&#x201d;, &#x201c;s__:564806&#x201d;, &#x201c;s__:369227&#x201d;, &#x201c;s__:358104&#x201d;, &#x201c;s__:369486&#x201d;, &#x201c;s__gnavus:360015&#x201d;, &#x201c;s__prausnitzii:851865&#x201d;. These taxonomic features describe paths in the taxonomy of the Kingdom Bacteria that was derived from the SILVA
                    <sup>
                        <xref ref-type="bibr" rid="ref-17">17</xref>
                    </sup> database. The documentation (
                    <ext-link ext-link-type="uri" xlink:href="https://ibdmdb.org/tunnel/public/HMP2/16S/1806/products">https://ibdmdb.org/tunnel/public/HMP2/16S/1806/products</ext-link>) for the abundance profiles used in this analysis denotes that this taxonomic string was generated with the sequence of an OTU derived with the UPARSE algorithm
                    <sup>
                        <xref ref-type="bibr" rid="ref-18">18</xref>
                    </sup> that was mapped to the SILVA database. Among these features, "c__Betaproteobacteria" refers to Class Betaproteobacteria, "o__Burkholderiales" to Order Burkholderiales, while values towards the leaves of the taxonomy refer to entries in the SILVA database that have an identifier and a sequence but have not been provided formal names in the binomial nomenclature system. Comparing the visual analysis results and the 
                    <italic toggle="yes">metagenomeSeq</italic> differential abundance testing results in 
                    <xref ref-type="table" rid="T1">Table 1</xref> shows that the taxonomic feature s__:369227 (member of the Lachnospiraceae family which are strictly anaerobic
                    <sup>
                        <xref ref-type="bibr" rid="ref-19">19</xref>
                    </sup>) was identified using both methods. Members of Lachnospiraceae are abundant in human intestinal tracts and have been linked specifically to production of butyric acid
                    <sup>
                        <xref ref-type="bibr" rid="ref-19">19</xref>
                    </sup>. Also, colonization with a specific strain of Lachnospiraceae in obese mice has been linked to development of hyperglycemia
                    <sup>
                        <xref ref-type="bibr" rid="ref-20">20</xref>
                    </sup>. The second taxon, s__:363232, is a member of the genus 
                    <italic toggle="yes">Dorea</italic> which has recently been shown to be associated with diarrhea predominant irritable bowel syndrome
                    <sup>
                        <xref ref-type="bibr" rid="ref-21">21</xref>
                    </sup>.</p>
                <table-wrap id="T1" orientation="portrait" position="anchor">
                    <label>Table 1. </label>
                    <caption>
                        <title>metagenomeSeq analysis of IBD Stool 16S Pilot dataset.</title>
                    </caption>
                    <table content-type="article-table" frame="hsides">
                        <thead>
                            <tr>
                                <th align="center" colspan="1" rowspan="1" valign="top"/>
                                <th align="center" colspan="1" rowspan="1" valign="top">Log fold change</th>
                                <th align="center" colspan="1" rowspan="1" valign="top">se</th>
                                <th align="center" colspan="1" rowspan="1" valign="top">p-value</th>
                                <th align="center" colspan="1" rowspan="1" valign="top">Adjusted p-value</th>
                            </tr>
                        </thead>
                        <tbody>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">s__:369227</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">1.864583442</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">0.431193725</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">1.53061E-05</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">0.000734694</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">s__:363232</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">1.193035074</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">0.275415013</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">1.47914E-05</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">0.000734694</td>
                            </tr>
                        </tbody>
                    </table>
                    <table-wrap-foot>
                        <fn>
                            <p>We used the fitFeatureModel of 
                                <italic toggle="yes">metagenomeSeq</italic> and aggregated counts to each level of the taxonomic hierarchy. Our analysis identified s__:369227 under family 
                                <italic toggle="yes">Lachnospiracea</italic> and s__:363232 under genus 
                                <italic toggle="yes">Dorea</italic> as differentially abundant between samples from subjects diagnosed with Ulcerative Colitis and Crohn&#x2019;s Disease.</p>
                        </fn>
                    </table-wrap-foot>
                </table-wrap>
            </sec>
            <sec>
                <title>IBD iHMP</title>
                <p>The IBD iHMP dataset consists of samples from subjects with CD, UC, and those without IBD (nonIBD). For these samples, we analyzed the 16S sequencing data of an ileum biopsy from the first visit for each subject, which yielded 72 samples with 32 from CD, 18 from CD, and 22 from nonIBD. We used 
                    <italic toggle="yes">metagenomeSeq</italic> to compute an F-statistic to determine if any taxonomic feature is associated with at least one group using the 
                    <italic toggle="yes">fitZig</italic> method (based on a zero-inflated Normal linear model on log-transformed counts appropriate for multi-category experiment designs). 
                    <xref ref-type="fig" rid="f4">Figure 4</xref> shows an example using Metaviz to visualize abundance profiles for phylum Fusobacteria, which was found to be differentially abundant across the three groups. Differential abundance of members of this phylum has previously been reported in studies of IBD
                    <sup>
                        <xref ref-type="bibr" rid="ref-22">22</xref>
                    </sup>. Analysis code and results are available as 
                    <italic toggle="yes">Extended data</italic>
                    <sup>
                        <xref ref-type="bibr" rid="ref-23">23</xref>
                    </sup>.</p>
                <fig fig-type="figure" id="f4" orientation="portrait" position="float">
                    <label>Figure 4. </label>
                    <caption>
                        <title>IBD Biopsy iHMP Multiple Groups Analysis.</title>
                        <p>Using statistical analysis we identified taxonomic features that showed a difference in abundance between the three subject diagnosis categories: UC, CD, or nonIBD in the Fusobacteria phylum. This Metaviz workspace is available at: 
                            <ext-link ext-link-type="uri" xlink:href="http://metaviz.cbcb.umd.edu/?ws=wHsHT56U8Ru">http://metaviz.cbcb.umd.edu/?ws=wHsHT56U8Ru</ext-link>.</p>
                    </caption>
                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/56081/425442f8-093e-49a2-aa3b-1fd532d81e50_figure4.gif"/>
                </fig>
            </sec>
        </sec>
        <sec sec-type="conclusions">
            <title>Conclusion</title>
            <p>In this work we presented software infrastructure linking Metaviz to the iHMP data resources
                <sup>
                    <xref ref-type="bibr" rid="ref-6">6</xref>
                </sup>. We detailed the 16S taxonomic community profile data from iHMP available in the UMD Metagenome Browser. We then described linking the UMD Metagenome Browser to the iHMP Data Portal for single files and the manifest file utility for multiple file selections. We also performed visual exploratory and confirmatory differential abundance analysis of data from the IBD study. We first visualize 16S and whole metagenome sequencing abundance measurements for the same samples in 
                <italic toggle="yes">metavizr</italic>. Then we use Metaviz and 
                <italic toggle="yes">metagenomeSeq</italic> to analyze two datasets, IBD Stool Pilot and iHMP IBD, to examine taxonomic feature abundances in samples from UC, CD, and those without IBD. These illustrative analyses demonstrate the utility of Metaviz for integrative analysis with the HMP data resources. Visual inspection of taxonomic features coupled with statistical testing provides an effective mechanism to explore and test associations between bacterial communities and their human hosts.</p>
        </sec>
        <sec>
            <title>Data availability</title>
            <sec>
                <title>Source data</title>
                <p>The 16S abundance matrices for IBD, T2D and the MOMS-PI studies were downloaded from the 
                    <ext-link ext-link-type="uri" xlink:href="https://bioconductor.org/packages/release/data/experiment/html/HMP2Data.html">
                        <italic toggle="yes">HMP2Data</italic>
                    </ext-link> Bioconductor package. These datasets are then loaded into the neo4j graph database using import methods available in the 
                    <italic toggle="yes">metavizr</italic>
                    <sup>
                        <xref ref-type="bibr" rid="ref-24">24</xref>
                    </sup> Bioconductor package. These import scripts are available at 
                    <ext-link ext-link-type="uri" xlink:href="https://gist.github.com/jkanche/c57d8220a33b41e21c4f6769a7aef7e4">https://gist.github.com/jkanche/c57d8220a33b41e21c4f6769a7aef7e4</ext-link>.</p>
            </sec>
            <sec>
                <title>Extended data</title>
                <p>Figshare: Differential Abundance Analysis - IBD (
                    <xref ref-type="fig" rid="f4">Figure 4</xref>). 
                    <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.6084/m9.figshare.12404222.v2">https://doi.org/10.6084/m9.figshare.12404222.v2</ext-link>
                    <sup>
                        <xref ref-type="bibr" rid="ref-23">23</xref>
                    </sup>.</p>
                <p>This file contains differential abundance analysis code and results.</p>
                <p>Extended data are available under the terms of the 
                    <ext-link ext-link-type="uri" xlink:href="https://github.com/epiviz/Metaviz/blob/master/LICENSE">Creative Commons Attribution 4.0 International license</ext-link> (CC-BY 4.0).</p>
            </sec>
        </sec>
        <sec>
            <title>Software availability</title>
            <p>
                <bold>Metaviz is available at:</bold> 
                <ext-link ext-link-type="uri" xlink:href="http://metaviz.cbcb.umd.edu">http://metaviz.cbcb.umd.edu</ext-link>.</p>
            <p>
                <bold>Source code available from:</bold> 
                <ext-link ext-link-type="uri" xlink:href="https://github.com/epiviz/Metaviz">https://github.com/epiviz/Metaviz</ext-link>.</p>
            <p>
                <bold>Archived source code at time of publication:</bold> 
                <ext-link ext-link-type="uri" xlink:href="http://doi.org/10.5281/zenodo.3871869">http://doi.org/10.5281/zenodo.3871869</ext-link>
                <sup>
                    <xref ref-type="bibr" rid="ref-6">6</xref>
                </sup>.</p>
            <p>
                <bold>License:</bold> 
                <ext-link ext-link-type="uri" xlink:href="https://github.com/epiviz/Metaviz/blob/master/LICENSE">Artistic License version 2.0</ext-link>.</p>
        </sec>
    </body>
    <back>
        <ref-list>
            <ref id="ref-1">
                <label>1</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Miller</surname>
                            <given-names>RR</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Montoya</surname>
                            <given-names>V</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Gardy</surname>
                            <given-names>JL</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Metagenomics for pathogen detection in public health.</article-title>
                    <source>

                        <italic toggle="yes">Genome Med.</italic>
</source>
                    <year>2013</year>;<volume>5</volume>(<issue>9</issue>):<fpage>81</fpage>.
                    <pub-id pub-id-type="pmid">24050114</pub-id>
                    <pub-id pub-id-type="doi">10.1186/gm485</pub-id>
                    <pub-id pub-id-type="pmcid">3978900</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-2">
                <label>2</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Blanton</surname>
                            <given-names>LV</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Charbonneau</surname>
                            <given-names>MR</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Salih</surname>
                            <given-names>T</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Gut bacteria that prevent growth impairments transmitted by microbiota from malnourished children.</article-title>
                    <source>

                        <italic toggle="yes">Science.</italic>
</source>
                    <year>2016</year>;<volume>351</volume>(<issue>6275</issue>).
                    <pub-id pub-id-type="pmid">26912898</pub-id>
                    <pub-id pub-id-type="doi">10.1126/science.aad3311</pub-id>
                    <pub-id pub-id-type="pmcid">4787260</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-3">
                <label>3</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Wagner</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Chelaru</surname>
                            <given-names>F</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Kancherla</surname>
                            <given-names>J</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Metaviz: interactive statistical and visual analysis of metagenomic data.</article-title>
                    <source>

                        <italic toggle="yes">Nucleic Acids Res.</italic>
</source>
                    <year>2018</year>;<volume>46</volume>(6):<fpage>2777</fpage>&#x2013;<lpage>2787</lpage>.
                    <pub-id pub-id-type="pmid">29529268</pub-id>
                    <pub-id pub-id-type="doi">10.1093/nar/gky136</pub-id>
                    <pub-id pub-id-type="pmcid">5887897</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-4">
                <label>4</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Paulson</surname>
                            <given-names>JN</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Stine</surname>
                            <given-names>OC</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Bravo</surname>
                            <given-names>HC</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Differential abundance analysis for microbial marker-gene surveys.</article-title>
                    <source>

                        <italic toggle="yes">Nat Methods.</italic>
</source>
                    <year>2013</year>;<volume>10</volume>(<issue>12</issue>):<fpage>1200</fpage>&#x2013;<lpage>1202</lpage>.
                    <pub-id pub-id-type="pmid">24076764</pub-id>
                    <pub-id pub-id-type="doi">10.1038/nmeth.2658</pub-id>
                    <pub-id pub-id-type="pmcid">4010126</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-5">
                <label>5</label>
                <mixed-citation publication-type="journal">
                    <collab>Integrative HMP (iHMP) Research Network Consortium</collab>:
                    <article-title>The integrative human microbiome project: Dynamic analysis of microbiome-host omics profiles during periods of human health and disease.</article-title>corresponding author.
                    <source>

                        <italic toggle="yes">Cell Host Microbe.</italic>
</source>
                    <year>2014</year>;<volume>16</volume>(<issue>3</issue>):<fpage>276</fpage>&#x2013;<lpage>289</lpage>.
                    <pub-id pub-id-type="pmid">25211071</pub-id>
                    <pub-id pub-id-type="doi">10.1016/j.chom.2014.08.014</pub-id>
                    <pub-id pub-id-type="pmcid">5109542</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-6">
                <label>6</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Kancherla</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Chelaru</surname>
                            <given-names>F</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Hcorrada</surname>
                            <given-names>J</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>epiviz/Metaviz: release for submission (Version 0.1.1).</article-title>
                    <source>

                        <italic toggle="yes">Zenodo.</italic>
</source>
                    <year>2020</year>.
                    <ext-link ext-link-type="uri" xlink:href="http://www.doi.org/10.5281/zenodo.3871869">http://www.doi.org/10.5281/zenodo.3871869</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-7">
                <label>7</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Leiserson</surname>
                            <given-names>MDM</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Gramazio</surname>
                            <given-names>CC</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Hu</surname>
                            <given-names>J</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>MAGI: Visualization and collaborative annotation of genomic aberrations.</article-title>
                    <source>

                        <italic toggle="yes">Nat Methods.</italic>
</source>
                    <year>2015</year>;<volume>12</volume>(<issue>6</issue>):<fpage>483</fpage>&#x2013;<lpage>484</lpage>.
                    <pub-id pub-id-type="pmid">26020500</pub-id>
                    <pub-id pub-id-type="doi">10.1038/nmeth.3412</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-8">
                <label>8</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Thompson</surname>
                            <given-names>LR</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Sanders</surname>
                            <given-names>JG</given-names>
                        </name>

                        <name name-style="western">
                            <surname>McDonald</surname>
                            <given-names>D</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>A communal catalogue reveals Earth&#x2019;s multiscale microbial diversity.</article-title>
                    <source>

                        <italic toggle="yes">Nature.</italic>
</source>
                    <year>2017</year>;<volume>551</volume>(<issue>7681</issue>):<fpage>457</fpage>&#x2013;<lpage>463</lpage>.
                    <pub-id pub-id-type="pmid">29088705</pub-id>
                    <pub-id pub-id-type="doi">10.1038/nature24621</pub-id>
                    <pub-id pub-id-type="pmcid">6192678</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-9">
                <label>9</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>V&#x00e1;zquez-Baeza</surname>
                            <given-names>Y</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Pirrung</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Gonzalez</surname>
                            <given-names>A</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>EMPeror: a tool for visualizing high-throughput microbial community data.</article-title>
                    <source>

                        <italic toggle="yes">Gigascience.</italic>
</source>
                    <year>2013</year>;<volume>2</volume>(<issue>1</issue>):<fpage>16</fpage>.
                    <pub-id pub-id-type="pmid">24280061</pub-id>
                    <pub-id pub-id-type="doi">10.1186/2047-217X-2-16</pub-id>
                    <pub-id pub-id-type="pmcid">4076506</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-10">
                <label>10</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Caporaso</surname>
                            <given-names>JG</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Kuczynski</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Stombaugh</surname>
                            <given-names>J</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>QIIME allows analysis of high-throughput community sequencing data.</article-title>
                    <source>

                        <italic toggle="yes">Nat Methods.</italic>
</source>
                    <year>2010</year>;<volume>7</volume>(<issue>5</issue>):<fpage>335</fpage>&#x2013;<lpage>336</lpage>.
                    <pub-id pub-id-type="pmid">20383131</pub-id>
                    <pub-id pub-id-type="doi">10.1038/nmeth.f.303</pub-id>
                    <pub-id pub-id-type="pmcid">3156573</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-11">
                <label>11</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Truong</surname>
                            <given-names>DT</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Franzosa</surname>
                            <given-names>EA</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Tickle</surname>
                            <given-names>TL</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>MetaPhlAn2 for enhanced metagenomic taxonomic profiling.</article-title>
                    <source>

                        <italic toggle="yes">Nat Methods.</italic>
</source>
                    <year>2015</year>;<volume>12</volume>(<issue>10</issue>):<fpage>902</fpage>&#x2013;<lpage>903</lpage>.
                    <pub-id pub-id-type="pmid">26418763</pub-id>
                    <pub-id pub-id-type="doi">10.1038/nmeth.3589</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-12">
                <label>12</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Stansfield</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Smirnova</surname>
                            <given-names>E</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Zhao</surname>
                            <given-names>N</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>HMP2Data: 16s rRNA sequencing data from the Human Microbiome Project 2</article-title>.<year>2019</year>.
                    <pub-id pub-id-type="doi">10.18129/B9.bioc.HMP2Data</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-13">
                <label>13</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Chelaru</surname>
                            <given-names>F</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Smith</surname>
                            <given-names>L</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Goldstein</surname>
                            <given-names>N</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Epiviz: Interactive visual analytics for functional genomics data.</article-title>
                    <source>

                        <italic toggle="yes">Nat Methods.</italic>
</source>
                    <year>2014</year>;<volume>11</volume>(<issue>9</issue>):<fpage>938</fpage>&#x2013;<lpage>40</lpage>.
                    <pub-id pub-id-type="pmid">25086505</pub-id>
                    <pub-id pub-id-type="doi">10.1038/nmeth.3038</pub-id>
                    <pub-id pub-id-type="pmcid">4149593</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-14">
                <label>14</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Olson</surname>
                            <given-names>ND</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Shah</surname>
                            <given-names>N</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Kancherla</surname>
                            <given-names>J</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>
                        <italic toggle="yes">metagenomeFeatures</italic>: An R package for working with 16S rRNA reference databases and marker-gene survey feature data.</article-title>
                    <source>

                        <italic toggle="yes">bioRxiv.</italic>
</source>
                    <year>2018</year>.
                    <pub-id pub-id-type="doi">10.1101/339812</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-15">
                <label>15</label>
                <mixed-citation publication-type="journal">
                    <article-title>The Neo4j Graph Database</article-title>.
                    <ext-link ext-link-type="uri" xlink:href="https://neo4j.com/docs/operations-manual/3.1/">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-16">
                <label>16</label>
                <mixed-citation publication-type="journal">
                    <collab>Human Microbiome Project Consortium</collab>:
                    <article-title>Structure, function and diversity of the healthy human microbiome.</article-title>
                    <source>

                        <italic toggle="yes">Nature.</italic>
</source>
                    <year>2012</year>;<volume>486</volume>(<issue>7402</issue>):<fpage>207</fpage>&#x2013;<lpage>14</lpage>.
                    <pub-id pub-id-type="pmid">22699609</pub-id>
                    <pub-id pub-id-type="doi">10.1038/nature11234</pub-id>
                    <pub-id pub-id-type="pmcid">3564958</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-17">
                <label>17</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Yilmaz</surname>
                            <given-names>P</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Parfrey</surname>
                            <given-names>LW</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Yarza</surname>
                            <given-names>P</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>The SILVA and "all-species Living Tree Project (LTP)" taxonomic frameworks.</article-title>
                    <source>

                        <italic toggle="yes">Nucleic Acids Res.</italic>
</source>
                    <year>2014</year>;<volume>42</volume>(<issue>Database issue</issue>):<fpage>D643</fpage>&#x2013;<lpage>D648</lpage>.
                    <pub-id pub-id-type="pmid">24293649</pub-id>
                    <pub-id pub-id-type="doi">10.1093/nar/gkt1209</pub-id>
                    <pub-id pub-id-type="pmcid">3965112</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-18">
                <label>18</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Edgar</surname>
                            <given-names>RC</given-names>
                        </name>
</person-group>:
                    <article-title>UPARSE: highly accurate OTU sequences from microbial amplicon reads</article-title>
                    <source>

                        <italic toggle="yes"> Nat Methods.</italic>
</source>
                    <year>2013</year>;<volume>10</volume>(<issue>10</issue>):<fpage>996</fpage>&#x2013;<lpage>8</lpage>.
                    <pub-id pub-id-type="pmid">23955772</pub-id>
                    <pub-id pub-id-type="doi">10.1038/nmeth.2604</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-19">
                <label>19</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Meehan</surname>
                            <given-names>CJ</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Beiko</surname>
                            <given-names>RG</given-names>
                        </name>
</person-group>:
                    <article-title>A phylogenomic view of ecological specialization in the lachnospiraceae, a family of digestive tract-associated bacteria.</article-title>
                    <source>

                        <italic toggle="yes">Genome Biol Evol.</italic>
</source>
                    <year>2014</year>;<volume>6</volume>(<issue>3</issue>):<fpage>703</fpage>&#x2013;<lpage>713</lpage>.
                    <pub-id pub-id-type="pmid">24625961</pub-id>
                    <pub-id pub-id-type="doi">10.1093/gbe/evu050</pub-id>
                    <pub-id pub-id-type="pmcid">3971600</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-20">
                <label>20</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Kameyama</surname>
                            <given-names>K</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Itoh</surname>
                            <given-names>K</given-names>
                        </name>
</person-group>:
                    <article-title>Intestinal Colonization by a Lachnospiraceae Bacterium Contributes to the Development of Diabetes in Obese Mice.</article-title>
                    <source>

                        <italic toggle="yes">Microbes Environ.</italic>
</source>
                    <year>2014</year>;<volume>29</volume>(<issue>4</issue>):<fpage>427</fpage>&#x2013;<lpage>430</lpage>.
                    <pub-id pub-id-type="pmid">25283478</pub-id>
                    <pub-id pub-id-type="doi">10.1264/jsme2.ME14054</pub-id>
                    <pub-id pub-id-type="pmcid">4262368</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-21">
                <label>21</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Maharshak</surname>
                            <given-names>N</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Ringel</surname>
                            <given-names>Y</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Katibian</surname>
                            <given-names>D</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Fecal and Mucosa-Associated Intestinal Microbiota in Patients with Diarrhea-Predominant Irritable Bowel Syndrome.</article-title>
                    <source>

                        <italic toggle="yes">Dig Dis Sci.</italic>
</source>
                    <year>2018</year>;<volume>63</volume>(<issue>7</issue>):<fpage>1890</fpage>&#x2013;<lpage>1899</lpage>.
                    <pub-id pub-id-type="pmid">29777439</pub-id>
                    <pub-id pub-id-type="doi">10.1007/s10620-018-5086-4</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-22">
                <label>22</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Strauss</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Kaplan</surname>
                            <given-names>GG</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Beck</surname>
                            <given-names>PL</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Invasive potential of gut mucosa-derived fusobacterium nucleatum positively correlates with IBD status of the host.</article-title>
                    <source>

                        <italic toggle="yes">Inflamm Bowel Dis.</italic>
</source>
                    <year>2011</year>;<volume>17</volume>(<issue>9</issue>):<fpage>1971</fpage>&#x2013;<lpage>1978</lpage>.
                    <pub-id pub-id-type="pmid">21830275</pub-id>
                    <pub-id pub-id-type="doi">10.1002/ibd.21606</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-23">
                <label>23</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Bravo</surname>
                            <given-names>HC</given-names>
                        </name>
</person-group>:
                    <article-title>Differential Abundance Analysis - IBD (Figure 4).</article-title>
                    <source>

                        <italic toggle="yes">figshare.</italic>
</source>Online resource.<year>2020</year>.
                    <ext-link ext-link-type="uri" xlink:href="http://www.doi.org/10.6084/m9.figshare.12404222.v2">http://www.doi.org/10.6084/m9.figshare.12404222.v2</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-24">
                <label>24</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Bravo</surname>
                            <given-names>HC</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Chelaru</surname>
                            <given-names>F</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Wagner</surname>
                            <given-names>J</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>metavizr: R Interface to the metaviz web app for interactive metagenomics data analysis and visualization</article-title>.<year>2020</year>.
                    <pub-id pub-id-type="doi">10.18129/B9.bioc.metavizr</pub-id>
                </mixed-citation>
            </ref>
        </ref-list>
    </back>
    <sub-article article-type="reviewer-report" id="report87967">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.56081.r87967</article-id>
            <title-group>
                <article-title>Reviewer response for version 2</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Smirnova</surname>
                        <given-names>Ekaterina</given-names>
                    </name>
                    <xref ref-type="aff" rid="r87967a1">1</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0001-9577-4102</uri>
                </contrib>
                <aff id="r87967a1">
                    <label>1</label>Department of Biostatistics, Virginia Commonwealth University, Richmond, USA</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>23</day>
                <month>6</month>
                <year>2021</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2021 Smirnova E</copyright-statement>
                <copyright-year>2021</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport87967" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.24345.2"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>No further comments to make.</p>
            <p>Are the conclusions about the tool and its performance adequately supported by the findings presented in the article?</p>
            <p>Yes</p>
            <p>Is the rationale for developing the new software tool clearly explained?</p>
            <p>Yes</p>
            <p>Is the description of the software tool technically sound?</p>
            <p>Yes</p>
            <p>Are sufficient details of the code, methods and analysis (if applicable) provided to allow replication of the software development and its use by others?</p>
            <p>Partly</p>
            <p>Is sufficient information provided to allow interpretation of the expected output datasets and any results generated using the tool?</p>
            <p>Yes</p>
            <p>Reviewer Expertise:</p>
            <p>Biostatistics (working on Human Microbiome Project data)</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.</p>
        </body>
    </sub-article>
    <sub-article article-type="reviewer-report" id="report64702">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.26859.r64702</article-id>
            <title-group>
                <article-title>Reviewer response for version 1</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Waldron</surname>
                        <given-names>Levi</given-names>
                    </name>
                    <xref ref-type="aff" rid="r64702a1">1</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0003-2725-0694</uri>
                </contrib>
                <aff id="r64702a1">
                    <label>1</label>Graduate School of Public Health and Health Policy, Institute for Implementation Science in Population Health, City University of New York, New York, NY, USA</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>20</day>
                <month>7</month>
                <year>2020</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2020 Waldron L</copyright-statement>
                <copyright-year>2020</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport64702" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.24345.1"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>The authors present a web tool for exploratory analysis of iHMP data, based on their existing metaviz software. The figures from the manuscript were easy to reproduce using the links and code provided. It seems like a useful new way to view, explore, and download iHMP data. I have only a couple minor comments: 
                <list list-type="order">
                    <list-item>
                        <p>The figures are low-resolution and text is a bit blurry. The lightest yellow text in Figure 2 isn't readable.</p>
                    </list-item>
                    <list-item>
                        <p>From&#x00a0;
                            <ext-link ext-link-type="uri" xlink:href="http://metaviz.cbcb.umd.edu/">http://metaviz.cbcb.umd.edu/</ext-link>&#x00a0;it's not immediately obvious how to find the iHMP (HMP2) data.&#x00a0;&#x00a0;IBD and T2D do&#x00a0;appear a ways down a list of available datasets, but this list takes a very long time to load for me ("loading datasets and sample annotations..."). I didn't find MOMS-PI but I may not have waited long enough. It would be worth providing direct links in the manuscript to workspaces for each dataset, like is already done for Figure 3.</p>
                    </list-item>
                    <list-item>
                        <p>Perhaps explain the feature variables like&#x00a0;c__Betaproteobacteria, o__Burkholderiales, f__Ruminococcaceae, g__Lachnospira, g__[Ruminococcus], g__Faecalibacterium, s__:589277, s__:333166, s__:564806, s__:369227, s__:358104, s__:369486, s__gnavus:360015, s__prausnitzii:851865 for readers who are familiar with scientific taxonomy names but not this format.</p>
                    </list-item>
                </list>
            </p>
            <p>Are the conclusions about the tool and its performance adequately supported by the findings presented in the article?</p>
            <p>Yes</p>
            <p>Is the rationale for developing the new software tool clearly explained?</p>
            <p>Yes</p>
            <p>Is the description of the software tool technically sound?</p>
            <p>Yes</p>
            <p>Are sufficient details of the code, methods and analysis (if applicable) provided to allow replication of the software development and its use by others?</p>
            <p>Yes</p>
            <p>Is sufficient information provided to allow interpretation of the expected output datasets and any results generated using the tool?</p>
            <p>Yes</p>
            <p>Reviewer Expertise:</p>
            <p>bioinformatics, bioonductor, metagenomics, human microbiome analysis.</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.</p>
        </body>
        <sub-article article-type="response" id="comment6597-64702">
            <front-stub>
                <contrib-group>
                    <contrib contrib-type="author">
                        <name>
                            <surname>Corrada Bravo</surname>
                            <given-names>Hector</given-names>
                        </name>
                        <aff>University of Maryland, USA</aff>
                    </contrib>
                </contrib-group>
                <author-notes>
                    <fn fn-type="conflict">
                        <p>
                            <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                    </fn>
                </author-notes>
                <pub-date pub-type="epub">
                    <day>21</day>
                    <month>4</month>
                    <year>2021</year>
                </pub-date>
            </front-stub>
            <body>
                <p>
                    <bold>Reviewer Comment: </bold>The figures are low-resolution and text is a bit blurry. The lightest yellow text in Figure 2 isn't readable.</p>
                <p> </p>
                <p> 
                    <bold>Author Response</bold>: We updated the color scheme for Figure 2. We also updated figures with a higher resolution.</p>
                <p> </p>
                <p> 
                    <bold>Reviewer Comment: </bold>From 
                    <ext-link ext-link-type="uri" xlink:href="http://metaviz.cbcb.umd.edu/">http://metaviz.cbcb.umd.edu/</ext-link> it's not immediately obvious how to find the iHMP (HMP2) data.&#x00a0; IBD and T2D do appear a ways down a list of available datasets, but this list takes a very long time to load for me ("loading datasets and sample annotations..."). I didn't find MOMS-PI but I may not have waited long enough. It would be worth providing direct links in the manuscript to workspaces for each dataset, like is already done for Figure 3.</p>
                <p> </p>
                <p> 
                    <bold>Author Response</bold>: We developed default workspaces for IBD, T2D, and MOMS-PI as well as put them at 
                    <ext-link ext-link-type="uri" xlink:href="https://epiviz.github.io/metaviz-workspaces/">https://epiviz.github.io/metaviz-workspaces/</ext-link>
                </p>
                <p> </p>
                <p> 
                    <bold>Reviewer Comment: </bold>Perhaps explain the feature variables like c__Betaproteobacteria, o__Burkholderiales, &#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0; f__Ruminococcaceae, g__Lachnospira, g__[Ruminococcus], &#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0; g__Faecalibacterium, s__:589277, s__:333166, s__:564806, s__:369227, s__:358104, s__:369486, s__gnavus:360015, s__prausnitzii:851865 for readers who are familiar with scientific taxonomy names but not this &#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0; format. &#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;</p>
                <p> </p>
                <p> 
                    <bold>Author Response</bold>: We added the following text to the manuscript: &#x201c;These taxonomic features describe paths in the taxonomy of the Kingdom Bacteria that was derived from the SILVA
                    <sup>16</sup> database. The documentation (
                    <ext-link ext-link-type="uri" xlink:href="https://ibdmdb.org/tunnel/public/HMP2/16S/1806/products">https://ibdmdb.org/tunnel/public/HMP2/16S/1806/products</ext-link>) for the abundance profiles used in this analysis denotes that this taxonomic string was generated using the sequence of an OTU derived with the UPARSE algorithm
                    <sup>17</sup> that was mapped to the SILVA database. Among these features, "c__Betaproteobacteria" refers to Class Betaproteobacteria, "o__Burkholderiales" to Order Burkholderiales, while values towards the leaves of the taxonomy refer to entries in the SILVA database that have an identifier and a sequence but have not been provided formal names in the binomial nomenclature system&#x201d;</p>
            </body>
        </sub-article>
    </sub-article>
    <sub-article article-type="reviewer-report" id="report64700">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.26859.r64700</article-id>
            <title-group>
                <article-title>Reviewer response for version 1</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Pasolli</surname>
                        <given-names>Edoardo</given-names>
                    </name>
                    <xref ref-type="aff" rid="r64700a1">1</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0003-0799-3490</uri>
                </contrib>
                <aff id="r64700a1">
                    <label>1</label>University of Naples Federico II, Naples, Italy</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>16</day>
                <month>7</month>
                <year>2020</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2020 Pasolli E</copyright-statement>
                <copyright-year>2020</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport64700" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.24345.1"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>The manuscript by Justin Wagner 
                <italic>et al</italic>. presents an infrastructure that integrates Metaviz, an interactive microbiome data analysis and visualization tool that was previously published by the same authors, with the iHMP Data Coordination Center web portal and the HMP2Data R/Bioconductor package. The authors give an overall overview of the infrastructure in addition to a couple of use cases.</p>
            <p> </p>
            <p> This is a nice and timely contribution that helps in using the large and complex set of available HMP data. The manuscript is already well structured, I have just few comments: 
                <list list-type="order">
                    <list-item>
                        <p>Please check all the links reported in the text. For example, the link to the data repository and web portal (https://ihmpdcc.org) doesn't seem to work.</p>
                    </list-item>
                    <list-item>
                        <p>How will the proposed infrastructure deal with likely updates (in terms of new data) of the HMP web portal?</p>
                    </list-item>
                    <list-item>
                        <p>Which are the options available to save/export the results generated in the proposed infrastructure? Please comment more about this in the text.</p>
                    </list-item>
                </list>
            </p>
            <p>Are the conclusions about the tool and its performance adequately supported by the findings presented in the article?</p>
            <p>Yes</p>
            <p>Is the rationale for developing the new software tool clearly explained?</p>
            <p>Yes</p>
            <p>Is the description of the software tool technically sound?</p>
            <p>Yes</p>
            <p>Are sufficient details of the code, methods and analysis (if applicable) provided to allow replication of the software development and its use by others?</p>
            <p>Partly</p>
            <p>Is sufficient information provided to allow interpretation of the expected output datasets and any results generated using the tool?</p>
            <p>Yes</p>
            <p>Reviewer Expertise:</p>
            <p>Microbiome and bioinformatics</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.</p>
        </body>
        <sub-article article-type="response" id="comment6596-64700">
            <front-stub>
                <contrib-group>
                    <contrib contrib-type="author">
                        <name>
                            <surname>Corrada Bravo</surname>
                            <given-names>Hector</given-names>
                        </name>
                        <aff>University of Maryland, USA</aff>
                    </contrib>
                </contrib-group>
                <author-notes>
                    <fn fn-type="conflict">
                        <p>
                            <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                    </fn>
                </author-notes>
                <pub-date pub-type="epub">
                    <day>21</day>
                    <month>4</month>
                    <year>2021</year>
                </pub-date>
            </front-stub>
            <body>
                <p>
                    <bold>Reviewer Comment: </bold>Please check all the links reported in the text. For example, the link to the data repository and web portal (
                    <ext-link ext-link-type="uri" xlink:href="https://ihmpdcc.org/">https://ihmpdcc.org</ext-link>) doesn't seem to work.&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;</p>
                <p> </p>
                <p> 
                    <bold>Author Response</bold>: We have updated the link to 
                    <ext-link ext-link-type="uri" xlink:href="http://www.ihmpdcc.org/">http://www.ihmpdcc.org</ext-link> and checked that it resolves.</p>
                <p> </p>
                <p> 
                    <bold>Reviewer Comment: </bold>How will the proposed infrastructure deal with likely updates (in terms of new data) of the HMP web portal?&#x00a0;&#x00a0;&#x00a0;</p>
                <p> </p>
                <p> 
                    <bold>Author Response</bold>: We updated the manuscript with the following: &#x201c;As a community resource, we plan to update the Metaviz database within a month of Bioconductor releases of HMP2Data. We maintain links to the HMP Data Portal through the update of HMP2Data package URLs and provide default workspaces for the HMP2 datasets as well as those in HMP16SData R/Bioconductor package (
                    <ext-link ext-link-type="uri" xlink:href="https://epiviz.github.io/metaviz-workspaces/">https://epiviz.github.io/metaviz-workspaces/</ext-link>)&#x201d;</p>
                <p> </p>
                <p> 
                    <bold>Reviewer Comment: </bold>Which are the options available to save/export the results generated in the proposed infrastructure? Please comment more about this in the text. &#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;</p>
                <p> </p>
                <p> 
                    <bold>Author Response</bold>: We added the following tutorial to 
                    <ext-link ext-link-type="uri" xlink:href="https://epiviz.github.io/tutorials/metaviz/save_plots/">https://epiviz.github.io/tutorials/metaviz/save_plots/</ext-link>
                </p>
            </body>
        </sub-article>
    </sub-article>
    <sub-article article-type="reviewer-report" id="report64704">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.26859.r64704</article-id>
            <title-group>
                <article-title>Reviewer response for version 1</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Smirnova</surname>
                        <given-names>Ekaterina</given-names>
                    </name>
                    <xref ref-type="aff" rid="r64704a1">1</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0001-9577-4102</uri>
                </contrib>
                <aff id="r64704a1">
                    <label>1</label>Department of Biostatistics, Virginia Commonwealth University, Richmond, USA</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>6</day>
                <month>7</month>
                <year>2020</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2020 Smirnova E</copyright-statement>
                <copyright-year>2020</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport64704" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.24345.1"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>Manuscript &#x201c;Interactive exploratory data analysis of Integrative Human Microbiome Project data using Metaviz&#x201d; by Wagner 
                <italic>et al.&#x00a0;</italic>provides a novel interactive tool for visualization of interactive Human Microbiome Project (iHMP) Data. This tool encompasses both the HMP Data coordination center (DAC) portal as well as the Bioconductor HMP2Data package. Given the size and complexity of HMP data, this visualization tool allows users to explore major data summaries and patterns. As such, this work is very timely and is highly recommended for publication.</p>
            <p> Major comments: 
                <list list-type="order">
                    <list-item>
                        <p>How does the tool handle HMP DAC and HMP2Data package data updates? There is discussion that user can upload additional data manifest from HMP DAC portal, but what if the data already available thought Metaviz tool is updated on the DAC or Bioconductor?</p>
                    </list-item>
                    <list-item>
                        <p>Is there an option to download data summaries? Such as demographics table, number of samples by body site, etc.</p>
                    </list-item>
                    <list-item>
                        <p>MOMS-PI and &#x00a0;T2D studies have protected meta data available through dbGap. If users apply for dbGap and get access, id there an option to merge meta data by sample ids and visualize using this tool?</p>
                    </list-item>
                    <list-item>
                        <p>It would be helpful to provide instructions how to create diversity boxplots by disease status (as in Figure 3). I tried creating these but could not produce them on the same plot.</p>
                    </list-item>
                    <list-item>
                        <p>I recommend making alpha diversity plots by disease status a default plot when the data is selected using visualization tool.</p>
                    </list-item>
                </list>
            </p>
            <p>Are the conclusions about the tool and its performance adequately supported by the findings presented in the article?</p>
            <p>Yes</p>
            <p>Is the rationale for developing the new software tool clearly explained?</p>
            <p>Yes</p>
            <p>Is the description of the software tool technically sound?</p>
            <p>Yes</p>
            <p>Are sufficient details of the code, methods and analysis (if applicable) provided to allow replication of the software development and its use by others?</p>
            <p>Partly</p>
            <p>Is sufficient information provided to allow interpretation of the expected output datasets and any results generated using the tool?</p>
            <p>Yes</p>
            <p>Reviewer Expertise:</p>
            <p>Biostatistics (working on Human Microbiome Project data)</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.</p>
        </body>
        <sub-article article-type="response" id="comment6595-64704">
            <front-stub>
                <contrib-group>
                    <contrib contrib-type="author">
                        <name>
                            <surname>Corrada Bravo</surname>
                            <given-names>Hector</given-names>
                        </name>
                        <aff>University of Maryland, USA</aff>
                    </contrib>
                </contrib-group>
                <author-notes>
                    <fn fn-type="conflict">
                        <p>
                            <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                    </fn>
                </author-notes>
                <pub-date pub-type="epub">
                    <day>21</day>
                    <month>4</month>
                    <year>2021</year>
                </pub-date>
            </front-stub>
            <body>
                <p>
                    <bold>Reviewer Comment: </bold>How does the tool handle HMP DAC and HMP2Data package data updates? There is discussion that user can upload additional data manifest from HMP DAC portal, but what if the data already available thought Metaviz tool is updated on the DAC or Bioconductor?&#x00a0;&#x00a0;</p>
                <p> &#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;</p>
                <p> 
                    <bold>Author Response</bold>: We updated the manuscript with the following: &#x201c;As a community resource, we plan to update the Metaviz database within a month of Bioconductor releases of HMP2Data. We maintain links to the HMP Data Portal through the update of HMP2Data package URLs and provide default workspaces for the HMP2 datasets as well as those in HMP16SData R/Bioconductor package (
                    <ext-link ext-link-type="uri" xlink:href="https://epiviz.github.io/metaviz-workspaces/">https://epiviz.github.io/metaviz-workspaces/</ext-link>)&#x201d;</p>
                <p> </p>
                <p> 
                    <bold>Reviewer Comment: </bold>Is there an option to download data summaries? Such as demographics table, number of samples by body site, etc.</p>
                <p> </p>
                <p> 
                    <bold>Author Response</bold>: We updated the manuscript with the following: &#x201c;For generating data summaries, we recommend using the HMP2Data package with appropriate R libraries to summarize sample information and the interactive HMP Data Portal for data summaries over different samples or study attributes. We provide instructions in the metavizr vignette to handle visualizing data that would be added to an analysis session like protected variables from dbGAP&#x201d;</p>
                <p> </p>
                <p> 
                    <bold>Reviewer Comment: </bold>MOMS-PI and T2D studies have protected meta data available through dbGap. If users apply for dbGap and get access, id there an option to merge meta data by sample ids and visualize using this tool?</p>
                <p> </p>
                <p> 
                    <bold>Author Response</bold>:
                    <italic> </italic>We added a vignette to the metavizR Bioconductor package (
                    <ext-link ext-link-type="uri" xlink:href="https://github.com/epiviz/metavizr/pull/2/commits/756cc049a2c9264355e15df42cfa64d71ee141ae#diff-46f4a8e59ff35c66f1220d9cdc6abf687d92771b424f4a6dfed1434c998cd500">https://github.com/epiviz/metavizr/pull/2/commits/756cc049a2c9264355e15df42cfa64d71ee141ae#diff-46f4a8e59ff35c66f1220d9cdc6abf687d92771b424f4a6dfed1434c998cd500</ext-link>) for added a protected variable and then generate new plots with it.</p>
                <p> </p>
                <p> 
                    <bold>Reviewer Comment: </bold>It would be helpful to provide instructions how to create diversity boxplots by disease status (as in Figure 3). I tried creating these but could not produce them on the same plot.</p>
                <p> </p>
                <p> 
                    <bold>Author Response</bold>: We added a tutorial at metaviz.org for including a diversity plot (
                    <ext-link ext-link-type="uri" xlink:href="https://epiviz.github.io/tutorials/metaviz/diversity_plot/">https://epiviz.github.io/tutorials/metaviz/diversity_plot/</ext-link>).</p>
                <p> </p>
                <p> 
                    <bold>Reviewer Comment: </bold>I recommend making alpha diversity plots by disease status a default plot when the data is selected using visualization tool. &#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;</p>
                <p> </p>
                <p> 
                    <bold>Author Response</bold>: We created default workspaces at 
                    <ext-link ext-link-type="uri" xlink:href="https://epiviz.github.io/metaviz-workspaces/">https://epiviz.github.io/metaviz-workspaces/</ext-link> with alpha diversity plots on either the disease or an interesting attribute.</p>
            </body>
        </sub-article>
    </sub-article>
</article>
