<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.2 20190208//EN" "http://jats.nlm.nih.gov/publishing/1.2/JATS-journalpublishing1.dtd"><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="methods-article" dtd-version="1.2" xml:lang="en">
    <front>
        <journal-meta>
            <journal-id journal-id-type="pmc">F1000Research</journal-id>
            <journal-title-group>
                <journal-title>F1000Research</journal-title>
            </journal-title-group>
            <issn pub-type="epub">2046-1402</issn>
            <publisher>
                <publisher-name>F1000 Research Limited</publisher-name>
                <publisher-loc>London, UK</publisher-loc>
            </publisher>
        </journal-meta>
        <article-meta>
            <article-id pub-id-type="doi">10.12688/f1000research.15591.1</article-id>
            <article-categories>
                <subj-group subj-group-type="heading">
                    <subject>Method Article</subject>
                </subj-group>
                <subj-group>
                    <subject>Articles</subject>
                </subj-group>
            </article-categories>
            <title-group>
                <article-title>Revealing HIV viral load patterns using unsupervised machine learning and cluster summarization</article-title>
                <fn-group content-type="pub-status">
                    <fn>
                        <p>[version 1; peer review: 1 approved, 1 approved with reservations]</p>
                    </fn>
                </fn-group>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Farooq</surname>
                        <given-names>Samir A.</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Conceptualization</role>
                    <role content-type="http://credit.niso.org/">Data Curation</role>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Validation</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Weisenthal</surname>
                        <given-names>Samuel J.</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a1">1</xref>
                    <xref ref-type="aff" rid="a2">2</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Trayhan</surname>
                        <given-names>Melissa</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Conceptualization</role>
                    <role content-type="http://credit.niso.org/">Data Curation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Resources</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <xref ref-type="aff" rid="a1">1</xref>
                    <xref ref-type="aff" rid="a2">2</xref>
                    <xref ref-type="aff" rid="a3">3</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>White</surname>
                        <given-names>Robert J.</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Conceptualization</role>
                    <role content-type="http://credit.niso.org/">Data Curation</role>
                    <role content-type="http://credit.niso.org/">Funding Acquisition</role>
                    <role content-type="http://credit.niso.org/">Project Administration</role>
                    <role content-type="http://credit.niso.org/">Resources</role>
                    <role content-type="http://credit.niso.org/">Supervision</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a1">1</xref>
                    <xref ref-type="aff" rid="a2">2</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Bush</surname>
                        <given-names>Kristen</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a1">1</xref>
                    <xref ref-type="aff" rid="a2">2</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Mariuz</surname>
                        <given-names>Peter R.</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Conceptualization</role>
                    <role content-type="http://credit.niso.org/">Supervision</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a4">4</xref>
                </contrib>
                <contrib contrib-type="author" corresp="yes">
                    <name>
                        <surname>Zand</surname>
                        <given-names>Martin S.</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Conceptualization</role>
                    <role content-type="http://credit.niso.org/">Funding Acquisition</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Resources</role>
                    <role content-type="http://credit.niso.org/">Supervision</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <uri content-type="orcid">https://orcid.org/0000-0002-7095-8682</uri>
                    <xref ref-type="corresp" rid="c1">a</xref>
                    <xref ref-type="aff" rid="a1">1</xref>
                    <xref ref-type="aff" rid="a2">2</xref>
                    <xref ref-type="aff" rid="a3">3</xref>
                </contrib>
                <aff id="a1">
                    <label>1</label>Rochester Center for Health Informatics, University of Rochester Medical Center, Rochester, NY, 14642, USA</aff>
                <aff id="a2">
                    <label>2</label>Clinical and Translational Science Institute, University of Rochester Medical Center, Rochester, NY, 14642, USA</aff>
                <aff id="a3">
                    <label>3</label>Department of Medicine - Division of Nephrology, University of Rochester Medical Center, Rochester, NY, 14642, USA</aff>
                <aff id="a4">
                    <label>4</label>Department of Medicine, Division of Infectious Diseases, Strong Memorial Hospital AIDS Center,, University of Rochester Medical Center, Rochester, NY, 14642, USA</aff>
            </contrib-group>
            <author-notes>
                <corresp id="c1">
                    <label>a</label>
                    <email xlink:href="mailto:martin_zand@urmc.rochester.edu">martin_zand@urmc.rochester.edu</email>
                </corresp>
                <fn fn-type="conflict">
                    <p>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>27</day>
                <month>7</month>
                <year>2018</year>
            </pub-date>
            <pub-date pub-type="collection">
                <year>2018</year>
            </pub-date>
            <volume>7</volume>
            <elocation-id>1144</elocation-id>
            <history>
                <date date-type="accepted">
                    <day>18</day>
                    <month>7</month>
                    <year>2018</year>
                </date>
            </history>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2018 Farooq SA et al.</copyright-statement>
                <copyright-year>2018</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <self-uri content-type="pdf" xlink:href="https://f1000research.com/articles/7-1144/pdf"/>
            <abstract>
                <p>HIV RNA viral load (VL) is an important outcome variable in studies of HIV infected persons. There exists only a handful of methods which classify patients by VL patterns. Most methods place limits on the use of viral load measurements, are often specific to a particular study design, and do not account for complex, temporal variation. To address this issue, we propose a set of four unambiguous computable characteristics (features) of time-varying HIV viral load patterns, along with a novel centroid-based classification algorithm, which we use to classify a population of 1,576 HIV positive clinic patients into one of five different viral load patterns (clusters) often found in the literature: durably suppressed viral load (DSVL), sustained low viral load (SLVL), sustained high viral load (SHVL), high viral load suppression (HVLS), and rebounding viral load (RVL). The centroid algorithm summarizes these clusters in terms of their centroids and radii. We show that this allows new VL patterns to be assigned pattern membership based on the distance from the centroid relative to its radius, which we term radial normalization classification. This method has the benefit of providing an objective and quantitative method to assign VL pattern membership with a concise and interpretable model that aids clinical decision making. This method also facilitates meta-analyses by providing computably distinct HIV categories. Finally we propose that this novel centroid algorithm could also be useful in the areas of cluster comparison for outcomes research and data reduction in machine learning.</p>
            </abstract>
            <kwd-group kwd-group-type="author">
                <kwd>Machine learning</kwd>
                <kwd>HIV</kwd>
                <kwd>viral load</kwd>
                <kwd>feature extraction</kwd>
                <kwd>HIV categories</kwd>
                <kwd>centroid</kwd>
                <kwd>cluster summarization</kwd>
                <kwd>clinical interpretability</kwd>
            </kwd-group>
            <funding-group>
                <award-group id="fund-1" xlink:href="http://dx.doi.org/10.13039/100000060">
                    <funding-source>National Institute of Allergy and Infectious Diseases</funding-source>
                    <award-id>P30AI078498</award-id>
                </award-group>
                <award-group id="fund-2" xlink:href="http://dx.doi.org/10.13039/100006108">
                    <funding-source>National Center for Advancing Translational Sciences</funding-source>
                    <award-id>UL1TR002001</award-id>
                    <award-id>TL1TR002000</award-id>
                </award-group>
                <funding-statement>This work was partially funded by the University of Rochester Clinical and Translational Science Institute grants UL1 TR002001, and TL1 TR002000 from the National Center for Advancing Translational Sciences (NCATS), a component of the National Institutes of Health (NIH).  This publication was also made possible through core services and support from the University of Rochester Center for AIDS Research (CFAR), an NIH-funded program (P30 AI078498). </funding-statement>
                <funding-statement>
                    <italic>The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.</italic>
                </funding-statement>
            </funding-group>
        </article-meta>
    </front>
    <body>
        <sec sec-type="intro">
            <title>Introduction</title>
            <p>The primary clinical goal of HIV treatment and patient engagement is suppression of the HIV viral load (VL), as measured by low or undetectable circulating HIV RNA levels. However, VL most often fluctuates over repeated measurements, with a range that spans 8 orders of magnitude from 0 (undetectable) - 10
                <sup>7</sup> copies/mL. VL is regularly monitored for signs of progression of HIV infection. Standard HIV treatment protocols are based on VL measurements
                <sup>
                    <xref ref-type="bibr" rid="ref-1">1</xref>
                </sup>, especially when monitoring responses to antiretroviral therapy (ART). Monitoring of VL helps to determine whether ART therapy was able to successfully suppress patient VL
                <sup>
                    <xref ref-type="bibr" rid="ref-2">2</xref>
                </sup>. Individuals with sustained high viral loads (SHVL) are at greater risk of secondary transmission, clinical progression to AIDS, and death
                <sup>
                    <xref ref-type="bibr" rid="ref-3">3</xref>&#x2013;
                    <xref ref-type="bibr" rid="ref-6">6</xref>
                </sup>. In contrast, significant reduction in VL or high viral load suppression (HVLS) both lead to immune recovery, as measured by CD4 T cell levels
                <sup>
                    <xref ref-type="bibr" rid="ref-7">7</xref>
                </sup>, and can reduce or eliminate the risks of SHVL. Furthermore, patients sustaining low-level viral load (SLVL), or with a rising VL after previous suppression, have a high incidence of treatment failure
                <sup>
                    <xref ref-type="bibr" rid="ref-8">8</xref>
                </sup>. Thus, developing an objective measure of VL status, and categorization of patients by time varying patterns of VL, is critical for standardizing both therapy and comparing research protocol efficacy.</p>
            <p>Reports in the current literature differ in the definition &#x201c;high viral load"
                <sup>
                    <xref ref-type="bibr" rid="ref-9">9</xref>&#x2013;
                    <xref ref-type="bibr" rid="ref-12">12</xref>
                </sup>, and their findings of how long it takes a patient on highly active anti-retroviral therapy (HAART) to suppress their VL
                <sup>
                    <xref ref-type="bibr" rid="ref-2">2</xref>,
                    <xref ref-type="bibr" rid="ref-9">9</xref>,
                    <xref ref-type="bibr" rid="ref-10">10</xref>,
                    <xref ref-type="bibr" rid="ref-13">13</xref>
                </sup>. We summarize some of the published approaches here (for greater detail see 
                <xref ref-type="other" rid="SF7">Supplementary File 1</xref>). With respect to VL levels, Terzian 
                <italic toggle="yes">et al.</italic> defined SHVL as two consecutive viral load measurements (VLM) 
                <italic toggle="yes">&#x2265;</italic>100,000 copies/mL
                <sup>
                    <xref ref-type="bibr" rid="ref-9">9</xref>
                </sup>. Durably suppressed viral load (DSVL) was defined as all VLM &lt;400 copies/mL. In contrast, Greub 
                <italic toggle="yes">et al.</italic> focused on detecting low level viral rebound (LLVR) by first considering patients with an initial consecutive VLM pair &lt;50 copies/mL, and classified LLVR as having subsequent maximum VLM between 51&#x2013;500
                <sup>
                    <xref ref-type="bibr" rid="ref-8">8</xref>
                </sup>. Alternatively, Rose 
                <italic toggle="yes">et al.</italic> investigated the use of five different frameworks to categorize suppressed versus not-suppressed VL
                <sup>
                    <xref ref-type="bibr" rid="ref-10">10</xref>
                </sup>. Their approach excluded patients with VLM&lt;200 at baseline, and stratified the remainder with regard to VL suppression using an 8 month window centered around 24 months after the start of VLM (18&#x2013;30 months). Another approach was used by Phillips 
                <italic toggle="yes">et al.</italic>, and characterized VLM responses to ART
                <sup>
                    <xref ref-type="bibr" rid="ref-13">13</xref>
                </sup>, utilizing a 24&#x2013;40 week window and a rule-based method to identify two populations of HIV patients (Viral Failure and Viral Rebound). Despite these studies, no formal standard has been adopted by the field to classify a patient as having DSVL, SHVL, HVLS, SLVL, or rebounding viral load (RVL) patterns.</p>
            <p>Classifying patient VL states outside of research studies is further complicated in that real-world VL measurements are taken intermittently over time, and missing data is common due to a variety of factors (e.g. travel, social circumstances, non-adherence). This leads to incomplete and irregularly spaced data points. In addition, differences in the sensitivity of the multiple VL clinical assays available results in multiple cut off points for &#x201c;undetectable" viral loads analyzed at different facilities, further complicating analyses. Thus, there is a need for analytic techniques that can adjust for these details and classify VL states, both across research studies using different methodologies and to consistently classify patients in clinical practice.</p>
            <p>Machine learning methods can provide objective, unsupervised classification of patient clinical status
                <sup>
                    <xref ref-type="bibr" rid="ref-14">14</xref>
                </sup>. These methods begin by collecting a set of features from patient data (e.g. demographics, laboratory measurements, therapies) and then performing computational clustering to identify similar patient classes. Some groups have applied machine learning methods to HIV research studies
                <sup>
                    <xref ref-type="bibr" rid="ref-15">15</xref>
                </sup> to predict HIV VL responses
                <sup>
                    <xref ref-type="bibr" rid="ref-16">16</xref>
                </sup> or CD4 T cell counts
                <sup>
                    <xref ref-type="bibr" rid="ref-17">17</xref>
                </sup>, to distinguish between suppressed and viremic patients
                <sup>
                    <xref ref-type="bibr" rid="ref-18">18</xref>
                </sup>, and to select therapeutic regimens
                <sup>
                    <xref ref-type="bibr" rid="ref-19">19</xref>
                </sup>. None, however, have used machine learning to create a standard classification for VL status with irregularly sampled VL measurements across a cohort of patients.</p>
            <p>To address these issues, we propose a set of unambiguous features which, when combined as a feature vector, capture the distinct dynamic patterns present in VL measurements over time. In addition, we have developed a novel centroid algorithm to cluster HIV positive subjects based on these patterns. Here we present the derivation of this method, and demonstrate its application to clustering 1,576 HIV patients with repeated VL measurements over a 5 year period. We found that patient VL measurements can be clustered into five time-varying patterns that correspond well to clinically relevant states. We note that the method and resulting categories can be used to standardize definitions of VL patterns across research studies, and potentially for clinical classification.</p>
        </sec>
        <sec sec-type="methods">
            <title>Methods</title>
            <sec>
                <title>Human subjects protection</title>
                <p>This proposal was reviewed and approved by the University of Rochester Human Subjects Review Board (protocol number RSRB00068884). Consent was waived by the review board due to de-identification of the data set. The analysis in this paper is presented in compliance with Center for Medicare Services (CMS) current cell size suppression policy
                    <sup>
                        <xref ref-type="bibr" rid="ref-20">20</xref>
                    </sup>. Data were coded such that patients could not be identified directly in compliance with the Department of Health and Human Services Regulations for the Protection of Human Subjects (45 CFR 46.101(b)(4)).</p>
            </sec>
            <sec>
                <title>Study data</title>
                <p>We obtained medical encounter data from all patients with an HIV diagnosis in the University of Rochester Medical Center&#x2019;s electronic medical record system (EMR) between 2011&#x2013;2016, including age, gender, race, ethnicity, and VL measurements. There were a total of 1,892 patients with at least one VL measured, with 1,576 of these patients having at least three VL measurements.</p>
                <p>Measurements 
                    <italic toggle="yes">&#x2264;</italic>48 copies/mL, present as categorical values &#x201c;NEG", &#x201c;POS &lt; 20", or &#x201c;POS &lt; 48" were transformed into numerical values of 0, 20, and 48 respectively. The deidentified study data containing only viral load measurements and relative time are available at 
                    <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.5281/zenodo.1313245">https://doi.org/10.5281/zenodo.1313245</ext-link>
                    <sup>
                        <xref ref-type="bibr" rid="ref-21">21</xref>
                    </sup>.</p>
            </sec>
            <sec>
                <title>Hardware and software specifications</title>
                <p>Analyses were performed on a Windows 8 server with Intel(R) Xeon(R) CPUs E5-2620 v2 @ 2.10GHz and 256GB of RAM. Python 2.7 was used for most data mining and machine learning under Spyder v.3 installed from Anaconda2 (64-bit). The default packages available in Anaconda were used for analysis, including, but not limited to: NumPy, scikit-learn, SciPy, datetime, csv, math, Matplotlib, pip, operator, copy, random, and time. Using pip we installed the webcolors and pydotplus packages for rendering a decision tree. SQLite was used to store, query, and clean ~the data. Analytic code is available for download at 
                    <ext-link ext-link-type="uri" xlink:href="https://github.com/SamirRCHI/Viral_Load_Data_Categorization">https://github.com/SamirRCHI/Viral_Load_Data_Categorization</ext-link>.</p>
            </sec>
            <sec>
                <title>Viral load analysis methods</title>
                <p>Since VL data is asynchronous and noisy, with variable numbers of data points for each subject, we excluded patients with 
                    <italic toggle="yes">&#x2264;</italic> 2 VL measurements as too few to accurately assess VL patterns. Based on temporal patterns of VL described in the literature, the VL pair distribution of our data (
                    <xref ref-type="other" rid="FS2">Figure S1</xref>), and a further extensive investigation into the data, we hypothesized six potential temporal VL patterns, defined in 
                    <xref ref-type="table" rid="T1">Table 1</xref> and illustrated in 
                    <xref ref-type="fig" rid="f1">Figure 1</xref>.</p>
                <table-wrap id="T1" orientation="portrait" position="anchor">
                    <label>Table 1. </label>
                    <caption>
                        <title>Viral load state definitions.</title>
                    </caption>
                    <table content-type="article-table" frame="hsides">
                        <thead>
                            <tr>
                                <th align="left" colspan="1" rowspan="1" valign="top">Abbrv.</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">Name</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">Definition</th>
                            </tr>
                        </thead>
                        <tbody>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" style="color:#009E73" valign="top">
                                    <italic toggle="yes">DSVL</italic>
                                </td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Durably Suppressed Viral Load</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Having consistently suppressed their viral loads at or near the undetectable range</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" style="color:#000BB2" valign="top">
                                    <italic toggle="yes">SLVL</italic>
                                </td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Sustained Low Viral Load</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Viral load counts which are constantly slightly higher than the undetectable range.</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" style="color:#D55E00" valign="top">
                                    <italic toggle="yes">SHVL</italic>
                                </td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Sustained High Viral Load</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Viral load counts which are constantly in a range considered high risk for HIV complications (e.g. opportunistic infections, malignancy).</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" style="color:#CC79A7" valign="top">
                                    <italic toggle="yes">HVLS</italic>
                                </td>
                                <td align="left" colspan="1" rowspan="1" valign="top">High Viral Load Suppression</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">A viral load pattern in which the terminal portion of the curve has a negative slope and the terminal data point is in the low or suppressed range. This could have a few different speeds or styles of suppression - rapid, gradual, or slow.</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" style="color:#000000" valign="top">
                                    <italic toggle="yes">RVL</italic>
                                </td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Rebounding Viral Load</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">A viral load pattern in which viral loads are unstable, with the measurement at one time step seemingly being independent of the next measurement.</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" style="color:#FF0000" valign="top">
                                    <italic toggle="yes">EVL</italic>
                                </td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Emerging Viral Load</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Having a steady, or rapid, emergence of high viral load while the first few measurements of viral load were suppressed. While we have found no mention of this type of pattern in the literature, and found that this pattern did not occur in our data set, VL data sets could contain this pattern.</td>
                            </tr>
                        </tbody>
                    </table>
                    <table-wrap-foot>
                        <fn>
                            <p>*Colors are used throughout the manuscript to identify clusters</p>
                        </fn>
                    </table-wrap-foot>
                </table-wrap>
                <fig fig-type="figure" id="f1" orientation="portrait" position="float">
                    <label>Figure 1. </label>
                    <caption>
                        <title>Possible HIV viral load patterns.</title>
                        <p>Examples of each type of viral load pattern. Note that actual viral load patterns are noisier and may often be more difficult to distinguish. The magnitudes of viral load values reflect those found in the dataset.</p>
                    </caption>
                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/17007/4f79706a-94d2-4b0d-b43b-23416a31d185_figure1.gif"/>
                </fig>
                <p>It is important to note that these definitions are pattern based, and do not explicitly select absolute VL cutoff levels or a specific temporal window, as other reports have done
                    <sup>
                        <xref ref-type="bibr" rid="ref-8">8</xref>&#x2013;
                        <xref ref-type="bibr" rid="ref-10">10</xref>,
                        <xref ref-type="bibr" rid="ref-13">13</xref>
                    </sup>. This has the advantage of allowing the absolute VL levels and critical time windows to emerge from the analysis. It also does not preclude incorporation of absolute levels (e.g. VLM
                    <italic toggle="yes">&gt;</italic>400) at a later stage into the pattern specification.</p>
            </sec>
            <sec>
                <title>Feature vector definition</title>
                <p>Mathematical notations for this work are described in 
                    <xref ref-type="table" rid="T2">Table 2</xref>. We next designed a feature vector to capture characteristics that would allow us to distinguish between VL patterns. VL values at the lower limit of detection are a function of the specific assay used, and appear in our data set as 0, 20, and 48 copies/mL (
                    <xref ref-type="other" rid="FS2">Figure S1</xref>). Thus, plots of the log
                    <sub>10</sub> transformed data have discretely spaced values at the lower level of detection, capturing the undetectable range of viral load. Additionally, we adjusted the data by log
                    <sub>10</sub>[
                    <italic toggle="yes">V L</italic> + 10] to avoid log
                    <sub>10</sub>[0]. The addition of 10 to VL (instead of 1) is used to minimize the distance between the undetectable values: 0, 20, and 48 (copies/mL). Thus, in our notation, all the values related to viral load are assumed to have been adjusted to this measure. For example, 
                    <italic toggle="yes">min
                        <sub>V L</sub>
                    </italic> = log
                    <sub>10</sub>[0 + 10] = 1 and 
                    <italic toggle="yes">max
                        <sub>V L</sub>
                    </italic> = log
                    <sub>10</sub>[10
                    <sup>7</sup> + 10] 
                    <italic toggle="yes">&#x2248;</italic> 7.</p>
                <table-wrap id="T2" orientation="portrait" position="anchor">
                    <label>Table 2. </label>
                    <caption>
                        <title>Mathematic notation.</title>
                    </caption>
                    <table content-type="article-table" frame="hsides">
                        <thead>
                            <tr>
                                <th align="left" colspan="1" rowspan="1">Symbol</th>
                                <th align="left" colspan="1" rowspan="1">Description</th>
                            </tr>
                        </thead>
                        <tbody>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">
                                    <italic toggle="yes">N</italic>
                                </td>
                                <td align="left" colspan="1" rowspan="1">The number of usable patients in the data. In our case 1576 patients.
                                    <xref ref-type="other" rid="tfn2">*</xref>
                                </td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">
                                    <italic toggle="yes">p</italic>
                                </td>
                                <td align="left" colspan="1" rowspan="1">Refers to a single patient.</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">
                                    <italic toggle="yes">V</italic>&#x00a0;
                                    <italic toggle="yes">LM
                                        <sub>p</sub>
                                    </italic>
                                </td>
                                <td align="left" colspan="1" rowspan="1">The total number of viral load measurements patient 
                                    <italic toggle="yes">p</italic> has taken.</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">
                                    <inline-formula>
                                        <mml:math display="inline" id="M16">
                                            <mml:mrow>
                                                <mml:msub>
                                                    <mml:mrow>
                                                        <mml:mover accent="true">
                                                            <mml:mrow>
                                                                <mml:mi>V</mml:mi>
                                                                <mml:mspace width="0.1em"/>
                                                                <mml:mi>L</mml:mi>
                                                            </mml:mrow>
                                                            <mml:mo stretchy="true">&#x2192;</mml:mo>
                                                        </mml:mover>
                                                    </mml:mrow>
                                                    <mml:mrow>
                                                        <mml:mi>p</mml:mi>
                                                    </mml:mrow>
                                                </mml:msub>
                                            </mml:mrow>
                                        </mml:math>
                                    </inline-formula>
                                </td>
                                <td align="left" colspan="1" rowspan="1">All viral load counts of patient 
                                    <italic toggle="yes">p</italic> in order of time.</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">
                                    <inline-formula>
                                        <mml:math display="inline" id="M17">
                                            <mml:mrow>
                                                <mml:msub>
                                                    <mml:mrow>
                                                        <mml:mover accent="true">
                                                            <mml:mrow>
                                                                <mml:mi>V</mml:mi>
                                                                <mml:mspace width="0.1em"/>
                                                                <mml:mi>L</mml:mi>
                                                            </mml:mrow>
                                                            <mml:mo stretchy="true">&#x2192;</mml:mo>
                                                        </mml:mover>
                                                    </mml:mrow>
                                                    <mml:mrow>
                                                        <mml:mi>p</mml:mi>
                                                        <mml:mo>,</mml:mo>
                                                        <mml:mi>i</mml:mi>
                                                    </mml:mrow>
                                                </mml:msub>
                                            </mml:mrow>
                                        </mml:math>
                                    </inline-formula>
</td>
                                <td align="left" colspan="1" rowspan="1">Refers to the 
                                    <italic toggle="yes">i
                                        <sup>th</sup>
                                    </italic> viral load count of patient 
                                    <italic toggle="yes">p</italic> in 
                                    <inline-formula>
                                        <mml:math display="inline" id="M18">
                                            <mml:mrow>
                                                <mml:msub>
                                                    <mml:mrow>
                                                        <mml:mover accent="true">
                                                            <mml:mrow>
                                                                <mml:mi>V</mml:mi>
                                                                <mml:mspace width="0.1em"/>
                                                                <mml:mi>L</mml:mi>
                                                            </mml:mrow>
                                                            <mml:mo stretchy="true">&#x2192;</mml:mo>
                                                        </mml:mover>
                                                    </mml:mrow>
                                                    <mml:mrow>
                                                        <mml:mi>p</mml:mi>
                                                    </mml:mrow>
                                                </mml:msub>
                                            </mml:mrow>
                                        </mml:math>
                                    </inline-formula>, where 1 &#x2264; 
                                    <italic toggle="yes">i</italic> &#x2264; 
                                    <italic toggle="yes">V</italic>&#x00a0;
                                    <italic toggle="yes">LM
                                        <sub>p</sub>
                                    </italic>.</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">
                                    <inline-formula>
                                        <mml:math display="inline" id="M19">
                                            <mml:mrow>
                                                <mml:msub>
                                                    <mml:mrow>
                                                        <mml:mover accent="true">
                                                            <mml:mi>t</mml:mi>
                                                            <mml:mo stretchy="true">&#x2192;</mml:mo>
                                                        </mml:mover>
                                                    </mml:mrow>
                                                    <mml:mrow>
                                                        <mml:mi>p</mml:mi>
                                                    </mml:mrow>
                                                </mml:msub>
                                            </mml:mrow>
                                        </mml:math>
                                    </inline-formula>
                                </td>
                                <td align="left" colspan="1" rowspan="1">All temporal instances corresponding to 
                                    <inline-formula>
                                        <mml:math display="inline" id="M20">
                                            <mml:mrow>
                                                <mml:msub>
                                                    <mml:mrow>
                                                        <mml:mover accent="true">
                                                            <mml:mrow>
                                                                <mml:mi>V</mml:mi>
                                                                <mml:mspace width="0.1em"/>
                                                                <mml:mi>L</mml:mi>
                                                            </mml:mrow>
                                                            <mml:mo stretchy="true">&#x2192;</mml:mo>
                                                        </mml:mover>
                                                    </mml:mrow>
                                                    <mml:mrow>
                                                        <mml:mi>p</mml:mi>
                                                    </mml:mrow>
                                                </mml:msub>
                                            </mml:mrow>
                                        </mml:math>
                                    </inline-formula>.</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">
                                    <inline-formula>
                                        <mml:math display="inline" id="M21">
                                            <mml:mrow>
                                                <mml:msub>
                                                    <mml:mrow>
                                                        <mml:mover accent="true">
                                                            <mml:mi>t</mml:mi>
                                                            <mml:mo stretchy="true">&#x2192;</mml:mo>
                                                        </mml:mover>
                                                    </mml:mrow>
                                                    <mml:mrow>
                                                        <mml:mi>p</mml:mi>
                                                        <mml:mo>,</mml:mo>
                                                        <mml:mi>i</mml:mi>
                                                    </mml:mrow>
                                                </mml:msub>
                                            </mml:mrow>
                                        </mml:math>
                                    </inline-formula>
                                </td>
                                <td align="left" colspan="1" rowspan="1">Temporal instance of viral load 
                                    <inline-formula>
                                        <mml:math display="inline" id="M22">
                                            <mml:mrow>
                                                <mml:msub>
                                                    <mml:mrow>
                                                        <mml:mover accent="true">
                                                            <mml:mrow>
                                                                <mml:mi>V</mml:mi>
                                                                <mml:mspace width="0.1em"/>
                                                                <mml:mi>L</mml:mi>
                                                            </mml:mrow>
                                                            <mml:mo stretchy="true">&#x2192;</mml:mo>
                                                        </mml:mover>
                                                    </mml:mrow>
                                                    <mml:mrow>
                                                        <mml:mi>p</mml:mi>
                                                        <mml:mo>,</mml:mo>
                                                        <mml:mi>i</mml:mi>
                                                    </mml:mrow>
                                                </mml:msub>
                                            </mml:mrow>
                                        </mml:math>
                                    </inline-formula>, where 1 &#x2264; 
                                    <italic toggle="yes">i</italic> &#x2264; 
                                    <italic toggle="yes">V</italic>&#x00a0;
                                    <italic toggle="yes">LM
                                        <sub>p</sub>
                                    </italic>.</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">
                                    <italic toggle="yes">max</italic>
                                    <sub>
                                        <italic toggle="yes">V</italic>&#x00a0;
                                        <italic toggle="yes">L</italic>
                                    </sub>
                                </td>
                                <td align="left" colspan="1" rowspan="1">The maximum viral load for 
                                    <italic toggle="yes">all</italic> patients, (10
                                    <sup>7</sup>).
                                    <xref ref-type="other" rid="tfn2">**</xref>
                                </td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">
                                    <italic toggle="yes">min</italic>
                                    <sub>
                                        <italic toggle="yes">V</italic>&#x00a0;
                                        <italic toggle="yes">L</italic>
                                    </sub>
                                </td>
                                <td align="left" colspan="1" rowspan="1">The minimum viral load for 
                                    <italic toggle="yes">all</italic> patients, (0).
                                    <xref ref-type="other" rid="tfn2">**</xref>
                                </td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">&#x25cb;</td>
                                <td align="left" colspan="1" rowspan="1">Hadamard Product - elemental-wise multiplication of arrays.</td>
                            </tr>
                        </tbody>
                    </table>
                    <table-wrap-foot>
                        <fn>
                            <p id="tfn1">*This is after selecting for patients with 
                                <italic toggle="yes">&#x2265;</italic> 3 measurements.</p>
                            <p id="tfn2">**This value changes after transformation of the data.</p>
                        </fn>
                    </table-wrap-foot>
                </table-wrap>
                <p>Using the transformed VL data, we next extract several relevant features of the VL measurements over time. These features are used for machine learning classification of individual patient VL time series, and designed to distinguish patterns in VL change while minimizing the effects of noise. We do not limit feature extraction based on the total elapsed time of viral load measurements because the optimal time-point for determining viral load class is not well established. The attributes for feature extraction are: relative area of viral exposure, weighted recency reliability, adjusted maximal difference, and interquartile range. The definitions include:</p>
                <list list-type="bullet">
                    <list-item>
                        <label>1.</label>
                        <p>
                            <italic toggle="yes">Relative area of viral exposure</italic> (
                            <italic toggle="yes">Area</italic>) - the area under the viral load curve relative to the total viral load area possible, which has a range between [0,1]. We choose a normalized, relative score, as the total time span between the first and last viral load measurement, which differs between patients. This feature is similar to finding the mean and median, except it is sensitive to the dimension of time, hence yielding more information. The feature is calculated by summing the area of each trapezoid created by each pair of viral load values, followed by dividing by the total possible area (
                            <xref ref-type="other" rid="e1">Equation 1</xref>).</p>
                    </list-item>
                </list>
                <p>
                    <disp-formula id="e1">
                        <mml:math display="block" id="math1">
                            <mml:mrow>
                                <mml:msub>
                                    <mml:mover accent="true">
                                        <mml:mi>A</mml:mi>
                                        <mml:mo>&#x02d9;</mml:mo>
                                    </mml:mover>
                                    <mml:mi>p</mml:mi>
                                </mml:msub>
                                <mml:mo>=</mml:mo>
                                <mml:mfrac>
                                    <mml:mrow>
                                        <mml:mstyle displaystyle="false">
                                            <mml:msubsup>
                                                <mml:mo>&#x2211;</mml:mo>
                                                <mml:mrow>
                                                    <mml:mi>i</mml:mi>
                                                    <mml:mo>=</mml:mo>
                                                    <mml:mn>2</mml:mn>
                                                </mml:mrow>
                                                <mml:mrow>
                                                    <mml:mi>V</mml:mi>
                                                    <mml:mspace width="0.1em"/>
                                                    <mml:mi>L</mml:mi>
                                                    <mml:msub>
                                                        <mml:mi>M</mml:mi>
                                                        <mml:mi>p</mml:mi>
                                                    </mml:msub>
                                                </mml:mrow>
                                            </mml:msubsup>
                                            <mml:mrow>
                                                <mml:mfrac>
                                                    <mml:mrow>
                                                        <mml:msub>
                                                            <mml:mrow>
                                                                <mml:mover accent="true">
                                                                    <mml:mrow>
                                                                        <mml:mi>V</mml:mi>
                                                                        <mml:mspace width="0.1em"/>
                                                                        <mml:mi>L</mml:mi>
                                                                    </mml:mrow>
                                                                    <mml:mo stretchy="true">&#x2192;</mml:mo>
                                                                </mml:mover>
                                                            </mml:mrow>
                                                            <mml:mrow>
                                                                <mml:mi>p</mml:mi>
                                                                <mml:mo>,</mml:mo>
                                                                <mml:mi>i</mml:mi>
                                                            </mml:mrow>
                                                        </mml:msub>
                                                        <mml:mo>+</mml:mo>
                                                        <mml:msub>
                                                            <mml:mrow>
                                                                <mml:mover accent="true">
                                                                    <mml:mrow>
                                                                        <mml:mi>V</mml:mi>
                                                                        <mml:mspace width="0.1em"/>
                                                                        <mml:mi>L</mml:mi>
                                                                    </mml:mrow>
                                                                    <mml:mo stretchy="true">&#x2192;</mml:mo>
                                                                </mml:mover>
                                                            </mml:mrow>
                                                            <mml:mrow>
                                                                <mml:mi>p</mml:mi>
                                                                <mml:mo>,</mml:mo>
                                                                <mml:mi>i</mml:mi>
                                                                <mml:mo>&#x2212;</mml:mo>
                                                                <mml:mn>1</mml:mn>
                                                            </mml:mrow>
                                                        </mml:msub>
                                                    </mml:mrow>
                                                    <mml:mn>2</mml:mn>
                                                </mml:mfrac>
                                                <mml:mo stretchy="false">(</mml:mo>
                                                <mml:msub>
                                                    <mml:mover accent="true">
                                                        <mml:mi>t</mml:mi>
                                                        <mml:mo>&#x2192;</mml:mo>
                                                    </mml:mover>
                                                    <mml:mrow>
                                                        <mml:mi>p</mml:mi>
                                                        <mml:mo>,</mml:mo>
                                                        <mml:mi>i</mml:mi>
                                                    </mml:mrow>
                                                </mml:msub>
                                                <mml:mo>&#x2212;</mml:mo>
                                                <mml:msub>
                                                    <mml:mover accent="true">
                                                        <mml:mi>t</mml:mi>
                                                        <mml:mo>&#x2192;</mml:mo>
                                                    </mml:mover>
                                                    <mml:mrow>
                                                        <mml:mi>p</mml:mi>
                                                        <mml:mo>,</mml:mo>
                                                        <mml:mi>i</mml:mi>
                                                        <mml:mo>&#x2212;</mml:mo>
                                                        <mml:mn>1</mml:mn>
                                                    </mml:mrow>
                                                </mml:msub>
                                                <mml:mo stretchy="false">)</mml:mo>
                                            </mml:mrow>
                                        </mml:mstyle>
                                    </mml:mrow>
                                    <mml:mrow>
                                        <mml:mo stretchy="false">(</mml:mo>
                                        <mml:mi>m</mml:mi>
                                        <mml:mi>a</mml:mi>
                                        <mml:msub>
                                            <mml:mi>x</mml:mi>
                                            <mml:mrow>
                                                <mml:mi>V</mml:mi>
                                                <mml:mspace width="0.1em"/>
                                                <mml:mi>L</mml:mi>
                                            </mml:mrow>
                                        </mml:msub>
                                        <mml:mo>&#x2212;</mml:mo>
                                        <mml:mi>m</mml:mi>
                                        <mml:mi>i</mml:mi>
                                        <mml:msub>
                                            <mml:mi>n</mml:mi>
                                            <mml:mrow>
                                                <mml:mi>V</mml:mi>
                                                <mml:mspace width="0.1em"/>
                                                <mml:mi>L</mml:mi>
                                            </mml:mrow>
                                        </mml:msub>
                                        <mml:mo stretchy="false">)</mml:mo>
                                        <mml:mo stretchy="false">(</mml:mo>
                                        <mml:msub>
                                            <mml:mover accent="true">
                                                <mml:mi>t</mml:mi>
                                                <mml:mo>&#x2192;</mml:mo>
                                            </mml:mover>
                                            <mml:mrow>
                                                <mml:mi>p</mml:mi>
                                                <mml:mo>,</mml:mo>
                                                <mml:mi>V</mml:mi>
                                                <mml:mspace width="0.1em"/>
                                                <mml:mi>L</mml:mi>
                                                <mml:msub>
                                                    <mml:mi>M</mml:mi>
                                                    <mml:mi>p</mml:mi>
                                                </mml:msub>
                                            </mml:mrow>
                                        </mml:msub>
                                        <mml:mo>&#x2212;</mml:mo>
                                        <mml:msub>
                                            <mml:mover accent="true">
                                                <mml:mi>t</mml:mi>
                                                <mml:mo>&#x2192;</mml:mo>
                                            </mml:mover>
                                            <mml:mrow>
                                                <mml:mi>p</mml:mi>
                                                <mml:mo>,</mml:mo>
                                                <mml:mn>1</mml:mn>
                                            </mml:mrow>
                                        </mml:msub>
                                        <mml:mo stretchy="false">)</mml:mo>
                                    </mml:mrow>
                                </mml:mfrac>
                            </mml:mrow>
                            <mml:mspace width="17em"/>
                            <mml:mo stretchy="false">(</mml:mo>
                            <mml:mn>1</mml:mn>
                            <mml:mo stretchy="false">)</mml:mo>
                        </mml:math>
                    </disp-formula>
                </p>
                <list list-type="bullet">
                    <list-item>
                        <label>2.</label>
                        <p>
                            <italic toggle="yes">Weighted recency reliability</italic> (
                            <italic toggle="yes">wRR</italic>) - Due to viral load noise, the last measurement may not be an accurate reflection of a patient&#x2019;s viral load trend. For example, a patient may have a VL whose average slope is negative, indicating high viral load suppression over time (HVLS). If, however, the last measurement is slightly higher than the trend, heavily weighting this last measurement could lead to mis-classifying the patient as rebounding viral load (RVL). To account for this, we calculate a weighted mean where the weight of the VL measurement increases with time. More specifically, the weight function follows an inverse square root function 
                            <inline-formula>
                                <mml:math display="inline" id="M2">
                                    <mml:mrow>
                                        <mml:mo stretchy="false">(</mml:mo>
                                        <mml:mi>f</mml:mi>
                                        <mml:mo stretchy="false">(</mml:mo>
                                        <mml:mi>x</mml:mi>
                                        <mml:mo stretchy="false">)</mml:mo>
                                        <mml:mo>=</mml:mo>
                                        <mml:mfrac>
                                            <mml:mn>1</mml:mn>
                                            <mml:mrow>
                                                <mml:msqrt>
                                                    <mml:mi>x</mml:mi>
                                                </mml:msqrt>
                                            </mml:mrow>
                                        </mml:mfrac>
                                        <mml:mo stretchy="false">)</mml:mo>
                                    </mml:mrow>
                                </mml:math> </inline-formula> rather than an inverse function 
                            <inline-formula>
                                <mml:math display="inline" id="M3">
                                    <mml:mrow>
                                        <mml:mo stretchy="false">(</mml:mo>
                                        <mml:mi>g</mml:mi>
                                        <mml:mo stretchy="false">(</mml:mo>
                                        <mml:mi>x</mml:mi>
                                        <mml:mo stretchy="false">)</mml:mo>
                                        <mml:mo>=</mml:mo>
                                        <mml:mfrac>
                                            <mml:mn>1</mml:mn>
                                            <mml:mi>x</mml:mi>
                                        </mml:mfrac>
                                        <mml:mo stretchy="false">)</mml:mo>
                                        <mml:mo>.</mml:mo>
                                    </mml:mrow>
                                </mml:math> </inline-formula> This has the advantage of avoiding rapid convergence of 
                            <italic toggle="yes">g</italic>(
                            <italic toggle="yes">x</italic>) to zero when time is measured in units of days (
                            <xref ref-type="other" rid="e2">Equation 2</xref>). Weighted recency is then calculated as the dot product of the viral loads and weights divided by the sum of the weights (
                            <xref ref-type="other" rid="e3">Equation 3</xref>).</p>
                    </list-item>
                </list>
                <p>
                    <disp-formula id="e2">
                        <mml:math display="block" id="math4">
                            <mml:mrow>
                                <mml:msub>
                                    <mml:mrow>
                                        <mml:mover accent="false">
                                            <mml:mrow>
                                                <mml:mi>w</mml:mi>
                                                <mml:mi>e</mml:mi>
                                                <mml:mi>i</mml:mi>
                                                <mml:mi>g</mml:mi>
                                                <mml:mi>h</mml:mi>
                                                <mml:mi>t</mml:mi>
                                            </mml:mrow>
                                            <mml:mo stretchy="true">&#x2192;</mml:mo>
                                        </mml:mover>
                                    </mml:mrow>
                                    <mml:mrow>
                                        <mml:mi>p</mml:mi>
                                        <mml:mo>,</mml:mo>
                                        <mml:mi>i</mml:mi>
                                    </mml:mrow>
                                </mml:msub>
                                <mml:mo>=</mml:mo>
                                <mml:mfrac>
                                    <mml:mn>1</mml:mn>
                                    <mml:mrow>
                                        <mml:msqrt>
                                            <mml:mrow>
                                                <mml:msub>
                                                    <mml:mover accent="true">
                                                        <mml:mi>t</mml:mi>
                                                        <mml:mo>&#x2192;</mml:mo>
                                                    </mml:mover>
                                                    <mml:mrow>
                                                        <mml:mi>p</mml:mi>
                                                        <mml:mo>,</mml:mo>
                                                        <mml:mi>V</mml:mi>
                                                        <mml:mspace width="0.1em"/>
                                                        <mml:mi>L</mml:mi>
                                                        <mml:msub>
                                                            <mml:mi>M</mml:mi>
                                                            <mml:mi>p</mml:mi>
                                                        </mml:msub>
                                                    </mml:mrow>
                                                </mml:msub>
                                                <mml:mo>&#x2212;</mml:mo>
                                                <mml:msub>
                                                    <mml:mover accent="true">
                                                        <mml:mi>t</mml:mi>
                                                        <mml:mo>&#x2192;</mml:mo>
                                                    </mml:mover>
                                                    <mml:mrow>
                                                        <mml:mi>p</mml:mi>
                                                        <mml:mo>,</mml:mo>
                                                        <mml:mi>i</mml:mi>
                                                    </mml:mrow>
                                                </mml:msub>
                                            </mml:mrow>
                                        </mml:msqrt>
                                        <mml:mo>+</mml:mo>
                                        <mml:mn>1</mml:mn>
                                    </mml:mrow>
                                </mml:mfrac>
                            </mml:mrow>
                            <mml:mspace width="20.5em"/>
                            <mml:mo stretchy="false">(</mml:mo>
                            <mml:mn>2</mml:mn>
                            <mml:mo stretchy="false">)</mml:mo>
                        </mml:math>
                    </disp-formula>
                </p>
                <p>
                    <disp-formula id="e3">
                        <mml:math display="block" id="math5">
                            <mml:mrow>
                                <mml:mi>w</mml:mi>
                                <mml:msub>
                                    <mml:mi>R</mml:mi>
                                    <mml:mi>p</mml:mi>
                                </mml:msub>
                                <mml:mo>=</mml:mo>
                                <mml:mfrac>
                                    <mml:mrow>
                                        <mml:msub>
                                            <mml:mrow>
                                                <mml:mover accent="false">
                                                    <mml:mrow>
                                                        <mml:mi>w</mml:mi>
                                                        <mml:mi>e</mml:mi>
                                                        <mml:mi>i</mml:mi>
                                                        <mml:mi>g</mml:mi>
                                                        <mml:mi>h</mml:mi>
                                                        <mml:mi>t</mml:mi>
                                                    </mml:mrow>
                                                    <mml:mo stretchy="true">&#x2192;</mml:mo>
                                                </mml:mover>
                                            </mml:mrow>
                                            <mml:mi>p</mml:mi>
                                        </mml:msub>
                                        <mml:mo>&#x2022;</mml:mo>
                                        <mml:msub>
                                            <mml:mrow>
                                                <mml:mover accent="true">
                                                    <mml:mrow>
                                                        <mml:mi>V</mml:mi>
                                                        <mml:mspace width="0.1em"/>
                                                        <mml:mi>L</mml:mi>
                                                    </mml:mrow>
                                                    <mml:mo stretchy="true">&#x2192;</mml:mo>
                                                </mml:mover>
                                            </mml:mrow>
                                            <mml:mi>p</mml:mi>
                                        </mml:msub>
                                    </mml:mrow>
                                    <mml:mrow>
                                        <mml:mstyle displaystyle="false">
                                            <mml:msubsup>
                                                <mml:mo>&#x2211;</mml:mo>
                                                <mml:mrow>
                                                    <mml:mi>i</mml:mi>
                                                    <mml:mo>=</mml:mo>
                                                    <mml:mn>1</mml:mn>
                                                </mml:mrow>
                                                <mml:mrow>
                                                    <mml:mi>V</mml:mi>
                                                    <mml:mspace width="0.1em"/>
                                                    <mml:mi>L</mml:mi>
                                                    <mml:msub>
                                                        <mml:mi>M</mml:mi>
                                                        <mml:mi>p</mml:mi>
                                                    </mml:msub>
                                                </mml:mrow>
                                            </mml:msubsup>
                                            <mml:mrow>
                                                <mml:msub>
                                                    <mml:mrow>
                                                        <mml:mover accent="false">
                                                            <mml:mrow>
                                                                <mml:mi>w</mml:mi>
                                                                <mml:mi>e</mml:mi>
                                                                <mml:mi>i</mml:mi>
                                                                <mml:mi>g</mml:mi>
                                                                <mml:mi>h</mml:mi>
                                                                <mml:mi>t</mml:mi>
                                                            </mml:mrow>
                                                            <mml:mo stretchy="true">&#x2192;</mml:mo>
                                                        </mml:mover>
                                                    </mml:mrow>
                                                    <mml:mrow>
                                                        <mml:mi>p</mml:mi>
                                                        <mml:mo>,</mml:mo>
                                                        <mml:mi>i</mml:mi>
                                                    </mml:mrow>
                                                </mml:msub>
                                            </mml:mrow>
                                        </mml:mstyle>
                                    </mml:mrow>
                                </mml:mfrac>
                            </mml:mrow>
                            <mml:mspace width="24em"/>
                            <mml:mo stretchy="false">(</mml:mo>
                            <mml:mn>3</mml:mn>
                            <mml:mo stretchy="false">)</mml:mo>
                        </mml:math>
                    </disp-formula>
                </p>
                <list list-type="bullet">
                    <list-item>
                        <label/>
                        <p>We were also interested in how reliable 
                            <italic toggle="yes">wR</italic> is as a representation of the patient&#x2019;s viral load trend. To this end, we calculated the absolute deviations from the viral load measurements to 
                            <italic toggle="yes">wR</italic> (
                            <xref ref-type="other" rid="e4">Equation 4</xref>). Rather than averaging the deviations, we take the median to reduce the effects of outliers and call this our weighted recency reliability measure (
                            <xref ref-type="other" rid="e5">Equation 5</xref>). We take the inverse to force the range of the result to be between [0,1]; a property made to use in our next proposed feature, adjusted maximal difference.</p>
                    </list-item>
                </list>
                <p>
                    <disp-formula id="e4">
                        <mml:math display="block" id="math6">
                            <mml:mrow>
                                <mml:msub>
                                    <mml:mrow>
                                        <mml:mover accent="true">
                                            <mml:mrow>
                                                <mml:mi>d</mml:mi>
                                                <mml:mi>e</mml:mi>
                                                <mml:mi>v</mml:mi>
                                            </mml:mrow>
                                            <mml:mo stretchy="true">&#x2192;</mml:mo>
                                        </mml:mover>
                                    </mml:mrow>
                                    <mml:mrow>
                                        <mml:mi>p</mml:mi>
                                        <mml:mo>,</mml:mo>
                                        <mml:mi>i</mml:mi>
                                    </mml:mrow>
                                </mml:msub>
                                <mml:mo>=</mml:mo>
                                <mml:mrow>
                                    <mml:mo stretchy="false">|</mml:mo>
                                    <mml:mrow>
                                        <mml:mi>w</mml:mi>
                                        <mml:msub>
                                            <mml:mi>R</mml:mi>
                                            <mml:mi>p</mml:mi>
                                        </mml:msub>
                                        <mml:mo>&#x2212;</mml:mo>
                                        <mml:msub>
                                            <mml:mrow>
                                                <mml:mover accent="true">
                                                    <mml:mrow>
                                                        <mml:mi>V</mml:mi>
                                                        <mml:mspace width="0.1em"/>
                                                        <mml:mi>L</mml:mi>
                                                    </mml:mrow>
                                                    <mml:mo stretchy="true">&#x2192;</mml:mo>
                                                </mml:mover>
                                            </mml:mrow>
                                            <mml:mrow>
                                                <mml:mi>p</mml:mi>
                                                <mml:mo>,</mml:mo>
                                                <mml:mi>i</mml:mi>
                                            </mml:mrow>
                                        </mml:msub>
                                    </mml:mrow>
                                    <mml:mo stretchy="false">|</mml:mo>
                                </mml:mrow>
                            </mml:mrow>
                            <mml:mspace width="24.5em"/>
                            <mml:mo stretchy="false">(</mml:mo>
                            <mml:mn>4</mml:mn>
                            <mml:mo stretchy="false">)</mml:mo>
                        </mml:math>
                    </disp-formula>
                </p>
                <p>
                    <disp-formula id="e5">
                        <mml:math display="block" id="math7">
                            <mml:mrow>
                                <mml:mi>w</mml:mi>
                                <mml:mi>R</mml:mi>
                                <mml:msub>
                                    <mml:mi>R</mml:mi>
                                    <mml:mi>p</mml:mi>
                                </mml:msub>
                                <mml:mo>=</mml:mo>
                                <mml:mfrac>
                                    <mml:mn>1</mml:mn>
                                    <mml:mrow>
                                        <mml:mtext>median(</mml:mtext>
                                        <mml:msub>
                                            <mml:mrow>
                                                <mml:mover accent="true">
                                                    <mml:mrow>
                                                        <mml:mi>d</mml:mi>
                                                        <mml:mi>e</mml:mi>
                                                        <mml:mi>v</mml:mi>
                                                    </mml:mrow>
                                                    <mml:mo stretchy="true">&#x2192;</mml:mo>
                                                </mml:mover>
                                            </mml:mrow>
                                            <mml:mi>p</mml:mi>
                                        </mml:msub>
                                        <mml:mtext>)</mml:mtext>
                                        <mml:mo>+</mml:mo>
                                        <mml:mtext>1</mml:mtext>
                                    </mml:mrow>
                                </mml:mfrac>
                            </mml:mrow>
                            <mml:mspace width="22.5em"/>
                            <mml:mo stretchy="false">(</mml:mo>
                            <mml:mn>5</mml:mn>
                            <mml:mo stretchy="false">)</mml:mo>
                        </mml:math> </disp-formula>
                </p>
                <list list-type="bullet">
                    <list-item>
                        <label>3.</label>
                        <p>
                            <italic toggle="yes">Adjusted maximal difference</italic> (
                            <italic toggle="yes">Adj MD</italic>) - this is time-independent the difference between the &#x201c;peak&#x201d; and last VL measurements. To distinguish between viral load suppression or emergence, we calculate the &#x201c;peak&#x201d; as the maximum of the absolute deviations (
                            <xref ref-type="other" rid="e4">Equation 4</xref>) and retain the sign of the result. We expected the positive scores to effectively isolate the EVL group, however, we instead found that retaining the positive (emergent) scores lead to mis-categorization of SHVL and RVL groups without clearly identifying EVL patterns. This, along with other investigation into the data, led us to conclude that the EVL pattern may not exist in our data, but we refrain to make generalizations to all healthcare facilities. With this consideration, we force (ground) the positive scores down to zero for proper labeling of SHVL and RVL (
                            <xref ref-type="other" rid="e6">Equation 6</xref>).</p>
                        <p>Due to the varying nature in viral load measurements, we are hesitant to use the final viral load measurement as a means of judging suppression.  Thus we propose to use 
                            <italic toggle="yes">wR</italic> instead. To reduce the effects of rebounding patients being falsely labeled as suppressed patients, we multiply our result by 
                            <italic toggle="yes">wRR</italic> - as rebounding patients are expected to have a low score in the range [0,1]. The maximal difference is necessary in order to ensure that the suppression type of viral load patterns are classified appropriately (
                            <xref ref-type="other" rid="e7">Equation 7</xref>).</p>
                    </list-item>
                </list>
                <p>
                    <disp-formula id="e6">
                        <mml:math display="block" id="math8">
                            <mml:mrow>
                                <mml:mtext>grnd</mml:mtext>
                                <mml:mo stretchy="false">(</mml:mo>
                                <mml:mi>x</mml:mi>
                                <mml:mo stretchy="false">)</mml:mo>
                                <mml:mo>=</mml:mo>
                                <mml:mrow>
                                    <mml:mo>{</mml:mo>
                                    <mml:mrow>
                                        <mml:mtable>
                                            <mml:mtr>
                                                <mml:mtd>
                                                    <mml:mrow>
                                                        <mml:mo>&#x2212;</mml:mo>
                                                        <mml:mn>1</mml:mn>
                                                    </mml:mrow>
                                                </mml:mtd>
                                            </mml:mtr>
                                            <mml:mtr>
                                                <mml:mtd>
                                                    <mml:mn>0</mml:mn>
                                                </mml:mtd>
                                            </mml:mtr>
                                        </mml:mtable>
                                        <mml:mspace width="1em"/>
                                        <mml:mtable>
                                            <mml:mtr>
                                                <mml:mtd>
                                                    <mml:mrow>
                                                        <mml:mi>x</mml:mi>
                                                        <mml:mo>&lt;</mml:mo>
                                                        <mml:mn>0</mml:mn>
                                                    </mml:mrow>
                                                </mml:mtd>
                                            </mml:mtr>
                                            <mml:mtr>
                                                <mml:mtd>
                                                    <mml:mrow>
                                                        <mml:mi>x</mml:mi>
                                                        <mml:mo>&#x2265;</mml:mo>
                                                        <mml:mn>0</mml:mn>
                                                    </mml:mrow>
                                                </mml:mtd>
                                            </mml:mtr>
                                        </mml:mtable>
                                    </mml:mrow>
                                </mml:mrow>
                            </mml:mrow>
                            <mml:mspace width="24em"/>
                            <mml:mo stretchy="false">(</mml:mo>
                            <mml:mn>6</mml:mn>
                            <mml:mo stretchy="false">)</mml:mo>
                        </mml:math> </disp-formula>
                </p>
                <p>
                    <disp-formula id="e7">
                        <mml:math display="block" id="math9">
                            <mml:mrow>
                                <mml:msub>
                                    <mml:mover accent="true">
                                        <mml:mi>D</mml:mi>
                                        <mml:mo>&#x02c5;</mml:mo>
                                    </mml:mover>
                                    <mml:mi>p</mml:mi>
                                </mml:msub>
                                <mml:mo>=</mml:mo>
                                <mml:mtext>grnd</mml:mtext>
                                <mml:mo stretchy="false">(</mml:mo>
                                <mml:mi>w</mml:mi>
                                <mml:msub>
                                    <mml:mi>R</mml:mi>
                                    <mml:mi>p</mml:mi>
                                </mml:msub>
                                <mml:mo>&#x2212;</mml:mo>
                                <mml:msub>
                                    <mml:mrow>
                                        <mml:mover accent="true">
                                            <mml:mrow>
                                                <mml:mi>V</mml:mi>
                                                <mml:mspace width="0.1em"/>
                                                <mml:mi>L</mml:mi>
                                            </mml:mrow>
                                            <mml:mo stretchy="true">&#x2192;</mml:mo>
                                        </mml:mover>
                                    </mml:mrow>
                                    <mml:mrow>
                                        <mml:mi>p</mml:mi>
                                        <mml:mo>,</mml:mo>
                                        <mml:mo>argmax</mml:mo>
                                        <mml:mo>&#x2061;</mml:mo>
                                        <mml:mo stretchy="false">(</mml:mo>
                                        <mml:msub>
                                            <mml:mrow>
                                                <mml:mover accent="true">
                                                    <mml:mrow>
                                                        <mml:mi>d</mml:mi>
                                                        <mml:mi>e</mml:mi>
                                                        <mml:mi>v</mml:mi>
                                                    </mml:mrow>
                                                    <mml:mo stretchy="true">&#x2192;</mml:mo>
                                                </mml:mover>
                                            </mml:mrow>
                                            <mml:mi>p</mml:mi>
                                        </mml:msub>
                                        <mml:mo stretchy="false">)</mml:mo>
                                    </mml:mrow>
                                </mml:msub>
                                <mml:mo stretchy="false">)</mml:mo>
                                <mml:mo>&#x22c5;</mml:mo>
                                <mml:mi>max</mml:mi>
                                <mml:mo>&#x2061;</mml:mo>
                                <mml:mo stretchy="false">(</mml:mo>
                                <mml:msub>
                                    <mml:mrow>
                                        <mml:mover accent="true">
                                            <mml:mrow>
                                                <mml:mi>d</mml:mi>
                                                <mml:mi>e</mml:mi>
                                                <mml:mi>v</mml:mi>
                                            </mml:mrow>
                                            <mml:mo stretchy="true">&#x2192;</mml:mo>
                                        </mml:mover>
                                    </mml:mrow>
                                    <mml:mi>p</mml:mi>
                                </mml:msub>
                                <mml:mo stretchy="false">)</mml:mo>
                                <mml:mo>&#x22c5;</mml:mo>
                                <mml:mi>w</mml:mi>
                                <mml:mi>R</mml:mi>
                                <mml:msub>
                                    <mml:mi>R</mml:mi>
                                    <mml:mi>p</mml:mi>
                                </mml:msub>
                            </mml:mrow>
                            <mml:mspace width="10em"/>
                            <mml:mo stretchy="false">(</mml:mo>
                            <mml:mn>7</mml:mn>
                            <mml:mo stretchy="false">)</mml:mo>
                        </mml:math> </disp-formula>
                </p>
                <list list-type="bullet">
                    <list-item>
                        <label>4.</label>
                        <p>
                            <italic toggle="yes">Interquartile range</italic> (
                            <italic toggle="yes">IQR</italic>) - This feature is added to further segregate the rebounding patients and follows the standard interquartile range calculation (
                            <xref ref-type="other" rid="e8">Equation 8</xref>).</p>
                    </list-item>
                </list>
                <p>
                    <disp-formula id="e8">
                        <mml:math display="block" id="math10">
                            <mml:mrow>
                                <mml:mi>I</mml:mi>
                                <mml:mi>Q</mml:mi>
                                <mml:msub>
                                    <mml:mi>R</mml:mi>
                                    <mml:mi>p</mml:mi>
                                </mml:msub>
                                <mml:mo>=</mml:mo>
                                <mml:msub>
                                    <mml:mtext>Q</mml:mtext>
                                    <mml:mn>3</mml:mn>
                                </mml:msub>
                                <mml:mo stretchy="false">(</mml:mo>
                                <mml:msub>
                                    <mml:mrow>
                                        <mml:mover accent="true">
                                            <mml:mrow>
                                                <mml:mi>V</mml:mi>
                                                <mml:mspace width="0.1em"/>
                                                <mml:mi>L</mml:mi>
                                            </mml:mrow>
                                            <mml:mo stretchy="true">&#x2192;</mml:mo>
                                        </mml:mover>
                                    </mml:mrow>
                                    <mml:mi>p</mml:mi>
                                </mml:msub>
                                <mml:mo stretchy="false">)</mml:mo>
                                <mml:mo>&#x2212;</mml:mo>
                                <mml:msub>
                                    <mml:mtext>Q</mml:mtext>
                                    <mml:mn>1</mml:mn>
                                </mml:msub>
                                <mml:mo stretchy="false">(</mml:mo>
                                <mml:msub>
                                    <mml:mrow>
                                        <mml:mover accent="true">
                                            <mml:mrow>
                                                <mml:mi>V</mml:mi>
                                                <mml:mspace width="0.1em"/>
                                                <mml:mi>L</mml:mi>
                                            </mml:mrow>
                                            <mml:mo stretchy="true">&#x2192;</mml:mo>
                                        </mml:mover>
                                    </mml:mrow>
                                    <mml:mi>p</mml:mi>
                                </mml:msub>
                                <mml:mo stretchy="false">)</mml:mo>
                            </mml:mrow>
                            <mml:mspace width="22em"/>
                            <mml:mo stretchy="false">(</mml:mo>
                            <mml:mn>8</mml:mn>
                            <mml:mo stretchy="false">)</mml:mo>
                        </mml:math> </disp-formula>
                </p>
            </sec>
            <sec>
                <title>Statistical analysis</title>
                <p>Machine learning methods for cluster classification were compared by calculating 
                    <italic toggle="yes">F</italic>
                    <sub>1</sub> scores, the harmonic mean of precision and recall
                    <sup>
                        <xref ref-type="bibr" rid="ref-22">22</xref>
                    </sup>, defined by 
                    <xref ref-type="other" rid="e9">Equation 9</xref>&#x2013;
                    <xref ref-type="other" rid="e11">Equation 11</xref>.</p>
                <p>
                    <disp-formula id="e9">
                        <mml:math display="block" id="math11">
                            <mml:mrow>
                                <mml:mi>p</mml:mi>
                                <mml:mi>r</mml:mi>
                                <mml:mi>e</mml:mi>
                                <mml:mi>c</mml:mi>
                                <mml:mi>i</mml:mi>
                                <mml:mi>s</mml:mi>
                                <mml:mi>i</mml:mi>
                                <mml:mi>o</mml:mi>
                                <mml:mi>n</mml:mi>
                                <mml:mo>=</mml:mo>
                                <mml:mfrac>
                                    <mml:mrow>
                                        <mml:mi>T</mml:mi>
                                        <mml:mi>r</mml:mi>
                                        <mml:mi>u</mml:mi>
                                        <mml:mi>e</mml:mi>
                                        <mml:mspace width="0.1em"/>
                                        <mml:mi>P</mml:mi>
                                        <mml:mi>o</mml:mi>
                                        <mml:mi>s</mml:mi>
                                        <mml:mi>i</mml:mi>
                                        <mml:mi>t</mml:mi>
                                        <mml:mi>i</mml:mi>
                                        <mml:mi>v</mml:mi>
                                        <mml:mi>e</mml:mi>
                                    </mml:mrow>
                                    <mml:mrow>
                                        <mml:mi>T</mml:mi>
                                        <mml:mi>r</mml:mi>
                                        <mml:mi>u</mml:mi>
                                        <mml:mi>e</mml:mi>
                                        <mml:mspace width="0.1em"/>
                                        <mml:mi>P</mml:mi>
                                        <mml:mi>o</mml:mi>
                                        <mml:mi>s</mml:mi>
                                        <mml:mi>i</mml:mi>
                                        <mml:mi>t</mml:mi>
                                        <mml:mi>i</mml:mi>
                                        <mml:mi>v</mml:mi>
                                        <mml:mi>e</mml:mi>
                                        <mml:mo>+</mml:mo>
                                        <mml:mi>F</mml:mi>
                                        <mml:mi>a</mml:mi>
                                        <mml:mi>l</mml:mi>
                                        <mml:mi>s</mml:mi>
                                        <mml:mi>e</mml:mi>
                                        <mml:mspace width="0.1em"/>
                                        <mml:mi>P</mml:mi>
                                        <mml:mi>o</mml:mi>
                                        <mml:mi>s</mml:mi>
                                        <mml:mi>i</mml:mi>
                                        <mml:mi>t</mml:mi>
                                        <mml:mi>i</mml:mi>
                                        <mml:mi>v</mml:mi>
                                        <mml:mi>e</mml:mi>
                                    </mml:mrow>
                                </mml:mfrac>
                            </mml:mrow>
                            <mml:mspace width="15em"/>
                            <mml:mo stretchy="false">(</mml:mo>
                            <mml:mn>9</mml:mn>
                            <mml:mo stretchy="false">)</mml:mo>
                        </mml:math> </disp-formula>
                </p>
                <p>
                    <disp-formula id="e10">
                        <mml:math display="block" id="math12">
                            <mml:mrow>
                                <mml:mi>r</mml:mi>
                                <mml:mi>e</mml:mi>
                                <mml:mi>c</mml:mi>
                                <mml:mi>a</mml:mi>
                                <mml:mi>l</mml:mi>
                                <mml:mi>l</mml:mi>
                                <mml:mo>=</mml:mo>
                                <mml:mfrac>
                                    <mml:mrow>
                                        <mml:mi>T</mml:mi>
                                        <mml:mi>r</mml:mi>
                                        <mml:mi>u</mml:mi>
                                        <mml:mi>e</mml:mi>
                                        <mml:mspace width="0.1em"/>
                                        <mml:mi>P</mml:mi>
                                        <mml:mi>o</mml:mi>
                                        <mml:mi>s</mml:mi>
                                        <mml:mi>i</mml:mi>
                                        <mml:mi>t</mml:mi>
                                        <mml:mi>i</mml:mi>
                                        <mml:mi>v</mml:mi>
                                        <mml:mi>e</mml:mi>
                                    </mml:mrow>
                                    <mml:mrow>
                                        <mml:mi>P</mml:mi>
                                        <mml:mi>o</mml:mi>
                                        <mml:mi>s</mml:mi>
                                        <mml:mi>i</mml:mi>
                                        <mml:mi>t</mml:mi>
                                        <mml:mi>i</mml:mi>
                                        <mml:mi>v</mml:mi>
                                        <mml:mi>e</mml:mi>
                                    </mml:mrow>
                                </mml:mfrac>
                            </mml:mrow>
                            <mml:mspace width="24em"/>
                            <mml:mo stretchy="false">(</mml:mo>
                            <mml:mn>10</mml:mn>
                            <mml:mo stretchy="false">)</mml:mo>
                        </mml:math> </disp-formula>
                </p>
                <p>
                    <disp-formula id="e11">
                        <mml:math display="block" id="math13">
                            <mml:mrow>
                                <mml:msub>
                                    <mml:mi>F</mml:mi>
                                    <mml:mn>1</mml:mn>
                                </mml:msub>
                                <mml:mo>=</mml:mo>
                                <mml:mn>2</mml:mn>
                                <mml:mo>&#x2217;</mml:mo>
                                <mml:mfrac>
                                    <mml:mrow>
                                        <mml:mi>p</mml:mi>
                                        <mml:mi>r</mml:mi>
                                        <mml:mi>e</mml:mi>
                                        <mml:mi>c</mml:mi>
                                        <mml:mi>i</mml:mi>
                                        <mml:mi>s</mml:mi>
                                        <mml:mi>i</mml:mi>
                                        <mml:mi>o</mml:mi>
                                        <mml:mi>n</mml:mi>
                                        <mml:mo>&#x2217;</mml:mo>
                                        <mml:mi>r</mml:mi>
                                        <mml:mi>e</mml:mi>
                                        <mml:mi>c</mml:mi>
                                        <mml:mi>a</mml:mi>
                                        <mml:mi>l</mml:mi>
                                        <mml:mi>l</mml:mi>
                                    </mml:mrow>
                                    <mml:mrow>
                                        <mml:mi>p</mml:mi>
                                        <mml:mi>r</mml:mi>
                                        <mml:mi>e</mml:mi>
                                        <mml:mi>c</mml:mi>
                                        <mml:mi>i</mml:mi>
                                        <mml:mi>s</mml:mi>
                                        <mml:mi>i</mml:mi>
                                        <mml:mi>o</mml:mi>
                                        <mml:mi>n</mml:mi>
                                        <mml:mo>+</mml:mo>
                                        <mml:mi>r</mml:mi>
                                        <mml:mi>e</mml:mi>
                                        <mml:mi>c</mml:mi>
                                        <mml:mi>a</mml:mi>
                                        <mml:mi>l</mml:mi>
                                        <mml:mi>l</mml:mi>
                                    </mml:mrow>
                                </mml:mfrac>
                            </mml:mrow>
                            <mml:mspace width="22em"/>
                            <mml:mo stretchy="false">(</mml:mo>
                            <mml:mn>11</mml:mn>
                            <mml:mo stretchy="false">)</mml:mo>
                        </mml:math> </disp-formula>
                </p>
            </sec>
            <sec>
                <title>Analytic terminology</title>
                <p>Here we formally define keywords appearing in the analysis: Let 
                    <italic toggle="yes">Feature extraction</italic> be the process of determining the values 
                    <italic toggle="yes">&#x0226;</italic>, 
                    <italic toggle="yes">wRR</italic>, 
                    <inline-formula>
                        <mml:math display="inline" id="M25">
                            <mml:mrow>
                                <mml:mover accent="true">
                                    <mml:mi>D</mml:mi>
                                    <mml:mo>&#x02c5;</mml:mo>
                                </mml:mover>
                            </mml:mrow>
                        </mml:math>
                    </inline-formula>, and 
                    <italic toggle="yes">IQR</italic> from a set of patients (using their viral load patterns) with the formulations given above. Then a 
                    <italic toggle="yes">feature vector</italic> (
                    <italic toggle="yes">
                        <inline-formula>
                            <mml:math display="inline" id="M26">
                                <mml:mover accent="true">
                                    <mml:mi>F</mml:mi>
                                    <mml:mo>&#x2192;</mml:mo>
                                </mml:mover>
                            </mml:math>
                        </inline-formula>
                        <sub>p</sub>
                    </italic>) contains the values 
                    <italic toggle="yes">&#x0226;
                        <sub>p</sub>
                    </italic>, 
                    <italic toggle="yes">wRR
                        <sub>p</sub>
                    </italic>, 
                    <inline-formula>
                        <mml:math display="inline" id="M27">
                            <mml:mrow>
                                <mml:msub>
                                    <mml:mover accent="true">
                                        <mml:mi>D</mml:mi>
                                        <mml:mo>&#x02c5;</mml:mo>
                                    </mml:mover>
                                    <mml:mi>p</mml:mi>
                                </mml:msub>
                            </mml:mrow>
                        </mml:math>
                    </inline-formula>, and 
                    <italic toggle="yes">IQR
                        <sub>p</sub>
                    </italic> extracted from patient 
                    <italic toggle="yes">p</italic>&#x2019;s viral load pattern.  The words 
                    <italic toggle="yes">sample</italic> or 
                    <italic toggle="yes">point</italic> are also used here RVL (black; n= 237) and HVLS (purple; n=316) clusters. interchangeably. The term 
                    <italic toggle="yes">feature</italic> (
                    <italic toggle="yes">F</italic>) can be thought of as a column vector for all patients in the dataset consisting of the four attributes: 
                    <italic toggle="yes">F
                        <sub>&#x0226;</sub>
                    </italic>, 
                    <italic toggle="yes">F
                        <sub>wRR</sub>
                    </italic>, 
                    <inline-formula>
                        <mml:math display="inline" id="M">
                            <mml:mrow>
                                <mml:msub>
                                    <mml:mo>F</mml:mo>
                                    <mml:mover accent="true">
                                        <mml:mi>D</mml:mi>
                                        <mml:mo>&#x02c5;</mml:mo>
                                    </mml:mover>
                                </mml:msub>
                            </mml:mrow>
                        </mml:math>
                    </inline-formula>, and 
                    <italic toggle="yes">F
                        <sub>IQR</sub>
                    </italic>. Finally, the terms 
                    <italic toggle="yes">label assignment</italic>, 
                    <italic toggle="yes">VL pattern membership assignment</italic>, 
                    <italic toggle="yes">patient categorization</italic>, and 
                    <italic toggle="yes">prediction</italic>, all refer to the same principle: To assign the most appropriate label which characterizes the viral load pattern of a patient. However, while the principle is the same, the method of assigning such an appropriate label differs depending on the categorization or the learning method used.</p>
            </sec>
        </sec>
        <sec sec-type="results">
            <title>Results</title>
            <sec>
                <title>Feature extraction and normalization</title>
                <p>We began by transforming viral load data by min-max normalization
                    <sup>
                        <xref ref-type="bibr" rid="ref-22">22</xref>
                    </sup> to equally weight the temporal features of the VL series (
                    <xref ref-type="other" rid="e12">Equation 12</xref>). That is, we normalize the features, 
                    <italic toggle="yes">F</italic>, to a range between [0,  1] using 
                    <xref ref-type="other" rid="e12">Equation 12</xref> where 
                    <italic toggle="yes">F
                        <sup>*</sup>
                    </italic> = 
                    <italic toggle="yes">f</italic>(
                    <italic toggle="yes">F</italic>).</p>
                <p>
                    <disp-formula id="e12">
                        <mml:math display="block" id="math14">
                            <mml:mrow>
                                <mml:msup>
                                    <mml:mi>F</mml:mi>
                                    <mml:mo>&#x2217;</mml:mo>
                                </mml:msup>
                                <mml:mo>=</mml:mo>
                                <mml:mi>f</mml:mi>
                                <mml:mo stretchy="false">(</mml:mo>
                                <mml:mi>F</mml:mi>
                                <mml:mo stretchy="false">)</mml:mo>
                                <mml:mo>=</mml:mo>
                                <mml:mfrac>
                                    <mml:mrow>
                                        <mml:mi>F</mml:mi>
                                        <mml:mo>&#x2212;</mml:mo>
                                        <mml:mi>min</mml:mi>
                                        <mml:mo>&#x2061;</mml:mo>
                                        <mml:mspace width="0.1em"/>
                                        <mml:mi>F</mml:mi>
                                    </mml:mrow>
                                    <mml:mrow>
                                        <mml:mi>max</mml:mi>
                                        <mml:mo>&#x2061;</mml:mo>
                                        <mml:mspace width="0.1em"/>
                                        <mml:mi>F</mml:mi>
                                        <mml:mo>&#x2212;</mml:mo>
                                        <mml:mi>min</mml:mi>
                                        <mml:mo>&#x2061;</mml:mo>
                                        <mml:mspace width="0.1em"/>
                                        <mml:mi>F</mml:mi>
                                    </mml:mrow>
                                </mml:mfrac>
                            </mml:mrow>
                            <mml:mspace width="21.5em"/>
                            <mml:mo stretchy="false">(</mml:mo>
                            <mml:mn>12</mml:mn>
                            <mml:mo stretchy="false">)</mml:mo>
                        </mml:math> </disp-formula>
                </p>
                <p>Next, we examined each of the four features for all patients with 
                    <italic toggle="yes">&#x2265;</italic> 3 viral load measurements (
                    <italic toggle="yes">N</italic> = 1,576 patients), and did not find distinct bi-variate clustering (
                    <xref ref-type="other" rid="FS3">Figure S2</xref>). A feature correlation coefficient analysis (
                    <xref ref-type="other" rid="ST9">Supplementary Table S1</xref>) revealed that the 
                    <italic toggle="yes">Adj MD</italic> feature is linearly independent of 
                    <italic toggle="yes">Area</italic> and 
                    <italic toggle="yes">wRR</italic>. In contrast, there is modest linear dependence between 
                    <italic toggle="yes">IQR</italic> and 
                    <italic toggle="yes">Adj MD</italic>, and between 
                    <italic toggle="yes">Area</italic> and both 
                    <italic toggle="yes">wRR</italic> and IQR. As expected, the largest linear dependency is between 
                    <italic toggle="yes">wRR</italic> and 
                    <italic toggle="yes">IQR</italic>. These results suggest the separation between viral load patterns will be most noticeable between the 
                    <italic toggle="yes">Area</italic> and the 
                    <italic toggle="yes">Adj MD</italic> features - as we designed them to be. Also, although 
                    <italic toggle="yes">Adj MD</italic> is dependent upon 
                    <italic toggle="yes">wRR</italic>, we find that their correlation coefficient is very low (0.033).</p>
            </sec>
            <sec>
                <title>Hierarchical clustering</title>
                <p>We then performed hierarchical clustering of the individual subject VL patterns using a Euclidean distance metric and Ward&#x2019;s linkage criterion
                    <sup>
                        <xref ref-type="bibr" rid="ref-23">23</xref>
                    </sup> to minimize the total within-cluster variance. Patients showed a clear separation into 5 distinct groups, which had clinical significance (
                    <xref ref-type="fig" rid="f2">Figure 2</xref> and 
                    <xref ref-type="fig" rid="f3">Figure 3</xref>). The cluster with the lowest viral loads and the highest weighted recency reliability (n=442) corresponds to the DSVL patient group. The patients corresponding to the SHVL group (orange; n=46) exhibited the highest relative 
                    <italic toggle="yes">Area</italic> and very low 
                    <italic toggle="yes">IQR</italic>. Compared to the DSVL cluster, the blue cluster (n=535) has slightly greater area and 
                    <italic toggle="yes">IQR</italic> with a significant difference in the weighted recency reliability. Using this information, along with the general patterns shown by 
                    <xref ref-type="fig" rid="f3">Figure 3</xref>, we identify this as the SLVL group. The algorithm also identifies the RVL (black; n=237) and HVLS (purple; n=316) clusters. The RVL cluster has a low weighted recency reliability and high 
                    <italic toggle="yes">IQR</italic>. In contrast, the HVLS cluster has a lower area, higher weighted recency reliability, indicating little variation in the terminal portion of the VL time series, and most importantly very low adjusted maximal differences (
                    <xref ref-type="fig" rid="f4">Figure 4</xref>).</p>
                <fig fig-type="figure" id="f2" orientation="portrait" position="float">
                    <label>Figure 2. </label>
                    <caption>
                        <title>Dendrogram of hierarchically clustered patients.</title>
                        <p>Clustered using the Euclidean distance along with Ward&#x2019;s method. Numbers on the bottom axis show number of patients in each cluster. The corresponding viral load pattern plots can be found in 
                            <xref ref-type="other" rid="f3">Figure 3</xref>. 
                            <styled-content style="#009E73" style-type="color">DSVL</styled-content> = Durably Suppressed Viral Load, 
                            <styled-content style="#D55E00" style-type="color">SHVL</styled-content> = Sustained High Viral Load, 
                            <styled-content style="#000BB5" style-type="color">SLVL</styled-content> = Sustained Low Viral Load, RVL = Rebounding Viral Load, 
                            <styled-content style="#CC79A7" style-type="color">HVLS</styled-content> = High Viral Load Suppression</p>
                    </caption>
                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/17007/4f79706a-94d2-4b0d-b43b-23416a31d185_figure2.gif"/>
                </fig>
                <fig fig-type="figure" id="f3" orientation="portrait" position="float">
                    <label>Figure 3. </label>
                    <caption>
                        <title>Extracted patient viral load patterns.</title>
                        <p>For each cluster categorization of the patient from 
                            <xref ref-type="other" rid="f2">Figure 2</xref>, the days since first viral load measurement are plotted against the viral load counts. The points on the plots indicate the last viral load measurement.</p>
                    </caption>
                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/17007/4f79706a-94d2-4b0d-b43b-23416a31d185_figure3.gif"/>
                </fig>
                <fig fig-type="figure" id="f4" orientation="portrait" position="float">
                    <label>Figure 4. </label>
                    <caption>
                        <title>Feature segregation from hierarchical clustering.</title>
                        <p>Each patient is colored corresponding to the results from the hierarchical clustering in 
                            <xref ref-type="fig" rid="f2">Figure 2</xref>. The artificial line of points is a result of the grounding function used in 
                            <italic toggle="yes">Adj MD</italic>. 
                            <italic toggle="yes">Area</italic> = relative area of viral load exposure, 
                            <italic toggle="yes">wRR</italic>= weighted recency reliability, 
                            <italic toggle="yes">IQR</italic> = interquartile range, AdjMD = adjusted maximal viral load difference.</p>
                    </caption>
                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/17007/4f79706a-94d2-4b0d-b43b-23416a31d185_figure4.gif"/>
                </fig>
                <p>VL patterns are similar within clusters, and dissimilar between clusters 
                    <xref ref-type="fig" rid="f3">Figure 3</xref>. Interestingly, there are patients within each cluster whose last VL measurement occurs near 1,827 days. This is equivalent to the full span of five years of VL monitoring data set. This suggests that these clusters don&#x2019;t disappear after some elapsed time, but rather each type of pattern can be found at virtually any time point.</p>
                <p>We found large VL spikes within the time series of the HVLS group. We hypothesize that this may be due to the asynchronous timing of measurements between subjects, the natural variation in biological responses, or patient variability in adherence to therapy. This observation also reflects one limitation of asynchronous outcomes data sampling, which lacks a &#x201c;completion" endpoint characteristic of most prospective, randomized clinical trials. If measurements ended at a spike, the adjusted maximal difference feature may be weighted in the favor of the patient being classified as RVL. This may indicate that some patients classified as suppressing their viral loads should have been classified as having rebounding viral loads. Alternatively, may indicate that these features do not restrict a patient to forever to one category, but allow for dynamic classification as a function of biological or therapeutic responses.</p>
            </sec>
            <sec>
                <title>Comparison of categorization methods</title>
                <p>Using the same data set, we next compared our VL pattern categorization method to those previously published in the literature. Visually, we find that the SLVL group detected by our method is very similar to the LLVR group defined by Greub 
                    <italic toggle="yes">et al.</italic> (
                    <xref ref-type="fig" rid="f5">Figure 5</xref>). Furthermore, it appears that the methods trying to capture SHVL, viral rebound, and viral failure patients did not succeed as well as the identification of SHVL and RVL patients in our method. RMVL repeat continuous visually appears to have performed very well in identifying patients whom have suppressed their viral loads. However, the results suggest that our analysis performs slightly better in identifying the suppression group (HVLS), as we find that the last VL measurements (black dots in 
                    <xref ref-type="fig" rid="f5">Figure 5</xref>) are consistently low using our method.</p>
                <fig fig-type="figure" id="f5" orientation="portrait" position="float">
                    <label>Figure 5. </label>
                    <caption>
                        <title>Comparison of patient categories with existing methods.</title>
                        <p>A 2D binning of VLM counts for every patient category. Each row uses a different categorization method, and the method name is located to the right of the row, and the title of each subplot is the category assigned by the indicated method. The columns of each 2D bin are normalized based on the maximum number of logged viral load measurement (VLM) counts in the column: log
                            <sub>10</sub>[1 + 
                            <italic toggle="yes">V L M</italic> ]. Bin color for a count of 0 is copper, and other bin colors range from white to teal (the maximum of the log
                            <sub>10</sub>[
                            <italic toggle="yes">V L M counts</italic>] in the column of the bin). The black dots represent the last viral load measurement for the patient (opacity 
                            <italic toggle="yes">&#x2265;</italic> 0.3; 2D bins have variable opacity for the dots). The bottom row is our analysis is the same as 
                            <xref ref-type="other" rid="f3">Figure 3</xref>, but represented as a 2D bin. DSVL = durably suppressed viral load; LLVR = low level viral rebound; SLVL = sustained low viral load; SHVL = sustained high viral load; HVLS = high viral load suppression, RVL = rebounding viral load.</p>
                    </caption>
                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/17007/4f79706a-94d2-4b0d-b43b-23416a31d185_figure5.gif"/>
                </fig>
                <p>The other methods may not have performed as well as they rely on a window or a consecutive pair measure, which may be too subjective for assigning VL pattern membership. Furthermore, notice that patients with baseline VL&lt;200 (
                    <xref ref-type="fig" rid="f5">Figure 5</xref>) contain VL patterns which can reach as high as 10
                    <sup>6</sup> copies/ml, which is in contrast to Rose 
                    <italic toggle="yes">et al.</italic>&#x2019;s assumption that these patients have consistently low viral variation. Lastly we wish to emphasize that while some of these categorization methods are successful in identifying a specific group of patients, our method is unique as it attempts to associate each VL pattern to a specific category, without using categories such as &#x201c;Not Suppressed&#x201d;, &#x201c;Unspecified&#x201d;, or &#x201c;Omitted&#x201d;.</p>
            </sec>
            <sec>
                <title>Supervised learning of VL patterns</title>
                <p>We next used the classes identified by hierarchical clustering to compare several machine learning models, with the goal of identifying methods that could be trained to prospectively assign HIV patients to VL categories (i.e. SHVL, SVL, SLVL, DSVL, and RVL). Unsupervised learning (e.g. hierarchical clustering) is useful for establishing the data structure of VL categories and their locations in the feature vector space. Once the model is established (e.g. cluster boundaries), supervised learning methods are better suited for prospective cluster assignment, given a robust "ground truth" for model training, as they do not depend on re-analysis of the entire population.</p>
                <p>To this end, we compared the predictive power of several supervised learning methods for HIV cluster assignment, including: k-nearest neighbors (kNN), decision tree, support vector machine (SVM), Adaboost, and random forests. Models were trained on the original data set, and we then ranked their prediction power by their average 
                    <italic toggle="yes">F</italic>
                    <sub>1</sub> score derived by leave-one-out cross-validation (LOOCV) on the clustered results (
                    <xref ref-type="table" rid="T3">Table 3</xref>). We compared the ability of these methods to reconstruct the originally identified clusters, even when allowing for variability in cluster numbers (e.g. kNN with 
                    <italic toggle="yes">k</italic>={7, 9}, or DT without a maximum depth specification). All methods performed comparably well, with the notable exception of Adaboost. This generally high performance was expected because the VL pattern categories are well-separated as a result of the clustering. k-Nearest Neighbors and k=5, was computationally efficient and yielded the best results in 
                    <xref ref-type="table" rid="T3">Table 3</xref>.</p>
                <table-wrap id="T3" orientation="portrait" position="anchor">
                    <label>Table 3. </label>
                    <caption>
                        <title>

                            <italic toggle="yes">F</italic>
                            <sub>1</sub> prediction scores using LOOCV.</title>
                    </caption>
                    <table content-type="article-table" frame="hsides">
                        <thead>
                            <tr>
                                <th align="left" colspan="1" rowspan="1">Group:</th>
                                <th align="right" colspan="1" rowspan="1" style="color:#009E73">DSVL</th>
                                <th align="right" colspan="1" rowspan="1" style="color:#000BB5">SLVL</th>
                                <th align="right" colspan="1" rowspan="1" style="color:#D55E00">SHVL</th>
                                <th align="right" colspan="1" rowspan="1" style="color:#CC79A7">HVLS</th>
                                <th align="right" colspan="1" rowspan="1" style="color:#000000">RVL</th>
                                <th align="right" colspan="1" rowspan="1">Average</th>
                            </tr>
                            <tr>
                                <th align="left" colspan="1" rowspan="1">Patients:</th>
                                <th align="right" colspan="1" rowspan="1">442</th>
                                <th align="right" colspan="1" rowspan="1">535</th>
                                <th align="right" colspan="1" rowspan="1">46</th>
                                <th align="right" colspan="1" rowspan="1">316</th>
                                <th align="right" colspan="1" rowspan="1">237</th>
                                <th align="right" colspan="1" rowspan="1">
                                    <italic toggle="yes">F</italic>
                                    <sub>1</sub> Score</th>
                            </tr>
                        </thead>
                        <tbody>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">kNN,k=5</td>
                                <td align="right" colspan="1" rowspan="1">0.9966</td>
                                <td align="right" colspan="1" rowspan="1">0.9925</td>
                                <td align="right" colspan="1" rowspan="1">0.9677</td>
                                <td align="right" colspan="1" rowspan="1">0.9889</td>
                                <td align="right" colspan="1" rowspan="1">0.9810</td>
                                <td align="right" colspan="1" rowspan="1">0.9853</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">kNN,k=9</td>
                                <td align="right" colspan="1" rowspan="1">0.9943</td>
                                <td align="right" colspan="1" rowspan="1">0.9907</td>
                                <td align="right" colspan="1" rowspan="1">0.9583</td>
                                <td align="right" colspan="1" rowspan="1">0.9841</td>
                                <td align="right" colspan="1" rowspan="1">0.9725</td>
                                <td align="right" colspan="1" rowspan="1">0.9800</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">kNN,k=7</td>
                                <td align="right" colspan="1" rowspan="1">0.9943</td>
                                <td align="right" colspan="1" rowspan="1">0.9897</td>
                                <td align="right" colspan="1" rowspan="1">0.9362</td>
                                <td align="right" colspan="1" rowspan="1">0.9873</td>
                                <td align="right" colspan="1" rowspan="1">0.9746</td>
                                <td align="right" colspan="1" rowspan="1">0.9764</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">Random Forest</td>
                                <td align="right" colspan="1" rowspan="1">0.9909</td>
                                <td align="right" colspan="1" rowspan="1">0.9841</td>
                                <td align="right" colspan="1" rowspan="1">0.9556</td>
                                <td align="right" colspan="1" rowspan="1">0.9685</td>
                                <td align="right" colspan="1" rowspan="1">0.9645</td>
                                <td align="right" colspan="1" rowspan="1">0.9727</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">Decision Tree (DT)</td>
                                <td align="right" colspan="1" rowspan="1">0.9898</td>
                                <td align="right" colspan="1" rowspan="1">0.9795</td>
                                <td align="right" colspan="1" rowspan="1">0.9670</td>
                                <td align="right" colspan="1" rowspan="1">0.9512</td>
                                <td align="right" colspan="1" rowspan="1">0.9432</td>
                                <td align="right" colspan="1" rowspan="1">0.9661</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">SVM</td>
                                <td align="right" colspan="1" rowspan="1">0.9955</td>
                                <td align="right" colspan="1" rowspan="1">0.9833</td>
                                <td align="right" colspan="1" rowspan="1">0.9111</td>
                                <td align="right" colspan="1" rowspan="1">0.9666</td>
                                <td align="right" colspan="1" rowspan="1">0.9387</td>
                                <td align="right" colspan="1" rowspan="1">0.9590</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">DT,max depth=5</td>
                                <td align="right" colspan="1" rowspan="1">0.9757</td>
                                <td align="right" colspan="1" rowspan="1">0.9659</td>
                                <td align="right" colspan="1" rowspan="1">0.9032</td>
                                <td align="right" colspan="1" rowspan="1">0.9375</td>
                                <td align="right" colspan="1" rowspan="1">0.9168</td>
                                <td align="right" colspan="1" rowspan="1">0.9398</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">
                                    <italic toggle="yes">Polyhedron</italic>
                                </td>
                                <td align="right" colspan="1" rowspan="1">0.9727</td>
                                <td align="right" colspan="1" rowspan="1">0.9474</td>
                                <td align="right" colspan="1" rowspan="1">0.9011</td>
                                <td align="right" colspan="1" rowspan="1">0.8985</td>
                                <td align="right" colspan="1" rowspan="1">0.9109</td>
                                <td align="right" colspan="1" rowspan="1">0.9261</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">
                                    <italic toggle="yes">Bounding Box</italic>
                                </td>
                                <td align="right" colspan="1" rowspan="1">0.9865</td>
                                <td align="right" colspan="1" rowspan="1">0.9630</td>
                                <td align="right" colspan="1" rowspan="1">0.8764</td>
                                <td align="right" colspan="1" rowspan="1">0.9038</td>
                                <td align="right" colspan="1" rowspan="1">0.8614</td>
                                <td align="right" colspan="1" rowspan="1">0.9182</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">
                                    <italic toggle="yes">Push and Pull</italic>
                                </td>
                                <td align="right" colspan="1" rowspan="1">0.9737</td>
                                <td align="right" colspan="1" rowspan="1">0.9347</td>
                                <td align="right" colspan="1" rowspan="1">0.8842</td>
                                <td align="right" colspan="1" rowspan="1">0.9027</td>
                                <td align="right" colspan="1" rowspan="1">0.8767</td>
                                <td align="right" colspan="1" rowspan="1">0.9144</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">
                                    <italic toggle="yes">Best Rep.</italic>
                                </td>
                                <td align="right" colspan="1" rowspan="1">0.9589</td>
                                <td align="right" colspan="1" rowspan="1">0.9280</td>
                                <td align="right" colspan="1" rowspan="1">0.9011</td>
                                <td align="right" colspan="1" rowspan="1">0.8598</td>
                                <td align="right" colspan="1" rowspan="1">0.8923</td>
                                <td align="right" colspan="1" rowspan="1">0.9080</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">
                                    <italic toggle="yes">Mean</italic>
                                </td>
                                <td align="right" colspan="1" rowspan="1">0.9401</td>
                                <td align="right" colspan="1" rowspan="1">0.9004</td>
                                <td align="right" colspan="1" rowspan="1">0.9072</td>
                                <td align="right" colspan="1" rowspan="1">0.8212</td>
                                <td align="right" colspan="1" rowspan="1">0.8968</td>
                                <td align="right" colspan="1" rowspan="1">0.8931</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">
                                    <italic toggle="yes">Smallest Disk</italic>
                                </td>
                                <td align="right" colspan="1" rowspan="1">0.9627</td>
                                <td align="right" colspan="1" rowspan="1">0.9017</td>
                                <td align="right" colspan="1" rowspan="1">0.8889</td>
                                <td align="right" colspan="1" rowspan="1">0.8246</td>
                                <td align="right" colspan="1" rowspan="1">0.8717</td>
                                <td align="right" colspan="1" rowspan="1">0.8899</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">
                                    <italic toggle="yes">Median</italic>
                                </td>
                                <td align="right" colspan="1" rowspan="1">0.9271</td>
                                <td align="right" colspan="1" rowspan="1">0.8882</td>
                                <td align="right" colspan="1" rowspan="1">0.9167</td>
                                <td align="right" colspan="1" rowspan="1">0.7967</td>
                                <td align="right" colspan="1" rowspan="1">0.8953</td>
                                <td align="right" colspan="1" rowspan="1">0.8848</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">AdaBoost</td>
                                <td align="right" colspan="1" rowspan="1">0.9227</td>
                                <td align="right" colspan="1" rowspan="1">0.8248</td>
                                <td align="right" colspan="1" rowspan="1">0.5797</td>
                                <td align="right" colspan="1" rowspan="1">0.5033</td>
                                <td align="right" colspan="1" rowspan="1">0.6475</td>
                                <td align="right" colspan="1" rowspan="1">0.6956</td>
                            </tr>
                        </tbody>
                    </table>
                    <table-wrap-foot>
                        <fn>
                            <p>LOOCV = leave-one-out cross validation, kNN = k nearest neighbor, DT = decision tree</p>
                        </fn>
                        <fn>
                            <p>SVM = support vector machine, 
                                <styled-content style="#009E73" style-type="color">DSVL</styled-content> = Durably Suppressed Viral Load</p>
                        </fn>
                        <fn>
                            <p>
                                <styled-content style="#D55E00" style-type="color">SHVL</styled-content> = Sustained High Viral Load, 
                                <styled-content style="#000BB5" style-type="color">SLVL</styled-content> = Sustained Low Viral Load</p>
                        </fn>
                        <fn>
                            <p>RVL = Rebounding Viral Load, 
                                <styled-content style="#CC79A7" style-type="color">HVLS</styled-content> = High Viral Load Suppression</p>
                        </fn>
                    </table-wrap-foot>
                </table-wrap>
                <p>We next considered the trade-off of predictive precision versus model interpretability. Critical clinical evaluation of machine learning results is important to protect against mis-categorization and clinical error. For this reason, many have advocated using models that are more clinically interpretable. kNN is dependent upon the entire training set for prediction, as it does not inherently &#x201c;learn&#x201d; patterns
                    <sup>
                        <xref ref-type="bibr" rid="ref-24">24</xref>
                    </sup>, hence it does not meet our interpretability criteria. In comparison, SVM offers a simpler model, but it&#x2019;s results could be non-intuitive for clinicians. And although Decision Trees offer the best interpretability, overly complex trees may be generated, as occurred in our study (
                    <xref ref-type="other" rid="FS4">Figure S3</xref>).</p>
                <p>We found that pruned decision tree rules, with a maximum depth of 5 levels, met this interpretable criteria, however at a slight cost to the predictive power (
                    <xref ref-type="table" rid="T3">Table 3</xref>). The extracted decision rules are shown in 
                    <xref ref-type="fig" rid="f6">Figure 6</xref>. Each category has a rule with a high proportion of true positive samples following the rule relative to all samples for the category (support). Similarly, a high proportion of the predicted class was found in the rule (precision), indicating that the rules can be summarized into a majority rule. Note that the sum of the support does not necessarily add up to one for each class because some samples belonging to that class may have been otherwise placed into a different rule, making the precision of that rule weaker.</p>
                <fig fig-type="figure" id="f6" orientation="portrait" position="float">
                    <label>Figure 6. </label>
                    <caption>
                        <title>Extracted rules from pruned decision tree and polyhedral CM rule region.</title>
                        <p>Support is the fraction of true positives satisfying the rule relative to all samples of the class. Precision is the proportion of true positives versus all positives in the rule. Rules are sorted in order of application, first by the level of the decision tree depth (Depth), and then by descending precision. The colored regions represent the the values for which the rule holds (rule feature space). For the centroid method (CM; shaded gray) bounds were calculated by the polyhedron method, where the rectangular bar is the center and the radius is the area inside the parentheses. 
                            <italic toggle="yes">Area</italic> = relative area of viral load exposure, 
                            <italic toggle="yes">wRR</italic>= weighted recency reliability, 
                            <italic toggle="yes">IQR</italic> = interquartile range, AdjMD = adjusted maximal viral load difference.</p>
                    </caption>
                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/17007/4f79706a-94d2-4b0d-b43b-23416a31d185_figure6.gif"/>
                </fig>
                <p>As an alternative interpretable model we explored the use of centroid cluster summarization, which is often used in clustering algorithms, and is flexible enough to accommodate different centroid determination methods
                    <sup>
                        <xref ref-type="bibr" rid="ref-22">22</xref>,
                        <xref ref-type="bibr" rid="ref-25">25</xref>
                    </sup>. To determine the effects of different centroid determination algorithms, we compared seven different methods: multidimensional mean, multidimensional median, best representative center, bounding box method, smallest disk method, polyhedral center, and a novel &#x201c;push and pull" (PnP) method inspired by force-directed graph drawing such as the Fruchterman-Reingold&#x2019;s algorithm
                    <sup>
                        <xref ref-type="bibr" rid="ref-26">26</xref>,
                        <xref ref-type="bibr" rid="ref-27">27</xref>
                    </sup> (see 
                    <xref ref-type="other" rid="SF7">Supplementary File 1</xref>). Force directed clustering methods maximize inter-cluster center distances, while minimizing intra-cluster distance, and are the basis for modularity clustering in graph theory
                    <sup>
                        <xref ref-type="bibr" rid="ref-28">28</xref>
                    </sup>.</p>
                <p>We then combined the centroid cluster summarization approach with a radius-based classification prediction algorithm. Let 
                    <italic toggle="yes">c
                        <sub>i</sub>
                    </italic> be the 
                    <italic toggle="yes">i</italic>th cluster center with corresponding radius 
                    <italic toggle="yes">r
                        <sub>i</sub>
                    </italic>, where 
                    <italic toggle="yes">r
                        <sub>i</sub>
                    </italic> is calculated as the distance to the farthest intra-cluster sample from 
                    <italic toggle="yes">c
                        <sub>i</sub>
                    </italic>, then for a new sample 
                    <italic toggle="yes">s</italic> choose its predicted cluster membership 
                    <italic toggle="yes">j</italic> such that 
                    <inline-formula>
                        <mml:math display="inline" id="M15">
                            <mml:mrow>
                                <mml:mfrac>
                                    <mml:mrow>
                                        <mml:msub>
                                            <mml:mrow>
                                                <mml:mrow>
                                                    <mml:mo>&#x2016;</mml:mo>
                                                    <mml:mrow>
                                                        <mml:mi>s</mml:mi>
                                                        <mml:mo>&#x2212;</mml:mo>
                                                        <mml:msub>
                                                            <mml:mi>c</mml:mi>
                                                            <mml:mi>j</mml:mi>
                                                        </mml:msub>
                                                    </mml:mrow>
                                                    <mml:mo>&#x2016;</mml:mo>
                                                </mml:mrow>
                                            </mml:mrow>
                                            <mml:mn>2</mml:mn>
                                        </mml:msub>
                                    </mml:mrow>
                                    <mml:mrow>
                                        <mml:msub>
                                            <mml:mi>r</mml:mi>
                                            <mml:mi>j</mml:mi>
                                        </mml:msub>
                                    </mml:mrow>
                                </mml:mfrac>
                            </mml:mrow>
                        </mml:math>
                    </inline-formula> is a minimum. We refer to this method as 
                    <italic toggle="yes">radial normalization</italic> classification.</p>
                <p>Comparing the representative 
                    <italic toggle="yes">F</italic>
                    <sub>1</sub>  power of the centroid radial normalization methods (italicized in 
                    <xref ref-type="table" rid="T3">Table 3</xref>) to common machine learning algorithms, we find that the centroid interpretation loses some predictive power. However, the centroid summary is highly interpretable because the entire model can be expressed concisely (
                    <xref ref-type="other" rid="ST10">Supplementary Table S2</xref>), and understood clearly. For example, a clinician classifying a patient by VL time series values would compare observed feature values with the ranges given in 
                    <xref ref-type="fig" rid="f6">Figure 6</xref>, and find which classification the patient&#x2019;s data fits best within. In the case of the centroid method, if an observed value appears to fall in multiple categories, then they should be assigned to the one closest to the center (this allows a clinician to cross-check model predictions).</p>
            </sec>
            <sec>
                <title>Temporal state variation</title>
                <p>HIV patient viral load states are often fluid, with class changes (e.g. SHVL 
                    <italic toggle="yes">&#x2192;</italic> HVLS) occurring due to therapy, viral genetics, social and other factors. To examine this aspect of classes, we use the k-Nearest Neighbors (k=5) model, fit to the original clusters, to predict the class state of each patient with 
                    <italic toggle="yes">&#x2265;</italic> 3 VL measurements using only partially retained VL data. For example if a patient has 6 viral load measurements, then we predict the class state at 3, 4, 5, and 6 VLM, which may yield SHVL 
                    <italic toggle="yes">&#x2192;</italic> SHVL 
                    <italic toggle="yes">&#x2192;</italic> RVL 
                    <italic toggle="yes">&#x2192;</italic> HVLS as its prediction. We then constructed a state-transfer network using the trace-route method
                    <sup>
                        <xref ref-type="bibr" rid="ref-29">29</xref>
                    </sup>, revealing several interesting relationships:</p>
                <list list-type="bullet">
                    <list-item>
                        <label>1.</label>
                        <p>Patients on therapy appear to suppress their viral loads at a positive linear rate throughout the entire 900 day span. This is quite different from the literature which suggests that if a patient is going to suppress their VL, it will be within 32 weeks, or 224 days
                            <sup>
                                <xref ref-type="bibr" rid="ref-13">13</xref>
                            </sup> (
                            <xref ref-type="fig" rid="f7">Figure 7A</xref>).</p>
                    </list-item>
                </list>
                <fig fig-type="figure" id="f7" orientation="portrait" position="float">
                    <label>Figure 7. </label>
                    <caption>
                        <title>Class state variation.</title>
                        <p>
                            <bold>A</bold>) Classification using kNN, with k=5, trained on the original five clusters to predict on partially retained viral load for patients 
                            <italic toggle="yes">&#x2265;</italic> 900 days of data. The number of patients in one class between 0&#x2013;900 days are shown relative to the first state classification (i.e. third viral load measurement).  
                            <bold>B</bold>) A trace-route map of class state transfers (
                            <italic toggle="yes">class</italic>
                            <sub>1</sub> &#x2192; 
                            <italic toggle="yes">class</italic>
                            <sub>2</sub>) as a function of partially retained viral load derived from model. Nodes represent viral load classification and arrows reflect the volume of state transitions between successive VL measurements (e.g. SHVL
                            <italic toggle="yes">&#x2192;</italic>DSVL). Self-loops (e.g. RVL
                            <italic toggle="yes">&#x2192;</italic>RVL) indicate no change in state reflecting stable classification.</p>
                    </caption>
                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/17007/4f79706a-94d2-4b0d-b43b-23416a31d185_figure7.gif"/>
                </fig>
                <list list-type="bullet">
                    <list-item>
                        <label>2.</label>
                        <p>DSVL classification appears unstable for the first 400 days, suggesting that patients in this class should be monitored carefully during this initial period (
                            <xref ref-type="fig" rid="f7">Figure 7A</xref>).</p>
                    </list-item>
                    <list-item>
                        <label>3.</label>
                        <p>The number of patients classified as SHVL drops considerably until 
                            <italic toggle="yes">&#x223c;</italic>500 days after first classification. After this point, those patients who have not yet left the SHVL category, may not do so (
                            <xref ref-type="fig" rid="f7">Figure 7A</xref>).</p>
                    </list-item>
                    <list-item>
                        <label>4.</label>
                        <p>The two sets of classes {DSVL, SLVL} and {SHVL, RVL, HVLS} are well separated (i.e. without much transfer between sets; 
                            <xref ref-type="fig" rid="f7">Figure 7B</xref>). This appears to suggest that patients whose viral load is consistently low or durably suppressed tend not to transfer into a high viral load state (i.e. RVL or SHVL), at least in this data cohort.</p>
                    </list-item>
                    <list-item>
                        <label>5.</label>
                        <p>SHVL patients in this cohort tended to transfer out of the class at a much higher rate than the transfer in, suggesting positive patient care (
                            <xref ref-type="fig" rid="f7">Figure 7B</xref>). This observation is consistent with reports in the literature that entry into treatment, with adherence to a HAART regimen, generally results in viral load suppression.</p>
                    </list-item>
                    <list-item>
                        <label>6.</label>
                        <p>The state transfer diagram illustrates that the most frequent state transition over time is remaining within the same cluster (
                            <xref ref-type="fig" rid="f7">Figure 7B</xref>) assignment.</p>
                    </list-item>
                </list>
            </sec>
        </sec>
        <sec sec-type="discussion">
            <title>Discussion</title>
            <p>Researchers have previously performed HIV population case studies using differing schema to classify VL patterns
                <sup>
                    <xref ref-type="bibr" rid="ref-2">2</xref>,
                    <xref ref-type="bibr" rid="ref-9">9</xref>,
                    <xref ref-type="bibr" rid="ref-10">10</xref>,
                    <xref ref-type="bibr" rid="ref-13">13</xref>
                </sup>. We have developed a unique method for standardizing the algorithmic classification of VL patterns using a set of optimally segregating features. These features have been specifically engineered to optimize unsupervised clustering of temporal sequences of VL data that are asynchronous and noisy. Our findings demonstrate their success in identifying five viral load patterns often reported in the literature
                <sup>
                    <xref ref-type="bibr" rid="ref-7">7</xref>&#x2013;
                    <xref ref-type="bibr" rid="ref-12">12</xref>
                </sup>. It is possible that additional viral load patterns may emerge in the future, for example due to new HIV variants that are resistant to current therapies. The method reported here is flexible enough to recognize such new temporal patterns of VL responses. It is also general enough that models could be trained on other viral infections that have patterns of natural or treatment related patient responses (e.g. hepatitis B and C, parvovirus B19), although this may require defining new features that capture disease specific pattern variants.</p>
            <p>A common practice in data analytics is to calculate the centroid as the average of the points
                <sup>
                    <xref ref-type="bibr" rid="ref-22">22</xref>,
                    <xref ref-type="bibr" rid="ref-30">30</xref>
                </sup>. However, 
                <xref ref-type="table" rid="T3">Table 3</xref> suggests that the mean is not necessarily the best centroid for HIV viral load data. We note two advantages of the centroid algorithm: First, we can choose the centroid best corresponding to the shape of the data, and second, we can use it to mathematically determine the amount of over-lap between n-dimensional cluster spheres (i.e. viral load categories). This method may facilitate cross-comparison of HIV research studies by providing a standard for VL pattern classification. Such standardization would be immensely useful in meta-analyses
                <sup>
                    <xref ref-type="bibr" rid="ref-31">31</xref>&#x2013;
                    <xref ref-type="bibr" rid="ref-35">35</xref>
                </sup>, potentially revealing the influence of different patient care strategies or new relationships between different patient populations.</p>
            <p>Our work also explored the trade-off with respect to predictive accuracy between model interpretability and more complex, "black box" approaches to classification. The interpretability versus predictability problem is well known in the deep learning literature
                <sup>
                    <xref ref-type="bibr" rid="ref-36">36</xref>&#x2013;
                    <xref ref-type="bibr" rid="ref-38">38</xref>
                </sup>. Interpretability is a desirable attribute in clinical classification systems, allowing clinicians to integrate causal physiology and diagnostic information with data features in a way promotes clearer bedside clinical reasoning. Using an interpretable model for assigning viral load pattern membership may be advantageous when a clinician wishes to use the assigned pattern membership to aid in making a critical clinical decision (e.g. choosing between treatment options), or when examining features that may be linked to a mechanism (e.g. slope of VL decline and viral genotype). A "black box" or more complex model may make such decisions or interpretations more difficult
                <sup>
                    <xref ref-type="bibr" rid="ref-39">39</xref>
                </sup>, and can favor the use of simpler models at the expense of some predictive power.</p>
            <p>Along these lines, we have also proposed a novel centroid-based algorithm for summarization of clustering results. This algorithm is not meant to supplant other well defined supervised learning algorithms, but rather to aid in interpretable assignment of VL patterns from other data sets into one of the five categories. The algorithm results are concise, allowing investigators to build the model in their preferred programming language. Hence this method may improve and standardize HIV population research by giving precise definitions to the varying temporal VL patterns, and potentially improving patient care.</p>
            <p>Several caveats apply to our work. As noted, this is a single center study, and thus our method should be tested with a much larger data set to cross-validate the categories represented by the clusters. In addition, our feature vector was designed specifically based on observed VL patterns previously reported in the literature rather than objectively clustering the data using a standard time-series based clustering method
                <sup>
                    <xref ref-type="bibr" rid="ref-40">40</xref>&#x2013;
                    <xref ref-type="bibr" rid="ref-42">42</xref>
                </sup>. This may limit generalizability to other VL analyses. In addition, some of our features are slightly collinear - with the greatest correlation coefficient being between 
                <italic toggle="yes">IQR</italic> and 
                <italic toggle="yes">wRR</italic> (-0.717). However, while HVLS and RVL both have a varied range of 
                <italic toggle="yes">IQR</italic>, it is clear that the HVLS class has greater 
                <italic toggle="yes">wRR</italic> than the RVL class due to HVLS patients having a long consistent viral load tail. Furthermore 
                <italic toggle="yes">IQR</italic> helps distinguish the HVLS and the SLVL or SHVL class, hence both 
                <italic toggle="yes">IQR</italic> and 
                <italic toggle="yes">wRR</italic> are necessary despite the slight correlation. Finally, because our method normalizes time into 
                <italic toggle="yes">number of days since first VL measurement</italic>, we lose the ability to look for seasonal or yearly patterns in the data.</p>
            <p>Our data set did not have patients in whom VL was initially suppressed, and then rebounded (EVL). We originally hypothesized the existence of six distinct VL patterns, we found that the 
                <italic toggle="yes">emergent VL</italic> group was not a pattern identified in our data. Perhaps this is a consequence of a high rate of local patient engagement in therapy in this cohort study, access to care, or the effectiveness of highly active anti-retroviral therapy regimens. We hypothesize that these conditions may not always exist (e.g. in areas where HAART is expensive, when people may lose the ability to pay for therapy), and that in such cases the EVL pattern may indeed be present and significant. Based on the formulation of the 
                <italic toggle="yes">adj MD</italic> and 
                <italic toggle="yes">wRR</italic> features, we hypothesize that a consequence of the grounding function is that any EVL pattern, if exists, will be grouped under RVL. This grouping may be appropriate as one can argue that going from a suppressed state to a high VL state is a form of rebounding. Clinical treatment of these patterns is likely to be similar. Further work with data sets that contain RVL patterns will need to be done to test these hypotheses. Unfortunately, we are not aware of any such data currently in the public domain.</p>
            <p>Our method used hierarchical clustering to define groups, with a cutoff for group specification at a high level in the branching tree (i.e. level 5). Such thresholds or tuning parameters are characteristic of most unsupervised clustering algorithms
                <sup>
                    <xref ref-type="bibr" rid="ref-22">22</xref>,
                    <xref ref-type="bibr" rid="ref-43">43</xref>
                </sup>. However, identification of important sub-clusters by using a lower threshold is also possible. Clustering results may change depending on the parameter chosen, revealing finer between-cluster differences as the number of clusters increase. The hierarchical clustering algorithm has the advantage that a proper cut-off can be easily visualized. For example, choosing a lower cut-off may reveal that the suppression group splits itself into categories with different rates of HIV viral load suppression during treatment. Researchers wishing to engineer a new feature vector for VL pattern segregation may find useful the 
                <xref ref-type="other" rid="SM1">Supplementary material</xref> on features we considered but subsequently removed due to poor performance.</p>
        </sec>
        <sec sec-type="conclusions">
            <title>Conclusions</title>
            <p>We have proposed a set of four unambiguous features which have been successfully used in segregating five different types of temporal viral load patterns: durably suppressed viral load (DSVL), sustained low viral load (SLVL), sustained high viral load (SHVL), high viral load suppression (HVLS), and rebounding viral load (RVL). We have also proposed a novel centroid-based cluster summary algorithm. The use of this algorithm may improve meta-analyses or population studies of viral load patterns by standardizing the classification of HIV patient categories. Furthermore, the segregation process used in this paper (i.e. identifying domain specific features, performing unsupervised clustering, interpreting the results with a cluster summary) can be used to model other viral infections and the response of VL levels over time to treatment or natural disease progression. We also found that using a temporal state variation method is important when considering patient viral load classifications, as changes in patient response can continue to occur beyond previously estimated time frames.</p>
        </sec>
        <sec>
            <title>Abbreviations</title>
            <p>AdjMD = adjusted maximal viral load difference, 
                <italic toggle="yes">Area</italic> = relative area of viral load exposure, ART = anti-retroviral therapy, DT = decision tree, EVL = 
                <styled-content style="#FF0000" style-type="color">emerging viral load</styled-content>, 
                <styled-content style="#009E73" style-type="color">DSVL</styled-content> = Durably Suppressed Viral Load, HAART = highly active retroviral therapy, HIV = human immunodeficiency virus, 
                <italic toggle="yes">IQR</italic> = interquartile range, kNN = k nearest neighbor, LLVL = low level viral load, LOOCV = leave-one-out cross validation, 
                <styled-content style="#D55E00" style-type="color">SHVL</styled-content> = Sustained High Viral Load, 
                <styled-content style="#000BB5" style-type="color">SLVL</styled-content> = Sustained Low Viral Load, RVL = Rebounding Viral Load, 
                <styled-content style="#CC79A7" style-type="color">HVLS</styled-content> = High Viral Load Suppression, SVM = support vector machine, VL = viral load, VLM = viral load measurement, 
                <italic toggle="yes">wRR</italic>= weighted recency reliability.</p>
        </sec>
        <sec>
            <title>Data availability</title>
            <p>Full access to the data is available on GitHub (Data S1): 
                <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.5281/zenodo.1313245">https://doi.org/10.5281/zenodo.1313245</ext-link>
                <sup>
                    <xref ref-type="bibr" rid="ref-21">21</xref>
                </sup>
            </p>
            <p>
                <bold>Data S1: Viral load data.</bold> The data set used for this study is provided in a completely deidentified format, CSV format where the first column represents a unique subject, with a random identifier. The subsequent values are as 
                <italic toggle="yes">t
                    <sub>i,j</sub>
                </italic>, 
                <italic toggle="yes">V L
                    <sub>i,j</sub>
                </italic>, where 
                <italic toggle="yes">t
                    <sub>i,j</sub>
                </italic> is the time from a universal 
                <italic toggle="yes">T</italic>
                <sub>0</sub> for the VL measurement 
                <italic toggle="yes">j</italic> for patient 
                <italic toggle="yes">i</italic>, and 
                <italic toggle="yes">V L
                    <sub>i,j</sub>
                </italic> is the corresponding VL measurement. Each record (row) is of a unique length, depending on the number of VL measurements present for that subject. The study data, and code used for analysis, can be found at 
                <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.5281/zenodo.1313245">https://doi.org/10.5281/zenodo.1313245</ext-link>
                <sup>
                    <xref ref-type="bibr" rid="ref-21">21</xref>
                </sup>.</p>
        </sec>
    </body>
    <back>
        <ack>
            <title>Acknowledgements</title>
            <p>We would like to thank Yusuf Bilgic (State University of New York at Geneseo) and James Java (University of Rochester), for discussions regarding the statistical analyses.</p>
        </ack>
        <sec id="SM1" sec-type="supplementary-material">
            <title>Supplementary material</title>
            <p id="FN1">
                <bold>Supplementary Figures:</bold>
            </p>
            <p>
                <ext-link ext-link-type="uri" xlink:href="https://f1000researchdata.s3.amazonaws.com/supplementary/15591/92750418-668c-4847-8a52-d620f4a9e75b.zip">Click here to access the data</ext-link>.</p>
            <list list-type="bullet">
                <list-item>
                    <p id="FS2">
                        <bold>Figure S1: Viral load distribution.</bold> For each pair of viral load measurements, we calculate the change in days and the change in viral load counts for all patients and plot it as a scatter. The horizontal line of dots which appears between 0 and 2 are an artifact of using 20 and 48 in data to replace the &#x201c;Pos 
                        <italic toggle="yes">&lt;</italic>20" and &#x201c;Pos 
                        <italic toggle="yes">&lt;</italic>48" values which appeared in our data. The sequential range of viral load measurements shows that VL measurements taken within 10 days of each other may vary by 
                        <italic toggle="yes">&#x00b1;</italic>10
                        <sup>5</sup> copies/mL.</p>
                </list-item>
                <list-item>
                    <p id="FS3">
                        <bold>Figure S2: Patient feature extraction.</bold> Feature extraction on 1576 patients displayed as 2D splicing of the 4 dimensional feature space.  Each splice plots a dimension versus another in the form of a scatter plot.</p>
                </list-item>
                <list-item>
                    <p id="FS4">
                        <bold>Figure S3: Decision Tree.</bold> While some useful rules may be pruned, the tree is otherwise complicated and difficult to draw useful conclusions from.</p>
                </list-item>
                <list-item>
                    <p id="FS5">
                        <bold>Figure S4: Seven centroid calculations on clustered viral load data.</bold> For each cluster, the seven methods of calculating a globular cluster center are shown in comparison to each other (calculated on the normalized and clustered viral load data). Since the PnP method can have a center outside the range of [0,1], an indicator is shown for when the center goes beyond the range.</p>
                </list-item>
                <list-item>
                    <p id="FS6">
                        <bold>Figure S5: Centroid methods.</bold> Gives a visual of how the seven methods work on an example point set. The green target signifies the exact center which is found according to the different methods in our algorithm.</p>
                </list-item>
            </list>
            <p id="SF7">
                <bold>Supplementary File 1: Review of existing viral load categorization methods and features and centroid detection methodologies  that  were  considered but not used.</bold> A review of currently published viral load categorization methods.</p>
            <p>
                <ext-link ext-link-type="uri" xlink:href="https://f1000researchdata.s3.amazonaws.com/supplementary/15591/57439f11-8a6c-42af-8367-c9f64b3d9be1.pdf">Click here to access the data</ext-link>.</p>
            <p id="ST8">
                <bold>Supplementary Tables:</bold>
            </p>
            <p>
                <ext-link ext-link-type="uri" xlink:href="https://f1000researchdata.s3.amazonaws.com/supplementary/15591/5b7d6238-300c-4bd5-95fa-f25961d89ffa.pdf">Click here to access the data</ext-link>.</p>
            <list list-type="bullet">
                <list-item>
                    <p id="ST9">
                        <bold>Supplementary Table S1: Correlation coefficinet matrix features.</bold>
                    </p>
                </list-item>
                <list-item>
                    <p id="ST10">
                        <bold>Supplementary Table S2: Centroids and radii from polyhedral CM.</bold>
                    </p>
                </list-item>
            </list>
        </sec>
        <ref-list>
            <ref id="ref-1">
                <label>1</label>
                <mixed-citation publication-type="journal">
                    <collab>Centers for Disease Control and Prevention (CDC)</collab>:
                    <article-title>Vital signs: HIV prevention through care and treatment--United States.</article-title>
                    <source>

                        <italic toggle="yes">MMWR Morb Mortal Wkly Rep.</italic>
</source>
                    <year>2011</year>;<volume>60</volume>(<issue>47</issue>):<fpage>1618</fpage>&#x2013;<lpage>23</lpage>.
                    <pub-id pub-id-type="pmid">22129997</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-2">
                <label>2</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Yehia</surname>
                            <given-names>BR</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Fleishman</surname>
                            <given-names>JA</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Metlay</surname>
                            <given-names>JP</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Sustained viral suppression in HIV-infected patients receiving antiretroviral therapy.</article-title>
                    <source>

                        <italic toggle="yes">JAMA.</italic>
</source>
                    <year>2012</year>;<volume>308</volume>(<issue>4</issue>):<fpage>339</fpage>&#x2013;<lpage>42</lpage>.
                    <pub-id pub-id-type="pmid">22820781</pub-id>
                    <pub-id pub-id-type="doi">10.1001/jama.2012.5927</pub-id>
                    <pub-id pub-id-type="pmcid">3541503</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-3">
                <label>3</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Mellors</surname>
                            <given-names>JW</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Mu&#x00f1;oz</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Giorgi</surname>
                            <given-names>JV</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Plasma viral load and CD4+ lymphocytes as prognostic markers of HIV-1 infection.</article-title>
                    <source>

                        <italic toggle="yes">Ann Intern Med.</italic>
</source>
                    <year>1997</year>;<volume>126</volume>(<issue>12</issue>):<fpage>946</fpage>&#x2013;<lpage>54</lpage>.
                    <pub-id pub-id-type="pmid">9182471</pub-id>
                    <pub-id pub-id-type="doi">10.7326/0003-4819-126-12-199706150-00003</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-4">
                <label>4</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Sterling</surname>
                            <given-names>TR</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Vlahov</surname>
                            <given-names>D</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Astemborski</surname>
                            <given-names>J</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Initial plasma HIV-1 RNA levels and progression to AIDS in women and men.</article-title>
                    <source>

                        <italic toggle="yes">N Engl J Med.</italic>
</source>
                    <year>2001</year>;<volume>344</volume>(<issue>10</issue>):<fpage>720</fpage>&#x2013;<lpage>725</lpage>.
                    <pub-id pub-id-type="pmid">11236775</pub-id>
                    <pub-id pub-id-type="doi">10.1056/NEJM200103083441003</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-5">
                <label>5</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Dybul</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Fauci</surname>
                            <given-names>AS</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Bartlett</surname>
                            <given-names>JG</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Guidelines for using antiretroviral agents among HIV-infected adults and adolescents: recommendations of the Panel on Clinical Practices for Treatment of HIV.</article-title>
                    <source>

                        <italic toggle="yes">MMWR Recommendations and reports: Morbidity and mortality weekly report Recommendations and reports/Centers for Disease Control.</italic>
</source>
                    <year>2002</year>;<volume>51</volume>(<issue>RR-7</issue>):<fpage>1</fpage>&#x2013;<lpage>55</lpage>.
                    <ext-link ext-link-type="uri" xlink:href="https://www.cdc.gov/mmwr/PDF/rr/rr5107.pdf">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-6">
                <label>6</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Attia</surname>
                            <given-names>S</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Egger</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>M&#x00fc;ller</surname>
                            <given-names>M</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Sexual transmission of HIV according to viral load and antiretroviral therapy: systematic review and meta-analysis.</article-title>
                    <source>

                        <italic toggle="yes">AIDS.</italic>
</source>
                    <year>2009</year>;<volume>23</volume>(<issue>11</issue>):<fpage>1397</fpage>&#x2013;<lpage>1404</lpage>.
                    <pub-id pub-id-type="pmid">19381076</pub-id>
                    <pub-id pub-id-type="doi">10.1097/QAD.0b013e32832b7dca</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-7">
                <label>7</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Viard</surname>
                            <given-names>JP</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Burgard</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Hubert</surname>
                            <given-names>JB</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Impact of 5 years of maximally successful highly active antiretroviral therapy on CD4 cell count and HIV-1 DNA level.</article-title>
                    <source>

                        <italic toggle="yes">AIDS.</italic>
</source>
                    <year>2004</year>;<volume>18</volume>(<issue>1</issue>):<fpage>45</fpage>&#x2013;<lpage>49</lpage>.
                    <pub-id pub-id-type="pmid">15090828</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-8">
                <label>8</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Greub</surname>
                            <given-names>G</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Cozzi-Lepri</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Ledergerber</surname>
                            <given-names>B</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Intermittent and sustained low-level HIV viral rebound in patients receiving potent antiretroviral therapy.</article-title>
                    <source>

                        <italic toggle="yes">AIDS.</italic>
</source>
                    <year>2002</year>;<volume>16</volume>(<issue>14</issue>):<fpage>1967</fpage>&#x2013;<lpage>1969</lpage>.
                    <pub-id pub-id-type="pmid">12351960</pub-id>
                    <pub-id pub-id-type="doi">10.1097/00002030-200209270-00017</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-9">
                <label>9</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Terzian</surname>
                            <given-names>AS</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Bodach</surname>
                            <given-names>SD</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Wiewel</surname>
                            <given-names>EW</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Novel use of surveillance data to detect HIV-infected persons with sustained high viral load and durable virologic suppression in New York City.</article-title>
                    <source>

                        <italic toggle="yes">PLoS One.</italic>
</source>
                    <year>2012</year>;<volume>7</volume>(<issue>1</issue>):<fpage>e29679</fpage>.
                    <pub-id pub-id-type="pmid">22291892</pub-id>
                    <pub-id pub-id-type="doi">10.1371/journal.pone.0029679</pub-id>
                    <pub-id pub-id-type="pmcid">3265470</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-10">
                <label>10</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Rose</surname>
                            <given-names>CE</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Gardner</surname>
                            <given-names>L</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Craw</surname>
                            <given-names>J</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>A Comparison of Methods for Analyzing Viral Load Data in Studies of HIV Patients.</article-title>
                    <source>

                        <italic toggle="yes">PLoS One.</italic>
</source>
                    <year>2015</year>;<volume>10</volume>(<issue>6</issue>):<fpage>e0130090</fpage>.
                    <pub-id pub-id-type="pmid">26090989</pub-id>
                    <pub-id pub-id-type="doi">10.1371/journal.pone.0130090</pub-id>
                    <pub-id pub-id-type="pmcid">4474923</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-11">
                <label>11</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>de Jong</surname>
                            <given-names>MD</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Simmons</surname>
                            <given-names>CP</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Thanh</surname>
                            <given-names>TT</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Fatal outcome of human influenza A (H5N1) is associated with high viral load and hypercytokinemia.</article-title>
                    <source>

                        <italic toggle="yes">Nat Med.</italic>
</source>
                    <year>2006</year>;<volume>12</volume>(<issue>10</issue>):<fpage>1203</fpage>&#x2013;<lpage>1207</lpage>.
                    <pub-id pub-id-type="pmid">16964257</pub-id>
                    <pub-id pub-id-type="doi">10.1038/nm1477</pub-id>
                    <pub-id pub-id-type="pmcid">4333202</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-12">
                <label>12</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Ylitalo</surname>
                            <given-names>N</given-names>
                        </name>

                        <name name-style="western">
                            <surname>S&#x00f8;rensen</surname>
                            <given-names>P</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Josefsson</surname>
                            <given-names>AM</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Consistent high viral load of human papillomavirus 16 and risk of cervical carcinoma in situ: a nested case-control study.</article-title>
                    <source>

                        <italic toggle="yes">Lancet.</italic>
</source>
                    <year>2000</year>;<volume>355</volume>(<issue>9222</issue>):<fpage>2194</fpage>&#x2013;<lpage>2198</lpage>.
                    <pub-id pub-id-type="pmid">10881892</pub-id>
                    <pub-id pub-id-type="doi">10.1016/S0140-6736(00)02402-8</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-13">
                <label>13</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Phillips</surname>
                            <given-names>AN</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Staszewski</surname>
                            <given-names>S</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Weber</surname>
                            <given-names>R</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>HIV viral load response to antiretroviral therapy according to the baseline CD4 cell count and viral load.</article-title>
                    <source>

                        <italic toggle="yes">JAMA.</italic>
</source>
                    <year>2001</year>;<volume>286</volume>(<issue>20</issue>):<fpage>2560</fpage>&#x2013;<lpage>7</lpage>.
                    <pub-id pub-id-type="pmid">11722270</pub-id>
                    <pub-id pub-id-type="doi">10.1001/jama.286.20.2560</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-14">
                <label>14</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Kononenko</surname>
                            <given-names>I</given-names>
                        </name>
</person-group>:
                    <article-title>Machine learning for medical diagnosis: history, state of the art and perspective.</article-title>
                    <source>

                        <italic toggle="yes">Artif Intell Med.</italic>
</source>
                    <year>2001</year>;<volume>23</volume>(<issue>1</issue>):<fpage>89</fpage>&#x2013;<lpage>109</lpage>.
                    <pub-id pub-id-type="pmid">11470218</pub-id>
                    <pub-id pub-id-type="doi">10.1016/S0933-3657(01)00077-X</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-15">
                <label>15</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Dubey</surname>
                            <given-names>A</given-names>
                        </name>
</person-group>:
                    <article-title>Applications of Machine Learning: Cutting Edge Technology in HIV Diagnosis, Treatment and Further Research.</article-title>
                    <source>

                        <italic toggle="yes">Computational Molecular Biology.</italic>
</source>
                    <year>2016</year>;<volume>6</volume>(<issue>3</issue>):<fpage>1</fpage>&#x2013;<lpage>6</lpage>.
                    <pub-id pub-id-type="doi">10.5376/cmb.2016.06.0003</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-16">
                <label>16</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Rosa</surname>
                            <given-names>RS</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Santos</surname>
                            <given-names>RH</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Brito</surname>
                            <given-names>AY</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Insights on prediction of patients&#x2019; response to anti-HIV therapies through machine learning.</article-title>In:
                    <italic toggle="yes">Neural Networks (IJCNN), 2014 International Joint Conference on. IEEE</italic>;<year>2014</year>;<fpage>3697</fpage>&#x2013;<lpage>3704</lpage>.
                    <pub-id pub-id-type="doi">10.1109/IJCNN.2014.6889659</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-17">
                <label>17</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Rodr&#x00ed;guez</surname>
                            <given-names>JO</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Prieto</surname>
                            <given-names>SE</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Correa</surname>
                            <given-names>C</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Predictions of CD4 lymphocytes' count in HIV patients from complete blood count.</article-title>
                    <source>

                        <italic toggle="yes">BMC Med Phys.</italic>
</source>
                    <year>2013</year>;<volume>13</volume>(<issue>1</issue>):<fpage>3</fpage>.
                    <pub-id pub-id-type="pmid">24034560</pub-id>
                    <pub-id pub-id-type="doi">10.1186/1756-6649-13-3</pub-id>
                    <pub-id pub-id-type="pmcid">3847222</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-18">
                <label>18</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Ramirez</surname>
                            <given-names>CM</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Sinclair</surname>
                            <given-names>E</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Epling</surname>
                            <given-names>L</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Immunologic profiles distinguish aviremic HIV-infected adults.</article-title>
                    <source>

                        <italic toggle="yes">AIDS.</italic>
</source>
                    <year>2016</year>;<volume>30</volume>(<issue>10</issue>):<fpage>1553</fpage>&#x2013;<lpage>1562</lpage>.
                    <pub-id pub-id-type="pmid">26854811</pub-id>
                    <pub-id pub-id-type="doi">10.1097/QAD.0000000000001049</pub-id>
                    <pub-id pub-id-type="pmcid">5679214</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-19">
                <label>19</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Parbhoo</surname>
                            <given-names>S</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Bogojeska</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Zazzi</surname>
                            <given-names>M</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Combining Kernel and Model Based Learning for HIV Therapy Selection.</article-title>
                    <source>

                        <italic toggle="yes">AMIA Jt Summits Transl Sci Proc.</italic>
</source>
                    <year>2017</year>;<volume>2017</volume>:<fpage>239</fpage>&#x2013;<lpage>248</lpage>.
                    <pub-id pub-id-type="pmid">28815137</pub-id>
                    <pub-id pub-id-type="pmcid">5543338</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-20">
                <label>20</label>
                <mixed-citation publication-type="journal">
                    <collab>Center for Medicare Services</collab>:
                    <article-title>CMS Cell Size Suppression Policy</article-title>.<year>2015</year>. [Online; accessed 29-November-2017].
                    <ext-link ext-link-type="uri" xlink:href="https://www.resdac.org/resconnect/articles/26">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-21">
                <label>21</label>
                <mixed-citation publication-type="book">
                    <collab>SamirRCHI</collab>:
                    <article-title>Samir-RCHI/Viral_Load_Data_Categorization: HIV Viral Load Categorization Release (Version v0.1-alpha).</article-title>
                    <italic toggle="yes">Zenodo</italic>.<year>2018</year>.
                    <ext-link ext-link-type="uri" xlink:href="http://www.doi.org/10.5281/zenodo.1313245">http://www.doi.org/10.5281/zenodo.1313245</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-22">
                <label>22</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Han</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Pei</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Kamber</surname>
                            <given-names>M</given-names>
                        </name>
</person-group>:
                    <article-title>Data mining: concepts and techniques</article-title>. Elsevier;<year>2011</year>.
                    <ext-link ext-link-type="uri" xlink:href="http://myweb.sabanciuniv.edu/rdehkharghani/files/2016/02/The-Morgan-Kaufmann-Series-in-Data-Management-Systems-Jiawei-Han-Micheline-Kamber-Jian-Pei-Data-Mining.-Concepts-and-Techniques-3rd-Edition-Morgan-Kaufmann-2011.pdf">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-23">
                <label>23</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Punj</surname>
                            <given-names>G</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Stewart</surname>
                            <given-names>DW</given-names>
                        </name>
</person-group>:
                    <article-title>Cluster Analysis in Marketing Research: Review and Suggestions for Application.</article-title>
                    <source>

                        <italic toggle="yes">J Mark Res.</italic>
</source>
                    <year>1983</year>;<volume>20</volume>(<issue>2</issue>):<fpage>134</fpage>&#x2013;<lpage>148</lpage>.
                    <pub-id pub-id-type="doi">10.2307/3151680</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-24">
                <label>24</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Keller</surname>
                            <given-names>JM</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Gray</surname>
                            <given-names>MR</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Givens</surname>
                            <given-names>JA</given-names>
                        </name>
</person-group>:
                    <article-title>A fuzzy K-nearest neighbor algorithm.</article-title>
                    <source>

                        <italic toggle="yes">IEEE Transactions on Systems, Man, and Cybernetics.</italic>
</source>
                    <year>1985</year>;<volume>SMC-15</volume>(<issue>4</issue>):<fpage>580</fpage>&#x2013;<lpage>585</lpage>.
                    <pub-id pub-id-type="doi">10.1109/TSMC.1985.6313426</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-25">
                <label>25</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Maire</surname>
                            <given-names>F</given-names>
                        </name>
</person-group>:
                    <article-title>An algorithm for the exact computation of the centroid of higher dimensional polyhedra and its application to kernel machines</article-title>. In:
                    <italic toggle="yes">Third IEEE International Conference on Data Mining</italic>.
                    <source>

                        <italic toggle="yes">IEEE Comput Soc.</italic>
</source>
                    <year>2003</year>.
                    <pub-id pub-id-type="doi">10.1109/ICDM.2003.1250988</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-26">
                <label>26</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Kobourov</surname>
                            <given-names>SG</given-names>
                        </name>
</person-group>:
                    <article-title>Spring Embedders and Force Directed Graph Drawing Algorithms</article-title>. arXiv preprint arXiv 12013011.<year>2012</year>.
                    <ext-link ext-link-type="uri" xlink:href="https://arxiv.org/pdf/1201.3011.pdf">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-27">
                <label>27</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Fruchterman</surname>
                            <given-names>TMJ</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Reingold</surname>
                            <given-names>EM</given-names>
                        </name>
</person-group>:
                    <article-title>Graph drawing by force&#x2010;directed placement.</article-title>
                    <source>

                        <italic toggle="yes">Software: Practice and Experience.</italic>
</source>
                    <year>1991</year>;<volume>21</volume>(<issue>11</issue>):<fpage>1129</fpage>&#x2013;<lpage>1164</lpage>.
                    <pub-id pub-id-type="doi">10.1002/spe.4380211102</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-28">
                <label>28</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Noack</surname>
                            <given-names>A</given-names>
                        </name>
</person-group>:
                    <article-title>Modularity clustering is force-directed layout.</article-title>
                    <source>

                        <italic toggle="yes">Phys Rev E Stat Nonlin Soft Matter Phys.</italic>
</source>
                    <year>2009</year>;<volume>79</volume>(<issue>2 Pt 2</issue>):<fpage>026102</fpage>.
                    <pub-id pub-id-type="pmid">19391801</pub-id>
                    <pub-id pub-id-type="doi">10.1103/PhysRevE.79.026102</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-29">
                <label>29</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Zand</surname>
                            <given-names>MS</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Trayhan</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Farooq</surname>
                            <given-names>SA</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Properties of healthcare teaming networks as a function of network construction algorithms.</article-title>
                    <source>

                        <italic toggle="yes">PLoS One.</italic>
</source>
                    <year>2017</year>;<volume>12</volume>(<issue>4</issue>):<fpage>e0175876</fpage>.
                    <pub-id pub-id-type="pmid">28426795</pub-id>
                    <pub-id pub-id-type="doi">10.1371/journal.pone.0175876</pub-id>
                    <pub-id pub-id-type="pmcid">5398561</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-30">
                <label>30</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Abdi</surname>
                            <given-names>H</given-names>
                        </name>
</person-group>:
                    <article-title>Centroids.</article-title>
                    <source>

                        <italic toggle="yes">Wiley Interdiscip Rev Comput Stat.</italic>
</source>
                    <year>2009</year>;<volume>1</volume>(<issue>2</issue>):<fpage>259</fpage>&#x2013;<lpage>260</lpage>.
                    <pub-id pub-id-type="doi">10.1002/wics.31</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-31">
                <label>31</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Etter</surname>
                            <given-names>P</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Landovitz</surname>
                            <given-names>R</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Sibeko</surname>
                            <given-names>S</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Recommendations for the follow-up of study participants with breakthrough HIV infections during HIV/AIDS biomedical prevention studies.</article-title>
                    <source>

                        <italic toggle="yes">AIDS.</italic>
</source>
                    <year>2013</year>;<volume>27</volume>(<issue>7</issue>):<fpage>1119</fpage>&#x2013;<lpage>1128</lpage>.
                    <pub-id pub-id-type="pmid">23262497</pub-id>
                    <pub-id pub-id-type="doi">10.1097/QAD.0b013e32835dc08e</pub-id>
                    <pub-id pub-id-type="pmcid">4286368</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-32">
                <label>32</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Olsen</surname>
                            <given-names>CM</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Knight</surname>
                            <given-names>LL</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Green</surname>
                            <given-names>AC</given-names>
                        </name>
</person-group>:
                    <article-title>Risk of melanoma in people with HIV/AIDS in the pre- and post-HAART eras: a systematic review and meta-analysis of cohort studies.</article-title>
                    <source>

                        <italic toggle="yes">PLoS One.</italic>
</source>
                    <year>2014</year>;<volume>9</volume>(<issue>4</issue>):<fpage>e95096</fpage>.
                    <pub-id pub-id-type="pmid">24740329</pub-id>
                    <pub-id pub-id-type="doi">10.1371/journal.pone.0095096</pub-id>
                    <pub-id pub-id-type="pmcid">3989294</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-33">
                <label>33</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Blaser</surname>
                            <given-names>N</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Wettstein</surname>
                            <given-names>C</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Estill</surname>
                            <given-names>J</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Impact of viral load and the duration of primary infection on HIV transmission: systematic review and meta-analysis.</article-title>
                    <source>

                        <italic toggle="yes">AIDS.</italic>
</source>
                    <year>2014</year>;<volume>28</volume>(<issue>7</issue>):<fpage>1021</fpage>&#x2013;<lpage>1029</lpage>.
                    <pub-id pub-id-type="pmid">24691205</pub-id>
                    <pub-id pub-id-type="doi">10.1097/QAD.0000000000000135</pub-id>
                    <pub-id pub-id-type="pmcid">4058443</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-34">
                <label>34</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Boender</surname>
                            <given-names>TS</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Sigaloff</surname>
                            <given-names>KC</given-names>
                        </name>

                        <name name-style="western">
                            <surname>McMahon</surname>
                            <given-names>JH</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Long-term Virological Outcomes of First-Line Antiretroviral Therapy for HIV-1 in Low- and Middle-Income Countries: A Systematic Review and Meta-analysis.</article-title>
                    <source>

                        <italic toggle="yes">Clin Infect Dis.</italic>
</source>
                    <year>2015</year>;<volume>61</volume>(<issue>9</issue>):<fpage>1453</fpage>&#x2013;<lpage>1461</lpage>.
                    <pub-id pub-id-type="pmid">26157050</pub-id>
                    <pub-id pub-id-type="doi">10.1093/cid/civ556</pub-id>
                    <pub-id pub-id-type="pmcid">4599392</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-35">
                <label>35</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Boerma</surname>
                            <given-names>RS</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Boender</surname>
                            <given-names>TS</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Bussink</surname>
                            <given-names>AP</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Suboptimal Viral Suppression Rates Among HIV-Infected Children in Low- and Middle-Income Countries: A Meta-analysis.</article-title>
                    <source>

                        <italic toggle="yes">Clin Infect Dis.</italic>
</source>
                    <year>2016</year>;<volume>63</volume>(<issue>12</issue>):<fpage>1645</fpage>&#x2013;<lpage>1654</lpage>.
                    <pub-id pub-id-type="pmid">27660236</pub-id>
                    <pub-id pub-id-type="doi">10.1093/cid/ciw645</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-36">
                <label>36</label>
                <mixed-citation publication-type="book">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Bologna</surname>
                            <given-names>G</given-names>
                        </name>
</person-group>:
                    <article-title>Symbolic Rule Extraction from the DIMLP Neural Network.</article-title>In:
                    <italic toggle="yes">Lecture Notes in Computer Science</italic>. Springer Berlin Heidelberg;<year>2000</year>;<fpage>240</fpage>&#x2013;<lpage>254</lpage>.
                    <pub-id pub-id-type="doi">10.1007/10719871_17</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-37">
                <label>37</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Bologna</surname>
                            <given-names>G</given-names>
                        </name>
</person-group>:
                    <article-title>A model for single and multiple knowledge based networks.</article-title>
                    <source>

                        <italic toggle="yes">Artif Intell Med.</italic>
</source>
                    <year>2003</year>;<volume>28</volume>(<issue>2</issue>):<fpage>141</fpage>&#x2013;<lpage>163</lpage>.
                    <pub-id pub-id-type="pmid">12893117</pub-id>
                    <pub-id pub-id-type="doi">10.1016/S0933-3657(03)00055-1</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-38">
                <label>38</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Intrator</surname>
                            <given-names>O</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Intrator</surname>
                            <given-names>N</given-names>
                        </name>
</person-group>:
                    <article-title>Interpreting neural-network results: a simulation study.</article-title>
                    <source>

                        <italic toggle="yes">Comput Stat Data Anal.</italic>
</source>
                    <year>2001</year>;<volume>37</volume>(<issue>3</issue>):<fpage>373</fpage>&#x2013;<lpage>393</lpage>.
                    <pub-id pub-id-type="doi">10.1016/S0167-9473(01)00016-0</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-39">
                <label>39</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Shickel</surname>
                            <given-names>B</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Tighe</surname>
                            <given-names>PJ</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Bihorac</surname>
                            <given-names>A</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Deep EHR: A Survey of Recent Advances in Deep Learning Techniques for Electronic Health Record (EHR) Analysis.</article-title>
                    <source>

                        <italic toggle="yes">IEEE J Biomed Health Inform.</italic>
</source>
                    <year>2017</year>;<fpage>1</fpage>&#x2013;<lpage>1</lpage>.
                    <pub-id pub-id-type="pmid">29989977</pub-id>
                    <pub-id pub-id-type="doi">10.1109/JBHI.2017.2767063</pub-id>
                    <pub-id pub-id-type="pmcid">6043423</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-40">
                <label>40</label>
                <mixed-citation publication-type="book">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Klapper-Rybicka</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Schraudolph</surname>
                            <given-names>NN</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Schmidhuber</surname>
                            <given-names>J</given-names>
                        </name>
</person-group>:
                    <article-title>Unsupervised Learning in LSTM Recurrent Neural Networks.</article-title>In:
                    <italic toggle="yes">Artificial Neural Networks &#x2014; ICANN 2001</italic>. Springer Berlin Heidelberg;<year>2001</year>;<fpage>684</fpage>&#x2013;<lpage>691</lpage>.
                    <pub-id pub-id-type="doi">10.1007/3-540-44668-0_95</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-41">
                <label>41</label>
                <mixed-citation publication-type="book">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Bahadori</surname>
                            <given-names>MT</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Kale</surname>
                            <given-names>D</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Fan</surname>
                            <given-names>Y</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Functional subspace clustering with application to time series.</article-title>In:
                    <italic toggle="yes">International Conference on Machine Learning</italic>.<year>2015</year>;<fpage>228</fpage>&#x2013;<lpage>237</lpage>.
                    <ext-link ext-link-type="uri" xlink:href="http://www-bcf.usc.edu/~fanyingy/publications/ICML-BKFL15.pdf">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-42">
                <label>42</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Kontaki</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Papadopoulos</surname>
                            <given-names>AN</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Manolopoulos</surname>
                            <given-names>Y</given-names>
                        </name>
</person-group>:
                    <article-title>Continuous subspace clustering in streaming time series.</article-title>
                    <source>

                        <italic toggle="yes">Inf Syst.</italic>
</source>
                    <year>2008</year>;<volume>33</volume>(<issue>2</issue>):<fpage>240</fpage>&#x2013;<lpage>260</lpage>.
                    <pub-id pub-id-type="doi">10.1016/j.is.2007.09.001</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-43">
                <label>43</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Karypis</surname>
                            <given-names>G</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Han</surname>
                            <given-names>EH</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Kumar</surname>
                            <given-names>V</given-names>
                        </name>
</person-group>:
                    <article-title>Chameleon: hierarchical clustering using dynamic modeling.</article-title>
                    <source>

                        <italic toggle="yes">Computer.</italic>
</source>
                    <year>1999</year>;<volume>32</volume>(<issue>8</issue>):<fpage>68</fpage>&#x2013;<lpage>75</lpage>.
                    <pub-id pub-id-type="doi">10.1109/2.781637</pub-id>
                </mixed-citation>
            </ref>
        </ref-list>
    </back>
    <sub-article article-type="reviewer-report" id="report39645">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.17007.r39645</article-id>
            <title-group>
                <article-title>Reviewer response for version 1</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Telenti</surname>
                        <given-names>Amalio</given-names>
                    </name>
                    <xref ref-type="aff" rid="r39645a1">1</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0001-6290-7677</uri>
                </contrib>
                <aff id="r39645a1">
                    <label>1</label>The Scripps Research Institute, La Jolla, CA, USA</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>29</day>
                <month>10</month>
                <year>2018</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2018 Telenti A</copyright-statement>
                <copyright-year>2018</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport39645" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.15591.1"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve-with-reservations</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>This report brings machine learning approaches to the classification of patterns of viral control in HIV infected individuals. This is welcome because, although this is a mature field in HIV, it signals the opportunity for new models in this and in other infections.</p>
            <p> </p>
            <p> 
                <bold>Strengths</bold>: very well developed models and excellent reporting of the results through figures and documentation. The code and datasets are available in Github.</p>
            <p> </p>
            <p> 
                <bold>Weakness</bold>: the model is trained and implemented on a suboptimal dataset. Treatment response in HIV infection (and thus the modeling of viral response) is well understood and best modeled with the knowledge of the time of treatment initiation, and a full understanding of variable influencing treatment response. Having a cohort that is described solely by &#x201c;time from first measured viral load&#x201d; is to all purposes, suboptimal. &#x00a0;An additional issue is the reliance of a limited number of viral load determinations for an unclear number of individuals. Depending on the circumstances of sampling, having three viral load over an undisclosed time period is note devoid of many uncontrolled biases. Lastly, the text is equivocal in the utilization of the last time point &#x2013; the reviewer understands that the information contained in the last point may be weighted because of the possibility that it is noisy. Unfortunately, in real life, that is the moment where strong predictive models are needed. It is possible that this was actually the goal of the authors.</p>
            <p> </p>
            <p> 
                <bold>Summary</bold>: this work is a valuable contribution to the field, and the basic concepts and models will hopefully be deployed in the study of datasets that are more appropriate for this exercise. It is desirable that future modeling includes a more ambitious plan to move from the current train-test approach to one that establishes the generalization of the model. It will also be critical to observe the predictive value of the model on longer term outcomes.</p>
            <p>Is the rationale for developing the new method (or application) clearly explained?</p>
            <p>Yes</p>
            <p>Is the description of the method technically sound?</p>
            <p>Yes</p>
            <p>Are the conclusions about the method and its performance adequately supported by the findings presented in the article?</p>
            <p>Partly</p>
            <p>If any results are presented, are all the source data underlying the results available to ensure full reproducibility?</p>
            <p>Yes</p>
            <p>Are sufficient details provided to allow replication of the method development and its use by others?</p>
            <p>Yes</p>
            <p>Reviewer Expertise:</p>
            <p>Host and pathogen genomics</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.</p>
        </body>
    </sub-article>
    <sub-article article-type="reviewer-report" id="report37937">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.17007.r37937</article-id>
            <title-group>
                <article-title>Reviewer response for version 1</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Blower</surname>
                        <given-names>Sally</given-names>
                    </name>
                    <xref ref-type="aff" rid="r37937a1">1</xref>
                    <role>Referee</role>
                </contrib>
                <aff id="r37937a1">
                    <label>1</label>Center&#x00a0;for&#x00a0;Biomedical Modeling, Semel Institute of Neuroscience and Human Behavior, David Geffen School of Medicine, University of California,&#x00a0;Los Angeles&#x00a0;(UCLA), Los Angeles, CA, USA</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>28</day>
                <month>9</month>
                <year>2018</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2018 Blower S</copyright-statement>
                <copyright-year>2018</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport37937" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.15591.1"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>This is an extremely interesting study that proposes a novel quantitative methodology for classifying HIV patients by viral load patterns. The authors propose four computable characteristics of time-varying viral load patterns and a novel classification algorithm. They demonstrate their approach by classifying a group of 1,576 HIV positive patients into five categories based on viral load patterns.&#x00a0;</p>
            <p> </p>
            <p> This is an extremely well written interesting paper with&#x00a0;excellent figures. The proposed methodology&#x00a0;has great importance and utility for both research studies and clinical programs.</p>
            <p> </p>
            <p> My only very minor comments are: 
                <list list-type="order">
                    <list-item>
                        <p>For the descriptions given in Table 2 for mathematical notation. I suggest changing the description of "refers to a single patient" to "refers to a specific patient".</p>
                    </list-item>
                    <list-item>
                        <p>Equation 10, please clarify what the denominator means; i.e., how does "positive" differ from "true positive" or "false positive".</p>
                    </list-item>
                    <list-item>
                        <p>In the paragraph on page 6, under the heading Analytic terminology, editing is needed on line 8, where it says: clusters. interchangeably.</p>
                    </list-item>
                </list>
            </p>
            <p>Is the rationale for developing the new method (or application) clearly explained?</p>
            <p>Yes</p>
            <p>Is the description of the method technically sound?</p>
            <p>Yes</p>
            <p>Are the conclusions about the method and its performance adequately supported by the findings presented in the article?</p>
            <p>Yes</p>
            <p>If any results are presented, are all the source data underlying the results available to ensure full reproducibility?</p>
            <p>Yes</p>
            <p>Are sufficient details provided to allow replication of the method development and its use by others?</p>
            <p>Yes</p>
            <p>Reviewer Expertise:</p>
            <p>mathematical modeling of HIV</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.</p>
        </body>
    </sub-article>
</article>
